Video Search and Retrieval with Vespa and TwelveLabs¶
In the following notebook, we will demonstrate how to leverage TwelveLabs Marengo-retrieval-2.7
a SOTA multimodal embedding model to demonstrate a use case of video embeddings storage and semantic search retrieval using Vespa.ai.
The steps we will take in this notebook are:
- Setup and configuration
- Generate Attributes and Embeddings for 3 sample videos using the TwelveLabs python SDK.
- Deploy the Vespa application to Vespa Cloud and Feed the Data
- Perform a semantic search with hybrid multi-phase ranking on the videos
- Review the results
- Cleanup
All the steps that are needed to provision the Vespa application, including feeding the data, can be done by running this notebook. We have tried to make it easy for others to run this notebook, to create your own Video semantic search application using TwelveLabs models with Vespa.
1. Setup and Configuration¶
For reference, this is the Python version used for this notebook.
!python --version
Python 3.11.11
1.1 Install libraries¶
Install the required Python dependencies from TwelveLabs python SDK and pyvespa python API.
!pip3 install pyvespa vespacli twelvelabs pandas
Requirement already satisfied: pyvespa in /usr/local/lib/python3.11/dist-packages (0.53.0) Requirement already satisfied: vespacli in /usr/local/lib/python3.11/dist-packages (8.478.26) Requirement already satisfied: twelvelabs in /usr/local/lib/python3.11/dist-packages (0.4.4) Requirement already satisfied: pandas in /usr/local/lib/python3.11/dist-packages (2.2.2) Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from pyvespa) (2.32.3) Requirement already satisfied: requests_toolbelt in /usr/local/lib/python3.11/dist-packages (from pyvespa) (1.0.0) Requirement already satisfied: docker in /usr/local/lib/python3.11/dist-packages (from pyvespa) (7.1.0) Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from pyvespa) (3.1.5) Requirement already satisfied: cryptography in /usr/local/lib/python3.11/dist-packages (from pyvespa) (43.0.3) Requirement already satisfied: aiohttp in /usr/local/lib/python3.11/dist-packages (from pyvespa) (3.11.12) Requirement already satisfied: httpx[http2] in /usr/local/lib/python3.11/dist-packages (from pyvespa) (0.28.1) Requirement already satisfied: tenacity>=8.4.1 in /usr/local/lib/python3.11/dist-packages (from pyvespa) (9.0.0) Requirement already satisfied: typing_extensions in /usr/local/lib/python3.11/dist-packages (from pyvespa) (4.12.2) Requirement already satisfied: python-dateutil in /usr/local/lib/python3.11/dist-packages (from pyvespa) (2.8.2) Requirement already satisfied: fastcore>=1.7.8 in /usr/local/lib/python3.11/dist-packages (from pyvespa) (1.7.29) Requirement already satisfied: lxml in /usr/local/lib/python3.11/dist-packages (from pyvespa) (5.3.1) Requirement already satisfied: pydantic>=2.4.2 in /usr/local/lib/python3.11/dist-packages (from twelvelabs) (2.10.6) Requirement already satisfied: numpy>=1.23.2 in /usr/local/lib/python3.11/dist-packages (from pandas) (1.26.4) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas) (2025.1) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas) (2025.1) Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from fastcore>=1.7.8->pyvespa) (24.2) Requirement already satisfied: anyio in /usr/local/lib/python3.11/dist-packages (from httpx[http2]->pyvespa) (3.7.1) Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from httpx[http2]->pyvespa) (2025.1.31) Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx[http2]->pyvespa) (1.0.7) Requirement already satisfied: idna in /usr/local/lib/python3.11/dist-packages (from httpx[http2]->pyvespa) (3.10) Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx[http2]->pyvespa) (0.14.0) Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.4.2->twelvelabs) (0.7.0) Requirement already satisfied: pydantic-core==2.27.2 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.4.2->twelvelabs) (2.27.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil->pyvespa) (1.17.0) Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (2.4.6) Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (1.3.2) Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (25.1.0) Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (1.5.0) Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (6.1.0) Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (0.2.1) Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp->pyvespa) (1.18.3) Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.11/dist-packages (from cryptography->pyvespa) (1.17.1) Requirement already satisfied: urllib3>=1.26.0 in /usr/local/lib/python3.11/dist-packages (from docker->pyvespa) (2.3.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->pyvespa) (3.4.1) Requirement already satisfied: h2<5,>=3 in /usr/local/lib/python3.11/dist-packages (from httpx[http2]->pyvespa) (4.2.0) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2->pyvespa) (3.0.2) Requirement already satisfied: pycparser in /usr/local/lib/python3.11/dist-packages (from cffi>=1.12->cryptography->pyvespa) (2.22) Requirement already satisfied: hyperframe<7,>=6.1 in /usr/local/lib/python3.11/dist-packages (from h2<5,>=3->httpx[http2]->pyvespa) (6.1.0) Requirement already satisfied: hpack<5,>=4.1 in /usr/local/lib/python3.11/dist-packages (from h2<5,>=3->httpx[http2]->pyvespa) (4.1.0) Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.11/dist-packages (from anyio->httpx[http2]->pyvespa) (1.3.1)
Import all the required packages in this notebook.
import os
import hashlib
import json
from vespa.package import (
ApplicationPackage,
Field,
Schema,
Document,
HNSW,
RankProfile,
FieldSet,
SecondPhaseRanking,
Function,
)
from vespa.deployment import VespaCloud
from vespa.io import VespaResponse, VespaQueryResponse
from twelvelabs import TwelveLabs
from twelvelabs.models.embed import EmbeddingsTask
import pandas as pd
from datetime import datetime
TL_API_KEY = os.getenv("TL_API_KEY") or input("Enter your TL_API key: ")
1.3 Sign-up for a Vespa Trial Account¶
Pre-requisite:
- Spin-up a Vespa Cloud Trial account.
- Login to the account you just created and create a tenant at console.vespa-cloud.com.
- Save the tenant name.
1.4 Setup the tenant name and the application name¶
- Paste below the name of the tenant name.
- Give your application a name. Note that the name cannot have
-
or_
.
# Replace with your tenant name from the Vespa Cloud Console
tenant_name = "vespa-team"
# Replace with your application name (does not need to exist yet)
application = "videosearch"
2. Generate Attributes and Embeddings for sample videos using TwelveLabs Embedding API¶
2.1 Generate attributes on the videos¶
In this section, we will leverage the Pegasus 1.1 generative model to generate some attributes about our videos to store as part of the searchable information in Vespa. Attributes we want to store as part of the videos include:
- Keywords
- Summaries
For video samples, we are selecting the 3 videos in the array below from the Internet Archive.
You can customize this code with the urls of your choice. Note that there are certain restrictions such as the resolution of the videos.
VIDEO_URLs = [
"https://ia801503.us.archive.org/27/items/hide-and-seek-with-giant-jenny/HnVideoEditor_2022_10_29_205557707.ia.mp4",
"https://ia601401.us.archive.org/1/items/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net.mp4",
"https://dn720401.ca.archive.org/0/items/mr-bean-the-animated-series-holiday-for-teddy/S2E12.ia.mp4",
]
In order to generate text on the videos, the prerequisite is to upload the videos and index them. Let's first create an index below:
# Spin-up session
client = TwelveLabs(api_key=TL_API_KEY)
# Generating Index Name
timestamp = int(datetime.now().timestamp())
index_name = "Vespa_" + str(timestamp)
# Create Index
print("Creating Index:" + index_name)
index = client.index.create(
name=index_name,
models=[
{
"name": "pegasus1.2",
"options": ["visual", "audio"],
}
],
addons=["thumbnail"], # Optional
)
print(f"Created index: id={index.id} name={index.name} models={index.models}")
Creating Index:Vespa_1739986655 Created index: id=67b616dfc82670193cb59a05 name=Vespa_1739986655 models=root=[Model(name='pegasus1.2', options=['visual', 'audio'], addons=None, finetuned=False)]
We can now upload the videos:
# Capturing index id for upload
index_id = index.id
def on_task_update(task: EmbeddingsTask):
print(f" Status={task.status}")
for video_url in VIDEO_URLs:
# Create a video indexing task
task = client.task.create(index_id=index_id, url=video_url)
print(f"Task created successfully! Task ID: {task.id}")
status = task.wait_for_done(sleep_interval=10, callback=on_task_update)
print(f"Indexing done: {status}")
if task.status != "ready":
raise RuntimeError(f"Indexing failed with status {task.status}")
print(
f"Uploaded {video_url}. The unique identifer of your video is {task.video_id}."
)
Task created successfully! Task ID: 67b616e4c82670193cb59a06 Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=ready Indexing done: Task(id='67b616e4c82670193cb59a06', created_at='2025-02-19T17:37:40.286Z', updated_at='2025-02-19T17:37:40.286Z', index_id='67b616dfc82670193cb59a05', video_id='67b616e551e07a2910a9b956', status='ready', system_metadata={'filename': 'HnVideoEditor_2022_10_29_205557707.ia', 'duration': 221.9666671, 'width': 854, 'height': 480}, hls=None) Uploaded https://ia801503.us.archive.org/27/items/hide-and-seek-with-giant-jenny/HnVideoEditor_2022_10_29_205557707.ia.mp4. The unique identifer of your video is 67b616e551e07a2910a9b956. Task created successfully! Task ID: 67b6179dc82670193cb59a0a Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=ready Indexing done: Task(id='67b6179dc82670193cb59a0a', created_at='2025-02-19T17:40:45.027Z', updated_at='2025-02-19T17:40:45.027Z', index_id='67b616dfc82670193cb59a05', video_id='67b617b6589f15770cd94602', status='ready', system_metadata={'filename': 'twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net', 'duration': 1448.8000001, 'width': 640, 'height': 480}, hls=None) Uploaded https://ia601401.us.archive.org/1/items/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net.mp4. The unique identifer of your video is 67b617b6589f15770cd94602. Task created successfully! Task ID: 67b618c0c82670193cb59a10 Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=pending Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=indexing Status=ready Indexing done: Task(id='67b618c0c82670193cb59a10', created_at='2025-02-19T17:45:36.117Z', updated_at='2025-02-19T17:45:36.117Z', index_id='67b616dfc82670193cb59a05', video_id='67b618c2589f15770cd94603', status='ready', system_metadata={'filename': 'S2E12.ia', 'duration': 659.9200001, 'width': 854, 'height': 480}, hls=None) Uploaded https://dn720401.ca.archive.org/0/items/mr-bean-the-animated-series-holiday-for-teddy/S2E12.ia.mp4. The unique identifer of your video is 67b618c2589f15770cd94603.
Now that the videos have been uploaded, we can generate the keywords, and summaries on the videos below. You will notice on the output that the video uploaded last is the one that is processed first in this stage. This matters since we store other attributes on the videos on arrays (eg URLs, Titles).
client = TwelveLabs(api_key=TL_API_KEY)
summaries = []
keywords_array = []
# Get all videos in an Index
videos = client.index.video.list(index_id)
for video in videos:
print(f"Generating text for {video.id}")
res = client.generate.summarize(
video_id=video.id,
type="summary",
prompt="Generate an abstract of the video serving as metadata on the video, up to five sentences.",
)
print(f"Summary: {res.summary}")
summaries.append(res.summary)
keywords = client.generate.text(
video_id=video.id,
prompt="Based on this video, I want to generate five keywords for SEO (Search Engine Optimization). Provide just the keywords as a comma delimited list without any additional text.",
)
print(f"Open-ended Text: {keywords.data}")
keywords_array.append(keywords.data)
Generating text for 67b618c2589f15770cd94603 Summary: The video showcases a series of comedic scenes featuring Mr. Bean, who embarks on a holiday at Seaview Hotel. Throughout his stay, he engages in various mishaps, including accidentally starting a car and exchanging toys with a little girl. At the beach, he interacts with a family and participates in a dodgem car game, where he manages to take possession of a teddy bear, causing confusion and amusement. The video concludes with Mr. Bean back at his room, reflecting on his holiday adventures through photographs, highlighting his characteristic blend of humor and awkwardness. Open-ended Text: MrBean, TeddyBear, Beach, BumperCars, Holiday Generating text for 67b617b6589f15770cd94602 Summary: The video is an animated adaptation of "A Visit from St. Nicholas," commonly known as "The Night Before Christmas," narrated and sung by Joel Grey. It features a town where the residents, including a clockmaker named Joshua Trundle and his family, are troubled by Santa's absence due to a critical letter published in the local newspaper. The story unfolds with the clockmaker's son, Albert, realizing his mistake and attempting to fix a malfunctioning clock in the town hall that was meant to welcome Santa. Despite initial setbacks, the community's efforts eventually lead to Santa's arrival, restoring joy and belief in the magic of Christmas. The video concludes with Santa Claus descending through chimneys to deliver gifts, symbolizing the triumph of hope and the spirit of the holiday. Open-ended Text: Christmas Eve, Santa Claus, Clockmaker, Snowy Night, Mouse Characters Generating text for 67b616e551e07a2910a9b956 Summary: The video showcases a series of animated scenes featuring a panda and three cartoon wolves engaging in various activities, from watching TV to building a miniature cardboard town. The wolves, along with a pink dog and a green alien, encounter unexpected situations such as a giant fox descending upon them and a chase involving toy cars and a convertible. The narrative culminates in a heartwarming reunion under a bridge, where the characters express gratitude for their day's adventures. Throughout the video, the characters display a range of emotions and interactions, highlighting themes of friendship and teamwork. Open-ended Text: Cartoon, Wolves, Hide-and-Seek, Cardboard City, Alien
We need to store the titles of the videos as an additional attribute.
# Creating array with titles
titles = [
"Mr. Bean the Animated Series Holiday for Teddy",
"Twas the night before Christmas",
"Hide and Seek with Giant Jenny",
]
2.2 Generate Embeddings¶
The following code leverages the Embed API to create an asynchronous embedding task to embed the sample videos.
Twelve Labs video embeddings capture all the subtle cues and interactions between different modalities, including the visual expressions, body language, spoken words, and the overall context of the video, encapsulating the essence of all these modalities and their interrelations over time.
client = TwelveLabs(api_key=TL_API_KEY)
# Initialize an array to store the task IDs as strings
task_ids = []
for url in VIDEO_URLs:
task = client.embed.task.create(model_name="Marengo-retrieval-2.7", video_url=url)
print(
f"Created task: id={task.id} model_name={task.model_name} status={task.status}"
)
# Append the task ID to the array
task_ids.append(str(task.id))
status = task.wait_for_done(sleep_interval=10, callback=on_task_update)
print(f"Embedding done: {status}")
if task.status != "ready":
raise RuntimeError(f"Embedding failed with status {task.status}")
Created task: id=67b61b3567ce2ae76ec6b3eb model_name=Marengo-retrieval-2.7 status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=ready Embedding done: ready Created task: id=67b61b7e67ce2ae76ec6b3f4 model_name=Marengo-retrieval-2.7 status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=ready Embedding done: ready Created task: id=67b61c0c67ce2ae76ec6b3fa model_name=Marengo-retrieval-2.7 status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=processing Status=ready Embedding done: ready
2.3 Retrieve Embeddings¶
Once the embedding task is completed, we can retrieve the results of the embedding task based on the task_ids.
# Spin-up session
client = TwelveLabs(api_key=TL_API_KEY)
# Initialize an array to store the task objects directly
tasks = []
for task_id in task_ids:
# Retrieve the task
task = client.embed.task.retrieve(task_id)
tasks.append(task)
# Print task details
print(f"Task ID: {task.id}")
print(f"Status: {task.status}")
Task ID: 67b61b3567ce2ae76ec6b3eb Status: ready Task ID: 67b61b7e67ce2ae76ec6b3f4 Status: ready Task ID: 67b61c0c67ce2ae76ec6b3fa Status: ready
We can now review the output structure of the first segment for each one of these videos. This output will help us define the schema to store the embeddings in Vespa in the second part of this notebook.
From looking at this output, the video has been embedded into chunks of 6 seconds each (default configurable value in the Embed API). Each embedding has a float vector of dimension 1024.
The number of segments generated vary per video, based on the length of the videos ranging from 37 to 242 segments.
for task in tasks:
print(task.id)
# Display data types of each field
for key, value in task.video_embedding.segments[0]:
if isinstance(value, list):
print(
f"{key}: list of size {len(value)} (truncated to 5 items): {value[:5]} "
)
else:
print(f"{key}: {type(value).__name__} : {value}")
print(f"Total Number of segments: {len(task.video_embedding.segments)}")
67b61b3567ce2ae76ec6b3eb start_offset_sec: float : 0.0 end_offset_sec: float : 6.0 embedding_scope: str : clip embeddings_float: list of size 1024 (truncated to 5 items): [0.030361895, 0.008698823, -0.0048321243, -0.019013105, -0.011488311] Total Number of segments: 37 67b61b7e67ce2ae76ec6b3f4 start_offset_sec: float : 0.0 end_offset_sec: float : 6.0 embedding_scope: str : clip embeddings_float: list of size 1024 (truncated to 5 items): [0.024328815, -0.0035867887, 0.016065866, 0.02501548, 0.007778642] Total Number of segments: 242 67b61c0c67ce2ae76ec6b3fa start_offset_sec: float : 0.0 end_offset_sec: float : 6.0 embedding_scope: str : clip embeddings_float: list of size 1024 (truncated to 5 items): [0.04080625, 0.0086980555, 0.00096186635, -0.00607, -0.020250283] Total Number of segments: 110
3. Deploy a Vespa Application¶
At this point, we are ready to deploy a Vespa Application. We have generated the attributes we needed on each video, as well as the embeddings.
3.1 Create an Application Package¶
The application package has all the Vespa configuration files - create one from scratch:
The Vespa schema deployed as part of the package is called videos
. All the fields are matching the output of the Twelvelabs Embed API above. Refer to the Vespa documentation for more information on the schema specification.
We can first define the schema using pyvespa
videos_schema = Schema(
name="videos",
document=Document(
fields=[
Field(name="video_url", type="string", indexing=["summary"]),
Field(
name="title",
type="string",
indexing=["index", "summary"],
match=["text"],
index="enable-bm25",
),
Field(
name="keywords",
type="string",
indexing=["index", "summary"],
match=["text"],
index="enable-bm25",
),
Field(
name="video_summary",
type="string",
indexing=["index", "summary"],
match=["text"],
index="enable-bm25",
),
Field(
name="embedding_scope", type="string", indexing=["attribute", "summary"]
),
Field(
name="start_offset_sec",
type="array<float>",
indexing=["attribute", "summary"],
),
Field(
name="end_offset_sec",
type="array<float>",
indexing=["attribute", "summary"],
),
Field(
name="embeddings",
type="tensor<float>(p{},x[1024])",
indexing=["index", "attribute"],
ann=HNSW(distance_metric="angular"),
),
]
),
)
fieldsets = (
[
FieldSet(
name="default",
fields=["title", "keywords", "video_summary"],
),
],
)
mapfunctions = [
Function(
name="similarities",
expression="""
sum(
query(q) * attribute(embeddings), x
)
""",
),
Function(
name="bm25_score",
expression="bm25(title) + bm25(keywords) + bm25(video_summary)",
),
]
semantic_rankprofile = RankProfile(
name="hybrid",
inputs=[("query(q)", "tensor<float>(x[1024])")],
first_phase="bm25_score",
second_phase=SecondPhaseRanking(
expression="closeness(field, embeddings)", rerank_count=10
),
match_features=["closest(embeddings)"],
summary_features=["similarities"],
functions=mapfunctions,
)
videos_schema.add_rank_profile(semantic_rankprofile)
We can now create the package based on the previous schema
# Create the Vespa application package
package = ApplicationPackage(name=application, schema=[videos_schema])
3.2 Deploy the Application Package¶
The app is now defined and ready to deploy to Vespa Cloud.
Deploy package
to Vespa Cloud, by creating an instance of
VespaCloud:
vespa_cloud = VespaCloud(
tenant=tenant_name,
application=application,
application_package=package,
key_content=os.getenv("VESPA_TEAM_API_KEY", None),
)
Setting application... Running: vespa config set application vespa-presales.videosearch Setting target cloud... Running: vespa config set target cloud No api-key found for control plane access. Using access token. No auth.json found. Please authenticate. Your Device Confirmation code is: QVNV-TRSK Automatically open confirmation page in your default browser? [Y/n] Y Y Opened link in your browser: https://login.console.vespa-cloud.com/activate?user_code=QVNV-TRSK /usr/bin/xdg-open: 882: www-browser: not found /usr/bin/xdg-open: 882: links2: not found /usr/bin/xdg-open: 882: elinks: not found /usr/bin/xdg-open: 882: links: not found /usr/bin/xdg-open: 882: lynx: not found /usr/bin/xdg-open: 882: w3m: not found xdg-open: no method available for opening 'https://login.console.vespa-cloud.com/activate?user_code=QVNV-TRSK' Couldn't open the URL, please do it manually Waiting for login to complete in browser ... done Warning: Could not store the refresh token locally. You may need to login again once your access token expires Success: Logged in auth.json created at /root/.vespa/auth.json Successfully obtained access token for control plane access. Certificate and key not found in /content/.vespa or /root/.vespa/vespa-presales.videosearch.default: Creating new cert/key pair with vespa CLI. Generating certificate and key... Running: vespa auth cert -N Success: Certificate written to '/root/.vespa/vespa-presales.videosearch.default/data-plane-public-cert.pem' Success: Private key written to '/root/.vespa/vespa-presales.videosearch.default/data-plane-private-key.pem'
app = vespa_cloud.deploy()
Deployment started in run 1 of dev-aws-us-east-1c for vespa-presales.videosearch. This may take a few minutes the first time. INFO [18:05:07] Deploying platform version 8.482.31 and application dev build 1 for dev-aws-us-east-1c of default ... INFO [18:05:08] Using CA signed certificate version 1 INFO [18:05:08] Using 1 nodes in container cluster 'videosearch_container' INFO [18:05:11] Session 2384 for tenant 'vespa-presales' prepared and activated. INFO [18:05:33] ######## Details for all nodes ######## INFO [18:05:42] h113694g.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP INFO [18:05:42] --- platform vespa/cloud-tenant-rhel8:8.482.31 INFO [18:05:42] --- container-clustercontroller on port 19050 has not started INFO [18:05:42] --- metricsproxy-container on port 19092 has not started INFO [18:05:42] h113694b.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP INFO [18:05:42] --- platform vespa/cloud-tenant-rhel8:8.482.31 INFO [18:05:42] --- logserver-container on port 4080 has not started INFO [18:05:42] --- metricsproxy-container on port 19092 has not started INFO [18:05:42] h113669a.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP INFO [18:05:42] --- platform vespa/cloud-tenant-rhel8:8.482.31 INFO [18:05:42] --- storagenode on port 19102 has not started INFO [18:05:42] --- searchnode on port 19107 has not started INFO [18:05:42] --- distributor on port 19111 has not started INFO [18:05:42] --- metricsproxy-container on port 19092 has not started INFO [18:05:42] h113963a.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP INFO [18:05:42] --- platform vespa/cloud-tenant-rhel8:8.482.31 INFO [18:05:42] --- container on port 4080 has not started INFO [18:05:42] --- metricsproxy-container on port 19092 has not started INFO [18:07:10] Waiting for convergence of 10 services across 4 nodes INFO [18:07:10] 3 nodes booting INFO [18:07:10] 10 application services still deploying DEBUG [18:07:17] h113694b.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP DEBUG [18:07:17] --- platform vespa/cloud-tenant-rhel8:8.482.31 DEBUG [18:07:17] --- logserver-container on port 4080 has not started DEBUG [18:07:17] --- metricsproxy-container on port 19092 has not started DEBUG [18:07:17] h113669a.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP DEBUG [18:07:17] --- platform vespa/cloud-tenant-rhel8:8.482.31 DEBUG [18:07:17] --- storagenode on port 19102 has not started DEBUG [18:07:17] --- searchnode on port 19107 has not started DEBUG [18:07:17] --- distributor on port 19111 has not started DEBUG [18:07:17] --- metricsproxy-container on port 19092 has not started DEBUG [18:07:17] h113963a.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP DEBUG [18:07:17] --- platform vespa/cloud-tenant-rhel8:8.482.31 DEBUG [18:07:17] --- container on port 4080 has not started DEBUG [18:07:17] --- metricsproxy-container on port 19092 has not started DEBUG [18:07:17] h113694g.dev.us-east-1c.aws.vespa-cloud.net: expected to be UP DEBUG [18:07:17] --- platform vespa/cloud-tenant-rhel8:8.482.31 DEBUG [18:07:17] --- container-clustercontroller on port 19050 has not started DEBUG [18:07:17] --- metricsproxy-container on port 19092 has not started INFO [18:08:06] Found endpoints: INFO [18:08:06] - dev.aws-us-east-1c INFO [18:08:06] |-- https://aefbf207.f5d60452.z.vespa-app.cloud/ (cluster 'videosearch_container') INFO [18:08:22] Deployment complete! Only region: aws-us-east-1c available in dev environment. Found mtls endpoint for videosearch_container URL: https://aefbf207.f5d60452.z.vespa-app.cloud/ Application is up!
3.3 Feed the Vespa Application¶
The vespa_feed
feed format for pyvespa
expects a dict with the keys id
and fields
:
{ "id": "vespa-document-id", "fields": {"vespa_field": "vespa-field-value"}}
For the id, we will use a md5 hash of the video url.
The video embedding output segments are added to the fields
in vespa_feed
.
# Initialize a list to store Vespa feed documents
vespa_feed = []
# Need to reverse VIDEO_URLS as keywords/summaries generated in reverse order
VIDEO_URLs.reverse()
# Iterate through each task and corresponding metadata
for i, task in enumerate(tasks):
video_url = VIDEO_URLs[i]
title = titles[i]
keywords = keywords_array[i]
summary = summaries[i]
start_offsets = [] # Reset for each video
end_offsets = [] # Reset for each video
embeddings = {} # Reset for each video
# Iterate through the video embedding segments
for index, segment in enumerate(task.video_embedding.segments):
# Append start and end offsets as floats
start_offsets.append(float(segment.start_offset_sec))
end_offsets.append(float(segment.end_offset_sec))
# Add embedding to a multi-dimensional dictionary with index as the key
embeddings[str(index)] = list(map(float, segment.embeddings_float))
# Create Vespa document for each task
for segment in task.video_embedding.segments:
start_offset_sec = segment.start_offset_sec
end_offset_sec = segment.end_offset_sec
embedding = list(map(float, segment.embeddings_float))
# Create a unique ID by hashing the URL and segment index
id_hash = hashlib.md5(f"{video_url}_{index}".encode()).hexdigest()
document = {
"id": id_hash,
"fields": {
"video_url": video_url,
"title": title,
"keywords": keywords,
"video_summary": summary,
"embedding_scope": segment.embedding_scope,
"start_offset_sec": start_offsets,
"end_offset_sec": end_offsets,
"embeddings": embeddings,
},
}
vespa_feed.append(document)
We can quickly validate the number of the number of documents created (one for each video), and visually check the first record.
# Print Vespa feed size and an example
print(f"Total documents created: {len(vespa_feed)}")
Total documents created: 3
# The positional index of the document
i = 0
# Iterate through the first 3 embeddings in vespa_feed
for i in range(
min(3, len(vespa_feed))
): # Ensure we don't exceed the length of vespa_feed
# Limit the embedding to the first 3 keys and first 5 values for each key
embedding = vespa_feed[i]["fields"]["embeddings"]
embedding_sample = {key: values[:3] for key, values in list(embedding.items())[:3]}
# Beautify and print the first document with only the first 5 embedding values
pretty_json = json.dumps(
{
"id": vespa_feed[i]["id"],
"fields": {
"video_url": vespa_feed[i]["fields"]["video_url"],
"title": vespa_feed[i]["fields"]["title"],
"keywords": vespa_feed[i]["fields"]["keywords"],
"video_summary": vespa_feed[i]["fields"]["video_summary"],
"embedding_scope": vespa_feed[i]["fields"]["embedding_scope"],
"start_offset_sec": vespa_feed[i]["fields"]["start_offset_sec"][:3],
"end_offset_sec": vespa_feed[i]["fields"]["end_offset_sec"][:3],
"embedding": embedding_sample,
},
},
indent=4,
)
print(pretty_json)
{ "id": "0b1fc68a17391fb58102a539ed290d27", "fields": { "video_url": "https://ia801503.us.archive.org/27/items/hide-and-seek-with-giant-jenny/HnVideoEditor_2022_10_29_205557707.ia.mp4", "title": "Hide and Seek with Giant Jenny", "keywords": "Cartoon, Wolves, Hide-and-Seek, Cardboard City, Alien", "video_summary": "The video showcases a series of animated scenes featuring a panda and three cartoon wolves engaging in various activities, from watching TV to building a miniature cardboard town. The wolves, along with a pink dog and a green alien, encounter unexpected situations such as a giant fox descending upon them and a chase involving toy cars and a convertible. The narrative culminates in a heartwarming reunion under a bridge, where the characters express gratitude for their day's adventures. Throughout the video, the characters display a range of emotions and interactions, highlighting themes of friendship and teamwork.", "embedding_scope": "clip", "start_offset_sec": [ 0.0, 6.0, 12.0 ], "end_offset_sec": [ 6.0, 12.0, 18.0 ], "embedding": { "0": [ 0.04080625, 0.0086980555, 0.00096186635 ], "1": [ 0.05161131, -0.0063618324, -0.008135624 ], "2": [ 0.050463274, 0.0006376326, -0.010785032 ] } } }
Now we can feed to Vespa using feed_iterable
which accepts any Iterable
and an optional callback function where we can
check the outcome of each operation.
def callback(response: VespaResponse, id: str):
if not response.is_successful():
print(
f"Failed to feed document {id} with status code {response.status_code}: Reason {response.get_json()}"
)
# Feed data into Vespa synchronously
app.feed_iterable(vespa_feed, schema="videos", callback=callback)
client = TwelveLabs(api_key=TL_API_KEY)
user_query = "Santa Claus on his sleigh"
res = client.embed.create(
model_name="Marengo-retrieval-2.7",
text=user_query,
)
print("Created a text embedding")
print(f" Model: {res.model_name}")
if res.text_embedding is not None and res.text_embedding.segments is not None:
q_embedding = res.text_embedding.segments[0].embeddings_float
print(f" Embedding Dimension: {len(q_embedding)}")
print(f" Sample 5 values from array: {q_embedding[:5]}")
Created a text embedding Model: Marengo-retrieval-2.7 Embedding Dimension: 1024 Sample 5 values from array: [-0.018066406, -0.0065307617, 0.05859375, -0.033447266, -0.02368164]
The following uses dense vector representations of the query embedding obtained previously and document and matching is performed and accelerated by Vespa's support for approximate nearest neighbor search.
The output is limited to the top 1 hit, as we only have a sample of 3 videos. The top hit returned was based on a hybrid ranking based on a bm25 ranking based on a lexical search on the text, keywords and summary of the video, performed as a first phase, and similarity search on the embeddings.
We can see as part of the match-features
, the segment 212 in the video was the one providing the highest match.
We also calculate the similarities as part of the summary-features
for the rest of the segments so we can look for top N segments within a video, optionally.
with app.syncio(connections=1) as session:
response: VespaQueryResponse = session.query(
yql="select * from videos where userQuery() OR ({targetHits:100}nearestNeighbor(embeddings,q))",
query=user_query,
ranking="hybrid",
hits=1,
body={"input.query(q)": q_embedding},
)
assert response.is_successful()
for hit in response.hits:
print(json.dumps(hit, indent=4))
# response.get_json()
You should see output similar to this:
{
"id": "id:videos:videos::13bcb994b389c9d925993e611877e40b",
"relevance": 0.47162757625475055,
"source": "videosearch_content",
"fields": {
"matchfeatures": {
"closest(embeddings)": {
"type": "tensor<float>(p{})",
"cells": {
"212": 1.0
}
}
},
"sddocname": "videos",
"documentid": "id:videos:videos::13bcb994b389c9d925993e611877e40b",
"video_url": "https://ia601401.us.archive.org/1/items/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net.mp4",
"title": "Twas the night before Christmas",
"keywords": "Christmas Eve, Santa Claus, Clockmaker, Snowy Night, Mouse Characters",
"video_summary": "The video is an animated adaptation of \"A Visit from St. Nicholas,\" commonly known as \"The Night Before Christmas,\" narrated and sung by Joel Grey. It features a town where the residents, including a clockmaker named Joshua Trundle and his family, are troubled by Santa's absence due to a critical letter published in the local newspaper. The story unfolds with the clockmaker's son, Albert, realizing his mistake and attempting to fix a malfunctioning clock in the town hall that was meant to welcome Santa. Despite initial setbacks, the community's efforts eventually lead to Santa's arrival, restoring joy and belief in the magic of Christmas. The video concludes with Santa Claus descending through chimneys to deliver gifts, symbolizing the triumph of hope and the spirit of the holiday.",
"embedding_scope": "clip",
"start_offset_sec": [
0.0,
6.0,
12.0,
18.0,
],
}
}
In order to process the results above in a more consumable format and sort out the top N segments based on similarities, we can do this more conveniently in a pandas dataframe below:
def get_top_n_similarity_matches(data, N=5):
"""
Function to extract the top N similarity scores and their corresponding start and end offsets.
Args:
- data (dict): Input JSON-like structure containing similarities and offsets.
- N (int): The number of top similarity scores to return.
Returns:
- pd.DataFrame: A DataFrame with the top N similarity scores and their corresponding offsets.
"""
# Extract relevant fields
similarities = data["fields"]["summaryfeatures"]["similarities"]["cells"]
start_offset_sec = data["fields"]["start_offset_sec"]
end_offset_sec = data["fields"]["end_offset_sec"]
# Convert similarity scores to a list of tuples (index, similarity_score) and sort by similarity score
sorted_similarities = sorted(similarities.items(), key=lambda x: x[1], reverse=True)
# Extract top N similarity scores
top_n_similarities = sorted_similarities[:N]
# Prepare results
results = []
for index_str, score in top_n_similarities:
index = int(index_str)
if index < len(start_offset_sec):
result = {
"index": index,
"similarity_score": score,
"start_offset_sec": start_offset_sec[index],
"end_offset_sec": end_offset_sec[index],
}
else:
result = {
"index": index,
"similarity_score": score,
"start_offset_sec": None,
"end_offset_sec": None,
}
results.append(result)
# Convert results to a DataFrame
df = pd.DataFrame(results)
return df
df_result = get_top_n_similarity_matches(response.hits[0], N=10)
df_result
index | similarity_score | start_offset_sec | end_offset_sec | |
---|---|---|---|---|
0 | 212 | 0.435371 | 1272.0 | 1278.0 |
1 | 230 | 0.418007 | 1380.0 | 1386.0 |
2 | 210 | 0.411242 | 1260.0 | 1266.0 |
3 | 211 | 0.409344 | 1266.0 | 1272.0 |
4 | 208 | 0.408644 | 1248.0 | 1254.0 |
5 | 231 | 0.406000 | 1386.0 | 1392.0 |
6 | 209 | 0.404767 | 1254.0 | 1260.0 |
7 | 229 | 0.403729 | 1374.0 | 1380.0 |
8 | 203 | 0.403292 | 1218.0 | 1224.0 |
9 | 207 | 0.391671 | 1242.0 | 1248.0 |
5. Review results (Optional)¶
We can review the results by spinning up a video player in the notebook and check the segments identified and judge by ourselves.
But, first we need to obtain the contiguous segments, add 3 seconds overlap in the consolidated segments and convert to MM:SS so we can quickly find the segments to watch in the player. Let's write a function that takes the response as an input and provides the consolidated segments to view in the player.
def concatenate_contiguous_segments(df):
"""
Function to concatenate contiguous segments based on their start and end offsets.
Converts the concatenated segments to MM:SS format.
Args:
- df (pd.DataFrame): DataFrame with columns 'start_offset_sec' and 'end_offset_sec'.
Returns:
- List of tuples with concatenated segments in MM:SS format as (start_time, end_time).
"""
if df.empty:
return []
# Sort by start_offset_sec for ordered processing
df = df.sort_values(by="start_offset_sec").reset_index(drop=True)
# Initialize the list to hold concatenated segments
concatenated_segments = []
# Initialize the first segment
start = df.iloc[0]["start_offset_sec"]
end = df.iloc[0]["end_offset_sec"]
for i in range(1, len(df)):
current_start = df.iloc[i]["start_offset_sec"]
current_end = df.iloc[i]["end_offset_sec"]
# Check if the current segment is contiguous with the previous one
if current_start <= end:
# Extend the segment if it is contiguous
end = max(end, current_end)
else:
# Add the previous segment to the result list in MM:SS format
concatenated_segments.append(
(convert_seconds_to_mmss(start - 3), convert_seconds_to_mmss(end + 3))
)
# Start a new segment
start = current_start
end = current_end
# Add the final segment
concatenated_segments.append(
(convert_seconds_to_mmss(start - 3), convert_seconds_to_mmss(end + 3))
)
return concatenated_segments
def convert_seconds_to_mmss(seconds):
"""
Converts seconds to MM:SS format.
Args:
- seconds (float): Time in seconds.
Returns:
- str: Time in MM:SS format.
"""
minutes = int(seconds // 60)
seconds = int(seconds % 60)
return f"{minutes:02}:{seconds:02}"
segments = concatenate_contiguous_segments(df_result)
segments
[('20:15', '20:27'), ('20:39', '21:21'), ('22:51', '23:15')]
We can now spin-up the player and review the segments of interest. Video player is set to start in the middle of the first segment.
from IPython.display import HTML
video_url = "https://ia601401.us.archive.org/1/items/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net/twas-the-night-before-christmas-1974-full-movie-freedownloadvideo.net.mp4"
video_player = f"""
<video id="myVideo" width="640" height="480" controls>
<source src="{video_url}" type="video/mp4">
Your browser does not support the video tag.
</video>
"""
HTML(video_player)
6. Clean-up¶
The following will delete the application and data from the dev environment.
vespa_cloud.delete()
Deactivated vespa-presales.videosearch in dev.aws-us-east-1c Deleted instance vespa-presales.videosearch.default
The following will delete the index created earlier where videos where uploaded:
# Creating a client
client = TwelveLabs(api_key=TL_API_KEY)
client.index.delete(index_id)