Billion-scale vector search with Cohere binary embeddings in Vespa¶
Cohere just released a new embedding API with support for binary and int8
vectors. Read the announcement
in the blog post: Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets.
We are excited to announce that Cohere Embed is the first embedding model that natively supports int8 and binary embeddings.
This is huge because:
- Binarization reduces the storage (disk/memory) footprint from 1024 floats (4096 bytes) per vector to 128 bytes.
- Faster distance calculations using hamming distance that
Vespa natively supports for bits packed into
int8
tensor cells. More on hamming distance in Vespa. - Multiple vector representations allow for coarse retrieval in hamming space and subsequent phases using higher-resolution representations.
- Drastically reduces the deployment due to tiered storage economics.
Vespa supports hamming
distance with and without HNSW indexing.
For those wanting to learn more about binary vectors, we recommend our 2021 blog series on Billion-scale vector search with Vespa and Billion-scale vector search with Vespa - part two.
This notebook demonstrates using the Cohere embeddings with a coarse-to-fine search and re-ranking pipeline that reduces costs, but offers the same retrieval (nDCG) accuracy.
- The packed binary vector representation is stored in memory, with an optional HNSW index using hamming distance.
- The
int8
vector representation is stored on disk using Vespa's paged option.
At query time:
- Retrieve in hamming space (1000 candidates) as the coarse-level search using the compact binary representation.
- Re-rank by using a dot product between the float version of the query vector (1024 dims) against an unpacked float version of the binary embedding (also 1024 dims)
- A re-ranking phase using the 1024 dimensional int8 representations. This stage pages the vector data from the disk using Vespa's paged option (unless it is already cached).
Install the dependencies:
!pip3 install -U pyvespa cohere==4.57 vespacli
Examining the Cohere embeddings¶
Let us check out the Cohere embedding API and how we can obtain vector embeddings with different precisions for the same text input (without additional cost). See also Cohere embed API doc.
import cohere
# Make sure that the environment variable CO_API_KEY is set to your API key
co = cohere.Client()
Some sample documents¶
Define a few sample documents that we want to embed
documents = [
"Alan Turing was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist.",
"Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time.",
"Isaac Newton was an English polymath active as a mathematician, physicist, astronomer, alchemist, theologian, and author who was described in his time as a natural philosopher.",
"Marie Curie was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity",
]
# Compute the embeddings of our sample documents.
# Set input_type to "search_document" and embedding_types to "binary" and "int8"
embeddings = co.embed(
texts=documents,
model="embed-english-v3.0",
input_type="search_document",
embedding_types=["binary", "int8"],
)
print(embeddings)
cohere.Embeddings { response_type: embeddings_by_type embeddings: cohere.EmbeddingsByType { float: None int8: [[-23, -22, -52, 18, -42, -48, 2, -8, 6, 44, 73, 9, 3, -44, -25, 15, 19, 3, 18, -19, 6, 17, 0, -62, -14, 46, -8, -14, 20, 22, 10, -40, 10, 48, -20, 40, -8, 8, 29, 0, -27, 11, -39, -28, -93, 33, -89, 4, 15, -41, -12, -2, 7, -23, -15, -21, 47, -9, 88, -107, -91, -50, 65, 27, 5, 5, 52, 27, -15, -4, 14, -7, 6, -1, -17, 13, 17, 74, 26, -9, -4, -1, 56, -15, -7, 6, -17, -25, -23, -38, -38, 78, -61, -27, -53, -20, -3, -8, -9, -18, 9, -24, 14, -13, -40, 90, 40, 24, -48, -7, -11, -116, 36, -56, -15, -1, -6, 31, 31, 8, 44, 80, 36, -35, -24, -13, -36, -64, 44, -11, -35, 46, -43, -68, -40, 12, 32, -8, -1, 58, -9, -4, 49, 3, 9, 44, 45, -33, -52, -25, -53, 27, -67, 22, 33, 29, -32, 36, 37, 83, -17, 19, 66, -17, 4, -57, -57, 20, 19, -20, -3, 18, 43, -16, -8, 29, -45, -39, -42, 121, 73, -49, -128, 127, -19, 41, -10, 55, 38, 13, -66, 1, -52, -35, 59, 6, -60, -35, 20, -11, -20, 58, -50, 27, -1, -27, 0, 33, 36, 39, -22, -6, 0, -43, -34, -4, -2, -27, -37, -19, -48, 30, -59, 33, -79, 27, -51, 38, -46, 7, 99, 0, 46, 21, -39, -13, -1, -87, 22, 65, 42, -47, -66, -109, 73, 77, 47, -79, -17, 28, 8, -2, -2, -36, -12, 35, -41, 25, -1, 13, -17, 57, 98, -31, -26, -23, -3, -8, -13, -33, 22, -13, 6, 63, -64, -12, 5, -11, 0, 27, 5, 50, 35, -7, 11, 64, 9, 30, 31, -14, 2, 53, 23, 54, 21, -19, -30, 90, -20, -16, -69, -5, -7, -79, 6, -2, 23, 8, 18, -11, -14, 8, 21, 16, -14, -58, -37, -8, -86, -34, -22, 7, -39, -14, 5, 27, -78, -2, -5, -39, 42, 1, 4, 22, 16, -7, -2, 48, -26, -68, -48, -37, -7, -26, -27, 44, 9, 4, -33, 47, -59, -19, 10, 44, 56, -123, -38, 111, -25, -10, 18, 29, 8, -41, -26, -51, 9, 20, 68, 46, -44, 45, -67, 0, 41, 35, 39, -28, 13, 21, 25, 71, -28, 32, -18, 59, 14, 7, 10, -40, 20, 72, 51, -26, -18, -25, -35, 39, -34, 23, 127, -24, -26, 77, -88, 104, 45, 37, 31, -36, 23, -34, -1, -50, 0, -35, -45, -8, 40, 1, -51, 71, -60, 4, -18, -26, 19, 20, 1, 30, 6, -20, -13, 3, 23, 88, -14, -12, -31, -36, 51, 15, -4, 13, 5, -42, 17, 29, 13, 23, -17, 8, 23, 25, -36, -60, 22, 57, 4, 2, 29, -36, -41, 34, 12, -34, 46, 10, -28, 31, 18, 11, 4, 3, 7, 19, -30, 25, -56, 7, 7, 0, 64, -35, -33, 19, -72, -35, -20, -79, -81, 2, -1, -54, -17, -6, -24, 97, 47, -46, 48, -12, -33, -20, 43, 7, -16, 45, -5, 27, -7, -8, 19, 43, -43, 15, -21, 35, -35, -18, -39, -21, 18, 4, 13, 12, 12, 57, 0, -11, 121, 15, 58, 29, -86, 11, -42, 17, 47, -18, -27, -29, -26, 55, -19, 20, -6, 34, 0, -9, 4, 7, 27, -17, -35, -4, -20, 11, 4, 36, 5, -7, 27, -40, 127, 23, -30, -111, 37, -15, -35, -22, 5, -17, -23, -36, -23, 45, -38, 16, 47, 5, -49, 52, -28, -20, -6, -51, -50, -53, 33, 4, 16, -63, -2, 13, -36, -37, -19, -9, -42, 46, -14, -22, 72, 93, 106, -27, -5, 13, -23, -47, 4, 25, -6, -30, 22, -45, -96, -34, 22, -44, 43, 40, -2, -9, -45, 15, -11, 23, 18, 0, -44, 11, 25, -30, -29, -6, -19, -20, 47, 35, 39, -24, -19, 25, 19, -11, -13, 2, -50, -4, -9, -22, 17, -2, -65, 37, 15, 30, 15, 107, -47, 28, 11, 18, -22, 53, -41, 58, 8, -14, -28, -8, -10, 11, -18, 20, -38, 4, 0, -18, -13, 26, -51, 20, -23, 23, 52, 5, -3, 25, 3, 27, 28, 60, 1, -13, -21, -14, 10, 7, 12, 21, 0, -5, -39, 7, 3, -2, 4, 42, -45, -12, 38, 0, -10, -7, -39, 6, -37, 24, 17, -37, 26, 13, -60, -22, 27, 36, 5, 54, -21, -19, 30, -79, 17, 19, -24, 17, 111, -54, 61, -56, 7, 86, 17, 60, 11, 26, -6, 59, 16, 21, 25, -17, 13, 15, 7, -13, -83, -2, -17, 39, 21, 60, 33, 40, -69, 36, 14, 19, -3, -2, -37, 14, -4, -40, -9, 3, 49, 16, 54, -6, 3, -11, -4, 4, -6, 25, -65, 47, -25, -29, -41, 31, 57, -35, 30, -7, -3, -27, -36, -23, -34, 39, -2, -25, 2, 58, 11, 16, -14, -55, -7, -7, -110, -14, -47, -85, 77, 71, -10, 6, 13, -72, -32, 69, 7, -27, 9, -41, -40, -28, 30, -12, 26, -58, 74, -1, -50, 37, -81, -41, 42, -49, -22, 25, 0, 86, -8, -4, -1, -17, 1, 58, 12, -34, -42, -24, -33, 23, 2, 23, 3, -44, -33, -19, 14, -70, 7, 25, -13, -90, -57, -29, -11, -46, -34, 6, 14, 79, 108, 26, 31, 3, -9, 27, 66, 2, 41, -17, -19, 62, 23, 48, -20, 6, -88, 74, -59, -53, 67, -77, -32, 1, -3, -43, 22, -45, -34, 20, 60, 58, -65, -48, 116, 76, 127, 24, -29, 59, 10, -20, -57, -19, -3, 35, 19, 3, 34, 6, 55, 27, 35, -4, -55, 32, 22, -4, -12, -34, -50, -16, 0, -22, 75, -48, -51, -26, -12, 1, -9, -17, -26, -4, -60, -128, -3, -19, -23, -17, -4, -5, -5, 37, -8, -21, -1, -16, 49, 6, -31, -21, -18, -13, 33, -11, -29, 16, -31, 41, -19, 0, 57, -4, -9, 16, 27, -27, 6, 104, -53, 39, -6, -8, 3, 0, 4, 39, -46, 33, 10, 26, -19, 53, 41, 31, 15, 12, 2, 44, -67, -18, -88, -29, 27, 3, 55, -8, -6, 38, 0, 13], [-18, -43, -29, 10, -43, -28, 0, -20, -10, 81, 107, 17, 35, -44, -27, 54, -4, 31, 17, -23, 19, -18, -41, -67, 31, -74, -39, 18, -4, 2, 13, 37, -7, 51, -7, 42, 9, 11, 43, 22, 12, 0, -32, -20, -39, 9, -21, -28, 27, -33, -11, 25, 11, -44, -24, -38, 109, -75, 73, -125, -89, -59, 103, 43, 20, -14, -24, 8, -3, 55, 22, -23, -4, 8, -25, -1, 28, 37, 28, 0, -27, 13, 40, -8, -43, -16, -39, -13, 9, 7, -11, 42, -32, 63, 2, -42, 2, 14, -34, -30, 17, -45, 21, -19, -41, 123, 32, 55, -63, -7, 11, -128, 28, 7, -29, -18, -17, 51, 7, 46, 25, 70, 61, -86, -7, -8, -27, -92, 88, 8, -20, 19, -42, -29, -19, 5, -10, 38, -8, 68, -45, -51, 46, 0, 5, 39, 35, -16, 2, -56, -16, 16, -26, 6, 21, -12, -28, 6, 53, 31, -35, -5, 20, 20, -1, 0, 46, 14, 33, 30, 29, -4, 56, 8, -21, -3, -3, -122, -24, 127, 71, 5, -128, 83, 30, -52, -14, -49, 29, 23, -21, 4, -45, -22, 16, 39, -64, 29, -2, -31, 18, 10, 27, 2, 4, 18, -13, 31, 91, 23, -37, -2, 2, -32, -69, 14, -7, 8, -38, 47, -45, 6, -52, -2, -24, 44, -50, 28, 18, 21, 98, -20, -25, 53, -2, 16, 68, 29, 14, -23, -4, -91, -40, 40, -30, 46, 17, 11, 37, 24, -18, -65, 13, -110, 39, 13, 15, 69, -78, 31, -39, 54, 43, -4, -21, 13, -36, -21, -62, 51, 56, -66, 8, 59, -80, 23, -13, 6, -2, 38, -17, 55, 11, -17, -19, -20, 23, -5, -13, -47, 31, -16, -21, 15, -26, -35, -39, 1, 28, -15, -52, 63, -3, 8, -9, 1, -20, 4, 0, -34, -19, 27, 17, -9, -11, -42, -10, 0, -66, -34, -7, -21, 17, -1, -11, 1, -7, 10, -5, 7, 127, 72, 37, 0, 49, -14, 28, -32, -11, 20, 31, 30, 0, -71, -50, 66, 9, 25, -28, 29, -43, -40, -27, -13, -1, -78, 23, 46, -33, -22, 1, -11, -22, -16, 36, -26, -24, 7, 5, 2, -29, 30, -87, -21, -5, 49, 0, -50, 23, -13, -11, 29, 24, 44, 3, 30, -44, -9, 13, 3, -10, -16, 16, -27, 54, -28, 6, 110, -99, -21, 127, 2, -1, 52, -86, 94, 23, 36, 22, -18, -14, 5, -59, 0, -26, -22, -103, 0, -18, 17, -50, 99, -72, 28, 48, 47, 9, -48, 51, 40, 45, -15, -34, -6, 14, 103, -1, 48, -21, 0, 41, -9, -6, 66, -11, 4, -33, -2, 52, 0, 16, -8, 58, 3, -33, -9, 50, 51, 20, 43, 64, 0, -53, 39, -41, -20, 98, 5, -49, 18, -39, 25, 5, 30, -9, 57, -31, 3, -41, 32, -2, 11, 33, -27, -47, 36, -76, -34, -4, -47, -51, -19, 31, -30, 14, -14, -30, 100, 42, -52, 47, -24, -77, -1, 45, 9, 20, 52, 4, 83, 44, 5, 45, 49, -15, -3, 81, 2, 22, -23, -39, -27, 20, 32, -14, 10, -21, 17, 13, 32, 77, -9, 45, 29, -51, -24, -4, 29, 22, -50, 7, 10, -25, -2, -20, 30, -35, 27, -12, -1, -9, -15, 27, -5, -29, -85, -52, -20, 16, 68, 48, -23, -6, -20, 92, 19, -63, -128, 30, -9, -51, -36, 54, 45, -24, -41, 10, 36, -32, -28, -2, 25, 3, 44, -61, 32, 33, -7, -31, -2, 20, 7, -31, -17, -2, 19, 63, 26, 61, -4, -18, 23, 3, -26, -11, 59, 45, -22, 14, 7, -11, -30, -21, 48, -18, -25, 2, -20, -25, -38, 15, -4, 5, 8, 18, -37, -42, -56, -41, -10, -67, -2, -54, -86, -4, 49, 1, -2, -21, -11, 59, 14, 10, -59, -62, -15, -15, -19, 3, 6, -19, -1, -46, 4, 51, -17, -32, 37, 1, 13, 19, 114, -11, 6, 21, -12, 1, 21, -22, 10, -31, 11, -51, -39, -4, 2, 78, 30, 28, -97, -8, -53, -12, 15, -19, 30, -15, -2, 43, 15, 53, 93, 0, 55, 23, 19, -23, -51, -3, 1, 2, -26, 14, -27, -15, 61, 26, -16, 9, 4, 12, 24, -16, -14, 43, -20, 35, 34, -18, -14, -33, -20, 4, -29, 20, -6, -37, -21, 13, 40, -36, 30, 12, -34, -3, -52, -5, 4, -58, -21, 57, -29, 11, -48, -15, 12, -4, 26, -22, -43, 4, 41, -58, 25, 10, 17, -33, 33, -60, 30, -50, 20, 34, 12, 20, 65, -57, -1, -16, 41, 26, 92, 20, 16, -128, -11, 2, -30, -41, -17, 35, 9, 67, 3, -21, -13, 17, 19, 8, 26, -37, 47, 10, 33, 34, 35, 14, -12, 55, -5, -65, -14, -84, 5, -30, 35, -5, -27, 22, 34, 16, 32, 0, 33, -12, 72, -68, 0, -50, -50, 20, 37, -43, 74, 0, -11, 15, -43, 23, -49, -29, -35, 61, 47, 85, -3, -9, -53, 41, -20, -14, 11, -59, 1, -1, -56, -12, 11, 19, 4, 38, -76, -23, 18, 20, -15, -25, -55, -69, 53, 54, -82, -60, 6, 17, -74, -40, -25, -40, -83, 0, 33, -73, -101, -50, -39, -26, 43, -67, 1, -51, -11, 24, -26, 23, -43, 7, 11, 73, 3, 59, 2, -7, 89, 54, 58, -37, 1, -84, 88, -65, -56, -11, -60, -58, 21, 25, -54, 16, -42, -58, -14, 24, 17, -41, 27, 48, 40, 79, 13, -38, 31, -8, 15, 4, 6, 3, -33, -59, 45, 55, -7, -12, 0, 10, -22, -17, 35, 4, -7, 4, -23, -34, -23, -3, 58, 50, -32, -72, -37, -56, 36, -46, 6, 3, 12, -39, 20, 13, 37, 15, 9, 0, -28, -21, -14, 4, 17, 57, 1, -24, -12, -1, -14, -11, -47, -13, 0, -36, -44, -4, -43, -48, -33, -4, 8, -19, 12, -23, 24, -10, 18, 19, 38, -5, -6, 54, -28, 41, -19, -28, -19, 17, -26, -41, 9, -15, 90, 33, 20, -6, 82, -60, -40, -84, -36, -3, -62, 6, -34, -10, -31, -31, 20], [5, 6, -45, -35, -3, -37, -11, -10, -18, 36, 26, 57, -17, -33, 25, -7, -3, 66, 44, 8, 1, 40, -34, -57, 2, -34, -7, 12, 9, 2, 20, -4, 19, 37, 2, 58, -16, 28, 50, -19, 20, -14, -59, -48, -98, 61, -61, 48, 7, -50, -4, -35, 11, -32, -38, -37, 60, -75, 47, -48, -56, -41, 69, 12, 3, -1, 85, 32, -35, -21, 23, 17, 12, -33, -6, -30, -2, -14, 39, 12, 34, 64, -5, -65, 24, -60, -14, -20, -58, -27, -48, -33, -87, 6, 11, -23, -33, -87, 3, 22, 52, -50, 73, -2, -33, 99, 4, 86, -9, 8, 18, -104, 40, -12, -64, -19, -3, -5, 11, -18, -4, 13, 77, -36, 32, 7, -56, -34, 65, -40, -24, 76, -62, -88, -58, 32, 4, 22, 23, 8, -2, -43, 39, 39, -6, -24, 8, 14, 13, 22, -42, 21, -36, 4, 26, 30, -25, 32, 0, 66, 12, 7, 16, 8, -16, -21, 9, 7, 29, -26, -4, 12, 55, 21, 16, 69, -2, -53, -50, 127, 98, -3, -128, 116, -8, -27, 11, 19, -4, 10, 16, 8, -38, -22, 66, 11, -37, -19, 30, -62, 29, 47, -34, -12, -57, -16, -14, 35, 47, 38, -16, -7, -6, -4, -18, -24, -23, -48, -25, -17, -7, 22, -27, 64, -23, 1, -29, 5, -32, 14, 18, 16, 23, 37, -1, 21, 46, -50, 19, 21, 49, -71, -17, -17, 34, 16, 33, -31, -4, 69, -57, 39, 3, -43, -22, 69, -50, 33, -32, 8, -7, 112, 84, 17, -23, -4, 1, 0, -9, -14, 26, -22, 17, 102, -47, -15, 26, -22, -32, 40, -22, 29, -7, -26, -21, -55, 11, -4, 16, -9, 39, 14, 8, 36, -13, -32, -88, 38, 22, -19, -69, 43, -15, -15, 8, 4, -29, 21, 23, -13, -55, 9, 23, -32, 21, -37, -23, -4, -55, -3, 28, -28, 19, -48, -1, -20, -59, 2, -30, 42, -9, 47, 24, 100, -12, 9, -9, 12, -41, -10, -49, -11, 16, -64, 21, 49, 33, 27, -46, 68, -75, -44, 3, 41, 62, -81, -31, 72, 13, -30, -28, 27, -22, 16, -12, -24, 8, 25, 16, 11, -64, 34, -13, -11, 8, 29, 16, -29, 16, 20, 38, 44, 22, 13, 12, 29, -23, -26, 25, -25, -8, 27, 41, -23, 10, -7, -45, 0, -63, 16, 127, -21, -8, 52, -59, 74, 55, 40, 18, 2, -12, -9, -42, -8, -11, -9, -71, 1, -2, 27, -50, 80, -62, 21, -4, 16, -25, 10, -8, -9, 0, -32, -8, -3, -11, 57, -5, 37, 0, -41, 52, 29, -20, 18, -18, -22, 46, 29, 36, 8, 21, -25, 42, 9, -30, 49, 22, 13, -3, 33, 35, 25, -75, -13, -33, -77, 95, -2, 1, -16, -49, 92, -27, 7, 13, 77, -13, -13, -42, 17, -57, 19, -30, -12, -45, 28, -45, -13, -8, 0, -16, 2, 47, -28, -9, 30, -38, 127, 39, -30, 15, -18, -16, 10, 14, -9, 41, 27, 18, 63, 14, 3, 45, 5, -24, 41, -36, 46, -32, -28, 4, -10, 18, 35, 0, -15, -15, -24, -29, 32, 43, 16, 23, -14, 7, -13, -54, 11, 69, 40, -2, -9, -26, 82, 0, -24, -27, 38, -94, 54, -31, -22, 20, -27, -13, -128, -39, -22, 47, 78, 36, 6, -4, -45, 33, 17, -37, -103, 55, -41, -42, -46, -17, -29, 8, 11, 25, 60, 10, -28, 54, -65, -62, -10, -40, 26, 30, 13, -24, -25, -17, -4, -15, -54, 19, 48, -47, -38, -3, 6, 1, 18, -6, -13, -1, 63, 111, -18, -10, -11, -6, -62, -19, 53, -25, 9, 75, -50, -42, -43, 2, 26, 5, 0, -25, -62, -21, -27, -25, -1, 1, 19, -47, -37, -16, 13, -23, -40, -3, 19, 23, 38, 43, -102, 71, 5, -13, 52, -18, -29, -68, 2, -48, 28, 54, 9, -12, -14, -37, 3, 50, 63, -109, 63, 8, 21, -28, 58, -30, -2, 22, -14, -37, 28, 9, -48, 14, 1, -94, 10, -8, 6, 45, -5, -39, 26, -43, 5, 50, 55, -5, 6, 9, 11, 4, 29, -4, -6, -16, -56, 6, 16, 0, 14, 8, 39, -35, 10, 38, 28, 43, -11, -128, -36, 22, 2, -32, 28, 30, -13, 1, -40, 5, 13, 24, -52, -16, 16, 15, 15, -41, 91, -32, -43, 28, -21, -32, -2, -57, 25, 70, -32, 37, -25, -19, 67, 2, 53, 10, -25, -9, 79, -20, -20, 30, -35, -31, -36, -26, 20, -71, 25, 18, -18, 8, -31, 12, 118, -27, 43, 24, 15, 5, 49, -66, 21, -33, -40, 14, 52, 35, 72, 36, -38, 24, 0, -20, -19, 5, -32, -60, 51, 29, 21, -1, 40, 82, 24, 88, 8, -45, -29, -83, -37, -39, 42, -10, -34, 36, -7, 19, -18, -10, 0, -39, 35, -98, -27, -35, -15, 28, 44, -7, 13, -31, -24, 25, -29, 13, 62, 8, 14, 64, 61, 107, -42, 51, -86, 70, 0, -18, 60, -76, -17, 11, -70, -11, 32, -15, 60, 14, -43, -14, 0, 7, 20, 7, -24, -1, -28, -42, 66, 29, -2, 1, -10, -60, 2, -1, -48, 8, 78, -25, -62, -57, 29, -41, 46, -36, 41, -78, -19, -10, -33, 6, -19, -18, -12, 44, 10, 22, -52, 1, 10, 37, 50, -24, -15, -50, 13, -22, -29, 44, -74, -50, 20, 44, -18, 50, -42, -53, 25, 35, 46, -30, -39, 76, 12, 127, 21, -9, 10, 59, -43, -11, 22, 45, -20, 8, 17, 10, -27, 14, -11, 43, -22, -108, 72, -1, -1, -37, -29, 6, 50, -15, -12, 76, -51, -91, -48, -45, -9, -14, -31, 28, 16, -47, -41, 8, 10, 20, -17, -19, -35, 13, -8, -5, 4, 80, 46, 20, 35, -44, 39, -22, -54, -11, -9, -38, -28, 9, 6, -4, 3, -24, -63, 43, 56, 2, 9, -12, 127, -65, 22, -30, -9, -41, -43, -23, 50, -43, 61, 24, 81, -35, 36, 53, 30, -23, 43, -38, 43, -40, 13, -18, 0, 0, 3, -52, -45, 8, 46, -16, 38], [-47, -19, -104, 29, -32, -72, 0, 5, -53, 69, 56, 17, 0, -38, -10, 10, -44, 69, 20, -17, -2, -45, -19, -128, 34, -4, -64, 21, -23, -9, 13, 21, 28, 55, -52, 39, -1, 24, 0, 30, 2, 5, 28, 3, -30, 19, -33, -47, 27, -35, -29, -28, 5, -41, -40, -60, 43, -21, 49, -92, -60, -22, 59, 65, 35, -10, 24, 35, -76, 31, 35, -58, -4, -13, -47, 4, 12, -7, 18, -14, -36, 47, -8, -35, -28, -15, -41, -18, -72, -38, -39, 36, -128, 20, 44, -26, -8, 14, 1, 17, -20, -23, 1, -38, -30, 85, 61, 81, -16, 4, 2, -39, 40, -77, -22, 26, 24, 48, 56, 41, 25, 99, -37, -16, 41, 50, 16, -61, -25, -18, -34, 48, 60, 20, -16, 0, 28, -17, 28, 12, 49, -46, 13, 47, -7, 10, 19, 15, 19, -26, 8, -24, -22, 12, -5, 7, -28, -4, 32, 21, 38, 16, 16, -1, -15, -32, -32, 12, 9, 47, 9, -5, -60, -39, -35, -14, -9, -10, -65, 127, 93, -48, -118, 63, 58, -71, 21, -58, -32, 37, -21, 33, 1, 0, -21, -27, -17, 12, -17, 0, -50, 64, -39, 52, 11, 24, -11, 33, 0, 0, -4, -37, 23, -52, -11, 31, -8, 1, -30, -1, -8, 6, -33, 43, 34, 27, -36, 7, 39, 19, 8, 30, 11, -12, -33, 33, 103, -15, 0, 8, 28, -66, -45, -18, 69, 64, 27, -22, 27, 60, -35, -49, 12, -86, -29, 50, 51, 71, -25, 27, -50, 72, 11, -1, -59, 63, 13, 29, -29, 8, 32, 41, -13, 60, -39, 58, -27, 0, -5, 48, 16, 65, 36, -11, 18, 59, 18, -7, -19, -41, 25, 6, 9, -9, 0, -30, 18, 73, 2, -31, -74, -32, -43, -44, 38, -18, 25, -4, 31, 31, -66, 35, 21, -39, 20, -36, -41, -6, -71, -11, -76, -57, 16, -51, -20, 28, -60, 34, 7, -1, 102, 58, 46, 3, 32, -37, -4, 32, -26, -4, -27, -56, -9, -47, -59, 62, 1, 18, -26, 18, -28, -30, 26, 46, 13, -18, -19, 53, -32, -27, -48, 13, -39, -1, 15, -15, -11, 29, 33, -5, 25, -21, -4, -16, 27, -46, 4, -40, 51, 21, 13, 0, 14, -16, 56, -28, -68, -22, 60, 52, 20, -29, 52, -27, -25, -28, -33, 34, -30, 39, 15, -16, -30, 19, -44, 4, 5, -29, 14, 34, -38, -62, -54, -72, 8, 2, -98, -33, 71, 36, -85, 71, -68, 51, -7, 3, 2, -9, 59, 62, 19, 5, -17, 4, 19, 9, -21, -24, 10, -20, 68, 3, -18, 59, -11, -2, -4, 20, 17, -7, 35, -12, 30, 20, -48, 72, 73, 30, 15, 91, 35, 19, -13, 0, -43, 12, 43, 8, -32, -2, -17, 35, -1, 51, 2, 100, -37, 44, -112, 33, -64, 36, 20, -50, -63, 31, -75, -69, 11, 1, -68, -1, 34, -19, 3, 39, -12, 38, 7, -54, -8, 0, 26, 21, 6, -8, -9, 2, -16, 18, -37, -19, 76, -18, 16, -31, 112, -42, 53, -5, -9, -13, 55, 19, -8, -39, 61, -11, -13, 3, 56, 33, 30, 4, -25, -46, -14, -32, 13, 9, 11, 4, -46, 38, -9, 39, -47, 27, -1, 59, -49, -80, 49, -32, -17, -44, 14, -40, 3, 17, 32, 17, -23, -39, 32, 11, -38, -87, 118, -81, -52, -59, -1, -24, 39, 11, 10, 24, 16, 18, 8, 29, 0, 27, 7, -21, 7, -38, 8, -31, -14, -15, 27, -83, -18, -21, -28, -15, -6, 0, 22, 16, 0, -15, -34, 21, 62, 35, 2, 39, -13, -64, 30, -1, 31, 9, -5, 24, -84, -6, -14, -38, 15, -76, -39, -27, -21, -73, -64, 9, -84, 10, -46, -92, -29, 79, -8, -45, -3, 38, 32, 20, -13, -35, -4, 0, -24, -12, -17, 4, 2, 4, -11, -11, 12, 14, -41, 49, 1, 6, -37, 49, 5, -6, -31, 48, 82, 54, 8, -1, 8, 69, -13, -42, 12, 24, 16, 15, 8, -61, -4, -6, 54, -34, -20, 14, 11, 8, 21, 0, 8, 43, -14, 29, 54, 30, -38, -46, 15, 22, 8, -22, 23, -32, -7, 22, 15, -6, -3, -12, 5, 47, -16, -8, 35, -45, -31, -40, -71, 22, -41, 22, 7, -37, 6, 24, -62, 14, 19, 40, -34, 53, -26, -44, 27, -23, 8, 42, 34, -4, 39, -33, 64, -13, 27, 59, 8, 87, 61, 43, 16, 43, -3, 4, 33, -3, 10, -39, -20, -8, -71, 11, 5, 44, -8, 40, -44, 6, -2, 0, -44, 65, -10, -4, -34, -32, 13, -32, -19, -16, 10, 0, 15, 12, -1, 8, 51, -5, -20, -7, 29, 47, -39, -1, -3, 37, 14, -8, 86, -15, -15, -94, -7, -25, 6, 27, -29, 12, 10, 44, 4, -36, 9, -17, 108, 35, -72, -9, -56, -55, 54, 53, -28, 32, 37, -6, 5, 5, -1, 26, 39, 14, -29, -24, 54, 13, 0, -128, 95, 0, -38, 40, -90, -7, -46, -23, 36, 11, 30, 26, 25, -49, -101, 44, 20, 59, 12, -40, -77, -35, 27, -10, 32, 18, 2, -107, -50, -60, -19, -128, 4, 76, 9, -71, -35, 7, 5, -50, -6, 9, -101, -68, -1, -11, 2, -95, 19, -10, 34, -22, 48, 11, 8, 78, 34, 8, -58, -14, -31, 29, -35, 3, -39, -42, 4, -12, -11, -41, 1, -33, 24, 30, 70, 48, -37, -9, 127, 61, 127, -5, -19, 30, 4, -9, -29, 10, -57, -7, 5, -16, -17, -26, 9, 47, 34, -3, 2, 17, 11, 32, 1, 20, -31, -22, 24, 8, 51, 3, -25, 44, 41, 32, -75, 14, -25, -32, -45, -28, 41, 64, 37, 12, 18, -34, -42, -59, -10, 40, 46, 4, -1, 0, -31, 20, -15, -56, 18, -19, 16, 51, -36, 44, -61, 15, 9, 1, -22, -14, -12, 4, -21, 13, 9, 31, 12, 10, 62, -45, 35, 23, 3, 0, -19, 0, 10, -10, -44, 35, -10, -89, -48, 41, 12, -32, -7, -27, -13, -23, -67, -34, -6, -16, -33, 16]] uint8: None binary: [[-110, 121, 110, -50, 87, -59, 8, 35, 114, 30, -92, -112, -118, -16, 7, 96, 17, 19, 97, -9, -23, 25, -103, -35, -78, -45, 72, -123, -41, 67, 14, -31, -42, -126, 75, 111, 62, -64, 57, 64, -52, -66, -64, -12, 100, 99, 87, 61, -5, 5, 23, 34, -75, -66, -16, 91, 92, 121, 55, 117, 100, -112, -24, 84, 84, -65, 61, -31, -45, 7, 44, 8, -35, -125, 16, -50, -52, 11, -105, -32, 102, -62, -3, 86, -107, 21, 95, 15, 27, -79, -20, 114, 90, 125, 110, -97, -15, -98, 21, -102, -124, 112, -115, 26, -86, -55, 67, 7, 11, -127, 125, 103, -46, -55, 79, -31, 126, -32, 33, -128, -124, -80, 21, 27, -49, -9, 112, 103], [-110, -7, -24, 23, -33, 68, 24, 35, 22, -50, -32, 86, 74, -14, 71, 96, 81, -45, 105, -25, -73, 108, -99, 13, -76, 125, 73, -44, -34, -34, -105, 75, 86, -58, 85, -30, -92, -27, -39, 0, -91, -2, 30, -12, -116, 9, 81, 39, 76, 44, 87, 20, -107, 110, -75, 20, 44, 125, -75, 85, -28, -118, -24, 127, 78, -75, 108, -20, -48, 3, 12, 12, 71, -29, -98, -26, 68, 11, 0, -104, 96, 70, -3, 53, -98, -108, 127, -102, -17, -84, -88, 88, -54, -45, -11, -4, -4, 15, -67, 122, -108, 117, -115, 40, 98, -47, 102, -103, 3, -123, -85, 119, -48, -24, 95, -34, -26, -24, -31, -9, 99, 64, -128, -43, 74, -91, 80, -95], [64, -14, -4, 30, 118, 5, 8, 35, 51, 3, 72, -122, -70, -10, 2, -20, 17, 115, -67, -11, 115, 31, -103, -73, -78, 65, 64, -123, -41, 91, 14, -39, -41, -78, 73, -62, 60, -28, 89, 32, 33, -35, -62, 116, 102, -45, 83, 63, 73, 37, 23, 64, -43, -46, -106, 83, 109, 92, -87, -15, -60, -39, -23, 63, 84, 56, -6, -15, 20, 3, 76, 3, 104, -16, -79, 70, -123, 15, -125, -111, 109, -105, -99, 82, -19, -27, 95, -113, 94, -74, 57, 82, -102, -7, -95, -21, -3, -66, 73, 95, -124, 37, -115, -81, 107, -55, -25, 6, 19, -107, -120, 111, -110, -23, 79, -26, 106, -61, -96, -77, 9, 116, -115, -67, -63, -9, -43, 77], [-109, -7, -32, 19, 87, 116, 8, 35, 54, -102, -64, -106, -14, -10, 31, 78, -99, 59, -6, -45, 97, 96, -103, 37, 69, -35, 9, -59, 95, 25, 14, 73, 86, -9, -43, 110, -70, 96, 45, 32, -91, 62, -64, -12, 100, -55, 34, 62, 14, 5, 22, 67, -75, -17, -14, 81, 45, 125, -15, -11, -28, 75, -25, 20, 42, -78, -4, -67, -44, 11, 76, 3, 127, 40, 0, 103, 75, -62, -123, -111, 68, -13, -10, -5, -66, -89, 119, -70, -29, -95, -19, 82, 106, 127, -24, -11, -48, 15, -29, -102, -115, 107, -115, 55, -69, -61, 103, 11, 3, 25, -118, 63, -108, 11, 78, -28, 14, 124, 119, -61, 97, 84, 53, 69, 123, 89, -104, -127]] ubinary: None } meta: {'api_version': {'version': '1'}, 'billed_units': {'input_tokens': 106}} }
As we can see from the above, we got multiple vector representations for the same input strings.
print(
"int8 dimensionality: {}, binary dimensionality {}".format(
len(embeddings.embeddings.int8[0]), len(embeddings.embeddings.binary[0])
)
)
int8 dimensionality: 1024, binary dimensionality 128
Defining the Vespa application¶
First, we define a Vespa schema with the fields we want to store and their type.
Notice the binary_vector
field that defines an indexed (dense) Vespa tensor with the dimension name x[128]
.
Indexing specifies index
which means that Vespa will build HNSW graph for searching this vector field.
Also, notice the configuration of the distance-metric.
We also want to store the int8_vector
on disk; we use paged
to signalize this.
from vespa.package import Schema, Document, Field, FieldSet
my_schema = Schema(
name="doc",
mode="index",
document=Document(
fields=[
Field(name="doc_id", type="string", indexing=["summary"]),
Field(
name="text",
type="string",
indexing=["summary", "index"],
index="enable-bm25",
),
Field(
name="binary_vector",
type="tensor<int8>(x[128])",
indexing=["attribute", "index"],
attribute=["distance-metric: hamming"],
),
Field(
name="int8_vector",
type="tensor<int8>(x[1024])",
indexing=["attribute"],
attribute=["paged"],
),
]
),
fieldsets=[FieldSet(name="default", fields=["text"])],
)
We must add the schema to a Vespa application package. This consists of configuration files, schemas, models, and possibly even custom code (plugins).
from vespa.package import ApplicationPackage
vespa_app_name = "coherebillion"
vespa_application_package = ApplicationPackage(name=vespa_app_name, schema=[my_schema])
from vespa.package import RankProfile, FirstPhaseRanking, SecondPhaseRanking, Function
rerank = RankProfile(
name="rerank",
inputs=[
("query(q_binary)", "tensor<int8>(x[128])"),
("query(q_full)", "tensor<float>(x[1024])"),
("query(q_int8)", "tensor<int8>(x[1024])"),
],
functions=[
Function( # this returns a tensor<float>(x[1024]) with values -1 or 1
name="unpack_binary_representation",
expression="2*unpack_bits(attribute(binary_vector)) -1",
)
],
first_phase=FirstPhaseRanking(
expression="sum(query(q_full)*unpack_binary_representation )" # phase 1 ranking using the float query and the unpacked float version of the binary_vector
),
second_phase=SecondPhaseRanking(
expression="cosine_similarity(query(q_int8),attribute(int8_vector),x)", # phase 2 using the int8 vector representations
rerank_count=30, # number of hits to rerank, upper bound on number of random IO operations
),
match_features=[
"distance(field, binary_vector)",
"closeness(field, binary_vector)",
"firstPhase",
],
)
my_schema.add_rank_profile(rerank)
Deploy the application to Vespa Cloud¶
With the configured application, we can deploy it to Vespa Cloud.
To deploy the application to Vespa Cloud we need to create a tenant in the Vespa Cloud:
Create a tenant at console.vespa-cloud.com (unless you already have one). This step requires a Google or GitHub account, and will start your free trial.
Make note of the tenant name, it is used in the next steps.
Note: Deployments to dev and perf expire after 7 days of inactivity, i.e., 7 days after running deploy. This applies to all plans, not only the Free Trial. Use the Vespa Console to extend the expiry period, or redeploy the application to add 7 more days.
from vespa.deployment import VespaCloud
import os
# Replace with your tenant name from the Vespa Cloud Console
tenant_name = "vespa-team"
# Key is only used for CI/CD. Can be removed if logging in interactively
key = os.getenv("VESPA_TEAM_API_KEY", None)
if key is not None:
key = key.replace(r"\n", "\n") # To parse key correctly
vespa_cloud = VespaCloud(
tenant=tenant_name,
application=vespa_app_name,
key_content=key, # Key is only used for CI/CD. Can be removed if logging in interactively
application_package=vespa_application_package,
)
Now deploy the app to Vespa Cloud dev zone.
The first deployment typically takes 2 minutes until the endpoint is up.
from vespa.application import Vespa
app: Vespa = vespa_cloud.deploy()
Feed our sample documents and their binary embedding representation¶
With few documents, we use the synchronous API. Read more in reads and writes.
for i, doc in enumerate(documents):
response = app.feed_data_point(
schema="doc",
data_id=str(i),
fields={
"doc_id": str(i),
"text": doc,
"binary_vector": embeddings.embeddings.binary[i],
"int8_vector": embeddings.embeddings.int8[i],
},
)
assert response.is_successful()
Querying data¶
Read more about querying Vespa in:
- Vespa Query API
- Vespa Query API reference
- Vespa Query Language API (YQL)
- Practical Nearest Neighbor Search Guide
We now need to invoke the embed API again to embed the query text; we ask for all three representations:
query = "Who discovered x-ray?"
# Make sure to set input_type="search_query" when getting the embeddings for the query.
# We ask for 3 versions (float, binary, and int8) of the embeddings.
query_emb = co.embed(
[query],
model="embed-english-v3.0",
input_type="search_query",
embedding_types=["float", "binary", "int8"],
)
print(query_emb)
Now, we use Vespa's nearestNeighbor query operator to expose up to 1000 hits to ranking using the configured distance-metric (hamming distance).
This is the retrieve logic, or phase-0 search as it only uses the hamming distance. See phased ranking for more on phased ranking pipelines.
The hits that are near in hamming space, are exposed to the flexibility of the Vespa ranking framework:
- the first-phase uses the unpacked version of the binary vector and computes the dot product against the float query version
- The second phase and final phase re-ranks the 30 best from the the previous phase, here using cosine similarity between the int8 embedding representations
response = app.query(
yql="select * from doc where {targetHits:1000}nearestNeighbor(binary_vector,q_binary)",
ranking="rerank",
body={
"input.query(q_binary)": query_emb.embeddings.binary[0],
"input.query(q_full)": query_emb.embeddings.float[0],
"input.query(q_int8)": query_emb.embeddings.int8[0],
},
)
assert response.is_successful()
response.hits
[{'id': 'id:doc:doc::3', 'relevance': 0.45650564242263414, 'source': 'cohere_content', 'fields': {'matchfeatures': {'closeness(field,binary_vector)': 0.0030303030303030303, 'distance(field,binary_vector)': 329.0, 'firstPhase': 4.905200004577637}, 'sddocname': 'doc', 'documentid': 'id:doc:doc::3', 'doc_id': '3', 'text': 'Marie Curie was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity'}}, {'id': 'id:doc:doc::1', 'relevance': 0.337421116422118, 'source': 'cohere_content', 'fields': {'matchfeatures': {'closeness(field,binary_vector)': 0.002544529262086514, 'distance(field,binary_vector)': 391.99999999999994, 'firstPhase': 3.7868080139160156}, 'sddocname': 'doc', 'documentid': 'id:doc:doc::1', 'doc_id': '1', 'text': 'Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time.'}}, {'id': 'id:doc:doc::2', 'relevance': 0.280400768492745, 'source': 'cohere_content', 'fields': {'matchfeatures': {'closeness(field,binary_vector)': 0.0026595744680851063, 'distance(field,binary_vector)': 375.0, 'firstPhase': 3.854860305786133}, 'sddocname': 'doc', 'documentid': 'id:doc:doc::2', 'doc_id': '2', 'text': 'Isaac Newton was an English polymath active as a mathematician, physicist, astronomer, alchemist, theologian, and author who was described in his time as a natural philosopher.'}}, {'id': 'id:doc:doc::0', 'relevance': 0.2570603626828106, 'source': 'cohere_content', 'fields': {'matchfeatures': {'closeness(field,binary_vector)': 0.0024390243902439024, 'distance(field,binary_vector)': 409.0, 'firstPhase': 2.845644474029541}, 'sddocname': 'doc', 'documentid': 'id:doc:doc::0', 'doc_id': '0', 'text': 'Alan Turing was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist.'}}]
The relevance
is the cosine similarity between the int8 vector representations calculated in the second-phase. Note also that we return the hamming
distance
and the firstPhase score which is the query, unpacked binary dot product.
Conclusions¶
These new Cohere binary embeddings are a huge step forward for cost-efficient vector search at scale and integrate perfectly with Vespa features for building out vector search at scale.
Storing the int8
vector representation on disk using the paged attribute option enables phased retrieval and ranking close to the data.
First, one can use the compact in-memory binary representation for the coarse-level search to efficiently find a limited number of candidates.
Then, the candidates from the coarse search can be re-scored and re-ranked using a more advanced scoring function using a finer resolution.
Clean up¶
We can now delete the cloud instance:
vespa_cloud.delete()