Models
vespa.models
ModelConfig(model_id, embedding_dim, tokenizer_id=None, binarized=False, embedding_field_type='bfloat16', distance_metric=None, component_id=None, model_path=None, tokenizer_path=None, model_url=None, tokenizer_url=None, max_tokens=None, transformer_input_ids=None, transformer_attention_mask=None, transformer_token_type_ids=None, transformer_output=None, pooling_strategy=None, normalize=None, query_prepend=None, document_prepend=None)
dataclass
Configuration for an embedding model.
This class encapsulates all model-specific parameters that affect the Vespa schema, component configuration, and ranking expressions.
Attributes:
| Name | Type | Description |
|---|---|---|
model_id |
str
|
The model identifier (e.g., 'e5-small-v2', 'snowflake-arctic-embed-xs') |
embedding_dim |
int
|
The dimension of the embedding vectors (e.g., 384, 768). When binarized=True, specify the original model dimension - it will be automatically divided by 8 for storage (e.g., 1024 -> 128 bytes). |
tokenizer_id |
Optional[str]
|
The tokenizer model identifier (if different from model_id) |
binarized |
bool
|
Whether the embeddings should be binarized (packed to bits). When True, overrides embedding_field_type to int8 and embedding_dim must be divisible by 8. |
embedding_field_type |
EmbeddingFieldType
|
Tensor cell type for embeddings. Options: - "double": 64-bit float (highest precision, highest memory) - "float": 32-bit float (good balance) - "bfloat16": 16-bit brain float (reduced memory, good for large scale) - DEFAULT - "int8": 8-bit integer (quantized, or used automatically when binarized=True) |
distance_metric |
Optional[DistanceMetric]
|
Distance metric for HNSW index. Options: - "angular": Cosine similarity (default for non-binarized) - "hamming": Hamming distance (required for binarized embeddings) - "euclidean", "dotproduct", "prenormalized-angular", "geodegrees" |
component_id |
Optional[str]
|
The ID to use for the Vespa component (defaults to sanitized model_id) |
model_path |
Optional[str]
|
Optional local path to the model file |
tokenizer_path |
Optional[str]
|
Optional local path to the tokenizer file |
model_url |
Optional[str]
|
Optional URL to the ONNX model file (alternative to model_id) |
tokenizer_url |
Optional[str]
|
Optional URL to the tokenizer file (alternative to tokenizer_id) |
max_tokens |
Optional[int]
|
Maximum number of tokens accepted by the transformer model (default: 512) |
transformer_input_ids |
Optional[str]
|
Name/identifier for transformer input IDs (default: "input_ids") |
transformer_attention_mask |
Optional[str]
|
Name/identifier for transformer attention mask (default: "attention_mask") |
transformer_token_type_ids |
Optional[str]
|
Name/identifier for transformer token type IDs (default: "token_type_ids") Set to None to disable token_type_ids |
transformer_output |
Optional[str]
|
Name/identifier for transformer output (default: "last_hidden_state") |
pooling_strategy |
Optional[PoolingStrategy]
|
How to pool output vectors ("mean", "cls", or "none") (default: "mean") |
normalize |
Optional[bool]
|
Whether to normalize output to unit length (default: False) |
query_prepend |
Optional[str]
|
Optional instruction to prepend to query text |
document_prepend |
Optional[str]
|
Optional instruction to prepend to document text |
__post_init__()
Set defaults and validate configuration.
sanitize_component_id(model_id)
Sanitize a model ID to create a valid Vespa component identifier.
Vespa component IDs must match the pattern [a-zA-Z][a-zA-Z0-9_]* (start with a letter, followed by letters, digits, or underscores).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_id
|
str
|
The model identifier to sanitize |
required |
Returns:
| Type | Description |
|---|---|
str
|
A valid Vespa component ID |
Example
sanitize_component_id("e5-small-v2") 'e5_small_v2' sanitize_component_id("sentence-transformers/all-MiniLM-L6-v2") 'sentence_transformers_all_MiniLM_L6_v2' sanitize_component_id("model.v1.0") 'model_v1_0' sanitize_component_id("123-model") 'model_123_model'
create_embedder_component(config)
Create a Vespa hugging-face-embedder component from a model configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ModelConfig
|
ModelConfig instance with model parameters |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Component |
Component
|
A Vespa Component configured as a hugging-face-embedder |
Example
config = ModelConfig(model_id="e5-small-v2", embedding_dim=384) component = create_embedder_component(config) component.id 'e5_small_v2'
Example with URL-based model and custom parameters
config = ModelConfig( ... model_id="gte-multilingual", ... embedding_dim=768, ... model_url="https://huggingface.co/onnx-community/gte-multilingual-base/resolve/main/onnx/model_quantized.onnx", ... tokenizer_url="https://huggingface.co/onnx-community/gte-multilingual-base/resolve/main/tokenizer.json", ... transformer_output="token_embeddings", ... max_tokens=8192, ... query_prepend="Represent this sentence for searching relevant passages: ", ... document_prepend="passage: ", ... ) component = create_embedder_component(config) component.id 'gte_multilingual'
create_embedding_field(config, field_name='embedding', indexing=None, distance_metric=None, embedder_id=None)
Create a Vespa embedding field from a model configuration.
The field type and indexing statement are automatically configured based on whether the embeddings are binarized.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ModelConfig
|
ModelConfig instance with model parameters |
required |
field_name
|
str
|
Name of the embedding field (default: "embedding") |
'embedding'
|
indexing
|
Optional[List[str]]
|
Custom indexing statement (default: auto-generated based on config) |
None
|
distance_metric
|
Optional[DistanceMetric]
|
Distance metric for HNSW (default: "hamming" for binarized, "angular" for float) |
None
|
embedder_id
|
Optional[str]
|
Embedder ID to use in the indexing statement (default: uses config.component_id) |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Field |
Field
|
A Vespa Field configured for embeddings |
Example
config = ModelConfig(model_id="e5-small-v2", embedding_dim=384) field = create_embedding_field(config) field.type 'tensor
(x[384])' config_float = ModelConfig(model_id="e5-small-v2", embedding_dim=384, embedding_field_type="float") field_float = create_embedding_field(config_float) field_float.type 'tensor
(x[384])' config_binary = ModelConfig(model_id="bge-m3", embedding_dim=1024, binarized=True) field_binary = create_embedding_field(config_binary) field_binary.type 'tensor
(x[128])'
create_semantic_rank_profile(config, profile_name='semantic', embedding_field='embedding', query_tensor='q')
Create a semantic ranking profile based on model configuration.
The ranking expression is automatically configured to use hamming distance for binarized embeddings or cosine similarity for float embeddings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ModelConfig
|
ModelConfig instance with model parameters |
required |
profile_name
|
str
|
Name of the rank profile (default: "semantic") |
'semantic'
|
embedding_field
|
str
|
Name of the embedding field (default: "embedding") |
'embedding'
|
query_tensor
|
str
|
Name of the query tensor (default: "q") |
'q'
|
Returns:
| Name | Type | Description |
|---|---|---|
RankProfile |
RankProfile
|
A Vespa RankProfile configured for semantic search |
Example
config = ModelConfig(model_id="e5-small-v2", embedding_dim=384, binarized=False) profile = create_semantic_rank_profile(config) profile.name 'semantic'
create_hybrid_rank_profile(config, profile_name='fusion', base_profile='bm25', embedding_field='embedding', query_tensor='q', fusion_method='rrf', global_rerank_count=1000, first_phase_keep_rank_count=None)
Create a hybrid ranking profile combining BM25 and semantic search.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ModelConfig
|
ModelConfig instance with model parameters |
required |
profile_name
|
str
|
Name of the rank profile (default: "fusion") |
'fusion'
|
base_profile
|
str
|
Name of the BM25 profile to inherit from (default: "bm25") |
'bm25'
|
embedding_field
|
str
|
Name of the embedding field (default: "embedding") |
'embedding'
|
query_tensor
|
str
|
Name of the query tensor (default: "q") |
'q'
|
fusion_method
|
FusionMethod
|
Fusion method - "rrf" for reciprocal rank fusion, "atan_norm" for atan-normalized sum in first phase, or "norm_linear" for linear normalization in global phase. |
'rrf'
|
global_rerank_count
|
int
|
Number of hits to rerank in global phase (default: 1000) |
1000
|
first_phase_keep_rank_count
|
Optional[int]
|
How many documents to keep the first phase top rank values for (default: None, uses Vespa default of 10000) |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
RankProfile |
RankProfile
|
A Vespa RankProfile configured for hybrid search |
Example
config = ModelConfig(model_id="e5-small-v2", embedding_dim=384) profile = create_hybrid_rank_profile(config) profile.name 'fusion'
get_model_config(model_name)
Get a predefined model configuration by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name of a predefined model |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ModelConfig |
ModelConfig
|
The model configuration |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the model name is not found |
Example
config = get_model_config("e5-small-v2") config.embedding_dim 384
list_models()
List all available predefined model configurations.
Returns:
| Type | Description |
|---|---|
List[str]
|
List of model names that can be used with get_model_config() |
Example
models = list_models() 'e5-small-v2' in models True 'nomic-ai-modernbert' in models True
create_hybrid_package(models, app_name='hybridapp', schema_name='doc', global_rerank_count=1000)
Create a Vespa application package configured for hybrid search evaluation.
This function creates a complete Vespa application package with all necessary components, fields, and rank profiles for evaluation. It supports single or multiple embedding models, automatically handling naming conflicts by using model-specific field and component names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
Union[str, ModelConfig, List[Union[str, ModelConfig]]]
|
Single model or list of models to configure. Each can be: - A string model name (e.g., "e5-small-v2") to use a predefined config - A ModelConfig instance for custom configuration |
required |
app_name
|
str
|
Name of the application (default: "hybridapp") |
'hybridapp'
|
schema_name
|
str
|
Name of the schema (default: "doc") |
'doc'
|
global_rerank_count
|
int
|
Number of hits to rerank in global phase (default: 1000) |
1000
|
Returns:
| Name | Type | Description |
|---|---|---|
ApplicationPackage |
ApplicationPackage
|
Configured Vespa application package with: - Components for each embedding model - Embedding fields for each model (named "embedding" for single model, "embedding_{component_id}" for multiple models) - BM25 and semantic rank profiles for each model - Hybrid rank profiles (RRF, atan_norm, norm_linear) for each model - A match-only rank profile for baseline evaluation |
Raises:
| Type | Description |
|---|---|
ValueError
|
If models list is empty |
KeyError
|
If a model name is not found in COMMON_MODELS |
Example
Single model by name
package = create_hybrid_package("e5-small-v2") len(package.components) 1 package.schema.document.fields[2].name 'embedding'
Single model with custom config
config = ModelConfig(model_id="my-model", embedding_dim=512) package = create_hybrid_package(config) package.schema.document.fields[2].name 'embedding'
Multiple models - creates separate fields and profiles for each
package = create_hybrid_package(["e5-small-v2", "e5-base-v2"]) len(package.components) 2
Fields will be named: embedding_e5_small_v2, embedding_e5_base_v2
field_names = [f.name for f in package.schema.document.fields if f.name.startswith('embedding')] len(field_names) 2
Multiple models with mixed configs
custom = ModelConfig(model_id="custom-model", embedding_dim=384) package = create_hybrid_package(["e5-small-v2", custom]) len(package.components) 2