Skip to content

Models

vespa.models

ModelConfig(model_id, embedding_dim, tokenizer_id=None, binarized=False, embedding_field_type='bfloat16', distance_metric=None, component_id=None, model_path=None, tokenizer_path=None, model_url=None, tokenizer_url=None, max_tokens=None, transformer_input_ids=None, transformer_attention_mask=None, transformer_token_type_ids=None, transformer_output=None, pooling_strategy=None, normalize=None, query_prepend=None, document_prepend=None) dataclass

Configuration for an embedding model.

This class encapsulates all model-specific parameters that affect the Vespa schema, component configuration, and ranking expressions.

Attributes:

Name Type Description
model_id str

The model identifier (e.g., 'e5-small-v2', 'snowflake-arctic-embed-xs')

embedding_dim int

The dimension of the embedding vectors (e.g., 384, 768). When binarized=True, specify the original model dimension - it will be automatically divided by 8 for storage (e.g., 1024 -> 128 bytes).

tokenizer_id Optional[str]

The tokenizer model identifier (if different from model_id)

binarized bool

Whether the embeddings should be binarized (packed to bits). When True, overrides embedding_field_type to int8 and embedding_dim must be divisible by 8.

embedding_field_type EmbeddingFieldType

Tensor cell type for embeddings. Options: - "double": 64-bit float (highest precision, highest memory) - "float": 32-bit float (good balance) - "bfloat16": 16-bit brain float (reduced memory, good for large scale) - DEFAULT - "int8": 8-bit integer (quantized, or used automatically when binarized=True)

distance_metric Optional[DistanceMetric]

Distance metric for HNSW index. Options: - "angular": Cosine similarity (default for non-binarized) - "hamming": Hamming distance (required for binarized embeddings) - "euclidean", "dotproduct", "prenormalized-angular", "geodegrees"

component_id Optional[str]

The ID to use for the Vespa component (defaults to sanitized model_id)

model_path Optional[str]

Optional local path to the model file

tokenizer_path Optional[str]

Optional local path to the tokenizer file

model_url Optional[str]

Optional URL to the ONNX model file (alternative to model_id)

tokenizer_url Optional[str]

Optional URL to the tokenizer file (alternative to tokenizer_id)

max_tokens Optional[int]

Maximum number of tokens accepted by the transformer model (default: 512)

transformer_input_ids Optional[str]

Name/identifier for transformer input IDs (default: "input_ids")

transformer_attention_mask Optional[str]

Name/identifier for transformer attention mask (default: "attention_mask")

transformer_token_type_ids Optional[str]

Name/identifier for transformer token type IDs (default: "token_type_ids") Set to None to disable token_type_ids

transformer_output Optional[str]

Name/identifier for transformer output (default: "last_hidden_state")

pooling_strategy Optional[PoolingStrategy]

How to pool output vectors ("mean", "cls", or "none") (default: "mean")

normalize Optional[bool]

Whether to normalize output to unit length (default: False)

query_prepend Optional[str]

Optional instruction to prepend to query text

document_prepend Optional[str]

Optional instruction to prepend to document text

__post_init__()

Set defaults and validate configuration.

sanitize_component_id(model_id)

Sanitize a model ID to create a valid Vespa component identifier.

Vespa component IDs must match the pattern [a-zA-Z][a-zA-Z0-9_]* (start with a letter, followed by letters, digits, or underscores).

Parameters:

Name Type Description Default
model_id str

The model identifier to sanitize

required

Returns:

Type Description
str

A valid Vespa component ID

Example

sanitize_component_id("e5-small-v2") 'e5_small_v2' sanitize_component_id("sentence-transformers/all-MiniLM-L6-v2") 'sentence_transformers_all_MiniLM_L6_v2' sanitize_component_id("model.v1.0") 'model_v1_0' sanitize_component_id("123-model") 'model_123_model'

create_embedder_component(config)

Create a Vespa hugging-face-embedder component from a model configuration.

Parameters:

Name Type Description Default
config ModelConfig

ModelConfig instance with model parameters

required

Returns:

Name Type Description
Component Component

A Vespa Component configured as a hugging-face-embedder

Example

config = ModelConfig(model_id="e5-small-v2", embedding_dim=384) component = create_embedder_component(config) component.id 'e5_small_v2'

Example with URL-based model and custom parameters

config = ModelConfig( ... model_id="gte-multilingual", ... embedding_dim=768, ... model_url="https://huggingface.co/onnx-community/gte-multilingual-base/resolve/main/onnx/model_quantized.onnx", ... tokenizer_url="https://huggingface.co/onnx-community/gte-multilingual-base/resolve/main/tokenizer.json", ... transformer_output="token_embeddings", ... max_tokens=8192, ... query_prepend="Represent this sentence for searching relevant passages: ", ... document_prepend="passage: ", ... ) component = create_embedder_component(config) component.id 'gte_multilingual'

create_embedding_field(config, field_name='embedding', indexing=None, distance_metric=None, embedder_id=None)

Create a Vespa embedding field from a model configuration.

The field type and indexing statement are automatically configured based on whether the embeddings are binarized.

Parameters:

Name Type Description Default
config ModelConfig

ModelConfig instance with model parameters

required
field_name str

Name of the embedding field (default: "embedding")

'embedding'
indexing Optional[List[str]]

Custom indexing statement (default: auto-generated based on config)

None
distance_metric Optional[DistanceMetric]

Distance metric for HNSW (default: "hamming" for binarized, "angular" for float)

None
embedder_id Optional[str]

Embedder ID to use in the indexing statement (default: uses config.component_id)

None

Returns:

Name Type Description
Field Field

A Vespa Field configured for embeddings

Example

config = ModelConfig(model_id="e5-small-v2", embedding_dim=384) field = create_embedding_field(config) field.type 'tensor(x[384])'

config_float = ModelConfig(model_id="e5-small-v2", embedding_dim=384, embedding_field_type="float") field_float = create_embedding_field(config_float) field_float.type 'tensor(x[384])'

config_binary = ModelConfig(model_id="bge-m3", embedding_dim=1024, binarized=True) field_binary = create_embedding_field(config_binary) field_binary.type 'tensor(x[128])'

create_semantic_rank_profile(config, profile_name='semantic', embedding_field='embedding', query_tensor='q')

Create a semantic ranking profile based on model configuration.

The ranking expression is automatically configured to use hamming distance for binarized embeddings or cosine similarity for float embeddings.

Parameters:

Name Type Description Default
config ModelConfig

ModelConfig instance with model parameters

required
profile_name str

Name of the rank profile (default: "semantic")

'semantic'
embedding_field str

Name of the embedding field (default: "embedding")

'embedding'
query_tensor str

Name of the query tensor (default: "q")

'q'

Returns:

Name Type Description
RankProfile RankProfile

A Vespa RankProfile configured for semantic search

Example

config = ModelConfig(model_id="e5-small-v2", embedding_dim=384, binarized=False) profile = create_semantic_rank_profile(config) profile.name 'semantic'

create_hybrid_rank_profile(config, profile_name='fusion', base_profile='bm25', embedding_field='embedding', query_tensor='q', fusion_method='rrf', global_rerank_count=1000, first_phase_keep_rank_count=None)

Create a hybrid ranking profile combining BM25 and semantic search.

Parameters:

Name Type Description Default
config ModelConfig

ModelConfig instance with model parameters

required
profile_name str

Name of the rank profile (default: "fusion")

'fusion'
base_profile str

Name of the BM25 profile to inherit from (default: "bm25")

'bm25'
embedding_field str

Name of the embedding field (default: "embedding")

'embedding'
query_tensor str

Name of the query tensor (default: "q")

'q'
fusion_method FusionMethod

Fusion method - "rrf" for reciprocal rank fusion, "atan_norm" for atan-normalized sum in first phase, or "norm_linear" for linear normalization in global phase.

'rrf'
global_rerank_count int

Number of hits to rerank in global phase (default: 1000)

1000
first_phase_keep_rank_count Optional[int]

How many documents to keep the first phase top rank values for (default: None, uses Vespa default of 10000)

None

Returns:

Name Type Description
RankProfile RankProfile

A Vespa RankProfile configured for hybrid search

Example

config = ModelConfig(model_id="e5-small-v2", embedding_dim=384) profile = create_hybrid_rank_profile(config) profile.name 'fusion'

get_model_config(model_name)

Get a predefined model configuration by name.

Parameters:

Name Type Description Default
model_name str

Name of a predefined model

required

Returns:

Name Type Description
ModelConfig ModelConfig

The model configuration

Raises:

Type Description
KeyError

If the model name is not found

Example

config = get_model_config("e5-small-v2") config.embedding_dim 384

list_models()

List all available predefined model configurations.

Returns:

Type Description
List[str]

List of model names that can be used with get_model_config()

Example

models = list_models() 'e5-small-v2' in models True 'nomic-ai-modernbert' in models True

create_hybrid_package(models, app_name='hybridapp', schema_name='doc', global_rerank_count=1000)

Create a Vespa application package configured for hybrid search evaluation.

This function creates a complete Vespa application package with all necessary components, fields, and rank profiles for evaluation. It supports single or multiple embedding models, automatically handling naming conflicts by using model-specific field and component names.

Parameters:

Name Type Description Default
models Union[str, ModelConfig, List[Union[str, ModelConfig]]]

Single model or list of models to configure. Each can be: - A string model name (e.g., "e5-small-v2") to use a predefined config - A ModelConfig instance for custom configuration

required
app_name str

Name of the application (default: "hybridapp")

'hybridapp'
schema_name str

Name of the schema (default: "doc")

'doc'
global_rerank_count int

Number of hits to rerank in global phase (default: 1000)

1000

Returns:

Name Type Description
ApplicationPackage ApplicationPackage

Configured Vespa application package with: - Components for each embedding model - Embedding fields for each model (named "embedding" for single model, "embedding_{component_id}" for multiple models) - BM25 and semantic rank profiles for each model - Hybrid rank profiles (RRF, atan_norm, norm_linear) for each model - A match-only rank profile for baseline evaluation

Raises:

Type Description
ValueError

If models list is empty

KeyError

If a model name is not found in COMMON_MODELS

Example

Single model by name

package = create_hybrid_package("e5-small-v2") len(package.components) 1 package.schema.document.fields[2].name 'embedding'

Single model with custom config

config = ModelConfig(model_id="my-model", embedding_dim=512) package = create_hybrid_package(config) package.schema.document.fields[2].name 'embedding'

Multiple models - creates separate fields and profiles for each

package = create_hybrid_package(["e5-small-v2", "e5-base-v2"]) len(package.components) 2

Fields will be named: embedding_e5_small_v2, embedding_e5_base_v2

field_names = [f.name for f in package.schema.document.fields if f.name.startswith('embedding')] len(field_names) 2

Multiple models with mixed configs

custom = ModelConfig(model_id="custom-model", embedding_dim=384) package = create_hybrid_package(["e5-small-v2", custom]) len(package.components) 2