Package
vespa.package
VT(tag, cs, attrs=None, void_=False, replace_underscores=True, **kwargs)
A 'Vespa Tag' structure, containing tag
, children
, and attrs
sanitize_tag_name(tag)
staticmethod
Convert invalid tag names (with '-') to valid Python identifiers (with '_')
restore_tag_name()
Restore sanitized tag names back to the original names for XML generation
Summary(name=None, type=None, fields=None)
Bases: object
Configures a summary field.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the summary field. Can be |
None
|
type
|
str
|
The type of the summary field. Can be |
None
|
fields
|
list
|
A list of properties used to configure the summary. These can be single properties (like "summary: dynamic", common in |
None
|
Example
Summary(None, None, ["dynamic"])
Summary(None, None, ['dynamic'])
Summary(
"title",
"string",
[("source", "title")]
)
Summary('title', 'string', [('source', 'title')])
Summary(
"title",
"string",
[("source", ["title", "abstract"])]
)
Summary('title', 'string', [('source', ['title', 'abstract'])])
Summary(
name="artist",
type="string",
)
Summary('artist', 'string', None)
as_lines
property
Returns the object as a list of strings, where each string represents a line of configuration that can be used during schema generation as shown below:
Example
HNSW(distance_metric='euclidean', max_links_per_node=16, neighbors_to_explore_at_insert=200)
Bases: object
Configures Vespa HNSW indexes.
For more information, check the Vespa documentation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
distance_metric
|
str
|
The distance metric to use when computing distance between vectors. Default is 'euclidean'. |
'euclidean'
|
max_links_per_node
|
int
|
Specifies how many links per HNSW node to select when building the graph. Default is 16. |
16
|
neighbors_to_explore_at_insert
|
int
|
Specifies how many neighbors to explore when inserting a document in the HNSW graph. Default is 200. |
200
|
StructField(name, **kwargs)
Create a Vespa struct-field.
For more detailed information about struct-fields, check the Vespa documentation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the struct-field. |
required |
indexing
|
list
|
Configures how to process data of a struct-field during indexing. |
required |
attribute
|
list
|
Specifies a property of an index structure attribute. |
required |
match
|
list
|
Set properties that decide how the matching method for this field operates. |
required |
query_command
|
list
|
Add configuration for the query-command of the field. |
required |
summary
|
Summary
|
Add configuration for the summary of the field. |
required |
rank
|
str
|
Specifies the property that defines ranking calculations done for a field. |
required |
Example
StructField(
name = "first_name",
indexing = ["attribute"],
attribute = ["fast-search"],
)
StructField('first_name', ['attribute'], ['fast-search'], None, None, None, None)
StructField(
name = "last_name",
match = ["exact", ("exact-terminator", '"@%"')],
query_command = ['"exact %%"'],
summary = Summary(None, None, fields=["dynamic", ("bolding", "on")])
)
StructField('last_name', None, None, ['exact', ('exact-terminator', '"@%"')], ['"exact %%"'], Summary(None, None, ['dynamic', ('bolding', 'on')]), None)
Field(name, type, indexing=None, index=None, attribute=None, ann=None, match=None, weight=None, bolding=None, summary=None, is_document_field=True, **kwargs)
Bases: object
Create a Vespa field.
For more detailed information about fields, check the Vespa documentation.
Once we have an ApplicationPackage
instance containing a Schema
and a Document
,
we usually want to add fields so that we can store our data in a structured manner.
We can accomplish that by creating Field
instances and adding those to the ApplicationPackage
instance via Schema
and Document
methods.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the field. |
required |
type
|
str
|
The data type of the field. |
required |
indexing
|
list
|
Configures how to process data of a field during indexing. |
None
|
index
|
str
|
Sets index parameters. Fields with index are normalized and tokenized by default. |
None
|
attribute
|
list
|
Specifies a property of an index structure attribute. |
None
|
ann
|
HNSW
|
Add configuration for approximate nearest neighbor. |
None
|
match
|
list
|
Set properties that decide how the matching method for this field operates. |
None
|
weight
|
int
|
Sets the weight of the field, used when calculating rank scores. |
None
|
bolding
|
bool
|
Whether to highlight matching query terms in the summary. |
None
|
summary
|
Summary
|
Add configuration for the summary of the field. |
None
|
is_document_field
|
bool
|
Whether the field is a document field or part of the schema. Default is True. |
True
|
stemming
|
str
|
Add configuration for stemming of the field. |
required |
rank
|
str
|
Add configuration for ranking calculations of the field. |
required |
query_command
|
list
|
Add configuration for query-command of the field. |
required |
struct_fields
|
list
|
Add struct-fields to the field. |
required |
alias
|
list
|
Add alias to the field. |
required |
Example
Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25")
Field('title', 'string', ['index', 'summary'], 'enable-bm25', None, None, None, None, None, None, True, None, None, None, [], None)
Field(
name = "abstract",
type = "string",
indexing = ["attribute"],
attribute=["fast-search", "fast-access"]
)
Field('abstract', 'string', ['attribute'], None, ['fast-search', 'fast-access'], None, None, None, None, None, True, None, None, None, [], None)
Field(name="tensor_field",
type="tensor<float>(x[128])",
indexing=["attribute"],
ann=HNSW(
distance_metric="euclidean",
max_links_per_node=16,
neighbors_to_explore_at_insert=200,
),
)
Field('tensor_field', 'tensor<float>(x[128])', ['attribute'], None, None, HNSW('euclidean', 16, 200), None, None, None, None, True, None, None, None, [], None)
Field(
name = "abstract",
type = "string",
match = ["exact", ("exact-terminator", '"@%"',)],
)
Field('abstract', 'string', None, None, None, None, ['exact', ('exact-terminator', '"@%"')], None, None, None, True, None, None, None, [], None)
Field(
name = "abstract",
type = "string",
weight = 200,
)
Field('abstract', 'string', None, None, None, None, None, 200, None, None, True, None, None, None, [], None)
Field(
name = "abstract",
type = "string",
bolding = True,
)
Field('abstract', 'string', None, None, None, None, None, None, True, None, True, None, None, None, [], None)
Field(
name = "abstract",
type = "string",
summary = Summary(None, None, ["dynamic", ["bolding", "on"]]),
)
Field('abstract', 'string', None, None, None, None, None, None, None, Summary(None, None, ['dynamic', ['bolding', 'on']]), True, None, None, None, [], None)
Field(
name = "abstract",
type = "string",
stemming = "shortest",
)
Field('abstract', 'string', None, None, None, None, None, None, None, None, True, 'shortest', None, None, [], None)
Field(
name = "abstract",
type = "string",
rank = "filter",
)
Field('abstract', 'string', None, None, None, None, None, None, None, None, True, None, 'filter', None, [], None)
Field(
name = "abstract",
type = "string",
query_command = ['"exact %%"'],
)
Field('abstract', 'string', None, None, None, None, None, None, None, None, True, None, None, ['"exact %%"'], [], None)
Field(
name = "abstract",
type = "string",
struct_fields = [
StructField(
name = "first_name",
indexing = ["attribute"],
attribute = ["fast-search"],
),
],
)
Field('abstract', 'string', None, None, None, None, None, None, None, None, True, None, None, None, [StructField('first_name', ['attribute'], ['fast-search'], None, None, None, None)], None)
add_struct_fields(*struct_fields)
Add StructField
's to the Field
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
struct_fields
|
list
|
A list of |
()
|
ImportedField(name, reference_field, field_to_import)
Bases: object
Imported field from a reference document.
Useful to implement parent/child relationships.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Field name. |
required |
reference_field
|
str
|
A field of type reference that points to the document that contains the field to be imported. |
required |
field_to_import
|
str
|
Field name to be imported, as defined in the reference document. |
required |
Example
Struct(name, fields=None)
Bases: object
Create a Vespa struct. A struct defines a composite type. Check the Vespa documentation for more detailed information about structs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the struct. |
required |
fields
|
list
|
List of |
None
|
Example
Struct("person")
Struct('person', None)
Struct(
"person",
[
Field("first_name", "string"),
Field("last_name", "string"),
],
)
Struct('person', [Field('first_name', 'string', None, None, None, None, None, None, None, None, True, None, None, None, [], None), Field('last_name', 'string', None, None, None, None, None, None, None, None, True, None, None, None, [], None)])
DocumentSummary(name, inherits=None, summary_fields=None, from_disk=None, omit_summary_features=None)
Bases: object
Create a Document Summary. Check the Vespa documentation for more detailed information about document-summary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the document-summary. |
required |
inherits
|
str
|
Name of another document-summary from which this inherits. |
None
|
summary_fields
|
list
|
List of |
None
|
from_disk
|
bool
|
Marks this document-summary as accessing fields on disk. |
None
|
omit_summary_features
|
bool
|
Specifies that summary-features should be omitted from this document summary. |
None
|
Example
DocumentSummary(
name="document-summary",
)
DocumentSummary('document-summary', None, None, None, None)
DocumentSummary(
name="which-inherits",
inherits="base-document-summary",
)
DocumentSummary('which-inherits', 'base-document-summary', None, None, None)
DocumentSummary(
name="with-field",
summary_fields=[Summary("title", "string", [("source", "title")])]
)
DocumentSummary('with-field', None, [Summary('title', 'string', [('source', 'title')])], None, None)
DocumentSummary(
name="with-bools",
from_disk=True,
omit_summary_features=True,
)
DocumentSummary('with-bools', None, None, True, True)
Document(fields=None, inherits=None, structs=None)
Bases: object
Create a Vespa Document.
Check the Vespa documentation for more detailed information about documents.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
list
|
A list of |
None
|
Example
Document()
Document(None, None, None)
Document(fields=[Field(name="title", type="string")])
Document([Field('title', 'string', None, None, None, None, None, None, None, None, True, None, None, None, [], None)], None, None)
Document(fields=[Field(name="title", type="string")], inherits="context")
Document([Field('title', 'string', None, None, None, None, None, None, None, None, True, None, None, None, [], None)], context, None)
add_fields(*fields)
Add Field
objects to the document.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
list
|
Fields to be added. |
()
|
Returns:
Type | Description |
---|---|
None
|
None |
add_structs(*structs)
Add Struct
objects to the document.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
structs
|
list
|
Structs to be added. |
()
|
Returns:
Type | Description |
---|---|
None
|
None |
FieldSet(name, fields)
Bases: object
Create a Vespa field set.
A fieldset groups fields together for searching. Check the Vespa documentation for more detailed information about field sets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the fieldset. |
required |
fields
|
list
|
Field names to be included in the fieldset. |
required |
Returns:
Name | Type | Description |
---|---|---|
FieldSet |
None
|
A field set instance. |
Function(name, expression, args=None)
Bases: object
Create a Vespa rank function.
Define a named function that can be referenced as a part of the ranking expression, or (if having no arguments) as a feature. Check the Vespa documentation for more detailed information about rank functions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the function. |
required |
expression
|
str
|
String representing a Vespa expression. |
required |
args
|
list
|
List of arguments to be used in the function expression. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
Function |
None
|
A rank function instance. |
Example
Function(
name="myfeature",
expression="fieldMatch(bar) + freshness(foo)",
args=["foo", "bar"]
)
Function('myfeature', 'fieldMatch(bar) + freshness(foo)', ['foo', 'bar'])
It is possible to define functions with multi-line expressions:
Function(
name="token_type_ids",
expression="tensor<float>(d0[1],d1[128])(\n"
" if (d1 < question_length,\n"
" 0,\n"
" if (d1 < question_length + doc_length,\n"
" 1,\n"
" TOKEN_NONE\n"
" )))",
)
Function('token_type_ids', 'tensor<float>(d0[1],d1[128])(\n if (d1 < question_length,\n 0,\n if (d1 < question_length + doc_length,\n 1,\n TOKEN_NONE\n )))', None)
FirstPhaseRanking(expression, keep_rank_count=None, rank_score_drop_limit=None)
Create a Vespa first phase ranking configuration.
This is the initial ranking performed on all matching documents. Check the Vespa documentation for more detailed information about first phase ranking configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression
|
str
|
Specify the ranking expression to be used for the first phase of ranking. Check also the Vespa documentation for ranking expressions. |
required |
keep_rank_count
|
int
|
How many documents to keep the first phase top rank values for. Default value is 10000. |
None
|
rank_score_drop_limit
|
float
|
Drop all hits with a first phase rank score less than or equal to this floating point number. |
None
|
Returns:
Name | Type | Description |
---|---|---|
FirstPhaseRanking |
None
|
A first phase ranking configuration instance. |
Example
SecondPhaseRanking(expression, rerank_count=100, rank_score_drop_limit=None)
Bases: object
Create a Vespa second phase ranking configuration.
This is the optional reranking performed on the best hits from the first phase. Check the Vespa documentation for more detailed information about second phase ranking configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression
|
str
|
Specify the ranking expression to be used for the second phase of ranking. Check also the Vespa documentation for ranking expressions. |
required |
rerank_count
|
int
|
Specifies the number of hits to be reranked in the second phase. Default value is 100. |
100
|
rank_score_drop_limit
|
float
|
Drop all hits with a first phase rank score less than or equal to this floating point number. |
None
|
Returns:
Name | Type | Description |
---|---|---|
SecondPhaseRanking |
None
|
A second phase ranking configuration instance. |
Example
SecondPhaseRanking(expression="1.25 * bm25(title) + 3.75 * bm25(body)", rerank_count=10)
SecondPhaseRanking('1.25 * bm25(title) + 3.75 * bm25(body)', 10, None)
SecondPhaseRanking(expression="1.25 * bm25(title) + 3.75 * bm25(body)", rerank_count=10, rank_score_drop_limit=5)
SecondPhaseRanking('1.25 * bm25(title) + 3.75 * bm25(body)', 10, 5)
GlobalPhaseRanking(expression, rerank_count=100, rank_score_drop_limit=None)
Bases: object
Create a Vespa global phase ranking configuration.
This is the optional reranking performed on the best hits from the content nodes phase(s). Check the Vespa documentation for more detailed information about global phase ranking configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression
|
str
|
Specify the ranking expression to be used for the global phase of ranking. Check also the Vespa documentation for ranking expressions. |
required |
rerank_count
|
int
|
Specifies the number of hits to be reranked in the global phase. Default value is 100. |
100
|
rank_score_drop_limit
|
float
|
Drop all hits with a first phase rank score less than or equal to this floating point number. |
None
|
Returns:
Name | Type | Description |
---|---|---|
GlobalPhaseRanking |
None
|
A global phase ranking configuration instance. |
Example
GlobalPhaseRanking(expression="1.25 * bm25(title) + 3.75 * bm25(body)", rerank_count=10)
GlobalPhaseRanking('1.25 * bm25(title) + 3.75 * bm25(body)', 10, None)
GlobalPhaseRanking(expression="1.25 * bm25(title) + 3.75 * bm25(body)", rerank_count=10, rank_score_drop_limit=5)
GlobalPhaseRanking('1.25 * bm25(title) + 3.75 * bm25(body)', 10, 5)
Mutate(on_match, on_first_phase, on_second_phase, on_summary)
Bases: object
Enable mutating operations in rank profiles.
Check the Vespa documentation for more detailed information about mutable attributes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
on_match
|
dict
|
Dictionary for the on-match phase containing 3 mandatory keys:
- |
required |
on_first_phase
|
dict
|
Dictionary for the on-first-phase phase containing 3 mandatory keys:
- |
required |
on_second_phase
|
dict
|
Dictionary for the on-second-phase phase containing 3 mandatory keys:
- |
required |
on_summary
|
dict
|
Dictionary for the on-summary phase containing 3 mandatory keys:
- |
required |
Example
enable_mutating_operations(
on_match={
'attribute': 'popularity',
'operation_string': 'add',
'operation_value': 5
},
on_first_phase={
'attribute': 'score',
'operation_string': 'subtract',
'operation_value': 3
}
)
enable_mutating_operations({'attribute': 'popularity', 'operation_string': 'add', 'operation_value': 5},
{'attribute': 'score', 'operation_string': 'subtract', 'operation_value': 3})
MatchPhaseRanking(attribute, order, max_hits)
Bases: object
Create a Vespa match phase ranking configuration.
This is an optional phase that can be used to quickly select a subset of hits for further ranking. Check the Vespa documentation for more detailed information about match phase ranking configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
attribute
|
str
|
The numeric attribute to use for filtering. |
required |
order
|
str
|
The sort order, either "ascending" or "descending". |
required |
max_hits
|
int
|
Maximum number of hits to pass to the next phase. |
required |
Example
RankProfile(name, first_phase, inherits=None, constants=None, functions=None, summary_features=None, match_features=None, second_phase=None, global_phase=None, match_phase=None, num_threads_per_search=None, **kwargs)
Bases: object
Create a Vespa rank profile.
Rank profiles are used to specify an alternative ranking of the same data for different purposes, and to experiment with new rank settings. Check the Vespa documentation for more detailed information about rank profiles.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Rank profile name. |
required |
first_phase
|
str
|
The config specifying the first phase of ranking. More info about first phase ranking. |
required |
inherits
|
str
|
The inherits attribute is optional. If defined, it contains the name of another rank profile in the same schema. Values not defined in this rank profile will then be inherited. |
None
|
constants
|
dict
|
Dict of constants available in ranking expressions, resolved and optimized at configuration time. More info about constants. |
None
|
functions
|
list
|
List of |
None
|
summary_features
|
list
|
List of rank features to be included with each hit. More info about summary features. |
None
|
match_features
|
list
|
List of rank features to be included with each hit. More info about match features. |
None
|
second_phase
|
SecondPhaseRanking
|
Config specifying the second phase of ranking. See |
None
|
global_phase
|
GlobalPhaseRanking
|
Config specifying the global phase of ranking. See |
None
|
match_phase
|
MatchPhaseRanking
|
Config specifying the match phase of ranking. See |
None
|
num_threads_per_search
|
int
|
Overrides the global |
None
|
weight
|
list
|
A list of tuples containing the field and their weight. |
required |
rank_type
|
list
|
A list of tuples containing a field and the rank-type-name. More info about rank-type. |
required |
rank_properties
|
list
|
A list of tuples containing a field and its configuration. More info about rank-properties. |
required |
mutate
|
Mutate
|
A |
required |
Example
RankProfile(name = "default", first_phase = "nativeRank(title, body)")
RankProfile('default', 'nativeRank(title, body)', None, None, None, None, None, None, None, None, None, None, None, None, None)
RankProfile(name = "new", first_phase = "BM25(title)", inherits = "default")
RankProfile('new', 'BM25(title)', 'default', None, None, None, None, None, None, None, None, None, None, None, None)
RankProfile(
name = "new",
first_phase = "BM25(title)",
inherits = "default",
constants={"TOKEN_NONE": 0, "TOKEN_CLS": 101, "TOKEN_SEP": 102},
summary_features=["BM25(title)"]
)
RankProfile('new', 'BM25(title)', 'default', {'TOKEN_NONE': 0, 'TOKEN_CLS': 101, 'TOKEN_SEP': 102}, None, ['BM25(title)'], None, None, None, None, None, None, None, None, None)
RankProfile(
name="bert",
first_phase="bm25(title) + bm25(body)",
second_phase=SecondPhaseRanking(expression="1.25 * bm25(title) + 3.75 * bm25(body)", rerank_count=10),
inherits="default",
constants={"TOKEN_NONE": 0, "TOKEN_CLS": 101, "TOKEN_SEP": 102},
functions=[
Function(
name="question_length",
expression="sum(map(query(query_token_ids), f(a)(a > 0)))"
),
Function(
name="doc_length",
expression="sum(map(attribute(doc_token_ids), f(a)(a > 0)))"
)
],
summary_features=["question_length", "doc_length"]
)
RankProfile('bert', 'bm25(title) + bm25(body)', 'default', {'TOKEN_NONE': 0, 'TOKEN_CLS': 101, 'TOKEN_SEP': 102}, [Function('question_length', 'sum(map(query(query_token_ids), f(a)(a > 0)))', None), Function('doc_length', 'sum(map(attribute(doc_token_ids), f(a)(a > 0)))', None)], ['question_length', 'doc_length'], None, SecondPhaseRanking('1.25 * bm25(title) + 3.75 * bm25(body)', 10, None), None, None, None, None, None, None, None)
RankProfile(
name = "default",
first_phase = "nativeRank(title, body)",
weight = [("title", 200), ("body", 100)]
)
RankProfile('default', 'nativeRank(title, body)', None, None, None, None, None, None, None, None, None, [('title', 200), ('body', 100)], None, None, None)
RankProfile(
name = "default",
first_phase = "nativeRank(title, body)",
rank_type = [("body", "about")]
)
RankProfile('default', 'nativeRank(title, body)', None, None, None, None, None, None, None, None, None, None, [('body', 'about')], None, None)
RankProfile(
name = "default",
first_phase = "nativeRank(title, body)",
rank_properties = [("fieldMatch(title).maxAlternativeSegmentations", "10")]
)
RankProfile('default', 'nativeRank(title, body)', None, None, None, None, None, None, None, None, None, None, None, [('fieldMatch(title).maxAlternativeSegmentations', '10')], None)
RankProfile(
name = "default",
first_phase = FirstPhaseRanking(expression="nativeRank(title, body)", keep_rank_count=50)
)
RankProfile('default', FirstPhaseRanking('nativeRank(title, body)', 50, None), None, None, None, None, None, None, None, None, None, None, None, None, None)
RankProfile(
name = "default",
first_phase = "nativeRank(title, body)",
num_threads_per_search = 2
)
RankProfile('default', 'nativeRank(title, body)', None, None, None, None, None, None, None, None, 2, None, None, None, None)
OnnxModel(model_name, model_file_path, inputs, outputs)
Bases: object
Create a Vespa ONNX model config.
Vespa has support for advanced ranking models through its tensor API. If you have your model in the ONNX format, Vespa can import the models and use them directly. Check the Vespa documentation for more detailed information about field sets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
Unique model name to use as an ID when referencing the model. |
required |
model_file_path
|
str
|
ONNX model file path. |
required |
inputs
|
dict
|
Dict mapping the ONNX input names as specified in the ONNX file to valid Vespa inputs.
These can be a document field ( |
required |
outputs
|
dict
|
Dict mapping the ONNX output names as specified in the ONNX file to the name used in Vespa to specify the output. If omitted, the first output in the ONNX file will be used. |
required |
Example
OnnxModel(
model_name="bert",
model_file_path="bert.onnx",
inputs={
"input_ids": "input_ids",
"token_type_ids": "token_type_ids",
"attention_mask": "attention_mask",
},
outputs={"logits": "logits"},
)
OnnxModel('bert', 'bert.onnx', {'input_ids': 'input_ids', 'token_type_ids': 'token_type_ids', 'attention_mask': 'attention_mask'}, {'logits': 'logits'})
Schema(name, document, fieldsets=None, rank_profiles=None, models=None, global_document=False, imported_fields=None, document_summaries=None, mode='index', inherits=None, **kwargs)
Bases: object
Create a Vespa Schema.
Check the Vespa documentation for more detailed information about schemas.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Schema name. |
required |
document
|
Document
|
Vespa |
required |
fieldsets
|
list
|
A list of |
None
|
rank_profiles
|
list
|
A list of |
None
|
models
|
list
|
A list of |
None
|
global_document
|
bool
|
Set to True to copy the documents to all content nodes. Defaults to False. |
False
|
imported_fields
|
list
|
A list of |
None
|
document_summaries
|
list
|
A list of |
None
|
mode
|
str
|
Schema mode. Defaults to 'index'. Other options are 'store-only' and 'streaming'. |
'index'
|
inherits
|
str
|
Schema to inherit from. |
None
|
stemming
|
str
|
The default stemming setting. Defaults to 'best'. |
required |
Example
add_fields(*fields)
Add Field
to the Schema's Document
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
list
|
A list of |
()
|
add_field_set(field_set)
Add a FieldSet
to the Schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
field_set
|
list
|
A list of |
required |
add_rank_profile(rank_profile)
Add a RankProfile
to the Schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rank_profile
|
RankProfile
|
The rank profile to be added to the Schema. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
add_model(model)
Add an OnnxModel
to the Schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
OnnxModel
|
The ONNX model to be added to the Schema. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
add_imported_field(imported_field)
Add an ImportedField
to the Schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
imported_field
|
ImportedField
|
The imported field to be added to the Schema. |
required |
add_document_summary(document_summary)
Add a DocumentSummary
to the Schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
document_summary
|
DocumentSummary
|
The document summary to be added to the Schema. |
required |
QueryTypeField(name, type)
Bases: object
Create a field to be included in a QueryProfileType
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Field name. |
required |
type
|
str
|
Field type. |
required |
Example
QueryProfileType(fields=None)
Bases: object
Create a Vespa Query Profile Type.
Check the Vespa documentation for more detailed information about query profile types.
An ApplicationPackage
instance comes with a default QueryProfile
named default
that is associated with a QueryProfileType
named root
,
meaning that you usually do not need to create those yourself, only add fields to them when required.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
list[QueryTypeField]
|
A list of |
None
|
Example
add_fields(*fields)
Add QueryTypeField
objects to the Query Profile Type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
QueryTypeField
|
Fields to be added. |
()
|
QueryField(name, value)
QueryProfile(fields=None)
Bases: object
Create a Vespa Query Profile.
Check the Vespa documentation for more detailed information about query profiles.
A QueryProfile
is a named collection of query request parameters given in the configuration.
The query request can specify a query profile whose parameters will be used as parameters of that request.
The query profiles may optionally be type-checked.
Type checking is turned on by referencing a QueryProfileType
from the query profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
list[QueryField]
|
A list of |
None
|
Example
add_fields(*fields)
Add QueryField
objects to the Query Profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fields
|
QueryField
|
Fields to be added. |
()
|
ApplicationConfiguration(name, value)
Bases: object
Create a Vespa Schema.
Check the Config documentation for more detailed information about generic configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Configuration name. |
required |
value
|
str | dict
|
Either a string or a dictionary (which may be nested) of values. |
required |
Example
Parameter(name, args=None, children=None)
Bases: object
Create a Vespa Component configuration parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Parameter name. |
required |
args
|
Any
|
Parameter arguments. |
None
|
children
|
str | list[Parameter]
|
Parameter children. Can be either a string or a list of |
None
|
AuthClient(id, permissions, parameters=None)
Bases: object
Create a Vespa AuthClient.
Check the Vespa documentation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id
|
str
|
The auth client ID. |
required |
permissions
|
list[str]
|
List of permissions. |
required |
parameters
|
list[Parameter]
|
List of |
None
|
Example
Component(id, cls=None, bundle=None, type=None, parameters=None)
Bases: object
Nodes(count='1', parameters=None)
Bases: object
Specify node resources for a content or container cluster as part of a ContainerCluster
or ContentCluster
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
count
|
int
|
Number of nodes in a cluster. |
'1'
|
parameters
|
list[Parameter]
|
List of |
None
|
Example
ContainerCluster(
id="example_container",
nodes=Nodes(
count="2",
parameters=[
Parameter(
"resources",
{"vcpu": "4.0", "memory": "16Gb", "disk": "125Gb"},
children=[Parameter("gpu", {"count": "1", "memory": "16Gb"})]
),
Parameter("node", {"hostalias": "node1", "distribution-key": "0"}),
]
)
)
# Output: ContainerCluster(id="example_container", version="1.0", nodes="Nodes(count='2')")
Cluster(id, version='1.0', nodes=None)
Bases: object
Base class for a cluster configuration. Should not be instantiated directly.
Use subclasses ContainerCluster
or ContentCluster
instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id
|
str
|
Cluster ID. |
required |
version
|
str
|
Cluster version. |
'1.0'
|
nodes
|
Nodes
|
|
None
|
to_xml(root)
Set up XML elements that are used in both container and content clusters.
ContainerCluster(id, version='1.0', nodes=None, components=None, auth_clients=None)
Bases: Cluster
Defines the configuration of a container cluster.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
components
|
list[Component]
|
List of |
None
|
auth_clients
|
list[AuthClient]
|
List of |
None
|
nodes
|
Nodes
|
|
None
|
If ContainerCluster
is used, any Component
s must be added to the ContainerCluster
,
rather than to the ApplicationPackage
, in order to be included in the generated schema.
Example
ContainerCluster(
id="example_container",
components=[
Component(
id="e5",
type="hugging-face-embedder",
parameters=[
Parameter(
"transformer-model",
{"url": "https://github.com/vespa-engine/sample-apps/raw/master/examples/model-exporting/model/e5-small-v2-int8.onnx"}
),
Parameter(
"tokenizer-model",
{"url": "https://raw.githubusercontent.com/vespa-engine/sample-apps/master/examples/model-exporting/model/tokenizer.json"}
)
]
)
],
auth_clients=[AuthClient(id="mtls", permissions=["read", "write"])],
nodes=Nodes(count="2", parameters=[Parameter("resources", {"vcpu": "4.0", "memory": "16Gb", "disk": "125Gb"})])
)
# Output: ContainerCluster(id="example_container", version="1.0", nodes="Nodes(count='2')", components="[Component(id='e5', type='hugging-face-embedder')]", auth_clients="[AuthClient(id='mtls', permissions=['read', 'write'])]")
ContentCluster(id, document_name, version='1.0', nodes=None, min_redundancy='1')
Bases: Cluster
Defines the configuration of a content cluster.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
document_name
|
str
|
Name of document. |
required |
min_redundancy
|
int
|
Minimum redundancy of the content cluster. Must be at least 2 for production deployments. |
'1'
|
Example
ValidationID
Bases: Enum
Collection of IDs that can be used in validation-overrides.xml.
Taken from ValidationId.java.
clusterSizeReduction
was not added as it will be removed in Vespa 9.
indexingChange = 'indexing-change'
class-attribute
instance-attribute
Changing what tokens are expected and stored in field indexes
indexModeChange = 'indexing-mode-change'
class-attribute
instance-attribute
Changing the index mode (streaming, indexed, store-only) of documents
fieldTypeChange = 'field-type-change'
class-attribute
instance-attribute
Field type changes
tensorTypeChange = 'tensor-type-change'
class-attribute
instance-attribute
Tensor type change
resourcesReduction = 'resources-reduction'
class-attribute
instance-attribute
Large reductions in node resources (> 50% of the current max total resources)
contentTypeRemoval = 'schema-removal'
class-attribute
instance-attribute
Removal of a schema (causes deletion of all documents)
contentClusterRemoval = 'content-cluster-removal'
class-attribute
instance-attribute
Removal (or id change) of content clusters
deploymentRemoval = 'deployment-removal'
class-attribute
instance-attribute
Removal of production zones from deployment.xml
globalDocumentChange = 'global-document-change'
class-attribute
instance-attribute
Changing global attribute for document types in content clusters
configModelVersionMismatch = 'config-model-version-mismatch'
class-attribute
instance-attribute
Internal use
skipOldConfigModels = 'skip-old-config-models'
class-attribute
instance-attribute
Internal use
accessControl = 'access-control'
class-attribute
instance-attribute
Internal use, used in zones where there should be no access-control
globalEndpointChange = 'global-endpoint-change'
class-attribute
instance-attribute
Changing global endpoints
zoneEndpointChange = 'zone-endpoint-change'
class-attribute
instance-attribute
Changing zone (possibly private) endpoint settings
redundancyIncrease = 'redundancy-increase'
class-attribute
instance-attribute
Increasing redundancy - may easily cause feed blocked
redundancyOne = 'redundancy-one'
class-attribute
instance-attribute
redundancy=1 requires a validation override on first deployment
pagedSettingRemoval = 'paged-setting-removal'
class-attribute
instance-attribute
May cause content nodes to run out of memory
certificateRemoval = 'certificate-removal'
class-attribute
instance-attribute
Remove data plane certificates
Validation(validation_id, until, comment=None)
Bases: object
Represents a validation to be overridden on application.
Check the Vespa documentation for more detailed information about validations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
validation_id
|
str
|
ID of the validation. |
required |
until
|
str
|
The last day this change is allowed, as an ISO-8601-format date in UTC, e.g. 2016-01-30.
Dates may at most be 30 days in the future, but should be as close to now as possible for safety,
while allowing time for review and propagation to all deployed zones. |
required |
comment
|
str
|
Optional text explaining the reason for the change to humans. |
None
|
DeploymentConfiguration(environment, regions)
Bases: object
Create a DeploymentConfiguration, which defines how to generate a deployment.xml file (for use in production deployments).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
environment
|
str
|
The environment to deploy to. Currently, only 'prod' is supported. |
required |
regions
|
list[str]
|
List of regions to deploy to, e.g. ["us-east-1", "us-west-1"]. See Vespa documentation for more information. |
required |
Example
EmptyDeploymentConfiguration()
Bases: DeploymentConfiguration
Create an EmptyDeploymentConfiguration, which creates an empty deployment.xml, used to delete production deployments.
ServicesConfiguration(application_name, schemas=None, configurations=[], stateless_model_evaluation=False, components=[], auth_clients=[], clusters=[], services_config=None)
Bases: object
Create a ServicesConfiguration, adopting the VespaTag (VT) approach, rather than Jinja templates.
Intended to be used in ApplicationPackage, to generate services.xml, based on either:
- A passed services_config
(VT) object, or
- A set of configurations, schemas, components, auth_clients, and clusters (equivalent to the old approach).
The latter will be done in code by calling build_services_vt()
to generate the VT object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
application_name
|
str
|
Application name. |
required |
schemas
|
Optional[List[Schema]]
|
List of |
None
|
configurations
|
Optional[List[ApplicationConfiguration]]
|
List of |
[]
|
stateless_model_evaluation
|
Optional[bool]
|
Enable stateless model evaluation. Default is False. |
False
|
components
|
Optional[List[Component]]
|
List of |
[]
|
auth_clients
|
Optional[List[AuthClient]]
|
List of |
[]
|
clusters
|
Optional[List[Cluster]]
|
List of |
[]
|
services_config
|
Optional[VT]
|
|
None
|
Example
config = ServicesConfiguration(
application_name="myapp",
schemas=[Schema(name="myschema", document=Document())],
configurations=[ApplicationConfiguration(name="container.handler.observability.application-userdata", value={"version": "my-version"})],
components=[Component(id="hf-embedder", type="huggingface-embedder")],
stateless_model_evaluation=True,
)
print(str(config))
# Output: <?xml version="1.0" encoding="UTF-8" ?>
# <services version="1.0">...</services>
services_config = ServicesConfiguration(
application_name="myapp",
services_config=services(
container(id="myapp_default", version="1.0")(
component(
model(url="https://huggingface.co/mixedbread-ai/mxbai-rerank-xsmall-v1/raw/main/tokenizer.json"),
id="tokenizer", type="hugging-face-tokenizer"
),
document_api(),
search(),
),
content(id="myapp", version="1.0")(
min_redundancy("1"),
documents(document(type="doc", mode="index")),
engine(proton(tuning(searchnode(requestthreads(persearch("4"))))))
),
version="1.0", minimum_required_vespa_version="8.311.28",
),
)
print(str(services_config))
# Output: <?xml version="1.0" encoding="UTF-8" ?>
# <services version="1.0" minimum-required-vespa-version="8.311.28">...</services>
ApplicationPackage(name, schema=None, query_profile=None, query_profile_type=None, stateless_model_evaluation=False, create_schema_by_default=True, create_query_profile_by_default=True, configurations=None, validations=None, components=None, auth_clients=None, clusters=None, deployment_config=None, services_config=None)
Bases: object
Create an application package.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Application name. Cannot contain '-' or '_'. |
required |
schema
|
list
|
List of Schema objects for the application. If None, a default Schema with the same name as the application will be created. Defaults to None. |
None
|
query_profile
|
QueryProfile
|
QueryProfile of the application. If None, a default QueryProfile with QueryProfileType 'root' will be created. Defaults to None. |
None
|
query_profile_type
|
QueryProfileType
|
QueryProfileType of the application. If None, a default QueryProfileType 'root' will be created. Defaults to None. |
None
|
stateless_model_evaluation
|
bool
|
Enable stateless model evaluation. Defaults to False. |
False
|
create_schema_by_default
|
bool
|
Include a default Schema if none is provided in the schema argument. Defaults to True. |
True
|
create_query_profile_by_default
|
bool
|
Include a default QueryProfile and QueryProfileType if not explicitly defined by the user. Defaults to True. |
True
|
configurations
|
list
|
List of ApplicationConfiguration for the application. Defaults to None. |
None
|
validations
|
list
|
Optional list of Validation objects to be overridden. Defaults to None. |
None
|
components
|
list
|
List of Component objects for application components. Defaults to None. |
None
|
clusters
|
list
|
List of Cluster objects for content or container clusters. If clusters is provided, any Component must be part of a cluster. Defaults to None. |
None
|
auth_clients
|
list
|
List of AuthClient objects for client authorization. If clusters is passed, pass the auth clients to the ContainerCluster instead. Defaults to None. |
None
|
deployment_config
|
DeploymentConfiguration
|
Configuration for production deployments. Defaults to None. |
None
|
Example
To create a default application package:
This creates a default Schema, QueryProfile, and QueryProfileType, which can be populated with your application's specifics.
services_to_text
property
Intention is to only use services_config, but keeping this until 100% compatibility is achieved through tests.
add_schema(*schemas)
Add Schema's to the application package.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
schemas
|
list
|
Schemas to be added. |
()
|
Returns:
Type | Description |
---|---|
None
|
None |
to_zip()
Return the application package as zipped bytes, to be used in a subsequent deploy.
Returns:
Name | Type | Description |
---|---|---|
BytesIO |
BytesIO
|
A buffer containing the zipped application package. |
to_zipfile(zfile)
Export the application package as a deployable zipfile. See application packages for deployment options.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
zfile
|
str
|
Filename to export to. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
to_files(root)
Export the application package as a directory tree.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root
|
str
|
Directory to export files to. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
validate_services(xml_input)
Validate an XML input against the RelaxNG schema file for services.xml
Parameters:
Name | Type | Description | Default |
---|---|---|---|
xml_input
|
Path or str or Element
|
The XML input to validate. |
required |
Returns: True if the XML input is valid according to the RelaxNG schema, False otherwise.