from learntorank.query import QueryModel, Ranking, OR
standard_query_model = QueryModel(
name="or_bm25",
match_phase = OR(),
ranking = Ranking(name="bm25")
)Query models
Starting in version 0.5.0 we can bypass the pyvespa high-level API and create a QueryModel with the full flexibility of the Vespa Query API. This is useful for use cases not covered by the pyvespa API and for users that are familiar with and prefer to work with the Vespa Query API.
def body_function(query):
body = {'yql': 'select * from sources * where userQuery();',
'query': query,
'type': 'any',
'ranking': {'profile': 'bm25', 'listFeatures': 'false'}}
return body
flexible_query_model = QueryModel(body_function = body_function)The flexible_query_model defined above is equivalent to the standard_query_model, as we can see when querying the app. We will use the cord19 app in our demonstration.
from vespa.application import Vespa
app = Vespa(url = "https://api.cord19.vespa.ai")from learntorank.query import send_query
standard_result = send_query(
app=app,
query="this is a test",
query_model=standard_query_model
)
standard_result.get_hits().head(3)flexible_result = send_query(
app=app,
query="this is a test",
query_model=flexible_query_model
)
flexible_result.get_hits().head(3)Specify a query model
Query + term-matching + rank profile
from learntorank.query import QueryModel, OR, Ranking, send_query
results = send_query(
app=app,
query="Is remdesivir an effective treatment for COVID-19?",
query_model = QueryModel(
match_phase=OR(),
ranking=Ranking(name="bm25")
)
)results.number_documents_retrievedQuery + term-matching + ann operator + rank_profile
from learntorank.query import QueryModel, QueryRankingFeature, ANN, WeakAnd, Union, Ranking
from random import random
match_phase = Union(
WeakAnd(hits = 10),
ANN(
doc_vector="specter_embedding",
query_vector="specter_vector",
hits = 10,
label="title"
)
)
ranking = Ranking(name="related-specter", list_features=True)
query_model = QueryModel(
query_properties=[QueryRankingFeature(
name="specter_vector",
mapping=lambda x: [random() for x in range(768)]
)],
match_phase=match_phase, ranking=ranking
)results = send_query(
app=app,
query="Is remdesivir an effective treatment for COVID-19?",
query_model=query_model
)results.number_documents_retrievedRecall specific documents
Let’s take a look at the top 3 ids from the last query.
top_ids = [hit["fields"]["id"] for hit in results.hits[0:3]]
top_idsAssume that we now want to retrieve the second and third ids above. We can do so with the recall argument.
results_with_recall = send_query(
app=app,
query="Is remdesivir an effective treatment for COVID-19?",
query_model=query_model,
recall = ("id", top_ids[1:3])
)It will only retrieve the documents with Vespa field id that is defined on the list that is inside the tuple.
id_recalled = [hit["fields"]["id"] for hit in results_with_recall.hits]
id_recalled