from learntorank.query import QueryModel, Ranking, OR
= QueryModel(
standard_query_model ="or_bm25",
name= OR(),
match_phase = Ranking(name="bm25")
ranking )
Query models
Starting in version 0.5.0
we can bypass the pyvespa high-level API and create a QueryModel
with the full flexibility of the Vespa Query API. This is useful for use cases not covered by the pyvespa API and for users that are familiar with and prefer to work with the Vespa Query API.
def body_function(query):
= {'yql': 'select * from sources * where userQuery();',
body 'query': query,
'type': 'any',
'ranking': {'profile': 'bm25', 'listFeatures': 'false'}}
return body
= QueryModel(body_function = body_function) flexible_query_model
The flexible_query_model
defined above is equivalent to the standard_query_model
, as we can see when querying the app
. We will use the cord19 app in our demonstration.
from vespa.application import Vespa
= Vespa(url = "https://api.cord19.vespa.ai") app
from learntorank.query import send_query
= send_query(
standard_result =app,
app="this is a test",
query=standard_query_model
query_model
)3) standard_result.get_hits().head(
= send_query(
flexible_result =app,
app="this is a test",
query=flexible_query_model
query_model
)3) flexible_result.get_hits().head(
Specify a query model
Query + term-matching + rank profile
from learntorank.query import QueryModel, OR, Ranking, send_query
= send_query(
results =app,
app="Is remdesivir an effective treatment for COVID-19?",
query= QueryModel(
query_model =OR(),
match_phase=Ranking(name="bm25")
ranking
) )
results.number_documents_retrieved
Query + term-matching + ann operator + rank_profile
from learntorank.query import QueryModel, QueryRankingFeature, ANN, WeakAnd, Union, Ranking
from random import random
= Union(
match_phase = 10),
WeakAnd(hits
ANN(="specter_embedding",
doc_vector="specter_vector",
query_vector= 10,
hits ="title"
label
)
)= Ranking(name="related-specter", list_features=True)
ranking = QueryModel(
query_model =[QueryRankingFeature(
query_properties="specter_vector",
name=lambda x: [random() for x in range(768)]
mapping
)],=match_phase, ranking=ranking
match_phase )
= send_query(
results =app,
app="Is remdesivir an effective treatment for COVID-19?",
query=query_model
query_model )
results.number_documents_retrieved
Recall specific documents
Let’s take a look at the top 3 ids from the last query.
= [hit["fields"]["id"] for hit in results.hits[0:3]]
top_ids top_ids
Assume that we now want to retrieve the second and third ids above. We can do so with the recall
argument.
= send_query(
results_with_recall =app,
app="Is remdesivir an effective treatment for COVID-19?",
query=query_model,
query_model= ("id", top_ids[1:3])
recall )
It will only retrieve the documents with Vespa field id
that is defined on the list that is inside the tuple.
= [hit["fields"]["id"] for hit in results_with_recall.hits]
id_recalled id_recalled