query()
:
Query Parameters
You can specify additional parameters for the query, such as:top_k
: the number of results to return.n_probes
: the number of clusters to search for each query vector.filters
: a list of metadata filters to apply to the query.include
: a list of item fields to return (e.g.,["distance", "metadata"]
).greedy
: whether to perform a greedy search (higher recall but slower).
Batched Queries
It’s also possible to perform batch queries by passing a list of query vectors toquery()
:
Querying with Metadata Filters
You can filter query results based on metadata fields. For example, to filter items where theage
field is greater than 18
, you can use the following filter:
age
field is greater than 18
. You can also use other comparison operators such as $lt
, $gte
, $lte
, $eq
, and $neq
.
For more details on metadata filters, see the Metadata Filtering guide.
Automatic Embedding Generation
This feature is only available in Python. To use it, use
pip install cyborgdb-core[embeddings]
or pip install cyborgdb-core[embeddings]
embedding_model
during index creation, you can automatically generate embeddings for queries by providing query_contents
to the query()
call:
Python
sentence-transformers
for embedding generation. You can use any model from the HuggingFace Model Hub that is compatible with sentence-transformers
.
Retrieving Items Post-Query
In certain applications, such as RAG, it may be desirable to retrieve matching items after a query. This is possible viaget()
, which retrieves and decrypts item added via upsert()
. For more details, see the Get Items guide.
Note on Trained vs. Untrained Queries
For the embedded lib version of CyborgDB, queries will initially default to ‘untrained’ queries, which use an exhaustive search algorithm. This is fine for small datasets, but once you have more than50,000
vectors in your index, you should train the index. Without doing so, queries will run slower. For more details, see Training an Encrypted Index.