Skip to main content
query_binary is the high-throughput query path. Query vectors are sent as base64-encoded binary instead of JSON arrays, which is significantly faster for large batch queries.
results = index.query_binary(
    query_vectors,
    top_k=None,
    n_probes=None,
    filters=None,
    include=None,
    greedy=None,
    rerank_mult=None,
)

Parameters

ParameterTypeDefaultDescription
query_vectorsnp.ndarray-A 1D array (dim,) for a single query, or a 2D array (n_queries, dim) for batch queries. The SDK casts to little-endian float32 if needed.
top_kintNone (server default: 100)(Optional) Number of nearest neighbors to return per query.
n_probesintNone (auto)(Optional) Number of clusters to probe. Higher = better recall, more latency.
filtersDictNone(Optional) Metadata filter expression. See metadata filtering.
includeList[str]None (server default returns only id)(Optional) Subset of "distance", "metadata", "vector". Unlike get, "contents" is not a valid value here — the query response model has no contents field.
greedyboolNone(Optional) Use the greedy approximate search algorithm.
rerank_multintNone (server default: 10)(Optional) Multiplier for stage 1 retrieval on reranking indexes. Stage 1 returns top_k * rerank_mult candidates before reranking narrows to top_k. Ignored by indexes that do not rerank.
include defaults differ between endpoints: query_binary returns [] (only id) by default, while get returns ["vector", "contents", "metadata"].

Returns

  • Single query (1D query_vectors): List[Dict] — one result dict per neighbor.
  • Batch query (2D query_vectors): List[List[Dict]] — one inner list per query vector.
Each result dict contains id plus whichever of distance, metadata, vector were requested via include.

Exceptions

  • TypeError: query_vectors is not a numpy array.
  • ValueError: query_vectors.ndim is not 1 or 2, or the server returns an error.

Example Usage

Single query

import numpy as np

q = np.random.rand(384).astype(np.float32)
results = index.query_binary(q, top_k=5, include=["distance", "metadata"])
for r in results:
    print(r["id"], r["distance"])

Batch query

queries = np.random.rand(32, 384).astype(np.float32)
batch_results = index.query_binary(queries, top_k=10)
for i, results in enumerate(batch_results):
    print(f"Query {i}: {len(results)} hits")
For ad-hoc queries and most application code, prefer query() — it accepts either lists or numpy arrays and automatically routes to the binary path when numpy is supplied. Call query_binary directly only when you want to bypass the routing logic.