v0.17 collapsed the embedded family of index types (
IVFFlat, IVFPQ, IVFSQ) into a single DiskIVF index type at the service layer. The polymorphic index_config argument is gone — configuration is now expressed as flat parameters on create_index.memory, disk, or s3). The four knobs you control at create time are:
| Parameter | Default | What it controls |
|---|---|---|
dimension | auto-detect | Vector dimensionality. If you omit it, the server fixes the dimension to whatever shape the first upsert uses. |
metric | euclidean | Distance function used for similarity search. Accepts euclidean, squared_euclidean, or cosine. |
embedding_model | unset | Optional sentence-transformers model name. When set, the service generates embeddings server-side from contents on upsert/query; dimension is inferred from the model. |
storage_precision | float32 | On-disk rerank-vector dtype. float32 is highest recall; float16 halves on-disk storage at the cost of small recall loss. |
| Parameter | When to use it |
|---|---|
index_key | SDK-supplied KEK path. You generate and persist a 32-byte key locally; the service records the index as provider: none. The same key must be re-supplied on every subsequent call. |
kms_name | KMS-backed path. References a named entry in the service YAML’s kms.registry (e.g. aws-kms or aws). The service generates the KEK, wraps it via the named provider, and persists the envelope. The SDK never sees the plaintext key. |
Distance metric
Python SDK
TypeScript SDK
Go SDK
cURL
euclidean— L2 distance. Default if omitted.squared_euclidean— L2 without the square root; faster, ordering identical toeuclidean.cosine— cosine distance. Use with normalized vectors.
storage_precision: float32 vs float16
storage_precision selects the dtype used for the on-disk rerank vectors. Reranking happens after IVF candidate retrieval to recover recall; storing rerank vectors at lower precision halves the on-disk footprint with a small recall trade-off.
| Value | On-disk size | Recall | When to use |
|---|---|---|---|
float32 (default) | 4 bytes / element | Highest | Most workloads, especially small/medium indexes |
float16 | 2 bytes / element | Small recall loss | Large indexes where disk footprint matters more than the last 1–2% of recall |
Python SDK
TypeScript SDK
Go SDK
Automatic embeddings (embedding_model)
Pass a sentence-transformers model name and the server will embed text contents server-side on upsert and query. When embedding_model is set, dimension is inferred from the model and can be omitted.
Python SDK
TypeScript SDK
Go SDK
Training the index
DiskIVF needs to be trained once it has enough vectors. The service auto-triggers training whennum_vectors > n_lists * RETRAIN_THRESHOLD (default RETRAIN_THRESHOLD = 10000) — most callers do not need to call train() explicitly. See Train an Encrypted Index for the manual-training path and the tuning knobs (n_lists, batch_size, max_iters, tolerance, max_memory).
API reference
REST API Reference
POST /v1/indexes/createPython SDK Reference
Client.create_index() in PythonJS/TS SDK Reference
Client.createIndex() in JavaScript/TypeScriptGo SDK Reference
Client.CreateIndex() in Go