Skip to main content
Adds new vectors to the index or updates existing ones. The Python SDK exposes a single positional API with two calling shapes:
# Shape 1: list of item dicts
index.upsert(items)

# Shape 2: parallel arrays — IDs + numpy array of vectors
index.upsert(ids, vectors)
For batches large enough that JSON encoding becomes a bottleneck, call upsert_binary directly — it sends vectors as base64-encoded binary and also accepts parallel metadata / contents lists. Shape 2 above is a thin wrapper that already forwards to upsert_binary under the hood, but only with ids + vectors (no metadata/contents).

Parameters

Shape 1: List of item dicts

ParameterTypeDefaultDescription
itemsList[Dict]-List of dictionaries, one per vector.
Where each dictionary can contain:
[
  {
    "id": str,                # Unique identifier for the vector (required)
    "vector": List[float],    # Vector data. Optional if the index has an embedding model and `contents` is provided.
    "contents": str | bytes,  # Optional content. Bytes are base64-encoded, strings are passed through. All contents are encrypted before storage.
    "metadata": Dict          # Optional key-value pairs for filtering and retrieval
  },
  ...
]
The contents field accepts both strings and bytes. Bytes are automatically base64-encoded before encryption; strings are passed as-is. Contents are returned in their original format (string or bytes) when retrieved with get().

Shape 2: Parallel arrays

ParameterTypeDefaultDescription
idsList[str]-List of unique vector identifiers.
vectorsnp.ndarray (shape (n, dim), dtype float32)-Vector data as a 2D numpy array.
The two-arg form of upsert() does not accept metadata or contents keyword arguments. To attach metadata or contents alongside parallel arrays, call upsert_binary(ids, vectors, metadata=..., contents=...) directly.

Returns

None

Example Usage

Dictionary format

items = [
    {"id": "doc1", "vector": [0.1, 0.2, 0.3, 0.4]},
    {"id": "doc2", "vector": [0.5, 0.6, 0.7, 0.8], "metadata": {"category": "news"}},
]

index.upsert(items)

Parallel arrays (binary fast path)

import numpy as np

ids = ["vec1", "vec2", "vec3"]
vectors = np.random.rand(3, 128).astype(np.float32)

index.upsert(ids, vectors)

Parallel arrays with metadata / contents

When you need metadata or contents alongside parallel arrays, call upsert_binary directly:
import numpy as np

ids = ["vec1", "vec2", "vec3"]
vectors = np.random.rand(3, 128).astype(np.float32)
metadata = [
    {"category": "news"},
    {"category": "research"},
    None,
]
contents = ["First doc body", "Second doc body", None]

index.upsert_binary(ids, vectors, metadata=metadata, contents=contents)