> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Upsert

Adds new vectors to the index or updates existing ones. The Python SDK exposes a single positional API with two calling shapes:

```python theme={null}
# Shape 1: list of item dicts
index.upsert(items)

# Shape 2: parallel arrays — IDs + numpy array of vectors
index.upsert(ids, vectors)
```

<Tip>For batches large enough that JSON encoding becomes a bottleneck, call <a href="./upsert-binary">`upsert_binary`</a> directly — it sends vectors as base64-encoded binary and also accepts parallel `metadata` / `contents` lists. Shape 2 above is a thin wrapper that already forwards to `upsert_binary` under the hood, but only with `ids` + `vectors` (no metadata/contents).</Tip>

### Parameters

#### Shape 1: List of item dicts

| Parameter | Type         | Default | Description                           |
| --------- | ------------ | ------- | ------------------------------------- |
| `items`   | `List[Dict]` | -       | List of dictionaries, one per vector. |

Where each dictionary can contain:

```python theme={null}
[
  {
    "id": str,                # Unique identifier for the vector (required)
    "vector": List[float],    # Vector data. Optional if the index has an embedding model and `contents` is provided.
    "contents": str | bytes,  # Optional content. Bytes are base64-encoded, strings are passed through. All contents are encrypted before storage.
    "metadata": Dict          # Optional key-value pairs for filtering and retrieval
  },
  ...
]
```

<Tip>The `contents` field accepts both strings and bytes. Bytes are automatically base64-encoded before encryption; strings are passed as-is. Contents are returned in their original format (string or bytes) when retrieved with `get()`.</Tip>

#### Shape 2: Parallel arrays

| Parameter | Type                                             | Default | Description                        |
| --------- | ------------------------------------------------ | ------- | ---------------------------------- |
| `ids`     | `List[str]`                                      | -       | List of unique vector identifiers. |
| `vectors` | `np.ndarray` (shape `(n, dim)`, dtype `float32`) | -       | Vector data as a 2D numpy array.   |

<Warning>The two-arg form of `upsert()` does **not** accept `metadata` or `contents` keyword arguments. To attach metadata or contents alongside parallel arrays, call <a href="./upsert-binary">`upsert_binary(ids, vectors, metadata=..., contents=...)`</a> directly.</Warning>

### Returns

`None`

### Example Usage

#### Dictionary format

```python theme={null}
items = [
    {"id": "doc1", "vector": [0.1, 0.2, 0.3, 0.4]},
    {"id": "doc2", "vector": [0.5, 0.6, 0.7, 0.8], "metadata": {"category": "news"}},
]

index.upsert(items)
```

#### Parallel arrays (binary fast path)

```python theme={null}
import numpy as np

ids = ["vec1", "vec2", "vec3"]
vectors = np.random.rand(3, 128).astype(np.float32)

index.upsert(ids, vectors)
```

#### Parallel arrays with metadata / contents

When you need metadata or contents alongside parallel arrays, call `upsert_binary` directly:

```python theme={null}
import numpy as np

ids = ["vec1", "vec2", "vec3"]
vectors = np.random.rand(3, 128).astype(np.float32)
metadata = [
    {"category": "news"},
    {"category": "research"},
    None,
]
contents = ["First doc body", "Second doc body", None]

index.upsert_binary(ids, vectors, metadata=metadata, contents=contents)
```
