> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Upsert

Adds or updates vector embeddings in the index. Accepts a list of dictionaries, where each dictionary represents a vector with its ID.

```python theme={null}
def upsert(self,
           items: List[Dict[str, Any]],
           *,
           index_key: bytes = None,
           user_id: bytes = None)
```

### Parameters

| Parameter   | Type                   | Description                                                                                                                                                   |
| ----------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `items`     | `List[Dict[str, Any]]` | A list of dictionaries, where each dictionary has the format `{"id": str, "vector": List[float], "contents": Union[bytes, str], "metadata": Dict[str, str]}`. |
| `index_key` | `bytes`                | *(Optional, keyword-only)* Override the per-operation index key. See [Per-operation key override](#per-operation-key-override).                               |
| `user_id`   | `bytes`                | *(Optional, keyword-only)* 16-byte RBAC user identifier. See [Per-operation key override](#per-operation-key-override).                                       |

Where each `item` dictionary has the following fields:

| Parameter  | Type             | Description                                                                                                                                                                |
| ---------- | ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `id`       | `str`            | Unique string identifier for the item.                                                                                                                                     |
| `vector`   | `List[float]`    | Embedding vector as a list of floats (*optional* if index is configured to compute embeddings).                                                                            |
| `contents` | `bytes` or `str` | Item contents (*optional*). Accepts strings or bytes; all are encoded to bytes and encrypted before storage. Must be a `str` if index is configured to compute embeddings. |
| `metadata` | `Dict[str, any]` | Dictionary of key-value metadata pairs associated with the vector (*optional*).                                                                                            |

<Tip>The `contents` field accepts strings or bytes. All contents are encoded to bytes and encrypted before storage, and will be returned as bytes when retrieved with `get()`.</Tip>

<Note>If embedding auto-generation is enabled (by setting the `embedding_model` parameter in [`create_index()`](../client/create-index)), then the `vector` parameter is optional.
If `vector` provided, then it will be used (its dimensionality must match that of the `embedding_model`).
If `vector` is not provided, a vector embedding will be auto-generated from `contents` using `sentence-transformers`. `contents` must be text in this case.</Note>

<Tip>For more info on metadata, see [Metadata Filtering](../../guides/data-operations/metadata-filtering).</Tip>

### Exceptions

<AccordionGroup>
  <Accordion title="ValueError">
    * Throws if the vector dimensions are incompatible with the index configuration.
    * Throws if the index was not created or loaded yet.
  </Accordion>

  <Accordion title="RuntimeError">
    * Throws if the vectors could not be upserted.
  </Accordion>
</AccordionGroup>

### Example Usage

```python theme={null}
# Initial configuration already done...

# Load Index
index = client.load_index(
    index_name=index_name, 
    index_key=index_key
)

# Single upsert (wrapped in a list)
index.upsert([{"id": "item_1", "vector": [0.1, 0.2, 0.3]}])

# Upsert with metadata
index.upsert([
  {"id": "item_1",
   "vector": [0.1, 0.2, 0.3],
   "metadata": {"type": "dog", "temperament": "good boy"}
  }
])

# Batch upsert
index.upsert([
    {"id": "item_1", "vector": [0.1, 0.2, 0.3]},
    {"id": "item_2", "vector": [0.4, 0.5, 0.6]}
])

# Upsert with items
index.upsert([
    {"id": "item_1", "vector": [0.1, 0.1, 0.1, 0.1], "contents": b'item_contents_here...'}, # Bytes
    {"id": "item_2", "vector": [0.2, 0.2, 0.2, 0.2], "contents": "item_contents_here..."} # Text
])
```

### Per-operation key override

The simple calls above reuse the key supplied at [`create_index()`](../client/create-index) / [`load_index()`](../client/load-index). You may instead pass `index_key=` (and `user_id=` for an [RBAC user](./manage-users)) to override the per-operation key. This is required in stateless/service deployments that reload the index per request:

```python theme={null}
# Stateless / service style: supply the key per operation
index.upsert(
    [{"id": "item_1", "vector": [0.1, 0.2, 0.3]}],
    index_key=index_key,
)
```

***

## Upsert Secondary Overload: NumPy Array Format

<Tip>This format is optimal for large batches due to its memory efficiency and compatibility with batch processing optimizations.
It is primarily intended for benchmarking purposes (hence the use of `int` for IDs).</Tip>

```python theme={null}
def upsert(self,
           ids: Union[list[str], np.ndarray],
           vectors: np.ndarray,
           *,
           index_key: bytes = None,
           user_id: bytes = None)
```

Accepts a list of string IDs (or a NumPy array) and a NumPy array of vectors:

* A list of strings or 1D array of identifiers for the unique IDs.
* A 2D array of float32 values for the vector embeddings.

This structure is suited for efficient handling of large batches, with type safety for IDs and embeddings.

### Parameters

| Parameter | Type                        | Description                                                                                                                                                             |
| --------- | --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `ids`     | `list[str]` or `np.ndarray` | A list of strings or 1D NumPy array of shape `(num_items,)` containing unique identifiers for each vector. Can be string or integer dtype. Length must match `vectors`. |
| `vectors` | `np.ndarray`                | 2D NumPy array of shape `(num_items, vector_dim)` with `dtype=float32`, representing vector embeddings.                                                                 |

### Exceptions

<AccordionGroup>
  <Accordion title="ValueError">
    * Throws if the vector dimensions are incompatible with the index configuration.
    * Throws if the index was not created or loaded yet.
  </Accordion>

  <Accordion title="RuntimeError">
    * Throws if the vectors could not be upserted.
  </Accordion>
</AccordionGroup>

### Example Usage

```python theme={null}
import numpy as np

# Load index
index = client.load_index(index_name=index_name, index_key=index_key)

# NumPy-based upsert with two arrays
vectors = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], dtype=np.float32)  # 2 vectors of dimension 3
ids = np.array(["item_101", "item_102"])  # Unique string IDs for each vector

index.upsert(ids, vectors)
```
