> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Upsert

Adds or updates vector embeddings in the index. Accepts a list of dictionaries, where each dictionary represents a vector with its ID.

```python theme={null}
def upsert(self, items: List[Dict[str, Any]])
```

### Parameters

| Parameter | Type                   | Description                                                                                                                                                   |
| --------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `items`   | `List[Dict[str, Any]]` | A list of dictionaries, where each dictionary has the format `{"id": str, "vector": List[float], "contents": Union[bytes, str], "metadata": Dict[str, str]}`. |

Where each `item` dictionary has the following fields:

| Parameter  | Type                      | Description                                                                                          |
| ---------- | ------------------------- | ---------------------------------------------------------------------------------------------------- |
| `id`       | `str`                     | Unique string identifier for the item.                                                               |
| `vector`   | `List[float]`             | Embedding vector as a list of floats (*optional* if index is configured to compute embeddings).      |
| `contents` | `bytes`, `str` or `Image` | Item contents (*optional*). Must be a `str` or `Image` if index is configured to compute embeddings. |
| `metadata` | `Dict[str, any]`          | Dictionary of key-value metadata pairs associated with the vector (*optional*).                      |

<Note>If embedding auto-generation is enabled (by setting the `embedding_model` parameter in [`create_index()`](../client/create-index)), then the `vector` parameter is optional.
If `vector` provided, then it will be used (its dimensionality must match that of the `embedding_model`).
If `vector` is not provided, a vector embedding will be auto-generated from `contents` using `sentence-transformers`. `contents` must be text in this case.</Note>

<Tip>For more info on metadata, see [Metadata Filtering](../../../guides/data-operations/metadata-filtering).</Tip>

### Exceptions

<AccordionGroup>
  <Accordion title="ValueError">
    * Throws if the vector dimensions are incompatible with the index configuration.
    * Throws if the index was not created or loaded yet.
  </Accordion>

  <Accordion title="RuntimeError">
    * Throws if the vectors could not be upserted.
  </Accordion>
</AccordionGroup>

### Example Usage

```python theme={null}
# Initial configuration already done...

# Load Index
index = client.load_index(index_name=index_name, index_key=index_key)

# Single upsert (wrapped in a list)
index.upsert([{"id": "item_1", "vector": [0.1, 0.2, 0.3]}])

# Upsert with metadata
index.upsert([
  {"id": "item_1",
   "vector": [0.1, 0.2, 0.3],
   "metadata": {"type": "dog", "temperament": "good boy"}
  }
])

# Batch upsert
index.upsert([
    {"id": "item_1", "vector": [0.1, 0.2, 0.3]},
    {"id": "item_2", "vector": [0.4, 0.5, 0.6]}
])

# Upsert with items
index.upsert([
    {"id": "item_1", "vector": [0.1, 0.1, 0.1, 0.1], "contents": b'item_contents_here...'}, # Bytes
    {"id": "item_2", "vector": [0.2, 0.2, 0.2, 0.2], "contents": "item_contents_here..."} # Text
])
```

<Tip>You can pass the `id` and `vector` fields as tuples, if you wish to skip the dictionary. On big datasets, this can make a signficant difference in memory usage.</Tip>

***

## Upsert Secondary Overload: NumPy Array Format

<Tip>This format is optimal for large batches due to its memory efficiency and compatibility with batch processing optimizations.
It is primarily intended for benchmarking purposes (hence the use of `int` for IDs).</Tip>

```python theme={null}
def upsert(self, ids: np.ndarray, vectors: np.ndarray)
```

Accepts two NumPy arrays:

* A 2D array of floats for the vector embeddings.
* A 1D array of integers for the unique IDs.

This structure is suited for efficient handling of large batches, with type safety for IDs and embeddings.

### Parameters

| Parameter | Type         | Description                                                                                                                                  |
| --------- | ------------ | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `ids`     | `np.ndarray` | 1D NumPy array of shape `(num_items,)` with `dtype=int`, containing unique integer identifiers for each vector. Length must match `vectors`. |
| `vectors` | `np.ndarray` | 2D NumPy array of shape `(num_items, vector_dim)` with `dtype=float`, representing vector embeddings.                                        |

### Exceptions

<AccordionGroup>
  <Accordion title="ValueError">
    * Throws if the vector dimensions are incompatible with the index configuration.
    * Throws if the index was not created or loaded yet.
  </Accordion>

  <Accordion title="RuntimeError">
    * Throws if the vectors could not be upserted.
  </Accordion>
</AccordionGroup>

### Example Usage

```python theme={null}
import numpy as np

# Load index
index = client.load_index(index_name=index_name, index_key=index_key)

# NumPy-based upsert with two arrays
vectors = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], dtype=float)  # 2 vectors of dimension 3
ids = np.array([101, 102], dtype=int)  # Unique integer IDs for each vector

index.upsert(vectors=vectors, ids=ids)
```
