> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Train Index

Train an index for efficient querying. Required before optimal performance.

## Authentication

Required - API key via `X-API-Key` header:

```http theme={null}
X-API-Key: cyborg_your_api_key_here
```

You can get an API key from the [CyborgDB Admin Dashboard](https://cyborgdb.co). For more info, follow [this guide](../../../intro/get-api-key).

## Request Body

```json theme={null}
{
  "index_name": "my_index",
  "index_key": "64_character_hex_string_representing_32_bytes",
  "n_lists": null,
  "batch_size": 2048,
  "max_iters": 100,
  "tolerance": 0.000001,
  "max_memory": 0
}
```

<Expandable title="parameters">
  <ParamField body="index_name" type="string" required="true">
    Name of the index to train
  </ParamField>

  <ParamField body="index_key" type="string">
    32-byte encryption key as a hex string. Required for indexes created with the SDK-supplied KEK path; omit for KMS-backed indexes (the service resolves the key via the stored KMSBlob).
  </ParamField>

  <ParamField body="n_lists" type="integer" default="null (auto)">
    Number of inverted lists to create. When `null` or omitted, automatically determines optimal value based on dataset size
  </ParamField>

  <ParamField body="batch_size" type="integer" default={2048}>
    Size of each training batch
  </ParamField>

  <ParamField body="max_iters" type="integer" default={100}>
    Maximum training iterations
  </ParamField>

  <ParamField body="tolerance" type="number" default={1e-6}>
    Convergence tolerance
  </ParamField>

  <ParamField body="max_memory" type="integer" default={0}>
    Maximum memory usage in MB (0 = no limit)
  </ParamField>
</Expandable>

## Response

**Training Completed:**

```json theme={null}
{
  "status": "success",
  "message": "Index 'my_index' trained successfully"
}
```

**Training Failed:**

```json theme={null}
{
  "status": "error",
  "message": "Training failed for 'my_index': internal error during training"
}
```

<Note>
  This endpoint blocks until training completes. It will return `"success"` when training finishes normally, or `"error"` if training fails.
</Note>

## Exceptions

* `401`: Authentication failed (invalid API key) **or** wrong `index_key` on SDK-supplied indexes — see [error model](../introduction#error-model-api-keys-index-keys-and-kms)
* `404`: Index not found
* `422`: Invalid request parameters or insufficient vectors
* `500`: Internal server error

## Example Usage

**Basic Training:**

```bash theme={null}
curl -X POST "http://localhost:8000/v1/indexes/train" \
     -H "X-API-Key: cyborg_your_api_key_here" \
     -H "Content-Type: application/json" \
     -d '{
       "index_name": "my_index",
       "index_key": "your_64_character_hex_key_here"
     }'
```

**Custom Training Parameters:**

```bash theme={null}
curl -X POST "http://localhost:8000/v1/indexes/train" \
     -H "X-API-Key: cyborg_your_api_key_here" \
     -H "Content-Type: application/json" \
     -d '{
       "index_name": "my_index",
       "index_key": "your_64_character_hex_key_here",
       "batch_size": 1024,
       "max_iters": 50,
       "tolerance": 0.0001
     }'
```

<Tip>There must be at least `2 * n_lists` vector embeddings in the index prior to training.</Tip>

## Training Requirements

* **Minimum vectors**: At least `2 × n_lists` vectors must be present in the index
* **Memory**: Training may require significant memory depending on `batch_size` and dataset size
* **Time**: Training duration varies based on dataset size and convergence parameters

## Use Cases

* **Performance optimization**: Enable fast approximate nearest neighbor search
* **Production preparation**: Train indexes before deploying to production
* **Batch processing**: Train indexes after bulk data ingestion
* **Index maintenance**: Retrain indexes after significant data updates
