Train an index for efficient querying. Required before optimal performance.
Authentication
Required - API key via X-API-Key header:
X-API-Key : cyborg_your_api_key_here
You can get an API key from the CyborgDB Admin Dashboard . For more info, follow this guide .
Request Body
{
"index_name" : "my_index" ,
"index_key" : "64_character_hex_string_representing_32_bytes" ,
"n_lists" : null ,
"batch_size" : 2048 ,
"max_iters" : 100 ,
"tolerance" : 0.000001 ,
"max_memory" : 0
}
Name of the index to train
32-byte encryption key as hex string
n_lists
integer
default: "null (auto)"
Number of inverted lists to create. When null or omitted, automatically determines optimal value based on dataset size
Size of each training batch
Maximum training iterations
Maximum memory usage in MB (0 = no limit)
Response
Training Completed:
{
"status" : "success" ,
"message" : "Index 'my_index' trained successfully"
}
Training Already In Progress:
{
"status" : "in_progress" ,
"message" : "Index 'my_index' is already being trained"
}
If an index is already being trained (either manually or via automatic training trigger), the API will return an in_progress status instead of starting a new training session.
Exceptions
401: Authentication failed (invalid API key)
404: Index not found
422: Invalid request parameters or insufficient vectors
500: Internal server error
Example Usage
Basic Training:
curl -X POST "http://localhost:8000/v1/indexes/train" \
-H "X-API-Key: cyborg_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"index_name": "my_index",
"index_key": "your_64_character_hex_key_here"
}'
Custom Training Parameters:
curl -X POST "http://localhost:8000/v1/indexes/train" \
-H "X-API-Key: cyborg_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"index_name": "my_index",
"index_key": "your_64_character_hex_key_here",
"batch_size": 1024,
"max_iters": 50,
"tolerance": 0.0001
}'
There must be at least 2 * n_lists vector embeddings in the index prior to training.
Training Requirements
Minimum vectors : At least 2 × n_lists vectors must be present in the index
Memory : Training may require significant memory depending on batch_size and dataset size
Time : Training duration varies based on dataset size and convergence parameters
Use Cases
Performance optimization : Enable fast approximate nearest neighbor search
Production preparation : Train indexes before deploying to production
Batch processing : Train indexes after bulk data ingestion
Index maintenance : Retrain indexes after significant data updates