Train - CyborgDB Docs

Trains the encrypted index to optimize it for efficient similarity search queries. Training is essential for IVF-based indexes to achieve optimal query performance and accuracy.

In CyborgDB Service v0.17, training is auto-triggered server-side once upserts cross the configured RETRAIN_THRESHOLD. upsert() returns None either way — to observe training, poll is_training(). Calling train() explicitly forces immediate clustering and is useful when you want to block until the index is ready (for example, before benchmarking queries). Auto-training can be disabled service-side with the AUTO_TRAIN_DISABLED setting.

index.train(
    n_lists=None,
    batch_size=None,
    max_iters=None,
    tolerance=None
)

Parameters

Parameter	Type	Default	Description
`n_lists`	`int`	`None`	(Optional) Number of inverted lists to use for the index. When `None`, auto-selects based on the dataset size
`batch_size`	`int`	`None`	(Optional) Number of vectors to process per training batch. When `None`, the server uses 2048
`max_iters`	`int`	`None`	(Optional) Maximum number of training iterations. When `None`, the server uses 100
`tolerance`	`float`	`None`	(Optional) Convergence tolerance for training completion. When `None`, the server uses 1e-6

Training is a compute-intensive operation that may take several seconds to minutes depending on the index size and configuration.

Returns

None

Exceptions

Error

Throws if the API request fails due to network connectivity issues.
Throws if authentication fails (invalid API key).
Throws if the encryption key is invalid for the specified index.
Throws if there are insufficient resources to complete training.

Training Errors

Throws if the index has no vectors to train on.
Throws if the index configuration is incompatible with training.
Throws if training parameters are out of valid ranges.
Throws if training fails to converge within the specified parameters.

Example Usage

Basic Index Training

# Train the index after adding the data
index.train()

Custom Training Parameters

# Train with custom parameters for large dataset
index.train(
    n_lists=100,        # Number of inverted lists
    batch_size=4096,    # Larger batches for better performance
    max_iters=200,      # More iterations for better convergence
    tolerance=1e-7      # Stricter convergence criteria
)

​Parameters

​Returns

​Exceptions

​Example Usage

​Basic Index Training

​Custom Training Parameters

Parameters

Returns

Exceptions

Example Usage

Basic Index Training

Custom Training Parameters