Trains the encrypted index to optimize it for efficient similarity search queries. Training is essential for IVF-based indexes to achieve optimal query performance and accuracy.
In CyborgDB Service, training is typically handled automatically by the service. However, you can explicitly trigger training once enough vectors have been added.
async train({
    nLists?: number,         // optional, default: 0
    batchSize?: number,      // optional, default: 2048
    maxIters?: number,       // optional, default: 100
    tolerance?: number       // optional, default: 1e-6
}): Promise<object>

Parameters

ParameterTypeDefaultDescription
nListsnumber0(Optional) Number of inverted lists to use for the index. Defaults to 0 which will auto-select based on the dataset size
batchSizenumber2048(Optional) Size of each batch processed during training. Larger values may improve training quality but use more memory
maxItersnumber100(Optional) Maximum number of iterations for the training algorithm. More iterations may improve accuracy but take longer
tolerancenumber1e-6(Optional) Convergence tolerance for training. Smaller values result in more precise training but may take longer
Training is a compute-intensive operation that may take several seconds to minutes depending on the index size and configuration.

Returns

Promise<object>: A Promise that resolves to a response object containing the operation status and training completion message.

Exceptions

Example Usage

import { Client } from 'cyborgdb';

const client = new Client({ baseUrl: 'http://localhost:8000', apiKey: 'your-api-key' });

// Load an existing index
const indexKey = new Uint8Array(Buffer.from('your-stored-hex-key', 'hex'));
const index = await client.loadIndex({ indexName: 'my-vector-index', indexKey });

// Add vectors to the index first
await index.upsert({
    items: [
        { id: 'doc1', vector: [0.1, 0.2, 0.3, 0.4], metadata: { title: 'Document 1' } },
        { id: 'doc2', vector: [0.4, 0.5, 0.6, 0.7], metadata: { title: 'Document 2' } },
        { id: 'doc3', vector: [0.7, 0.8, 0.9, 1.0], metadata: { title: 'Document 3' } }
    ]
});

// Train the index with default parameters
try {
    console.log('Starting index training...');
    const startTime = Date.now();
    
    const result = await index.train();
    
    const duration = Date.now() - startTime;
    console.log(`Training completed in ${duration}ms`);
    console.log('Training result:', result);
    // Typical output: { status: 'success', message: "Index 'my-vector-index' trained successfully" }
    
    // Verify training completed
    const isTrained = await index.isTrained();
    console.log('Index is now trained:', isTrained);
    
    // Index is now optimized for queries
    const queryResults = await index.query([0.1, 0.2, 0.3, 0.4], undefined, 5);
    console.log('Query after training:', queryResults);
    
} catch (error: any) {
    console.error('Training failed:', error.message);
}

Custom Training Parameters

try {
    console.log('Starting high-quality training...');

    const result = await index.train({
        batchSize: 4096,    // larger batches for better quality
        maxIters: 200,      // more iterations for convergence
        tolerance: 1e-8     // stricter convergence criteria
    });

    console.log('High-quality training completed:', result);
    return result;
    
} catch (error: any) {
    console.error('High-quality training failed:', error.message);
    throw error;
}

Response Format

The method returns a response object with the following structure:
// Standard training completion response
{
    "status": "success",
    "message": "Index 'my-vector-index' trained successfully"
}

Response Fields

FieldTypeDescription
statusstringOperation status (typically “success”)
messagestringDescriptive message about the training completion