DBConfig

The DBConfig class specifies the storage location for the index, with options for in-memory storage, databases, or file-based storage.

Parameters

ParameterTypeDefaultDescription
locationstring-DB location (redis, postgres, memory)
table_namestringNone(Optional) Table name (postgres-only)
connection_stringstringNone(Optional) Connection string to access DB.

The supported location options are:

  • "redis": Use for high-speed, in-memory storage (recommended for index_location).
  • "postgres": Use for reliable, SQL-based storage (recommended for config_location).
  • "memory" Use for temporary in-memory storage (for benchmarking and evaluation purposes).

Example Usage

import cyborgdb_core as cyborgdb

index_location = cyborgdb.DBConfig(location="redis",
                          connection_string="redis://localhost")

config_location = cyborgdb.DBConfig(location="postgres",
                           table_name="config_table", connection_string="host=localhost dbname=postgres")

For more info, you can read about supported backing stores here.


DistanceMetric

DistanceMetric is a string representing the distance metric used for the index. Options include:

  • "cosine": Cosine similarity.
  • "euclidean": Euclidean distance.
  • "squared_euclidean": Squared Euclidean distance.

IndexConfig

The IndexConfig class defines the parameters for the type of index to be created. Each index type (e.g., ivf, ivfflat, ivfpq) has unique configuration options:

For guidance on how to select the right IndexConfig and params, refer to the index configuration tuning guide.

IndexIVF

Ideal for large-scale datasets where fast retrieval is prioritized over high recall:

SpeedRecallIndex Size
FastestLowestSmallest

Parameters

ParameterTypeDescription
dimensionintDimensionality of vector embeddings.
n_listsintNumber of inverted index lists to create in the index (recommended base-2 value).
metricstr(Optional) Distance metric to use for index build and queries.

Example Usage

import cyborgdb_core as cyborgdb

index_config = cyborgdb.IndexIVF(dimension=128, n_lists=1024, metric="euclidean")

IndexIVFFlat

Suitable for applications requiring high recall with less concern for memory usage:

SpeedRecallIndex Size
FastHighestBiggest

Parameters

ParameterTypeDescription
dimensionintDimensionality of vector embeddings.
n_listsintNumber of inverted index lists to create in the index (recommended base-2 value).
metricstr(Optional) Distance metric to use for index build and queries.

Example Usage

import cyborgdb_core as cyborgdb

index_config = cyborgdb.IndexIVFFlat(dimension=128, n_lists=1024, metric="euclidean")

IndexIVFPQ

Product Quantization compresses embeddings, making it suitable for balancing memory use and recall:

SpeedRecallIndex Size
FastHighMedium

Parameters

ParameterTypeDescription
dimensionintDimensionality of vector embeddings.
n_listsintNumber of inverted index lists to create in the index (recommended base-2 value).
pq_dimintDimensionality of embeddings after quantization (less than or equal to dimension).
pq_bitsintNumber of bits per dimension for PQ embeddings (between 1 and 16).
metricstr(Optional) Distance metric to use for index build and queries.

Example Usage

import cyborgdb_core as cyborgdb

index_config = cyborgdb.IndexIVFPQ(dimension=128, n_lists=1024, pq_dim=64, pq_bits=8, metric="euclidean")
If embedding_model is defined in create_index(), then dimension is unncessary in IndexConfig.