DBConfig
The DBConfig class specifies the storage location for the index, with options for in-memory storage, databases, or file-based storage.
Parameters
| Parameter | Type | Default | Description |
location | string | - | DB location (redis, postgres, memory) |
table_name | string | None | (Optional) Table name (postgres-only) |
connection_string | string | None | (Optional) Connection string to access DB. |
The supported location options are:
"redis": Use for high-speed, in-memory storage (recommended for index_location).
"postgres": Use for reliable, SQL-based storage (recommended for config_location).
"memory" Use for temporary in-memory storage (for benchmarking and evaluation purposes).
Example Usage
import cyborgdb_core as cyborgdb
index_location = cyborgdb.DBConfig(
location="redis",
connection_string="redis://localhost"
)
config_location = cyborgdb.DBConfig(
location="postgres",
table_name="config_table",
connection_string="host=localhost dbname=postgres"
)
For more info, you can read about supported backing stores here.
GPUConfig
The GPUConfig class configures which operations should use GPU acceleration. GPU acceleration requires CUDA support.
Parameters
| Parameter | Type | Default | Description |
upsert | bool | False | (Optional) Enable GPU for upsert operations |
train | bool | False | (Optional) Enable GPU for training operations |
The query parameter is not available in the constructor. To enable GPU for query operations, you must enable both upsert=True and train=True, or use the bitflag operations directly in C++.
Properties (Read-Only)
| Property | Type | Description |
upsert | bool | Whether GPU is enabled for upsert operations |
train | bool | Whether GPU is enabled for training operations |
query | bool | Whether GPU is enabled for query operations |
all | bool | Whether all GPU operations are enabled |
none | bool | Whether no GPU operations are enabled |
Example Usage
import cyborgdb_core as cyborgdb
# Enable GPU for upsert and training (query will also be enabled via properties)
gpu_config1 = cyborgdb.GPUConfig(upsert=True, train=True)
# Enable GPU only for training
gpu_config2 = cyborgdb.GPUConfig(train=True)
# Disable GPU (default)
gpu_config3 = cyborgdb.GPUConfig()
# Check GPU configuration
if gpu_config1.all:
print("All GPU operations enabled")
if gpu_config2.train and gpu_config2.query:
print("GPU enabled for training and query")
DistanceMetric
DistanceMetric is a string representing the distance metric used for the index. Options include:
"cosine": Cosine similarity.
"euclidean": Euclidean distance.
"squared_euclidean": Squared Euclidean distance.
IndexConfig
The IndexConfig class defines the parameters for the type of index to be created. Each index type (e.g., ivf, ivfflat, ivfpq) has unique configuration options:
IndexIVF
Ideal for large-scale datasets where fast retrieval is prioritized over high recall:
| Speed | Recall | Index Size |
| Fastest | Lowest | Smallest |
Parameters
| Parameter | Type | Default | Description |
dimension | int | 0 | (Optional) Dimensionality of vector embeddings. Auto-detected if 0. |
Properties (Read-Only)
| Property | Type | Description |
n_lists | int | Number of inverted lists (coarse clusters). Set internally during training, initially 1. |
dimension | int | Dimensionality of vector embeddings. |
metric | str | Distance metric used. |
index_type | str | Returns “ivf”. |
Example Usage
import cyborgdb_core as cyborgdb
# Basic configuration with auto-detection
index_config = cyborgdb.IndexIVF()
# Explicit dimension configuration
index_config = cyborgdb.IndexIVF(dimension=128)
# Access read-only properties
print(f"n_lists: {index_config.n_lists}") # Will show 1 (default)
print(f"dimension: {index_config.dimension}") # Will show 128
IndexIVFFlat
Suitable for applications requiring high recall with less concern for memory usage:
| Speed | Recall | Index Size |
| Fast | Highest | Biggest |
Parameters
| Parameter | Type | Default | Description |
dimension | int | 0 | (Optional) Dimensionality of vector embeddings. Auto-detected if 0. |
Properties (Read-Only)
| Property | Type | Description |
n_lists | int | Number of inverted lists (coarse clusters). Set internally during training, initially 1. |
dimension | int | Dimensionality of vector embeddings. |
metric | str | Distance metric used. |
index_type | str | Returns “ivfflat”. |
Example Usage
import cyborgdb_core as cyborgdb
# Basic configuration with auto-detection
index_config = cyborgdb.IndexIVFFlat()
# Explicit dimension configuration
index_config = cyborgdb.IndexIVFFlat(dimension=128)
# Access read-only properties
print(f"n_lists: {index_config.n_lists}") # Will show 1 (default)
print(f"dimension: {index_config.dimension}") # Will show 128
IndexIVFFlat is the default index configuration and is suitable for most use cases.
IndexIVFPQ
Product Quantization compresses embeddings, making it suitable for balancing memory use and recall:
| Speed | Recall | Index Size |
| Fast | High | Medium |
Parameters
| Parameter | Type | Default | Description |
dimension | int | None | (Optional) Dimensionality of vector embeddings. Auto-detected if not provided. |
pq_dim | int | - | (Required) Dimensionality of PQ codes after quantization. |
pq_bits | int | - | (Required) Number of bits per quantizer (between 1 and 16). |
Properties (Read-Only)
| Property | Type | Description |
n_lists | int | Number of inverted lists (coarse clusters). Set internally during training, initially 1. |
dimension | int | Dimensionality of vector embeddings. |
metric | str | Distance metric used. |
index_type | str | Returns “ivfpq”. |
pq_dim | int | Dimensionality of PQ codes after quantization. |
pq_bits | int | Number of bits per quantizer. |
Example Usage
import cyborgdb_core as cyborgdb
# Basic configuration (dimension auto-detected)
index_config = cyborgdb.IndexIVFPQ(pq_dim=64, pq_bits=8)
# Explicit configuration
index_config = cyborgdb.IndexIVFPQ(
dimension=128,
pq_dim=64,
pq_bits=8
)
# Access read-only properties
print(f"n_lists: {index_config.n_lists}") # Will show 1 (default)
print(f"pq_dim: {index_config.pq_dim}") # Will show 64
print(f"pq_bits: {index_config.pq_bits}") # Will show 8
If dimension is not provided, it will be auto-determined based on the first vector embedding added to the index.