DBConfig
TheDBConfig class specifies the storage location for the index, with options for in-memory storage, databases, or file-based storage.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
location | string | - | DB location (redis, postgres, memory) |
table_name | string | None | (Optional) Table name (postgres-only) |
connection_string | string | None | (Optional) Connection string to access DB. |
location options are:
"redis": Use for high-speed, in-memory storage (recommended forindex_location)."postgres": Use for reliable, SQL-based storage (recommended forconfig_location)."memory"Use for temporary in-memory storage (for benchmarking and evaluation purposes).
Example Usage
DistanceMetric
DistanceMetric is a string representing the distance metric used for the index. Options include:
"cosine": Cosine similarity."euclidean": Euclidean distance."squared_euclidean": Squared Euclidean distance.
IndexConfig
TheIndexConfig class defines the parameters for the type of index to be created. Each index type (e.g., ivf, ivfflat, ivfpq) has unique configuration options:
IndexIVF
Ideal for large-scale datasets where fast retrieval is prioritized over high recall:| Speed | Recall | Index Size |
|---|---|---|
| Fastest | Lowest | Smallest |
Parameters
| Parameter | Type | Description |
|---|---|---|
dimension | int | Dimensionality of vector embeddings. |
n_lists | int | Number of inverted index lists to create in the index (recommended base-2 value). |
metric | str | (Optional) Distance metric to use for index build and queries. |
Example Usage
IndexIVFFlat
Suitable for applications requiring high recall with less concern for memory usage:| Speed | Recall | Index Size |
|---|---|---|
| Fast | Highest | Biggest |
Parameters
| Parameter | Type | Description |
|---|---|---|
dimension | int | Dimensionality of vector embeddings. |
n_lists | int | Number of inverted index lists to create in the index (recommended base-2 value). |
metric | str | (Optional) Distance metric to use for index build and queries. |
Example Usage
IndexIVFPQ
Product Quantization compresses embeddings, making it suitable for balancing memory use and recall:| Speed | Recall | Index Size |
|---|---|---|
| Fast | High | Medium |
Parameters
| Parameter | Type | Description |
|---|---|---|
dimension | int | Dimensionality of vector embeddings. |
n_lists | int | Number of inverted index lists to create in the index (recommended base-2 value). |
pq_dim | int | Dimensionality of embeddings after quantization (less than or equal to dimension). |
pq_bits | int | Number of bits per dimension for PQ embeddings (between 1 and 16). |
metric | str | (Optional) Distance metric to use for index build and queries. |