Types

Index Configuration Types

IndexIVF

Standard IVF (Inverted File) index configuration, ideal for balanced performance:

Speed	Accuracy	Memory Usage
Fast	Good	Medium

from cyborgdb import IndexIVF

config = IndexIVF(
    dimension=768 # optional, defaults to auto-detect
)

IndexIVFFlat

IVFFlat index configuration, suitable for highest accuracy requirements:

Speed	Accuracy	Memory Usage
Medium	Highest	High

from cyborgdb import IndexIVFFlat

config = IndexIVFFlat(
    dimension=512 # optional, defaults to auto-detect
)

IndexIVFPQ

IVFPQ (Product Quantization) index configuration, optimized for memory efficiency:

Speed	Accuracy	Memory Usage
Fast	Good	Low

from cyborgdb import IndexIVFPQ

config = IndexIVFPQ(
    pq_dim=64,        # required: product quantization dimension
    pq_bits=8,        # required: bits per quantization code
    dimension=1536    # optional, defaults to auto-detect
)

Both pq_dim and pq_bits are required parameters for IndexIVFPQ. Unlike IndexIVF and IndexIVFFlat, these parameters must be explicitly specified.

Vector Item Format

Dictionary format for upsert operations:

vector_item = {
    "id": "unique_identifier",           # Required: string
    "vector": [0.1, 0.2, 0.3, ...],    # Optional: List[float]
    "contents": "text content",          # Optional: string or bytes
    "metadata": {                        # Optional: Dict[str, Any]
        "category": "research",
        "author": "Dr. Smith",
        "tags": ["ai", "ml"]
    }
}

Query Result Format

Results returned from query operations:

# Single query result format (flat list)
single_query_results = [
    {
        "id": "doc1",                    # string (always included)
        "distance": 0.125,               # float (always included, lower = more similar)
        "metadata": {                    # Dict (if included in query)
            "category": "research"
        },
        "contents": "text content",      # string (if included in query)
        "vector": [0.1, 0.2, ...]      # List[float] (if included in query)
    },
    # ... more results
]

# Batch query result format (nested list)
batch_query_results = [
    [  # Results for first query vector
        {"id": "doc1", "distance": 0.125, ...},
        {"id": "doc2", "distance": 0.234, ...}
    ],
    [  # Results for second query vector
        {"id": "doc3", "distance": 0.156, ...},
        {"id": "doc4", "distance": 0.278, ...}
    ]
]

Metadata Filtering

The filters parameter in query operations supports MongoDB-style operators:

Supported Operators

$eq: Equality ({"category": "research"})
$ne: Not equal ({"status": {"$ne": "draft"}})
$gt: Greater than ({"score": {"$gt": 0.8}})
$gte: Greater than or equal ({"year": {"$gte": 2020}})
$lt: Less than ({"price": {"$lt": 100}})
$lte: Less than or equal ({"rating": {"$lte": 4.5}})
$in: In array ({"tag": {"$in": ["ai", "ml"]}})
$nin: Not in array ({"category": {"$nin": ["spam", "deleted"]}})
$and: Logical AND ({"$and": [{"a": 1}, {"b": 2}]})
$or: Logical OR ({"$or": [{"x": 1}, {"y": 2}]})

Filter Examples

# Simple equality filter
simple_filter = {"category": "research"}

# Range filter
range_filter = {
    "published_year": {"$gte": 2020, "$lte": 2024}
}

# Complex compound filter
complex_filter = {
    "$and": [
        {"category": "research"},
        {"confidence": {"$gte": 0.9}},
        {"$or": [
            {"language": "en"},
            {"translated": True}
        ]}
    ]
}

Field Selection

Many operations support field selection through the include parameter:

Available Fields

vector: The vector data itself
contents: Text or binary content associated with the vector
metadata: Structured metadata object
distance: Similarity distance (query operations only, always included automatically)

The id and distance fields are always included in query results regardless of the include parameter.

Example Usage

# Include only metadata (efficient for existence checks)
metadata_only = ["metadata"]

# Include vectors and distances (for similarity analysis)
vectors_and_distances = ["vector", "distance"]

# Include all available fields
all_fields = ["vector", "contents", "metadata", "distance"]

Distance Metrics

Supported distance metrics for similarity calculations:

cosine: Cosine similarity (recommended for normalized vectors)
euclidean: Euclidean distance (L2 norm)
squared_euclidean: Squared Euclidean distance (faster than euclidean)

Introduction

Client

Encrypted Index

Types & Helpers

Index Configuration Types

IndexIVF

IndexIVFFlat

IndexIVFPQ

Vector Item Format

Query Result Format

Metadata Filtering

Supported Operators

Filter Examples

Field Selection

Available Fields

Example Usage

Distance Metrics

Introduction

Client

Encrypted Index

Types & Helpers

​Index Configuration Types

​IndexIVF

​IndexIVFFlat

​IndexIVFPQ

​Vector Item Format

​Query Result Format

​Metadata Filtering

​Supported Operators

​Filter Examples

​Field Selection

​Available Fields

​Example Usage

​Distance Metrics

Index Configuration Types

IndexIVF

IndexIVFFlat

IndexIVFPQ

Vector Item Format

Query Result Format

Metadata Filtering

Supported Operators

Filter Examples

Field Selection

Available Fields

Example Usage

Distance Metrics