> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Types

## Index Configuration Types

### IndexIVFSQ

IVFSQ (Scalar Quantization) index configuration, compresses each dimension independently for a balance of speed and index size:

| Speed | Accuracy | Memory Usage |
| ----- | -------- | ------------ |
| Fast  | High     | Low          |

```python theme={null}
from cyborgdb import IndexIVFSQ

config = IndexIVFSQ(
    sq_bits=16,       # optional, default: 16
    dimension=768     # optional, defaults to auto-detect
)
```

<Note> The `sq_bits` parameter controls the precision of the scalar quantization. Higher values provide higher recall at the cost of larger index sizes. Accepted values are `8` and `16`, with `16` being the default.</Note>

### IndexIVFFlat

IVFFlat index configuration, suitable for highest accuracy requirements:

| Speed  | Accuracy | Memory Usage |
| ------ | -------- | ------------ |
| Medium | Highest  | High         |

```python theme={null}
from cyborgdb import IndexIVFFlat

config = IndexIVFFlat(
    dimension=512 # optional, defaults to auto-detect
)
```

### IndexIVFPQ

IVFPQ (Product Quantization) index configuration, optimized for memory efficiency:

| Speed | Accuracy | Memory Usage |
| ----- | -------- | ------------ |
| Fast  | Good     | Low          |

```python theme={null}
from cyborgdb import IndexIVFPQ

config = IndexIVFPQ(
    pq_dim=64,        # required: product quantization dimension
    pq_bits=8,        # required: bits per quantization code
    dimension=1536    # optional, defaults to auto-detect
)
```

<Note>Both `pq_dim` and `pq_bits` are required parameters for IndexIVFPQ. Unlike IndexIVFSQ and IndexIVFFlat, these parameters must be explicitly specified.</Note>

## Vector Item Format

Dictionary format for upsert operations:

```python theme={null}
vector_item = {
    "id": "unique_identifier",           # Required: string
    "vector": [0.1, 0.2, 0.3, ...],    # Optional: List[float]
    "contents": "text content",          # Optional: string or bytes
    "metadata": {                        # Optional: Dict[str, Any]
        "category": "research",
        "author": "Dr. Smith",
        "tags": ["ai", "ml"]
    }
}
```

## Query Result Format

Results returned from query operations:

```python theme={null}
# Single query result format (flat list)
single_query_results = [
    {
        "id": "doc1",                    # string (always included)
        "distance": 0.125,               # float (if included, lower = more similar)
        "metadata": {                    # Dict (if included in query)
            "category": "research"
        },
        "contents": "text content",      # string (if included in query)
        "vector": [0.1, 0.2, ...]      # List[float] (if included in query)
    },
    # ... more results
]

# Batch query result format (nested list)
batch_query_results = [
    [  # Results for first query vector
        {"id": "doc1", "distance": 0.125, ...},
        {"id": "doc2", "distance": 0.234, ...}
    ],
    [  # Results for second query vector
        {"id": "doc3", "distance": 0.156, ...},
        {"id": "doc4", "distance": 0.278, ...}
    ]
]
```

## Metadata Filtering

The `filters` parameter in query operations supports MongoDB-style operators:

### Supported Operators

* **`$eq`**: Equality (`{"category": "research"}`)
* **`$ne`**: Not equal (`{"status": {"$ne": "draft"}}`)
* **`$gt`**: Greater than (`{"score": {"$gt": 0.8}}`)
* **`$gte`**: Greater than or equal (`{"year": {"$gte": 2020}}`)
* **`$lt`**: Less than (`{"price": {"$lt": 100}}`)
* **`$lte`**: Less than or equal (`{"rating": {"$lte": 4.5}}`)
* **`$in`**: In array (`{"tag": {"$in": ["ai", "ml"]}}`)
* **`$nin`**: Not in array (`{"category": {"$nin": ["spam", "deleted"]}}`)
* **`$and`**: Logical AND (`{"$and": [{"a": 1}, {"b": 2}]}`)
* **`$or`**: Logical OR (`{"$or": [{"x": 1}, {"y": 2}]}`)

### Filter Examples

```python theme={null}
# Simple equality filter
simple_filter = {"category": "research"}

# Range filter
range_filter = {
    "published_year": {"$gte": 2020, "$lte": 2024}
}

# Complex compound filter
complex_filter = {
    "$and": [
        {"category": "research"},
        {"confidence": {"$gte": 0.9}},
        {"$or": [
            {"language": "en"},
            {"translated": True}
        ]}
    ]
}
```

## Field Selection

Many operations support field selection through the `include` parameter:

### Available Fields

* **`vector`**: The vector data itself
* **`contents`**: Text or binary content associated with the vector
* **`metadata`**: Structured metadata object
* **`distance`**: Similarity distance (query operations only)

<Note>The `id` field is always included in query results. Other fields such as `distance` and `metadata` are controlled by the `include` parameter (server default: `[]` — only `id` is returned unless `include` specifies additional fields).</Note>

### Example Usage

```python theme={null}
# Include only metadata (efficient for existence checks)
metadata_only = ["metadata"]

# Include vectors and distances (for similarity analysis)
vectors_and_distances = ["vector", "distance"]

# Include all available fields
all_fields = ["vector", "contents", "metadata", "distance"]
```

## Distance Metrics

Supported distance metrics for similarity calculations:

* **`cosine`**: Cosine similarity (recommended for normalized vectors)
* **`euclidean`**: Euclidean distance (L2 norm)
* **`squared_euclidean`**: Squared Euclidean distance (faster than euclidean)
