> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Types

## Location

The `Location` enum contains the supported index backing store locations for CyborgDB. These are:

```cpp theme={null}
enum class Location {
    kRedis,             // In-memory storage via Redis
    kMemory,            // Temporary in-memory storage
    kPostgres,          // Relational database storage
    kRocksDB,           // Local persistent storage via RocksDB
    kThreadSafeMemory,  // Thread-safe in-memory storage
    kNone               // Undefined storage type
};
```

<Warning>`kMemory` is deprecated and will be removed in a future release. Please use `kThreadSafeMemory` instead.</Warning>

***

## DBConfig

`DBConfig` defines the storage location for various index components.

### Constructor

```cpp theme={null}
explicit DBConfig(Location location,
                const std::optional<std::string>& table_name = std::nullopt,
                const std::optional<std::string>& db_connection_string = std::nullopt);
```

### Parameters

| Parameter              | Type                    | Description                                                                                    |
| ---------------------- | ----------------------- | ---------------------------------------------------------------------------------------------- |
| `location`             | [`Location`](#location) | Specifies the type of storage location.                                                        |
| `table_name`           | `std::string`           | *(Optional)* Name of the table in the database, if applicable. Defaults to `std::nullopt`.     |
| `db_connection_string` | `std::string`           | *(Optional)* Connection string for database access, if applicable. Defaults to `std::nullopt`. |

### Example Usage

```cpp theme={null}
cyborg::DBConfig index_loc(Location::kRedis, std::nullopt, "redis://localhost");
cyborg::DBConfig config_loc(Location::kRedis, std::nullopt, "redis://localhost");
cyborg::DBConfig items_loc(Location::kPostgres, "items", "host=localhost dbname=postgres");

// RocksDB (recommended for embedded/local deployments)
cyborg::DBConfig rocksdb_loc(Location::kRocksDB, std::nullopt, "~/.cyborgdb/data");
```

For more info, you can read about supported backing stores [here](../../intro/backing-stores).

***

## GPUConfig

`GPUConfig` is an enum that specifies which operations should use GPU acceleration. It uses bitflags that can be combined using the `|` (OR) operator.

### Enum Values

```cpp theme={null}
enum GPUConfig : uint8_t {
    kNone = 0,                        // No GPU usage
    kUpsert = 1 << 0,                 // Use GPU for upsert operations
    kTrain = 1 << 1,                  // Use GPU for training operations
    kQuery = 1 << 2,                  // Use GPU for query operations
    kAll = kUpsert | kTrain | kQuery  // Use GPU for all operations
};
```

### Example Usage

```cpp theme={null}
// Enable GPU for all operations
cyborg::GPUConfig config1 = cyborg::kAll;

// Enable GPU only for training and query
cyborg::GPUConfig config2 = cyborg::kTrain | cyborg::kQuery;

// Enable GPU only for upsert
cyborg::GPUConfig config3 = cyborg::kUpsert;

// Disable GPU completely
cyborg::GPUConfig config4 = cyborg::kNone;
```

***

## DeviceConfig

`DeviceConfig` class holds the configuration details for the device used in vector search operations, such as the number of CPU threads and GPU acceleration settings.

### Constructor

```cpp theme={null}
DeviceConfig(const int cpu_threads = 0, const GPUConfig gpu_config = kNone);
```

### Parameters

| Parameter     | Type                      | Description                                                                           |
| ------------- | ------------------------- | ------------------------------------------------------------------------------------- |
| `cpu_threads` | `int`                     | *(Optional)* Number of CPU threads to use. Defaults to `0` (use all available cores). |
| `gpu_config`  | [`GPUConfig`](#gpuconfig) | *(Optional)* GPU operations configuration. Defaults to `kNone` (no GPU).              |

### Methods

| Method                | Return Type               | Description                               |
| --------------------- | ------------------------- | ----------------------------------------- |
| `cpu_threads() const` | `int`                     | Get the number of CPU threads configured. |
| `gpu_config() const`  | [`GPUConfig`](#gpuconfig) | Get the GPU operations configuration.     |

### Example Usage

```cpp theme={null}
// 4 CPU threads, GPU enabled for training and query
cyborg::DeviceConfig device_config(4, cyborg::kTrain | cyborg::kQuery);
int threads = device_config.cpu_threads();           // Returns 4
cyborg::GPUConfig gpu = device_config.gpu_config();  // Returns kTrain | kQuery
```

***

## DistanceMetric

The `DistanceMetric` enum contains the supported distance metrics for CyborgDB. These are:

```cpp theme={null}
enum class DistanceMetric {
    Cosine,
    Euclidean,
    SquaredEuclidean};
```

***

## IndexConfig

`IndexConfig` is an abstract base class for configuring index types. The three derived classes can be used to configure indexes:

### IndexIVF

Ideal for large-scale datasets where fast retrieval is prioritized over high recall:

|  Speed  | Recall | Index Size |
| :-----: | :----: | :--------: |
| Fastest | Lowest |  Smallest  |

#### Constructor

```cpp theme={null}
IndexIVF(size_t dimension = 0, std::string embedding_model = "");
```

#### Parameters

| Parameter         | Type          | Default | Description                                                           |
| ----------------- | ------------- | ------- | --------------------------------------------------------------------- |
| `dimension`       | `size_t`      | `0`     | *(Optional)* Dimensionality of vector embeddings. Auto-detected if 0. |
| `embedding_model` | `std::string` | `""`    | *(Optional)* Embedding model name for auto-generation.                |

#### Methods

| Method                       | Return Type      | Description                                                                |
| ---------------------------- | ---------------- | -------------------------------------------------------------------------- |
| `dimension()`                | `size_t`         | Get vector dimensionality.                                                 |
| `metric()`                   | `DistanceMetric` | Get distance metric.                                                       |
| `set_metric(DistanceMetric)` | `void`           | Set distance metric.                                                       |
| `n_lists()`                  | `size_t`         | Get number of inverted lists (initially 1, set during training).           |
| `set_n_lists(size_t)`        | `void`           | Set number of inverted lists (usually done automatically during training). |

### IndexIVFFlat

Suitable for applications requiring high recall with less concern for memory usage:

| Speed |  Recall | Index Size |
| :---: | :-----: | :--------: |
|  Fast | Highest |   Biggest  |

#### Constructor

```cpp theme={null}
IndexIVFFlat(size_t dimension = 0, std::string embedding_model = "");
```

#### Parameters

| Parameter         | Type          | Default | Description                                                           |
| ----------------- | ------------- | ------- | --------------------------------------------------------------------- |
| `dimension`       | `size_t`      | `0`     | *(Optional)* Dimensionality of vector embeddings. Auto-detected if 0. |
| `embedding_model` | `std::string` | `""`    | *(Optional)* Embedding model name for auto-generation.                |

#### Methods

| Method                       | Return Type      | Description                                                                |
| ---------------------------- | ---------------- | -------------------------------------------------------------------------- |
| `dimension()`                | `size_t`         | Get vector dimensionality.                                                 |
| `metric()`                   | `DistanceMetric` | Get distance metric.                                                       |
| `set_metric(DistanceMetric)` | `void`           | Set distance metric.                                                       |
| `n_lists()`                  | `size_t`         | Get number of inverted lists (initially 1, set during training).           |
| `set_n_lists(size_t)`        | `void`           | Set number of inverted lists (usually done automatically during training). |

### IndexIVFPQ

Product Quantization compresses embeddings, making it suitable for balancing memory use and recall:

| Speed | Recall | Index Size |
| :---: | :----: | :--------: |
|  Fast |  High  |   Medium   |

#### Constructor

```cpp theme={null}
IndexIVFPQ(size_t dimension = 0, size_t pq_dim = 0, size_t pq_bits = 8,
           std::string embedding_model = "");
```

#### Parameters

| Parameter         | Type          | Default | Description                                                                       |
| ----------------- | ------------- | ------- | --------------------------------------------------------------------------------- |
| `dimension`       | `size_t`      | `0`     | *(Optional)* Dimensionality of vector embeddings. Auto-detected if 0.             |
| `pq_dim`          | `size_t`      | `0`     | *(Optional)* Dimensionality of embeddings after quantization. Auto-detected if 0. |
| `pq_bits`         | `size_t`      | `8`     | *(Optional)* Number of bits per dimension for PQ embeddings (between 1 and 16).   |
| `embedding_model` | `std::string` | `""`    | *(Optional)* Embedding model name for auto-generation.                            |

#### Methods

| Method                       | Return Type      | Description                                                                |
| ---------------------------- | ---------------- | -------------------------------------------------------------------------- |
| `dimension()`                | `size_t`         | Get vector dimensionality.                                                 |
| `metric()`                   | `DistanceMetric` | Get distance metric.                                                       |
| `set_metric(DistanceMetric)` | `void`           | Set distance metric.                                                       |
| `n_lists()`                  | `size_t`         | Get number of inverted lists (initially 1, set during training).           |
| `set_n_lists(size_t)`        | `void`           | Set number of inverted lists (usually done automatically during training). |
| `pq_dim()`                   | `size_t`         | Get PQ dimensionality.                                                     |
| `pq_bits()`                  | `size_t`         | Get PQ bits per quantizer.                                                 |

### IndexIVFSQ

Scalar Quantization compresses embeddings, providing a good balance of speed, recall, and index size:

| Speed | Recall | Index Size |
| :---: | :----: | :--------: |
|  Fast |  High  |    Small   |

#### Constructor

```cpp theme={null}
IndexIVFSQ(size_t dimension = 0, size_t sq_bits = 16,
           std::string embedding_model = "");
```

#### Parameters

| Parameter         | Type          | Default | Description                                                           |
| ----------------- | ------------- | ------- | --------------------------------------------------------------------- |
| `dimension`       | `size_t`      | `0`     | *(Optional)* Dimensionality of vector embeddings. Auto-detected if 0. |
| `sq_bits`         | `size_t`      | `16`    | *(Optional)* Number of bits for scalar quantization (8 or 16).        |
| `embedding_model` | `std::string` | `""`    | *(Optional)* Embedding model name for auto-generation.                |

#### Methods

| Method                       | Return Type      | Description                                                                |
| ---------------------------- | ---------------- | -------------------------------------------------------------------------- |
| `dimension()`                | `size_t`         | Get vector dimensionality.                                                 |
| `metric()`                   | `DistanceMetric` | Get distance metric.                                                       |
| `set_metric(DistanceMetric)` | `void`           | Set distance metric.                                                       |
| `n_lists()`                  | `size_t`         | Get number of inverted lists (initially 1, set during training).           |
| `set_n_lists(size_t)`        | `void`           | Set number of inverted lists (usually done automatically during training). |
| `sq_bits()`                  | `size_t`         | Get SQ bits per dimension.                                                 |

#### Example Usage

```cpp theme={null}
// Default configuration (16-bit scalar quantization)
cyborg::IndexIVFSQ config1;

// 8-bit scalar quantization (smaller index, lower precision)
cyborg::IndexIVFSQ config2(0, 8);

// Explicit dimension and SQ bits
cyborg::IndexIVFSQ config3(128, 16);
```

***

## Array2D

`Array2D` class provides a 2D container for data, which can be initialized with a specific number of rows and columns, or from an existing vector.

### Constructors

```cpp theme={null}
Array2D(size_t rows, size_t cols, const T& initial_value = T());
Array2D(std::vector<T>&& data, size_t cols);
Array2D(const std::vector<T>& data, size_t cols);
Array2D(std::initializer_list<std::initializer_list<T>> init_list);
Array2D(Array2D&& other) noexcept;
Array2D();
```

* **`Array2D(size_t rows, size_t cols, const T& initial_value = T())`**: Creates a 2D array with specified dimensions, initialized with the given value.
* **`Array2D(std::vector<T>&& data, size_t cols)`**: Initializes the 2D array from a 1D vector (move semantics).
* **`Array2D(const std::vector<T>& data, size_t cols)`**: Initializes the 2D array from a 1D vector (copy).
* **`Array2D(std::initializer_list<std::initializer_list<T>> init_list)`**: Initializes from a nested initializer list (e.g., `{{1, 2}, {3, 4}}`).
* **`Array2D(Array2D&& other) noexcept`**: Move constructor - transfers ownership without copying.
* **`Array2D()`**: Default constructor - creates an empty array (0 rows, 0 columns).

<Note>The copy constructor is deleted. Use `Clone()` or move semantics to copy an `Array2D`.</Note>

### Access Methods

* **`operator()(size_t row, size_t col) const`**: Access an element at the specified row and column (read-only).
* **`operator()(size_t row, size_t col)`**: Access an element at the specified row and column (read-write).
* **`size_t rows() const`**: Returns the number of rows.
* **`size_t cols() const`**: Returns the number of columns.
* **`size_t size() const`**: Returns the total number of elements.

### Example Usage

```cpp theme={null}
// Converting a vector to an array
std::vector<uint8_t> vec = {0, 1, 2, 3, 4, 5, 6, 7};
cyborg::Array2D<uint8_t> arr(vec, 2);
// arr is now a 2D array of 4 rows and 2 columns, with the contents from vec

// Creating a 2D array with 3 rows and 2 columns, initialized to zero
cyborg::Array2D<int> array(3, 2, 0);

// Access and modify elements
array(0, 0) = 1;
array(0, 1) = 2;

// Printing the array
for (size_t i = 0; i < array.rows(); ++i) {
    for (size_t j = 0; j < array.cols(); ++j) {
        std::cout << array(i, j) << " ";
    }
    std::cout << std::endl;
}
```

***

## TrainingConfig

The `TrainingConfig` struct defines parameters for training an index, allowing control over convergence and memory usage.

### Constructor

```cpp theme={null}
TrainingConfig(std::optional<size_t> n_lists = std::nullopt,
               std::optional<size_t> batch_size = std::nullopt,
               std::optional<size_t> max_iters = std::nullopt,
               std::optional<double> tolerance = std::nullopt,
               std::optional<size_t> max_memory = std::nullopt);
```

### Parameters

| Parameter    | Type                    | Description                                                                                                       |
| ------------ | ----------------------- | ----------------------------------------------------------------------------------------------------------------- |
| `n_lists`    | `std::optional<size_t>` | *(Optional)* Number of inverted lists to create. Defaults to `std::nullopt` (auto-determines, typically `0`).     |
| `batch_size` | `std::optional<size_t>` | *(Optional)* Size of each batch for training. Defaults to `std::nullopt` (auto-determined based on dataset size). |
| `max_iters`  | `std::optional<size_t>` | *(Optional)* Maximum iterations for training. Defaults to `std::nullopt` (auto-determines, typically `100`).      |
| `tolerance`  | `std::optional<double>` | *(Optional)* Convergence tolerance for training. Defaults to `std::nullopt` (uses `1e-6`).                        |
| `max_memory` | `std::optional<size_t>` | *(Optional)* Maximum memory (MB) usage during training. Defaults to `std::nullopt` (no limit).                    |

### Struct Members

Note: The struct members are stored in this order (different from constructor parameter order):

```cpp theme={null}
size_t batch_size;   // Batch size (default: 2048)
size_t max_iters;    // Maximum iterations (default: 100)
double tolerance;    // Convergence tolerance (default: 1e-6)
size_t max_memory;   // Maximum memory in MB (default: 0, no limit)
size_t n_lists;      // Number of inverted lists (default: 0, auto-determine)
```

***

## QueryParams

The `QueryParams` struct defines parameters for querying the index, controlling the number of results and probing behavior.

### Constructor

```cpp theme={null}
explicit QueryParams(size_t top_k = 100,
                     size_t n_probes = 0,
                     std::string filters = "",
                     std::vector<ResultFields> include = {},
                     bool greedy = false);
```

### Parameters

| Parameter  | Type                        | Description                                                                                                   |
| ---------- | --------------------------- | ------------------------------------------------------------------------------------------------------------- |
| `top_k`    | `size_t`                    | *(Optional)* Number of nearest neighbors to return. Defaults to `100`.                                        |
| `n_probes` | `size_t`                    | *(Optional)* Number of lists to probe during query. Defaults to `0` which will auto-determine optimal probes. |
| `filters`  | `std::string`               | *(Optional)* A JSON string of filters to apply to vector metadata, limiting search scope to these vectors.    |
| `include`  | `std::vector<ResultFields>` | *(Optional)* List of result fields to return. Can include `kDistance` and `kMetadata`. Defaults to empty.     |
| `greedy`   | `bool`                      | *(Optional)* Whether to perform greedy search. Defaults to `false`.                                           |

Higher n\_probes values may improve recall but could slow down query time, so select a value based on desired recall and performance trade-offs.

<Tip>`filters` use a subset of the [MongoDB Query and Projection Operators](https://www.mongodb.com/docs/manual/reference/operator/query/).
For instance: `filters: { "$and": [ { "label": "cat" }, { "confidence": { "$gte": 0.9 } } ] }` means that only vectors where `label == "cat"` and `confidence >= 0.9` will be considered for encrypted vector search.
For more info on metadata, see [Metadata Filtering](../guides/data-operations/metadata-filtering).</Tip>

***

### QueryResults

`QueryResults` class holds the results from a `Query` operation, including IDs, distances, and metadata for the nearest neighbors of each query. Results are vector-based and immutable after construction.

### Getter Methods

| Method        | Return Type                                    | Description                                                   |
| ------------- | ---------------------------------------------- | ------------------------------------------------------------- |
| `ids()`       | `const std::vector<std::vector<std::string>>&` | IDs of nearest neighbors for each query.                      |
| `distances()` | `const std::vector<std::vector<float>>&`       | Distances of nearest neighbors for each query.                |
| `metadata()`  | `const std::vector<std::vector<std::string>>&` | Metadata for nearest neighbors for each query (JSON strings). |

### Methods

| Method                                          | Return Type             | Description                                                                    |
| ----------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------ |
| `ResultView operator[](size_t query_idx) const` | `ResultView`            | Returns a read-only view of IDs, distances, and metadata for a specific query. |
| `num_results() const`                           | `std::vector<uint32_t>` | Returns the actual number of results per query (may be less than top\_k).      |
| `num_queries() const`                           | `size_t`                | Returns the number of queries.                                                 |
| `bool empty() const`                            | `bool`                  | Checks if the results are empty.                                               |
| `static QueryResults Empty(size_t num_queries)` | `QueryResults`          | Factory method to create empty results for a given number of queries.          |

### ResultView

The `ResultView` struct provides read-only access to results for a single query:

```cpp theme={null}
struct ResultView {
    const std::vector<std::string>& ids;
    const std::vector<float>& distances;
    const std::vector<std::string>& metadata;
    const uint32_t& num_results;
};
```

### Example Usage

```cpp theme={null}
// Access results for each query
for (size_t i = 0; i < results.num_queries(); ++i) {
    auto view = results[i];
    for (uint32_t j = 0; j < view.num_results; ++j) {
        std::cout << "ID: " << view.ids[j]
                  << ", Distance: " << view.distances[j] << std::endl;
    }
}

// Access all IDs and distances directly
const auto& all_ids = results.ids();
const auto& all_distances = results.distances();

// Get actual result counts per query
auto counts = results.num_results();

// Create empty results
auto empty = QueryResults::Empty(num_queries);
```

***

## ItemID

`ItemID` is a type alias for unique identifiers used throughout CyborgDB.

```cpp theme={null}
using ItemID = std::string;
```

`ItemID` is used to uniquely identify vectors and items within an encrypted index. Currently implemented as `std::string` for flexibility and human-readable identifiers.

***

## IndexType

The `IndexType` enum defines the supported index types in CyborgDB:

```cpp theme={null}
enum IndexType {
    IVF,     // Inverted File index (deprecated)
    IVFPQ,   // Inverted File with Product Quantization
    IVFFLAT, // Inverted File with flat (uncompressed) storage
    IVFSQ    // Inverted File with Scalar Quantization
};
```

All four index types are available:

* `IVF`: Fastest retrieval, lowest recall, smallest index size
* `IVFPQ`: Balanced memory usage and recall with product quantization
* `IVFFLAT`: Highest recall, largest index size, no compression
* `IVFSQ`: Good balance of recall, speed, and index size with scalar quantization (default)

<Warning>The `IVF` type is deprecated and will be removed in a future release.</Warning>

***

### `Item`

`Item` struct holds the individual results from a `Get` operation, including the requested fields.

```cpp theme={null}
struct Item {
    const std::string id;                   // Item ID
    const std::vector<float> vector;        // Vector embedding
    const std::vector<uint8_t> contents;    // Decrypted contents
    const std::string metadata;             // Metadata (JSON string)
};
```

***

## ResultFields

`ResultFields` enum specifies which fields to include in query results.

```cpp theme={null}
enum class ResultFields {
    kDistance,    // Include distance scores in query results
    kMetadata     // Include metadata in query results
};
```

***

### ItemFields

`ItemFields` enum defines the fields that can be requested for an `Item` object.

```cpp theme={null}
enum class ItemFields {
    kVector,       // Include vector in returned items
    kMetadata,     // Include metadata in returned items
    kContents      // Include content data in returned items
};
```

By default, `ids` are always included in the returned items.
