> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cyborg.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Configure an Encrypted Index

<Info>Index configuration is automatically handled by default. This guide allows you to override these defaults to customize index behavior & performance characteristics.</Info>

CyborgDB offers four index types, all of which offer varying characteristics:

|            Index Type            |  Speed  |  Recall | Index Size |
| :------------------------------: | :-----: | :-----: | :--------: |
|   [`IVFSQ`](#ivfsq-index-type)   | Fastest |   High  |    Small   |
| [`IVFFlat`](#ivfflat-index-type) |   Fast  | Highest |   Biggest  |
|   [`IVFPQ`](#ivfpq-index-type)   |   Fast  |   High  |   Medium   |
|     [`IVF`](#ivf-index-type)     |   Fast  |  Lowest |  Smallest  |

Generally-speaking, we recommend that you start with `IVFSQ`, which is the **default** index type. It provides high recall with a good balance of speed and index size. If you need the absolute highest recall, use `IVFFlat`. If minimizing index size is important, `IVFPQ` applies Product-Quantization (PQ, a form of lossy compression) to the vector embeddings which can greatly reduce index size.

## `IVFSQ` Index Type

The `IVFSQ` index type applies Scalar Quantization (SQ) to compress vector embeddings while maintaining high recall. This provides a good balance of recall, speed, and index size, making it the **default index type** in CyborgDB.

We recommend `IVFSQ` indexes for most applications, as it provides high recall rates with significantly smaller index sizes than `IVFFlat`. The default configuration uses 16-bit scalar quantization.

To create an `IVFSQ` index, you can use its configuration constructor:

<CodeGroup>
  ```python Python icon="python" theme={null}
  import cyborgdb_core as cyborgdb

  # Create the IVFSQ index config (default: 16-bit SQ)
  index_config = cyborgdb.IndexIVFSQ(sq_bits=16)

  # Create the index
  index_name = "test_index"
  index_key = bytes([0] * 32) # Set your private key here
  index = client.create_index(
      index_name=index_name,
      index_key=index_key,
      index_config=index_config)
  ```

  ```cpp C++ icon="brackets-curly" theme={null}
  #include "cyborgdb_core/client.hpp"
  #include "cyborgdb_core/encrypted_index.hpp"

  // Create the IVFSQ index config (default: 16-bit SQ)
  cyborg::IndexIVFSQ index_config(0, 16);

  // Create the index
  std::string index_name = "test_index";
  std::array<uint8_t, 32> index_key = {0}; // Set your private key here
  auto index = client.CreateIndex(index_name, index_key, index_config);
  ```
</CodeGroup>

`sq_bits` controls the precision of the scalar quantization. Higher values (e.g., 16) provide higher recall at the cost of larger index sizes. Lower values (e.g., 4 or 8) provide smaller index sizes but may reduce recall.

## `IVFFlat` Index Type

The `IVFFlat` index type improves `IVF` significantly by storing encrypted vector embeddings in the index. In addition to selecting the closest clusters for a query vector, the exact distance can be computed between each candidate vector and the query, yielding very high recall rates (up to >99%). This comes at the cost of index size and some search speed.

Use `IVFFlat` indexes when you need the absolute highest recall rates and index size is not a concern.

To create an `IVFFlat` index, you can use its configuration constructor:

<CodeGroup>
  ```python Python icon="python" theme={null}
  import cyborgdb_core as cyborgdb

  # Create the IVF index config
  index_config = cyborgdb.IndexIVFFlat()

  # Create the index
  index_name = "test_index"
  index_key = bytes([0] * 32) # Set your private key here
  index = client.create_index(
      index_name=index_name,
      index_key=index_key,
      index_config=index_config)
  ```

  ```cpp C++ icon="brackets-curly" theme={null}
  #include "cyborgdb_core/client.hpp"
  #include "cyborgdb_core/encrypted_index.hpp"

  // Create the IVF index config
  cyborg::IndexIVFFlat index_config;

  // Create the index
  std::string index_name = "test_index";
  std::array<uint8_t, 32> index_key = {0}; // Set your private key here
  auto index = client.CreateIndex(index_name, index_key, index_config);
  ```
</CodeGroup>

## `IVFPQ` Index Type

The `IVFPQ` index type builds upon `IVFFlat` by applying Product Quantization (PQ) - a form of lossy compression - to reduce the index size. When applied correctly, `IVFPQ` indexes can maintain high recall (>95%) while reducing index size significantly (2-4x).

We recommend `IVFPQ` indexes for mature applications, where the dataset and query distributions are well-established. This is because `IVFPQ` requires the most tuning to yield an ideal balance between recall and index size. It is possible to go from `IVFFlat` to `IVFPQ` on the same index, but not vice-versa.

To create an `IVFPQ` index, you can use its configuration constructor:

<CodeGroup>
  ```python Python icon="python" theme={null}
  import cyborgdb_core as cyborgdb

  # Set index parameters (both are now optional with defaults)
  pq_dim = 32 # Dimension must be divisible by pq_dim (0 = auto-detect)
  pq_bits = 8 # Number of bits for each pq dimension (default: 8)

  # Create the IVFPQ index config
  index_config = cyborgdb.IndexIVFPQ(pq_dim=pq_dim, pq_bits=pq_bits)

  # Create the index
  index_name = "test_index"
  index_key = bytes([0] * 32) # Set your private key here
  index = client.create_index(
      index_name=index_name,
      index_key=index_key,
      index_config=index_config
  )
  ```

  ```cpp C++ icon="brackets-curly" theme={null}
  #include "cyborgdb_core/client.hpp"
  #include "cyborgdb_core/encrypted_index.hpp"

  // Set index parameters (both are now optional with defaults)
  size_t pq_dim = 32; // Dimension must be divisible by pq_dim (0 = auto-detect)
  size_t pq_bits = 8; // Number of bits for each pq dimension (default: 8)

  // Create the IVFPQ index config
  cyborg::IndexIVFPQ index_config(0, pq_dim, pq_bits);

  // Create the index
  std::string index_name = "test_index";
  std::array<uint8_t, 32> index_key = {0}; // Set your private key here
  auto index = client.CreateIndex(index_name, index_key, index_config);
  ```
</CodeGroup>

`pq_dim` is the number of dimensionality for each vector after product-quantization. It must be between `1` and `dimension`, and `dimension` must be cleanly divisible by `pq_dim`. Lower `pq_dim` will yield smaller index sizes but lower recall.

`pq_bits` is the number of bits that will be used to represent each dimension of the product-quantized vector embeddings. It must be between `1` and `16`, with lower values yielding smaller index sizes but lower recall.

## `IVF` Index Type

<Warning>The `IVF` type is deprecated and will be removed in a future release.</Warning>

The `IVF` index type (Inverted File Index) is the simplest offered by CyborgDB. We recommend `IVF` indexes for applications which require high-speed, low-latency search with low recall requirements (or where `top_k` is rather large, i.e. >500).

To create an `IVF` index, you can use its configuration constructor:

<CodeGroup>
  ```python Python icon="python" theme={null}
  import cyborgdb_core as cyborgdb

  # Create the IVF index config
  index_config = cyborgdb.IndexIVF()

  # Create the index
  index_name = "test_index"
  index_key = bytes([0] * 32) # Set your private key here
  index = client.create_index(
      index_name=index_name,
      index_key=index_key,
      index_config=index_config
  )
  ```

  ```cpp C++ icon="brackets-curly" theme={null}
  #include "cyborgdb_core/client.hpp"
  #include "cyborgdb_core/encrypted_index.hpp"

  // Create the IVF index config
  cyborg::IndexIVF index_config;

  // Create the index
  std::string index_name = "test_index";
  std::array<uint8_t, 32> index_key = {0}; // Set your private key here
  auto index = client.CreateIndex(index_name, index_key, index_config);
  ```
</CodeGroup>

## Customizing Distance Metrics

By default, CyborgDB uses `euclidean` distance as its metric for all index types. You can override this default by providing a `distance_metric` parameter to any of the index constructors. For example:

<CodeGroup>
  ```python Python icon="python" theme={null}
  # Existing setup ...

  index_config = cyborgdb.IndexIVFFlat()

  index = client.create_index(
      index_name="index_name", 
      index_key=index_key, 
      index_config=index_config, 
      metric="euclidean"
  )
  ```

  ```cpp C++ icon="brackets-curly" theme={null}
  // Existing setup
  cyborg::IndexIVFFlat index_config;

  auto index = client.CreateIndex("index_name", index_key, index_config, cyborg::DistanceMetric::Euclidean);
  ```
</CodeGroup>

The currently supported distance metrics are:

* `"cosine"`: Cosine similarity.
* `"euclidean"`: Euclidean distance.
* `"squared_euclidean"`: Squared Euclidean distance.

## API Reference

For more information on the `IndexConfig` classes, refer to the API Reference:

<CardGroup cols={2}>
  <Card title="Python API Reference" href="../../python/types#indexconfig" icon="python">
    API reference for `IndexConfig` in Python
  </Card>

  <Card title="C++ API Reference" href="../../cpp/types#indexconfig" icon="brackets-curly">
    API reference for `IndexConfig` in C++
  </Card>
</CardGroup>
