Types

Embedded
Python SDK
JS/TS

DBConfig

The DBConfig class specifies the storage location for the index, with options for in-memory storage or databases.

Parameters

Parameter	Type	Default	Description
`location`	`string`	-	DB location (`redis`, `postgres`, `rocksdb`, `memory`, `threadsafememory`)
`table_name`	`string`	None	(Optional) Table name (`postgres`-only)
`connection_string`	`string`	None	(Optional) Connection string to access DB

The supported location options are:

"redis": Use for high-speed, in-memory storage (recommended for index_location)
"postgres": Use for reliable, SQL-based storage (recommended for config_location)
"rocksdb": Use for persistent, on-disk key-value storage
"memory": Use for temporary in-memory storage (for benchmarking and evaluation purposes)
"threadsafememory": Use for thread-safe in-memory storage (for multi-threaded benchmarking)

Example Usage

from cyborgdb_core import DBConfig

# Redis configuration
index_location = DBConfig(
    location="redis",
    connection_string="redis://localhost:6379"
)

# PostgreSQL configuration
config_location = DBConfig(
    location="postgres",
    table_name="config_table",
    connection_string="host=localhost dbname=vectordb user=postgres"
)

# RocksDB configuration
rocksdb_location = DBConfig(
    location="rocksdb",
    connection_string="/path/to/rocksdb"
)

# Memory configuration (for testing)
memory_location = DBConfig(location="memory")

# Thread-safe memory configuration (for multi-threaded testing)
ts_memory_location = DBConfig(location="threadsafememory")

Embeddings

The Embedded LangChain integration accepts any LangChain Embeddings implementation:

Supported Embedding Types

Type	Description	Example
`Embeddings`	Any LangChain Embeddings implementation	`OpenAIEmbeddings()`, `HuggingFaceEmbeddings()`

Example Usage

from langchain_openai import OpenAIEmbeddings
from cyborgdb_core.integrations.langchain import CyborgVectorStore
from cyborgdb_core import DBConfig

# Using LangChain Embeddings
store = CyborgVectorStore(
    index_name="docs",
    index_key=key,
    api_key="your-api-key",
    embedding=OpenAIEmbeddings(),
    index_location=DBConfig("memory"),
    config_location=DBConfig("memory")
)

DistanceMetric

DistanceMetric is a string representing the distance metric used for the index. Options include:

"cosine": Cosine similarity (recommended for normalized embeddings)
"euclidean": Euclidean distance
"squared_euclidean": Squared Euclidean distance

Metric Characteristics

Metric	Range	Use Case
`cosine`	[0, 2]	Text embeddings, normalized vectors
`euclidean`	[0, ∞)	Raw feature vectors
`squared_euclidean`	[0, ∞)	When avoiding sqrt computation

IndexType

The index type determines the algorithm used for approximate nearest neighbor search.

Available Index Types

Type	Description	Speed	Recall	Index Size
`"ivfflat"`	Inverted file with flat storage	Fast	Highest	Biggest
`"ivfpq"`	Inverted file with product quantization	Fast	High	Medium
`"ivfsq"`	Inverted file with scalar quantization	Fast	High	Small

The default index type for the Embedded library is "ivfsq".

Example Usage

from langchain_openai import OpenAIEmbeddings

# IVFFlat index (highest recall)
store = CyborgVectorStore(
    index_name="high_recall_index",
    index_key=key,
    api_key="your-api-key",
    embedding=OpenAIEmbeddings(),
    index_location=DBConfig("memory"),
    config_location=DBConfig("memory"),
    index_type="ivfflat",
    index_config_params={"n_lists": 1024}
)

# IVFPQ index (balanced performance)
store = CyborgVectorStore(
    index_name="balanced_index",
    index_key=key,
    api_key="your-api-key",
    embedding=OpenAIEmbeddings(),
    index_location=DBConfig("memory"),
    config_location=DBConfig("memory"),
    index_type="ivfpq",
    index_config_params={
        "n_lists": 1024,
        "pq_dim": 64,
        "pq_bits": 8
    }
)

IndexConfigParams

Optional parameters for configuring the index, passed as a dictionary.

Parameters by Index Type

IVFFlat

Parameter	Type	Default	Description
`n_lists`	`int`	1024	Number of inverted lists (clusters)

IVFPQ

Parameter	Type	Default	Description
`n_lists`	`int`	1024	Number of inverted lists (clusters)
`pq_dim`	`int`	8	Dimensionality after product quantization
`pq_bits`	`int`	8	Bits per quantized dimension (1-16)

IVFSQ

Parameter	Type	Default	Description
`n_lists`	`int`	1024	Number of inverted lists (clusters)
`sq_bits`	`int`	8	Bits per scalar quantized value

Tuning Guidelines

n_lists: Use √n where n is the expected number of vectors. Common values: 256, 512, 1024, 2048
pq_dim: Should divide the embedding dimension evenly. Lower values = more compression
pq_bits / sq_bits: 8 bits provides good balance. Lower = more compression, higher = better accuracy

Document

LangChain Document object used for storing text with metadata.

Attributes

Attribute	Type	Description
`page_content`	`str`	The text content of the document
`metadata`	`dict`	Optional metadata associated with the document

Example Usage

from langchain_core.documents import Document

# Create a document
doc = Document(
    page_content="This is the content of my document",
    metadata={
        "source": "manual",
        "author": "John Doe",
        "timestamp": "2024-01-01"
    }
)

# Add to vector store
store.add_documents([doc])

Filter Format

Metadata filters use a dictionary format for querying documents.

Simple Filters

# Exact match
filter = {"category": "technology"}

# Multiple conditions (AND)
filter = {
    "category": "technology",
    "year": 2024
}

Advanced Filters

# Range queries
filter = {
    "price": {"$gte": 100, "$lte": 500}
}

# IN queries
filter = {
    "tags": {"$in": ["python", "machine-learning"]}
}

# Nested fields
filter = {
    "metadata.author": "John Doe"
}

Supported Operators

Operator	Description	Example
`$eq`	Equal to	`{"age": {"$eq": 25}}`
`$ne`	Not equal to	`{"status": {"$ne": "archived"}}`
`$gt`	Greater than	`{"price": {"$gt": 100}}`
`$gte`	Greater than or equal	`{"score": {"$gte": 0.8}}`
`$lt`	Less than	`{"quantity": {"$lt": 10}}`
`$lte`	Less than or equal	`{"rating": {"$lte": 5}}`
`$in`	In array	`{"tags": {"$in": ["ai", "ml"]}}`
`$nin`	Not in array	`{"category": {"$nin": ["draft", "deleted"]}}`

Return Types

Query Results

Query operations return documents with optional scores:

# similarity_search returns List[Document]
docs = store.similarity_search("query", k=5)
# Returns: [Document(...), Document(...), ...]

# similarity_search_with_score returns List[Tuple[Document, float]]
results = store.similarity_search_with_score("query", k=5)
# Returns: [(Document(...), 0.95), (Document(...), 0.87), ...]

Score Normalization

Scores are normalized to [0, 1] range where:

1.0 = Perfect match
0.0 = Worst match

The normalization depends on the distance metric used.

Async Support

All methods have async variants prefixed with a:

Sync Method	Async Method
`add_texts`	`aadd_texts`
`add_documents`	`aadd_documents`
`similarity_search`	`asimilarity_search`
`similarity_search_with_score`	`asimilarity_search_with_score`
`similarity_search_by_vector`	`asimilarity_search_by_vector`
`max_marginal_relevance_search`	`amax_marginal_relevance_search`
`delete`	`adelete`

Example Usage

import asyncio

async def main():
    # Async text addition
    ids = await store.aadd_texts(["async text 1", "async text 2"])

    # Async search
    docs = await store.asimilarity_search("query", k=5)

    # Async deletion
    success = await store.adelete(ids)

asyncio.run(main())

Connection Configuration

The Python SDK connects to a running CyborgDB service instead of using DBConfig for storage locations.

Parameter	Type	Default	Description
`base_url`	`str`	-	Base URL of the CyborgDB microservice endpoint
`api_key`	`str`	-	API key for authentication with the microservice
`verify_ssl`	`bool`	`None`	(Optional) SSL verification. When `None`, automatically disabled for `localhost` and `http://` URLs

DBConfig and GPUConfig are not applicable to the Python SDK. Storage configuration is managed by the CyborgDB service.

Example Usage

from cyborgdb.integrations.langchain import CyborgVectorStore

store = CyborgVectorStore(
    index_name="my_documents",
    index_key=key,
    api_key="your-api-key",
    embedding="all-MiniLM-L6-v2",
    base_url="http://localhost:8000"
)

Embeddings

The Python SDK supports the same embedding types as the Embedded library:

Type	Description	Example
`str`	Model name string for SentenceTransformers	`"sentence-transformers/all-MiniLM-L6-v2"`
`SentenceTransformer`	SentenceTransformer model instance	`SentenceTransformer("all-MiniLM-L6-v2")`
`Embeddings`	Any LangChain Embeddings implementation	`OpenAIEmbeddings()`, `HuggingFaceEmbeddings()`

DistanceMetric

Same as Embedded:

"cosine": Cosine similarity (recommended for normalized embeddings)
"euclidean": Euclidean distance
"squared_euclidean": Squared Euclidean distance

Metric	Range	Use Case
`cosine`	[0, 2]	Text embeddings, normalized vectors
`euclidean`	[0, ∞)	Raw feature vectors
`squared_euclidean`	[0, ∞)	When avoiding sqrt computation

IndexType

Same as Embedded:

Type	Description	Speed	Recall	Index Size
`"ivfflat"`	Inverted file with flat storage	Fast	Highest	Biggest
`"ivfpq"`	Inverted file with product quantization	Fast	High	Medium
`"ivfsq"`	Inverted file with scalar quantization	Fast	High	Small

Document

LangChain Document object used for storing text with metadata.

Attribute	Type	Description
`page_content`	`str`	The text content of the document
`metadata`	`dict`	Optional metadata associated with the document

from langchain_core.documents import Document

doc = Document(
    page_content="This is the content of my document",
    metadata={"source": "manual", "author": "John Doe"}
)

Filter Format

Same filter format as Embedded. Metadata filters use a dictionary format:

# Exact match
filter = {"category": "technology"}

# Range queries
filter = {"price": {"$gte": 100, "$lte": 500}}

# IN queries
filter = {"tags": {"$in": ["python", "machine-learning"]}}

Supported operators: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin

Return Types

Same as Embedded:

# similarity_search returns List[Document]
docs = store.similarity_search("query", k=5)

# similarity_search_with_score returns List[Tuple[Document, float]]
results = store.similarity_search_with_score("query", k=5)

Async Support

Same async variants as Embedded, prefixed with a:

Sync Method	Async Method
`add_texts`	`aadd_texts`
`add_documents`	`aadd_documents`
`similarity_search`	`asimilarity_search`
`similarity_search_with_score`	`asimilarity_search_with_score`
`similarity_search_by_vector`	`asimilarity_search_by_vector`
`delete`	`adelete`

CyborgVectorStoreConfig

Configuration interface for constructing a CyborgVectorStore in JavaScript/TypeScript.

interface CyborgVectorStoreConfig {
    baseUrl: string;
    apiKey: string;
    indexName: string;
    indexKey: Uint8Array;
    indexType?: "ivfflat" | "ivfpq" | "ivfsq";
    indexConfigParams?: Record<string, number>;
    dimension?: number;
    metric?: "cosine" | "euclidean" | "squared_euclidean";
    verifySsl?: boolean;
}

Parameter	Type	Default	Description
`baseUrl`	`string`	-	Base URL of the CyborgDB microservice endpoint
`apiKey`	`string`	-	API key for authentication
`indexName`	`string`	-	Name of the index
`indexKey`	`Uint8Array`	-	32-byte encryption key
`indexType`	`string`	`"ivfflat"`	(Optional) Index algorithm type
`indexConfigParams`	`Record<string, number>`	-	(Optional) Additional index configuration parameters
`dimension`	`number`	-	(Optional) Embedding dimension (auto-inferred if not provided)
`metric`	`string`	`"cosine"`	(Optional) Distance metric
`verifySsl`	`boolean`	`true`	(Optional) SSL verification

All parameters use camelCase convention, consistent with JavaScript/TypeScript standards.

EmbeddingsInterface

The JS/TS SDK accepts any LangChain EmbeddingsInterface implementation:

import { OpenAIEmbeddings } from '@langchain/openai';
import { HuggingFaceTransformersEmbeddings } from '@langchain/community/embeddings/hf_transformers';

// OpenAI embeddings
const embeddings = new OpenAIEmbeddings();

// HuggingFace embeddings
const embeddings = new HuggingFaceTransformersEmbeddings({
    model: "all-MiniLM-L6-v2"
});

Unlike the Python SDKs, the JS/TS SDK does not accept raw model name strings. You must pass a LangChain EmbeddingsInterface instance.

DistanceMetric

Same options as Python SDKs:

"cosine": Cosine similarity (recommended for normalized embeddings)
"euclidean": Euclidean distance
"squared_euclidean": Squared Euclidean distance

Document

LangChain Document object:

import { Document } from '@langchain/core/documents';

const doc = new Document({
    pageContent: "This is the content of my document",
    metadata: { source: "manual", author: "John Doe" }
});

Attribute	Type	Description
`pageContent`	`string`	The text content of the document
`metadata`	`Record<string, any>`	Optional metadata associated with the document

FilterType

Metadata filters use an object format:

// Exact match
const filter = { category: "technology" };

// Multiple conditions (AND)
const filter = {
    category: "technology",
    year: 2024
};

Supports the same operators as the Python SDKs: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin.

Return Types

// similaritySearch returns Promise<Document[]>
const docs = await store.similaritySearch("query", 5);

// similaritySearchWithScore returns Promise<[Document, number][]>
const results = await store.similaritySearchWithScore("query", 5);

Async

All JS/TS methods are natively async and return Promise<...>. There are no separate sync/async variants — every method uses await.

Introduction

CyborgVectorStore

​DBConfig

​Parameters

​Example Usage

​Embeddings

​Supported Embedding Types

​Example Usage

​DistanceMetric

​Metric Characteristics

​IndexType

​Available Index Types

​Example Usage

​IndexConfigParams

​Parameters by Index Type

​IVFFlat

​IVFPQ

​IVFSQ

​Tuning Guidelines

​Document

​Attributes

​Example Usage

​Filter Format

​Simple Filters

​Advanced Filters

​Supported Operators

​Return Types

​Query Results

​Score Normalization

​Async Support

​Example Usage

​Connection Configuration

​Example Usage

​Embeddings

​DistanceMetric

​IndexType

​Document

​Filter Format

​Return Types

​Async Support

​CyborgVectorStoreConfig

​EmbeddingsInterface

​DistanceMetric

​Document

​FilterType

​Return Types

​Async

DBConfig

Parameters

Example Usage

Embeddings

Supported Embedding Types

Example Usage

DistanceMetric

Metric Characteristics

IndexType

Available Index Types

Example Usage

IndexConfigParams

Parameters by Index Type

IVFFlat

IVFPQ

IVFSQ

Tuning Guidelines

Document

Attributes

Example Usage

Filter Format

Simple Filters

Advanced Filters

Supported Operators

Return Types

Query Results

Score Normalization

Async Support

Example Usage

Connection Configuration

Example Usage

Embeddings

DistanceMetric

IndexType

Document

Filter Format

Return Types

Async Support

CyborgVectorStoreConfig

EmbeddingsInterface

DistanceMetric

Document

FilterType

Return Types

Async