- Encrypted index tokens
- Index centroid salted hashes
- Index payload (embedding) encryption
- Item content encryptions
- Encrypted query tokens
index_key, is a 256-bit (32-byte) symmetric Key-Encryption-Key (KEK)—the same you would find when using AES-256 encryption. CyborgDB wraps a per-index data key (DEK) under this KEK internally and encrypts all index data with that DEK using AES-256-GCM; callers only ever handle the KEK. CyborgDB’s cryptography is based entirely on well-established cryptographic standards, including AES-256-GCM, HMAC, and SHA-3 (Keccak).
Without the correct index_key, it is impossible to use a CyborgDB Encrypted Index. You cannot upsert vectors, query the index, or even delete it. Hence, it is critical to manage these encryption keys safely.
For multi-tenant scenarios, you can mint per-user keys with scoped read/write permissions instead of sharing the root KEK — see Access Control (RBAC). To persist an external-KMS wrapping envelope for the KEK, see External Key Management (KMS).
Generating Key Locally (for Development)
For local development and evaluation, you can generate a 256-bit encryption key locally and use it in your calls to CyborgDB. Here’s how you can do this using OpenSSL and integrate it with CyborgDB in Python. Theindex_key must be exactly 32 raw bytes. The simplest way to generate one is with the language’s cryptographic RNG.
Generate a 32-byte key
index_key.bin. Ensure that this file is kept secure and not included in your source control.Using Key Management Services (for Production)
For production environments, a Key Management Service (KMS) or Hardware Security Module (HSM) is strongly recommended to securely store and manage encryption keys. This ensures the highest level of security by isolating the keys from application logic and providing robust access controls.Overview of KMS
A KMS is a managed service that provides centralized control over encryption keys, including:- Key Generation: Automatically generating keys with required levels of entropy.
- Secure Storage: Storing keys in a tamper-proof environment.
- Access Control: Enforcing strict role-based access control (RBAC) policies.
- Auditing: Logging key usage for compliance and monitoring.
Supported KMS Providers
CyborgDB can integrate with popular KMS providers, such as:- AWS Key Management Service (AWS KMS)
- Google Cloud Key Management (Google Cloud KMS)
- Azure Key Vault
- HashiCorp Vault
Step-by-Step Example: AWS KMS
Generate a Data Key Using AWS KMS
Option A: Command LineUse the AWS CLI to generate a data key:Note: The
- Replace
<Your-Key-Id>with your KMS Key ID or ARN. - The command outputs a JSON object containing both the plaintext key and the encrypted key.
Plaintext and CiphertextBlob are Base64-encoded binary data.Option B: PythonStore the Encrypted Key
- Encrypted Key: Since it’s encrypted, you can safely store it in your application’s configuration file, environment variable, or secure parameter store.
Use the Encrypted Key in Your Application at Runtime
In your application code, decrypt the encrypted key at runtime using AWS KMS:Explanation:
- Encrypted Key Storage: The encrypted key (
encrypted_key_b64) is safe to store in configuration files since it’s encrypted. - Runtime Decryption: The
kms_client.decryptmethod securely retrieves the plaintext key at runtime.
Benefits of This Approach
- Security: The plaintext key is never stored persistently. It’s only available in memory during runtime.
- Simplicity: You can generate and manage the key using AWS KMS and standard AWS tools.
- No Need for Secure Storage: Since the encrypted key is safe to store in your application’s configuration, you don’t need additional secure storage solutions.
CyborgDB can also persist a per-index KMS envelope describing how the KEK is wrapped, so a service layer can unwrap it at request time. See External Key Management (KMS) for details on
KMSBlob and the create_index_kms / push_index_kms / get_index_kms / delete_index_kms functions.