Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

LanceDB

LanceDB is a serverless, embedded vector database — it runs in-process with no separate server. Data is stored in the Lance columnar format on local disk or in cloud object storage (S3, GCS, Azure Blob).

Setup

Add the feature flag to your Cargo.toml:

[dependencies]
synaptic = { version = "0.4", features = ["lancedb"] }

No Docker container or external service is required.

Dependency Note

The lancedb crate (>= 0.20) has transitive dependencies that require Rust >= 1.91. The current synaptic-rag crate (with feature lancedb) ships a pure-Rust in-memory backend with the full VectorStore interface so that your application compiles and tests run today at MSRV 1.88. Once the toolchain requirement aligns, the implementation will be upgraded to use native Lance on-disk storage.

Usage

use synaptic::lancedb::{LanceDbConfig, LanceDbVectorStore};
use synaptic::core::VectorStore;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Local file-based storage
    let config = LanceDbConfig::new("/var/lib/myapp/vectors", "documents", 1536);
    let store = LanceDbVectorStore::new(config).await?;

    // Add documents
    // store.add_documents(docs, &embeddings).await?;

    // Search
    // let results = store.similarity_search("query text", 5, &embeddings).await?;

    Ok(())
}

Cloud Storage

When the native lancedb backend is available, S3-backed storage is supported by simply using an S3 URI:

let config = LanceDbConfig::new("s3://my-bucket/vectors", "documents", 1536);
let store = LanceDbVectorStore::new(config).await?;

Configuration

FieldTypeDescription
uriStringStorage path — local (/data/mydb) or cloud (s3://bucket/path)
table_nameStringTable name within the database
dimusizeVector dimension — must match your embedding model

Advantages

  • No server required — runs entirely in-process
  • Versioned — Lance format supports time-travel queries
  • Cloud-native — S3/GCS/Azure Blob backed storage without an intermediary service
  • High throughput — columnar format optimised for scan-heavy vector workloads