Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Integrations

Synaptic uses a provider-centric architecture for external service integrations. Each integration lives in its own crate, depends only on synaptic-core (plus any provider SDK), and implements one or more core traits.

Architecture

synaptic-core (defines traits)
  ├── synaptic-openai          (ChatModel + Embeddings)
  ├── synaptic-anthropic       (ChatModel)
  ├── synaptic-gemini          (ChatModel)
  ├── synaptic-ollama          (ChatModel + Embeddings)
  ├── synaptic-bedrock         (ChatModel)
  ├── synaptic-groq            (ChatModel — OpenAI-compatible, LPU)
  ├── synaptic-mistral         (ChatModel — OpenAI-compatible)
  ├── synaptic-deepseek        (ChatModel — OpenAI-compatible)
  ├── synaptic-cohere          (DocumentCompressor + Embeddings)
  ├── synaptic-huggingface     (Embeddings)
  ├── synaptic-qdrant          (VectorStore)
  ├── synaptic-pgvector        (VectorStore + Checkpointer)
  ├── synaptic-pinecone        (VectorStore)
  ├── synaptic-chroma          (VectorStore)
  ├── synaptic-mongodb         (VectorStore)
  ├── synaptic-elasticsearch   (VectorStore)
  ├── synaptic-weaviate        (VectorStore)
  ├── synaptic-redis           (Store + LlmCache + Checkpointer)
  ├── synaptic-sqlite          (LlmCache)
  ├── synaptic-pdf             (Loader)
  ├── synaptic-tavily          (Tool)
  └── synaptic-sqltoolkit      (Tool×3: ListTables, DescribeTable, ExecuteQuery)

All integration crates share a common pattern:

  1. Core traitsChatModel, Embeddings, VectorStore, Store, LlmCache, Loader are defined in synaptic-core
  2. Independent crates — Each integration is a separate crate with its own feature flag
  3. Zero coupling — Integration crates never depend on each other
  4. Config structs — Builder-pattern configuration with new() + with_*() methods

Core Traits

TraitPurposeCrate Implementations
ChatModelLLM chat completionopenai, anthropic, gemini, ollama, bedrock, groq, mistral, deepseek
EmbeddingsText embedding vectorsopenai, ollama, cohere, huggingface
VectorStoreVector similarity searchqdrant, pgvector, pinecone, chroma, mongodb, elasticsearch, weaviate, (+ in-memory)
StoreKey-value storageredis, (+ in-memory)
LlmCacheLLM response cachingredis, sqlite, (+ in-memory)
CheckpointerGraph state persistenceredis, pgvector
LoaderDocument loadingpdf, (+ text, json, csv, directory)
DocumentCompressorDocument reranking/filteringcohere, (+ embeddings filter)
ToolAgent tooltavily, sqltoolkit (3 tools), duckduckgo, wikipedia, (+ custom tools)

LLM Provider Pattern

All LLM providers follow the same pattern — a config struct, a model struct, and a ProviderBackend for HTTP transport:

use synaptic::openai::{OpenAiChatModel, OpenAiConfig};
use synaptic::models::{HttpBackend, FakeBackend};

// Production
let config = OpenAiConfig::new("sk-...", "gpt-4o");
let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));

// Testing (no network calls)
let model = OpenAiChatModel::new(config, Arc::new(FakeBackend::with_responses(vec![...])));

The ProviderBackend abstraction (in synaptic-models) enables:

  • HttpBackend — real HTTP calls in production
  • FakeBackend — deterministic responses in tests

Storage & Retrieval Pattern

Vector stores, key-value stores, and caches implement core traits that allow drop-in replacement:

// Swap InMemoryVectorStore for QdrantVectorStore — same trait interface
use synaptic::qdrant::{QdrantVectorStore, QdrantConfig};

let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config);
store.add_documents(docs, &embeddings).await?;
let results = store.similarity_search("query", 5, &embeddings).await?;

Feature Flags

Each integration has its own feature flag in the synaptic facade crate:

[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant"] }
FeatureIntegration
openaiOpenAI ChatModel + Embeddings (+ OpenAI-compatible providers + Azure)
anthropicAnthropic ChatModel
geminiGoogle Gemini ChatModel
ollamaOllama ChatModel + Embeddings
bedrockAWS Bedrock ChatModel
groqGroq ChatModel (ultra-fast LPU inference, OpenAI-compatible)
mistralMistral ChatModel (OpenAI-compatible)
deepseekDeepSeek ChatModel (cost-efficient reasoning, OpenAI-compatible)
cohereCohere Reranker + Embeddings
huggingfaceHuggingFace Inference API Embeddings
qdrantQdrant vector store
pgvectorPostgreSQL pgvector store + graph checkpointer
pineconePinecone vector store
chromaChroma vector store
mongodbMongoDB Atlas vector search
elasticsearchElasticsearch vector store
weaviateWeaviate vector store
redisRedis store + cache + graph checkpointer
sqliteSQLite LLM cache
pdfPDF document loader
tavilyTavily search tool
sqltoolkitSQL database toolkit (ListTables, DescribeTable, ExecuteQuery)

Convenience combinations: models (all 9 LLM providers), agent (includes openai + graph), rag (includes openai + retrieval stack), full (everything).

Provider Selection Guide

Choose a provider based on your requirements:

ProviderAuthStreamingTool CallingEmbeddingsBest For
OpenAIAPI key (header)SSEYesYesGeneral-purpose, widest model selection
AnthropicAPI key (x-api-key)SSEYesNoLong context, reasoning tasks
GeminiAPI key (query param)SSEYesNoGoogle ecosystem, multimodal
OllamaNone (local)NDJSONYesYesPrivacy-sensitive, offline, development
BedrockAWS IAMAWS SDKYesNoEnterprise AWS environments
GroqAPI key (header)SSEYesNoUltra-fast inference (LPU), latency-critical
MistralAPI key (header)SSEYesNoEU compliance, cost-efficient tool calling
DeepSeekAPI key (header)SSEYesNoCost-efficient reasoning (90%+ cheaper)
CohereAPI key (header)YesReranking + production-grade embeddings
HuggingFaceAPI key (optional)YesOpen-source sentence-transformers

Deciding factors:

  • Privacy & compliance — Ollama runs entirely locally; Bedrock keeps data within AWS
  • Cost — Ollama is free; OpenAI-compatible providers (Groq, DeepSeek) offer competitive pricing
  • Latency — Ollama has no network round-trip; Groq is optimized for speed
  • Ecosystem — OpenAI has the most third-party integrations; Bedrock integrates with AWS services

Vector Store Selection Guide

StoreDeploymentManagedFilteringScalingBest For
QdrantSelf-hosted / CloudYes (Qdrant Cloud)Rich (payload filters)HorizontalGeneral-purpose, production
pgvectorSelf-hostedVia managed PostgresSQL WHERE clausesVerticalTeams already using PostgreSQL
PineconeFully managedYesMetadata filtersAutomaticZero-ops, rapid prototyping
ChromaSelf-hosted / DockerNoMetadata filtersSingle nodeDevelopment, small-medium datasets
MongoDB AtlasFully managedYesMQL filtersAutomaticTeams already using MongoDB
ElasticsearchSelf-hosted / CloudYes (Elastic Cloud)Full query DSLHorizontalHybrid text + vector search
WeaviateSelf-hosted / CloudYes (WCS)GraphQL filtersHorizontalMulti-tenancy, hybrid search
InMemoryIn-processN/ANoneN/ATesting, prototyping

Deciding factors:

  • Existing infrastructure — Use pgvector if you have PostgreSQL, MongoDB Atlas if you use MongoDB, Elasticsearch if you already run an ES cluster
  • Operational complexity — Pinecone and MongoDB Atlas are fully managed; Qdrant and Elasticsearch require cluster management
  • Query capabilities — Elasticsearch excels at hybrid text + vector queries; Qdrant has the richest filtering
  • Cost — InMemory and Chroma are free; pgvector reuses existing database infrastructure

Cache Selection Guide

CachePersistenceDeploymentTTL SupportBest For
InMemoryNo (process lifetime)In-processYesTesting, single-process apps
RedisYes (configurable)External serverYesMulti-process, distributed
SQLiteYes (file-based)In-processYesSingle-machine persistence
SemanticDepends on backing storeIn-processNoFuzzy-match caching

Complete RAG Pipeline Example

This example combines multiple integrations into a full retrieval-augmented generation pipeline with caching and reranking:

use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings};
use synaptic::openai::{OpenAiChatModel, OpenAiConfig, OpenAiEmbeddings};
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::cohere::{CohereReranker, CohereConfig};
use synaptic::cache::{CachedChatModel, InMemoryCache};
use synaptic::retrieval::ContextualCompressionRetriever;
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;

let backend = Arc::new(HttpBackend::new());

// 1. Set up embeddings
let embeddings = Arc::new(OpenAiEmbeddings::new(
    OpenAiEmbeddings::config("text-embedding-3-small"),
    backend.clone(),
));

// 2. Ingest documents into Qdrant
let loader = TextLoader::new("knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;

let qdrant_config = QdrantConfig::new("http://localhost:6334", "knowledge", 1536);
let store = QdrantVectorStore::new(qdrant_config, embeddings.clone()).await?;
store.add_documents(&chunks).await?;

// 3. Build retriever with Cohere reranking
let base_retriever = Arc::new(VectorStoreRetriever::new(Arc::new(store)));
let reranker = CohereReranker::new(CohereConfig::new(std::env::var("COHERE_API_KEY")?));
let retriever = ContextualCompressionRetriever::new(base_retriever, Arc::new(reranker));

// 4. Wrap the LLM with a cache
let llm_config = OpenAiConfig::new(std::env::var("OPENAI_API_KEY")?, "gpt-4o");
let base_model = OpenAiChatModel::new(llm_config, backend.clone());
let cache = Arc::new(InMemoryCache::new());
let model = CachedChatModel::new(Arc::new(base_model), cache);

// 5. Retrieve and generate
let relevant = retriever.retrieve("How does Synaptic handle streaming?").await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");

let request = ChatRequest::new(vec![
    Message::system(&format!("Answer based on the following context:\n\n{context}")),
    Message::human("How does Synaptic handle streaming?"),
]);
let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());

This pipeline demonstrates:

  • Qdrant for vector storage and retrieval
  • Cohere for reranking retrieved documents
  • InMemoryCache for caching LLM responses (swap with Redis/SQLite for persistence)
  • OpenAI for both embeddings and chat completion

Adding a New Integration

To add a new integration:

  1. Create a new crate synaptic-{name} in crates/
  2. Depend on synaptic-core for trait definitions
  3. Implement the appropriate trait(s)
  4. Add a feature flag in the synaptic facade crate
  5. Re-export via pub use synaptic_{name} as {name} in the facade lib.rs

See Also