Integrations
Synaptic uses a provider-centric architecture for external service integrations. Each integration lives in its own crate, depends only on synaptic-core (plus any provider SDK), and implements one or more core traits.
Architecture
synaptic-core (defines traits)
├── synaptic-openai (ChatModel + Embeddings)
├── synaptic-anthropic (ChatModel)
├── synaptic-gemini (ChatModel)
├── synaptic-ollama (ChatModel + Embeddings)
├── synaptic-bedrock (ChatModel)
├── synaptic-groq (ChatModel — OpenAI-compatible, LPU)
├── synaptic-mistral (ChatModel — OpenAI-compatible)
├── synaptic-deepseek (ChatModel — OpenAI-compatible)
├── synaptic-cohere (DocumentCompressor + Embeddings)
├── synaptic-huggingface (Embeddings)
├── synaptic-qdrant (VectorStore)
├── synaptic-pgvector (VectorStore + Checkpointer)
├── synaptic-pinecone (VectorStore)
├── synaptic-chroma (VectorStore)
├── synaptic-mongodb (VectorStore)
├── synaptic-elasticsearch (VectorStore)
├── synaptic-weaviate (VectorStore)
├── synaptic-redis (Store + LlmCache + Checkpointer)
├── synaptic-sqlite (LlmCache)
├── synaptic-pdf (Loader)
├── synaptic-tavily (Tool)
└── synaptic-sqltoolkit (Tool×3: ListTables, DescribeTable, ExecuteQuery)
All integration crates share a common pattern:
- Core traits —
ChatModel,Embeddings,VectorStore,Store,LlmCache,Loaderare defined insynaptic-core - Independent crates — Each integration is a separate crate with its own feature flag
- Zero coupling — Integration crates never depend on each other
- Config structs — Builder-pattern configuration with
new()+with_*()methods
Core Traits
| Trait | Purpose | Crate Implementations |
|---|---|---|
ChatModel | LLM chat completion | openai, anthropic, gemini, ollama, bedrock, groq, mistral, deepseek |
Embeddings | Text embedding vectors | openai, ollama, cohere, huggingface |
VectorStore | Vector similarity search | qdrant, pgvector, pinecone, chroma, mongodb, elasticsearch, weaviate, (+ in-memory) |
Store | Key-value storage | redis, (+ in-memory) |
LlmCache | LLM response caching | redis, sqlite, (+ in-memory) |
Checkpointer | Graph state persistence | redis, pgvector |
Loader | Document loading | pdf, (+ text, json, csv, directory) |
DocumentCompressor | Document reranking/filtering | cohere, (+ embeddings filter) |
Tool | Agent tool | tavily, sqltoolkit (3 tools), duckduckgo, wikipedia, (+ custom tools) |
LLM Provider Pattern
All LLM providers follow the same pattern — a config struct, a model struct, and a ProviderBackend for HTTP transport:
use synaptic::openai::{OpenAiChatModel, OpenAiConfig};
use synaptic::models::{HttpBackend, FakeBackend};
// Production
let config = OpenAiConfig::new("sk-...", "gpt-4o");
let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));
// Testing (no network calls)
let model = OpenAiChatModel::new(config, Arc::new(FakeBackend::with_responses(vec![...])));
The ProviderBackend abstraction (in synaptic-models) enables:
HttpBackend— real HTTP calls in productionFakeBackend— deterministic responses in tests
Storage & Retrieval Pattern
Vector stores, key-value stores, and caches implement core traits that allow drop-in replacement:
// Swap InMemoryVectorStore for QdrantVectorStore — same trait interface
use synaptic::qdrant::{QdrantVectorStore, QdrantConfig};
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config);
store.add_documents(docs, &embeddings).await?;
let results = store.similarity_search("query", 5, &embeddings).await?;
Feature Flags
Each integration has its own feature flag in the synaptic facade crate:
[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant"] }
| Feature | Integration |
|---|---|
openai | OpenAI ChatModel + Embeddings (+ OpenAI-compatible providers + Azure) |
anthropic | Anthropic ChatModel |
gemini | Google Gemini ChatModel |
ollama | Ollama ChatModel + Embeddings |
bedrock | AWS Bedrock ChatModel |
groq | Groq ChatModel (ultra-fast LPU inference, OpenAI-compatible) |
mistral | Mistral ChatModel (OpenAI-compatible) |
deepseek | DeepSeek ChatModel (cost-efficient reasoning, OpenAI-compatible) |
cohere | Cohere Reranker + Embeddings |
huggingface | HuggingFace Inference API Embeddings |
qdrant | Qdrant vector store |
pgvector | PostgreSQL pgvector store + graph checkpointer |
pinecone | Pinecone vector store |
chroma | Chroma vector store |
mongodb | MongoDB Atlas vector search |
elasticsearch | Elasticsearch vector store |
weaviate | Weaviate vector store |
redis | Redis store + cache + graph checkpointer |
sqlite | SQLite LLM cache |
pdf | PDF document loader |
tavily | Tavily search tool |
sqltoolkit | SQL database toolkit (ListTables, DescribeTable, ExecuteQuery) |
Convenience combinations: models (all 9 LLM providers), agent (includes openai + graph), rag (includes openai + retrieval stack), full (everything).
Provider Selection Guide
Choose a provider based on your requirements:
| Provider | Auth | Streaming | Tool Calling | Embeddings | Best For |
|---|---|---|---|---|---|
| OpenAI | API key (header) | SSE | Yes | Yes | General-purpose, widest model selection |
| Anthropic | API key (x-api-key) | SSE | Yes | No | Long context, reasoning tasks |
| Gemini | API key (query param) | SSE | Yes | No | Google ecosystem, multimodal |
| Ollama | None (local) | NDJSON | Yes | Yes | Privacy-sensitive, offline, development |
| Bedrock | AWS IAM | AWS SDK | Yes | No | Enterprise AWS environments |
| Groq | API key (header) | SSE | Yes | No | Ultra-fast inference (LPU), latency-critical |
| Mistral | API key (header) | SSE | Yes | No | EU compliance, cost-efficient tool calling |
| DeepSeek | API key (header) | SSE | Yes | No | Cost-efficient reasoning (90%+ cheaper) |
| Cohere | API key (header) | — | — | Yes | Reranking + production-grade embeddings |
| HuggingFace | API key (optional) | — | — | Yes | Open-source sentence-transformers |
Deciding factors:
- Privacy & compliance — Ollama runs entirely locally; Bedrock keeps data within AWS
- Cost — Ollama is free; OpenAI-compatible providers (Groq, DeepSeek) offer competitive pricing
- Latency — Ollama has no network round-trip; Groq is optimized for speed
- Ecosystem — OpenAI has the most third-party integrations; Bedrock integrates with AWS services
Vector Store Selection Guide
| Store | Deployment | Managed | Filtering | Scaling | Best For |
|---|---|---|---|---|---|
| Qdrant | Self-hosted / Cloud | Yes (Qdrant Cloud) | Rich (payload filters) | Horizontal | General-purpose, production |
| pgvector | Self-hosted | Via managed Postgres | SQL WHERE clauses | Vertical | Teams already using PostgreSQL |
| Pinecone | Fully managed | Yes | Metadata filters | Automatic | Zero-ops, rapid prototyping |
| Chroma | Self-hosted / Docker | No | Metadata filters | Single node | Development, small-medium datasets |
| MongoDB Atlas | Fully managed | Yes | MQL filters | Automatic | Teams already using MongoDB |
| Elasticsearch | Self-hosted / Cloud | Yes (Elastic Cloud) | Full query DSL | Horizontal | Hybrid text + vector search |
| Weaviate | Self-hosted / Cloud | Yes (WCS) | GraphQL filters | Horizontal | Multi-tenancy, hybrid search |
| InMemory | In-process | N/A | None | N/A | Testing, prototyping |
Deciding factors:
- Existing infrastructure — Use pgvector if you have PostgreSQL, MongoDB Atlas if you use MongoDB, Elasticsearch if you already run an ES cluster
- Operational complexity — Pinecone and MongoDB Atlas are fully managed; Qdrant and Elasticsearch require cluster management
- Query capabilities — Elasticsearch excels at hybrid text + vector queries; Qdrant has the richest filtering
- Cost — InMemory and Chroma are free; pgvector reuses existing database infrastructure
Cache Selection Guide
| Cache | Persistence | Deployment | TTL Support | Best For |
|---|---|---|---|---|
| InMemory | No (process lifetime) | In-process | Yes | Testing, single-process apps |
| Redis | Yes (configurable) | External server | Yes | Multi-process, distributed |
| SQLite | Yes (file-based) | In-process | Yes | Single-machine persistence |
| Semantic | Depends on backing store | In-process | No | Fuzzy-match caching |
Complete RAG Pipeline Example
This example combines multiple integrations into a full retrieval-augmented generation pipeline with caching and reranking:
use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings};
use synaptic::openai::{OpenAiChatModel, OpenAiConfig, OpenAiEmbeddings};
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::cohere::{CohereReranker, CohereConfig};
use synaptic::cache::{CachedChatModel, InMemoryCache};
use synaptic::retrieval::ContextualCompressionRetriever;
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;
let backend = Arc::new(HttpBackend::new());
// 1. Set up embeddings
let embeddings = Arc::new(OpenAiEmbeddings::new(
OpenAiEmbeddings::config("text-embedding-3-small"),
backend.clone(),
));
// 2. Ingest documents into Qdrant
let loader = TextLoader::new("knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;
let qdrant_config = QdrantConfig::new("http://localhost:6334", "knowledge", 1536);
let store = QdrantVectorStore::new(qdrant_config, embeddings.clone()).await?;
store.add_documents(&chunks).await?;
// 3. Build retriever with Cohere reranking
let base_retriever = Arc::new(VectorStoreRetriever::new(Arc::new(store)));
let reranker = CohereReranker::new(CohereConfig::new(std::env::var("COHERE_API_KEY")?));
let retriever = ContextualCompressionRetriever::new(base_retriever, Arc::new(reranker));
// 4. Wrap the LLM with a cache
let llm_config = OpenAiConfig::new(std::env::var("OPENAI_API_KEY")?, "gpt-4o");
let base_model = OpenAiChatModel::new(llm_config, backend.clone());
let cache = Arc::new(InMemoryCache::new());
let model = CachedChatModel::new(Arc::new(base_model), cache);
// 5. Retrieve and generate
let relevant = retriever.retrieve("How does Synaptic handle streaming?").await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let request = ChatRequest::new(vec![
Message::system(&format!("Answer based on the following context:\n\n{context}")),
Message::human("How does Synaptic handle streaming?"),
]);
let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());
This pipeline demonstrates:
- Qdrant for vector storage and retrieval
- Cohere for reranking retrieved documents
- InMemoryCache for caching LLM responses (swap with Redis/SQLite for persistence)
- OpenAI for both embeddings and chat completion
Adding a New Integration
To add a new integration:
- Create a new crate
synaptic-{name}incrates/ - Depend on
synaptic-corefor trait definitions - Implement the appropriate trait(s)
- Add a feature flag in the
synapticfacade crate - Re-export via
pub use synaptic_{name} as {name}in the facadelib.rs
See Also
- Installation — Feature flag reference
- Architecture — Overall system design