Introduction
Synaptic is a Rust agent framework with LangChain-compatible architecture.
Build production-grade AI agents, chains, and retrieval pipelines in Rust with the same mental model you know from LangChain -- but with compile-time safety, zero-cost abstractions, and native async performance.
Why Synaptic?
- Type-safe -- Message types, tool definitions, and runnable pipelines are checked at compile time. No runtime surprises from mismatched schemas.
- Async-native -- Built on Tokio and
async-traitfrom the ground up. Every trait method is async, and streaming is a first-class citizen viaStream. - Composable -- LCEL-style pipe operator (
|), parallel branches, conditional routing, and fallback chains let you build complex workflows from simple parts. - LangChain-compatible -- Familiar concepts map directly:
ChatPromptTemplate,StateGraph,create_react_agent,ToolNode,VectorStoreRetriever, and more.
Features at a Glance
| Area | What you get |
|---|---|
| Chat Models | OpenAI, Anthropic, Gemini, Ollama adapters with streaming, retry, rate limiting, and caching |
| Messages | Typed message enum with factory methods, filtering, trimming, and merge utilities |
| Prompts | Template interpolation, chat prompt templates, few-shot prompting |
| Output Parsers | String, JSON, structured, list, enum, boolean, XML parsers |
| Runnables (LCEL) | Pipe operator, parallel, branch, assign/pick, bind, fallbacks, retry |
| Tools | Tool trait, registry, serial/parallel execution, tool choice |
| Memory | Buffer, window, summary, token buffer, summary buffer strategies |
| Graph | LangGraph-style state machines with checkpointing, streaming, and human-in-the-loop |
| Retrieval | Loaders, splitters, embeddings, vector stores, BM25, multi-query, ensemble retrievers |
| Evaluation | Exact match, regex, JSON validity, embedding distance, LLM judge evaluators |
| Callbacks | Recording, tracing, composite callback handlers |
Quick Links
- What is Synaptic? -- Concept mapping from LangChain Python to Synaptic Rust
- Architecture Overview -- Layered crate design and dependency graph
- Installation -- Add Synaptic to your project
- Quickstart -- Your first Synaptic program in 30 lines
- Tutorials -- Step-by-step guides for common use cases
- API Reference -- Full API documentation
What is Synaptic?
Synaptic is a Rust framework for building AI agents, chains, and retrieval pipelines. It follows the same architecture and abstractions as LangChain (Python), translated into idiomatic Rust with strong typing, async-native design, and zero-cost abstractions.
If you have used LangChain in Python, you already know the mental model. Synaptic provides the same composable building blocks -- chat models, prompts, output parsers, runnables, tools, memory, graphs, and retrieval -- but catches errors at compile time instead of runtime.
LangChain to Synaptic Mapping
The table below shows how core LangChain Python concepts map to their Synaptic Rust equivalents:
| LangChain (Python) | Synaptic (Rust) | Crate |
|---|---|---|
ChatOpenAI | OpenAiChatModel | synaptic-openai |
ChatAnthropic | AnthropicChatModel | synaptic-anthropic |
ChatGoogleGenerativeAI | GeminiChatModel | synaptic-gemini |
HumanMessage / AIMessage | Message::human() / Message::ai() | synaptic-core |
RunnableSequence / LCEL | | BoxRunnable / | pipe operator | synaptic-runnables |
RunnableLambda | RunnableLambda | synaptic-runnables |
RunnableParallel | RunnableParallel | synaptic-runnables |
RunnableBranch | RunnableBranch | synaptic-runnables |
RunnablePassthrough.assign() | RunnableAssign | synaptic-runnables |
ChatPromptTemplate | ChatPromptTemplate | synaptic-prompts |
ToolNode | ToolNode | synaptic-graph |
StateGraph | StateGraph | synaptic-graph |
create_react_agent | create_react_agent | synaptic-graph |
InMemorySaver | MemorySaver | synaptic-graph |
StrOutputParser | StrOutputParser | synaptic-parsers |
JsonOutputParser | JsonOutputParser | synaptic-parsers |
VectorStoreRetriever | VectorStoreRetriever | synaptic-vectorstores |
RecursiveCharacterTextSplitter | RecursiveCharacterTextSplitter | synaptic-splitters |
OpenAIEmbeddings | OpenAiEmbeddings | synaptic-openai |
Key Differences from LangChain Python
While the architecture is compatible, Synaptic makes deliberate Rust-idiomatic choices:
- Message is a tagged enum, not a class hierarchy. You construct messages with factory methods like
Message::human("hello")rather than instantiating classes. - ChatRequest uses a constructor with builder methods:
ChatRequest::new(messages).with_tools(tools).with_tool_choice(ToolChoice::Auto). - All traits are async via
#[async_trait]. Everychat(),invoke(), andcall()is an async function. - Concurrency uses
Arc-based sharing. Registries useArc<RwLock<_>>, callbacks and memory useArc<tokio::sync::Mutex<_>>. - Errors are typed.
SynapticErroris an enum with 19 variants (one per subsystem), not a generic exception. - Streaming is trait-based.
ChatModel::stream_chat()returns aChatStream(a pinnedStreamofAIMessageChunk), and graph streaming yieldsGraphEventvalues.
When to Use Synaptic
Synaptic is a good fit when you need:
- Performance-critical AI applications -- Rust's zero-cost abstractions and lack of garbage collection make Synaptic suitable for high-throughput, low-latency agent workloads. There is no Python GIL limiting concurrency.
- Rust ecosystem integration -- If your application is already written in Rust (web servers with Axum/Actix, CLI tools, embedded systems), Synaptic lets you add AI agent capabilities without crossing an FFI boundary or managing a Python subprocess.
- Compile-time safety -- Tool argument schemas, message types, and runnable pipeline signatures are all checked by the compiler. Refactoring a tool's input type produces compile errors at every call site, not runtime crashes in production.
- Deployable binaries -- Synaptic compiles to a single static binary with no runtime dependencies. No Python interpreter, no virtual environment, no pip install.
- Concurrent agent workloads -- Tokio's async runtime lets you run hundreds of concurrent agent sessions on a single machine with efficient task scheduling.
When Not to Use Synaptic
- If your team primarily writes Python and rapid prototyping speed matters more than runtime performance, LangChain Python is the more pragmatic choice.
- If you need access to the full LangChain ecosystem of third-party integrations (hundreds of vector stores, document loaders, and model providers), LangChain Python has broader coverage today.
Architecture Overview
Synaptic is organized as a Cargo workspace with 26 library crates, 1 facade crate, and several example binaries. The crates form a layered architecture where each layer builds on the one below it.
Crate Layers
Core Layer
synaptic-core defines all shared traits and types. Every other crate depends on it.
- Traits:
ChatModel,Tool,RuntimeAwareTool,MemoryStore,CallbackHandler,Store,Embeddings - Types:
Message,ChatRequest,ChatResponse,ToolCall,ToolDefinition,ToolChoice,AIMessageChunk,TokenUsage,RunEvent,RunnableConfig,Runtime,ToolRuntime,ModelProfile,Item,ContentBlock - Error type:
SynapticError(20 variants covering all subsystems) - Stream type:
ChatStream(Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send>>)
Implementation Crates
Each crate implements one core trait or provides a focused capability:
| Crate | Purpose |
|---|---|
synaptic-models | ProviderBackend abstraction, ScriptedChatModel test double, wrappers (retry, rate limit, structured output, bound tools) |
synaptic-openai | OpenAiChatModel + OpenAiEmbeddings |
synaptic-anthropic | AnthropicChatModel |
synaptic-gemini | GeminiChatModel |
synaptic-ollama | OllamaChatModel + OllamaEmbeddings |
synaptic-tools | ToolRegistry, SerialToolExecutor, ParallelToolExecutor |
synaptic-memory | Memory strategies: buffer, window, summary, token buffer, summary buffer, RunnableWithMessageHistory |
synaptic-callbacks | RecordingCallback, TracingCallback, CompositeCallback |
synaptic-prompts | PromptTemplate, ChatPromptTemplate, FewShotChatMessagePromptTemplate |
synaptic-parsers | Output parsers: string, JSON, structured, list, enum, boolean, XML, markdown list, numbered list |
synaptic-cache | InMemoryCache, SemanticCache, CachedChatModel |
Composition Crates
These crates provide higher-level orchestration:
| Crate | Purpose |
|---|---|
synaptic-runnables | Runnable trait with invoke()/batch()/stream(), BoxRunnable with pipe operator, RunnableLambda, RunnableParallel, RunnableBranch, RunnableAssign, RunnablePick, RunnableWithFallbacks |
synaptic-graph | LangGraph-style state machines: StateGraph, CompiledGraph, ToolNode, create_react_agent, create_supervisor, create_swarm, Command, GraphResult, Checkpointer, MemorySaver, multi-mode streaming |
Retrieval Pipeline
These crates form the document ingestion and retrieval pipeline:
| Crate | Purpose |
|---|---|
synaptic-loaders | TextLoader, JsonLoader, CsvLoader, DirectoryLoader |
synaptic-splitters | CharacterTextSplitter, RecursiveCharacterTextSplitter, MarkdownHeaderTextSplitter, TokenTextSplitter |
synaptic-embeddings | Embeddings trait, FakeEmbeddings, CacheBackedEmbeddings |
synaptic-vectorstores | VectorStore trait, InMemoryVectorStore, VectorStoreRetriever |
synaptic-retrieval | Retriever trait, BM25Retriever, MultiQueryRetriever, EnsembleRetriever, ContextualCompressionRetriever, SelfQueryRetriever, ParentDocumentRetriever |
Evaluation
| Crate | Purpose |
|---|---|
synaptic-eval | Evaluator trait, ExactMatchEvaluator, RegexMatchEvaluator, JsonValidityEvaluator, EmbeddingDistanceEvaluator, LLMJudgeEvaluator, Dataset, batch evaluation pipeline |
Advanced Crates
These crates provide specialized capabilities for production agent systems:
| Crate | Purpose |
|---|---|
synaptic-store | Store trait implementation, InMemoryStore with semantic search (optional embeddings) |
synaptic-middleware | AgentMiddleware trait, MiddlewareChain, built-in middleware: model retry, PII filtering, prompt caching, summarization, human-in-the-loop approval, tool call limiting |
synaptic-mcp | Model Context Protocol adapters: MultiServerMcpClient, Stdio/SSE/HTTP transports for tool discovery and invocation |
synaptic-macros | Procedural macros: #[tool], #[chain], #[entrypoint], #[task], #[traceable], middleware macros |
synaptic-deep | Deep Agent harness: Backend trait (State/Store/Filesystem), 7 filesystem tools, 6 middleware, create_deep_agent() factory |
Integration Crates
These crates provide third-party service integrations:
| Crate | Purpose |
|---|---|
synaptic-qdrant | QdrantVectorStore (Qdrant vector database) |
synaptic-pgvector | PgVectorStore (PostgreSQL pgvector extension) |
synaptic-redis | RedisStore + RedisCache (Redis key-value store and LLM cache) |
synaptic-pdf | PdfLoader (PDF document loading) |
Facade
synaptic re-exports all sub-crates for convenient single-import usage:
use synaptic::core::{ChatModel, Message, ChatRequest};
use synaptic::openai::OpenAiChatModel; // requires "openai" feature
use synaptic::models::ScriptedChatModel; // requires "model-utils" feature
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::graph::{StateGraph, create_react_agent};
Dependency Diagram
All crates depend on synaptic-core for shared traits and types. Higher-level crates depend on the layer below:
┌──────────┐
│ synaptic │ (facade: re-exports all)
└─────┬────┘
│
┌──────────────┬─────────────┼──────────────┬───────────────┐
│ │ │ │ │
┌───┴───┐ ┌─────┴────┐ ┌────┴─────┐ ┌─────┴────┐ ┌─────┴───┐
│ deep │ │middleware│ │ graph │ │runnables │ │ eval │
└───┬───┘ └─────┬────┘ └────┬─────┘ └────┬─────┘ └─────┬───┘
│ │ │ │ │
├──────────────┴────┬───────┴──────────────┤ │
│ │ │ │
┌────┴──┐ ┌─────┐ ┌─────┴──┐ ┌──────┐ ┌───────┐│┌──────┐┌─────┴──┐
│models │ │tools│ │memory │ │store │ │prompts│││parsers││cache │
└───┬───┘ └──┬──┘ └───┬────┘ └──┬───┘ └───┬───┘│└───┬───┘└───┬────┘
│ │ │ │ │ │ │ │
│ ┌─────┴─┬──────┤ ┌────┘ │ │ │ │
│ │ │ │ │ │ │ │ │
├──┤ ┌────┴──┐ │ ┌─┴────┐ ┌─────┴────┴────┴────────┤
│ │ │macros │ │ │ mcp │ │ callbacks │
│ │ └───┬───┘ │ └──┬───┘ └────────┬────────────────┘
│ │ │ │ │ │
┌─┴──┴──────┴───────┴─────┴───────────────┴──┐
│ synaptic-core │
│ (ChatModel, Tool, Store, Embeddings, ...) │
└──────────────────┬──────────────────────────┘
│
Provider crates (each depends on synaptic-core + synaptic-models):
openai, anthropic, gemini, ollama
Retrieval pipeline:
loaders ──► splitters ──► embeddings ──► vectorstores ──► retrieval
Integration crates: qdrant, pgvector, redis, pdf
Design Principles
Async-first with #[async_trait]
Every trait in Synaptic is async. The ChatModel::chat() method, Tool::call(), MemoryStore::load(), and Runnable::invoke() are all async functions. This means you can freely await network calls, database queries, and concurrent operations inside any implementation without blocking the runtime.
Arc-based sharing
Synaptic uses Arc<RwLock<_>> for registries (like ToolRegistry) where many readers need concurrent access, and Arc<tokio::sync::Mutex<_>> for stateful components (like callbacks and memory stores) where mutations must be serialized. This allows safe sharing across async tasks and agent sessions.
Session isolation
Memory stores and agent runs are keyed by session_id. Multiple conversations can run concurrently on the same model and tool set without state leaking between sessions.
Event-driven callbacks
The CallbackHandler trait receives RunEvent values at each lifecycle stage (run started, LLM called, tool called, run finished, run failed). You can compose multiple handlers with CompositeCallback for logging, tracing, metrics, and recording simultaneously.
Typed error handling
SynapticError has one variant per subsystem (Prompt, Model, Tool, Memory, Graph, etc.). This makes it straightforward to match on specific failure modes and provide targeted recovery logic.
Composition over inheritance
Rather than deep trait hierarchies, Synaptic favors composition. A CachedChatModel wraps any ChatModel. A RetryChatModel wraps any ChatModel. A RunnableWithFallbacks wraps any Runnable. You stack behaviors by wrapping, not by extending base classes.
Installation
Requirements
- Rust edition: 2021
- Minimum supported Rust version (MSRV): 1.88
- Runtime: Tokio (async runtime)
Adding Synaptic to Your Project
The synaptic facade crate re-exports all sub-crates. Use feature flags to control which modules are compiled.
Feature Flags
Synaptic provides fine-grained feature flags, similar to tokio:
[dependencies]
# Full — everything enabled (equivalent to previous default)
synaptic = { version = "0.2", features = ["full"] }
# Agent development (OpenAI + tools + graph + memory, etc.)
synaptic = { version = "0.2", features = ["agent"] }
# RAG applications (OpenAI + retrieval + loaders + splitters + embeddings + vectorstores, etc.)
synaptic = { version = "0.2", features = ["rag"] }
# Agent + RAG
synaptic = { version = "0.2", features = ["agent", "rag"] }
# Just OpenAI model calls
synaptic = { version = "0.2", features = ["openai"] }
# All 4 providers (OpenAI + Anthropic + Gemini + Ollama)
synaptic = { version = "0.2", features = ["models"] }
# Fine-grained: one provider + specific modules
synaptic = { version = "0.2", features = ["anthropic", "graph", "cache"] }
Composite features:
| Feature | Description |
|---|---|
default | model-utils, runnables, prompts, parsers, tools, callbacks |
agent | default + openai, graph, memory |
rag | default + openai, retrieval, loaders, splitters, embeddings, vectorstores |
models | All 6 providers: openai + anthropic + gemini + ollama + bedrock + cohere |
full | All features enabled |
Provider features (each enables one provider crate):
| Feature | Description |
|---|---|
openai | OpenAiChatModel + OpenAiEmbeddings (synaptic-openai) |
anthropic | AnthropicChatModel (synaptic-anthropic) |
gemini | GeminiChatModel (synaptic-gemini) |
ollama | OllamaChatModel + OllamaEmbeddings (synaptic-ollama) |
Module features:
Individual features: model-utils, runnables, prompts, parsers, tools, memory, callbacks, retrieval, loaders, splitters, embeddings, vectorstores, graph, cache, eval, store, middleware, mcp, macros, deep.
| Feature | Description |
|---|---|
model-utils | ProviderBackend abstraction, ScriptedChatModel, wrappers (RetryChatModel, RateLimitedChatModel, StructuredOutputChatModel, etc.) |
store | Key-value store with namespace hierarchy and optional semantic search |
middleware | Agent middleware chain (tool call limits, HITL, summarization, context editing) |
mcp | Model Context Protocol client (Stdio/SSE/HTTP transports) |
macros | Proc macros (#[tool], #[chain], #[entrypoint], #[traceable]) |
deep | Deep agent harness (backends, filesystem tools, sub-agents, skills) |
Integration features:
| Feature | Description |
|---|---|
qdrant | Qdrant vector store (synaptic-qdrant) |
pgvector | PostgreSQL pgvector store (synaptic-pgvector) |
redis | Redis store + cache (synaptic-redis) |
pdf | PDF document loader (synaptic-pdf) |
bedrock | AWS Bedrock ChatModel (synaptic-bedrock) |
cohere | Cohere Reranker (synaptic-cohere) |
pinecone | Pinecone vector store (synaptic-pinecone) |
chroma | Chroma vector store (synaptic-chroma) |
mongodb | MongoDB Atlas vector search (synaptic-mongodb) |
elasticsearch | Elasticsearch vector store (synaptic-elasticsearch) |
sqlite | SQLite LLM cache (synaptic-sqlite) |
tavily | Tavily search tool (synaptic-tavily) |
The core module (traits and types) is always available regardless of feature selection.
Quick Start Example
[dependencies]
synaptic = { version = "0.2", features = ["agent"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Using the Facade
The facade crate provides namespaced re-exports for all sub-crates. You access types through their module path:
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings}; // requires "openai" feature
use synaptic::anthropic::AnthropicChatModel; // requires "anthropic" feature
use synaptic::models::ScriptedChatModel; // requires "model-utils" feature
use synaptic::runnables::{Runnable, BoxRunnable, RunnableLambda};
use synaptic::prompts::ChatPromptTemplate;
use synaptic::parsers::StrOutputParser;
use synaptic::tools::ToolRegistry;
use synaptic::memory::InMemoryStore;
use synaptic::graph::{StateGraph, create_react_agent};
use synaptic::retrieval::Retriever;
use synaptic::vectorstores::InMemoryVectorStore;
Alternatively, you can depend on individual crates directly if you want to minimize compile times:
[dependencies]
synaptic-core = "0.2"
synaptic-models = "0.2"
Provider API Keys
Synaptic reads API keys from environment variables. Set the ones you need for your chosen provider:
| Provider | Environment Variable |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Anthropic | ANTHROPIC_API_KEY |
| Google Gemini | GOOGLE_API_KEY |
| Ollama | No key required (runs locally) |
For example, on a Unix shell:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="AI..."
You do not need any API keys to run the Quickstart example, which uses the ScriptedChatModel test double.
Building and Testing
From the workspace root:
# Build all crates
cargo build --workspace
# Run all tests
cargo test --workspace
# Test a single crate
cargo test -p synaptic-models
# Run a specific test by name
cargo test -p synaptic-core -- trim_messages
# Check formatting
cargo fmt --all -- --check
# Run lints
cargo clippy --workspace
Workspace Dependencies
Synaptic uses Cargo workspace-level dependency management. Key shared dependencies include:
async-trait-- async trait methodsserde/serde_json-- serializationthiserror2.0 -- error derivetokio-- async runtime (macros, rt-multi-thread, sync, time)reqwest-- HTTP client (json, stream features)futures/async-stream-- stream utilitiestracing/tracing-subscriber-- structured logging
Quickstart
This guide walks you through a minimal Synaptic program that sends a chat request and prints the response. It uses ScriptedChatModel, a test double that returns pre-configured responses, so you do not need any API keys to run it.
The Complete Example
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError};
use synaptic::models::ScriptedChatModel;
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
// 1. Create a scripted model with a predefined response.
// ScriptedChatModel returns responses in order, one per chat() call.
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Hello! I'm a Synaptic assistant. How can I help you today?"),
usage: None,
},
]);
// 2. Build a chat request with a system prompt and a user message.
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant built with Synaptic."),
Message::human("Hello! What are you?"),
]);
// 3. Send the request and get a response.
let response = model.chat(request).await?;
// 4. Print the assistant's reply.
println!("Assistant: {}", response.message.content());
Ok(())
}
Running this program prints:
Assistant: Hello! I'm a Synaptic assistant. How can I help you today?
What is Happening
-
ScriptedChatModel::new(vec![...])creates a chat model that returns the givenChatResponsevalues in sequence. This is useful for testing and examples without requiring a live API. In production, you would replace this withOpenAiChatModel(fromsynaptic::openai),AnthropicChatModel(fromsynaptic::anthropic), or another provider adapter. -
ChatRequest::new(messages)constructs a chat request from a vector of messages. Messages are created with factory methods:Message::system()for system prompts,Message::human()for user input, andMessage::ai()for assistant responses. -
model.chat(request).await?sends the request asynchronously and returns aChatResponsecontaining the model's message and optional token usage information. -
response.message.content()extracts the text content from the response message.
Using a Real Provider
To use OpenAI instead of the scripted model, replace the model creation:
use synaptic::openai::OpenAiChatModel;
// Reads OPENAI_API_KEY from the environment automatically.
let model = OpenAiChatModel::new("gpt-4o");
You will also need the "openai" feature enabled in your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai"] }
The rest of the code stays the same -- ChatModel::chat() has the same signature regardless of provider.
Next Steps
- Build a Simple LLM Application -- Chain prompts with output parsers
- Build a Chatbot with Memory -- Add conversation history
- Build a ReAct Agent -- Give your model tools to call
- Build a RAG Application -- Retrieve documents for context
- Architecture Overview -- Understand the crate structure
Build a Simple LLM Application
This tutorial walks you through building a basic chat application with Synaptic. You will learn how to create a chat model, send messages, template prompts, and compose processing pipelines using the LCEL pipe operator.
Prerequisites
Add the required Synaptic crates to your Cargo.toml:
[dependencies]
synaptic = "0.2"
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Step 1: Create a Chat Model
Every LLM interaction in Synaptic goes through a type that implements the ChatModel trait. For production use you would reach for OpenAiChatModel (from synaptic::openai), AnthropicChatModel (from synaptic::anthropic), or one of the other provider adapters. For this tutorial we use ScriptedChatModel, which returns pre-configured responses -- perfect for offline development and testing.
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message};
use synaptic::models::ScriptedChatModel;
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Paris is the capital of France."),
usage: None,
},
]);
ScriptedChatModel pops responses from a queue in order. Each call to chat() returns the next response. This makes tests deterministic and lets you compile and run examples without an API key.
Step 2: Build a Request and Get a Response
A ChatRequest holds the conversation messages (and optionally tool definitions). Build one with ChatRequest::new() and pass a vector of messages:
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message};
use synaptic::models::ScriptedChatModel;
#[tokio::main]
async fn main() {
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Paris is the capital of France."),
usage: None,
},
]);
let request = ChatRequest::new(vec![
Message::system("You are a geography expert."),
Message::human("What is the capital of France?"),
]);
let response = model.chat(request).await.unwrap();
println!("{}", response.message.content());
// Output: Paris is the capital of France.
}
Key points:
Message::system(),Message::human(), andMessage::ai()are factory methods for building typed messages.ChatRequest::new(messages)is the constructor. Never build the struct literal directly.model.chat(request)is async and returnsResult<ChatResponse, SynapticError>.
Step 3: Template Messages with ChatPromptTemplate
Hard-coding message strings works for one-off calls, but real applications need parameterized prompts. ChatPromptTemplate lets you define message templates with {{ variable }} placeholders that are filled in at runtime.
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a helpful assistant that speaks {{ language }}."),
MessageTemplate::human("{{ question }}"),
]);
To render the template, call format() with a map of variable values:
use std::collections::HashMap;
use serde_json::Value;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a helpful assistant that speaks {{ language }}."),
MessageTemplate::human("{{ question }}"),
]);
let mut values = HashMap::new();
values.insert("language".to_string(), Value::String("French".to_string()));
values.insert("question".to_string(), Value::String("What is the capital of France?".to_string()));
let messages = template.format(&values).unwrap();
// messages[0] => System("You are a helpful assistant that speaks French.")
// messages[1] => Human("What is the capital of France?")
ChatPromptTemplate also implements the Runnable trait, which means it can participate in LCEL pipelines. When used as a Runnable, it takes a HashMap<String, Value> as input and produces Vec<Message> as output.
Step 4: Compose a Pipeline with the Pipe Operator
Synaptic implements LangChain Expression Language (LCEL) composition through the | pipe operator. You can chain any two runnables together as long as the output type of the first matches the input type of the second.
Here is a complete example that templates a prompt and extracts the response text:
use std::collections::HashMap;
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
#[tokio::main]
async fn main() {
// 1. Define the model
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("The capital of France is Paris."),
usage: None,
},
]);
// 2. Define the prompt template
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a geography expert."),
MessageTemplate::human("{{ question }}"),
]);
// 3. Build the chain: template -> model -> parser
// Each step is boxed to erase types, then piped with |
let chain = template.boxed() | model.boxed() | StrOutputParser.boxed();
// 4. Invoke the chain
let mut input = HashMap::new();
input.insert(
"question".to_string(),
serde_json::Value::String("What is the capital of France?".to_string()),
);
let config = RunnableConfig::default();
let result: String = chain.invoke(input, &config).await.unwrap();
println!("{}", result);
// Output: The capital of France is Paris.
}
Here is what happens at each stage of the pipeline:
ChatPromptTemplatereceivesHashMap<String, Value>, renders the templates, and outputsVec<Message>.ScriptedChatModelreceivesVec<Message>(via itsRunnableimplementation which wraps them in aChatRequest), calls the model, and outputs aMessage.StrOutputParserreceives aMessageand extracts its text content as aString.
The boxed() method wraps each component into a BoxRunnable, which is a type-erased wrapper that enables the | operator. Without boxing, Rust cannot unify the different concrete types.
Summary
In this tutorial you learned how to:
- Create a
ScriptedChatModelfor offline development - Build
ChatRequestobjects from typed messages - Use
ChatPromptTemplatewith{{ variable }}interpolation - Compose processing pipelines with the LCEL
|pipe operator
Next Steps
- Build a Chatbot with Memory -- add conversation history
- Build a ReAct Agent -- give the LLM tools to call
- Runnables & LCEL -- deeper look at composition patterns
Build a Chatbot with Memory
This tutorial walks you through building a session-based chatbot that remembers conversation history. You will learn how to store and retrieve messages with InMemoryStore, isolate conversations by session ID, and choose the right memory strategy for your use case.
Prerequisites
Add the required Synaptic crates to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["memory"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Step 1: Store and Load Messages
Every chatbot needs to remember what was said. Synaptic provides the MemoryStore trait for this purpose, and InMemoryStore as a simple in-process implementation backed by a HashMap.
use synaptic::core::{MemoryStore, Message, SynapticError};
use synaptic::memory::InMemoryStore;
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
let memory = InMemoryStore::new();
let session_id = "demo-session";
// Simulate a conversation
memory.append(session_id, Message::human("Hello, Synaptic")).await?;
memory.append(session_id, Message::ai("Hello! How can I help you?")).await?;
memory.append(session_id, Message::human("What can you do?")).await?;
memory.append(session_id, Message::ai("I can help with many tasks!")).await?;
// Load the conversation history
let transcript = memory.load(session_id).await?;
for message in &transcript {
println!("{}: {}", message.role(), message.content());
}
// Clear memory when done
memory.clear(session_id).await?;
Ok(())
}
The output will be:
human: Hello, Synaptic
ai: Hello! How can I help you?
human: What can you do?
ai: I can help with many tasks!
The MemoryStore trait defines three methods:
append(session_id, message)-- adds a message to a session's history.load(session_id)-- returns all messages for a session as aVec<Message>.clear(session_id)-- removes all messages for a session.
Step 2: Session Isolation
Each session ID maps to an independent conversation history. This is how you keep multiple users or threads separate:
use synaptic::core::{MemoryStore, Message, SynapticError};
use synaptic::memory::InMemoryStore;
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
let memory = InMemoryStore::new();
// Alice's conversation
memory.append("alice", Message::human("Hi, I'm Alice")).await?;
memory.append("alice", Message::ai("Hello, Alice!")).await?;
// Bob's conversation (completely independent)
memory.append("bob", Message::human("Hi, I'm Bob")).await?;
memory.append("bob", Message::ai("Hello, Bob!")).await?;
// Each session has its own history
let alice_history = memory.load("alice").await?;
let bob_history = memory.load("bob").await?;
assert_eq!(alice_history.len(), 2);
assert_eq!(bob_history.len(), 2);
assert_eq!(alice_history[0].content(), "Hi, I'm Alice");
assert_eq!(bob_history[0].content(), "Hi, I'm Bob");
Ok(())
}
Session IDs are arbitrary strings. In a web application you would typically use a user ID, a conversation thread ID, or a combination of both.
Step 3: Choose a Memory Strategy
As conversations grow long, sending every message to the LLM becomes expensive and eventually exceeds the context window. Synaptic provides several memory strategies that wrap an underlying MemoryStore and control what gets returned by load().
ConversationBufferMemory
Keeps all messages. This is the simplest strategy -- a passthrough wrapper that makes the "keep everything" policy explicit:
use std::sync::Arc;
use synaptic::core::MemoryStore;
use synaptic::memory::{InMemoryStore, ConversationBufferMemory};
let store = Arc::new(InMemoryStore::new());
let memory = ConversationBufferMemory::new(store);
// memory.load() returns all messages
Best for: short conversations where you want the full history available.
ConversationWindowMemory
Keeps only the last K messages. Older messages are still stored but are not returned by load():
use std::sync::Arc;
use synaptic::core::MemoryStore;
use synaptic::memory::{InMemoryStore, ConversationWindowMemory};
let store = Arc::new(InMemoryStore::new());
let memory = ConversationWindowMemory::new(store, 10); // keep last 10 messages
// memory.load() returns at most 10 messages
Best for: conversations where recent context is sufficient and you want predictable costs.
ConversationSummaryMemory
Uses an LLM to summarize older messages. When the stored message count exceeds buffer_size * 2, the older portion is compressed into a summary that is prepended as a system message:
use std::sync::Arc;
use synaptic::core::{ChatModel, MemoryStore};
use synaptic::memory::{InMemoryStore, ConversationSummaryMemory};
let store = Arc::new(InMemoryStore::new());
let model: Arc<dyn ChatModel> = /* your chat model */;
let memory = ConversationSummaryMemory::new(store, model, 6);
// When messages exceed 12, older ones are summarized
// memory.load() returns: [summary system message] + [recent 6 messages]
Best for: long-running conversations where you need to retain the gist of older context without the full verbatim history.
ConversationTokenBufferMemory
Keeps messages within a token budget. Uses a configurable token estimator to drop the oldest messages once the total exceeds the limit:
use std::sync::Arc;
use synaptic::core::MemoryStore;
use synaptic::memory::{InMemoryStore, ConversationTokenBufferMemory};
let store = Arc::new(InMemoryStore::new());
let memory = ConversationTokenBufferMemory::new(store, 4000); // 4000 token budget
// memory.load() returns as many recent messages as fit within 4000 tokens
Best for: staying within a model's context window by directly managing token count.
ConversationSummaryBufferMemory
A hybrid of summary and buffer strategies. Keeps the most recent messages verbatim, and summarizes everything older when the token count exceeds a threshold:
use std::sync::Arc;
use synaptic::core::{ChatModel, MemoryStore};
use synaptic::memory::{InMemoryStore, ConversationSummaryBufferMemory};
let store = Arc::new(InMemoryStore::new());
let model: Arc<dyn ChatModel> = /* your chat model */;
let memory = ConversationSummaryBufferMemory::new(store, model, 2000);
// Keeps recent messages verbatim; summarizes when total tokens exceed 2000
Best for: balancing cost with context quality -- you get the detail of recent messages and the compressed gist of older ones.
Step 4: Auto-Manage History with RunnableWithMessageHistory
In a real chatbot, you want the history load/save to happen automatically on each turn. RunnableWithMessageHistory wraps any Runnable<Vec<Message>, String> and handles this for you:
- Extracts the
session_idfromRunnableConfig.metadata["session_id"] - Loads conversation history from memory
- Appends the user's new message
- Calls the inner runnable with the full message list
- Saves the AI response back to memory
use std::sync::Arc;
use std::collections::HashMap;
use synaptic::core::{MemoryStore, RunnableConfig};
use synaptic::memory::{InMemoryStore, RunnableWithMessageHistory};
use synaptic::runnables::Runnable;
// Wrap a model chain with automatic history management
let memory = Arc::new(InMemoryStore::new());
let chain = /* your model chain (BoxRunnable<Vec<Message>, String>) */;
let chatbot = RunnableWithMessageHistory::new(chain, memory);
// Each call automatically loads/saves history
let mut config = RunnableConfig::default();
config.metadata.insert(
"session_id".to_string(),
serde_json::Value::String("user-42".to_string()),
);
let response = chatbot.invoke("What is Rust?".to_string(), &config).await?;
// The user message and AI response are now stored in memory for session "user-42"
This is the recommended approach for production chatbots because it keeps the memory management out of your application logic.
How It All Fits Together
Here is the mental model for Synaptic memory:
+-----------------------+
| MemoryStore trait |
| append / load / clear |
+-----------+-----------+
|
+----------------------+----------------------+
| | |
InMemoryStore (other stores) Memory Strategies
(raw storage) (wrap a MemoryStore)
|
+----------------------+----------------------+
| | | | |
Buffer Window Summary TokenBuffer SummaryBuffer
(all) (last K) (LLM) (tokens) (hybrid)
All memory strategies implement MemoryStore themselves, so they are composable -- you could wrap an InMemoryStore in a ConversationWindowMemory, and everything downstream only sees the MemoryStore trait.
Summary
In this tutorial you learned how to:
- Use
InMemoryStoreto store and retrieve conversation messages - Isolate conversations with session IDs
- Choose a memory strategy based on your conversation length and cost requirements
- Automate history management with
RunnableWithMessageHistory
Next Steps
- Build a RAG Application -- add document retrieval to your chatbot
- Memory How-to Guides -- detailed guides for each memory strategy
- Memory Concepts -- deeper understanding of memory architecture
Build a RAG Application
This tutorial walks you through building a Retrieval-Augmented Generation (RAG) pipeline with Synaptic. RAG is a pattern where you retrieve relevant documents from a knowledge base and include them as context in a prompt, so the LLM can answer questions grounded in your data rather than relying solely on its training.
Prerequisites
Add the required Synaptic crates to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["rag"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
How RAG Works
A RAG pipeline has two phases:
Indexing (offline) Querying (online)
================== ==================
+-----------+ +-----------+
| Documents | | Query |
+-----+-----+ +-----+-----+
| |
v v
+-----+------+ +-----+------+
| Split | | Retrieve | <--- Vector Store
+-----+------+ +-----+------+
| |
v v
+-----+------+ +-----+------+
| Embed | | Augment | (inject context into prompt)
+-----+------+ +-----+------+
| |
v v
+-----+------+ +-----+------+
| Store | ---> Vector Store | Generate | (LLM produces answer)
+------------+ +------------+
- Indexing -- Load documents, split them into chunks, embed each chunk, and store the vectors.
- Querying -- Embed the user's question, find the most similar chunks, include them in a prompt, and ask the LLM.
Step 1: Load Documents
Synaptic provides several document loaders. TextLoader wraps an in-memory string into a Document. For files on disk, use FileLoader.
use synaptic::loaders::{Loader, TextLoader};
let loader = TextLoader::new(
"rust-intro",
"Rust is a systems programming language focused on safety, speed, and concurrency. \
It achieves memory safety without a garbage collector through its ownership system. \
Rust's type system and borrow checker ensure that references are always valid. \
The language has grown rapidly since its 1.0 release in 2015 and is widely used \
for systems programming, web backends, embedded devices, and command-line tools.",
);
let docs = loader.load().await?;
// docs[0].id == "rust-intro"
// docs[0].content == the full text above
Each Document has three fields:
id-- a unique identifier (a string you provide).content-- the text content.metadata-- aHashMap<String, serde_json::Value>for arbitrary key-value pairs.
For loading files from disk, use FileLoader:
use synaptic::loaders::{Loader, FileLoader};
let loader = FileLoader::new("data/rust-book.txt");
let docs = loader.load().await?;
// docs[0].id == "data/rust-book.txt"
// docs[0].metadata["source"] == "data/rust-book.txt"
Other loaders include JsonLoader, CsvLoader, and DirectoryLoader (for loading many files at once with glob filtering).
Step 2: Split Documents into Chunks
Large documents need to be split into smaller chunks so that retrieval can return focused, relevant passages instead of entire files. RecursiveCharacterTextSplitter tries a hierarchy of separators (\n\n, \n, , "") and keeps chunks within a size limit.
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
let splitter = RecursiveCharacterTextSplitter::new(100)
.with_chunk_overlap(20);
let chunks = splitter.split_documents(docs);
for chunk in &chunks {
println!("[{}] {} chars: {}...", chunk.id, chunk.content.len(), &chunk.content[..40]);
}
The splitter produces new Document values with IDs like rust-intro-chunk-0, rust-intro-chunk-1, etc. Each chunk inherits the parent document's metadata and gains a chunk_index metadata field.
Key parameters:
chunk_size-- the maximum character length of each chunk (passed tonew()).chunk_overlap-- how many characters from the end of one chunk overlap with the start of the next (set with.with_chunk_overlap()). Overlap helps preserve context across chunk boundaries.
Other splitters are available for specialized content: CharacterTextSplitter, MarkdownHeaderTextSplitter, HtmlHeaderTextSplitter, and TokenTextSplitter.
Step 3: Embed and Store
Embeddings convert text into numerical vectors so that similarity can be computed mathematically. FakeEmbeddings provides deterministic, hash-based vectors for testing -- no API key required.
use std::sync::Arc;
use synaptic::embeddings::FakeEmbeddings;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore};
let embeddings = Arc::new(FakeEmbeddings::new(128));
// Create a vector store and add the chunks
let store = InMemoryVectorStore::new();
let ids = store.add_documents(chunks, embeddings.as_ref()).await?;
println!("Indexed {} chunks", ids.len());
InMemoryVectorStore stores document vectors in memory and uses cosine similarity for search. For convenience, you can also create a pre-populated store in one step:
let store = InMemoryVectorStore::from_documents(chunks, embeddings.as_ref()).await?;
For production use, replace FakeEmbeddings with OpenAiEmbeddings (from synaptic::openai) or OllamaEmbeddings (from synaptic::ollama), which call real embedding APIs.
Step 4: Retrieve Relevant Documents
Now you can search the vector store for chunks that are similar to a query:
use synaptic::vectorstores::VectorStore;
let results = store.similarity_search("What is Rust?", 3, embeddings.as_ref()).await?;
for doc in &results {
println!("Found: {}", doc.content);
}
The second argument (3) is k -- the number of results to return.
Using a Retriever
For a cleaner API that decouples retrieval logic from the store implementation, wrap the store in a VectorStoreRetriever:
use synaptic::retrieval::Retriever;
use synaptic::vectorstores::VectorStoreRetriever;
let retriever = VectorStoreRetriever::new(
Arc::new(store),
embeddings.clone(),
3, // default k
);
let results = retriever.retrieve("What is Rust?", 3).await?;
The Retriever trait has a single method -- retrieve(query, top_k) -- and is implemented by many retrieval strategies in Synaptic:
VectorStoreRetriever-- wraps anyVectorStorefor similarity search.BM25Retriever-- keyword-based scoring (no embeddings needed).MultiQueryRetriever-- generates multiple query variants with an LLM to improve recall.EnsembleRetriever-- combines multiple retrievers with Reciprocal Rank Fusion.
Step 5: Generate an Answer
The final step combines retrieved context with the user's question in a prompt. Here is the complete pipeline:
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError};
use synaptic::models::ScriptedChatModel;
use synaptic::loaders::{Loader, TextLoader};
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore, VectorStoreRetriever};
use synaptic::retrieval::Retriever;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
// 1. Load
let loader = TextLoader::new(
"rust-guide",
"Rust is a systems programming language focused on safety, speed, and concurrency. \
It achieves memory safety without a garbage collector through its ownership system. \
Rust was first released in 2015 and has grown into one of the most loved languages \
according to developer surveys.",
);
let docs = loader.load().await?;
// 2. Split
let splitter = RecursiveCharacterTextSplitter::new(100).with_chunk_overlap(20);
let chunks = splitter.split_documents(docs);
// 3. Embed and store
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = InMemoryVectorStore::from_documents(chunks, embeddings.as_ref()).await?;
// 4. Retrieve
let retriever = VectorStoreRetriever::new(Arc::new(store), embeddings.clone(), 2);
let question = "When was Rust first released?";
let relevant = retriever.retrieve(question, 2).await?;
// 5. Build the augmented prompt
let context = relevant
.iter()
.map(|doc| doc.content.as_str())
.collect::<Vec<_>>()
.join("\n\n");
let prompt = format!(
"Answer the question based only on the following context:\n\n\
{context}\n\n\
Question: {question}"
);
// 6. Generate (using ScriptedChatModel for offline testing)
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Rust was first released in 2015."),
usage: None,
},
]);
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant. Answer questions using only the provided context."),
Message::human(prompt),
]);
let response = model.chat(request).await?;
println!("Answer: {}", response.message.content());
// Output: Answer: Rust was first released in 2015.
Ok(())
}
In production, you would replace ScriptedChatModel with a real provider like OpenAiChatModel (from synaptic::openai) or AnthropicChatModel (from synaptic::anthropic).
Building RAG with LCEL Chains
For a more composable approach, you can integrate the retrieval step into an LCEL pipeline using RunnableParallel, RunnableLambda, and the pipe operator. This lets you express the RAG pattern as a single chain:
+---> retriever ---> format context ---+
| |
input (query) ---+ +---> prompt ---> model ---> parser
| |
+---> passthrough (question) ----------+
Each step is a Runnable, and they compose with |. See the Runnables how-to guides for details on RunnableParallel and RunnableLambda.
Summary
In this tutorial you learned how to:
- Load documents with
TextLoaderandFileLoader - Split documents into retrieval-friendly chunks with
RecursiveCharacterTextSplitter - Embed and store chunks in an
InMemoryVectorStore - Retrieve relevant documents with
VectorStoreRetriever - Combine retrieved context with a prompt to generate grounded answers
Next Steps
- Build a Graph Workflow -- orchestrate multi-step agent logic with a state graph
- Retrieval How-to Guides -- BM25, multi-query, ensemble, and compression retrievers
- Retrieval Concepts -- deeper look at embedding and retrieval strategies
Build a ReAct Agent
This tutorial walks you through building a ReAct (Reasoning + Acting) agent that can decide when to call tools and when to respond to the user. You will define a custom tool, wire it into a prebuilt agent graph, and watch the agent loop through reasoning and tool execution.
What is a ReAct Agent?
A ReAct agent follows a loop:
- Reason -- The LLM looks at the conversation so far and decides what to do next.
- Act -- If the LLM determines it needs information, it emits one or more tool calls.
- Observe -- The tool results are added to the conversation as
Toolmessages. - Repeat -- The LLM reviews the tool output and either calls more tools or produces a final answer.
Synaptic provides create_react_agent(model, tools), which builds a compiled StateGraph that implements this loop automatically.
Prerequisites
Add the required crates to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["agent", "macros"] }
async-trait = "0.1"
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Step 1: Define a Custom Tool
The easiest way to define a tool in Synaptic is with the #[tool] macro. Write an async function, add a doc comment (this becomes the description the LLM sees), and the macro generates the struct, Tool trait implementation, and a factory function automatically.
use serde_json::json;
use synaptic::core::SynapticError;
use synaptic::macros::tool;
/// Adds two numbers.
#[tool]
async fn add(
/// The first number
a: i64,
/// The second number
b: i64,
) -> Result<serde_json::Value, SynapticError> {
Ok(json!({ "value": a + b }))
}
The function parameters are automatically mapped to a JSON Schema that tells the LLM what arguments to provide. Parameter doc comments become "description" fields in the schema. In production, you can use Option<T> for optional parameters and #[default = value] for defaults. See Procedural Macros for the full reference.
Step 2: Create a Chat Model
For this tutorial we build a simple demo model that simulates the ReAct loop. On the first call (when there is no tool output in the conversation yet), it returns a tool call. On the second call (after tool output has been added), it returns a final text answer.
use async_trait::async_trait;
use serde_json::json;
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError, ToolCall};
struct DemoModel;
#[async_trait]
impl ChatModel for DemoModel {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, SynapticError> {
let has_tool_output = request.messages.iter().any(|m| m.is_tool());
if !has_tool_output {
// First turn: ask to call the "add" tool
Ok(ChatResponse {
message: Message::ai_with_tool_calls(
"I will use a tool to calculate this.",
vec![ToolCall {
id: "call-1".to_string(),
name: "add".to_string(),
arguments: json!({ "a": 7, "b": 5 }),
}],
),
usage: None,
})
} else {
// Second turn: the tool result is in, produce the final answer
Ok(ChatResponse {
message: Message::ai("The result is 12."),
usage: None,
})
}
}
}
In a real application you would use one of the provider adapters (OpenAiChatModel from synaptic::openai, AnthropicChatModel from synaptic::anthropic, etc.) instead of a scripted model.
Step 3: Build the Agent Graph
create_react_agent takes a model and a vector of tools, and returns a CompiledGraph<MessageState>. Under the hood, it creates two nodes:
- "agent" -- calls the
ChatModelwith the current messages and tool definitions. - "tools" -- executes any tool calls from the agent's response using a
ToolNode.
A conditional edge routes from "agent" to "tools" if the response contains tool calls, or to END if it does not. An unconditional edge routes from "tools" back to "agent" so the model can review the results.
use std::sync::Arc;
use synaptic::core::Tool;
use synaptic::graph::create_react_agent;
let model = Arc::new(DemoModel);
let tools: Vec<Arc<dyn Tool>> = vec![add()];
let graph = create_react_agent(model, tools).unwrap();
The add() factory function (generated by #[tool]) returns Arc<dyn Tool>, so it can be used directly in the tools vector. The model is wrapped in Arc because the graph needs shared ownership -- nodes may be invoked concurrently in more complex workflows.
Step 4: Run the Agent
Create an initial MessageState with the user's question and invoke the graph:
use synaptic::core::Message;
use synaptic::graph::MessageState;
let initial_state = MessageState {
messages: vec![Message::human("What is 7 + 5?")],
};
let result = graph.invoke(initial_state).await.unwrap();
let last = result.last_message().unwrap();
println!("agent answer: {}", last.content());
// Output: agent answer: The result is 12.
MessageState is the built-in state type for conversational agents. It holds a Vec<Message> that grows as the agent loop progresses. After invocation, last_message() returns the final message in the conversation -- typically the agent's answer.
Full Working Example
Here is the complete program that ties all the pieces together:
use std::sync::Arc;
use async_trait::async_trait;
use serde_json::json;
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError, Tool, ToolCall};
use synaptic::graph::{create_react_agent, MessageState};
use synaptic::macros::tool;
// --- Model ---
struct DemoModel;
#[async_trait]
impl ChatModel for DemoModel {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, SynapticError> {
let has_tool_output = request.messages.iter().any(|m| m.is_tool());
if !has_tool_output {
Ok(ChatResponse {
message: Message::ai_with_tool_calls(
"I will use a tool to calculate this.",
vec![ToolCall {
id: "call-1".to_string(),
name: "add".to_string(),
arguments: json!({ "a": 7, "b": 5 }),
}],
),
usage: None,
})
} else {
Ok(ChatResponse {
message: Message::ai("The result is 12."),
usage: None,
})
}
}
}
// --- Tool ---
/// Adds two numbers.
#[tool]
async fn add(
/// The first number
a: i64,
/// The second number
b: i64,
) -> Result<serde_json::Value, SynapticError> {
Ok(json!({ "value": a + b }))
}
// --- Main ---
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
let model = Arc::new(DemoModel);
let tools: Vec<Arc<dyn Tool>> = vec![add()];
let graph = create_react_agent(model, tools)?;
let initial_state = MessageState {
messages: vec![Message::human("What is 7 + 5?")],
};
let result = graph.invoke(initial_state).await?;
let last = result.last_message().unwrap();
println!("agent answer: {}", last.content());
Ok(())
}
How the Loop Executes
Here is the sequence of events when you run this example:
| Step | Node | What happens |
|---|---|---|
| 1 | agent | Receives [Human("What is 7 + 5?")]. Returns an AI message with a ToolCall for add(a=7, b=5). |
| 2 | routing | The conditional edge sees tool calls in the last message and routes to tools. |
| 3 | tools | ToolNode looks up "add" in the registry, calls the add tool's call method, and appends a Tool message with {"value": 12}. |
| 4 | edge | The unconditional edge routes from tools back to agent. |
| 5 | agent | Receives the full conversation including the tool result. Returns AI("The result is 12.") with no tool calls. |
| 6 | routing | No tool calls in the last message, so the conditional edge routes to END. |
The graph terminates and returns the final MessageState.
Next Steps
- Build a Graph Workflow -- build custom state graphs with conditional edges
- Tool Choice -- control which tools the model can call
- Human-in-the-Loop -- add interrupt points for human review
- Checkpointing -- persist agent state across invocations
Build a Graph Workflow
This tutorial walks you through building a custom multi-step workflow using Synaptic's LangGraph-style state graph. You will learn how to define nodes, wire them with edges, stream execution events, add conditional routing, and visualize the graph.
Prerequisites
Add the required Synaptic crates to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["graph"] }
async-trait = "0.1"
futures = "0.3"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
How State Graphs Work
A Synaptic state graph is a directed graph where:
- Nodes are processing steps. Each node takes the current state, transforms it, and returns the new state.
- Edges connect nodes. Fixed edges always route to the same target; conditional edges choose the target at runtime based on the state.
- State is a value that flows through the graph. It carries all the data nodes need to read and write.
The lifecycle is:
START ---> node_a ---> node_b ---> node_c ---> END
| | |
v v v
state_0 --> state_1 --> state_2 --> state_3
Each node receives the state, processes it, and passes the updated state to the next node. The graph terminates when execution reaches the END sentinel.
Step 1: Define the State
The simplest built-in state is MessageState, which holds a Vec<Message>. It is suitable for most agent and chatbot workflows:
use synaptic::graph::MessageState;
use synaptic::core::Message;
let state = MessageState::with_messages(vec![
Message::human("Hi"),
]);
MessageState implements the State trait, which requires a merge() method. When states are merged (e.g., during checkpointing or human-in-the-loop updates), MessageState appends the new messages to the existing list.
For custom workflows, you can implement State on your own types. The trait requires Clone + Send + Sync + 'static and a merge method:
use serde::{Serialize, Deserialize};
use synaptic::graph::State;
#[derive(Debug, Clone, Serialize, Deserialize)]
struct MyState {
counter: u32,
results: Vec<String>,
}
impl State for MyState {
fn merge(&mut self, other: Self) {
self.counter += other.counter;
self.results.extend(other.results);
}
}
Step 2: Define Nodes
A node is any type that implements the Node<S> trait. The trait has a single async method, process, which takes the state and returns the updated state:
use async_trait::async_trait;
use synaptic::core::{Message, SynapticError};
use synaptic::graph::{MessageState, Node};
struct GreetNode;
#[async_trait]
impl Node<MessageState> for GreetNode {
async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
state.messages.push(Message::ai("Hello! Let me help you."));
Ok(state)
}
}
struct ProcessNode;
#[async_trait]
impl Node<MessageState> for ProcessNode {
async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
state.messages.push(Message::ai("Processing your request..."));
Ok(state)
}
}
struct FinalizeNode;
#[async_trait]
impl Node<MessageState> for FinalizeNode {
async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
state.messages.push(Message::ai("Done! Here's the result."));
Ok(state)
}
}
For simpler cases, you can use FnNode to wrap an async closure without defining a separate struct:
use synaptic::graph::FnNode;
let greet = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Hello!"));
Ok(state)
});
Step 3: Build and Compile the Graph
Use StateGraph to wire nodes and edges into a workflow, then call compile() to produce an executable CompiledGraph:
use synaptic::graph::{StateGraph, END};
let graph = StateGraph::new()
.add_node("greet", GreetNode)
.add_node("process", ProcessNode)
.add_node("finalize", FinalizeNode)
.set_entry_point("greet")
.add_edge("greet", "process")
.add_edge("process", "finalize")
.add_edge("finalize", END)
.compile()?;
The builder methods are chainable:
add_node(name, node)-- registers a named node.set_entry_point(name)-- designates the first node to execute.add_edge(source, target)-- adds a fixed edge between two nodes (useENDas the target to terminate).compile()-- validates the graph and returns aCompiledGraph. It returns an error if the entry point is missing or if any edge references a non-existent node.
Step 4: Invoke the Graph
Call invoke() with an initial state. The graph executes each node in sequence according to the edges, and returns the final state:
use synaptic::core::Message;
use synaptic::graph::MessageState;
let state = MessageState::with_messages(vec![Message::human("Hi")]);
let result = graph.invoke(state).await?;
for msg in &result.messages {
println!("{}: {}", msg.role(), msg.content());
}
Output:
human: Hi
ai: Hello! Let me help you.
ai: Processing your request...
ai: Done! Here's the result.
Step 5: Stream Execution
For real-time feedback, use stream() to receive a GraphEvent after each node completes. Each event contains the node name and the current state snapshot:
use futures::StreamExt;
use synaptic::graph::StreamMode;
let state = MessageState::with_messages(vec![Message::human("Hi")]);
let mut stream = graph.stream(state, StreamMode::Values);
while let Some(event) = stream.next().await {
let event = event?;
println!("Node '{}' completed, {} messages in state",
event.node, event.state.messages.len());
}
Output:
Node 'greet' completed, 2 messages in state
Node 'process' completed, 3 messages in state
Node 'finalize' completed, 4 messages in state
StreamMode controls what each event contains:
StreamMode::Values-- the event'sstateis the full accumulated state after the node ran.StreamMode::Updates-- the event'sstateis the state as it stands after the node, useful for observing per-node changes.
Step 6: Add Conditional Edges
Real workflows often need branching logic. Use add_conditional_edges with a routing function that inspects the state and returns the name of the next node:
use std::collections::HashMap;
use synaptic::graph::{StateGraph, END};
let graph = StateGraph::new()
.add_node("greet", GreetNode)
.add_node("process", ProcessNode)
.add_node("finalize", FinalizeNode)
.set_entry_point("greet")
.add_edge("greet", "process")
.add_conditional_edges_with_path_map(
"process",
|state: &MessageState| {
if state.messages.len() > 3 {
"finalize".to_string()
} else {
"process".to_string()
}
},
HashMap::from([
("finalize".to_string(), "finalize".to_string()),
("process".to_string(), "process".to_string()),
]),
)
.add_edge("finalize", END)
.compile()?;
In this example, the process node loops back to itself until the state has more than 3 messages, at which point it routes to finalize.
There are two variants:
add_conditional_edges(source, router_fn)-- the routing function returns a node name directly. Simple, but visualization tools cannot display the possible targets.add_conditional_edges_with_path_map(source, router_fn, path_map)-- also provides aHashMap<String, String>that maps labels to target node names. This enables visualization tools to show all possible routing targets.
The routing function must be Fn(&S) -> String + Send + Sync + 'static. It receives a reference to the current state and returns the name of the target node (or END to terminate).
Step 7: Visualize the Graph
CompiledGraph provides several methods for visualizing the graph structure. These are useful for debugging and documentation.
Mermaid Diagram
println!("{}", graph.draw_mermaid());
Produces a Mermaid flowchart that can be rendered by GitHub, GitLab, or any Mermaid-compatible viewer:
graph TD
__start__(["__start__"])
greet["greet"]
process["process"]
finalize["finalize"]
__end__(["__end__"])
__start__ --> greet
greet --> process
finalize --> __end__
process -.-> |finalize| finalize
process -.-> |process| process
Fixed edges appear as solid arrows (-->), conditional edges as dashed arrows (-.->) with labels.
ASCII Summary
println!("{}", graph.draw_ascii());
Produces a compact text summary:
Graph:
Nodes: finalize, greet, process
Entry: __start__ -> greet
Edges:
finalize -> __end__
greet -> process
process -> finalize | process [conditional]
Other Formats
draw_dot()-- produces a Graphviz DOT string, suitable for rendering with thedotcommand.draw_png(path)-- renders the graph as a PNG image using Graphviz (requiresdotto be installed).draw_mermaid_png(path)-- renders via the mermaid.ink API (requires internet access).draw_mermaid_svg(path)-- renders as SVG via the mermaid.ink API.
Complete Example
Here is the full program combining all the concepts:
use std::collections::HashMap;
use async_trait::async_trait;
use futures::StreamExt;
use synaptic::core::{Message, SynapticError};
use synaptic::graph::{MessageState, Node, StateGraph, StreamMode, END};
struct GreetNode;
#[async_trait]
impl Node<MessageState> for GreetNode {
async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
state.messages.push(Message::ai("Hello! Let me help you."));
Ok(state)
}
}
struct ProcessNode;
#[async_trait]
impl Node<MessageState> for ProcessNode {
async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
state.messages.push(Message::ai("Processing your request..."));
Ok(state)
}
}
struct FinalizeNode;
#[async_trait]
impl Node<MessageState> for FinalizeNode {
async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
state.messages.push(Message::ai("Done! Here's the result."));
Ok(state)
}
}
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
// Build the graph with a conditional loop
let graph = StateGraph::new()
.add_node("greet", GreetNode)
.add_node("process", ProcessNode)
.add_node("finalize", FinalizeNode)
.set_entry_point("greet")
.add_edge("greet", "process")
.add_conditional_edges_with_path_map(
"process",
|state: &MessageState| {
if state.messages.len() > 3 {
"finalize".to_string()
} else {
"process".to_string()
}
},
HashMap::from([
("finalize".to_string(), "finalize".to_string()),
("process".to_string(), "process".to_string()),
]),
)
.add_edge("finalize", END)
.compile()?;
// Visualize the graph
println!("=== Graph Structure ===");
println!("{}", graph.draw_ascii());
println!();
println!("=== Mermaid ===");
println!("{}", graph.draw_mermaid());
println!();
// Stream execution
println!("=== Execution ===");
let state = MessageState::with_messages(vec![Message::human("Hi")]);
let mut stream = graph.stream(state, StreamMode::Values);
while let Some(event) = stream.next().await {
let event = event?;
let last_msg = event.state.last_message().unwrap();
println!("[{}] {}: {}", event.node, last_msg.role(), last_msg.content());
}
Ok(())
}
Output:
=== Graph Structure ===
Graph:
Nodes: finalize, greet, process
Entry: __start__ -> greet
Edges:
finalize -> __end__
greet -> process
process -> finalize | process [conditional]
=== Mermaid ===
graph TD
__start__(["__start__"])
finalize["finalize"]
greet["greet"]
process["process"]
__end__(["__end__"])
__start__ --> greet
finalize --> __end__
greet --> process
process -.-> |finalize| finalize
process -.-> |process| process
=== Execution ===
[greet] ai: Hello! Let me help you.
[process] ai: Processing your request...
[process] ai: Processing your request...
[finalize] ai: Done! Here's the result.
The process node executes twice because on the first pass the state has only 3 messages (the human message plus greet and process outputs), so the conditional edge loops back. On the second pass it has 4 messages, which exceeds the threshold, and routing proceeds to finalize.
Summary
In this tutorial you learned how to:
- Define graph state with
MessageStateor a customStatetype - Create nodes by implementing the
Node<S>trait or usingFnNode - Build a graph with
StateGraphusing fixed and conditional edges - Execute a graph with
invoke()or stream it withstream() - Visualize the graph with Mermaid, ASCII, DOT, and image output
Next Steps
- Build a ReAct Agent -- use the prebuilt
create_react_agenthelper for tool-calling agents - Graph How-to Guides -- checkpointing, human-in-the-loop, streaming, and tool nodes
- Graph Concepts -- deeper look at state machines and the LangGraph execution model
Build a Deep Agent
This tutorial walks you through building a Deep Agent step by step. You will start with a minimal agent that can read and write files, then progressively add skills, subagents, memory, and custom configuration. By the end you will understand every layer of the deep agent stack.
What You Will Build
A Deep Agent that:
- Uses filesystem tools to read, write, and search files.
- Loads domain-specific skills from
SKILL.mdfiles. - Delegates subtasks to custom subagents.
- Persists learned knowledge in an
AGENTS.mdmemory file. - Auto-summarizes conversation history when context grows large.
Prerequisites
Create a new binary crate:
cargo new deep-agent-tutorial
cd deep-agent-tutorial
Add dependencies to Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["deep", "openai"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
Step 1: Create a Backend
Every deep agent needs a backend that provides filesystem operations. The backend is the agent's view of the world -- it determines where files are read from and written to.
Synaptic ships three backend implementations:
StateBackend-- in-memoryHashMap<String, String>. Great for tests and sandboxed demos. No real files are touched.StoreBackend-- delegates to a SynapticStoreimplementation. Useful when you already have a store with semantic search.FilesystemBackend-- reads and writes real files on disk, sandboxed to a root directory. Requires thefilesystemfeature flag.
For this tutorial we use StateBackend so everything runs in memory:
use std::sync::Arc;
use synaptic::deep::backend::{Backend, StateBackend};
let backend = Arc::new(StateBackend::new());
The deep agent wraps each backend operation as a tool that the model can call.
Step 2: Create a Minimal Deep Agent
The create_deep_agent function assembles a full middleware stack and tool set in one call. It returns a CompiledGraph<MessageState> -- the same graph type used by create_agent and create_react_agent, so you run it with invoke().
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::deep::backend::StateBackend;
use synaptic::core::{ChatModel, Message};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(StateBackend::new());
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model.clone(), options)?;
let state = MessageState::with_messages(vec![
Message::human("Create a file called hello.txt with 'Hello World!'"),
]);
let result = agent.invoke(state).await?;
let final_state = result.into_state();
println!("{}", final_state.last_message().unwrap().content());
Ok(())
}
What happens under the hood:
DeepAgentOptions::new(backend)configures sensible defaults -- filesystem tools enabled, skills enabled, memory enabled, subagents enabled.create_deep_agentassembles 6 middleware layers and 6-7 tools, then callscreate_agentto produce a compiled graph.agent.invoke(state)runs the agent loop. The model sees thewrite_filetool and calls it to createhello.txtin the backend.result.into_state()unwraps theGraphResultinto the finalMessageState.
Because we are using StateBackend, the file lives only in memory. You can verify it:
let content = backend.read_file("hello.txt", 0, 100).await?;
assert!(content.contains("Hello World!"));
Step 3: Use Filesystem Tools
The deep agent automatically registers these tools: ls, read_file, write_file, edit_file, glob, grep, and execute (if the backend supports shell commands).
Let us seed the backend with a small Rust project and ask the agent to analyze it:
// Seed files into the in-memory backend
backend.write_file("src/main.rs", r#"fn main() {
let items = vec![1, 2, 3, 4, 5];
let mut total = 0;
for i in items {
total = total + i;
}
println!("Total: {}", total);
// TODO: add error handling
// TODO: extract into a function
}
"#).await?;
backend.write_file("Cargo.toml", r#"[package]
name = "sample"
version = "0.1.0"
edition = "2021"
"#).await?;
let state = MessageState::with_messages(vec![
Message::human("Read src/main.rs. List all the TODO comments and suggest improvements."),
]);
let result = agent.invoke(state).await?;
let final_state = result.into_state();
println!("{}", final_state.last_message().unwrap().content());
The agent calls read_file to get the source, finds the TODO comments, and responds with suggestions. You can follow up with a write request:
let state = MessageState::with_messages(vec![
Message::human(
"Create src/lib.rs with a public function `sum_items(items: &[i32]) -> i32` \
that uses iter().sum(). Then update src/main.rs to use it."
),
]);
let result = agent.invoke(state).await?;
The agent uses write_file and edit_file to make the changes.
Step 4: Add Skills
Skills are domain-specific instructions stored as SKILL.md files in the backend. The SkillsMiddleware scans {skills_dir}/*/SKILL.md on each model call, parses YAML frontmatter for name and description, and injects a skill index into the system prompt. The agent can then read_file any skill for full details.
Write a skill file directly to the backend:
backend.write_file(
".skills/testing/SKILL.md",
"---\nname: testing\ndescription: Write comprehensive tests\n---\n\
Testing Skill\n\n\
When asked to test Rust code:\n\n\
1. Create a `tests/` module with `#[cfg(test)]`.\n\
2. Write at least one happy-path test and one edge-case test.\n\
3. Use `assert_eq!` with descriptive messages.\n\
4. Test error paths with `assert!(result.is_err())`.\n"
).await?;
Skills are enabled by default (enable_skills = true). When the agent processes a request, it sees the skill index in its system prompt:
<available_skills>
- **testing**: Write comprehensive tests (read `.skills/testing/SKILL.md` for details)
</available_skills>
The agent can call read_file on .skills/testing/SKILL.md to get the full instructions. This is progressive disclosure -- the index is always small, and full skill content is loaded on demand.
You can add multiple skills:
backend.write_file(
".skills/refactoring/SKILL.md",
"---\nname: refactoring\ndescription: Rust refactoring best practices\n---\n\
Refactoring Skill\n\n\
1. Prefer `iter().sum()` over manual loops.\n\
2. Add `#[must_use]` to pure functions.\n\
3. Run clippy before and after changes.\n"
).await?;
Step 5: Add Custom Subagents
The deep agent can spawn child agents via a task tool. Each child gets its own conversation, runs the same middleware stack, and returns a summary to the parent.
Define custom subagent types with SubAgentDef:
use synaptic::deep::SubAgentDef;
let mut options = DeepAgentOptions::new(backend.clone());
options.subagents = vec![SubAgentDef {
name: "researcher".to_string(),
description: "Research specialist".to_string(),
system_prompt: "You are a research assistant. Use grep and read_file to \
find information in the codebase. Report findings concisely."
.to_string(),
tools: vec![], // inherits filesystem tools from the deep agent
}];
let agent = create_deep_agent(model.clone(), options)?;
When the model calls the task tool, it passes a description and an optional agent_type. If agent_type matches a SubAgentDef name, the child uses that definition's system prompt and extra tools. Otherwise a general-purpose child agent is spawned.
Subagent depth is bounded by max_subagent_depth (default 3) to prevent runaway recursion. You can disable subagents entirely:
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_subagents = false;
let agent = create_deep_agent(model.clone(), options)?;
Step 6: Add Memory Persistence
The DeepMemoryMiddleware loads a memory file from the backend on each model call and injects it into the system prompt wrapped in <agent_memory> tags. Write an initial memory file:
backend.write_file(
"AGENTS.md",
"# Agent Memory\n\n\
- Always use Rust idioms\n\
- Prefer async/await over blocking I/O\n\
- User prefers 4-space indentation\n"
).await?;
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_memory = true; // this is already the default
let agent = create_deep_agent(model.clone(), options)?;
The agent now sees this in its system prompt on every call:
<agent_memory>
# Agent Memory
- Always use Rust idioms
- Prefer async/await over blocking I/O
- User prefers 4-space indentation
</agent_memory>
The memory file path defaults to "AGENTS.md". You can change it:
let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("project-notes.md".to_string());
The agent can update memory by calling write_file or edit_file on the memory file. Future sessions will pick up the changes automatically.
Step 7: Customize Options
DeepAgentOptions gives you control over the entire agent stack:
let mut options = DeepAgentOptions::new(backend.clone());
// System prompt prepended to all model calls
options.system_prompt = Some("You are a coding assistant.".to_string());
// Token budget and summarization
options.max_input_tokens = 128_000; // default
options.summarization_threshold = 0.85; // default (85% of max)
options.eviction_threshold = 20_000; // evict large tool results (default)
// Subagent configuration
options.max_subagent_depth = 3; // default
options.enable_subagents = true; // default
// Feature toggles
options.enable_filesystem = true; // default
options.enable_skills = true; // default
options.enable_memory = true; // default
// Paths in the backend
options.skills_dir = Some(".skills".to_string()); // default
options.memory_file = Some("AGENTS.md".to_string()); // default
// Extensibility: add your own tools, middleware, checkpointer, or store
options.tools = vec![];
options.middleware = vec![];
options.checkpointer = None;
options.store = None;
options.subagents = vec![];
let agent = create_deep_agent(model.clone(), options)?;
Step 8: Putting It All Together
Here is a complete example that combines everything:
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, SubAgentDef};
use synaptic::deep::backend::StateBackend;
use synaptic::core::{ChatModel, Message};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(StateBackend::new());
// Seed the workspace
backend.write_file("src/main.rs", "fn main() {\n println!(\"hello\");\n}\n").await?;
// Add a skill
backend.write_file(
".skills/testing/SKILL.md",
"---\nname: testing\ndescription: Write comprehensive tests\n---\n# Testing\nAlways write unit tests.\n"
).await?;
// Add agent memory
backend.write_file("AGENTS.md", "# Memory\n- Use Rust 2021 edition\n").await?;
// Configure the deep agent
let mut options = DeepAgentOptions::new(backend.clone());
options.system_prompt = Some("You are a senior Rust engineer. Be concise.".to_string());
options.max_input_tokens = 64_000;
options.summarization_threshold = 0.80;
options.max_subagent_depth = 2;
options.subagents = vec![SubAgentDef {
name: "researcher".to_string(),
description: "Code research specialist".to_string(),
system_prompt: "You research codebases and report findings.".to_string(),
tools: vec![],
}];
let agent = create_deep_agent(model, options)?;
// Run the agent
let state = MessageState::with_messages(vec![
Message::human(
"Audit this project: read all source files, find TODOs, \
and write a summary to REPORT.md."
),
]);
let result = agent.invoke(state).await?;
let final_state = result.into_state();
println!("{}", final_state.last_message().unwrap().content());
// Verify the report was created
let report = backend.read_file("REPORT.md", 0, 100).await?;
println!("--- REPORT.md ---\n{}", report);
Ok(())
}
How the Middleware Stack Works
create_deep_agent assembles this middleware stack in order:
- DeepMemoryMiddleware -- reads
AGENTS.mdand appends it to the system prompt. - SkillsMiddleware -- scans
.skills/*/SKILL.mdand injects a skill index into the system prompt. - FilesystemMiddleware -- registers filesystem tools. Evicts results larger than
eviction_thresholdtokens to.evicted/files with a preview. - SubAgentMiddleware -- provides the
tasktool for spawning child agents. - DeepSummarizationMiddleware -- summarizes older messages when token count exceeds the threshold, saving full history to
.context/history_N.md. - PatchToolCallsMiddleware -- fixes malformed tool calls (strips code fences, deduplicates IDs, removes empty names).
- User middleware -- anything in
options.middlewareruns last.
Using a Real Filesystem Backend
For production use, enable the filesystem feature to work with real files:
[dependencies]
synaptic = { version = "0.2", features = ["deep", "openai"] }
synaptic-deep = { version = "0.2", features = ["filesystem"] }
Note: The
filesystemfeature is on thesynaptic-deepcrate directly because thesynapticfacade does not forward it. Addsynaptic-deepas an explicit dependency when you needFilesystemBackend.
use synaptic::deep::backend::FilesystemBackend;
let backend = Arc::new(FilesystemBackend::new("/path/to/workspace"));
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;
FilesystemBackend sandboxes all operations to the root directory. Path traversal via .. is rejected. It also supports shell command execution via the execute tool.
Offline Mode (No API Key Required)
For testing and CI, combine StateBackend with ScriptedChatModel to run the entire deep agent without network access:
use std::sync::Arc;
use synaptic::core::{ChatModel, ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::deep::backend::StateBackend;
use synaptic::graph::MessageState;
let backend = Arc::new(StateBackend::new());
// Script the model to: 1) write a file, 2) respond
let model: Arc<dyn ChatModel> = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai_with_tool_calls(
"Creating the file.",
vec![ToolCall {
id: "call_1".into(),
name: "write_file".into(),
arguments: r#"{"path": "/output.txt", "content": "Hello from offline test!"}"#.into(),
}],
),
usage: None,
},
ChatResponse {
message: Message::ai("Done! Created output.txt."),
usage: None,
},
]));
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;
let state = MessageState::with_messages(vec![
Message::human("Create output.txt with a greeting."),
]);
let result = agent.invoke(state).await?.into_state();
// Verify the file was created in the virtual filesystem
let content = backend.read_file("/output.txt", 0, 100).await?;
assert!(content.contains("Hello from offline test!"));
This approach is ideal for:
- Unit tests -- deterministic, no API costs, fast execution
- CI pipelines -- no secrets required
- Demos -- runs anywhere without configuration
What You Built
Over the course of this tutorial you:
- Created a
StateBackendas an in-memory filesystem for the agent. - Used
create_deep_agentto assemble a full agent with tools and middleware. - Ran the agent with
invoke()on aMessageStateand extracted results withinto_state(). - Registered built-in filesystem tools (
ls,read_file,write_file,edit_file,glob,grep). - Added domain skills via
SKILL.mdfiles with YAML frontmatter. - Defined custom subagents with
SubAgentDeffor task delegation. - Enabled persistent memory via
AGENTS.md. - Customized every option through
DeepAgentOptions.
Next Steps
- Multi-Agent Patterns -- supervisor and swarm architectures
- Middleware -- write custom middleware for the agent stack
- Store -- persistent key-value storage with semantic search
Chat Models
Synaptic supports multiple LLM providers through the ChatModel trait defined in synaptic-core. Each provider lives in its own crate, giving you a uniform interface for sending messages and receiving responses -- whether you are using OpenAI, Anthropic, Gemini, or a local Ollama instance.
Providers
Each provider adapter lives in its own crate. You enable only the providers you need via feature flags:
| Provider | Adapter | Crate | Feature |
|---|---|---|---|
| OpenAI | OpenAiChatModel | synaptic-openai | "openai" |
| Anthropic | AnthropicChatModel | synaptic-anthropic | "anthropic" |
| Google Gemini | GeminiChatModel | synaptic-gemini | "gemini" |
| Ollama (local) | OllamaChatModel | synaptic-ollama | "ollama" |
use std::sync::Arc;
use synaptic::openai::OpenAiChatModel;
let model = OpenAiChatModel::new("gpt-4o");
For testing, use ScriptedChatModel (returns pre-defined responses) or FakeBackend (simulates HTTP responses without network calls).
Wrappers
Synaptic provides composable wrappers that add behavior on top of any ChatModel:
| Wrapper | Purpose |
|---|---|
RetryChatModel | Automatic retry with exponential backoff |
RateLimitedChatModel | Concurrency-based rate limiting (semaphore) |
TokenBucketChatModel | Token bucket rate limiting |
StructuredOutputChatModel<T> | JSON schema enforcement for structured output |
CachedChatModel | Response caching (exact-match or semantic) |
BoundToolsChatModel | Automatically attach tool definitions to every request |
All wrappers implement ChatModel, so they can be stacked:
use std::sync::Arc;
use synaptic::models::{RetryChatModel, RetryPolicy, RateLimitedChatModel};
let model: Arc<dyn ChatModel> = Arc::new(base_model);
let with_retry = Arc::new(RetryChatModel::new(model, RetryPolicy::default()));
let with_rate_limit = RateLimitedChatModel::new(with_retry, 5);
Guides
- Streaming Responses -- consume tokens as they arrive with
stream_chat() - Bind Tools to a Model -- send tool definitions alongside your request
- Control Tool Choice -- force, prevent, or target specific tool usage
- Structured Output -- get typed Rust structs from LLM responses
- Caching LLM Responses -- avoid redundant API calls with in-memory or semantic caching
- Retry & Rate Limiting -- handle transient failures and control request throughput
- Model Profiles -- query model capabilities and limits at runtime
Streaming Responses
This guide shows how to consume LLM responses as a stream of tokens, rather than waiting for the entire response to complete.
Overview
Every ChatModel in Synaptic provides two methods:
chat()-- returns a completeChatResponseonce the model finishes generating.stream_chat()-- returns aChatStream, which yieldsAIMessageChunkitems as the model produces them.
Streaming is useful for displaying partial results to users in real time.
Basic streaming
Use stream_chat() and iterate over chunks with StreamExt::next():
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message, AIMessageChunk};
async fn stream_example(model: &dyn ChatModel) -> Result<(), Box<dyn std::error::Error>> {
let request = ChatRequest::new(vec![
Message::human("Tell me a story about a brave robot"),
]);
let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
print!("{}", chunk.content); // Print each token as it arrives
}
println!(); // Final newline
Ok(())
}
The ChatStream type is defined as:
type ChatStream<'a> = Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send + 'a>>;
Accumulating chunks into a message
AIMessageChunk supports the + and += operators for merging chunks together. After streaming completes, convert the accumulated result into a full Message:
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message, AIMessageChunk};
async fn accumulate_stream(model: &dyn ChatModel) -> Result<Message, Box<dyn std::error::Error>> {
let request = ChatRequest::new(vec![
Message::human("Summarize Rust's ownership model"),
]);
let mut stream = model.stream_chat(request);
let mut full = AIMessageChunk::default();
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
full += chunk; // Merge content, tool_calls, usage, etc.
}
let final_message = full.into_message();
println!("Complete response: {}", final_message.content());
Ok(final_message)
}
When merging chunks:
contentstrings are concatenated.tool_callsare appended to the accumulated list.usagetoken counts are summed.- The first non-
Noneidis preserved.
Using the + operator
You can also combine two chunks with + without mutation:
let combined = chunk_a + chunk_b;
This produces a new AIMessageChunk with the merged fields from both.
Streaming with tool calls
When the model streams a response that includes tool calls, tool call data arrives across multiple chunks. After accumulation, the full tool call information is available on the resulting message:
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message, AIMessageChunk, ToolDefinition};
use serde_json::json;
async fn stream_with_tools(model: &dyn ChatModel) -> Result<(), Box<dyn std::error::Error>> {
let tool = ToolDefinition {
name: "get_weather".to_string(),
description: "Get current weather".to_string(),
parameters: json!({"type": "object", "properties": {"city": {"type": "string"}}}),
};
let request = ChatRequest::new(vec![
Message::human("What's the weather in Paris?"),
]).with_tools(vec![tool]);
let mut stream = model.stream_chat(request);
let mut full = AIMessageChunk::default();
while let Some(chunk) = stream.next().await {
full += chunk?;
}
let message = full.into_message();
for tc in message.tool_calls() {
println!("Call tool '{}' with: {}", tc.name, tc.arguments);
}
Ok(())
}
Default streaming behavior
If a provider adapter does not implement native streaming, the default stream_chat() implementation wraps the chat() result as a single-chunk stream. This means you can always use stream_chat() regardless of provider -- you just may not get incremental token delivery from providers that do not support it natively.
Bind Tools to a Model
This guide shows how to include tool (function) definitions in a chat request so the model can decide to call them.
Defining tools
A ToolDefinition describes a tool the model can invoke. It has a name, description, and a JSON Schema for its parameters:
use synaptic::core::ToolDefinition;
use serde_json::json;
let weather_tool = ToolDefinition {
name: "get_weather".to_string(),
description: "Get the current weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g. 'Tokyo'"
}
},
"required": ["location"]
}),
};
Sending tools with a request
Use ChatRequest::with_tools() to attach tool definitions to a single request:
use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition};
use serde_json::json;
async fn call_with_tools(model: &dyn ChatModel) -> Result<(), Box<dyn std::error::Error>> {
let tool_def = ToolDefinition {
name: "get_weather".to_string(),
description: "Get the current weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}),
};
let request = ChatRequest::new(vec![
Message::human("What's the weather in Tokyo?"),
]).with_tools(vec![tool_def]);
let response = model.chat(request).await?;
// Check if the model decided to call any tools
for tc in response.message.tool_calls() {
println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}
Ok(())
}
Processing tool calls
When the model returns tool calls, each ToolCall contains:
id-- a unique identifier for this call (used to match the tool result back)name-- the name of the tool to invokearguments-- aserde_json::Valuewith the arguments
After executing the tool, send the result back as a Tool message:
use synaptic::core::{ChatRequest, Message, ToolCall};
use serde_json::json;
// Suppose the model returned a tool call
let tool_call = ToolCall {
id: "call_123".to_string(),
name: "get_weather".to_string(),
arguments: json!({"location": "Tokyo"}),
};
// Execute your tool logic...
let result = "Sunny, 22C";
// Send the result back in a follow-up request
let messages = vec![
Message::human("What's the weather in Tokyo?"),
Message::ai_with_tool_calls("", vec![tool_call]),
Message::tool(result, "call_123"), // tool_call_id must match
];
let follow_up = ChatRequest::new(messages);
// let final_response = model.chat(follow_up).await?;
Permanently binding tools with BoundToolsChatModel
If you want every request through a model to automatically include certain tool definitions, use BoundToolsChatModel:
use std::sync::Arc;
use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition};
use synaptic::models::BoundToolsChatModel;
use serde_json::json;
let tools = vec![
ToolDefinition {
name: "get_weather".to_string(),
description: "Get weather for a city".to_string(),
parameters: json!({"type": "object", "properties": {"city": {"type": "string"}}}),
},
ToolDefinition {
name: "search".to_string(),
description: "Search the web".to_string(),
parameters: json!({"type": "object", "properties": {"query": {"type": "string"}}}),
},
];
let base_model: Arc<dyn ChatModel> = Arc::new(base_model);
let bound = BoundToolsChatModel::new(base_model, tools);
// Now every call to bound.chat() will include both tools automatically
let request = ChatRequest::new(vec![Message::human("Look up Rust news")]);
// let response = bound.chat(request).await?;
Multiple tools
You can provide any number of tools. The model will choose which (if any) to call based on the conversation context:
let request = ChatRequest::new(vec![
Message::human("Search for Rust news and tell me the weather in Berlin"),
]).with_tools(vec![search_tool, weather_tool, calculator_tool]);
See also: Control Tool Choice for fine-grained control over which tools the model uses.
Control Tool Choice
This guide shows how to control whether and which tools the model uses when responding to a request.
Overview
When you attach tools to a ChatRequest, the model decides by default whether to call any of them. The ToolChoice enum lets you override this behavior, forcing the model to use tools, avoid them, or target a specific one.
The ToolChoice enum
use synaptic::core::ToolChoice;
// Auto -- the model decides whether to use tools (this is the default)
ToolChoice::Auto
// Required -- the model must call at least one tool
ToolChoice::Required
// None -- the model must not call any tools, even if tools are provided
ToolChoice::None
// Specific -- the model must call this exact tool
ToolChoice::Specific("get_weather".to_string())
Setting tool choice on a request
Use ChatRequest::with_tool_choice():
use synaptic::core::{ChatRequest, Message, ToolChoice, ToolDefinition};
use serde_json::json;
let tools = vec![
ToolDefinition {
name: "get_weather".to_string(),
description: "Get weather for a city".to_string(),
parameters: json!({
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}),
},
ToolDefinition {
name: "search".to_string(),
description: "Search the web".to_string(),
parameters: json!({
"type": "object",
"properties": {
"query": { "type": "string" }
},
"required": ["query"]
}),
},
];
let messages = vec![Message::human("What's the weather in London?")];
Auto (default)
The model chooses freely whether to call tools:
let request = ChatRequest::new(messages.clone())
.with_tools(tools.clone())
.with_tool_choice(ToolChoice::Auto);
This is equivalent to not calling with_tool_choice() at all.
Required
Force the model to call at least one tool. Useful when you know the user's intent maps to a tool call:
let request = ChatRequest::new(messages.clone())
.with_tools(tools.clone())
.with_tool_choice(ToolChoice::Required);
None
Prevent the model from calling tools, even though tools are provided. This is helpful when you want to temporarily disable tool usage without removing the definitions:
let request = ChatRequest::new(messages.clone())
.with_tools(tools.clone())
.with_tool_choice(ToolChoice::None);
Specific
Force the model to call one specific tool by name. The model will always call this tool, regardless of the conversation context:
let request = ChatRequest::new(messages.clone())
.with_tools(tools.clone())
.with_tool_choice(ToolChoice::Specific("get_weather".to_string()));
Practical patterns
Routing with specific tool choice
When building a multi-step agent, you can force a classification step by requiring a specific "router" tool:
let router_tool = ToolDefinition {
name: "route".to_string(),
description: "Classify the user's intent".to_string(),
parameters: json!({
"type": "object",
"properties": {
"intent": {
"type": "string",
"enum": ["weather", "search", "calculator"]
}
},
"required": ["intent"]
}),
};
let request = ChatRequest::new(vec![Message::human("What is 2 + 2?")])
.with_tools(vec![router_tool])
.with_tool_choice(ToolChoice::Specific("route".to_string()));
Two-phase generation
First call with Required to extract structured data, then call with None to generate a natural language response:
// Phase 1: extract data
let extract_request = ChatRequest::new(messages.clone())
.with_tools(tools.clone())
.with_tool_choice(ToolChoice::Required);
// Phase 2: generate response (no tools)
let respond_request = ChatRequest::new(full_conversation)
.with_tools(tools.clone())
.with_tool_choice(ToolChoice::None);
Structured Output
This guide shows how to get typed Rust structs from LLM responses using StructuredOutputChatModel<T>.
Overview
StructuredOutputChatModel<T> wraps any ChatModel and instructs it to respond with valid JSON matching a schema you describe. It injects a system prompt with the schema instructions and provides a parse_response() method to deserialize the JSON into your Rust type.
Basic usage
Define your output type as a struct that implements Deserialize, then wrap your model:
use std::sync::Arc;
use serde::Deserialize;
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::StructuredOutputChatModel;
#[derive(Debug, Deserialize)]
struct MovieReview {
title: String,
rating: f32,
summary: String,
}
async fn get_review(base_model: Arc<dyn ChatModel>) -> Result<(), Box<dyn std::error::Error>> {
let structured = StructuredOutputChatModel::<MovieReview>::new(
base_model,
r#"{"title": "string", "rating": "number (1-10)", "summary": "string"}"#,
);
let request = ChatRequest::new(vec![
Message::human("Review the movie 'Interstellar'"),
]);
// Use generate() to get both the parsed struct and the raw response
let (review, _raw_response) = structured.generate(request).await?;
println!("Title: {}", review.title);
println!("Rating: {}/10", review.rating);
println!("Summary: {}", review.summary);
Ok(())
}
How it works
When you call chat() or generate() on a StructuredOutputChatModel:
- A system message is prepended to the request instructing the model to respond with valid JSON matching the schema description.
- The request is forwarded to the inner model.
- With
generate(), the response text is parsed as JSON into your target typeT.
The schema description is a free-form string. It does not need to be valid JSON Schema -- it just needs to clearly communicate the expected shape to the LLM:
// Simple field descriptions
let schema = r#"{"name": "string", "age": "integer", "hobbies": ["string"]}"#;
// More detailed descriptions
let schema = r#"{
"sentiment": "one of: positive, negative, neutral",
"confidence": "float between 0.0 and 1.0",
"key_phrases": "array of strings"
}"#;
Parsing responses manually
If you want to use the model as a normal ChatModel and parse later, you can call chat() followed by parse_response():
let structured = StructuredOutputChatModel::<MovieReview>::new(base_model, schema);
let response = structured.chat(request).await?;
let parsed: MovieReview = structured.parse_response(&response)?;
Handling markdown code blocks
The parser automatically handles responses wrapped in markdown code blocks. All of these formats are supported:
{"title": "Interstellar", "rating": 9.0, "summary": "..."}
```json
{"title": "Interstellar", "rating": 9.0, "summary": "..."}
```
```
{"title": "Interstellar", "rating": 9.0, "summary": "..."}
```
Complex output types
You can use nested structs, enums, and collections:
#[derive(Debug, Deserialize)]
struct AnalysisResult {
entities: Vec<Entity>,
sentiment: Sentiment,
language: String,
}
#[derive(Debug, Deserialize)]
struct Entity {
name: String,
entity_type: String,
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "lowercase")]
enum Sentiment {
Positive,
Negative,
Neutral,
}
let structured = StructuredOutputChatModel::<AnalysisResult>::new(
base_model,
r#"{
"entities": [{"name": "string", "entity_type": "person|org|location"}],
"sentiment": "positive|negative|neutral",
"language": "ISO 639-1 code"
}"#,
);
Combining with other wrappers
Since StructuredOutputChatModel<T> implements ChatModel, it composes with other wrappers:
use synaptic::models::{RetryChatModel, RetryPolicy};
let base: Arc<dyn ChatModel> = Arc::new(base_model);
let structured = Arc::new(StructuredOutputChatModel::<MovieReview>::new(
base,
r#"{"title": "string", "rating": "number", "summary": "string"}"#,
));
// Add retry logic on top
let reliable = RetryChatModel::new(structured, RetryPolicy::default());
Caching LLM Responses
This guide shows how to cache LLM responses to avoid redundant API calls and reduce latency.
Overview
Synaptic provides two cache implementations through the LlmCache trait:
InMemoryCache-- exact-match caching with optional TTL expiration.SemanticCache-- embedding-based similarity matching for semantically equivalent queries.
Both are used with CachedChatModel, which wraps any ChatModel and checks the cache before making an API call.
Exact-match caching with InMemoryCache
The simplest cache stores responses keyed by the exact request content:
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::cache::{InMemoryCache, CachedChatModel};
let base_model: Arc<dyn ChatModel> = Arc::new(model);
let cache = Arc::new(InMemoryCache::new());
let cached_model = CachedChatModel::new(base_model, cache);
// First call hits the LLM
// let response1 = cached_model.chat(request.clone()).await?;
// Identical request returns cached response instantly
// let response2 = cached_model.chat(request.clone()).await?;
Cache with TTL
Set a time-to-live so entries expire automatically:
use std::time::Duration;
use std::sync::Arc;
use synaptic::cache::InMemoryCache;
// Entries expire after 1 hour
let cache = Arc::new(InMemoryCache::with_ttl(Duration::from_secs(3600)));
// Entries expire after 5 minutes
let cache = Arc::new(InMemoryCache::with_ttl(Duration::from_secs(300)));
After the TTL elapses, a cache lookup for that entry returns None, and the next request will hit the LLM again.
Semantic caching with SemanticCache
Semantic caching uses embeddings to find similar queries, even when the exact wording differs. For example, "What's the weather?" and "Tell me the current weather" could match the same cached response.
use std::sync::Arc;
use synaptic::cache::{SemanticCache, CachedChatModel};
use synaptic::openai::OpenAiEmbeddings;
let embeddings: Arc<dyn synaptic::embeddings::Embeddings> = Arc::new(embeddings_provider);
// Similarity threshold of 0.95 means only very similar queries match
let cache = Arc::new(SemanticCache::new(embeddings, 0.95));
let cached_model = CachedChatModel::new(base_model, cache);
When looking up a cached response:
- The query is embedded using the provided
Embeddingsimplementation. - The embedding is compared against all stored entries using cosine similarity.
- If the best match exceeds the similarity threshold, the cached response is returned.
Choosing a threshold
- 0.95 -- 0.99: Very strict. Only nearly identical queries match. Good for factual Q&A where slight wording changes can change meaning.
- 0.90 -- 0.95: Moderate. Catches common rephrasing. Good for general-purpose chatbots.
- 0.80 -- 0.90: Loose. Broader matching. Useful when you want aggressive caching and approximate answers are acceptable.
The LlmCache trait
Both cache types implement the LlmCache trait:
#[async_trait]
pub trait LlmCache: Send + Sync {
async fn get(&self, key: &str) -> Result<Option<ChatResponse>, SynapticError>;
async fn put(&self, key: &str, response: &ChatResponse) -> Result<(), SynapticError>;
async fn clear(&self) -> Result<(), SynapticError>;
}
You can implement this trait for custom cache backends (Redis, SQLite, etc.).
Clearing the cache
Both cache implementations support clearing all entries:
use synaptic::cache::LlmCache;
// cache implements LlmCache
// cache.clear().await?;
Combining with other wrappers
Since CachedChatModel implements ChatModel, it composes with retry, rate limiting, and other wrappers:
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::cache::{InMemoryCache, CachedChatModel};
use synaptic::models::{RetryChatModel, RetryPolicy};
let base_model: Arc<dyn ChatModel> = Arc::new(model);
// Cache first, then retry on cache miss + API failure
let cache = Arc::new(InMemoryCache::new());
let cached = Arc::new(CachedChatModel::new(base_model, cache));
let reliable = RetryChatModel::new(cached, RetryPolicy::default());
Retry & Rate Limiting
This guide shows how to add automatic retry logic and rate limiting to any ChatModel.
Retry with RetryChatModel
RetryChatModel wraps a model and automatically retries on transient failures (rate limit errors and timeouts). It uses exponential backoff between attempts.
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::{RetryChatModel, RetryPolicy};
let base_model: Arc<dyn ChatModel> = Arc::new(model);
// Use default policy: 3 attempts, 500ms base delay
let retry_model = RetryChatModel::new(base_model, RetryPolicy::default());
Custom retry policy
Configure the maximum number of attempts and the base delay for exponential backoff:
use std::time::Duration;
use synaptic::models::RetryPolicy;
let policy = RetryPolicy {
max_attempts: 5, // Try up to 5 times
base_delay: Duration::from_millis(200), // Start with 200ms delay
};
let retry_model = RetryChatModel::new(base_model, policy);
The delay between retries follows exponential backoff: base_delay * 2^attempt. With a 200ms base delay:
| Attempt | Delay before retry |
|---|---|
| 1st retry | 200ms |
| 2nd retry | 400ms |
| 3rd retry | 800ms |
| 4th retry | 1600ms |
Only retryable errors trigger retries:
SynapticError::RateLimit-- the provider returned a rate limit response.SynapticError::Timeout-- the request timed out.
All other errors are returned immediately without retrying.
Streaming with retry
RetryChatModel also retries stream_chat() calls. If a retryable error occurs during streaming, the entire stream is retried from the beginning.
Concurrency limiting with RateLimitedChatModel
RateLimitedChatModel uses a semaphore to limit the number of concurrent requests to the underlying model:
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::RateLimitedChatModel;
let base_model: Arc<dyn ChatModel> = Arc::new(model);
// Allow at most 5 concurrent requests
let limited = RateLimitedChatModel::new(base_model, 5);
When the concurrency limit is reached, additional callers wait until a slot becomes available. This is useful for:
- Respecting provider concurrency limits.
- Preventing resource exhaustion in high-throughput applications.
- Controlling costs by limiting parallel API calls.
Token bucket rate limiting with TokenBucketChatModel
TokenBucketChatModel uses a token bucket algorithm for smoother rate limiting. The bucket starts full and refills at a steady rate:
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::TokenBucketChatModel;
let base_model: Arc<dyn ChatModel> = Arc::new(model);
// Bucket capacity: 100 tokens, refill rate: 10 tokens/second
let throttled = TokenBucketChatModel::new(base_model, 100.0, 10.0);
Each chat() or stream_chat() call consumes one token from the bucket. When the bucket is empty, callers wait until a token is refilled.
Parameters:
- capacity -- the maximum burst size. A capacity of 100 allows 100 rapid-fire requests before throttling kicks in.
- refill_rate -- tokens added per second. A rate of 10.0 means the bucket refills at 10 tokens per second.
Token bucket vs concurrency limiting
| Feature | RateLimitedChatModel | TokenBucketChatModel |
|---|---|---|
| Controls | Concurrent requests | Request rate over time |
| Mechanism | Semaphore | Token bucket |
| Burst handling | Blocks when N requests are in-flight | Allows bursts up to capacity |
| Best for | Concurrency limits | Rate limits (requests/second) |
Stacking wrappers
All wrappers implement ChatModel, so they compose naturally. A common pattern is retry on the outside, rate limiting on the inside:
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::{RetryChatModel, RetryPolicy, TokenBucketChatModel};
let base_model: Arc<dyn ChatModel> = Arc::new(model);
// First, apply rate limiting
let throttled: Arc<dyn ChatModel> = Arc::new(
TokenBucketChatModel::new(base_model, 50.0, 5.0)
);
// Then, add retry on top
let reliable = RetryChatModel::new(throttled, RetryPolicy::default());
This ensures that retried requests also go through the rate limiter, preventing retry storms from overwhelming the provider.
Model Profiles
ModelProfile exposes a model's capabilities and limits so that calling code can inspect provider support flags at runtime without hard-coding provider-specific knowledge.
The ModelProfile Struct
pub struct ModelProfile {
pub name: String,
pub provider: String,
pub supports_tool_calling: bool,
pub supports_structured_output: bool,
pub supports_streaming: bool,
pub max_input_tokens: Option<usize>,
pub max_output_tokens: Option<usize>,
}
| Field | Type | Description |
|---|---|---|
name | String | Model identifier (e.g. "gpt-4o", "claude-3-opus") |
provider | String | Provider name (e.g. "openai", "anthropic") |
supports_tool_calling | bool | Whether the model can handle ToolDefinition in requests |
supports_structured_output | bool | Whether the model supports JSON schema enforcement |
supports_streaming | bool | Whether stream_chat() produces real token-level chunks |
max_input_tokens | Option<usize> | Maximum context window size, if known |
max_output_tokens | Option<usize> | Maximum generation length, if known |
Querying a Model's Profile
Every ChatModel implementation exposes a profile() method that returns Option<ModelProfile>. The default implementation returns None, so providers opt in by overriding it:
use synaptic::core::ChatModel;
let model = my_chat_model();
if let Some(profile) = model.profile() {
println!("Provider: {}", profile.provider);
println!("Supports tools: {}", profile.supports_tool_calling);
if let Some(max) = profile.max_input_tokens {
println!("Context window: {} tokens", max);
}
} else {
println!("No profile available for this model");
}
Using Profiles for Capability Checks
Profiles are useful when writing generic code that works across multiple providers. For example, you can guard tool-calling or structured-output logic behind a capability check:
use synaptic::core::{ChatModel, ChatRequest, ToolChoice};
async fn maybe_call_with_tools(
model: &dyn ChatModel,
request: ChatRequest,
) -> Result<ChatResponse, SynapticError> {
let supports_tools = model
.profile()
.map(|p| p.supports_tool_calling)
.unwrap_or(false);
if supports_tools {
let request = request.with_tool_choice(ToolChoice::Auto);
model.chat(request).await
} else {
// Fall back to plain chat without tools
model.chat(ChatRequest::new(request.messages)).await
}
}
Implementing profile() for a Custom Model
If you implement your own ChatModel, override profile() to advertise capabilities:
use synaptic::core::{ChatModel, ModelProfile};
impl ChatModel for MyCustomModel {
// ... chat() and stream_chat() ...
fn profile(&self) -> Option<ModelProfile> {
Some(ModelProfile {
name: "my-model-v1".to_string(),
provider: "custom".to_string(),
supports_tool_calling: true,
supports_structured_output: false,
supports_streaming: true,
max_input_tokens: Some(128_000),
max_output_tokens: Some(4_096),
})
}
}
Messages
Messages are the fundamental unit of communication in Synaptic. Every interaction with a chat model is expressed as a sequence of Message values, and every response comes back as a Message.
The Message enum is defined in synaptic_core and uses a tagged union with six variants: System, Human, AI, Tool, Chat, and Remove. You create messages through factory methods rather than struct literals.
Quick example
use synaptic::core::{ChatRequest, Message};
let messages = vec![
Message::system("You are a helpful assistant."),
Message::human("What is Rust?"),
];
let request = ChatRequest::new(messages);
Guides
- Message Types -- all message variants, factory methods, and accessor methods
- Filter & Trim Messages -- select messages by type/name/id and trim to a token budget
- Merge Message Runs -- combine consecutive messages of the same role into one
Message Types
This guide covers all message variants in Synaptic, how to create them, and how to inspect their contents.
The Message enum
Message is a tagged enum (#[serde(tag = "role")]) with six variants:
| Variant | Factory method | Role string | Purpose |
|---|---|---|---|
System | Message::system() | "system" | System instructions for the model |
Human | Message::human() | "human" | User input |
AI | Message::ai() | "assistant" | Model response (text only) |
AI (with tools) | Message::ai_with_tool_calls() | "assistant" | Model response with tool calls |
Tool | Message::tool() | "tool" | Tool execution result |
Chat | Message::chat() | custom | Custom role message |
Remove | Message::remove() | "remove" | Signals removal of a message by ID |
Creating messages
Always use factory methods instead of constructing enum variants directly:
use synaptic::core::{Message, ToolCall};
use serde_json::json;
// System message -- sets the model's behavior
let system = Message::system("You are a helpful assistant.");
// Human message -- user input
let human = Message::human("Hello, how are you?");
// AI message -- plain text response
let ai = Message::ai("I'm doing well, thanks for asking!");
// AI message with tool calls
let ai_tools = Message::ai_with_tool_calls(
"Let me look that up for you.",
vec![
ToolCall {
id: "call_1".to_string(),
name: "search".to_string(),
arguments: json!({"query": "Rust programming"}),
},
],
);
// Tool message -- result of a tool execution
// Second argument is the tool_call_id, which must match the ToolCall's id
let tool = Message::tool("Found 42 results for 'Rust programming'", "call_1");
// Chat message -- custom role
let chat = Message::chat("moderator", "This conversation is on topic.");
// Remove message -- used in message history management
let remove = Message::remove("msg-id-to-remove");
Accessor methods
All message variants share a common set of accessor methods:
use synaptic::core::Message;
let msg = Message::human("Hello!");
// Get the role as a string
assert_eq!(msg.role(), "human");
// Get the text content
assert_eq!(msg.content(), "Hello!");
// Type-checking predicates
assert!(msg.is_human());
assert!(!msg.is_ai());
assert!(!msg.is_system());
assert!(!msg.is_tool());
assert!(!msg.is_chat());
assert!(!msg.is_remove());
// Tool-related accessors (empty/None for non-AI/non-Tool messages)
assert!(msg.tool_calls().is_empty());
assert!(msg.tool_call_id().is_none());
// Optional fields
assert!(msg.id().is_none());
assert!(msg.name().is_none());
Tool call accessors
use synaptic::core::{Message, ToolCall};
use serde_json::json;
let ai = Message::ai_with_tool_calls("", vec![
ToolCall {
id: "call_1".into(),
name: "search".into(),
arguments: json!({"q": "rust"}),
},
]);
// Get all tool calls (only meaningful for AI messages)
let calls = ai.tool_calls();
assert_eq!(calls.len(), 1);
assert_eq!(calls[0].name, "search");
let tool_msg = Message::tool("result", "call_1");
// Get the tool_call_id (only meaningful for Tool messages)
assert_eq!(tool_msg.tool_call_id(), Some("call_1"));
Builder methods
Messages support a builder pattern for setting optional fields:
use synaptic::core::Message;
use serde_json::json;
let msg = Message::human("Hello!")
.with_id("msg-001")
.with_name("Alice")
.with_additional_kwarg("source", json!("web"))
.with_response_metadata_entry("model", json!("gpt-4o"));
assert_eq!(msg.id(), Some("msg-001"));
assert_eq!(msg.name(), Some("Alice"));
Available builder methods:
| Method | Description |
|---|---|
.with_id(id) | Set the message ID |
.with_name(name) | Set the sender name |
.with_additional_kwarg(key, value) | Add an arbitrary key-value pair |
.with_response_metadata_entry(key, value) | Add response metadata |
.with_content_blocks(blocks) | Set multimodal content blocks |
.with_usage_metadata(usage) | Set token usage (AI messages only) |
Serialization
Messages serialize to JSON with a "role" tag:
use synaptic::core::Message;
let msg = Message::human("Hello!");
let json = serde_json::to_string_pretty(&msg).unwrap();
// {
// "role": "human",
// "content": "Hello!"
// }
Note that the AI variant serializes with "role": "assistant" (not "ai"), matching the convention used by most LLM providers.
Filter & Trim Messages
This guide shows how to select specific messages from a conversation and trim message lists to fit within token budgets.
Filtering messages with filter_messages
The filter_messages function selects messages based on their type (role), name, or ID. It supports both inclusion and exclusion filters.
use synaptic::core::{filter_messages, Message};
Filter by type
let messages = vec![
Message::system("You are helpful."),
Message::human("Question 1"),
Message::ai("Answer 1"),
Message::human("Question 2"),
Message::ai("Answer 2"),
];
// Keep only human messages
let humans = filter_messages(
&messages,
Some(&["human"]), // include_types
None, // exclude_types
None, // include_names
None, // exclude_names
None, // include_ids
None, // exclude_ids
);
assert_eq!(humans.len(), 2);
assert_eq!(humans[0].content(), "Question 1");
assert_eq!(humans[1].content(), "Question 2");
Exclude by type
// Remove system messages, keep everything else
let without_system = filter_messages(
&messages,
None, // include_types
Some(&["system"]), // exclude_types
None, None, None, None,
);
assert_eq!(without_system.len(), 4);
Filter by name
let messages = vec![
Message::human("Hi").with_name("Alice"),
Message::human("Hello").with_name("Bob"),
Message::ai("Hey!"),
];
// Only messages from Alice
let alice_msgs = filter_messages(
&messages,
None, None,
Some(&["Alice"]), // include_names
None, None, None,
);
assert_eq!(alice_msgs.len(), 1);
assert_eq!(alice_msgs[0].content(), "Hi");
Filter by ID
let messages = vec![
Message::human("First").with_id("msg-1"),
Message::human("Second").with_id("msg-2"),
Message::human("Third").with_id("msg-3"),
];
// Exclude a specific message
let filtered = filter_messages(
&messages,
None, None, None, None,
None, // include_ids
Some(&["msg-2"]), // exclude_ids
);
assert_eq!(filtered.len(), 2);
Combining filters
All filter parameters can be combined. A message must pass all active filters to be included:
// Keep only human messages from Alice
let result = filter_messages(
&messages,
Some(&["human"]), // include_types
None, // exclude_types
Some(&["Alice"]), // include_names
None, None, None,
);
Trimming messages with trim_messages
The trim_messages function trims a message list to fit within a token budget. It supports two strategies: keep the first messages or keep the last messages.
use synaptic::core::{trim_messages, TrimStrategy, Message};
Keep last messages (most common)
This is the typical pattern for chat applications where you want to preserve the most recent context:
let messages = vec![
Message::system("You are a helpful assistant."),
Message::human("Question 1"),
Message::ai("Answer 1"),
Message::human("Question 2"),
Message::ai("Answer 2"),
Message::human("Question 3"),
];
// Simple token counter: estimate ~4 chars per token
let token_counter = |msg: &Message| -> usize {
msg.content().len() / 4
};
// Keep last messages within 50 tokens, preserve the system message
let trimmed = trim_messages(
messages,
50, // max_tokens
token_counter,
TrimStrategy::Last,
true, // include_system: preserve the leading system message
);
// Result: system message + as many recent messages as fit in the budget
assert!(trimmed[0].is_system());
Keep first messages
Useful when you want to preserve the beginning of a conversation:
let trimmed = trim_messages(
messages,
50,
token_counter,
TrimStrategy::First,
false, // include_system not relevant for First strategy
);
The include_system parameter
When using TrimStrategy::Last with include_system: true:
- If the first message is a system message, it is always preserved.
- The system message's tokens are subtracted from the budget.
- The remaining budget is filled with messages from the end of the list.
This ensures your system prompt is never trimmed away, even as the conversation grows.
Custom token counters
The token_counter parameter is a function that takes a &Message and returns a usize token count. You can use any estimation strategy:
// Simple character-based estimate
let simple = |msg: &Message| -> usize { msg.content().len() / 4 };
// Word-based estimate
let word_based = |msg: &Message| -> usize {
msg.content().split_whitespace().count()
};
// Fixed cost per message (useful when all messages are similar size)
let fixed = |_msg: &Message| -> usize { 10 };
Merge Message Runs
This guide shows how to use merge_message_runs to combine consecutive messages of the same role into a single message.
Overview
Some LLM providers require alternating message roles (human, assistant, human, assistant). If your message history has consecutive messages from the same role, you can merge them into one message before sending the request.
Basic usage
use synaptic::core::{merge_message_runs, Message};
let messages = vec![
Message::human("Hello"),
Message::human("How are you?"), // Same role as previous
Message::ai("I'm fine!"),
Message::ai("Thanks for asking!"), // Same role as previous
];
let merged = merge_message_runs(messages);
assert_eq!(merged.len(), 2);
assert_eq!(merged[0].content(), "Hello\nHow are you?");
assert_eq!(merged[1].content(), "I'm fine!\nThanks for asking!");
How merging works
When two consecutive messages share the same role:
- Their
contentstrings are joined with a newline (\n). - For AI messages,
tool_callsandinvalid_tool_callsfrom subsequent messages are appended to the first message's lists. - The resulting message retains the
id,name, and other metadata of the first message in the run.
Merging AI messages with tool calls
Tool calls from consecutive AI messages are combined:
use synaptic::core::{merge_message_runs, Message, ToolCall};
use serde_json::json;
let messages = vec![
Message::ai_with_tool_calls("Looking up weather...", vec![
ToolCall {
id: "call_1".into(),
name: "get_weather".into(),
arguments: json!({"city": "Tokyo"}),
},
]),
Message::ai_with_tool_calls("Also checking news...", vec![
ToolCall {
id: "call_2".into(),
name: "search_news".into(),
arguments: json!({"query": "Tokyo"}),
},
]),
];
let merged = merge_message_runs(messages);
assert_eq!(merged.len(), 1);
assert_eq!(merged[0].content(), "Looking up weather...\nAlso checking news...");
assert_eq!(merged[0].tool_calls().len(), 2);
Preserving different roles
Messages with different roles are never merged, even if they appear to be related:
use synaptic::core::{merge_message_runs, Message};
let messages = vec![
Message::system("Be helpful."),
Message::human("Hi"),
Message::ai("Hello!"),
Message::human("Bye"),
];
let merged = merge_message_runs(messages);
assert_eq!(merged.len(), 4); // No change -- all roles are different
Practical use case: preparing messages for providers
Some providers reject requests with consecutive same-role messages. Use merge_message_runs to clean up before sending:
use synaptic::core::{merge_message_runs, ChatRequest, Message};
let conversation = vec![
Message::system("You are a translator."),
Message::human("Translate to French:"),
Message::human("Hello, how are you?"), // User sent two messages in a row
Message::ai("Bonjour, comment allez-vous ?"),
];
let cleaned = merge_message_runs(conversation);
let request = ChatRequest::new(cleaned);
// Now safe to send: roles alternate correctly
Empty input
merge_message_runs returns an empty vector when given an empty input:
use synaptic::core::merge_message_runs;
let result = merge_message_runs(vec![]);
assert!(result.is_empty());
Prompts
Synaptic provides two levels of prompt template:
PromptTemplate-- simple string interpolation with{{ variable }}syntax. Takes aHashMap<String, String>and returns a renderedString.ChatPromptTemplate-- produces aVec<Message>from a sequence ofMessageTemplateentries. Each entry can be a system, human, or AI message template, or aPlaceholderthat injects an existing list of messages.
Both template types implement the Runnable trait, so they compose directly with chat models, output parsers, and other runnables using the LCEL pipe operator (|).
Quick Example
use synaptic::prompts::{PromptTemplate, ChatPromptTemplate, MessageTemplate};
// Simple string template
let pt = PromptTemplate::new("Hello, {{ name }}!");
let mut values = std::collections::HashMap::new();
values.insert("name".to_string(), "world".to_string());
assert_eq!(pt.render(&values).unwrap(), "Hello, world!");
// Chat message template (produces Vec<Message>)
let chat = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a {{ role }} assistant."),
MessageTemplate::human("{{ question }}"),
]);
Sub-Pages
- Chat Prompt Template -- build multi-message prompts with variable interpolation and placeholders
- Few-Shot Prompting -- inject example conversations for few-shot learning
Chat Prompt Template
ChatPromptTemplate produces a Vec<Message> from a sequence of MessageTemplate entries. Each entry renders one or more messages with {{ variable }} interpolation. The template implements the Runnable trait, so it integrates directly into LCEL pipelines.
Creating a Template
Use ChatPromptTemplate::from_messages() (or new()) with a vector of MessageTemplate variants:
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a {{ role }} assistant."),
MessageTemplate::human("{{ question }}"),
]);
Rendering with format()
Call format() with a HashMap<String, serde_json::Value> to produce messages:
use std::collections::HashMap;
use serde_json::json;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a {{ role }} assistant."),
MessageTemplate::human("{{ question }}"),
]);
let values: HashMap<String, serde_json::Value> = HashMap::from([
("role".to_string(), json!("helpful")),
("question".to_string(), json!("What is Rust?")),
]);
let messages = template.format(&values).unwrap();
// messages[0] => Message::system("You are a helpful assistant.")
// messages[1] => Message::human("What is Rust?")
Using as a Runnable
Because ChatPromptTemplate implements Runnable<HashMap<String, Value>, Vec<Message>>, you can call invoke() or compose it with the pipe operator:
use std::collections::HashMap;
use serde_json::json;
use synaptic::core::RunnableConfig;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::runnables::Runnable;
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a {{ role }} assistant."),
MessageTemplate::human("{{ question }}"),
]);
let config = RunnableConfig::default();
let values: HashMap<String, serde_json::Value> = HashMap::from([
("role".to_string(), json!("helpful")),
("question".to_string(), json!("What is Rust?")),
]);
let messages = template.invoke(values, &config).await?;
// messages = [Message::system("You are a helpful assistant."), Message::human("What is Rust?")]
MessageTemplate Variants
MessageTemplate is an enum with four variants:
| Variant | Description |
|---|---|
MessageTemplate::system(text) | Renders a system message from a template string |
MessageTemplate::human(text) | Renders a human message from a template string |
MessageTemplate::ai(text) | Renders an AI message from a template string |
MessageTemplate::Placeholder(key) | Injects a list of messages from the input map |
Placeholder Example
Placeholder injects messages stored under a key in the input map. The value must be a JSON array of serialized Message objects. This is useful for injecting conversation history:
use std::collections::HashMap;
use serde_json::json;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are helpful."),
MessageTemplate::Placeholder("history".to_string()),
MessageTemplate::human("{{ input }}"),
]);
let history = json!([
{"role": "human", "content": "Hi"},
{"role": "assistant", "content": "Hello!"}
]);
let values: HashMap<String, serde_json::Value> = HashMap::from([
("history".to_string(), history),
("input".to_string(), json!("How are you?")),
]);
let messages = template.format(&values).unwrap();
// messages[0] => System("You are helpful.")
// messages[1] => Human("Hi") -- from placeholder
// messages[2] => AI("Hello!") -- from placeholder
// messages[3] => Human("How are you?")
Composing in a Pipeline
A common pattern is to pipe a prompt template into a chat model and then into an output parser:
use std::collections::HashMap;
use serde_json::json;
use synaptic::core::{ChatModel, ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Rust is a systems programming language."),
usage: None,
},
]);
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a {{ role }} assistant."),
MessageTemplate::human("{{ question }}"),
]);
let chain = template.boxed() | model.boxed() | StrOutputParser.boxed();
let values: HashMap<String, serde_json::Value> = HashMap::from([
("role".to_string(), json!("helpful")),
("question".to_string(), json!("What is Rust?")),
]);
let config = RunnableConfig::default();
let result: String = chain.invoke(values, &config).await.unwrap();
// result = "Rust is a systems programming language."
Few-Shot Prompting
FewShotChatMessagePromptTemplate injects example conversations into a prompt for few-shot learning. Each example is a pair of human input and AI output, formatted as alternating Human and AI messages. An optional system prefix message can be prepended.
Basic Usage
Create the template with a list of FewShotExample values and a suffix PromptTemplate for the user's actual query:
use std::collections::HashMap;
use synaptic::prompts::{
FewShotChatMessagePromptTemplate, FewShotExample, PromptTemplate,
};
let template = FewShotChatMessagePromptTemplate::new(
vec![
FewShotExample {
input: "What is 2+2?".to_string(),
output: "4".to_string(),
},
FewShotExample {
input: "What is 3+3?".to_string(),
output: "6".to_string(),
},
],
PromptTemplate::new("{{ question }}"),
);
let values = HashMap::from([
("question".to_string(), "What is 4+4?".to_string()),
]);
let messages = template.format(&values).unwrap();
// messages[0] => Human("What is 2+2?") -- example 1 input
// messages[1] => AI("4") -- example 1 output
// messages[2] => Human("What is 3+3?") -- example 2 input
// messages[3] => AI("6") -- example 2 output
// messages[4] => Human("What is 4+4?") -- actual query (suffix)
Each FewShotExample has two fields:
input-- the human message for this exampleoutput-- the AI response for this example
The suffix template is rendered with the user-provided variables and appended as the final human message.
Adding a System Prefix
Use with_prefix() to prepend a system message before the examples:
use std::collections::HashMap;
use synaptic::prompts::{
FewShotChatMessagePromptTemplate, FewShotExample, PromptTemplate,
};
let template = FewShotChatMessagePromptTemplate::new(
vec![FewShotExample {
input: "hi".to_string(),
output: "hello".to_string(),
}],
PromptTemplate::new("{{ input }}"),
)
.with_prefix(PromptTemplate::new("You are a polite assistant."));
let values = HashMap::from([("input".to_string(), "hey".to_string())]);
let messages = template.format(&values).unwrap();
// messages[0] => System("You are a polite assistant.") -- prefix
// messages[1] => Human("hi") -- example input
// messages[2] => AI("hello") -- example output
// messages[3] => Human("hey") -- actual query
The prefix template supports {{ variable }} interpolation, so you can parameterize the system message too.
Using as a Runnable
FewShotChatMessagePromptTemplate implements Runnable<HashMap<String, String>, Vec<Message>>, so you can call invoke() or compose it in pipelines:
use std::collections::HashMap;
use synaptic::core::RunnableConfig;
use synaptic::prompts::{
FewShotChatMessagePromptTemplate, FewShotExample, PromptTemplate,
};
use synaptic::runnables::Runnable;
let template = FewShotChatMessagePromptTemplate::new(
vec![FewShotExample {
input: "x".to_string(),
output: "y".to_string(),
}],
PromptTemplate::new("{{ q }}"),
);
let config = RunnableConfig::default();
let values = HashMap::from([("q".to_string(), "z".to_string())]);
let messages = template.invoke(values, &config).await?;
// 3 messages: Human("x"), AI("y"), Human("z")
Note: The
Runnableimplementation forFewShotChatMessagePromptTemplatetakesHashMap<String, String>, whileChatPromptTemplatetakesHashMap<String, serde_json::Value>. This difference reflects their underlying template rendering: few-shot templates usePromptTemplate::render()which works with string values.
Output Parsers
Output parsers transform raw LLM output into structured data. Every parser in Synaptic implements the Runnable trait, so they compose naturally with prompt templates, chat models, and other runnables using the LCEL pipe operator (|).
Available Parsers
| Parser | Input | Output | Description |
|---|---|---|---|
StrOutputParser | Message | String | Extracts the text content from a message |
JsonOutputParser | String | serde_json::Value | Parses a string as JSON |
StructuredOutputParser<T> | String | T | Deserializes JSON into a typed struct |
ListOutputParser | String | Vec<String> | Splits by a configurable separator |
EnumOutputParser | String | String | Validates against a list of allowed values |
BooleanOutputParser | String | bool | Parses yes/no/true/false strings |
MarkdownListOutputParser | String | Vec<String> | Parses markdown bullet lists |
NumberedListOutputParser | String | Vec<String> | Parses numbered lists |
XmlOutputParser | String | XmlElement | Parses XML into a tree structure |
All parsers also implement the FormatInstructions trait, which provides a get_format_instructions() method. You can include these instructions in your prompt to guide the LLM toward producing output in the expected format.
Quick Example
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::{Message, RunnableConfig};
let parser = StrOutputParser;
let config = RunnableConfig::default();
let result = parser.invoke(Message::ai("Hello world"), &config).await?;
assert_eq!(result, "Hello world");
Sub-Pages
- Basic Parsers -- StrOutputParser, JsonOutputParser, ListOutputParser
- Structured Parser -- deserialize JSON into typed Rust structs
- Enum Parser -- validate output against a fixed set of values
Basic Parsers
Synaptic provides several simple output parsers for common transformations. Each implements Runnable, so it can be used standalone or composed in a pipeline.
StrOutputParser
Extracts the text content from a Message. This is the most commonly used parser -- it sits at the end of most chains to convert the model's response into a plain String.
Signature: Runnable<Message, String>
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::{Message, RunnableConfig};
let parser = StrOutputParser;
let config = RunnableConfig::default();
let result = parser.invoke(Message::ai("Hello world"), &config).await?;
assert_eq!(result, "Hello world");
StrOutputParser works with any Message variant -- system, human, AI, or tool messages all have content that can be extracted.
JsonOutputParser
Parses a JSON string into a serde_json::Value. Useful when you need to work with arbitrary JSON structures without defining a specific Rust type.
Signature: Runnable<String, serde_json::Value>
use synaptic::parsers::JsonOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let parser = JsonOutputParser;
let config = RunnableConfig::default();
let result = parser.invoke(
r#"{"name": "Synaptic", "version": 1}"#.to_string(),
&config,
).await?;
assert_eq!(result["name"], "Synaptic");
assert_eq!(result["version"], 1);
If the input is not valid JSON, the parser returns Err(SynapticError::Parsing(...)).
ListOutputParser
Splits a string into a Vec<String> using a configurable separator. Useful when you ask the LLM to return a comma-separated or newline-separated list.
Signature: Runnable<String, Vec<String>>
use synaptic::parsers::{ListOutputParser, ListSeparator};
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let config = RunnableConfig::default();
// Split on commas
let parser = ListOutputParser::comma();
let result = parser.invoke("apple, banana, cherry".to_string(), &config).await?;
assert_eq!(result, vec!["apple", "banana", "cherry"]);
// Split on newlines (default)
let parser = ListOutputParser::newline();
let result = parser.invoke("first\nsecond\nthird".to_string(), &config).await?;
assert_eq!(result, vec!["first", "second", "third"]);
// Custom separator
let parser = ListOutputParser::new(ListSeparator::Custom("|".to_string()));
let result = parser.invoke("a | b | c".to_string(), &config).await?;
assert_eq!(result, vec!["a", "b", "c"]);
Each item is trimmed of leading and trailing whitespace. Empty items after trimming are filtered out.
BooleanOutputParser
Parses yes/no, true/false, y/n, and 1/0 style responses into a bool. Case-insensitive and whitespace-trimmed.
Signature: Runnable<String, bool>
use synaptic::parsers::BooleanOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let parser = BooleanOutputParser;
let config = RunnableConfig::default();
assert_eq!(parser.invoke("Yes".to_string(), &config).await?, true);
assert_eq!(parser.invoke("false".to_string(), &config).await?, false);
assert_eq!(parser.invoke("1".to_string(), &config).await?, true);
assert_eq!(parser.invoke("N".to_string(), &config).await?, false);
Unrecognized values return Err(SynapticError::Parsing(...)).
XmlOutputParser
Parses XML-formatted LLM output into an XmlElement tree. Supports nested elements, attributes, and text content without requiring a full XML library.
Signature: Runnable<String, XmlElement>
use synaptic::parsers::{XmlOutputParser, XmlElement};
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let config = RunnableConfig::default();
// Parse with a root tag filter
let parser = XmlOutputParser::with_root_tag("answer");
let result = parser.invoke(
"Here is my answer: <answer><item>hello</item></answer>".to_string(),
&config,
).await?;
assert_eq!(result.tag, "answer");
assert_eq!(result.children[0].tag, "item");
assert_eq!(result.children[0].text, Some("hello".to_string()));
Use XmlOutputParser::new() to parse the entire input as XML, or with_root_tag("tag") to extract content from within a specific root tag.
MarkdownListOutputParser
Parses markdown-formatted bullet lists (- item or * item) into a Vec<String>. Lines not starting with a bullet marker are ignored.
Signature: Runnable<String, Vec<String>>
use synaptic::parsers::MarkdownListOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let parser = MarkdownListOutputParser;
let config = RunnableConfig::default();
let result = parser.invoke(
"Here are the items:\n- Apple\n- Banana\n* Cherry\nNot a list item".to_string(),
&config,
).await?;
assert_eq!(result, vec!["Apple", "Banana", "Cherry"]);
NumberedListOutputParser
Parses numbered lists (1. item, 2. item) into a Vec<String>. The number prefix is stripped; only lines matching the N. text pattern are included.
Signature: Runnable<String, Vec<String>>
use synaptic::parsers::NumberedListOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let parser = NumberedListOutputParser;
let config = RunnableConfig::default();
let result = parser.invoke(
"Top 3 languages:\n1. Rust\n2. Python\n3. TypeScript".to_string(),
&config,
).await?;
assert_eq!(result, vec!["Rust", "Python", "TypeScript"]);
Format Instructions
All parsers implement the FormatInstructions trait. You can include the instructions in your prompt to guide the model:
use synaptic::parsers::{JsonOutputParser, ListOutputParser, FormatInstructions};
let json_parser = JsonOutputParser;
println!("{}", json_parser.get_format_instructions());
// "Your response should be a valid JSON object."
let list_parser = ListOutputParser::comma();
println!("{}", list_parser.get_format_instructions());
// "Your response should be a list of items separated by commas."
Pipeline Example
A typical chain pipes a prompt template through a model and into a parser:
use std::collections::HashMap;
use serde_json::json;
use synaptic::core::{ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("The answer is 42."),
usage: None,
},
]);
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system("You are a helpful assistant."),
MessageTemplate::human("{{ question }}"),
]);
// template -> model -> parser
let chain = template.boxed() | model.boxed() | StrOutputParser.boxed();
let config = RunnableConfig::default();
let values: HashMap<String, serde_json::Value> = HashMap::from([
("question".to_string(), json!("What is the meaning of life?")),
]);
let result: String = chain.invoke(values, &config).await?;
assert_eq!(result, "The answer is 42.");
Structured Parser
StructuredOutputParser<T> deserializes a JSON string directly into a typed Rust struct. This is the preferred parser when you know the exact shape of the data you expect from the LLM.
Basic Usage
Define a struct that derives Deserialize, then create a parser for it:
use synaptic::parsers::StructuredOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
use serde::Deserialize;
#[derive(Deserialize)]
struct Person {
name: String,
age: u32,
}
let parser = StructuredOutputParser::<Person>::new();
let config = RunnableConfig::default();
let result = parser.invoke(
r#"{"name": "Alice", "age": 30}"#.to_string(),
&config,
).await?;
assert_eq!(result.name, "Alice");
assert_eq!(result.age, 30);
Signature: Runnable<String, T> where T: DeserializeOwned + Send + Sync + 'static
Error Handling
If the input string is not valid JSON or does not match the struct's schema, the parser returns Err(SynapticError::Parsing(...)):
use synaptic::parsers::StructuredOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
use serde::Deserialize;
#[derive(Deserialize)]
struct Config {
enabled: bool,
threshold: f64,
}
let parser = StructuredOutputParser::<Config>::new();
let config = RunnableConfig::default();
// Missing required field -- returns an error
let err = parser.invoke(
r#"{"enabled": true}"#.to_string(),
&config,
).await.unwrap_err();
assert!(err.to_string().contains("structured parse error"));
Format Instructions
StructuredOutputParser<T> implements the FormatInstructions trait. Include the instructions in your prompt to guide the model toward producing correctly-shaped JSON:
use synaptic::parsers::{StructuredOutputParser, FormatInstructions};
use serde::Deserialize;
#[derive(Deserialize)]
struct Answer {
reasoning: String,
answer: String,
}
let parser = StructuredOutputParser::<Answer>::new();
let instructions = parser.get_format_instructions();
// "Your response should be a valid JSON object matching the expected schema."
Pipeline Example
In a chain, StructuredOutputParser typically follows a StrOutputParser step or receives the string content directly. Here is a complete example:
use synaptic::parsers::StructuredOutputParser;
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::{Message, RunnableConfig};
use serde::Deserialize;
#[derive(Debug, Deserialize)]
struct Sentiment {
label: String,
confidence: f64,
}
// Simulate an LLM that returns JSON in a Message
let extract_content = RunnableLambda::new(|msg: Message| async move {
Ok(msg.content().to_string())
});
let parser = StructuredOutputParser::<Sentiment>::new();
let chain = extract_content.boxed() | parser.boxed();
let config = RunnableConfig::default();
let input = Message::ai(r#"{"label": "positive", "confidence": 0.95}"#);
let result: Sentiment = chain.invoke(input, &config).await?;
assert_eq!(result.label, "positive");
assert!((result.confidence - 0.95).abs() < f64::EPSILON);
When to Use Structured vs. JSON Parser
- Use
StructuredOutputParser<T>when you know the exact schema at compile time and want type-safe access to fields. - Use
JsonOutputParserwhen you need to work with arbitrary or dynamic JSON structures where the shape is not known in advance.
Enum Parser
EnumOutputParser validates that the LLM's output matches one of a predefined set of allowed values. This is useful for classification tasks where the model should respond with exactly one of several categories.
Basic Usage
Create the parser with a list of allowed values, then invoke it:
use synaptic::parsers::EnumOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let parser = EnumOutputParser::new(vec![
"positive".to_string(),
"negative".to_string(),
"neutral".to_string(),
]);
let config = RunnableConfig::default();
// Valid value -- returns Ok
let result = parser.invoke("positive".to_string(), &config).await?;
assert_eq!(result, "positive");
Signature: Runnable<String, String>
Validation
The parser trims whitespace from the input before checking. If the trimmed input does not match any allowed value, it returns Err(SynapticError::Parsing(...)):
use synaptic::parsers::EnumOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
let parser = EnumOutputParser::new(vec![
"positive".to_string(),
"negative".to_string(),
"neutral".to_string(),
]);
let config = RunnableConfig::default();
// Whitespace is trimmed -- this succeeds
let result = parser.invoke(" neutral ".to_string(), &config).await?;
assert_eq!(result, "neutral");
// Invalid value -- returns an error
let err = parser.invoke("invalid".to_string(), &config).await.unwrap_err();
assert!(err.to_string().contains("expected one of"));
Format Instructions
EnumOutputParser implements FormatInstructions. Include the instructions in your prompt so the model knows which values to choose from:
use synaptic::parsers::{EnumOutputParser, FormatInstructions};
let parser = EnumOutputParser::new(vec![
"positive".to_string(),
"negative".to_string(),
"neutral".to_string(),
]);
let instructions = parser.get_format_instructions();
// "Your response should be one of the following values: positive, negative, neutral"
Pipeline Example
A typical classification pipeline combines a prompt, a model, a content extractor, and the enum parser:
use std::collections::HashMap;
use serde_json::json;
use synaptic::core::{ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::{StrOutputParser, EnumOutputParser, FormatInstructions};
use synaptic::runnables::Runnable;
let parser = EnumOutputParser::new(vec![
"positive".to_string(),
"negative".to_string(),
"neutral".to_string(),
]);
let model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("positive"),
usage: None,
},
]);
let template = ChatPromptTemplate::from_messages(vec![
MessageTemplate::system(
&format!(
"Classify the sentiment of the text. {}",
parser.get_format_instructions()
),
),
MessageTemplate::human("{{ text }}"),
]);
// template -> model -> extract content -> validate enum
let chain = template.boxed()
| model.boxed()
| StrOutputParser.boxed()
| parser.boxed();
let config = RunnableConfig::default();
let values: HashMap<String, serde_json::Value> = HashMap::from([
("text".to_string(), json!("I love this product!")),
]);
let result: String = chain.invoke(values, &config).await?;
assert_eq!(result, "positive");
Runnables (LCEL)
Synaptic implements LCEL (LangChain Expression Language) through the Runnable trait and a set of composable building blocks. Every component in an LCEL chain -- prompts, models, parsers, custom logic -- implements the same Runnable<I, O> interface, so they can be combined freely with a uniform API.
The Runnable trait
The Runnable<I, O> trait is defined in synaptic_core and provides three core methods:
| Method | Description |
|---|---|
invoke(input, config) | Execute on a single input, returning one output |
batch(inputs, config) | Execute on multiple inputs sequentially |
stream(input, config) | Return a RunnableOutputStream of incremental results |
Every Runnable also has a boxed() method that wraps it into a BoxRunnable<I, O> -- a type-erased container that enables the | pipe operator for composition.
use synaptic::runnables::{Runnable, RunnableLambda, BoxRunnable};
use synaptic::core::RunnableConfig;
let step = RunnableLambda::new(|x: String| async move {
Ok(x.to_uppercase())
});
let config = RunnableConfig::default();
let result = step.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "HELLO");
BoxRunnable -- type-erased composition
BoxRunnable<I, O> is the key type for building chains. It wraps any Runnable<I, O> behind a trait object, which erases the concrete type. This is necessary because the | operator requires both sides to have known types at the call site.
BoxRunnable itself implements Runnable<I, O>, so boxed runnables compose seamlessly.
Building blocks
Synaptic provides the following LCEL building blocks:
| Type | Purpose |
|---|---|
RunnableLambda | Wraps an async closure as a runnable |
RunnablePassthrough | Passes input through unchanged |
RunnableSequence | Chains two runnables (created by | operator) |
RunnableParallel | Runs named branches concurrently, merges to JSON |
RunnableBranch | Routes input by condition, with a default fallback |
RunnableAssign | Merges parallel branch results into the input JSON object |
RunnablePick | Extracts specific keys from a JSON object |
RunnableWithFallbacks | Tries alternatives when the primary runnable fails |
RunnableRetry | Retries with exponential backoff on failure |
RunnableEach | Maps a runnable over each element in a Vec |
RunnableGenerator | Wraps a generator function for true streaming output |
Tip: For standalone async functions, you can also use the
#[chain]macro to generate aBoxRunnablefactory. This avoids writingRunnableLambda::new(|x| async { ... }).boxed()by hand. See Procedural Macros.
Guides
- Pipe Operator -- chain runnables with
|to build sequential pipelines - Streaming -- consume incremental output through a chain
- Parallel & Branch -- run branches concurrently or route by condition
- Assign & Pick -- merge computed keys into JSON and extract specific fields
- Fallbacks -- provide alternative runnables when the primary one fails
- Bind -- attach config transforms to a runnable
- Retry -- retry with exponential backoff on transient failures
- Generator -- wrap a streaming generator function as a runnable
- Each -- map a runnable over each element in a list
Pipe Operator
This guide shows how to chain runnables together using the | pipe operator to build sequential processing pipelines.
Overview
The | operator on BoxRunnable creates a RunnableSequence that feeds the output of the first runnable into the input of the second. This is the primary way to build LCEL chains in Synaptic.
The pipe operator is implemented via Rust's BitOr trait on BoxRunnable. Both sides must be boxed first with .boxed(), because the operator needs type-erased wrappers to connect runnables with different concrete types.
Basic chaining
use synaptic::runnables::{Runnable, RunnableLambda, BoxRunnable};
use synaptic::core::RunnableConfig;
let step1 = RunnableLambda::new(|x: String| async move {
Ok(format!("Step 1: {x}"))
});
let step2 = RunnableLambda::new(|x: String| async move {
Ok(format!("{x} -> Step 2"))
});
// Pipe operator creates a RunnableSequence
let chain = step1.boxed() | step2.boxed();
let config = RunnableConfig::default();
let result = chain.invoke("input".to_string(), &config).await?;
assert_eq!(result, "Step 1: input -> Step 2");
The types must be compatible: the output type of step1 must match the input type of step2. In this example both work with String, so the types line up. The compiler will reject chains where the types do not match.
Multi-step chains
You can chain more than two steps by continuing to pipe. The result is still a single BoxRunnable:
let step3 = RunnableLambda::new(|x: String| async move {
Ok(format!("{x} -> Step 3"))
});
let chain = step1.boxed() | step2.boxed() | step3.boxed();
let result = chain.invoke("start".to_string(), &config).await?;
assert_eq!(result, "Step 1: start -> Step 2 -> Step 3");
Each | wraps the left side into a new RunnableSequence, so a | b | c produces a RunnableSequence(RunnableSequence(a, b), c). This nesting is transparent -- you interact with the result as a single BoxRunnable<I, O>.
Type conversions across steps
Steps can change the type flowing through the chain, as long as each step's output matches the next step's input:
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;
// String -> usize -> String
let count_chars = RunnableLambda::new(|s: String| async move {
Ok(s.len())
});
let format_count = RunnableLambda::new(|n: usize| async move {
Ok(format!("Length: {n}"))
});
let chain = count_chars.boxed() | format_count.boxed();
let config = RunnableConfig::default();
let result = chain.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "Length: 5");
Why boxed() is required
Rust's type system needs to know the exact types at compile time. Without boxed(), each RunnableLambda has a unique closure type that cannot appear on both sides of |. Calling .boxed() erases the concrete type into BoxRunnable<I, O>, which is a trait object that can compose with any other BoxRunnable as long as the input/output types align.
BoxRunnable::new(runnable) is equivalent to runnable.boxed() -- use whichever reads better in context.
Using RunnablePassthrough
RunnablePassthrough is a no-op runnable that passes its input through unchanged. It is useful when you need an identity step in a chain -- for example, as one branch in a RunnableParallel:
use synaptic::runnables::{Runnable, RunnablePassthrough};
let passthrough = RunnablePassthrough;
let result = passthrough.invoke("unchanged".to_string(), &config).await?;
assert_eq!(result, "unchanged");
Error propagation
If any step in the chain returns an Err, the chain short-circuits immediately and returns that error. Subsequent steps are not executed:
use synaptic::core::SynapticError;
let failing = RunnableLambda::new(|_x: String| async move {
Err::<String, _>(SynapticError::Validation("something went wrong".into()))
});
let after = RunnableLambda::new(|x: String| async move {
Ok(format!("This won't run: {x}"))
});
let chain = failing.boxed() | after.boxed();
let result = chain.invoke("test".to_string(), &config).await;
assert!(result.is_err());
Streaming through Chains
This guide shows how to use stream() to consume incremental output from an LCEL chain.
Overview
Every Runnable provides a stream() method that returns a RunnableOutputStream -- a pinned, boxed Stream of Result<O, SynapticError> items. This allows downstream consumers to process results as they become available, rather than waiting for the entire chain to finish.
The default stream() implementation wraps invoke() as a single-item stream. Runnables that support true incremental output (such as LLM model adapters or RunnableGenerator) override stream() to yield items one at a time.
Streaming a single runnable
use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;
let upper = RunnableLambda::new(|x: String| async move {
Ok(x.to_uppercase())
});
let config = RunnableConfig::default();
let mut stream = upper.stream("hello".to_string(), &config);
while let Some(result) = stream.next().await {
let value = result?;
println!("Got: {value}");
}
// Prints: Got: HELLO
Because RunnableLambda uses the default stream() implementation, this yields exactly one item -- the full result of invoke().
Streaming through a chain
When you stream through a RunnableSequence (created by the | operator), the behavior is:
- The first step runs fully via
invoke()and produces its complete output. - That output is fed into the second step's
stream(), which yields items incrementally.
This means only the final component in a chain truly streams. Intermediate steps buffer their output. This matches the LangChain behavior.
use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;
let step1 = RunnableLambda::new(|x: String| async move {
Ok(format!("processed: {x}"))
});
let step2 = RunnableLambda::new(|x: String| async move {
Ok(x.to_uppercase())
});
let chain = step1.boxed() | step2.boxed();
let config = RunnableConfig::default();
let mut stream = chain.stream("input".to_string(), &config);
while let Some(result) = stream.next().await {
let value = result?;
println!("Got: {value}");
}
// Prints: Got: PROCESSED: INPUT
Streaming with BoxRunnable
BoxRunnable preserves the streaming behavior of the inner runnable. Call .stream() directly on it:
let boxed_chain = step1.boxed() | step2.boxed();
let mut stream = boxed_chain.stream("input".to_string(), &config);
while let Some(result) = stream.next().await {
let value = result?;
println!("{value}");
}
True streaming with RunnableGenerator
RunnableGenerator wraps a generator function that returns a Stream, enabling true incremental output:
use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableGenerator};
use synaptic::core::RunnableConfig;
let gen = RunnableGenerator::new(|input: String| {
async_stream::stream! {
for word in input.split_whitespace() {
yield Ok(word.to_uppercase());
}
}
});
let config = RunnableConfig::default();
let mut stream = gen.stream("hello world foo".to_string(), &config);
while let Some(result) = stream.next().await {
let items = result?;
println!("Chunk: {:?}", items);
}
// Prints each word as a separate chunk:
// Chunk: ["HELLO"]
// Chunk: ["WORLD"]
// Chunk: ["FOO"]
When you call invoke() on a RunnableGenerator, it collects all streamed items into a Vec<O>.
Collecting a stream into a single result
If you need the full result rather than incremental output, use invoke() instead of stream(). Alternatively, collect the stream manually:
use futures::StreamExt;
let mut stream = chain.stream("input".to_string(), &config);
let mut items = Vec::new();
while let Some(result) = stream.next().await {
items.push(result?);
}
// items now contains all yielded values
Error handling in streams
If any step in a chain fails during streaming, the stream yields an Err item. Consumers should check each item:
while let Some(result) = stream.next().await {
match result {
Ok(value) => println!("Got: {value}"),
Err(e) => eprintln!("Error: {e}"),
}
}
When the first step of a RunnableSequence fails (during its invoke()), the stream immediately yields that error as the only item.
Parallel & Branch
This guide shows how to run multiple runnables concurrently with RunnableParallel and how to route input to different runnables with RunnableBranch.
RunnableParallel
RunnableParallel runs named branches concurrently on the same input, then merges all outputs into a single serde_json::Value object keyed by branch name.
The input type must implement Clone, because each branch receives its own copy. Every branch must produce a serde_json::Value output.
Basic usage
use serde_json::Value;
use synaptic::runnables::{Runnable, RunnableParallel, RunnableLambda};
use synaptic::core::RunnableConfig;
let parallel = RunnableParallel::new(vec![
(
"upper".to_string(),
RunnableLambda::new(|x: String| async move {
Ok(Value::String(x.to_uppercase()))
}).boxed(),
),
(
"lower".to_string(),
RunnableLambda::new(|x: String| async move {
Ok(Value::String(x.to_lowercase()))
}).boxed(),
),
(
"length".to_string(),
RunnableLambda::new(|x: String| async move {
Ok(Value::Number(x.len().into()))
}).boxed(),
),
]);
let config = RunnableConfig::default();
let result = parallel.invoke("Hello".to_string(), &config).await?;
// result is a JSON object:
// {"upper": "HELLO", "lower": "hello", "length": 5}
assert_eq!(result["upper"], "HELLO");
assert_eq!(result["lower"], "hello");
assert_eq!(result["length"], 5);
Constructor
RunnableParallel::new() takes a Vec<(String, BoxRunnable<I, Value>)> -- a list of (name, runnable) pairs. All branches run concurrently via futures::future::join_all.
In a chain
RunnableParallel implements Runnable<I, Value>, so you can use it in a pipe chain. A common pattern is to fan out processing and then merge the results:
let analyze = RunnableParallel::new(vec![
("summary".to_string(), summarizer.boxed()),
("keywords".to_string(), keyword_extractor.boxed()),
]);
let format_report = RunnableLambda::new(|data: Value| async move {
Ok(format!(
"Summary: {}\nKeywords: {}",
data["summary"], data["keywords"]
))
});
let chain = analyze.boxed() | format_report.boxed();
Error handling
If any branch fails, the entire RunnableParallel invocation returns the first error encountered. Successful branches that completed before the failure are discarded.
RunnableBranch
RunnableBranch routes input to one of several runnables based on condition functions. It evaluates conditions in order, invoking the runnable associated with the first matching condition. If no conditions match, the default runnable is used.
Basic usage
use synaptic::runnables::{Runnable, RunnableBranch, RunnableLambda, BoxRunnable};
use synaptic::core::RunnableConfig;
let branch = RunnableBranch::new(
vec![
(
Box::new(|x: &String| x.starts_with("hi")) as Box<dyn Fn(&String) -> bool + Send + Sync>,
RunnableLambda::new(|x: String| async move {
Ok(format!("Greeting: {x}"))
}).boxed(),
),
(
Box::new(|x: &String| x.starts_with("bye")),
RunnableLambda::new(|x: String| async move {
Ok(format!("Farewell: {x}"))
}).boxed(),
),
],
// Default: used when no condition matches
RunnableLambda::new(|x: String| async move {
Ok(format!("Other: {x}"))
}).boxed(),
);
let config = RunnableConfig::default();
let r1 = branch.invoke("hi there".to_string(), &config).await?;
assert_eq!(r1, "Greeting: hi there");
let r2 = branch.invoke("bye now".to_string(), &config).await?;
assert_eq!(r2, "Farewell: bye now");
let r3 = branch.invoke("something else".to_string(), &config).await?;
assert_eq!(r3, "Other: something else");
Constructor
RunnableBranch::new() takes two arguments:
branches: Vec<(BranchCondition<I>, BoxRunnable<I, O>)>-- condition/runnable pairs evaluated in order. The condition type isBox<dyn Fn(&I) -> bool + Send + Sync>.default: BoxRunnable<I, O>-- the fallback runnable when no condition matches.
In a chain
RunnableBranch implements Runnable<I, O>, so it works with the pipe operator:
let preprocess = RunnableLambda::new(|x: String| async move {
Ok(x.trim().to_string())
});
let route = RunnableBranch::new(
vec![/* conditions */],
default_handler.boxed(),
);
let chain = preprocess.boxed() | route.boxed();
When to use each
- Use
RunnableParallelwhen you need to run multiple operations on the same input concurrently and combine all results. - Use
RunnableBranchwhen you need to select a single processing path based on the input value.
Assign & Pick
This guide shows how to use RunnableAssign to merge computed values into a JSON object and RunnablePick to extract specific keys from one.
RunnableAssign
RunnableAssign takes a JSON object as input, runs named branches in parallel on that object, and merges the branch outputs back into the original object. This is useful for enriching data as it flows through a chain -- you keep the original fields and add new computed ones.
Basic usage
use serde_json::{json, Value};
use synaptic::runnables::{Runnable, RunnableAssign, RunnableLambda};
use synaptic::core::RunnableConfig;
let assign = RunnableAssign::new(vec![
(
"name_upper".to_string(),
RunnableLambda::new(|input: Value| async move {
let name = input["name"].as_str().unwrap_or_default();
Ok(Value::String(name.to_uppercase()))
}).boxed(),
),
(
"greeting".to_string(),
RunnableLambda::new(|input: Value| async move {
let name = input["name"].as_str().unwrap_or_default();
Ok(Value::String(format!("Hello, {name}!")))
}).boxed(),
),
]);
let config = RunnableConfig::default();
let input = json!({"name": "Alice", "age": 30});
let result = assign.invoke(input, &config).await?;
// Original fields are preserved, new fields are merged in
assert_eq!(result["name"], "Alice");
assert_eq!(result["age"], 30);
assert_eq!(result["name_upper"], "ALICE");
assert_eq!(result["greeting"], "Hello, Alice!");
How it works
- The input must be a JSON object (
Value::Object). If it is not,RunnableAssignreturns aSynapticError::Validationerror. - Each branch receives a clone of the full input object.
- All branches run concurrently via
futures::future::join_all. - Branch outputs are inserted into the original object using the branch name as the key. If a branch name collides with an existing key, the branch output overwrites the original value.
Constructor
RunnableAssign::new() takes a Vec<(String, BoxRunnable<Value, Value>)> -- named branches that each transform the input into a value to be merged.
Shorthand via RunnablePassthrough
RunnablePassthrough provides a convenience method that creates a RunnableAssign directly:
use synaptic::runnables::{RunnablePassthrough, RunnableLambda};
use serde_json::Value;
let assign = RunnablePassthrough::assign(vec![
(
"processed".to_string(),
RunnableLambda::new(|input: Value| async move {
// compute something from the input
Ok(Value::String("result".to_string()))
}).boxed(),
),
]);
RunnablePick
RunnablePick extracts specified keys from a JSON object, producing a new object containing only those keys. Keys that do not exist in the input are silently omitted from the output.
Basic usage
use serde_json::{json, Value};
use synaptic::runnables::{Runnable, RunnablePick};
use synaptic::core::RunnableConfig;
let pick = RunnablePick::new(vec![
"name".to_string(),
"age".to_string(),
]);
let config = RunnableConfig::default();
let input = json!({
"name": "Alice",
"age": 30,
"email": "alice@example.com",
"internal_id": 42
});
let result = pick.invoke(input, &config).await?;
// Only the picked keys are present
assert_eq!(result, json!({"name": "Alice", "age": 30}));
Error handling
RunnablePick expects a JSON object as input. If the input is not an object (e.g., a string or array), it returns a SynapticError::Validation error.
Missing keys are not an error -- they are simply absent from the output:
let pick = RunnablePick::new(vec!["name".to_string(), "missing_key".to_string()]);
let result = pick.invoke(json!({"name": "Bob"}), &config).await?;
assert_eq!(result, json!({"name": "Bob"}));
Combining Assign and Pick in a chain
A common pattern is to use RunnableAssign to enrich data, then RunnablePick to select only the fields needed downstream:
use serde_json::{json, Value};
use synaptic::runnables::{Runnable, RunnableAssign, RunnablePick, RunnableLambda};
use synaptic::core::RunnableConfig;
// Step 1: Enrich input with a computed field
let assign = RunnableAssign::new(vec![
(
"full_name".to_string(),
RunnableLambda::new(|input: Value| async move {
let first = input["first"].as_str().unwrap_or_default();
let last = input["last"].as_str().unwrap_or_default();
Ok(Value::String(format!("{first} {last}")))
}).boxed(),
),
]);
// Step 2: Pick only what the next step needs
let pick = RunnablePick::new(vec!["full_name".to_string()]);
let chain = assign.boxed() | pick.boxed();
let config = RunnableConfig::default();
let input = json!({"first": "Jane", "last": "Doe", "internal_id": 99});
let result = chain.invoke(input, &config).await?;
assert_eq!(result, json!({"full_name": "Jane Doe"}));
Fallbacks
This guide shows how to use RunnableWithFallbacks to provide alternative runnables that are tried when the primary one fails.
Overview
RunnableWithFallbacks wraps a primary runnable and a list of fallback runnables. When invoked, it tries the primary first. If the primary returns an error, it tries each fallback in order until one succeeds. If all fail, it returns the error from the last fallback attempted.
This is particularly useful when working with LLM providers that may experience transient outages, or when you want to try a cheaper model first and fall back to a more capable one.
Basic usage
use synaptic::runnables::{Runnable, RunnableWithFallbacks, RunnableLambda};
use synaptic::core::{RunnableConfig, SynapticError};
// A runnable that always fails
let unreliable = RunnableLambda::new(|_x: String| async move {
Err::<String, _>(SynapticError::Provider("service unavailable".into()))
});
// A reliable fallback
let fallback = RunnableLambda::new(|x: String| async move {
Ok(format!("Fallback handled: {x}"))
});
let with_fallbacks = RunnableWithFallbacks::new(
unreliable.boxed(),
vec![fallback.boxed()],
);
let config = RunnableConfig::default();
let result = with_fallbacks.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "Fallback handled: hello");
Multiple fallbacks
You can provide multiple fallbacks. They are tried in order:
let primary = failing_model.boxed();
let fallback_1 = cheaper_model.boxed();
let fallback_2 = local_model.boxed();
let resilient = RunnableWithFallbacks::new(
primary,
vec![fallback_1, fallback_2],
);
// Tries: primary -> fallback_1 -> fallback_2
let result = resilient.invoke(input, &config).await?;
If the primary succeeds, no fallbacks are attempted. If the primary fails but fallback_1 succeeds, fallback_2 is never tried.
Input cloning requirement
The input type must implement Clone, because RunnableWithFallbacks needs to pass a copy of the input to each fallback attempt. This is enforced by the type signature:
pub struct RunnableWithFallbacks<I: Send + Clone + 'static, O: Send + 'static> {
primary: BoxRunnable<I, O>,
fallbacks: Vec<BoxRunnable<I, O>>,
}
String, Vec<Message>, serde_json::Value, and most standard types implement Clone.
Streaming with fallbacks
RunnableWithFallbacks also supports stream(). When streaming, it buffers the primary stream's output. If the primary stream yields an error, it discards the buffered items and tries the next fallback's stream. This means there is no partial output from a failed provider -- you get the complete output from whichever provider succeeds.
use futures::StreamExt;
let resilient = RunnableWithFallbacks::new(primary.boxed(), vec![fallback.boxed()]);
let mut stream = resilient.stream("input".to_string(), &config);
while let Some(result) = stream.next().await {
let value = result?;
println!("Got: {value}");
}
In a chain
RunnableWithFallbacks implements Runnable<I, O>, so it composes with the pipe operator:
let resilient_model = RunnableWithFallbacks::new(
primary_model.boxed(),
vec![fallback_model.boxed()],
);
let chain = preprocess.boxed() | resilient_model.boxed() | postprocess.boxed();
When to use fallbacks vs. retry
- Use
RunnableWithFallbackswhen you have genuinely different alternatives (e.g., different LLM providers or different strategies). - Use
RunnableRetrywhen you want to retry the same runnable with exponential backoff (e.g., transient network errors).
You can combine both -- wrap a retrying runnable as the primary, with a different provider as a fallback:
use synaptic::runnables::{RunnableRetry, RetryPolicy, RunnableWithFallbacks};
let retrying_primary = RunnableRetry::new(primary.boxed(), RetryPolicy::default());
let resilient = RunnableWithFallbacks::new(
retrying_primary.boxed(),
vec![fallback.boxed()],
);
Bind
This guide shows how to use BoxRunnable::bind() to attach configuration transforms and listeners to a runnable.
Overview
bind() creates a new BoxRunnable that applies a transformation to the RunnableConfig before each invocation. This is useful for injecting tags, metadata, or other config fields into a runnable without modifying the call site.
Internally, bind() wraps the runnable in a RunnableBind that calls the transform function on the config, then delegates to the inner runnable with the modified config.
Basic usage
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;
let step = RunnableLambda::new(|x: String| async move {
Ok(x.to_uppercase())
});
// Bind a config transform that adds a tag
let bound = step.boxed().bind(|mut config| {
config.tags.push("my-tag".to_string());
config
});
let config = RunnableConfig::default();
let result = bound.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "HELLO");
// The inner runnable received a config with tags: ["my-tag"]
The transform function receives the RunnableConfig by value (cloned from the original) and returns the modified config.
Adding metadata
You can use bind() to attach metadata that downstream runnables or callbacks can inspect:
use serde_json::json;
let bound = step.boxed().bind(|mut config| {
config.metadata.insert("source".to_string(), json!("user-query"));
config.metadata.insert("priority".to_string(), json!("high"));
config
});
Setting a fixed config with with_config()
If you want to replace the config entirely rather than modify it, use with_config(). This ignores whatever config is passed at invocation time and uses the provided config instead:
let fixed_config = RunnableConfig {
tags: vec!["production".to_string()],
run_name: Some("fixed-pipeline".to_string()),
..RunnableConfig::default()
};
let bound = step.boxed().with_config(fixed_config);
// Even if a different config is passed to invoke(), the fixed config is used
let any_config = RunnableConfig::default();
let result = bound.invoke("hello".to_string(), &any_config).await?;
Streaming with bind
bind() also applies the config transform during stream() calls, not just invoke():
use futures::StreamExt;
let bound = step.boxed().bind(|mut config| {
config.tags.push("streaming".to_string());
config
});
let mut stream = bound.stream("hello".to_string(), &config);
while let Some(result) = stream.next().await {
let value = result?;
println!("{value}");
}
Attaching listeners with with_listeners()
with_listeners() wraps a runnable with before/after callbacks that fire on each invocation. The callbacks receive a reference to the RunnableConfig:
let with_logging = step.boxed().with_listeners(
|config| {
println!("Starting run: {:?}", config.run_name);
},
|config| {
println!("Finished run: {:?}", config.run_name);
},
);
let result = with_logging.invoke("hello".to_string(), &config).await?;
// Prints: Starting run: None
// Prints: Finished run: None
Listeners also fire around stream() calls -- on_start fires before the first item is yielded, and on_end fires after the stream completes.
Composing with bind in a chain
bind() returns a BoxRunnable, so you can chain it with the pipe operator:
let tagged_step = step.boxed().bind(|mut config| {
config.tags.push("step-1".to_string());
config
});
let chain = tagged_step | next_step.boxed();
let result = chain.invoke("input".to_string(), &config).await?;
RunnableConfig fields reference
The RunnableConfig struct has the following fields that you can modify via bind():
| Field | Type | Description |
|---|---|---|
tags | Vec<String> | Tags for filtering and categorization |
metadata | HashMap<String, Value> | Arbitrary key-value metadata |
max_concurrency | Option<usize> | Concurrency limit for batch operations |
recursion_limit | Option<usize> | Maximum recursion depth for chains |
run_id | Option<String> | Unique identifier for the current run |
run_name | Option<String> | Human-readable name for the current run |
Retry
This guide shows how to use RunnableRetry with RetryPolicy to automatically retry a runnable on failure with exponential backoff.
Overview
RunnableRetry wraps any runnable with retry logic. When the inner runnable returns an error, RunnableRetry waits for a backoff delay and tries again, up to a configurable maximum number of attempts. The backoff follows an exponential schedule: min(base_delay * 2^attempt, max_delay).
Basic usage
use std::time::Duration;
use synaptic::runnables::{Runnable, RunnableRetry, RetryPolicy, RunnableLambda};
use synaptic::core::RunnableConfig;
let flaky_step = RunnableLambda::new(|x: String| async move {
// Imagine this sometimes fails due to network issues
Ok(x.to_uppercase())
});
let policy = RetryPolicy::default(); // 3 attempts, 100ms base delay, 10s max delay
let with_retry = RunnableRetry::new(flaky_step.boxed(), policy);
let config = RunnableConfig::default();
let result = with_retry.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "HELLO");
Configuring the retry policy
RetryPolicy uses a builder pattern for configuration:
use std::time::Duration;
use synaptic::runnables::RetryPolicy;
let policy = RetryPolicy::default()
.with_max_attempts(5) // Up to 5 total attempts (1 initial + 4 retries)
.with_base_delay(Duration::from_millis(200)) // Start with 200ms delay
.with_max_delay(Duration::from_secs(30)); // Cap delay at 30 seconds
Default values
| Field | Default |
|---|---|
max_attempts | 3 |
base_delay | 100ms |
max_delay | 10 seconds |
Backoff schedule
The delay for each retry attempt is calculated as:
delay = min(base_delay * 2^attempt, max_delay)
For the defaults (100ms base, 10s max):
| Attempt | Delay |
|---|---|
| 1st retry (attempt 0) | 100ms |
| 2nd retry (attempt 1) | 200ms |
| 3rd retry (attempt 2) | 400ms |
| 4th retry (attempt 3) | 800ms |
| ... | ... |
| Capped at | 10s |
Filtering retryable errors
By default, all errors trigger a retry. Use with_retry_on() to specify a predicate that decides which errors are worth retrying:
use synaptic::runnables::RetryPolicy;
use synaptic::core::SynapticError;
let policy = RetryPolicy::default()
.with_max_attempts(4)
.with_retry_on(|error: &SynapticError| {
// Only retry provider errors (e.g., rate limits, timeouts)
matches!(error, SynapticError::Provider(_))
});
When the predicate returns false for an error, RunnableRetry immediately returns that error without further retries.
Input cloning requirement
The input type must implement Clone, because the input is reused for each retry attempt:
pub struct RunnableRetry<I: Send + Clone + 'static, O: Send + 'static> { ... }
In a chain
RunnableRetry implements Runnable<I, O>, so it works with the pipe operator:
use synaptic::runnables::{Runnable, RunnableRetry, RetryPolicy, RunnableLambda};
let preprocess = RunnableLambda::new(|x: String| async move {
Ok(x.trim().to_string())
});
let retrying_model = RunnableRetry::new(
model_step.boxed(),
RetryPolicy::default().with_max_attempts(3),
);
let chain = preprocess.boxed() | retrying_model.boxed();
Combining retry with fallbacks
For maximum resilience, wrap a retrying runnable with fallbacks. The primary is retried up to its limit; if it still fails, the fallback is tried:
use synaptic::runnables::{RunnableRetry, RetryPolicy, RunnableWithFallbacks};
let retrying_primary = RunnableRetry::new(
primary_model.boxed(),
RetryPolicy::default().with_max_attempts(3),
);
let resilient = RunnableWithFallbacks::new(
retrying_primary.boxed(),
vec![fallback_model.boxed()],
);
Full example
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::time::Duration;
use synaptic::runnables::{Runnable, RunnableRetry, RetryPolicy, RunnableLambda};
use synaptic::core::{RunnableConfig, SynapticError};
// Simulate a flaky service that fails twice then succeeds
let call_count = Arc::new(AtomicUsize::new(0));
let counter = call_count.clone();
let flaky = RunnableLambda::new(move |x: String| {
let counter = counter.clone();
async move {
let n = counter.fetch_add(1, Ordering::SeqCst);
if n < 2 {
Err(SynapticError::Provider("temporary failure".into()))
} else {
Ok(format!("Success: {x}"))
}
}
});
let policy = RetryPolicy::default()
.with_max_attempts(5)
.with_base_delay(Duration::from_millis(10));
let retrying = RunnableRetry::new(flaky.boxed(), policy);
let config = RunnableConfig::default();
let result = retrying.invoke("test".to_string(), &config).await?;
assert_eq!(result, "Success: test");
assert_eq!(call_count.load(Ordering::SeqCst), 3); // 2 failures + 1 success
Generator
This guide shows how to use RunnableGenerator to create a runnable from a streaming generator function.
Overview
RunnableGenerator wraps a function that produces a Stream of results. It bridges the gap between streaming generators and the Runnable trait:
invoke()collects the entire stream into aVec<O>stream()yields each item individually as it is produced
This is useful when you want a runnable that naturally produces output incrementally -- for example, tokenizers, chunkers, or any computation that yields partial results.
Basic usage
use synaptic::runnables::{Runnable, RunnableGenerator};
use synaptic::core::{RunnableConfig, SynapticError};
let gen = RunnableGenerator::new(|input: String| {
async_stream::stream! {
for word in input.split_whitespace() {
yield Ok(word.to_uppercase());
}
}
});
let config = RunnableConfig::default();
let result = gen.invoke("hello world".to_string(), &config).await?;
assert_eq!(result, vec!["HELLO", "WORLD"]);
Streaming
The real power of RunnableGenerator is streaming. stream() yields each item as it is produced, without waiting for the generator to finish:
use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableGenerator};
use synaptic::core::RunnableConfig;
let gen = RunnableGenerator::new(|input: String| {
async_stream::stream! {
for ch in input.chars() {
yield Ok(ch.to_string());
}
}
});
let config = RunnableConfig::default();
let mut stream = gen.stream("abc".to_string(), &config);
// Each item arrives individually wrapped in a Vec
while let Some(item) = stream.next().await {
let chunk = item?;
println!("{:?}", chunk); // ["a"], ["b"], ["c"]
}
Each streamed item is wrapped in Vec<O> to match the output type of invoke(). This means stream() yields Result<Vec<O>, SynapticError> where each Vec contains a single element.
Error handling
If the generator yields an Err, invoke() stops collecting and returns that error. stream() yields the error and continues to the next item:
use synaptic::runnables::RunnableGenerator;
use synaptic::core::SynapticError;
let gen = RunnableGenerator::new(|_input: String| {
async_stream::stream! {
yield Ok("first".to_string());
yield Err(SynapticError::Other("oops".into()));
yield Ok("third".to_string());
}
});
// invoke() fails on the error:
// gen.invoke("x".to_string(), &config).await => Err(...)
// stream() yields all three items:
// Ok(["first"]), Err(...), Ok(["third"])
In a pipeline
RunnableGenerator implements Runnable<I, Vec<O>>, so it works with the pipe operator. Place it wherever you need streaming generation in a chain:
use synaptic::runnables::{Runnable, RunnableGenerator, RunnableLambda};
let tokenize = RunnableGenerator::new(|input: String| {
async_stream::stream! {
for token in input.split_whitespace() {
yield Ok(token.to_string());
}
}
});
let count = RunnableLambda::new(|tokens: Vec<String>| async move {
Ok(tokens.len())
});
let chain = tokenize.boxed() | count.boxed();
// chain.invoke("one two three".to_string(), &config).await => Ok(3)
Type signature
pub struct RunnableGenerator<I: Send + 'static, O: Send + 'static> { ... }
impl<I, O> Runnable<I, Vec<O>> for RunnableGenerator<I, O> { ... }
The constructor accepts any function Fn(I) -> S where S: Stream<Item = Result<O, SynapticError>> + Send + 'static. The async_stream::stream! macro is the most ergonomic way to produce such a stream.
Each
This guide shows how to use RunnableEach to map a runnable over each element in a list.
Overview
RunnableEach wraps any BoxRunnable<I, O> and applies it to every element in a Vec<I>, producing a Vec<O>. It is the runnable equivalent of Iterator::map() -- process a batch of items through the same transformation.
Basic usage
use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
use synaptic::core::RunnableConfig;
let upper = RunnableLambda::new(|s: String| async move {
Ok(s.to_uppercase())
});
let each = RunnableEach::new(upper.boxed());
let config = RunnableConfig::default();
let result = each.invoke(
vec!["hello".into(), "world".into()],
&config,
).await?;
assert_eq!(result, vec!["HELLO", "WORLD"]);
Error propagation
If the inner runnable fails on any element, RunnableEach stops and returns that error immediately. Elements processed before the failure are discarded:
use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
use synaptic::core::{RunnableConfig, SynapticError};
let must_be_short = RunnableLambda::new(|s: String| async move {
if s.len() > 5 {
Err(SynapticError::Other(format!("too long: {s}")))
} else {
Ok(s.to_uppercase())
}
});
let each = RunnableEach::new(must_be_short.boxed());
let config = RunnableConfig::default();
let result = each.invoke(
vec!["hi".into(), "toolong".into(), "ok".into()],
&config,
).await;
assert!(result.is_err()); // fails on "toolong"
Empty input
An empty input vector produces an empty output vector:
use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
use synaptic::core::RunnableConfig;
let identity = RunnableLambda::new(|s: String| async move { Ok(s) });
let each = RunnableEach::new(identity.boxed());
let config = RunnableConfig::default();
let result = each.invoke(vec![], &config).await?;
assert!(result.is_empty());
In a pipeline
RunnableEach implements Runnable<Vec<I>, Vec<O>>, so it composes with the pipe operator. A common pattern is to split input into parts, process each with RunnableEach, and then combine the results:
use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
// Step 1: split a string into words
let split = RunnableLambda::new(|s: String| async move {
Ok(s.split_whitespace().map(String::from).collect::<Vec<_>>())
});
// Step 2: process each word
let process = RunnableEach::new(
RunnableLambda::new(|w: String| async move {
Ok(w.to_uppercase())
}).boxed()
);
// Step 3: join results
let join = RunnableLambda::new(|words: Vec<String>| async move {
Ok(words.join(", "))
});
let chain = split.boxed() | process.boxed() | join.boxed();
// chain.invoke("hello world".to_string(), &config).await => Ok("HELLO, WORLD")
Type signature
pub struct RunnableEach<I: Send + 'static, O: Send + 'static> {
inner: BoxRunnable<I, O>,
}
impl<I, O> Runnable<Vec<I>, Vec<O>> for RunnableEach<I, O> { ... }
Elements are processed sequentially in order. For concurrent processing, use RunnableParallel or the batch() method on a BoxRunnable instead.
Retrieval
Synaptic provides a complete Retrieval-Augmented Generation (RAG) pipeline. The pipeline follows five stages:
- Load -- ingest raw data from files, JSON, CSV, web URLs, or entire directories.
- Split -- break large documents into smaller chunks that fit within context windows.
- Embed -- convert text chunks into numerical vectors using an embedding model.
- Store -- persist embeddings in a vector store for efficient similarity search.
- Retrieve -- find the most relevant documents for a given query.
Key types
| Type | Crate | Purpose |
|---|---|---|
Document | synaptic_retrieval | A unit of text with id, content, and metadata: HashMap<String, Value> |
Loader trait | synaptic_loaders | Async trait for loading documents from various sources |
TextSplitter trait | synaptic_splitters | Splits text into chunks with optional overlap |
Embeddings trait | synaptic_embeddings | Converts text into vector representations |
VectorStore trait | synaptic_vectorstores | Stores and searches document embeddings |
Retriever trait | synaptic_retrieval | Retrieves relevant documents given a query string |
Retrievers
Synaptic ships with seven retriever implementations, each suited to different use cases:
| Retriever | Strategy |
|---|---|
VectorStoreRetriever | Wraps any VectorStore for cosine similarity search |
BM25Retriever | Okapi BM25 keyword scoring -- no embeddings required |
MultiQueryRetriever | Uses an LLM to generate query variants, retrieves for each, deduplicates |
EnsembleRetriever | Combines multiple retrievers via Reciprocal Rank Fusion |
ContextualCompressionRetriever | Post-filters retrieved documents using a DocumentCompressor |
SelfQueryRetriever | Uses an LLM to extract structured metadata filters from natural language |
ParentDocumentRetriever | Searches small child chunks but returns full parent documents |
Guides
- Document Loaders -- load data from text, JSON, CSV, files, directories, and the web
- Text Splitters -- break documents into chunks with character, recursive, markdown, or token-based strategies
- Embeddings -- embed text using OpenAI, Ollama, or deterministic fake embeddings
- Vector Stores -- store and search embeddings with
InMemoryVectorStore - BM25 Retriever -- keyword-based retrieval with Okapi BM25 scoring
- Multi-Query Retriever -- improve recall by generating multiple query perspectives
- Ensemble Retriever -- combine retrievers with Reciprocal Rank Fusion
- Contextual Compression -- post-filter results with embedding similarity thresholds
- Self-Query Retriever -- LLM-powered metadata filtering from natural language
- Parent Document Retriever -- search small chunks, return full parent documents
Document Loaders
This guide shows how to load documents from various sources using Synaptic's Loader trait and its built-in implementations.
Overview
Every loader implements the Loader trait from synaptic_loaders:
#[async_trait]
pub trait Loader: Send + Sync {
async fn load(&self) -> Result<Vec<Document>, SynapticError>;
}
Each loader returns Vec<Document>. A Document has three fields:
id: String-- a unique identifiercontent: String-- the document textmetadata: HashMap<String, Value>-- arbitrary key-value metadata
TextLoader
Wraps a string of text into a single Document. Useful when you already have content in memory.
use synaptic::loaders::{TextLoader, Loader};
let loader = TextLoader::new("doc-1", "Rust is a systems programming language.");
let docs = loader.load().await?;
assert_eq!(docs.len(), 1);
assert_eq!(docs[0].content, "Rust is a systems programming language.");
The first argument is the document ID; the second is the content.
FileLoader
Reads a file from disk using tokio::fs::read_to_string and returns a single Document. The file path is used as the document ID, and a source metadata key is set to the file path.
use synaptic::loaders::{FileLoader, Loader};
let loader = FileLoader::new("data/notes.txt");
let docs = loader.load().await?;
assert_eq!(docs[0].metadata["source"], "data/notes.txt");
JsonLoader
Loads documents from a JSON string. If the JSON is an array of objects, each object becomes a Document. If it is a single object, one Document is produced.
use synaptic::loaders::{JsonLoader, Loader};
let json_data = r#"[
{"id": "1", "content": "First document"},
{"id": "2", "content": "Second document"}
]"#;
let loader = JsonLoader::new(json_data);
let docs = loader.load().await?;
assert_eq!(docs.len(), 2);
assert_eq!(docs[0].content, "First document");
By default, JsonLoader looks for "id" and "content" keys. You can customize them with builder methods:
let loader = JsonLoader::new(json_data)
.with_id_key("doc_id")
.with_content_key("text");
CsvLoader
Loads documents from CSV data. Each row becomes a Document. All columns are stored as metadata.
use synaptic::loaders::{CsvLoader, Loader};
let csv_data = "title,body,author\nIntro,Hello world,Alice\nChapter 1,Once upon a time,Bob";
let loader = CsvLoader::new(csv_data)
.with_content_column("body")
.with_id_column("title");
let docs = loader.load().await?;
assert_eq!(docs.len(), 2);
assert_eq!(docs[0].id, "Intro");
assert_eq!(docs[0].content, "Hello world");
assert_eq!(docs[0].metadata["author"], "Alice");
If no content_column is specified, all columns are concatenated. If no id_column is specified, IDs default to "row-0", "row-1", etc.
DirectoryLoader
Loads all files from a directory, each file becoming a Document. Use with_glob to filter by extension and with_recursive to include subdirectories.
use synaptic::loaders::{DirectoryLoader, Loader};
let loader = DirectoryLoader::new("./docs")
.with_glob("*.txt")
.with_recursive(true);
let docs = loader.load().await?;
// Each document has a `source` metadata key set to the file path
Document IDs are the relative file paths from the base directory.
MarkdownLoader
Reads a markdown file and returns it as a single Document with format: "markdown" in metadata.
use synaptic::loaders::{MarkdownLoader, Loader};
let loader = MarkdownLoader::new("docs/guide.md");
let docs = loader.load().await?;
assert_eq!(docs[0].metadata["format"], "markdown");
WebBaseLoader
Fetches content from a URL via HTTP GET and returns a single Document. Metadata includes source (the URL) and content_type (from the response header).
use synaptic::loaders::{WebBaseLoader, Loader};
let loader = WebBaseLoader::new("https://example.com/page.html");
let docs = loader.load().await?;
assert_eq!(docs[0].metadata["source"], "https://example.com/page.html");
Lazy loading
Every Loader also provides a lazy_load() method that returns a Stream of documents instead of loading all at once. The default implementation wraps load(), but custom loaders can override it for true lazy behavior.
use futures::StreamExt;
use synaptic::loaders::{DirectoryLoader, Loader};
let loader = DirectoryLoader::new("./data").with_glob("*.txt");
let mut stream = loader.lazy_load();
while let Some(result) = stream.next().await {
let doc = result?;
println!("Loaded: {}", doc.id);
}
Text Splitters
This guide shows how to break large documents into smaller chunks using Synaptic's TextSplitter trait and its built-in implementations.
Overview
All splitters implement the TextSplitter trait from synaptic_splitters:
pub trait TextSplitter: Send + Sync {
fn split_text(&self, text: &str) -> Vec<String>;
fn split_documents(&self, docs: Vec<Document>) -> Vec<Document>;
}
split_text()takes a string and returns a vector of chunks.split_documents()splits each document's content, producing newDocumentvalues with preserved metadata and an addedchunk_indexfield.
CharacterTextSplitter
Splits text on a single separator string, then merges small pieces to stay under chunk_size.
use synaptic::splitters::CharacterTextSplitter;
use synaptic::splitters::TextSplitter;
// Chunk size in characters, default separator is "\n\n"
let splitter = CharacterTextSplitter::new(500);
let chunks = splitter.split_text("long text...");
Configure the separator and overlap:
let splitter = CharacterTextSplitter::new(500)
.with_separator("\n") // Split on single newlines
.with_chunk_overlap(50); // 50 characters of overlap between chunks
RecursiveCharacterTextSplitter
The most commonly used splitter. Tries a hierarchy of separators in order, splitting with the first one that produces chunks small enough. If a chunk is still too large, it recurses with the next separator.
Default separators: ["\n\n", "\n", " ", ""]
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::splitters::TextSplitter;
let splitter = RecursiveCharacterTextSplitter::new(1000)
.with_chunk_overlap(200);
let chunks = splitter.split_text("long document text...");
Custom separators:
let splitter = RecursiveCharacterTextSplitter::new(1000)
.with_separators(vec![
"\n\n\n".to_string(),
"\n\n".to_string(),
"\n".to_string(),
" ".to_string(),
String::new(),
]);
Language-aware splitting
Use from_language() to get separators tuned for a specific programming language:
use synaptic::splitters::{RecursiveCharacterTextSplitter, Language};
let splitter = RecursiveCharacterTextSplitter::from_language(
Language::Rust,
1000, // chunk_size
200, // chunk_overlap
);
MarkdownHeaderTextSplitter
Splits markdown text by headers, adding the header hierarchy to each chunk's metadata.
use synaptic::splitters::{MarkdownHeaderTextSplitter, HeaderType};
let splitter = MarkdownHeaderTextSplitter::new(vec![
HeaderType { level: "#".to_string(), name: "h1".to_string() },
HeaderType { level: "##".to_string(), name: "h2".to_string() },
HeaderType { level: "###".to_string(), name: "h3".to_string() },
]);
let docs = splitter.split_markdown("# Title\n\nIntro text\n\n## Section\n\nBody text");
// docs[0].metadata contains {"h1": "Title"}
// docs[1].metadata contains {"h1": "Title", "h2": "Section"}
A convenience constructor provides the default #, ##, ### configuration:
let splitter = MarkdownHeaderTextSplitter::default_headers();
Note that MarkdownHeaderTextSplitter also implements TextSplitter, but split_markdown() returns Vec<Document> with full metadata, which is usually what you want.
TokenTextSplitter
Splits text by estimated token count using a ~4 characters per token heuristic. Splits at word boundaries to keep chunks readable.
use synaptic::splitters::TokenTextSplitter;
use synaptic::splitters::TextSplitter;
// chunk_size is in estimated tokens (not characters)
let splitter = TokenTextSplitter::new(500)
.with_chunk_overlap(50);
let chunks = splitter.split_text("long text...");
This is consistent with the token estimation used in ConversationTokenBufferMemory.
HtmlHeaderTextSplitter
Splits HTML text by header tags (<h1>, <h2>, etc.), adding header hierarchy to each chunk's metadata. Similar to MarkdownHeaderTextSplitter but for HTML content.
use synaptic::splitters::HtmlHeaderTextSplitter;
let splitter = HtmlHeaderTextSplitter::new(vec![
("h1".to_string(), "Header 1".to_string()),
("h2".to_string(), "Header 2".to_string()),
]);
let html = "<h1>Title</h1><p>Intro text</p><h2>Section</h2><p>Body text</p>";
let docs = splitter.split_html(html);
// docs[0].metadata contains {"Header 1": "Title"}
// docs[1].metadata contains {"Header 1": "Title", "Header 2": "Section"}
The constructor takes a list of (tag_name, metadata_key) pairs. Only the specified tags are treated as split points; all other HTML content is treated as body text within the current section.
Splitting documents
All splitters can split a Vec<Document> into smaller chunks. Each chunk inherits the parent's metadata and gets a chunk_index field. The chunk ID is formatted as "{original_id}-chunk-{index}".
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
use synaptic::retrieval::Document;
let splitter = RecursiveCharacterTextSplitter::new(500);
let docs = vec![
Document::new("doc-1", "A very long document..."),
Document::new("doc-2", "Another long document..."),
];
let chunks = splitter.split_documents(docs);
// chunks[0].id == "doc-1-chunk-0"
// chunks[0].metadata["chunk_index"] == 0
Choosing a splitter
| Splitter | Best for |
|---|---|
CharacterTextSplitter | Simple splitting on a known delimiter |
RecursiveCharacterTextSplitter | General-purpose text -- tries to preserve paragraphs, then sentences, then words |
MarkdownHeaderTextSplitter | Markdown documents where you want header context in metadata |
TokenTextSplitter | When you need to control chunk size in tokens rather than characters |
Embeddings
This guide shows how to convert text into vector representations using Synaptic's Embeddings trait and its built-in providers.
Overview
All embedding providers implement the Embeddings trait from synaptic_embeddings:
#[async_trait]
pub trait Embeddings: Send + Sync {
async fn embed_documents(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>, SynapticError>;
async fn embed_query(&self, text: &str) -> Result<Vec<f32>, SynapticError>;
}
embed_documents()embeds multiple texts in a single batch -- use this for indexing.embed_query()embeds a single query text -- use this at retrieval time.
FakeEmbeddings
Generates deterministic vectors based on a simple hash of the input text. Useful for testing and development without API calls.
use synaptic::embeddings::FakeEmbeddings;
use synaptic::embeddings::Embeddings;
// Specify the number of dimensions (default is 4)
let embeddings = FakeEmbeddings::new(4);
let doc_vectors = embeddings.embed_documents(&["doc one", "doc two"]).await?;
let query_vector = embeddings.embed_query("search query").await?;
// Vectors are normalized to unit length
// Similar texts produce similar vectors
OpenAiEmbeddings
Uses the OpenAI embeddings API. Requires an API key and a ProviderBackend.
use std::sync::Arc;
use synaptic::embeddings::{OpenAiEmbeddings, OpenAiEmbeddingsConfig};
use synaptic::embeddings::Embeddings;
use synaptic::models::backend::HttpBackend;
let config = OpenAiEmbeddingsConfig::new("sk-...")
.with_model("text-embedding-3-small"); // default model
let backend = Arc::new(HttpBackend::new());
let embeddings = OpenAiEmbeddings::new(config, backend);
let vectors = embeddings.embed_documents(&["hello world"]).await?;
You can customize the base URL for compatible APIs:
let config = OpenAiEmbeddingsConfig::new("sk-...")
.with_base_url("https://my-proxy.example.com/v1");
OllamaEmbeddings
Uses a local Ollama instance for embedding. No API key required -- just specify the model name.
use std::sync::Arc;
use synaptic::embeddings::{OllamaEmbeddings, OllamaEmbeddingsConfig};
use synaptic::embeddings::Embeddings;
use synaptic::models::backend::HttpBackend;
let config = OllamaEmbeddingsConfig::new("nomic-embed-text");
// Default base_url: http://localhost:11434
let backend = Arc::new(HttpBackend::new());
let embeddings = OllamaEmbeddings::new(config, backend);
let vector = embeddings.embed_query("search query").await?;
Custom Ollama endpoint:
let config = OllamaEmbeddingsConfig::new("nomic-embed-text")
.with_base_url("http://my-ollama:11434");
CacheBackedEmbeddings
Wraps any Embeddings provider with an in-memory cache. Previously computed embeddings are returned from cache; only uncached texts are sent to the underlying provider.
use std::sync::Arc;
use synaptic::embeddings::{CacheBackedEmbeddings, FakeEmbeddings, Embeddings};
let inner = Arc::new(FakeEmbeddings::new(128));
let cached = CacheBackedEmbeddings::new(inner);
// First call computes the embedding
let v1 = cached.embed_query("hello").await?;
// Second call returns the cached result -- no recomputation
let v2 = cached.embed_query("hello").await?;
assert_eq!(v1, v2);
This is especially useful when adding documents to a vector store and then querying, since the same text may be embedded multiple times across operations.
Using embeddings with vector stores
Embeddings are passed to vector store methods rather than stored inside the vector store. This lets you swap embedding providers without rebuilding the store.
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;
let embeddings = FakeEmbeddings::new(128);
let store = InMemoryVectorStore::new();
let docs = vec![Document::new("1", "Rust is fast")];
store.add_documents(docs, &embeddings).await?;
let results = store.similarity_search("fast language", 5, &embeddings).await?;
Vector Stores
This guide shows how to store and search document embeddings using Synaptic's VectorStore trait and the built-in InMemoryVectorStore.
Overview
The VectorStore trait from synaptic_vectorstores provides methods for adding, searching, and deleting documents:
#[async_trait]
pub trait VectorStore: Send + Sync {
async fn add_documents(
&self, docs: Vec<Document>, embeddings: &dyn Embeddings,
) -> Result<Vec<String>, SynapticError>;
async fn similarity_search(
&self, query: &str, k: usize, embeddings: &dyn Embeddings,
) -> Result<Vec<Document>, SynapticError>;
async fn similarity_search_with_score(
&self, query: &str, k: usize, embeddings: &dyn Embeddings,
) -> Result<Vec<(Document, f32)>, SynapticError>;
async fn similarity_search_by_vector(
&self, embedding: &[f32], k: usize,
) -> Result<Vec<Document>, SynapticError>;
async fn delete(&self, ids: &[&str]) -> Result<(), SynapticError>;
}
The embeddings parameter is passed to each method rather than stored inside the vector store. This design lets you swap embedding providers without rebuilding the store.
InMemoryVectorStore
An in-memory vector store that uses cosine similarity for search. Backed by a RwLock<HashMap>.
Creating a store
use synaptic::vectorstores::InMemoryVectorStore;
let store = InMemoryVectorStore::new();
Adding documents
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;
let store = InMemoryVectorStore::new();
let embeddings = FakeEmbeddings::new(128);
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
// ids == ["1", "2", "3"]
Similarity search
Find the k most similar documents to a query:
let results = store.similarity_search("fast systems language", 2, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
Get similarity scores alongside results (higher is more similar):
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Search by vector
Search using a pre-computed embedding vector instead of a text query:
use synaptic::embeddings::Embeddings;
let query_vec = embeddings.embed_query("systems programming").await?;
let results = store.similarity_search_by_vector(&query_vec, 3).await?;
Deleting documents
store.delete(&["1", "3"]).await?;
Convenience constructors
Create a store pre-populated with documents:
use synaptic::vectorstores::InMemoryVectorStore;
use synaptic::embeddings::FakeEmbeddings;
let embeddings = FakeEmbeddings::new(128);
// From (id, content) tuples
let store = InMemoryVectorStore::from_texts(
vec![("1", "Rust is fast"), ("2", "Python is flexible")],
&embeddings,
).await?;
// From Document values
let store = InMemoryVectorStore::from_documents(docs, &embeddings).await?;
Maximum Marginal Relevance (MMR)
MMR search balances relevance with diversity. The lambda_mult parameter controls the trade-off:
1.0-- pure relevance (equivalent to standard similarity search)0.0-- maximum diversity0.5-- balanced (typical default)
let results = store.max_marginal_relevance_search(
"programming language",
3, // k: number of results
10, // fetch_k: initial candidates to consider
0.5, // lambda_mult: relevance vs. diversity
&embeddings,
).await?;
VectorStoreRetriever
VectorStoreRetriever bridges any VectorStore to the Retriever trait, making it compatible with the rest of Synaptic's retrieval infrastructure.
use std::sync::Arc;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Retriever;
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());
// ... add documents to store ...
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("query", 5).await?;
MultiVectorRetriever
MultiVectorRetriever stores small child chunks in a vector store for precise retrieval, but returns the larger parent documents they came from. This gives you the best of both worlds: small chunks for accurate embedding search and full documents for LLM context.
use std::sync::Arc;
use synaptic::vectorstores::{InMemoryVectorStore, MultiVectorRetriever};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::{Document, Retriever};
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());
let retriever = MultiVectorRetriever::new(store, embeddings, 3);
// Add parent documents with their child chunks
let parent = Document::new("parent-1", "Full article about Rust ownership...");
let children = vec![
Document::new("child-1", "Ownership rules in Rust"),
Document::new("child-2", "Borrowing and references"),
];
retriever.add_documents(parent, children).await?;
// Search finds child chunks but returns the parent
let results = retriever.retrieve("ownership", 1).await?;
assert_eq!(results[0].id, Some("parent-1".to_string()));
The id_key metadata field links children to their parent. By default it is "doc_id".
Score threshold filtering
Set a minimum similarity score. Only documents meeting the threshold are returned:
let retriever = VectorStoreRetriever::new(store, embeddings, 10)
.with_score_threshold(0.7);
let results = retriever.retrieve("query", 10).await?;
// Only documents with cosine similarity >= 0.7 are included
BM25 Retriever
This guide shows how to use the BM25Retriever for keyword-based document retrieval using Okapi BM25 scoring.
Overview
BM25 (Best Matching 25) is a classic information retrieval algorithm that scores documents based on term frequency, inverse document frequency, and document length normalization. Unlike vector-based retrieval, BM25 does not require embeddings -- it works directly on the text.
BM25 is a good choice when:
- You need exact keyword matching rather than semantic similarity.
- You want fast retrieval without the cost of computing embeddings.
- You want to combine it with a vector retriever in an ensemble (see Ensemble Retriever).
Basic usage
use synaptic::retrieval::{BM25Retriever, Document, Retriever};
let docs = vec![
Document::new("1", "Rust is a systems programming language focused on safety"),
Document::new("2", "Python is widely used for data science and machine learning"),
Document::new("3", "Go was designed at Google for concurrent programming"),
Document::new("4", "Rust provides memory safety without garbage collection"),
];
let retriever = BM25Retriever::new(docs);
let results = retriever.retrieve("Rust memory safety", 2).await?;
// Returns documents 4 and 1 (highest BM25 scores for those query terms)
The retriever pre-computes term frequencies, document lengths, and inverse document frequencies at construction time, so retrieval itself is fast.
Custom BM25 parameters
BM25 has two tuning parameters:
- k1 (default
1.5) -- controls term frequency saturation. Higher values give more weight to term frequency. - b (default
0.75) -- controls document length normalization.1.0means full length normalization;0.0means no length normalization.
let retriever = BM25Retriever::with_params(docs, 1.2, 0.8);
How scoring works
For each query term, BM25 computes:
score = IDF * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * dl / avgdl))
Where:
- IDF =
ln((N - df + 0.5) / (df + 0.5) + 1)-- inverse document frequency - tf -- term frequency in the document
- dl -- document length (in tokens)
- avgdl -- average document length across the corpus
- N -- total number of documents
- df -- number of documents containing the term
Documents with a total score of zero (no matching terms) are excluded from results.
Combining with vector search
BM25 pairs well with vector retrieval through the EnsembleRetriever. This gives you the best of both keyword matching and semantic search:
use std::sync::Arc;
use synaptic::retrieval::{BM25Retriever, EnsembleRetriever, Retriever};
let bm25 = Arc::new(BM25Retriever::new(docs.clone()));
let vector_retriever = Arc::new(/* VectorStoreRetriever */);
let ensemble = EnsembleRetriever::new(vec![
(vector_retriever as Arc<dyn Retriever>, 0.5),
(bm25 as Arc<dyn Retriever>, 0.5),
]);
let results = ensemble.retrieve("query", 5).await?;
See Ensemble Retriever for more details on combining retrievers.
Multi-Query Retriever
This guide shows how to use the MultiQueryRetriever to improve retrieval recall by generating multiple query perspectives with an LLM.
Overview
A single search query may not capture all relevant documents, especially when the user's phrasing does not match the vocabulary in the document corpus. The MultiQueryRetriever addresses this by:
- Using a
ChatModelto generate alternative phrasings of the original query. - Running each query variant through a base retriever.
- Deduplicating and merging the results.
This technique improves recall by overcoming limitations of distance-based similarity search.
Basic usage
use std::sync::Arc;
use synaptic::retrieval::{MultiQueryRetriever, Retriever};
let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);
let model: Arc<dyn ChatModel> = Arc::new(/* any ChatModel */);
// Default: generates 3 query variants
let retriever = MultiQueryRetriever::new(base_retriever, model);
let results = retriever.retrieve("What are the benefits of Rust?", 5).await?;
When you call retrieve(), the retriever:
- Sends a prompt to the LLM asking it to rephrase the query into 3 different versions.
- Runs the original query plus all generated variants through the base retriever.
- Collects all results, deduplicates by document
id, and returns up totop_kdocuments.
Custom number of query variants
Specify a different number of generated queries:
let retriever = MultiQueryRetriever::with_num_queries(
base_retriever,
model,
5, // Generate 5 query variants
);
More variants increase recall but also increase the number of LLM and retriever calls.
How it works internally
The retriever sends a prompt like this to the LLM:
You are an AI language model assistant. Your task is to generate 3 different versions of the given user question to retrieve relevant documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of distance-based similarity search. Provide these alternative questions separated by newlines. Only output the questions, nothing else.
Original question: What are the benefits of Rust?
The LLM might respond with:
Why should I use Rust as a programming language?
What advantages does Rust offer over other languages?
What makes Rust a good choice for software development?
Each of these queries is then run through the base retriever, and all results are merged with deduplication.
Example with a vector store
use std::sync::Arc;
use synaptic::retrieval::{MultiQueryRetriever, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;
// Set up vector store
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());
let docs = vec![
Document::new("1", "Rust ensures memory safety without a garbage collector"),
Document::new("2", "Rust's ownership system prevents data races at compile time"),
Document::new("3", "Go uses goroutines for lightweight concurrency"),
];
store.add_documents(docs, embeddings.as_ref()).await?;
// Wrap vector store as a retriever
let base = Arc::new(VectorStoreRetriever::new(store, embeddings, 5));
// Create multi-query retriever
let model: Arc<dyn ChatModel> = Arc::new(/* your model */);
let retriever = MultiQueryRetriever::new(base, model);
let results = retriever.retrieve("Why is Rust safe?", 5).await?;
// May find documents that mention "memory safety", "ownership", "data races"
// even if the original query doesn't use those exact terms
Ensemble Retriever
This guide shows how to combine multiple retrievers using the EnsembleRetriever and Reciprocal Rank Fusion (RRF).
Overview
Different retrieval strategies have different strengths. Keyword-based methods (like BM25) excel at exact term matching, while vector-based methods capture semantic similarity. The EnsembleRetriever combines results from multiple retrievers into a single ranked list, giving you the best of both approaches.
It uses Reciprocal Rank Fusion (RRF) to merge rankings. Each retriever contributes a weighted RRF score for each document based on the document's rank in that retriever's results. Documents are then sorted by their total RRF score.
Basic usage
use std::sync::Arc;
use synaptic::retrieval::{EnsembleRetriever, Retriever};
let retriever_a: Arc<dyn Retriever> = Arc::new(/* vector retriever */);
let retriever_b: Arc<dyn Retriever> = Arc::new(/* BM25 retriever */);
let ensemble = EnsembleRetriever::new(vec![
(retriever_a, 0.5), // weight 0.5
(retriever_b, 0.5), // weight 0.5
]);
let results = ensemble.retrieve("query", 5).await?;
Each tuple contains a retriever and its weight. The weight scales the RRF score contribution from that retriever.
Combining vector search with BM25
The most common use case is combining semantic (vector) search with keyword (BM25) search:
use std::sync::Arc;
use synaptic::retrieval::{BM25Retriever, EnsembleRetriever, Document, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
let docs = vec![
Document::new("1", "Rust provides memory safety through ownership"),
Document::new("2", "Python has a large ecosystem for machine learning"),
Document::new("3", "Rust's borrow checker prevents data races"),
Document::new("4", "Go is designed for building scalable services"),
];
// BM25 retriever (keyword-based)
let bm25 = Arc::new(BM25Retriever::new(docs.clone()));
// Vector retriever (semantic)
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::from_documents(docs, embeddings.as_ref()).await?);
let vector = Arc::new(VectorStoreRetriever::new(store, embeddings, 5));
// Combine with equal weights
let ensemble = EnsembleRetriever::new(vec![
(vector as Arc<dyn Retriever>, 0.5),
(bm25 as Arc<dyn Retriever>, 0.5),
]);
let results = ensemble.retrieve("Rust safety", 3).await?;
Adjusting weights
Weights control how much each retriever contributes to the final ranking. Higher weight means more influence.
// Favor semantic search
let ensemble = EnsembleRetriever::new(vec![
(vector_retriever, 0.7),
(bm25_retriever, 0.3),
]);
// Favor keyword search
let ensemble = EnsembleRetriever::new(vec![
(vector_retriever, 0.3),
(bm25_retriever, 0.7),
]);
How Reciprocal Rank Fusion works
For each document returned by a retriever, RRF computes a score:
rrf_score = weight / (k + rank)
Where:
weightis the retriever's configured weight.kis a constant (60, the standard RRF constant) that prevents top-ranked documents from dominating.rankis the document's 1-based position in the retriever's results.
If a document appears in results from multiple retrievers, its RRF scores are summed. The final results are sorted by total RRF score in descending order.
Combining more than two retrievers
You can combine any number of retrievers:
let ensemble = EnsembleRetriever::new(vec![
(vector_retriever, 0.4),
(bm25_retriever, 0.3),
(multi_query_retriever, 0.3),
]);
let results = ensemble.retrieve("query", 10).await?;
Contextual Compression
This guide shows how to post-filter retrieved documents using the ContextualCompressionRetriever and EmbeddingsFilter.
Overview
A base retriever may return documents that are only loosely related to the query. Contextual compression adds a second filtering step: after retrieval, a DocumentCompressor evaluates each document against the query and removes documents that do not meet a relevance threshold.
This is especially useful when your base retriever fetches broadly (high recall) and you want to tighten the results (high precision).
DocumentCompressor trait
The filtering logic is defined by the DocumentCompressor trait:
#[async_trait]
pub trait DocumentCompressor: Send + Sync {
async fn compress_documents(
&self,
documents: Vec<Document>,
query: &str,
) -> Result<Vec<Document>, SynapticError>;
}
Synaptic provides EmbeddingsFilter as a built-in compressor.
EmbeddingsFilter
Filters documents by computing cosine similarity between the query embedding and each document's content embedding. Only documents that meet or exceed the similarity threshold are kept.
use std::sync::Arc;
use synaptic::retrieval::EmbeddingsFilter;
use synaptic::embeddings::FakeEmbeddings;
let embeddings = Arc::new(FakeEmbeddings::new(128));
// Only keep documents with similarity >= 0.7
let filter = EmbeddingsFilter::new(embeddings, 0.7);
A convenience constructor uses the default threshold of 0.75:
let filter = EmbeddingsFilter::with_default_threshold(embeddings);
ContextualCompressionRetriever
Wraps a base retriever and applies a DocumentCompressor to the results:
use std::sync::Arc;
use synaptic::retrieval::{
ContextualCompressionRetriever,
EmbeddingsFilter,
Retriever,
};
use synaptic::embeddings::FakeEmbeddings;
let embeddings = Arc::new(FakeEmbeddings::new(128));
let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);
// Create the filter
let filter = Arc::new(EmbeddingsFilter::new(embeddings, 0.7));
// Wrap the base retriever with compression
let retriever = ContextualCompressionRetriever::new(base_retriever, filter);
let results = retriever.retrieve("query", 5).await?;
// Only documents with cosine similarity >= 0.7 to the query are returned
Full example
use std::sync::Arc;
use synaptic::retrieval::{
BM25Retriever,
ContextualCompressionRetriever,
EmbeddingsFilter,
Document,
Retriever,
};
use synaptic::embeddings::FakeEmbeddings;
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "The weather today is sunny and warm"),
Document::new("3", "Rust provides memory safety guarantees"),
Document::new("4", "Cooking pasta requires boiling water"),
];
// BM25 might return loosely relevant results
let base = Arc::new(BM25Retriever::new(docs));
// Use embedding similarity to filter out irrelevant documents
let embeddings = Arc::new(FakeEmbeddings::new(128));
let filter = Arc::new(EmbeddingsFilter::new(embeddings, 0.6));
let retriever = ContextualCompressionRetriever::new(base, filter);
let results = retriever.retrieve("Rust programming", 5).await?;
// Documents about weather and cooking are filtered out
How it works
- The
ContextualCompressionRetrievercallsbase.retrieve(query, top_k)to get candidate documents. - It passes those candidates to the
DocumentCompressor(e.g.,EmbeddingsFilter). - The compressor embeds the query and all candidate documents, computes cosine similarity, and removes documents below the threshold.
- The filtered results are returned.
Custom compressors
You can implement your own DocumentCompressor for other filtering strategies -- for example, using an LLM to judge relevance or extracting only the most relevant passage from each document.
use async_trait::async_trait;
use synaptic::retrieval::{DocumentCompressor, Document};
use synaptic::core::SynapticError;
struct MyCompressor;
#[async_trait]
impl DocumentCompressor for MyCompressor {
async fn compress_documents(
&self,
documents: Vec<Document>,
query: &str,
) -> Result<Vec<Document>, SynapticError> {
// Your filtering logic here
Ok(documents)
}
}
Self-Query Retriever
This guide shows how to use the SelfQueryRetriever to automatically extract structured metadata filters from natural language queries.
Overview
Users often express search intent that includes both a semantic query and metadata constraints in the same sentence. For example:
"Find documents about Rust published after 2024"
This contains:
- A semantic query: "documents about Rust"
- A metadata filter:
year > 2024
The SelfQueryRetriever uses a ChatModel to parse the user's natural language query into a structured search query plus metadata filters, then applies those filters to the results from a base retriever.
Defining metadata fields
First, describe the metadata fields available in your document corpus using MetadataFieldInfo:
use synaptic::retrieval::MetadataFieldInfo;
let fields = vec![
MetadataFieldInfo {
name: "year".to_string(),
description: "The year the document was published".to_string(),
field_type: "integer".to_string(),
},
MetadataFieldInfo {
name: "language".to_string(),
description: "The programming language discussed".to_string(),
field_type: "string".to_string(),
},
MetadataFieldInfo {
name: "author".to_string(),
description: "The author of the document".to_string(),
field_type: "string".to_string(),
},
];
Each field has a name, a human-readable description, and a field_type that tells the LLM what kind of values to expect.
Basic usage
use std::sync::Arc;
use synaptic::retrieval::{SelfQueryRetriever, MetadataFieldInfo, Retriever};
let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);
let model: Arc<dyn ChatModel> = Arc::new(/* any ChatModel */);
let retriever = SelfQueryRetriever::new(base_retriever, model, fields);
let results = retriever.retrieve(
"find articles about Rust written by Alice",
5,
).await?;
// LLM extracts: query="Rust", filters: [language eq "Rust", author eq "Alice"]
How it works
- The retriever builds a prompt describing the available metadata fields and sends the user's query to the LLM.
- The LLM responds with a JSON object containing:
"query"-- the extracted semantic search query."filters"-- an array of filter objects, each with"field","op", and"value".
- The retriever runs the extracted query through the base retriever (fetching extra candidates,
top_k * 2). - Filters are applied to the results, keeping only documents whose metadata matches all filter conditions.
- The final filtered results are truncated to
top_kand returned.
Supported filter operators
| Operator | Meaning |
|---|---|
eq | Equal to |
gt | Greater than |
gte | Greater than or equal to |
lt | Less than |
lte | Less than or equal to |
contains | String contains substring |
Numeric comparisons work on both integers and floats. String comparisons use lexicographic ordering.
Full example
use std::sync::Arc;
use std::collections::HashMap;
use synaptic::retrieval::{
BM25Retriever,
SelfQueryRetriever,
MetadataFieldInfo,
Document,
Retriever,
};
use serde_json::json;
// Documents with metadata
let docs = vec![
Document::with_metadata(
"1",
"An introduction to Rust's ownership model",
HashMap::from([
("year".to_string(), json!(2024)),
("language".to_string(), json!("Rust")),
]),
),
Document::with_metadata(
"2",
"Advanced Python patterns for data pipelines",
HashMap::from([
("year".to_string(), json!(2023)),
("language".to_string(), json!("Python")),
]),
),
Document::with_metadata(
"3",
"Rust async programming with Tokio",
HashMap::from([
("year".to_string(), json!(2025)),
("language".to_string(), json!("Rust")),
]),
),
];
let base = Arc::new(BM25Retriever::new(docs));
let model: Arc<dyn ChatModel> = Arc::new(/* your model */);
let fields = vec![
MetadataFieldInfo {
name: "year".to_string(),
description: "Publication year".to_string(),
field_type: "integer".to_string(),
},
MetadataFieldInfo {
name: "language".to_string(),
description: "Programming language topic".to_string(),
field_type: "string".to_string(),
},
];
let retriever = SelfQueryRetriever::new(base, model, fields);
// Natural language query with implicit filters
let results = retriever.retrieve("Rust articles from 2025", 5).await?;
// LLM extracts: query="Rust articles", filters: [language eq "Rust", year eq 2025]
// Returns only document 3
Considerations
- The quality of filter extraction depends on the LLM. Use a capable model for reliable results.
- Only filters referencing fields declared in
MetadataFieldInfoare applied; unknown fields are ignored. - If the LLM cannot parse the query into structured filters, it falls back to an empty filter list and returns standard retrieval results.
Parent Document Retriever
This guide shows how to use the ParentDocumentRetriever to search on small chunks for precision while returning full parent documents for context.
The problem
When splitting documents for retrieval, you face a trade-off:
- Small chunks are better for search precision -- they match queries more accurately because there is less noise.
- Large documents are better for context -- they give the LLM more information to work with when generating answers.
The ParentDocumentRetriever solves this by maintaining both: it splits parent documents into small child chunks for indexing, but when a child chunk matches a query, it returns the full parent document.
How it works
- You provide parent documents and a splitting function.
- The retriever splits each parent into child chunks, storing a child-to-parent mapping.
- Child chunks are indexed in a child retriever (e.g., backed by a vector store).
- At retrieval time, the child retriever finds matching chunks, then the parent retriever maps those back to their parent documents, deduplicating along the way.
Basic usage
use std::sync::Arc;
use synaptic::retrieval::{ParentDocumentRetriever, Document, Retriever};
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
// Create a child retriever (any Retriever implementation)
let child_retriever: Arc<dyn Retriever> = Arc::new(/* vector store retriever */);
// Create the parent document retriever with a splitting function
let splitter = RecursiveCharacterTextSplitter::new(200);
let parent_retriever = ParentDocumentRetriever::new(
child_retriever.clone(),
move |text: &str| splitter.split_text(text),
);
Adding documents
The add_documents() method splits parent documents into children and stores the mappings. It returns the child documents so you can index them in the child retriever.
let parent_docs = vec![
Document::new("doc-1", "A very long document about Rust ownership..."),
Document::new("doc-2", "A detailed guide to async programming in Rust..."),
];
// Split parents into children and get child docs for indexing
let child_docs = parent_retriever.add_documents(parent_docs).await;
// Index child docs in the vector store
// child_docs[0].id == "doc-1-child-0"
// child_docs[0].metadata["parent_id"] == "doc-1"
// child_docs[0].metadata["chunk_index"] == 0
Each child document:
- Has an ID formatted as
"{parent_id}-child-{index}". - Inherits all metadata from the parent.
- Gets additional
parent_idandchunk_indexmetadata fields.
Retrieval
When you call retrieve(), the retriever searches for matching child chunks, then returns the corresponding parent documents:
let results = parent_retriever.retrieve("ownership borrowing", 3).await?;
// Returns full parent documents, not individual chunks
The retriever fetches top_k * 3 child results internally to ensure enough parent documents can be assembled after deduplication.
Full example
use std::sync::Arc;
use synaptic::retrieval::{ParentDocumentRetriever, Document, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
// Set up embeddings and vector store for child chunks
let embeddings = Arc::new(FakeEmbeddings::new(128));
let child_store = Arc::new(InMemoryVectorStore::new());
// Create the child retriever
let child_retriever = Arc::new(VectorStoreRetriever::new(
child_store.clone(),
embeddings.clone(),
10,
));
// Create parent retriever with a small chunk size for children
let splitter = RecursiveCharacterTextSplitter::new(200);
let parent_retriever = ParentDocumentRetriever::new(
child_retriever,
move |text: &str| splitter.split_text(text),
);
// Add parent documents
let parents = vec![
Document::new("rust-guide", "A comprehensive guide to Rust. \
Rust is a systems programming language focused on safety, speed, and concurrency. \
It achieves memory safety without garbage collection through its ownership system. \
The borrow checker enforces ownership rules at compile time..."),
Document::new("go-guide", "A comprehensive guide to Go. \
Go is a statically typed language designed at Google. \
It features goroutines for lightweight concurrency. \
Go's garbage collector manages memory automatically..."),
];
let children = parent_retriever.add_documents(parents).await;
// Index children in the vector store
child_store.add_documents(children, embeddings.as_ref()).await?;
// Search for child chunks, get back full parent documents
let results = parent_retriever.retrieve("memory safety ownership", 2).await?;
// Returns the full "rust-guide" parent document, even though only
// a small chunk about ownership matched the query
When to use this
The ParentDocumentRetriever is most useful when:
- Your documents are long and cover multiple topics, but you want precise retrieval.
- You need the LLM to see the full document context for generating high-quality answers.
- Small chunks alone would lose important surrounding context.
For simpler use cases where chunks are self-contained, a standard VectorStoreRetriever may be sufficient.
Tools
Tools give LLMs the ability to take actions in the world -- calling APIs, querying databases, performing calculations, or any other side effect. Synaptic provides a complete tool system built around the Tool trait defined in synaptic-core.
Key Components
| Component | Crate | Description |
|---|---|---|
Tool trait | synaptic-core | The interface every tool must implement: name(), description(), and call() |
ToolRegistry | synaptic-tools | Thread-safe collection of registered tools (Arc<RwLock<HashMap>>) |
SerialToolExecutor | synaptic-tools | Dispatches tool calls by name through the registry |
ToolNode | synaptic-graph | Graph node that executes tool calls from AI messages in a state machine workflow |
ToolDefinition | synaptic-core | Schema description sent to the model so it knows what tools are available |
ToolChoice | synaptic-core | Controls whether and how the model selects tools |
How It Works
- You define tools using the
#[tool]macro (or by implementing theTooltrait manually). - Register them in a
ToolRegistry. - Convert them to
ToolDefinitionvalues and attach them to aChatRequestso the model knows what tools are available. - When the model responds with
ToolCallentries, dispatch them throughSerialToolExecutorto get results. - Send the results back to the model as
Message::tool(...)messages to continue the conversation.
Quick Example
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
/// Add two numbers.
#[tool]
async fn add(
/// First number
a: f64,
/// Second number
b: f64,
) -> Result<Value, SynapticError> {
Ok(json!({"result": a + b}))
}
let registry = ToolRegistry::new();
registry.register(add())?; // add() returns Arc<dyn Tool>
let executor = SerialToolExecutor::new(registry);
let result = executor.execute("add", json!({"a": 3, "b": 4})).await?;
assert_eq!(result, json!({"result": 7.0}));
Sub-Pages
- Custom Tools -- implement the
Tooltrait for your own tools - Tool Registry -- register, look up, and execute tools
- Tool Choice -- control how the model selects tools with
ToolChoice - Tool Definition Extras -- attach provider-specific parameters to tool definitions
- Runtime-Aware Tools -- tools that access graph state, store, and runtime context
Custom Tools
Every tool in Synaptic implements the Tool trait from synaptic-core. The recommended way to define tools is with the #[tool] attribute macro, which generates all the boilerplate for you.
Defining a Tool with #[tool]
The #[tool] macro converts an async function into a full Tool implementation. Doc comments on the function become the tool description, and doc comments on parameters become JSON Schema descriptions:
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use serde_json::{json, Value};
/// Get the current weather for a location.
#[tool]
async fn get_weather(
/// The city name
location: String,
) -> Result<Value, SynapticError> {
// In production, call a real weather API here
Ok(json!({
"location": location,
"temperature": 22,
"condition": "sunny"
}))
}
// `get_weather()` returns Arc<dyn Tool>
let tool = get_weather();
assert_eq!(tool.name(), "get_weather");
Key points:
- The function name becomes the tool name (override with
#[tool(name = "custom_name")]). - The doc comment on the function becomes the tool description.
- Each parameter becomes a JSON Schema property; doc comments on parameters become
"description"fields in the schema. String,i64,f64,bool,Vec<T>, andOption<T>types are mapped to JSON Schema types automatically.- The factory function (
get_weather()) returnsArc<dyn Tool>.
Error Handling
Return SynapticError::Tool(...) for tool-specific errors. The macro handles parameter validation automatically, but you can add your own domain-specific checks:
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use serde_json::{json, Value};
/// Divide two numbers.
#[tool]
async fn divide(
/// The numerator
a: f64,
/// The denominator
b: f64,
) -> Result<Value, SynapticError> {
if b == 0.0 {
return Err(SynapticError::Tool("division by zero".to_string()));
}
Ok(json!({"result": a / b}))
}
Note that the macro auto-generates validation for missing or invalid parameters (returning SynapticError::Tool errors), so you no longer need manual args["a"].as_f64().ok_or_else(...) checks.
Registering and Using
The #[tool] macro factory returns Arc<dyn Tool>, which you register directly:
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
use serde_json::json;
let registry = ToolRegistry::new();
registry.register(get_weather())?;
let executor = SerialToolExecutor::new(registry);
let result = executor.execute("get_weather", json!({"location": "Tokyo"})).await?;
// result = {"location": "Tokyo", "temperature": 22, "condition": "sunny"}
See the Tool Registry page for more on registration and execution.
Full ReAct Agent Loop
Here is a complete offline example that defines tools with #[tool], then wires them into a ReAct agent with ScriptedChatModel:
use std::sync::Arc;
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::{ChatModel, ChatResponse, Message, Tool, ToolCall, SynapticError};
use synaptic::models::ScriptedChatModel;
use synaptic::graph::{create_react_agent, MessageState};
// 1. Define tools with the macro
/// Add two numbers.
#[tool]
async fn add(
/// First number
a: f64,
/// Second number
b: f64,
) -> Result<Value, SynapticError> {
Ok(json!({"result": a + b}))
}
// 2. Script the model to call the tool and then respond
let model: Arc<dyn ChatModel> = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai_with_tool_calls(
"",
vec![ToolCall {
id: "call_1".into(),
name: "add".into(),
arguments: r#"{"a": 3, "b": 4}"#.into(),
}],
),
usage: None,
},
ChatResponse {
message: Message::ai("The sum is 7."),
usage: None,
},
]));
// 3. Build the agent -- add() returns Arc<dyn Tool>
let tools: Vec<Arc<dyn Tool>> = vec![add()];
let agent = create_react_agent(model, tools)?;
// 4. Run it
let state = MessageState::with_messages(vec![
Message::human("What is 3 + 4?"),
]);
let result = agent.invoke(state).await?.into_state();
assert_eq!(result.messages.last().unwrap().content(), "The sum is 7.");
Tool Definitions for Models
To tell a chat model about available tools, create ToolDefinition values and attach them to a ChatRequest:
use serde_json::json;
use synaptic::core::{ChatRequest, Message, ToolDefinition};
let tool_def = ToolDefinition {
name: "get_weather".to_string(),
description: "Get the current weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name"
}
},
"required": ["location"]
}),
};
let request = ChatRequest::new(vec![
Message::human("What is the weather in Tokyo?"),
])
.with_tools(vec![tool_def]);
The parameters field follows the JSON Schema format that LLM providers expect.
Optional and Default Parameters
#[tool]
async fn search(
/// The search query
query: String,
/// Maximum results (default 10)
#[default = 10]
max_results: i64,
/// Language filter
language: Option<String>,
) -> Result<String, SynapticError> {
let lang = language.unwrap_or_else(|| "en".into());
Ok(format!("Searching '{}' (max {}, lang {})", query, max_results, lang))
}
Stateful Tools with #[field]
Tools that need to hold state (database connections, API clients, etc.) can use
#[field] to create struct fields that are hidden from the LLM schema:
use std::sync::Arc;
#[tool]
async fn db_query(
#[field] pool: Arc<DbPool>,
/// SQL query to execute
query: String,
) -> Result<Value, SynapticError> {
let result = pool.execute(&query).await?;
Ok(serde_json::to_value(result).unwrap())
}
// Factory requires the field parameter
let tool = db_query(pool.clone());
For the full macro reference including #[inject], #[default], and middleware
macros, see the Procedural Macros page.
Manual Implementation
For advanced cases that the macro cannot handle (custom parameters() overrides, conditional logic in name() or description(), or implementing both Tool and other traits on the same struct), you can implement the Tool trait directly:
use async_trait::async_trait;
use serde_json::{json, Value};
use synaptic::core::{Tool, SynapticError};
struct WeatherTool;
#[async_trait]
impl Tool for WeatherTool {
fn name(&self) -> &'static str {
"get_weather"
}
fn description(&self) -> &'static str {
"Get the current weather for a location"
}
async fn call(&self, args: Value) -> Result<Value, SynapticError> {
let location = args["location"]
.as_str()
.unwrap_or("unknown");
Ok(json!({
"location": location,
"temperature": 22,
"condition": "sunny"
}))
}
}
The trait requires three methods:
name()-- a&'static stridentifier the model uses when making tool calls.description()-- tells the model what the tool does.call()-- receives arguments as aserde_json::Valueand returns aValueresult.
Wrap manual implementations in Arc::new(WeatherTool) when registering them.
Tool Registry
ToolRegistry is a thread-safe collection of tools, and SerialToolExecutor dispatches tool calls through the registry by name. Both are provided by the synaptic-tools crate.
ToolRegistry
ToolRegistry stores tools in an Arc<RwLock<HashMap<String, Arc<dyn Tool>>>>. It is Clone and can be shared across threads.
Creating and Registering Tools
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use synaptic::tools::ToolRegistry;
/// Echo back the input.
#[tool]
async fn echo(
#[args] args: Value,
) -> Result<Value, SynapticError> {
Ok(json!({"echo": args}))
}
let registry = ToolRegistry::new();
registry.register(echo())?; // echo() returns Arc<dyn Tool>
If you register two tools with the same name, the second registration replaces the first.
Looking Up Tools
Use get() to retrieve a tool by name:
let tool = registry.get("echo");
assert!(tool.is_some());
let missing = registry.get("nonexistent");
assert!(missing.is_none());
get() returns Option<Arc<dyn Tool>>, so the tool can be called directly if needed.
SerialToolExecutor
SerialToolExecutor wraps a ToolRegistry and provides a convenience method that looks up a tool by name and calls it in one step.
Creating and Using
use synaptic::tools::SerialToolExecutor;
use serde_json::json;
let executor = SerialToolExecutor::new(registry);
let result = executor.execute("echo", json!({"message": "hello"})).await?;
assert_eq!(result, json!({"echo": {"message": "hello"}}));
The execute() method:
- Looks up the tool by name in the registry.
- Calls
tool.call(args)with the provided arguments. - Returns the result or
SynapticError::ToolNotFoundif the tool does not exist.
Handling Unknown Tools
If you call execute() with a name that is not registered, it returns SynapticError::ToolNotFound:
let err = executor.execute("nonexistent", json!({})).await.unwrap_err();
assert!(matches!(err, synaptic::core::SynapticError::ToolNotFound(name) if name == "nonexistent"));
Complete Example
Here is a full example that registers multiple tools and executes them:
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
/// Add two numbers.
#[tool]
async fn add(
/// First number
a: f64,
/// Second number
b: f64,
) -> Result<Value, SynapticError> {
Ok(json!({"result": a + b}))
}
/// Multiply two numbers.
#[tool]
async fn multiply(
/// First number
a: f64,
/// Second number
b: f64,
) -> Result<Value, SynapticError> {
Ok(json!({"result": a * b}))
}
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
let registry = ToolRegistry::new();
registry.register(add())?;
registry.register(multiply())?;
let executor = SerialToolExecutor::new(registry);
let sum = executor.execute("add", json!({"a": 3, "b": 4})).await?;
assert_eq!(sum, json!({"result": 7.0}));
let product = executor.execute("multiply", json!({"a": 3, "b": 4})).await?;
assert_eq!(product, json!({"result": 12.0}));
Ok(())
}
Integration with Chat Models
In a typical agent workflow, the model's response contains ToolCall entries. You dispatch them through the executor and send the results back:
use synaptic::core::{Message, ToolCall};
use serde_json::json;
// After model responds with tool calls:
let tool_calls = vec![
ToolCall {
id: "call-1".to_string(),
name: "add".to_string(),
arguments: json!({"a": 3, "b": 4}),
},
];
// Execute each tool call
for tc in &tool_calls {
let result = executor.execute(&tc.name, tc.arguments.clone()).await?;
// Create a tool message with the result
let tool_message = Message::tool(
result.to_string(),
&tc.id,
);
// Append tool_message to the conversation and send back to the model
}
See the ReAct Agent tutorial for a complete agent loop example.
Tool Choice
ToolChoice controls whether and how a chat model selects tools when responding. It is defined in synaptic-core and attached to a ChatRequest via the with_tool_choice() builder method.
ToolChoice Variants
| Variant | Behavior |
|---|---|
ToolChoice::Auto | The model decides whether to call a tool or respond with text (default when tools are provided) |
ToolChoice::Required | The model must call at least one tool -- it cannot respond with plain text |
ToolChoice::None | The model must not call any tools, even if tools are provided in the request |
ToolChoice::Specific(name) | The model must call the specific named tool |
Basic Usage
Attach ToolChoice to a ChatRequest alongside tool definitions:
use serde_json::json;
use synaptic::core::{ChatRequest, Message, ToolChoice, ToolDefinition};
let weather_tool = ToolDefinition {
name: "get_weather".to_string(),
description: "Get the current weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}),
};
// Force the model to use tools
let request = ChatRequest::new(vec![
Message::human("What is the weather in Tokyo?"),
])
.with_tools(vec![weather_tool])
.with_tool_choice(ToolChoice::Required);
When to Use Each Variant
Auto (Default)
Let the model decide. This is the best choice for general-purpose agents that should respond with text when no tool is needed:
use synaptic::core::{ChatRequest, Message, ToolChoice};
let request = ChatRequest::new(vec![
Message::human("Hello, how are you?"),
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::Auto);
Required
Force tool usage. Useful in agent loops where the next step must be a tool call, or when you know the user's request requires tool invocation:
use synaptic::core::{ChatRequest, Message, ToolChoice};
let request = ChatRequest::new(vec![
Message::human("Look up the weather in Paris and Tokyo."),
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::Required);
// The model MUST respond with one or more tool calls
None
Suppress tool calls. Useful when you want to temporarily disable tools without removing them from the request, or during a final summarization step:
use synaptic::core::{ChatRequest, Message, ToolChoice};
let request = ChatRequest::new(vec![
Message::system("Summarize the tool results for the user."),
Message::human("What is the weather?"),
// ... tool result messages ...
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::None);
// The model MUST respond with text, not tool calls
Specific
Force a particular tool. Useful when you know exactly which tool should be called:
use synaptic::core::{ChatRequest, Message, ToolChoice};
let request = ChatRequest::new(vec![
Message::human("Check the weather in London."),
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::Specific("get_weather".to_string()));
// The model MUST call the "get_weather" tool specifically
Complete Example
Here is a full example that creates tools, forces a specific tool call, and processes the result:
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::{
ChatModel, ChatRequest, Message, SynapticError, Tool,
ToolChoice,
};
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
/// Perform arithmetic calculations.
#[tool]
async fn calculator(
/// The arithmetic expression to evaluate
expression: String,
) -> Result<Value, SynapticError> {
// Simplified: in production, parse and evaluate the expression
Ok(json!({"result": expression}))
}
// Register tools
let registry = ToolRegistry::new();
let calc_tool = calculator(); // Arc<dyn Tool>
registry.register(calc_tool.clone())?;
// Build the tool definition from the tool itself
let calc_def = calc_tool.as_tool_definition();
// Build a request that forces the calculator tool
let request = ChatRequest::new(vec![
Message::human("What is 42 * 17?"),
])
.with_tools(vec![calc_def])
.with_tool_choice(ToolChoice::Specific("calculator".to_string()));
// Send to the model, then execute the returned tool calls
let response = model.chat(request).await?;
for tc in response.message.tool_calls() {
let executor = SerialToolExecutor::new(registry.clone());
let result = executor.execute(&tc.name, tc.arguments.clone()).await?;
println!("Tool {} returned: {}", tc.name, result);
}
Provider Support
All Synaptic provider adapters (OpenAiChatModel, AnthropicChatModel, GeminiChatModel, OllamaChatModel) support ToolChoice. The adapter translates the Synaptic ToolChoice enum into the provider-specific format automatically.
See also: Bind Tools for attaching tools to a model permanently, and the ReAct Agent tutorial for a complete agent loop.
Tool Definition Extras
The extras field on ToolDefinition carries provider-specific parameters that fall outside the standard name/description/parameters schema, such as Anthropic's cache_control or any custom metadata your provider adapter needs.
The extras Field
pub struct ToolDefinition {
pub name: String,
pub description: String,
pub parameters: Value,
/// Provider-specific parameters (e.g., Anthropic's `cache_control`).
pub extras: Option<HashMap<String, Value>>,
}
When extras is None (the default), no additional fields are serialized. Provider adapters inspect extras during request building and map recognized keys into the provider's wire format.
Setting Extras on a Tool Definition
Build a ToolDefinition with extras by populating the field directly:
use std::collections::HashMap;
use serde_json::{json, Value};
use synaptic::core::ToolDefinition;
let mut extras = HashMap::new();
extras.insert("cache_control".to_string(), json!({"type": "ephemeral"}));
let tool_def = ToolDefinition {
name: "search".to_string(),
description: "Search the web".to_string(),
parameters: json!({
"type": "object",
"properties": {
"query": { "type": "string" }
},
"required": ["query"]
}),
extras: Some(extras),
};
Common Use Cases
Anthropic prompt caching -- Anthropic supports a cache_control field on tool definitions to enable prompt caching for tool schemas that rarely change:
let mut extras = HashMap::new();
extras.insert("cache_control".to_string(), json!({"type": "ephemeral"}));
let def = ToolDefinition {
name: "lookup".to_string(),
description: "Look up a record".to_string(),
parameters: json!({"type": "object", "properties": {}}),
extras: Some(extras),
};
Custom metadata -- You can attach arbitrary key-value pairs for your own adapter logic:
let mut extras = HashMap::new();
extras.insert("priority".to_string(), json!("high"));
extras.insert("timeout_ms".to_string(), json!(5000));
let def = ToolDefinition {
name: "deploy".to_string(),
description: "Deploy the service".to_string(),
parameters: json!({"type": "object", "properties": {}}),
extras: Some(extras),
};
Extras with #[tool] Macro Tools
The #[tool] macro does not support extras directly -- extras are a property of the ToolDefinition, not the tool function itself. Define your tool with the macro, then add extras to the generated definition:
use std::collections::HashMap;
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
/// Does something useful.
#[tool]
async fn my_tool(
/// The input query
query: String,
) -> Result<Value, SynapticError> {
Ok(json!("done"))
}
// Get the tool definition and add extras
let tool = my_tool();
let mut def = tool.as_tool_definition();
def.extras = Some(HashMap::from([
("cache_control".to_string(), json!({"type": "ephemeral"})),
]));
// Use `def` when building the ChatRequest
This approach works with any tool -- whether defined via #[tool] or by implementing the Tool trait manually.
Runtime-Aware Tools
RuntimeAwareTool extends the basic Tool trait with runtime context -- current graph state, a store reference, stream writer, tool call ID, and runnable config. Implement this trait for tools that need to read or modify graph state during execution.
The ToolRuntime Struct
When a runtime-aware tool is invoked, it receives a ToolRuntime with the following fields:
pub struct ToolRuntime {
pub store: Option<Arc<dyn Store>>,
pub stream_writer: Option<StreamWriter>,
pub state: Option<Value>,
pub tool_call_id: String,
pub config: Option<RunnableConfig>,
}
| Field | Description |
|---|---|
store | Shared key-value store for cross-tool persistence |
stream_writer | Writer for pushing streaming output from within a tool |
state | Serialized snapshot of the current graph state |
tool_call_id | The ID of the tool call being executed |
config | Runnable config with tags, metadata, and run ID |
Implementing with #[tool] and #[inject]
The recommended way to define a runtime-aware tool is with the #[tool] macro. Use #[inject(store)], #[inject(state)], or #[inject(tool_call_id)] on parameters to receive runtime context. These injected parameters are hidden from the LLM schema. Using any #[inject] attribute automatically switches the generated impl to RuntimeAwareTool:
use std::sync::Arc;
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::{Store, SynapticError};
/// Save a note to the store.
#[tool]
async fn save_note(
/// The note key
key: String,
/// The note text
text: String,
#[inject(store)] store: Arc<dyn Store>,
) -> Result<Value, SynapticError> {
store.put(
&["notes"],
&key,
json!({"text": text}),
).await?;
Ok(json!({"saved": key}))
}
// save_note() returns Arc<dyn RuntimeAwareTool>
let tool = save_note();
The #[inject(store)] parameter receives the Arc<dyn Store> from the ToolRuntime at execution time. Only key and text appear in the JSON Schema sent to the model.
Using with ToolNode in a Graph
ToolNode automatically injects runtime context into registered RuntimeAwareTool instances. Register them with with_runtime_tool() and optionally attach a store with with_store():
use synaptic::graph::ToolNode;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
let registry = ToolRegistry::new();
let executor = SerialToolExecutor::new(registry);
let tool_node = ToolNode::new(executor)
.with_store(store.clone())
.with_runtime_tool(save_note()); // save_note() returns Arc<dyn RuntimeAwareTool>
When the graph executes this tool node and encounters a tool call matching "save_note", it builds a ToolRuntime populated with the current graph state, the store, and the tool call ID, then calls call_with_runtime().
RuntimeAwareToolAdapter -- Using Outside a Graph
If you need to use a RuntimeAwareTool in a context that expects the standard Tool trait (for example, with SerialToolExecutor directly), wrap it in a RuntimeAwareToolAdapter:
use std::sync::Arc;
use synaptic::core::{RuntimeAwareTool, RuntimeAwareToolAdapter, ToolRuntime};
let tool = save_note(); // Arc<dyn RuntimeAwareTool>
let adapter = RuntimeAwareToolAdapter::new(tool);
// Optionally inject a runtime before calling
adapter.set_runtime(ToolRuntime {
store: Some(store.clone()),
stream_writer: None,
state: None,
tool_call_id: "call-1".to_string(),
config: None,
}).await;
// Now use it as a regular Tool
let result = adapter.call(json!({"key": "k", "text": "hello"})).await?;
If set_runtime() is not called before call(), the adapter uses a default empty ToolRuntime with all optional fields set to None and an empty tool_call_id.
create_react_agent with a Store
When building a ReAct agent via create_react_agent, pass a store through AgentOptions to have it automatically wired into the ToolNode for all registered runtime-aware tools:
use synaptic::graph::{create_react_agent, AgentOptions};
let graph = create_react_agent(
model,
tools,
AgentOptions {
store: Some(store),
..Default::default()
},
);
Memory
Synaptic provides session-keyed conversation memory through the MemoryStore trait and a family of memory strategies that control how conversation history is stored, trimmed, and summarized.
The MemoryStore Trait
All memory strategies implement the MemoryStore trait, which defines three async operations:
#[async_trait]
pub trait MemoryStore: Send + Sync {
async fn append(&self, session_id: &str, message: Message) -> Result<(), SynapticError>;
async fn load(&self, session_id: &str) -> Result<Vec<Message>, SynapticError>;
async fn clear(&self, session_id: &str) -> Result<(), SynapticError>;
}
append-- adds a message to the session's history.load-- retrieves the conversation history for a session.clear-- removes all messages for a session.
Every operation is keyed by a session_id string, which isolates conversations from one another. You choose the session key (a user ID, a thread ID, a UUID -- whatever makes sense for your application).
InMemoryStore
The simplest MemoryStore implementation is InMemoryStore, which stores messages in a HashMap protected by an Arc<RwLock<_>>:
use synaptic::memory::InMemoryStore;
use synaptic::core::{MemoryStore, Message};
let store = InMemoryStore::new();
store.append("session-1", Message::human("Hello")).await?;
store.append("session-1", Message::ai("Hi there!")).await?;
let history = store.load("session-1").await?;
assert_eq!(history.len(), 2);
// Different sessions are completely isolated
let other = store.load("session-2").await?;
assert!(other.is_empty());
InMemoryStore is often used as the backing store for the higher-level memory strategies described below.
Memory Strategies
Each memory strategy wraps an underlying MemoryStore and applies a different policy when loading messages. All strategies implement MemoryStore themselves, so they are interchangeable wherever a MemoryStore is expected.
| Strategy | Behavior | When to Use |
|---|---|---|
| Buffer Memory | Keeps the entire conversation history | Short conversations where full context matters |
| Window Memory | Keeps only the last K messages | Chat UIs where older context is less relevant |
| Summary Memory | Summarizes older messages with an LLM | Very long conversations requiring compact history |
| Token Buffer Memory | Keeps recent messages within a token budget | Cost control and prompt size limits |
| Summary Buffer Memory | Hybrid -- summarizes old messages, keeps recent ones verbatim | Best balance of context and efficiency |
Auto-Managing History
For the common pattern of loading history before a chain call and saving the result afterward, Synaptic provides RunnableWithMessageHistory. It wraps any Runnable<Vec<Message>, String> and handles the load/save lifecycle automatically, keyed by a session ID in the RunnableConfig metadata.
Choosing a Strategy
- If your conversations are short (under 20 messages), Buffer Memory is the simplest choice.
- If you want predictable memory usage without an LLM call, use Window Memory or Token Buffer Memory.
- If conversations are long and you need the full context preserved in compressed form, use Summary Memory.
- If you want the best of both worlds -- exact recent messages plus a compressed summary of older history -- use Summary Buffer Memory.
Buffer Memory
ConversationBufferMemory is the simplest memory strategy. It keeps the entire conversation history, returning every message on load() with no trimming or summarization.
Usage
use std::sync::Arc;
use synaptic::memory::{ConversationBufferMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message};
// Create a backing store and wrap it with buffer memory
let store = Arc::new(InMemoryStore::new());
let memory = ConversationBufferMemory::new(store);
let session = "user-1";
memory.append(session, Message::human("Hello")).await?;
memory.append(session, Message::ai("Hi there!")).await?;
memory.append(session, Message::human("What is Rust?")).await?;
memory.append(session, Message::ai("Rust is a systems programming language.")).await?;
let history = memory.load(session).await?;
// Returns ALL 4 messages -- the full conversation
assert_eq!(history.len(), 4);
How It Works
ConversationBufferMemory is a thin passthrough wrapper. It delegates append(), load(), and clear() directly to the underlying MemoryStore without modification. The "strategy" here is simply: keep everything.
This makes the buffer strategy explicit and composable. By wrapping your store in ConversationBufferMemory, you signal that this particular use site intentionally stores full history, and you can later swap in a different strategy (e.g., ConversationWindowMemory) without changing the rest of your code.
When to Use
Buffer memory is a good fit when:
- Conversations are short (under ~20 exchanges) and the full history fits comfortably within the model's context window.
- You need perfect recall of every message (e.g., for auditing or evaluation).
- You are prototyping and do not yet need a more sophisticated strategy.
Trade-offs
- Grows unbounded -- every message is stored and returned. For long conversations, this will eventually exceed the model's context window or cause high token costs.
- No compression -- there is no summarization or trimming, so you pay for every token in the history on every LLM call.
If unbounded growth is a concern, consider Window Memory for a fixed-size window, Token Buffer Memory for a token budget, or Summary Memory for LLM-based compression.
Window Memory
ConversationWindowMemory keeps only the most recent K messages. All messages are stored in the underlying store, but load() returns a sliding window of the last window_size messages.
Usage
use std::sync::Arc;
use synaptic::memory::{ConversationWindowMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message};
let store = Arc::new(InMemoryStore::new());
// Keep only the last 4 messages visible
let memory = ConversationWindowMemory::new(store, 4);
let session = "user-1";
memory.append(session, Message::human("Message 1")).await?;
memory.append(session, Message::ai("Reply 1")).await?;
memory.append(session, Message::human("Message 2")).await?;
memory.append(session, Message::ai("Reply 2")).await?;
memory.append(session, Message::human("Message 3")).await?;
memory.append(session, Message::ai("Reply 3")).await?;
let history = memory.load(session).await?;
// Only the last 4 messages are returned
assert_eq!(history.len(), 4);
assert_eq!(history[0].content(), "Message 2");
assert_eq!(history[3].content(), "Reply 3");
How It Works
append()stores every message in the underlyingMemoryStore-- nothing is discarded on write.load()retrieves all messages from the store, then returns only the lastwindow_sizeentries. If the total number of messages is less than or equal towindow_size, all messages are returned.clear()removes all messages from the underlying store for the given session.
The window is applied at load time, not at write time. This means the full history remains in the backing store and could be accessed directly if needed.
Choosing window_size
The window_size parameter is measured in individual messages, not pairs. A typical human/AI exchange produces 2 messages, so a window_size of 10 keeps roughly 5 turns of conversation.
Consider your model's context window when choosing a size. A window of 20 messages is usually safe for most models, while a window of 4-6 messages works well for lightweight chat UIs where only the most recent context matters.
When to Use
Window memory is a good fit when:
- You want fixed, predictable memory usage with no LLM calls for summarization.
- Older context is genuinely less relevant (e.g., a casual chatbot or customer support flow).
- You need a simple strategy that is easy to reason about.
Trade-offs
- Hard cutoff -- messages outside the window are invisible to the model. There is no summary or compressed representation of older history.
- No token awareness -- the window is measured in message count, not token count. A few long messages could still exceed the model's context window. If you need token-level control, see Token Buffer Memory.
For a strategy that preserves older context through summarization, see Summary Memory or Summary Buffer Memory.
Summary Memory
ConversationSummaryMemory uses an LLM to compress older messages into a running summary. Recent messages are kept verbatim, while everything beyond a buffer_size threshold is summarized into a single system message.
Usage
use std::sync::Arc;
use synaptic::memory::{ConversationSummaryMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message, ChatModel};
// You need a ChatModel to generate summaries
let model: Arc<dyn ChatModel> = Arc::new(my_model);
let store = Arc::new(InMemoryStore::new());
// Keep the last 4 messages verbatim; summarize older ones
let memory = ConversationSummaryMemory::new(store, model, 4);
let session = "user-1";
// As messages accumulate beyond buffer_size * 2, summarization triggers
memory.append(session, Message::human("Tell me about Rust.")).await?;
memory.append(session, Message::ai("Rust is a systems programming language...")).await?;
memory.append(session, Message::human("What about ownership?")).await?;
memory.append(session, Message::ai("Ownership is Rust's core memory model...")).await?;
// ... more messages ...
let history = memory.load(session).await?;
// If summarization has occurred, history starts with a system message
// containing the summary, followed by the most recent messages.
How It Works
-
append()stores the message in the underlying store, then checks the total message count. -
When the count exceeds
buffer_size * 2, the strategy splits messages into "older" and "recent" (the lastbuffer_sizemessages). -
The older messages are sent to the
ChatModelwith a prompt asking for a concise summary. If a previous summary already exists, it is included as context for the new summary. -
The store is cleared and repopulated with only the recent messages.
-
load()returns the stored messages, prepended with a system message containing the summary text (if one exists):Summary of earlier conversation: <summary text> -
clear()removes both the stored messages and the summary for the session.
Parameters
| Parameter | Type | Description |
|---|---|---|
store | Arc<dyn MemoryStore> | The backing store for raw messages |
model | Arc<dyn ChatModel> | The LLM used to generate summaries |
buffer_size | usize | Number of recent messages to keep verbatim |
When to Use
Summary memory is a good fit when:
- Conversations are very long and you need to preserve context from the entire history.
- You can afford the additional LLM call for summarization (it only triggers when the buffer overflows, not on every append).
- You want roughly constant token usage regardless of how long the conversation runs.
Trade-offs
- Lossy compression -- the summary is generated by an LLM, so specific details from older messages may be lost or distorted.
- Additional LLM cost -- each summarization step makes a separate ChatModel call. The model used for summarization can be a smaller, cheaper model than your primary model.
- Latency -- the
append()call that triggers summarization will be slower than usual due to the LLM round-trip.
If you want exact recent messages with no LLM calls, use Window Memory or Token Buffer Memory. For a hybrid approach that balances exact recall of recent messages with summarized older history, see Summary Buffer Memory.
Token Buffer Memory
ConversationTokenBufferMemory keeps the most recent messages that fit within a token budget. On load(), the oldest messages are dropped until the total estimated token count is at or below max_tokens.
Usage
use std::sync::Arc;
use synaptic::memory::{ConversationTokenBufferMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message};
let store = Arc::new(InMemoryStore::new());
// Keep messages within a 200-token budget
let memory = ConversationTokenBufferMemory::new(store, 200);
let session = "user-1";
memory.append(session, Message::human("Hello!")).await?;
memory.append(session, Message::ai("Hi! How can I help?")).await?;
memory.append(session, Message::human("Tell me a long story about Rust.")).await?;
memory.append(session, Message::ai("Rust began as a personal project...")).await?;
let history = memory.load(session).await?;
// Only messages that fit within 200 estimated tokens are returned.
// Oldest messages are dropped first.
How It Works
append()stores every message in the underlyingMemoryStorewithout modification.load()retrieves all messages, estimates their total token count, and removes the oldest messages one by one until the total fits withinmax_tokens.clear()removes all messages from the underlying store for the session.
Token Estimation
Synaptic uses a simple heuristic of approximately 4 characters per token, with a minimum of 1 token per message:
fn estimate_tokens(text: &str) -> usize {
text.len() / 4 + 1
}
This is a rough approximation. Actual token counts vary by model and tokenizer. The heuristic is intentionally conservative (slightly overestimates) to avoid exceeding real token limits.
Parameters
| Parameter | Type | Description |
|---|---|---|
store | Arc<dyn MemoryStore> | The backing store for raw messages |
max_tokens | usize | Maximum estimated tokens to return from load() |
When to Use
Token buffer memory is a good fit when:
- You need to control prompt size in token terms rather than message count.
- You want to stay within a model's context window without manually counting messages.
- You prefer a simple, no-LLM-call strategy for managing memory size.
Trade-offs
- Approximate -- the token estimate is a heuristic, not an exact count. For precise token budgeting, you would need a model-specific tokenizer.
- Hard cutoff -- dropped messages are lost entirely. There is no summary or compressed representation of older history.
- Drops whole messages -- if a single message is very long, it may consume most of the budget by itself.
For a fixed message count instead of a token budget, see Window Memory. For a strategy that preserves older context through summarization, see Summary Memory or Summary Buffer Memory.
Summary Buffer Memory
ConversationSummaryBufferMemory is a hybrid strategy that combines the strengths of Summary Memory and Token Buffer Memory. Recent messages are kept verbatim, while older messages are compressed into a running LLM-generated summary when the total estimated token count exceeds a configurable threshold.
Usage
use std::sync::Arc;
use synaptic::memory::{ConversationSummaryBufferMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message, ChatModel};
let model: Arc<dyn ChatModel> = Arc::new(my_model);
let store = Arc::new(InMemoryStore::new());
// Summarize older messages when total tokens exceed 500
let memory = ConversationSummaryBufferMemory::new(store, model, 500);
let session = "user-1";
memory.append(session, Message::human("What is Rust?")).await?;
memory.append(session, Message::ai("Rust is a systems programming language...")).await?;
memory.append(session, Message::human("How does ownership work?")).await?;
memory.append(session, Message::ai("Ownership is a set of rules...")).await?;
// ... as conversation grows and exceeds 500 estimated tokens,
// older messages are summarized automatically ...
let history = memory.load(session).await?;
// history = [System("Summary of earlier conversation: ..."), recent messages...]
How It Works
-
append()stores the new message, then estimates the total token count across all stored messages. -
When the total exceeds
max_token_limitand there is more than one message:- A split point is calculated: recent messages that fit within half the token limit are kept verbatim.
- All messages before the split point are summarized by the
ChatModel. If a previous summary exists, it is included as context. - The store is cleared and repopulated with only the recent messages.
-
load()returns the stored messages, prepended with a system message containing the summary (if one exists):Summary of earlier conversation: <summary text> -
clear()removes both stored messages and the summary for the session.
Parameters
| Parameter | Type | Description |
|---|---|---|
store | Arc<dyn MemoryStore> | The backing store for raw messages |
model | Arc<dyn ChatModel> | The LLM used to generate summaries |
max_token_limit | usize | Token threshold that triggers summarization |
Token Estimation
Like ConversationTokenBufferMemory, this strategy estimates tokens at approximately 4 characters per token (with a minimum of 1). The same heuristic caveat applies: actual token counts will vary by model.
When to Use
Summary buffer memory is the recommended strategy when:
- Conversations are long and you need both exact recent context and compressed older context.
- You want to stay within a token budget while preserving as much information as possible.
- The additional cost of occasional LLM summarization calls is acceptable.
This is the closest equivalent to LangChain's ConversationSummaryBufferMemory and is generally the best default choice for production chatbots.
Trade-offs
- LLM cost on overflow -- summarization only triggers when the token limit is exceeded, but each summarization call adds latency and cost.
- Lossy for old messages -- details from older messages may be lost in the summary, though recent messages are always exact.
- Heuristic token counting -- the split point is based on estimated tokens, not exact counts.
Offline Testing with ScriptedChatModel
Use ScriptedChatModel to test summarization without API keys:
use std::sync::Arc;
use synaptic::core::{ChatResponse, MemoryStore, Message};
use synaptic::models::ScriptedChatModel;
use synaptic::memory::{ConversationSummaryBufferMemory, InMemoryStore};
// Script the model to return a summary when called
let summarizer = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("The user asked about Rust and ownership."),
usage: None,
},
]));
let store = Arc::new(InMemoryStore::new());
let memory = ConversationSummaryBufferMemory::new(store, summarizer, 50);
let session = "test";
// Add enough messages to exceed the 50-token threshold
memory.append(session, Message::human("What is Rust?")).await?;
memory.append(session, Message::ai("Rust is a systems programming language focused on safety, speed, and concurrency.")).await?;
memory.append(session, Message::human("How does ownership work?")).await?;
memory.append(session, Message::ai("Ownership is a set of rules the compiler checks at compile time. Each value has a single owner.")).await?;
// Load -- older messages are now summarized
let history = memory.load(session).await?;
// history[0] is a System message with the summary
// Remaining messages are the most recent ones kept verbatim
For simpler alternatives, see Buffer Memory (keep everything), Window Memory (fixed message count), or Token Buffer Memory (token budget without summarization).
RunnableWithMessageHistory
RunnableWithMessageHistory wraps any Runnable<Vec<Message>, String> to automatically load conversation history before invocation and save the result afterward. This eliminates the boilerplate of manually calling memory.load() and memory.append() around every chain invocation.
Usage
use std::sync::Arc;
use synaptic::memory::{RunnableWithMessageHistory, InMemoryStore};
use synaptic::core::{MemoryStore, Message, RunnableConfig};
use synaptic::runnables::Runnable;
let store = Arc::new(InMemoryStore::new());
// `chain` is any Runnable<Vec<Message>, String>, e.g. a ChatModel pipeline
let with_history = RunnableWithMessageHistory::new(
chain.boxed(),
store,
);
// The session_id is passed via config metadata
let mut config = RunnableConfig::default();
config.metadata.insert(
"session_id".to_string(),
serde_json::Value::String("user-42".to_string()),
);
// First invocation
let response = with_history.invoke("Hello!".to_string(), &config).await?;
// Internally:
// 1. Loads existing messages for session "user-42" (empty on first call)
// 2. Appends Message::human("Hello!") to the store and to the message list
// 3. Passes the full Vec<Message> to the inner runnable
// 4. Saves Message::ai(response) to the store
// Second invocation -- history is automatically carried forward
let response = with_history.invoke("Tell me more.".to_string(), &config).await?;
// The inner runnable now receives all 4 messages:
// [Human("Hello!"), AI(first_response), Human("Tell me more."), ...]
How It Works
RunnableWithMessageHistory implements Runnable<String, String>. On each invoke() call:
- Extract session ID -- reads
session_idfromconfig.metadata. If not present, defaults to"default". - Load history -- calls
memory.load(session_id)to retrieve existing messages. - Append human message -- creates
Message::human(input), appends it to both the in-memory list and the store. - Invoke inner runnable -- passes the full
Vec<Message>(history + new message) to the wrapped runnable. - Save AI response -- creates
Message::ai(output)and appends it to the store. - Return -- returns the output string.
Session Isolation
Different session IDs produce completely isolated conversation histories:
let mut config_a = RunnableConfig::default();
config_a.metadata.insert(
"session_id".to_string(),
serde_json::Value::String("alice".to_string()),
);
let mut config_b = RunnableConfig::default();
config_b.metadata.insert(
"session_id".to_string(),
serde_json::Value::String("bob".to_string()),
);
// Alice and Bob have independent conversation histories
with_history.invoke("Hi, I'm Alice.".to_string(), &config_a).await?;
with_history.invoke("Hi, I'm Bob.".to_string(), &config_b).await?;
Combining with Memory Strategies
Because RunnableWithMessageHistory takes any Arc<dyn MemoryStore>, you can pass in a memory strategy to control how history is managed:
use synaptic::memory::{ConversationWindowMemory, InMemoryStore, RunnableWithMessageHistory};
use std::sync::Arc;
let store = Arc::new(InMemoryStore::new());
let windowed = Arc::new(ConversationWindowMemory::new(store, 10));
let with_history = RunnableWithMessageHistory::new(
chain.boxed(),
windowed, // Only the last 10 messages will be loaded
);
This lets you combine automatic history management with any trimming or summarization strategy.
When to Use
Use RunnableWithMessageHistory when:
- You have a
Runnablechain that takes messages and returns a string (the common pattern for chat pipelines). - You want to avoid manually loading and saving messages around every invocation.
- You need session-based conversation management with minimal boilerplate.
Clearing History
Use MemoryStore::clear() on the underlying store to reset a session's history:
let store = Arc::new(InMemoryStore::new());
let with_history = RunnableWithMessageHistory::new(chain.boxed(), store.clone());
// After some conversation...
store.clear("user-42").await?;
// Next invocation starts fresh -- no previous messages are loaded
For lower-level control over when messages are loaded and saved, use the MemoryStore trait directly.
Graph
Synaptic provides LangGraph-style graph orchestration through the synaptic_graph crate. A StateGraph is a state machine where nodes process state and edges control the flow between nodes. This architecture supports fixed routing, conditional branching, checkpointing for persistence, human-in-the-loop interrupts, and streaming execution.
Core Concepts
| Concept | Description |
|---|---|
State trait | Defines how graph state is merged when nodes produce updates |
Node<S> trait | A processing unit that takes state and returns updated state |
StateGraph | Builder for assembling nodes and edges into a graph |
CompiledGraph | The executable graph produced by StateGraph::compile() |
Checkpointer | Trait for persisting graph state across invocations |
ToolNode | Prebuilt node that auto-dispatches tool calls from AI messages |
How It Works
- Define a state type that implements
State(or use the built-inMessageState). - Create nodes -- either by implementing the
Node<S>trait or by wrapping a closure withFnNode. - Build a graph with
StateGraph::new(), adding nodes and edges. - Call
.compile()to validate the graph and produce aCompiledGraph. - Run the graph with
invoke()for a single result orstream()for per-node events.
use synaptic::graph::{StateGraph, MessageState, FnNode, END};
use synaptic::core::Message;
let greet = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Hello from the graph!"));
Ok(state)
});
let graph = StateGraph::new()
.add_node("greet", greet)
.set_entry_point("greet")
.add_edge("greet", END)
.compile()?;
let initial = MessageState::with_messages(vec![Message::human("Hi")]);
let result = graph.invoke(initial).await?;
assert_eq!(result.messages.len(), 2);
Guides
- State & Nodes -- define state types and processing nodes
- Edges -- connect nodes with fixed and conditional edges
- Graph Streaming -- consume per-node events during execution (single and multi-mode)
- Checkpointing -- persist and resume graph state
- Human-in-the-Loop -- interrupt execution for human review
- Tool Node -- auto-dispatch tool calls from AI messages
- Visualization -- render graphs as Mermaid, ASCII, DOT, or PNG
Advanced Features
Node Caching
Use add_node_with_cache() to cache node results based on input state. Cached entries expire after the specified TTL:
use synaptic::graph::{StateGraph, CachePolicy, END};
use std::time::Duration;
let graph = StateGraph::new()
.add_node_with_cache(
"expensive",
expensive_node,
CachePolicy::new(Duration::from_secs(300)),
)
.add_edge("expensive", END)
.set_entry_point("expensive")
.compile()?;
When the same input state is seen again within the TTL, the cached result is returned without re-executing the node.
Deferred Nodes
Use add_deferred_node() to create nodes that wait for ALL incoming paths to complete before executing. This is useful for fan-in aggregation after parallel fan-out with Send:
let graph = StateGraph::new()
.add_node("branch_a", node_a)
.add_node("branch_b", node_b)
.add_deferred_node("aggregate", aggregator_node)
.add_edge("branch_a", "aggregate")
.add_edge("branch_b", "aggregate")
.add_edge("aggregate", END)
.set_entry_point("branch_a")
.compile()?;
Structured Output (response_format)
When creating an agent with create_agent(), set response_format in AgentOptions to force the final response into a specific JSON schema:
use synaptic::graph::{create_agent, AgentOptions};
let graph = create_agent(model, tools, AgentOptions {
response_format: Some(serde_json::json!({
"type": "object",
"properties": {
"answer": { "type": "string" },
"confidence": { "type": "number" }
},
"required": ["answer", "confidence"]
})),
..Default::default()
})?;
When the agent produces its final answer (no tool calls), it re-calls the model with structured output instructions matching the schema.
State & Nodes
Graphs in Synaptic operate on a state value that flows through nodes. Each node receives the current state, processes it, and returns an updated state. The State trait defines how states are merged, and the Node<S> trait defines how nodes process state.
The State Trait
Any type used as graph state must implement the State trait:
pub trait State: Clone + Send + Sync + 'static {
/// Merge another state into this one (reducer pattern).
fn merge(&mut self, other: Self);
}
The merge() method is called when combining state updates -- for example, when update_state() is used during human-in-the-loop flows. The merge semantics are up to you: append, replace, or any custom logic.
MessageState -- The Built-in State
For the common case of conversational agents, Synaptic provides MessageState:
use synaptic::graph::MessageState;
use synaptic::core::Message;
// Create an empty state
let state = MessageState::new();
// Create with initial messages
let state = MessageState::with_messages(vec![
Message::human("Hello"),
Message::ai("Hi there!"),
]);
// Access the last message
if let Some(msg) = state.last_message() {
println!("Last: {}", msg.content());
}
MessageState implements State by appending messages on merge:
fn merge(&mut self, other: Self) {
self.messages.extend(other.messages);
}
This append-only behavior is the right default for conversational workflows where each node adds new messages to the history.
Custom State
You can define your own state type for non-conversational graphs:
use synaptic::graph::State;
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
struct PipelineState {
input: String,
steps_completed: Vec<String>,
result: Option<String>,
}
impl State for PipelineState {
fn merge(&mut self, other: Self) {
self.steps_completed.extend(other.steps_completed);
if other.result.is_some() {
self.result = other.result;
}
}
}
If you plan to use checkpointing, your state must also implement Serialize and Deserialize.
The Node<S> Trait
A node is any type that implements Node<S>:
use async_trait::async_trait;
use synaptic::core::SynapticError;
use synaptic::graph::{Node, NodeOutput, MessageState};
use synaptic::core::Message;
struct GreeterNode;
#[async_trait]
impl Node<MessageState> for GreeterNode {
async fn process(&self, mut state: MessageState) -> Result<NodeOutput<MessageState>, SynapticError> {
state.messages.push(Message::ai("Hello! How can I help?"));
Ok(state.into()) // NodeOutput::State(state)
}
}
Nodes return NodeOutput<S>, which is an enum:
NodeOutput::State(S)-- a regular state update (existing behavior). TheFrom<S>impl lets you writeOk(state.into()).NodeOutput::Command(Command<S>)-- a control flow command (goto, interrupt, fan-out). See Human-in-the-Loop for interrupt examples.
Nodes are Send + Sync, so they can safely hold shared references (e.g., Arc<dyn ChatModel>) and be used across async tasks.
FnNode -- Closure-based Nodes
For simple logic, FnNode wraps an async closure as a node without defining a separate struct:
use synaptic::graph::{FnNode, MessageState};
use synaptic::core::Message;
let greeter = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Hello from a closure!"));
Ok(state.into())
});
FnNode accepts any function with the signature Fn(S) -> Future<Output = Result<NodeOutput<S>, SynapticError>> where S: State.
Adding Nodes to a Graph
Nodes are added to a StateGraph with a string name. The name is used to reference the node in edges and conditional routing:
use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::Message;
let node_a = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Step A"));
Ok(state.into())
});
let node_b = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Step B"));
Ok(state.into())
});
let graph = StateGraph::new()
.add_node("a", node_a)
.add_node("b", node_b)
.set_entry_point("a")
.add_edge("a", "b")
.add_edge("b", END)
.compile()?;
Both struct-based nodes (implementing Node<S>) and FnNode closures can be passed to add_node() interchangeably.
Edges
Edges define the flow of execution between nodes in a graph. Synaptic supports two kinds of edges: fixed edges that always route to the same target, and conditional edges that route dynamically based on the current state.
Fixed Edges
A fixed edge unconditionally routes execution from one node to another:
use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::Message;
let node_a = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Step A"));
Ok(state)
});
let node_b = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Step B"));
Ok(state)
});
let graph = StateGraph::new()
.add_node("a", node_a)
.add_node("b", node_b)
.set_entry_point("a")
.add_edge("a", "b") // a always flows to b
.add_edge("b", END) // b always flows to END
.compile()?;
Use the END constant to indicate that a node terminates the graph. Every execution path must eventually reach END; otherwise, the graph will hit the 100-iteration safety limit.
Entry Point
Every graph requires an entry point -- the first node to execute:
let graph = StateGraph::new()
.add_node("start", my_node)
.set_entry_point("start") // required
// ...
Calling .compile() without setting an entry point returns an error.
Conditional Edges
Conditional edges route execution based on a function that inspects the current state and returns the name of the next node:
use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::Message;
let router = FnNode::new(|state: MessageState| async move {
Ok(state) // routing logic is in the edge, not the node
});
let handle_greeting = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Hello!"));
Ok(state)
});
let handle_question = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Let me look that up."));
Ok(state)
});
let graph = StateGraph::new()
.add_node("router", router)
.add_node("greeting", handle_greeting)
.add_node("question", handle_question)
.set_entry_point("router")
.add_conditional_edges("router", |state: &MessageState| {
let last = state.last_message().map(|m| m.content().to_string());
match last.as_deref() {
Some("hi") | Some("hello") => "greeting".to_string(),
_ => "question".to_string(),
}
})
.add_edge("greeting", END)
.add_edge("question", END)
.compile()?;
The router function receives an immutable reference to the state (&S) and returns a String -- the name of the next node to execute (or END to terminate).
Conditional Edges with Path Map
For graph visualization, you can provide a path_map that enumerates the possible routing targets. This gives visualization tools (Mermaid, DOT, ASCII) the information they need to draw all possible paths:
use std::collections::HashMap;
use synaptic::graph::{StateGraph, MessageState, END};
let graph = StateGraph::new()
.add_node("router", router_node)
.add_node("path_a", node_a)
.add_node("path_b", node_b)
.set_entry_point("router")
.add_conditional_edges_with_path_map(
"router",
|state: &MessageState| {
if state.messages.len() > 3 {
"path_a".to_string()
} else {
"path_b".to_string()
}
},
HashMap::from([
("path_a".to_string(), "path_a".to_string()),
("path_b".to_string(), "path_b".to_string()),
]),
)
.add_edge("path_a", END)
.add_edge("path_b", END)
.compile()?;
The path_map is a HashMap<String, String> where keys are labels and values are target node names. The compile step validates that all path map targets reference existing nodes (or END).
Validation
When you call .compile(), the graph validates:
- An entry point is set and refers to an existing node.
- Every fixed edge source and target refers to an existing node (or
END). - Every conditional edge source refers to an existing node.
- All
path_maptargets refer to existing nodes (orEND).
If any validation fails, compile() returns a SynapticError::Graph with a descriptive message.
Graph Streaming
Instead of waiting for the entire graph to finish, you can stream execution and receive a GraphEvent after each node completes. This is useful for progress reporting, real-time UIs, and debugging.
stream() and StreamMode
The stream() method on CompiledGraph returns a GraphStream -- a Pin<Box<dyn Stream>> that yields Result<GraphEvent<S>, SynapticError> values:
use synaptic::graph::{StateGraph, FnNode, MessageState, StreamMode, GraphEvent, END};
use synaptic::core::Message;
use futures::StreamExt;
let step_a = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Step A done"));
Ok(state)
});
let step_b = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Step B done"));
Ok(state)
});
let graph = StateGraph::new()
.add_node("a", step_a)
.add_node("b", step_b)
.set_entry_point("a")
.add_edge("a", "b")
.add_edge("b", END)
.compile()?;
let initial = MessageState::with_messages(vec![Message::human("Start")]);
let mut stream = graph.stream(initial, StreamMode::Values);
while let Some(event) = stream.next().await {
let event: GraphEvent<MessageState> = event?;
println!(
"Node '{}' completed -- {} messages in state",
event.node,
event.state.messages.len()
);
}
// Output:
// Node 'a' completed -- 2 messages in state
// Node 'b' completed -- 3 messages in state
GraphEvent
Each event contains:
| Field | Type | Description |
|---|---|---|
node | String | The name of the node that just executed |
state | S | The state snapshot after the node ran |
Stream Modes
The StreamMode enum controls what the state field contains:
| Mode | Behavior |
|---|---|
StreamMode::Values | Each event contains the full accumulated state after the node |
StreamMode::Updates | Each event contains the pre-node state (useful for computing per-node deltas) |
StreamMode::Messages | Same as Values — callers filter for AI messages in chat UIs |
StreamMode::Debug | Same as Values — intended for detailed debug information |
StreamMode::Custom | Events emitted via StreamWriter during node execution |
Multi-Mode Streaming
You can request multiple stream modes simultaneously using stream_modes(). Each event is wrapped in a MultiGraphEvent tagged with its mode:
use synaptic::graph::{StreamMode, MultiGraphEvent};
use futures::StreamExt;
let mut stream = graph.stream_modes(
initial_state,
vec![StreamMode::Values, StreamMode::Updates],
);
while let Some(result) = stream.next().await {
let event: MultiGraphEvent<MessageState> = result?;
match event.mode {
StreamMode::Values => {
println!("Full state after '{}': {:?}", event.event.node, event.event.state);
}
StreamMode::Updates => {
println!("State before '{}': {:?}", event.event.node, event.event.state);
}
_ => {}
}
}
For each node execution, one event per requested mode is emitted. With two modes and three nodes, you get six events total.
Streaming with Checkpoints
You can combine streaming with checkpointing using stream_with_config():
use synaptic::graph::{MemorySaver, CheckpointConfig, StreamMode};
use std::sync::Arc;
let checkpointer = Arc::new(MemorySaver::new());
let graph = graph.with_checkpointer(checkpointer);
let config = CheckpointConfig::new("thread-1");
let mut stream = graph.stream_with_config(
initial_state,
StreamMode::Values,
Some(config),
);
while let Some(event) = stream.next().await {
let event = event?;
println!("Node: {}", event.node);
}
Checkpoints are saved after each node during streaming, just as they are during invoke(). If the graph is interrupted (via interrupt_before or interrupt_after), the stream yields the interrupt error and terminates.
Error Handling
The stream yields Result values. If a node returns an error, the stream yields that error and terminates. Consuming code should handle both successful events and errors:
while let Some(result) = stream.next().await {
match result {
Ok(event) => println!("Node '{}' succeeded", event.node),
Err(e) => {
eprintln!("Graph error: {e}");
break;
}
}
}
Checkpointing
Checkpointing persists graph state between invocations, enabling resumable execution, multi-turn conversations over a graph, and human-in-the-loop workflows. The Checkpointer trait abstracts the storage backend, and MemorySaver provides an in-memory implementation for development and testing.
The Checkpointer Trait
#[async_trait]
pub trait Checkpointer: Send + Sync {
async fn put(&self, config: &CheckpointConfig, checkpoint: &Checkpoint) -> Result<(), SynapticError>;
async fn get(&self, config: &CheckpointConfig) -> Result<Option<Checkpoint>, SynapticError>;
async fn list(&self, config: &CheckpointConfig) -> Result<Vec<Checkpoint>, SynapticError>;
}
A Checkpoint stores the serialized state and the name of the next node to execute:
pub struct Checkpoint {
pub state: serde_json::Value,
pub next_node: Option<String>,
}
MemorySaver
MemorySaver is the built-in in-memory checkpointer. It stores checkpoints in a HashMap keyed by thread ID:
use synaptic::graph::MemorySaver;
use std::sync::Arc;
let checkpointer = Arc::new(MemorySaver::new());
For production use, you would implement Checkpointer with a persistent backend (database, Redis, file system, etc.).
Attaching a Checkpointer
After compiling a graph, attach a checkpointer with .with_checkpointer():
use synaptic::graph::{StateGraph, FnNode, MessageState, MemorySaver, END};
use synaptic::core::Message;
use std::sync::Arc;
let node = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Processed"));
Ok(state)
});
let graph = StateGraph::new()
.add_node("process", node)
.set_entry_point("process")
.add_edge("process", END)
.compile()?
.with_checkpointer(Arc::new(MemorySaver::new()));
CheckpointConfig
A CheckpointConfig identifies a thread (conversation) for checkpointing:
use synaptic::graph::CheckpointConfig;
let config = CheckpointConfig::new("thread-1");
The thread_id string isolates different conversations. Each thread maintains its own checkpoint history.
Invoking with Checkpoints
Use invoke_with_config() to run the graph with checkpointing enabled:
let config = CheckpointConfig::new("thread-1");
let initial = MessageState::with_messages(vec![Message::human("Hello")]);
let result = graph.invoke_with_config(initial, Some(config.clone())).await?;
After each node executes, the current state and next node are saved to the checkpointer. On subsequent invocations with the same CheckpointConfig, the graph resumes from the last checkpoint.
Retrieving State
You can inspect the current state saved for a thread:
// Get the latest state for a thread
if let Some(state) = graph.get_state(&config).await? {
println!("Messages: {}", state.messages.len());
}
// Get the full checkpoint history (oldest to newest)
let history = graph.get_state_history(&config).await?;
for (state, next_node) in &history {
println!(
"State with {} messages, next node: {:?}",
state.messages.len(),
next_node
);
}
State Serialization
Checkpointing requires your state type to implement Serialize and Deserialize (from serde). The built-in MessageState already has these derives. For custom state types, add the derives:
use serde::{Serialize, Deserialize};
use synaptic::graph::State;
#[derive(Clone, Serialize, Deserialize)]
struct MyState {
data: Vec<String>,
}
impl State for MyState {
fn merge(&mut self, other: Self) {
self.data.extend(other.data);
}
}
Human-in-the-Loop
Human-in-the-loop (HITL) allows you to pause graph execution at specific points, giving a human the opportunity to review, approve, or modify the state before the graph continues. Synaptic supports two approaches:
interrupt_before/interrupt_after-- declarative interrupts on theStateGraphbuilder.interrupt()function -- programmatic interrupts inside nodes viaCommand.
Both require a checkpointer to persist state for later resumption.
Interrupt Before and After
The StateGraph builder provides two interrupt modes:
interrupt_before(nodes)-- pause execution before the named nodes run.interrupt_after(nodes)-- pause execution after the named nodes run.
Example: Approval Before Tool Execution
A common pattern is to interrupt before a tool execution node so a human can review the tool calls the agent proposed:
use synaptic::graph::{StateGraph, FnNode, MessageState, MemorySaver, CheckpointConfig, END};
use synaptic::core::Message;
use std::sync::Arc;
let agent_node = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("I want to call the delete_file tool."));
Ok(state.into())
});
let tool_node = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::tool("File deleted.", "call-1"));
Ok(state.into())
});
let graph = StateGraph::new()
.add_node("agent", agent_node)
.add_node("tools", tool_node)
.set_entry_point("agent")
.add_edge("agent", "tools")
.add_edge("tools", END)
// Pause before the tools node executes
.interrupt_before(vec!["tools".to_string()])
.compile()?
.with_checkpointer(Arc::new(MemorySaver::new()));
let config = CheckpointConfig::new("thread-1");
let initial = MessageState::with_messages(vec![Message::human("Delete old logs")]);
Step 1: First Invocation -- Interrupt
The first invoke_with_config() runs the agent node, then stops before tools:
let result = graph.invoke_with_config(initial, Some(config.clone())).await?;
// Returns GraphResult::Interrupted
assert!(result.is_interrupted());
// You can inspect the interrupt value
if let Some(iv) = result.interrupt_value() {
println!("Interrupted: {iv}");
}
At this point, the checkpointer has saved the state after agent ran, with tools as the next node.
Step 2: Human Review
The human can inspect the saved state to review what the agent proposed:
if let Some(state) = graph.get_state(&config).await? {
for msg in &state.messages {
println!("[{}] {}", msg.role(), msg.content());
}
}
Step 3: Update State (Optional)
If the human wants to modify the state before resuming -- for example, to add an approval message or to change the tool call -- use update_state():
let approval = MessageState::with_messages(vec![
Message::human("Approved -- go ahead and delete."),
]);
graph.update_state(&config, approval).await?;
update_state() loads the current checkpoint, calls State::merge() with the provided update, and saves the merged result back to the checkpointer.
Step 4: Resume Execution
Resume the graph by calling invoke_with_config() again with the same config and a default (empty) state. The graph loads the checkpoint and continues from the interrupted node:
let result = graph
.invoke_with_config(MessageState::default(), Some(config))
.await?;
// The graph executed "tools" and reached END
let state = result.into_state();
println!("Final messages: {}", state.messages.len());
Programmatic Interrupt with interrupt()
For more control, nodes can call the interrupt() function to pause execution with a custom value. This is useful when the decision to interrupt depends on runtime state:
use synaptic::graph::{interrupt, Node, NodeOutput, MessageState};
struct ApprovalNode;
#[async_trait]
impl Node<MessageState> for ApprovalNode {
async fn process(&self, state: MessageState) -> Result<NodeOutput<MessageState>, SynapticError> {
// Check if any tool call is potentially dangerous
if let Some(msg) = state.last_message() {
for call in msg.tool_calls() {
if call.name == "delete_file" {
// Interrupt and ask for approval
return Ok(interrupt(serde_json::json!({
"question": "Approve file deletion?",
"tool_call": call.name,
})));
}
}
}
// No dangerous calls -- continue normally
Ok(state.into())
}
}
The caller receives a GraphResult::Interrupted with the interrupt value:
let result = graph.invoke_with_config(state, Some(config.clone())).await?;
if result.is_interrupted() {
let question = result.interrupt_value().unwrap();
println!("Agent asks: {}", question["question"]);
}
Dynamic Routing with Command
Nodes can also use Command to override the normal edge-based routing:
use synaptic::graph::{Command, NodeOutput};
// Route to a specific node, skipping normal edges
Ok(NodeOutput::Command(Command::goto("summary")))
// Route to a specific node with a state update
Ok(NodeOutput::Command(Command::goto_with_update("next", delta_state)))
// End the graph immediately
Ok(NodeOutput::Command(Command::end()))
// Update state without overriding routing
Ok(NodeOutput::Command(Command::update(delta_state)))
interrupt_after
interrupt_after works the same way, but the specified node runs before the interrupt. This is useful when you want to see the node's output before deciding whether to continue:
let graph = StateGraph::new()
.add_node("agent", agent_node)
.add_node("tools", tool_node)
.set_entry_point("agent")
.add_edge("agent", "tools")
.add_edge("tools", END)
// Interrupt after the agent node runs (to review its output)
.interrupt_after(vec!["agent".to_string()])
.compile()?
.with_checkpointer(Arc::new(MemorySaver::new()));
GraphResult
graph.invoke() returns Result<GraphResult<S>, SynapticError>. GraphResult is an enum:
GraphResult::Complete(state)-- graph ran toENDnormally.GraphResult::Interrupted { state, interrupt_value }-- graph paused.
Key methods:
| Method | Description |
|---|---|
is_complete() | Returns true if the graph completed normally |
is_interrupted() | Returns true if the graph was interrupted |
state() | Borrow the state (regardless of completion/interrupt) |
into_state() | Consume and return the state |
interrupt_value() | Returns Some(&Value) if interrupted, None otherwise |
Notes
- Interrupts require a checkpointer. Without one, the graph cannot save state for resumption.
interrupt_before/interrupt_afterreturnGraphResult::Interrupted(not an error).- Programmatic
interrupt()also returnsGraphResult::Interruptedwith the value you pass. - You can interrupt at multiple nodes by passing multiple names to
interrupt_before()orinterrupt_after(). - You can combine
interrupt_beforeandinterrupt_afteron different nodes in the same graph.
Command & Routing
Command<S> gives nodes dynamic control over graph execution, allowing them to override edge-based routing, update state, fan out to multiple nodes, or terminate early. Use it when routing decisions depend on runtime state.
Nodes return NodeOutput<S> -- either NodeOutput::State(S) for a regular state update (via Ok(state.into())), or NodeOutput::Command(Command<S>) for dynamic control flow.
Command Constructors
| Constructor | Behavior |
|---|---|
Command::goto("node") | Route to a specific node, skipping normal edges |
Command::goto_with_update("node", delta) | Route to a node and merge delta into state |
Command::update(delta) | Merge delta into state, then follow normal routing |
Command::end() | Terminate the graph immediately |
Command::send(targets) | Fan-out to multiple nodes via [Send] |
Command::resume(value) | Resume from a previous interrupt (see Interrupt & Resume) |
Conditional Routing with goto
A "triage" node inspects the input and routes to different handlers:
use synaptic::graph::{Command, FnNode, NodeOutput, State, StateGraph, END};
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct TicketState {
category: String,
resolved: bool,
}
impl State for TicketState {
fn merge(&mut self, other: Self) {
if !other.category.is_empty() { self.category = other.category; }
self.resolved = self.resolved || other.resolved;
}
}
let triage = FnNode::new(|state: TicketState| async move {
let target = if state.category == "billing" {
"billing_handler"
} else {
"support_handler"
};
Ok(NodeOutput::Command(Command::goto(target)))
});
let billing = FnNode::new(|mut state: TicketState| async move {
state.resolved = true;
Ok(state.into())
});
let support = FnNode::new(|mut state: TicketState| async move {
state.resolved = true;
Ok(state.into())
});
let graph = StateGraph::new()
.add_node("triage", triage)
.add_node("billing_handler", billing)
.add_node("support_handler", support)
.set_entry_point("triage")
.add_edge("billing_handler", END)
.add_edge("support_handler", END)
.compile()?;
let result = graph.invoke(TicketState {
category: "billing".into(),
resolved: false,
}).await?.into_state();
assert!(result.resolved);
Routing with State Update
goto_with_update routes and merges a state delta in one step. The delta is merged via State::merge() before the target node runs:
Ok(NodeOutput::Command(Command::goto_with_update("escalation", delta)))
Update Without Routing
Command::update(delta) merges state but follows normal edges. Useful when a node contributes a partial update without overriding the next step:
Ok(NodeOutput::Command(Command::update(delta)))
Early Termination
Command::end() stops the graph immediately. No further nodes execute:
let guard = FnNode::new(|state: TicketState| async move {
if state.category == "spam" {
return Ok(NodeOutput::Command(Command::end()));
}
Ok(state.into())
});
Fan-Out with Send
Command::send() dispatches work to multiple targets. Each Send carries a node name and a JSON payload:
use synaptic::graph::Send;
let targets = vec![
Send::new("worker", serde_json::json!({"chunk": "part1"})),
Send::new("worker", serde_json::json!({"chunk": "part2"})),
];
Ok(NodeOutput::Command(Command::send(targets)))
Note: Full parallel fan-out is not yet implemented. Targets are currently processed sequentially.
Commands in Streaming Mode
Commands work identically when streaming. If node "a" issues Command::goto("c"), the stream yields events for "a" and "c" but skips "b", even if an a -> b edge exists.
Interrupt & Resume
interrupt(value) pauses graph execution and returns control to the caller with a JSON value, enabling human-in-the-loop workflows where a node decides at runtime whether to pause. A checkpointer is required to persist state for later resumption.
For declarative interrupts (interrupt_before/interrupt_after), see Human-in-the-Loop.
The interrupt() Function
use synaptic::graph::{interrupt, Node, NodeOutput, MessageState};
use synaptic::core::SynapticError;
use async_trait::async_trait;
struct ApprovalGate;
#[async_trait]
impl Node<MessageState> for ApprovalGate {
async fn process(
&self,
state: MessageState,
) -> Result<NodeOutput<MessageState>, SynapticError> {
if let Some(msg) = state.last_message() {
for call in msg.tool_calls() {
if call.name == "delete_database" {
return Ok(interrupt(serde_json::json!({
"question": "Approve database deletion?",
"tool_call": call.name,
})));
}
}
}
Ok(state.into()) // continue normally
}
}
Detecting Interrupts with GraphResult
graph.invoke() returns GraphResult<S> -- either Complete(state) or Interrupted { state, interrupt_value }:
let result = graph.invoke_with_config(state, Some(config.clone())).await?;
if result.is_interrupted() {
println!("Paused: {}", result.interrupt_value().unwrap());
} else {
println!("Done: {:?}", result.into_state());
}
Full Round-Trip Example
use std::sync::Arc;
use serde::{Serialize, Deserialize};
use serde_json::json;
use synaptic::graph::{
interrupt, CheckpointConfig, FnNode, MemorySaver,
NodeOutput, State, StateGraph, END,
};
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct ReviewState {
proposal: String,
approved: bool,
done: bool,
}
impl State for ReviewState {
fn merge(&mut self, other: Self) {
if !other.proposal.is_empty() { self.proposal = other.proposal; }
self.approved = self.approved || other.approved;
self.done = self.done || other.done;
}
}
let propose = FnNode::new(|mut state: ReviewState| async move {
state.proposal = "Delete all temporary files".into();
Ok(state.into())
});
let gate = FnNode::new(|state: ReviewState| async move {
Ok(interrupt(json!({"question": "Approve?", "proposal": state.proposal})))
});
let execute = FnNode::new(|mut state: ReviewState| async move {
state.done = true;
Ok(state.into())
});
let saver = Arc::new(MemorySaver::new());
let graph = StateGraph::new()
.add_node("propose", propose)
.add_node("gate", gate)
.add_node("execute", execute)
.set_entry_point("propose")
.add_edge("propose", "gate")
.add_edge("gate", "execute")
.add_edge("execute", END)
.compile()?
.with_checkpointer(saver);
let config = CheckpointConfig::new("review-thread");
// Step 1: Invoke -- graph pauses at the gate
let result = graph
.invoke_with_config(ReviewState::default(), Some(config.clone()))
.await?;
assert!(result.is_interrupted());
// Step 2: Review saved state
let saved = graph.get_state(&config).await?.unwrap();
println!("Proposal: {}", saved.proposal);
// Step 3: Optionally update state before resuming
graph.update_state(&config, ReviewState {
proposal: String::new(), approved: true, done: false,
}).await?;
// Step 4: Resume execution
let result = graph
.invoke_with_config(ReviewState::default(), Some(config))
.await?;
assert!(result.is_complete());
assert!(result.into_state().done);
Notes
- Checkpointer required. Without one, state cannot be saved between interrupt and resume.
MemorySaverworks for development; implementCheckpointerfor production. - State is not merged on interrupt. When a node returns
interrupt(), the node's state update is not applied -- only state from previously executed nodes is preserved. Command::resume(value)passes a value to the graph on resumption, available via the command'sresume_valuefield.- State history. Call
graph.get_state_history(&config)to inspect all checkpoints for a thread.
Node Caching
CachePolicy paired with add_node_with_cache() enables hash-based result caching on individual graph nodes. When the same serialized input state is seen within the TTL window, the cached output is returned without re-executing the node. Use this for expensive nodes (LLM calls, API requests) where identical inputs produce identical outputs.
Setup
use std::time::Duration;
use synaptic::graph::{CachePolicy, FnNode, StateGraph, MessageState, END};
use synaptic::core::Message;
let expensive = FnNode::new(|mut state: MessageState| async move {
state.messages.push(Message::ai("Expensive result"));
Ok(state.into())
});
let graph = StateGraph::new()
.add_node_with_cache(
"llm_call",
expensive,
CachePolicy::new(Duration::from_secs(60)),
)
.add_edge("llm_call", END)
.set_entry_point("llm_call")
.compile()?;
How It Works
- Before executing a cached node, the graph serializes the current state to JSON and computes a hash.
- If the cache contains a valid (non-expired) entry for that
(node_name, state_hash), the cachedNodeOutputis returned immediately --process()is not called. - On a cache miss, the node executes normally and the result is stored.
The cache is held in Arc<RwLock<HashMap>> inside CompiledGraph, persisting across multiple invoke() calls on the same instance.
Example: Verifying Cache Hits
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::time::Duration;
use async_trait::async_trait;
use serde::{Serialize, Deserialize};
use synaptic::core::SynapticError;
use synaptic::graph::{CachePolicy, Node, NodeOutput, State, StateGraph, END};
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct MyState { counter: usize }
impl State for MyState {
fn merge(&mut self, other: Self) { self.counter += other.counter; }
}
struct TrackedNode { call_count: Arc<AtomicUsize> }
#[async_trait]
impl Node<MyState> for TrackedNode {
async fn process(&self, mut state: MyState) -> Result<NodeOutput<MyState>, SynapticError> {
self.call_count.fetch_add(1, Ordering::SeqCst);
state.counter += 1;
Ok(state.into())
}
}
let calls = Arc::new(AtomicUsize::new(0));
let graph = StateGraph::new()
.add_node_with_cache("n", TrackedNode { call_count: calls.clone() },
CachePolicy::new(Duration::from_secs(60)))
.add_edge("n", END)
.set_entry_point("n")
.compile()?;
// First call: cache miss
graph.invoke(MyState::default()).await?;
assert_eq!(calls.load(Ordering::SeqCst), 1);
// Same input: cache hit -- node not called
graph.invoke(MyState::default()).await?;
assert_eq!(calls.load(Ordering::SeqCst), 1);
// Different input: cache miss
graph.invoke(MyState { counter: 5 }).await?;
assert_eq!(calls.load(Ordering::SeqCst), 2);
TTL Expiry
Cached entries expire after the configured TTL. The next call with the same input re-executes the node:
let graph = StateGraph::new()
.add_node_with_cache("n", my_node,
CachePolicy::new(Duration::from_millis(100)))
.add_edge("n", END)
.set_entry_point("n")
.compile()?;
graph.invoke(state.clone()).await?; // executes
tokio::time::sleep(Duration::from_millis(150)).await;
graph.invoke(state.clone()).await?; // executes again
Mixing Cached and Uncached Nodes
Only nodes added with add_node_with_cache() are cached. Nodes added with add_node() always execute:
let graph = StateGraph::new()
.add_node_with_cache("llm", llm_node, CachePolicy::new(Duration::from_secs(300)))
.add_node("format", format_node) // always runs
.set_entry_point("llm")
.add_edge("llm", "format")
.add_edge("format", END)
.compile()?;
Notes
- State must implement
Serialize. The cache key is a hash of the JSON-serialized state. - Cache scope. The cache lives on the
CompiledGraphinstance. A newcompile()starts with an empty cache. - Works with Commands. Cached entries store the full
NodeOutput, includingCommandvariants.
Deferred Nodes
add_deferred_node() registers a node that is intended to wait until all incoming edges have been traversed before executing. Use deferred nodes as fan-in aggregation points after parallel fan-out with Command::send(), where multiple upstream branches must complete before the aggregator runs.
Adding a Deferred Node
Use add_deferred_node() on StateGraph instead of add_node():
use synaptic::graph::{FnNode, State, StateGraph, END};
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct AggState { values: Vec<String> }
impl State for AggState {
fn merge(&mut self, other: Self) { self.values.extend(other.values); }
}
let worker_a = FnNode::new(|mut state: AggState| async move {
state.values.push("from_a".into());
Ok(state.into())
});
let worker_b = FnNode::new(|mut state: AggState| async move {
state.values.push("from_b".into());
Ok(state.into())
});
let aggregator = FnNode::new(|state: AggState| async move {
println!("Collected {} results", state.values.len());
Ok(state.into())
});
let graph = StateGraph::new()
.add_node("worker_a", worker_a)
.add_node("worker_b", worker_b)
.add_deferred_node("aggregator", aggregator)
.add_edge("worker_a", "aggregator")
.add_edge("worker_b", "aggregator")
.add_edge("aggregator", END)
.set_entry_point("worker_a")
.compile()?;
Querying Deferred Status
After compiling, check whether a node is deferred with is_deferred():
assert!(graph.is_deferred("aggregator"));
assert!(!graph.is_deferred("worker_a"));
Counting Incoming Edges
incoming_edge_count() returns the total number of fixed and conditional edges targeting a node. Use it to validate that a deferred node has the expected number of upstream dependencies:
assert_eq!(graph.incoming_edge_count("aggregator"), 2);
assert_eq!(graph.incoming_edge_count("worker_a"), 0);
The count includes fixed edges (add_edge) and conditional edge path-map entries that reference the node. Conditional edges without a path map are not counted because their targets cannot be determined statically.
Combining with Command::send()
Deferred nodes are designed as the aggregation target after Command::send() fans out work:
use synaptic::graph::{Command, NodeOutput, Send};
let dispatcher = FnNode::new(|_state: AggState| async move {
let targets = vec![
Send::new("worker", serde_json::json!({"chunk": "A"})),
Send::new("worker", serde_json::json!({"chunk": "B"})),
];
Ok(NodeOutput::Command(Command::send(targets)))
});
let graph = StateGraph::new()
.add_node("dispatch", dispatcher)
.add_node("worker", worker_node)
.add_deferred_node("collect", collector_node)
.add_edge("worker", "collect")
.add_edge("collect", END)
.set_entry_point("dispatch")
.compile()?;
Note: Full parallel fan-out for
Command::send()is not yet implemented. Targets are currently processed sequentially. The deferred node infrastructure is in place for when parallel execution is added.
Linear Graphs
A deferred node in a linear chain compiles and executes normally. The deferred marker only becomes meaningful when multiple edges converge on the same node:
let graph = StateGraph::new()
.add_node("step1", step1)
.add_deferred_node("step2", step2)
.add_edge("step1", "step2")
.add_edge("step2", END)
.set_entry_point("step1")
.compile()?;
let result = graph.invoke(AggState::default()).await?.into_state();
// Runs identically to a non-deferred node in a linear chain
Notes
- Deferred is a marker. The current execution engine does not block on incoming edge completion -- it runs nodes in edge/command order. The marker is forward-looking infrastructure for future parallel fan-out support.
is_deferred()andincoming_edge_count()are introspection-only. They let you validate graph topology in tests without affecting execution.
Tool Node
ToolNode is a prebuilt graph node that automatically dispatches tool calls found in the last AI message of the state. It bridges the synaptic_tools crate's execution infrastructure with the graph system, making it straightforward to build tool-calling agent loops.
How It Works
When ToolNode processes state, it:
- Reads the last message from the state.
- Extracts any
tool_callsfrom that message (AI messages carry tool call requests). - Executes each tool call through the provided
SerialToolExecutor. - Appends a
Message::tool(result, call_id)for each tool call result. - Returns the updated state.
If the last message has no tool calls, the node passes the state through unchanged.
Setup
Create a ToolNode by providing a SerialToolExecutor with registered tools:
use synaptic::graph::ToolNode;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
use synaptic::core::{Tool, SynapticError};
use synaptic::macros::tool;
use std::sync::Arc;
// Define a tool using the #[tool] macro
/// Evaluates math expressions.
#[tool(name = "calculator")]
async fn calculator(
/// The math expression to evaluate
expression: String,
) -> Result<String, SynapticError> {
Ok(format!("Result: {expression}"))
}
// Register and create the executor
let registry = ToolRegistry::new();
registry.register(calculator()).await?;
let executor = SerialToolExecutor::new(registry);
let tool_node = ToolNode::new(executor);
Note: The
#[tool]macro generates the struct,Tooltrait implementation, and a factory function automatically. The doc comment becomes the tool description, and function parameters become the JSON Schema. See Procedural Macros for full details.
Using ToolNode in a Graph
ToolNode implements Node<MessageState>, so it can be added directly to a StateGraph:
use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::{Message, ToolCall};
// An agent node that produces tool calls
let agent = FnNode::new(|mut state: MessageState| async move {
let tool_call = ToolCall {
id: "call-1".to_string(),
name: "calculator".to_string(),
arguments: serde_json::json!({"expression": "2+2"}),
};
state.messages.push(Message::ai_with_tool_calls("", vec![tool_call]));
Ok(state)
});
let graph = StateGraph::new()
.add_node("agent", agent)
.add_node("tools", tool_node)
.set_entry_point("agent")
.add_edge("agent", "tools")
.add_edge("tools", END)
.compile()?;
let result = graph.invoke(MessageState::new()).await?.into_state();
// State now contains:
// [0] AI message with tool_calls
// [1] Tool message with "Result: 2+2"
tools_condition -- Standard Routing Function
Synaptic provides a tools_condition function that implements the standard routing logic: returns "tools" if the last message has tool calls, otherwise returns END. This replaces the need to write a custom routing closure:
use synaptic::graph::{StateGraph, MessageState, tools_condition, END};
let graph = StateGraph::new()
.add_node("agent", agent_node)
.add_node("tools", tool_node)
.set_entry_point("agent")
.add_conditional_edges("agent", tools_condition)
.add_edge("tools", "agent") // tool results go back to agent
.compile()?;
Agent Loop Pattern
In a typical ReAct agent, the tool node feeds results back to the agent node, which decides whether to call more tools or produce a final answer. Use tools_condition or conditional edges to implement this loop:
use std::collections::HashMap;
use synaptic::graph::{StateGraph, MessageState, END};
let graph = StateGraph::new()
.add_node("agent", agent_node)
.add_node("tools", tool_node)
.set_entry_point("agent")
.add_conditional_edges_with_path_map(
"agent",
|state: &MessageState| {
// If the last message has tool calls, go to tools
if let Some(msg) = state.last_message() {
if !msg.tool_calls().is_empty() {
return "tools".to_string();
}
}
END.to_string()
},
HashMap::from([
("tools".to_string(), "tools".to_string()),
(END.to_string(), END.to_string()),
]),
)
.add_edge("tools", "agent") // tool results go back to agent
.compile()?;
This is exactly the pattern that create_react_agent() implements automatically (using tools_condition internally).
create_react_agent
For convenience, Synaptic provides a factory function that assembles the standard ReAct agent graph:
use synaptic::graph::create_react_agent;
let graph = create_react_agent(model, tools);
This creates a compiled graph with "agent" and "tools" nodes wired in a conditional loop, equivalent to the manual setup shown above.
RuntimeAwareTool Injection
ToolNode supports RuntimeAwareTool instances that receive the current graph state, store reference, and tool call ID via ToolRuntime. Register runtime-aware tools with with_runtime_tool():
use synaptic::graph::ToolNode;
use synaptic::core::{RuntimeAwareTool, ToolRuntime};
let tool_node = ToolNode::new(executor)
.with_store(store) // inject store into ToolRuntime
.with_runtime_tool(my_tool); // register a RuntimeAwareTool
When create_agent is called with AgentOptions { store: Some(store), .. }, the store is automatically wired into the ToolNode.
Graph Visualization
Synaptic provides multiple ways to visualize a compiled graph, from text-based formats suitable for terminals and documentation to image formats for presentations and debugging.
Mermaid Diagram
Generate a Mermaid flowchart string. This is ideal for embedding in Markdown documents and GitHub READMEs:
let mermaid = graph.draw_mermaid();
println!("{mermaid}");
Example output:
graph TD
__start__(["__start__"])
agent["agent"]
tools["tools"]
__end__(["__end__"])
__start__ --> agent
agent --> tools
tools -.-> |continue| agent
tools -.-> |end| __end__
__start__and__end__are rendered as rounded nodes.- User-defined nodes are rendered as rectangles.
- Fixed edges use solid arrows (
-->). - Conditional edges with a path map use dashed arrows (
-.->) with labels.
ASCII Art
Generate a simple text summary for terminal output:
let ascii = graph.draw_ascii();
println!("{ascii}");
Example output:
Graph:
Nodes: agent, tools
Entry: __start__ -> agent
Edges:
agent -> tools
tools -> __end__ | agent [conditional]
The Display trait is also implemented, so you can use println!("{graph}") directly, which outputs the ASCII representation.
DOT Format (Graphviz)
Generate a Graphviz DOT string for use with the dot command-line tool:
let dot = graph.draw_dot();
println!("{dot}");
Example output:
digraph G {
rankdir=TD;
"__start__" [shape=oval];
"agent" [shape=box];
"tools" [shape=box];
"__end__" [shape=oval];
"__start__" -> "agent" [style=solid];
"agent" -> "tools" [style=solid];
"tools" -> "agent" [style=dashed, label="continue"];
"tools" -> "__end__" [style=dashed, label="end"];
}
PNG via Graphviz
Render the graph to a PNG image using the Graphviz dot command. This requires dot to be installed and available in your $PATH:
graph.draw_png("my_graph.png")?;
Under the hood, this pipes the DOT output through dot -Tpng and writes the resulting image to the specified path.
PNG via Mermaid.ink API
Render the graph to a PNG image using the mermaid.ink web service. This requires internet access but does not require any local tools:
graph.draw_mermaid_png("graph_mermaid.png").await?;
The Mermaid text is base64-encoded and sent to https://mermaid.ink/img/{encoded}. The returned image is saved to the specified path.
SVG via Mermaid.ink API
Similarly, you can generate an SVG instead:
graph.draw_mermaid_svg("graph_mermaid.svg").await?;
Summary
| Method | Format | Requires |
|---|---|---|
draw_mermaid() | Mermaid text | Nothing |
draw_ascii() | Plain text | Nothing |
draw_dot() | DOT text | Nothing |
draw_png(path) | PNG image | Graphviz dot in PATH |
draw_mermaid_png(path) | PNG image | Internet access |
draw_mermaid_svg(path) | SVG image | Internet access |
Display trait | Plain text | Nothing |
Tips
- Use
draw_mermaid()for documentation that renders on GitHub or mdBook. - Use
draw_ascii()orDisplayfor quick debugging in the terminal. - Conditional edges without a
path_mapcannot show their targets in visualizations. If you want full visualization support, useadd_conditional_edges_with_path_map()instead ofadd_conditional_edges().
Middleware Overview
The middleware system intercepts and modifies agent behavior at every lifecycle point -- before/after the agent run, before/after each model call, and around each tool call. Use middleware when you need cross-cutting concerns (rate limiting, retries, context management) without modifying your agent logic.
AgentMiddleware Trait
All methods have default no-op implementations. Override only the hooks you need.
#[async_trait]
pub trait AgentMiddleware: Send + Sync {
async fn before_agent(&self, messages: &mut Vec<Message>) -> Result<(), SynapticError>;
async fn after_agent(&self, messages: &mut Vec<Message>) -> Result<(), SynapticError>;
async fn before_model(&self, request: &mut ModelRequest) -> Result<(), SynapticError>;
async fn after_model(&self, request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError>;
async fn wrap_model_call(&self, request: ModelRequest, next: &dyn ModelCaller) -> Result<ModelResponse, SynapticError>;
async fn wrap_tool_call(&self, request: ToolCallRequest, next: &dyn ToolCaller) -> Result<Value, SynapticError>;
}
Lifecycle Diagram
before_agent(messages)
loop {
before_model(request)
-> wrap_model_call(request, next)
after_model(request, response)
for each tool_call {
wrap_tool_call(request, next)
}
}
after_agent(messages)
before_agent and after_agent run once per invocation. The inner loop repeats for each agent step (model call followed by tool execution). before_model / after_model run around every model call and can mutate the request or response. wrap_model_call and wrap_tool_call are onion-style wrappers that receive a next caller to delegate to the next layer.
MiddlewareChain
MiddlewareChain composes multiple middlewares and executes them in registration order for before_* hooks, and in reverse order for after_* hooks.
use synaptic::middleware::MiddlewareChain;
let chain = MiddlewareChain::new(vec![
Arc::new(ModelCallLimitMiddleware::new(10)),
Arc::new(ToolRetryMiddleware::new(3)),
]);
Using Middleware with create_agent
Pass middlewares through AgentOptions::middleware. The agent graph wires them into both the model node and the tool node automatically.
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::{ModelCallLimitMiddleware, ToolRetryMiddleware};
let options = AgentOptions {
middleware: vec![
Arc::new(ModelCallLimitMiddleware::new(10)),
Arc::new(ToolRetryMiddleware::new(3)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
Built-in Middlewares
| Middleware | Hook Used | Description |
|---|---|---|
ModelCallLimitMiddleware | wrap_model_call | Limits model invocations per run |
ToolCallLimitMiddleware | wrap_tool_call | Limits tool invocations per run |
ToolRetryMiddleware | wrap_tool_call | Retries failed tools with exponential backoff |
ModelFallbackMiddleware | wrap_model_call | Falls back to alternative models on failure |
SummarizationMiddleware | before_model | Auto-summarizes when context exceeds token limit |
TodoListMiddleware | before_model | Injects a task list into the agent context |
HumanInTheLoopMiddleware | wrap_tool_call | Pauses for human approval before tool execution |
ContextEditingMiddleware | before_model | Trims or filters context before model calls |
Writing a Custom Middleware
The easiest way to define a middleware is with the corresponding macro. Each lifecycle hook has its own macro (#[before_agent], #[before_model], #[after_model], #[after_agent], #[wrap_model_call], #[wrap_tool_call], #[dynamic_prompt]). The macro generates the struct, AgentMiddleware trait implementation, and a factory function automatically.
use synaptic::macros::before_model;
use synaptic::middleware::ModelRequest;
use synaptic::core::SynapticError;
#[before_model]
async fn log_model_call(request: &mut ModelRequest) -> Result<(), SynapticError> {
println!("Model call with {} messages", request.messages.len());
Ok(())
}
Then add it to your agent:
let options = AgentOptions {
middleware: vec![log_model_call()],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
Note: The
log_model_call()factory function returnsArc<dyn AgentMiddleware>. For stateful middleware, use#[field]parameters on the function. See Procedural Macros for the full reference, including all seven middleware macros and stateful middleware with#[field].
ModelCallLimitMiddleware
Limits the number of model invocations during a single agent run, preventing runaway loops. Use this when you want a hard cap on how many times the LLM is called per invocation.
Constructor
use synaptic::middleware::ModelCallLimitMiddleware;
let mw = ModelCallLimitMiddleware::new(10); // max 10 model calls
The middleware also exposes call_count() to inspect the current count and reset() to zero it out.
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ModelCallLimitMiddleware;
let options = AgentOptions {
middleware: vec![
Arc::new(ModelCallLimitMiddleware::new(5)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
wrap_model_call - Before delegating to the next layer, the middleware atomically increments an internal counter.
- If the counter has reached or exceeded
max_calls, it returnsSynapticError::MaxStepsExceededimmediately without calling the model. - Otherwise, it delegates to
next.call(request)as normal.
This means the agent loop terminates with an error once the limit is hit. The counter persists across the entire agent invocation (all steps in the agent loop), so a limit of 5 means at most 5 model round-trips total.
Example: Combining with Other Middleware
let options = AgentOptions {
middleware: vec![
Arc::new(ModelCallLimitMiddleware::new(10)),
Arc::new(ToolRetryMiddleware::new(3)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
The model call limit is checked on every model call regardless of whether other middlewares modify the request or response.
ToolCallLimitMiddleware
Limits the number of tool invocations during a single agent run. Use this to cap tool usage when agents may generate excessive tool calls in a loop.
Constructor
use synaptic::middleware::ToolCallLimitMiddleware;
let mw = ToolCallLimitMiddleware::new(20); // max 20 tool calls
The middleware exposes call_count() and reset() for inspection and manual reset.
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ToolCallLimitMiddleware;
let options = AgentOptions {
middleware: vec![
Arc::new(ToolCallLimitMiddleware::new(20)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
wrap_tool_call - Each time a tool call is dispatched, the middleware atomically increments an internal counter.
- If the counter has reached or exceeded
max_calls, it returnsSynapticError::MaxStepsExceededwithout executing the tool. - Otherwise, it delegates to
next.call(request)normally.
The counter tracks individual tool calls, not agent steps. If a single model response requests three tool calls, the counter increments three times. This gives you precise control over total tool usage across the entire agent run.
Combining Model and Tool Limits
Both limits can be applied simultaneously to guard against different failure modes:
use synaptic::middleware::{ModelCallLimitMiddleware, ToolCallLimitMiddleware};
let options = AgentOptions {
middleware: vec![
Arc::new(ModelCallLimitMiddleware::new(10)),
Arc::new(ToolCallLimitMiddleware::new(30)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
The agent stops as soon as either limit is hit.
Handling the Error
When the limit is exceeded, the middleware returns SynapticError::MaxStepsExceeded. You can catch this to provide a graceful fallback:
use synaptic::core::SynapticError;
let mut state = MessageState::new();
state.messages.push(Message::human("Do something complex."));
match graph.invoke(state).await {
Ok(result) => println!("{}", result.into_state().messages.last().unwrap().content()),
Err(SynapticError::MaxStepsExceeded(msg)) => {
println!("Agent hit tool call limit: {msg}");
// Retry with a higher limit, summarize progress, or inform the user
}
Err(e) => println!("Other error: {e}"),
}
Inspecting and Resetting
The middleware provides methods to inspect and reset the counter:
let mw = ToolCallLimitMiddleware::new(10);
// After an agent run, check how many tool calls were made
println!("Tool calls used: {}", mw.call_count());
// Reset the counter for a new run
mw.reset();
assert_eq!(mw.call_count(), 0);
ToolRetryMiddleware
Retries failed tool calls with exponential backoff. Use this when tools may experience transient failures (network timeouts, rate limits, temporary unavailability) and you want automatic recovery without surfacing errors to the model.
Constructor
use synaptic::middleware::ToolRetryMiddleware;
// Retry up to 3 times (4 total attempts including the first)
let mw = ToolRetryMiddleware::new(3);
Configuration
The base delay between retries defaults to 100ms and doubles on each attempt (exponential backoff). You can customize it with with_base_delay:
use std::time::Duration;
let mw = ToolRetryMiddleware::new(3)
.with_base_delay(Duration::from_millis(500));
// Delays: 500ms, 1000ms, 2000ms
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ToolRetryMiddleware;
let options = AgentOptions {
middleware: vec![
Arc::new(ToolRetryMiddleware::new(3)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
wrap_tool_call - When a tool call fails, the middleware waits for
base_delay * 2^attemptand retries. - Retries continue up to
max_retriestimes. If all retries fail, the last error is returned. - If the tool call succeeds on any attempt, the result is returned immediately.
The backoff schedule with the default 100ms base delay:
| Attempt | Delay before retry |
|---|---|
| 1st retry | 100ms |
| 2nd retry | 200ms |
| 3rd retry | 400ms |
Combining with Tool Call Limits
When both middlewares are active, the retry middleware operates inside the tool call limit. Each retry counts as a separate tool call:
let options = AgentOptions {
middleware: vec![
Arc::new(ToolCallLimitMiddleware::new(30)),
Arc::new(ToolRetryMiddleware::new(3)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
ModelFallbackMiddleware
Falls back to alternative models when the primary model fails. Use this for high-availability scenarios where you want seamless failover between providers (e.g., OpenAI to Anthropic) or between model tiers (e.g., GPT-4 to GPT-3.5).
Constructor
use synaptic::middleware::ModelFallbackMiddleware;
let mw = ModelFallbackMiddleware::new(vec![
fallback_model_1, // Arc<dyn ChatModel>
fallback_model_2, // Arc<dyn ChatModel>
]);
The fallback list is tried in order. The first successful response is returned.
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::openai::OpenAiChatModel;
use synaptic::anthropic::AnthropicChatModel;
use synaptic::middleware::ModelFallbackMiddleware;
let primary = Arc::new(OpenAiChatModel::new("gpt-4o"));
let fallback = Arc::new(AnthropicChatModel::new("claude-sonnet-4-20250514"));
let options = AgentOptions {
middleware: vec![
Arc::new(ModelFallbackMiddleware::new(vec![fallback])),
],
..Default::default()
};
let graph = create_agent(primary, tools, options)?;
How It Works
- Lifecycle hook:
wrap_model_call - The middleware first delegates to
next.call(request), which calls the primary model through the rest of the middleware chain. - If the primary call succeeds, the response is returned as-is.
- If the primary call fails, the middleware tries each fallback model in order by creating a
BaseChatModelCallerand sending the same request. - The first fallback that succeeds is returned. If all fallbacks also fail, the original error from the primary model is returned.
Fallback models are called directly (bypassing the middleware chain) to avoid interference from other middlewares that may have caused or contributed to the failure.
Example: Multi-tier Fallback
let primary = Arc::new(OpenAiChatModel::new("gpt-4o"));
let tier2 = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let tier3 = Arc::new(AnthropicChatModel::new("claude-sonnet-4-20250514"));
let options = AgentOptions {
middleware: vec![
Arc::new(ModelFallbackMiddleware::new(vec![tier2, tier3])),
],
..Default::default()
};
let graph = create_agent(primary, tools, options)?;
The agent tries GPT-4o first, then GPT-4o-mini, then Claude Sonnet.
SummarizationMiddleware
Automatically summarizes conversation history when it exceeds a token limit. Use this for long-running agents where the context window would otherwise overflow, replacing older messages with a concise summary while keeping recent messages intact.
Constructor
use synaptic::middleware::SummarizationMiddleware;
let mw = SummarizationMiddleware::new(
summarizer_model, // Arc<dyn ChatModel> -- model used to generate summaries
4000, // max_tokens -- threshold that triggers summarization
|msg: &Message| { // token_counter -- estimates tokens per message
msg.content().len() / 4
},
);
Parameters:
model-- The ChatModel used to generate the summary. Can be the same model as the agent or a cheaper/faster one.max_tokens-- When the estimated total tokens exceed this value, summarization is triggered.token_counter-- A functionFn(&Message) -> usizethat estimates the token count for a single message. A common heuristic iscontent.len() / 4.
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::SummarizationMiddleware;
use synaptic::openai::OpenAiChatModel;
let summarizer = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let options = AgentOptions {
middleware: vec![
Arc::new(SummarizationMiddleware::new(
summarizer,
4000,
|msg| msg.content().len() / 4,
)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
before_model - Before each model call, the middleware sums the estimated tokens across all messages.
- If the total is within
max_tokens, no action is taken. - If the total exceeds the limit, it splits messages into two groups:
- Recent messages that fit within half the token budget (kept as-is).
- Older messages that are sent to the summarizer model.
- The summarizer produces a concise summary, which replaces the older messages as a system message prefixed with
[Previous conversation summary]. - The request then proceeds with the summary plus the recent messages, staying within budget.
This approach preserves the most recent context verbatim while compressing older exchanges, keeping the agent informed about prior conversation without exceeding context limits.
Example: Budget-conscious Summarization
Use a cheaper model for summaries to reduce costs:
use synaptic::openai::OpenAiChatModel;
let agent_model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let cheap_model = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let options = AgentOptions {
middleware: vec![
Arc::new(SummarizationMiddleware::new(
cheap_model,
8000,
|msg| msg.content().len() / 4,
)),
],
..Default::default()
};
let graph = create_agent(agent_model, tools, options)?;
Offline Testing with ScriptedChatModel
Test summarization behavior without API keys:
use std::sync::Arc;
use synaptic::core::{ChatResponse, Message};
use synaptic::models::ScriptedChatModel;
use synaptic::middleware::SummarizationMiddleware;
use synaptic::graph::{create_agent, AgentOptions, MessageState};
// Script: summarizer returns a summary, agent responds
let summarizer = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Summary: discussed Rust ownership."),
usage: None,
},
]));
let agent_model = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Here's more about lifetimes."),
usage: None,
},
]));
let options = AgentOptions {
middleware: vec![
Arc::new(SummarizationMiddleware::new(
summarizer,
100, // low threshold for testing
|msg| msg.content().len() / 4,
)),
],
..Default::default()
};
let graph = create_agent(agent_model, vec![], options)?;
// Build a state with enough messages to exceed the threshold
let mut state = MessageState::new();
state.messages.push(Message::human("What is Rust?"));
state.messages.push(Message::ai("Rust is a systems programming language..."));
state.messages.push(Message::human("Tell me about ownership."));
state.messages.push(Message::ai("Ownership is a set of rules that govern memory..."));
state.messages.push(Message::human("And lifetimes?"));
let result = graph.invoke(state).await?.into_state();
TodoListMiddleware
Injects task-planning state into the agent's context by maintaining a shared todo list. Use this when your agent performs multi-step operations and you want it to track progress across model calls.
Constructor
use synaptic::middleware::TodoListMiddleware;
let mw = TodoListMiddleware::new();
Managing Tasks
The middleware provides async methods to add and complete tasks programmatically:
let mw = TodoListMiddleware::new();
// Add tasks before or during agent execution
let id1 = mw.add("Research competitor pricing").await;
let id2 = mw.add("Draft summary report").await;
// Mark tasks as done
mw.complete(id1).await;
// Inspect current state
let items = mw.items().await;
Each task gets a unique auto-incrementing ID. Tasks have an id, task (description), and done (completion status).
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::TodoListMiddleware;
let todo = Arc::new(TodoListMiddleware::new());
todo.add("Gather user requirements").await;
todo.add("Generate implementation plan").await;
todo.add("Write code").await;
let options = AgentOptions {
middleware: vec![todo.clone()],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
before_model - Before each model call, the middleware checks the current todo list.
- If the list is non-empty, it inserts a system message at the beginning of the request's message list containing the formatted task list.
- The model sees the current state of all tasks, including which ones are done.
The injected message looks like:
Current TODO list:
[ ] #1: Gather user requirements
[x] #2: Generate implementation plan
[ ] #3: Write code
This gives the model awareness of the overall plan and progress, enabling it to work through tasks methodically. You can call complete() from tool implementations or external code to update progress between agent steps.
Example: Tool-driven Task Completion
Combine with a custom tool that marks tasks as done:
let todo = Arc::new(TodoListMiddleware::new());
todo.add("Fetch data from API").await;
todo.add("Transform data").await;
todo.add("Save results").await;
// The agent sees the todo list in its context and can
// reason about which tasks remain. Your tools can call
// todo.complete(id) when they finish their work.
HumanInTheLoopMiddleware
Pauses tool execution to request human approval before proceeding. Use this when certain tool calls (e.g., database writes, payments, deployments) require human oversight.
Constructor
There are two constructors depending on the scope of approval:
use synaptic::middleware::HumanInTheLoopMiddleware;
// Require approval for ALL tool calls
let mw = HumanInTheLoopMiddleware::new(callback);
// Require approval only for specific tools
let mw = HumanInTheLoopMiddleware::for_tools(
callback,
vec!["delete_record".to_string(), "send_email".to_string()],
);
ApprovalCallback Trait
You must implement the ApprovalCallback trait to define how approval is obtained:
use synaptic::middleware::ApprovalCallback;
struct CliApproval;
#[async_trait]
impl ApprovalCallback for CliApproval {
async fn approve(&self, tool_name: &str, arguments: &Value) -> Result<bool, SynapticError> {
println!("Tool '{}' wants to run with args: {}", tool_name, arguments);
println!("Approve? (y/n)");
// Read user input and return true/false
Ok(true)
}
}
Return Ok(true) to approve, Ok(false) to reject (the model receives a rejection message), or Err(...) to abort the entire agent run.
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::HumanInTheLoopMiddleware;
let approval = Arc::new(CliApproval);
let hitl = HumanInTheLoopMiddleware::for_tools(
approval,
vec!["delete_record".to_string()],
);
let options = AgentOptions {
middleware: vec![Arc::new(hitl)],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
wrap_tool_call - When a tool call arrives, the middleware checks whether it requires approval:
- If constructed with
new(), all tools require approval. - If constructed with
for_tools(), only the named tools require approval.
- If constructed with
- For tools that require approval, it calls
callback.approve(tool_name, arguments). - If approved (
true), the tool call proceeds normally vianext.call(request). - If rejected (
false), the middleware returns aValue::Stringmessage saying the call was rejected. This message is fed back to the model as the tool result, allowing it to adjust its plan.
Example: Selective Approval with Logging
struct AuditApproval {
auto_approve: HashSet<String>,
}
#[async_trait]
impl ApprovalCallback for AuditApproval {
async fn approve(&self, tool_name: &str, arguments: &Value) -> Result<bool, SynapticError> {
if self.auto_approve.contains(tool_name) {
tracing::info!("Auto-approved: {}", tool_name);
return Ok(true);
}
tracing::warn!("Requires manual approval: {} with {:?}", tool_name, arguments);
// In production, this could send a Slack message, webhook, etc.
Ok(false) // reject by default until approved
}
}
This pattern lets you auto-approve safe operations while gating dangerous ones.
ContextEditingMiddleware
Trims or filters the conversation context before each model call. Use this to keep the context window manageable when full summarization is unnecessary -- for example, dropping old messages or stripping tool call noise from the history.
Constructor
The middleware accepts a ContextStrategy that defines how messages are edited:
use synaptic::middleware::{ContextEditingMiddleware, ContextStrategy};
// Keep only the last 10 non-system messages
let mw = ContextEditingMiddleware::new(ContextStrategy::LastN(10));
// Remove tool call/result pairs, keeping only human/AI content messages
let mw = ContextEditingMiddleware::new(ContextStrategy::StripToolCalls);
// Strip tool calls first, then keep last N
let mw = ContextEditingMiddleware::new(ContextStrategy::StripAndTruncate(10));
Convenience Constructors
let mw = ContextEditingMiddleware::last_n(10);
let mw = ContextEditingMiddleware::strip_tool_calls();
Strategies
| Strategy | Behavior |
|---|---|
LastN(n) | Keeps leading system messages, then the last n non-system messages |
StripToolCalls | Removes Tool messages and AI messages that contain only tool calls (no text) |
StripAndTruncate(n) | Applies StripToolCalls first, then LastN(n) |
Usage with create_agent
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ContextEditingMiddleware;
let options = AgentOptions {
middleware: vec![
Arc::new(ContextEditingMiddleware::last_n(20)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
How It Works
- Lifecycle hook:
before_model - Before each model call, the middleware applies the configured strategy to
request.messages. - LastN: System messages at the start of the list are always preserved. From the remaining messages, only the last
nare kept. Earlier messages are dropped. - StripToolCalls: Messages with
is_tool() == trueare removed. AI messages that have tool calls but empty text content are also removed. This cleans up the tool-call/tool-result pairs while preserving the conversational content. - StripAndTruncate: Runs both filters in sequence -- first strips tool calls, then truncates to the last N.
The original message list in the agent state is not modified; only the request sent to the model is trimmed.
Example: Combining with Summarization
For maximum context efficiency, strip tool calls first, then summarize what remains:
let options = AgentOptions {
middleware: vec![
Arc::new(ContextEditingMiddleware::strip_tool_calls()),
Arc::new(SummarizationMiddleware::new(model.clone(), 4000, |msg| msg.content().len() / 4)),
],
..Default::default()
};
let graph = create_agent(model, tools, options)?;
The context editor removes tool noise before summarization runs, producing cleaner summaries.
Key-Value Store
The Store trait provides persistent key-value storage for agents, enabling cross-invocation state management.
Store Trait
use synaptic::store::Store;
#[async_trait]
pub trait Store: Send + Sync {
async fn get(&self, namespace: &[&str], key: &str) -> Result<Option<Item>, SynapticError>;
async fn search(&self, namespace: &[&str], query: Option<&str>, limit: usize) -> Result<Vec<Item>, SynapticError>;
async fn put(&self, namespace: &[&str], key: &str, value: Value) -> Result<(), SynapticError>;
async fn delete(&self, namespace: &[&str], key: &str) -> Result<(), SynapticError>;
async fn list_namespaces(&self, prefix: &[&str]) -> Result<Vec<Vec<String>>, SynapticError>;
}
Each Item returned from get() or search() contains:
pub struct Item {
pub namespace: Vec<String>,
pub key: String,
pub value: Value,
pub created_at: String,
pub updated_at: String,
pub score: Option<f64>, // populated by semantic search
}
InMemoryStore
use synaptic::store::InMemoryStore;
let store = InMemoryStore::new();
store.put(&["users", "prefs"], "theme", json!("dark")).await?;
let item = store.get(&["users", "prefs"], "theme").await?;
Semantic Search
When configured with an embeddings model, InMemoryStore uses cosine similarity for search() queries instead of substring matching. Items are ranked by relevance and Item::score is populated.
use synaptic::store::InMemoryStore;
use synaptic::openai::OpenAiEmbeddings;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = InMemoryStore::new().with_embeddings(embeddings);
// Put documents
store.put(&["docs"], "rust", json!("Rust is a systems programming language")).await?;
store.put(&["docs"], "python", json!("Python is an interpreted language")).await?;
// Semantic search — results ranked by similarity
let results = store.search(&["docs"], Some("systems programming"), 10).await?;
// results[0] will be the "rust" item with highest similarity score
assert!(results[0].score.unwrap() > results[1].score.unwrap());
Without embeddings, search() falls back to substring matching on key and value.
Using with Agents
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::store::InMemoryStore;
let store = Arc::new(InMemoryStore::new());
let options = AgentOptions {
store: Some(store),
..Default::default()
};
let graph = create_agent(model, tools, options)?;
When a store is provided to create_agent, it is automatically wired into ToolNode. Any RuntimeAwareTool registered with the agent will receive the store via ToolRuntime.
Multi-Agent Patterns
Synaptic provides prebuilt multi-agent orchestration patterns that compose individual agents into collaborative workflows.
Pattern Comparison
| Pattern | Coordinator | Routing | Best For |
|---|---|---|---|
| Supervisor | Central supervisor model | Supervisor decides which sub-agent runs next | Structured delegation with clear task boundaries |
| Swarm | None (decentralized) | Each agent hands off to peers directly | Organic collaboration where any agent can escalate |
| Handoff Tools | Custom | You wire the topology | Arbitrary graphs that don't fit supervisor or swarm |
When to Use Each
Supervisor -- Use when you have a clear hierarchy. A single model reads the conversation and decides which specialist agent should handle the next step. The supervisor sees the full message history and can route back to itself when done.
Swarm -- Use when agents are peers. Each agent has its own model, tools, and a set of handoff tools to transfer to any other agent. There is no central coordinator; any agent can decide to transfer at any time.
Handoff Tools -- Use when you need a custom topology. create_handoff_tool produces a Tool that signals an intent to transfer to another agent. You can register these in any graph structure you design manually.
Key Types
All multi-agent functions live in synaptic_graph:
use synaptic::graph::{
create_supervisor, SupervisorOptions,
create_swarm, SwarmAgent, SwarmOptions,
create_handoff_tool,
create_agent, AgentOptions,
MessageState,
};
Minimal Example
use std::sync::Arc;
use synaptic::graph::{
create_agent, create_supervisor, AgentOptions, SupervisorOptions, MessageState,
};
use synaptic::core::Message;
// Build two sub-agents
let agent_a = create_agent(model.clone(), tools_a, AgentOptions::default())?;
let agent_b = create_agent(model.clone(), tools_b, AgentOptions::default())?;
// Wire them under a supervisor
let graph = create_supervisor(
model,
vec![
("agent_a".to_string(), agent_a),
("agent_b".to_string(), agent_b),
],
SupervisorOptions::default(),
)?;
let mut state = MessageState::new();
state.messages.push(Message::human("Hello, delegate this."));
let result = graph.invoke(state).await?.into_state();
See the individual pages for detailed usage of each pattern.
Supervisor Pattern
The supervisor pattern uses a central model to route conversations to specialized sub-agents.
How It Works
create_supervisor builds a graph with a "supervisor" node at the center. The supervisor node calls a ChatModel with handoff tools -- one per sub-agent. When the model emits a transfer_to_<agent_name> tool call, the graph routes to that sub-agent. When the sub-agent finishes, control returns to the supervisor, which can delegate again or produce a final answer.
+------------+
| supervisor |<-----+
+-----+------+ |
/ \ |
agent_a agent_b ------+
API
use synaptic::graph::{create_supervisor, SupervisorOptions};
pub fn create_supervisor(
model: Arc<dyn ChatModel>,
agents: Vec<(String, CompiledGraph<MessageState>)>,
options: SupervisorOptions,
) -> Result<CompiledGraph<MessageState>, SynapticError>;
SupervisorOptions
| Field | Type | Description |
|---|---|---|
checkpointer | Option<Arc<dyn Checkpointer>> | Persist state across invocations |
store | Option<Arc<dyn Store>> | Shared key-value store |
system_prompt | Option<String> | Override the default supervisor prompt |
If no system_prompt is provided, a default is generated:
"You are a supervisor managing these agents: agent_a, agent_b. Use the transfer tools to delegate tasks to the appropriate agent. When the task is complete, respond directly to the user."
Full Example
use std::sync::Arc;
use synaptic::core::{ChatModel, Message, Tool};
use synaptic::graph::{
create_agent, create_supervisor, AgentOptions, MessageState, SupervisorOptions,
};
// Assume `model` implements ChatModel, `research_tools` and `writing_tools`
// are Vec<Arc<dyn Tool>>.
// 1. Create sub-agents
let researcher = create_agent(
model.clone(),
research_tools,
AgentOptions {
system_prompt: Some("You are a research assistant.".into()),
..Default::default()
},
)?;
let writer = create_agent(
model.clone(),
writing_tools,
AgentOptions {
system_prompt: Some("You are a writing assistant.".into()),
..Default::default()
},
)?;
// 2. Create the supervisor graph
let supervisor = create_supervisor(
model,
vec![
("researcher".to_string(), researcher),
("writer".to_string(), writer),
],
SupervisorOptions {
system_prompt: Some(
"Route research questions to researcher, writing tasks to writer.".into(),
),
..Default::default()
},
)?;
// 3. Invoke
let mut state = MessageState::new();
state.messages.push(Message::human("Write a summary of recent AI trends."));
let result = supervisor.invoke(state).await?.into_state();
println!("{}", result.messages.last().unwrap().content());
With Checkpointing
Pass a checkpointer to persist the supervisor's state across calls:
use synaptic::graph::MemorySaver;
let supervisor = create_supervisor(
model,
agents,
SupervisorOptions {
checkpointer: Some(Arc::new(MemorySaver::new())),
..Default::default()
},
)?;
Offline Testing with ScriptedChatModel
You can test supervisor graphs without an API key using ScriptedChatModel. Script the supervisor to emit a handoff tool call, and script the sub-agent to produce a response:
use std::sync::Arc;
use synaptic::core::{ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::graph::{
create_agent, create_supervisor, AgentOptions, MessageState, SupervisorOptions,
};
// Sub-agent model: responds directly (no tool calls)
let agent_model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("The research is complete."),
usage: None,
},
]);
// Supervisor model: first response transfers to researcher, second is final answer
let supervisor_model = ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai_with_tool_calls(
"",
vec![ToolCall {
id: "call_1".into(),
name: "transfer_to_researcher".into(),
arguments: "{}".into(),
}],
),
usage: None,
},
ChatResponse {
message: Message::ai("All done. Here is the summary."),
usage: None,
},
]);
let researcher = create_agent(
Arc::new(agent_model),
vec![],
AgentOptions::default(),
)?;
let supervisor = create_supervisor(
Arc::new(supervisor_model),
vec![("researcher".to_string(), researcher)],
SupervisorOptions::default(),
)?;
let mut state = MessageState::new();
state.messages.push(Message::human("Research AI trends."));
let result = supervisor.invoke(state).await?.into_state();
Notes
- Each sub-agent is wrapped in a
SubAgentNodethat callsgraph.invoke(state)and returns the resulting state back to the supervisor. - The supervisor sees the full message history, including messages appended by sub-agents.
- The graph terminates when the supervisor produces a response with no tool calls.
Swarm Pattern
The swarm pattern creates a decentralized multi-agent graph where every agent can hand off to any other agent directly.
How It Works
create_swarm takes a list of SwarmAgent definitions. Each agent has its own model, tools, and system prompt. Synaptic automatically generates handoff tools (transfer_to_<peer>) for every other agent and adds them to each agent's tool set. A shared "tools" node executes regular tool calls and routes handoff tool calls to the target agent.
triage ----> tools ----> billing
^ | |
| v |
+------- support <------+
The first agent in the list is the entry point.
API
use synaptic::graph::{create_swarm, SwarmAgent, SwarmOptions};
pub fn create_swarm(
agents: Vec<SwarmAgent>,
options: SwarmOptions,
) -> Result<CompiledGraph<MessageState>, SynapticError>;
SwarmAgent
| Field | Type | Description |
|---|---|---|
name | String | Unique agent identifier |
model | Arc<dyn ChatModel> | The model this agent uses |
tools | Vec<Arc<dyn Tool>> | Agent-specific tools (handoff tools are added automatically) |
system_prompt | Option<String> | Optional system prompt for this agent |
SwarmOptions
| Field | Type | Description |
|---|---|---|
checkpointer | Option<Arc<dyn Checkpointer>> | Persist state across invocations |
store | Option<Arc<dyn Store>> | Shared key-value store |
Full Example
use std::sync::Arc;
use synaptic::core::{ChatModel, Message, Tool};
use synaptic::graph::{create_swarm, MessageState, SwarmAgent, SwarmOptions};
// Assume `model` implements ChatModel and *_tools are Vec<Arc<dyn Tool>>.
let swarm = create_swarm(
vec![
SwarmAgent {
name: "triage".to_string(),
model: model.clone(),
tools: triage_tools,
system_prompt: Some("You triage incoming requests.".into()),
},
SwarmAgent {
name: "billing".to_string(),
model: model.clone(),
tools: billing_tools,
system_prompt: Some("You handle billing questions.".into()),
},
SwarmAgent {
name: "support".to_string(),
model: model.clone(),
tools: support_tools,
system_prompt: Some("You provide technical support.".into()),
},
],
SwarmOptions::default(),
)?;
// The first agent ("triage") is the entry point.
let mut state = MessageState::new();
state.messages.push(Message::human("I need to update my payment method."));
let result = swarm.invoke(state).await?.into_state();
// The triage agent will call `transfer_to_billing`, routing to the billing agent.
println!("{}", result.messages.last().unwrap().content());
Routing Logic
- When an agent produces tool calls, the graph routes to the
"tools"node. - The tools node executes regular tool calls via the shared
SerialToolExecutor. - For handoff tools (
transfer_to_<name>), it adds a synthetic tool response message and skips execution. - After the tools node, routing inspects the last AI message for handoff calls and transfers to the target agent. If no handoff occurred, the current agent continues.
Offline Testing with ScriptedChatModel
Test swarm graphs without API keys by scripting each agent's model:
use std::sync::Arc;
use synaptic::core::{ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::graph::{create_swarm, MessageState, SwarmAgent, SwarmOptions};
// Triage model: transfers to billing
let triage_model = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai_with_tool_calls(
"",
vec![ToolCall {
id: "call_1".into(),
name: "transfer_to_billing".into(),
arguments: "{}".into(),
}],
),
usage: None,
},
]));
// Billing model: responds directly
let billing_model = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai("Your payment method has been updated."),
usage: None,
},
]));
let swarm = create_swarm(
vec![
SwarmAgent {
name: "triage".to_string(),
model: triage_model,
tools: vec![],
system_prompt: Some("Route requests to the right agent.".into()),
},
SwarmAgent {
name: "billing".to_string(),
model: billing_model,
tools: vec![],
system_prompt: Some("Handle billing questions.".into()),
},
],
SwarmOptions::default(),
)?;
let mut state = MessageState::new();
state.messages.push(Message::human("Update my payment method."));
let result = swarm.invoke(state).await?.into_state();
Notes
- The swarm requires at least one agent. An empty list returns an error.
- All agent tools are registered in a single shared
ToolRegistry, so tool names must be unique across agents. - Each agent has its own model, so you can mix providers (e.g., a fast model for triage, a powerful model for support).
- Handoff tools are generated for all peers -- an agent cannot hand off to itself.
Handoff Tools
Handoff tools signal an intent to transfer a conversation from one agent to another.
create_handoff_tool
The create_handoff_tool function creates a Tool that, when called, returns a transfer message. The tool is named transfer_to_<agent_name> and routing logic uses this naming convention to detect handoffs.
use synaptic::graph::create_handoff_tool;
let handoff = create_handoff_tool("billing", "Transfer to the billing specialist");
// handoff.name() => "transfer_to_billing"
// handoff.description() => "Transfer to the billing specialist"
When invoked, the tool returns:
"Transferring to agent 'billing'."
Using Handoff Tools in Custom Agents
You can register handoff tools alongside regular tools when building an agent:
use std::sync::Arc;
use synaptic::graph::{create_agent, create_handoff_tool, AgentOptions};
let escalate = create_handoff_tool("human_review", "Escalate to a human reviewer");
let mut all_tools: Vec<Arc<dyn Tool>> = my_tools;
all_tools.push(escalate);
let agent = create_agent(model, all_tools, AgentOptions::default())?;
The model will see transfer_to_human_review as an available tool. When it decides to call it, your graph's conditional edges can detect the handoff and route accordingly.
Building Custom Topologies
For workflows that don't fit the supervisor or swarm patterns, combine handoff tools with a manual StateGraph:
use std::collections::HashMap;
use synaptic::graph::{
create_handoff_tool, StateGraph, FnNode, MessageState, END,
};
// Create handoff tools
let to_reviewer = create_handoff_tool("reviewer", "Send to reviewer");
let to_publisher = create_handoff_tool("publisher", "Send to publisher");
// Build nodes (agent_node, reviewer_node, publisher_node defined elsewhere)
let graph = StateGraph::new()
.add_node("drafter", drafter_node)
.add_node("reviewer", reviewer_node)
.add_node("publisher", publisher_node)
.set_entry_point("drafter")
.add_conditional_edges("drafter", |state: &MessageState| {
if let Some(last) = state.last_message() {
for tc in last.tool_calls() {
if tc.name == "transfer_to_reviewer" {
return "reviewer".to_string();
}
if tc.name == "transfer_to_publisher" {
return "publisher".to_string();
}
}
}
END.to_string()
})
.add_edge("reviewer", "drafter")
.add_edge("publisher", END)
.compile()?;
Naming Convention
The handoff tool is always named transfer_to_<agent_name>. Both create_supervisor and create_swarm rely on this convention internally when routing. If you build custom topologies, match against the same pattern in your conditional edges.
Notes
- Handoff tools take no arguments. The model calls them with an empty object
{}. - The tool itself only returns a string message -- the actual routing is handled by the graph's conditional edges, not by the tool execution.
- You can create multiple handoff tools per agent to build complex routing graphs (e.g., an agent can hand off to three different specialists).
MCP (Model Context Protocol)
The synaptic_mcp crate connects to external MCP-compatible tool servers, discovers their tools, and exposes them as standard synaptic::core::Tool implementations.
What is MCP?
The Model Context Protocol is an open standard for connecting AI models to external tool servers. An MCP server advertises a set of tools via a JSON-RPC interface. Synaptic's MCP client discovers those tools at connection time and wraps each one as a native Tool that can be used with any agent, graph, or tool executor.
Supported Transports
| Transport | Config Struct | Communication |
|---|---|---|
| Stdio | StdioConnection | Spawn a child process; JSON-RPC over stdin/stdout |
| SSE | SseConnection | HTTP POST with Server-Sent Events for streaming |
| HTTP | HttpConnection | Standard HTTP POST with JSON-RPC |
All transports use the same JSON-RPC tools/list method for discovery and tools/call method for invocation.
Quick Start
use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, StdioConnection};
// Configure a single MCP server
let mut servers = HashMap::new();
servers.insert(
"my_server".to_string(),
McpConnection::Stdio(StdioConnection {
command: "npx".to_string(),
args: vec!["-y".to_string(), "@my/mcp-server".to_string()],
env: HashMap::new(),
}),
);
// Connect and discover tools
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// Use discovered tools with an agent
let agent = create_react_agent(model, tools)?;
Tool Name Prefixing
By default, discovered tool names are prefixed with the server name to avoid collisions (e.g., my_server_search). Disable this with:
let client = MultiServerMcpClient::new(servers).with_prefix(false);
Convenience Function
The load_mcp_tools function combines connect() and get_tools() in a single call:
use synaptic::mcp::load_mcp_tools;
let tools = load_mcp_tools(&client).await?;
Crate Imports
use synaptic::mcp::{
MultiServerMcpClient,
McpConnection,
StdioConnection,
SseConnection,
HttpConnection,
load_mcp_tools,
};
See the individual transport pages for detailed configuration examples.
Stdio Transport
The Stdio transport spawns a child process and communicates with it over stdin/stdout using JSON-RPC.
Configuration
use synaptic::mcp::StdioConnection;
use std::collections::HashMap;
let connection = StdioConnection {
command: "npx".to_string(),
args: vec!["-y".to_string(), "@modelcontextprotocol/server-filesystem".to_string()],
env: HashMap::from([
("HOME".to_string(), "/home/user".to_string()),
]),
};
Fields
| Field | Type | Description |
|---|---|---|
command | String | The executable to spawn |
args | Vec<String> | Command-line arguments |
env | HashMap<String, String> | Additional environment variables (empty by default) |
How It Works
- Discovery (
tools/list): Synaptic spawns the process, writes a JSON-RPCtools/listrequest to stdin, reads the response from stdout, then kills the process. - Invocation (
tools/call): For each tool call, Synaptic spawns a fresh process, writes a JSON-RPCtools/callrequest, reads the response, and kills the process.
Full Example
use std::collections::HashMap;
use std::sync::Arc;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, StdioConnection};
use synaptic::graph::create_react_agent;
// Configure an MCP server that provides filesystem tools
let mut servers = HashMap::new();
servers.insert(
"filesystem".to_string(),
McpConnection::Stdio(StdioConnection {
command: "npx".to_string(),
args: vec![
"-y".to_string(),
"@modelcontextprotocol/server-filesystem".to_string(),
"/allowed/path".to_string(),
],
env: HashMap::new(),
}),
);
// Connect and discover tools
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// tools might include: filesystem_read_file, filesystem_write_file, etc.
// Wire into an agent
let agent = create_react_agent(model, tools)?;
Testing Without a Server
For unit tests, you can test MCP client types without spawning a real server. The connection types are serializable and the client can be inspected before connecting:
use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, StdioConnection};
// Create a client without connecting
let mut servers = HashMap::new();
servers.insert(
"test".to_string(),
McpConnection::Stdio(StdioConnection {
command: "echo".to_string(),
args: vec!["hello".to_string()],
env: HashMap::new(),
}),
);
let client = MultiServerMcpClient::new(servers);
// Before connect(), no tools are available
let tools = client.get_tools().await;
assert!(tools.is_empty());
// Connection types round-trip through serde
let json = serde_json::to_string(&McpConnection::Stdio(StdioConnection {
command: "npx".to_string(),
args: vec![],
env: HashMap::new(),
}))?;
let _: McpConnection = serde_json::from_str(&json)?;
For integration tests that need actual tool discovery, use a simple echo script as the MCP server command.
Notes
- Each tool call spawns a new process. This is simple but adds latency for each invocation.
- Ensure the command is available on
PATHor provide an absolute path. - The
envmap is merged with the current process environment -- it does not replace it. - Stderr from the child process is discarded (
Stdio::null()).
SSE Transport
The SSE (Server-Sent Events) transport connects to a remote MCP server over HTTP, using the SSE transport variant of the protocol.
Configuration
use synaptic::mcp::SseConnection;
use std::collections::HashMap;
let connection = SseConnection {
url: "http://localhost:3001/mcp".to_string(),
headers: HashMap::from([
("Authorization".to_string(), "Bearer my-token".to_string()),
]),
};
Fields
| Field | Type | Description |
|---|---|---|
url | String | The MCP server endpoint URL |
headers | HashMap<String, String> | Additional HTTP headers (e.g., auth tokens) |
How It Works
Both tool discovery (tools/list) and tool invocation (tools/call) use HTTP POST requests with JSON-RPC payloads against the configured URL. The Content-Type: application/json header is added automatically.
Full Example
use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, SseConnection};
let mut servers = HashMap::new();
servers.insert(
"search".to_string(),
McpConnection::Sse(SseConnection {
url: "http://localhost:3001/mcp".to_string(),
headers: HashMap::from([
("Authorization".to_string(), "Bearer secret".to_string()),
]),
}),
);
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// tools might include: search_web_search, search_image_search, etc.
Notes
- SSE and HTTP transports share the same underlying HTTP POST mechanism for tool calls.
- The
headersmap is applied to every request (both discovery and invocation). - The server must implement the MCP JSON-RPC interface at the given URL.
HTTP Transport
The HTTP transport connects to an MCP server using standard HTTP POST requests with JSON-RPC payloads.
Configuration
use synaptic::mcp::HttpConnection;
use std::collections::HashMap;
let connection = HttpConnection {
url: "https://mcp.example.com/rpc".to_string(),
headers: HashMap::from([
("X-Api-Key".to_string(), "my-api-key".to_string()),
]),
};
Fields
| Field | Type | Description |
|---|---|---|
url | String | The MCP server endpoint URL |
headers | HashMap<String, String> | Additional HTTP headers (e.g., API keys) |
How It Works
Both tool discovery (tools/list) and tool invocation (tools/call) send a JSON-RPC POST request to the configured URL. The Content-Type: application/json header is added automatically. Custom headers from the config are included in every request.
Full Example
use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, HttpConnection};
let mut servers = HashMap::new();
servers.insert(
"calculator".to_string(),
McpConnection::Http(HttpConnection {
url: "https://mcp.example.com/rpc".to_string(),
headers: HashMap::from([
("X-Api-Key".to_string(), "my-api-key".to_string()),
]),
}),
);
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// tools might include: calculator_add, calculator_multiply, etc.
Notes
- HTTP and SSE transports use identical request/response handling for tool calls. The distinction is in how the MCP server manages the connection.
- Use HTTPS in production to protect API keys and tool call payloads.
- The
headersmap is applied to every request, making it suitable for static authentication tokens.
Multi-Server Client
MultiServerMcpClient connects to multiple MCP servers simultaneously and aggregates all discovered tools into a single collection.
Why Multiple Servers?
Real-world agents often need tools from several sources: a filesystem server for local files, a web search server for internet queries, and a database server for structured data. MultiServerMcpClient lets you configure all of them in one place and get back a unified Vec<Arc<dyn Tool>>.
Configuration
Pass a HashMap<String, McpConnection> where keys are server names and values are connection configs. You can mix transports freely:
use std::collections::HashMap;
use synaptic::mcp::{
MultiServerMcpClient, McpConnection,
StdioConnection, HttpConnection, SseConnection,
};
let mut servers = HashMap::new();
// Local filesystem server via stdio
servers.insert(
"fs".to_string(),
McpConnection::Stdio(StdioConnection {
command: "npx".to_string(),
args: vec!["-y".to_string(), "@mcp/server-filesystem".to_string()],
env: HashMap::new(),
}),
);
// Remote search server via HTTP
servers.insert(
"search".to_string(),
McpConnection::Http(HttpConnection {
url: "https://search.example.com/mcp".to_string(),
headers: HashMap::from([
("Authorization".to_string(), "Bearer token".to_string()),
]),
}),
);
// Analytics server via SSE
servers.insert(
"analytics".to_string(),
McpConnection::Sse(SseConnection {
url: "http://localhost:8080/mcp".to_string(),
headers: HashMap::new(),
}),
);
Connecting and Using Tools
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// Tools from all three servers are combined:
// fs_read_file, fs_write_file, search_web_search, analytics_query, ...
// Pass directly to an agent
let agent = create_react_agent(model, tools)?;
Tool Name Prefixing
By default, every tool name is prefixed with its server name to prevent collisions. For example, a tool named read_file from the "fs" server becomes fs_read_file.
To disable prefixing (when you know tool names are globally unique):
let client = MultiServerMcpClient::new(servers).with_prefix(false);
load_mcp_tools Shorthand
The load_mcp_tools convenience function combines connect() and get_tools():
use synaptic::mcp::load_mcp_tools;
let client = MultiServerMcpClient::new(servers);
let tools = load_mcp_tools(&client).await?;
Notes
connect()iterates over all servers sequentially. If any server fails, the entire call returns an error.- Tools are stored in an
Arc<RwLock<Vec<...>>>internally, soget_tools()is safe to call from multiple tasks. - The server name is used only for prefixing tool names -- it does not need to match any value on the server side.
Deep Agent
A Deep Agent is a high-level agent abstraction that combines a middleware stack, a backend for filesystem and state operations, and a factory for creating fully-configured agents in a single call. It is designed for tasks that require reading and writing files, spawning subagents, loading skills, and maintaining persistent memory -- the kinds of workflows typically associated with coding assistants and autonomous research agents.
Architecture
A Deep Agent is assembled from layers that wrap a core ReAct agent graph:
+-----------------------------------------------+
| Deep Agent |
| +------------------------------------------+ |
| | Middleware Stack | |
| | - DeepMemoryMiddleware (AGENTS.md) | |
| | - SkillsMiddleware (SKILL.md injection) | |
| | - FilesystemMiddleware (tool eviction) | |
| | - SubAgentMiddleware (task tool) | |
| | - DeepSummarizationMiddleware | |
| | - PatchToolCallsMiddleware | |
| +------------------------------------------+ |
| +------------------------------------------+ |
| | Filesystem Tools | |
| | ls, read_file, write_file, edit_file, | |
| | glob, grep (+execute if supported) | |
| +------------------------------------------+ |
| +------------------------------------------+ |
| | Backend (State / Store / Filesystem) | |
| +------------------------------------------+ |
| +------------------------------------------+ |
| | ReAct Agent Graph (agent + tools nodes) | |
| +------------------------------------------+ |
+-----------------------------------------------+
Core Capabilities
| Capability | Description |
|---|---|
| Filesystem tools | Read, write, edit, search, and list files through a pluggable backend. An execute tool is added when the backend supports it. |
| Subagents | Spawn child agents for isolated subtasks with recursion depth control (max_subagent_depth) |
| Skills | Load SKILL.md files from a configurable directory that inject domain-specific instructions into the system prompt |
| Memory | Persist learned context in AGENTS.md and reload it across sessions |
| Summarization | Auto-summarize conversation history when context length exceeds summarization_threshold of max_input_tokens |
| Backend abstraction | Swap between in-memory (StateBackend), persistent store (StoreBackend), and real filesystem (FilesystemBackend) backends |
Minimal Example
use synaptic::deep::{create_deep_agent, DeepAgentOptions, backend::FilesystemBackend};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;
use synaptic::core::Message;
use std::sync::Arc;
let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(FilesystemBackend::new("/path/to/workspace"));
let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;
let result = agent.invoke(MessageState::with_messages(vec![
Message::human("List the Rust files in src/"),
])).await?;
println!("{}", result.into_state().last_message_content());
create_deep_agent returns a CompiledGraph<MessageState> -- the same graph type used by create_react_agent. You invoke it with a MessageState containing your input messages and receive a GraphResult<MessageState> back.
Guides
- Quickstart -- create and run your first Deep Agent
- Backends -- choose between State, Store, and Filesystem backends
- Filesystem Tools -- reference for the built-in tools
- Subagents -- delegate subtasks to child agents
- Skills -- extend agent behavior with SKILL.md files
- Memory -- persistent agent memory via AGENTS.md
- Customization -- full DeepAgentOptions reference
When to Use a Deep Agent
Use a Deep Agent when your task involves file manipulation, multi-step reasoning over project state, or spawning subtasks. If you only need a simple question-answering loop, a plain create_react_agent is sufficient. Deep Agent adds the infrastructure layers that turn a basic ReAct loop into an autonomous coding or research assistant.
Quickstart
This guide walks you through creating and running a Deep Agent in three steps.
Prerequisites
Add the required crates to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["deep", "openai"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Step 1: Create a Backend
The backend determines how the agent interacts with the outside world. For this quickstart we use FilesystemBackend, which reads and writes real files on your machine:
use synaptic::deep::backend::FilesystemBackend;
use std::sync::Arc;
let backend = Arc::new(FilesystemBackend::new("/tmp/my-workspace"));
For testing without touching the filesystem, swap in StateBackend::new() instead:
use synaptic::deep::backend::StateBackend;
let backend = Arc::new(StateBackend::new());
Step 2: Create the Agent
Use create_deep_agent with a model and a DeepAgentOptions. The options struct has sensible defaults -- you only need to provide the backend:
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::openai::OpenAiChatModel;
use std::sync::Arc;
let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;
create_deep_agent wires up the full middleware stack (memory, skills, filesystem, subagents, summarization, tool-call patching), registers the filesystem tools, and compiles the underlying ReAct graph. It returns a CompiledGraph<MessageState>.
Step 3: Run the Agent
Build a MessageState with your prompt and call invoke. The agent will reason, call tools, and return a final result:
use synaptic::graph::MessageState;
use synaptic::core::Message;
let state = MessageState::with_messages(vec![
Message::human("Create a file called hello.txt containing 'Hello, world!'"),
]);
let result = agent.invoke(state).await?;
println!("{}", result.into_state().last_message_content());
What Happens Under the Hood
When you call agent.invoke(state):
- Memory loading -- The
DeepMemoryMiddlewarechecks for anAGENTS.mdfile via the backend and injects any saved context into the system prompt. - Skills injection -- The
SkillsMiddlewarescans the.skills/directory forSKILL.mdfiles and adds matching skill instructions to the system prompt. - Agent loop -- The underlying ReAct graph enters its reason-act-observe loop. The model sees the filesystem tools and decides which ones to call.
- Tool execution -- Each tool call (e.g.
write_file) is dispatched through the backend.FilesystemBackendperforms real I/O;StateBackendoperates on an in-memory map. - Summarization -- If the conversation grows beyond the configured token threshold (default: 85% of 128,000 tokens), the
DeepSummarizationMiddlewarecompresses older messages into a summary before the next model call. - Tool-call patching -- The
PatchToolCallsMiddlewarefixes malformed tool calls before they reach the executor. - Final answer -- When the model responds without tool calls, the graph terminates and
invokereturns theGraphResult<MessageState>.
Customizing Options
DeepAgentOptions fields can be set directly before passing to create_deep_agent:
let mut options = DeepAgentOptions::new(backend);
options.system_prompt = Some("You are a Rust expert.".to_string());
options.max_input_tokens = 64_000;
options.enable_subagents = false;
let agent = create_deep_agent(model, options)?;
Key defaults:
| Field | Default |
|---|---|
max_input_tokens | 128,000 |
summarization_threshold | 0.85 |
eviction_threshold | 20,000 |
max_subagent_depth | 3 |
skills_dir | ".skills" |
memory_file | "AGENTS.md" |
enable_subagents | true |
enable_filesystem | true |
enable_skills | true |
enable_memory | true |
Full Working Example
use std::sync::Arc;
use synaptic::core::Message;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, backend::FilesystemBackend};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(FilesystemBackend::new("/tmp/demo"));
let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;
let state = MessageState::with_messages(vec![
Message::human("What files are in the current directory?"),
]);
let result = agent.invoke(state).await?;
println!("{}", result.into_state().last_message_content());
Ok(())
}
Next Steps
- Backends -- learn about State, Store, and Filesystem backends
- Filesystem Tools -- see what each tool does
- Customization -- tune every option with
DeepAgentOptions
Backends
A Deep Agent backend controls how filesystem tools interact with the outside world. Synaptic provides three built-in backends. You choose the one that matches your deployment context.
StateBackend
An entirely in-memory backend. Files are stored in a HashMap<String, String> keyed by normalized paths and never touch the real filesystem. Directories are inferred from path prefixes rather than stored as explicit entries. This is the default for tests and sandboxed demos.
use synaptic::deep::backend::StateBackend;
use std::sync::Arc;
let backend = Arc::new(StateBackend::new());
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;
// After the agent runs, inspect the virtual filesystem:
let entries = backend.ls("/").await?;
let content = backend.read_file("/hello.txt", 0, 2000).await?;
StateBackend does not support shell command execution -- supports_execution() returns false and execute() returns an error.
When to use: Unit tests, CI pipelines, sandboxed playgrounds where no real I/O should occur.
StoreBackend
Persists files through Synaptic's Store trait. Each file is stored as an item with key=path and value={"content": "..."}. All items share a configurable namespace prefix. This lets you back the agent's workspace with any store implementation -- InMemoryStore for development, or a custom database-backed store for production.
use synaptic::deep::backend::StoreBackend;
use synaptic::store::InMemoryStore;
use std::sync::Arc;
let store = Arc::new(InMemoryStore::new());
let namespace = vec!["workspace".to_string(), "agent1".to_string()];
let backend = Arc::new(StoreBackend::new(store, namespace));
let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;
The second argument is a Vec<String> namespace. All file keys are stored under this namespace, so multiple agents can share a single store without key collisions.
StoreBackend does not support shell command execution -- supports_execution() returns false and execute() returns an error.
When to use: Server deployments where you want persistence without granting direct filesystem access. Ideal for multi-tenant applications.
FilesystemBackend
Reads and writes real files on the host operating system. This is the backend you want for coding assistants and local automation.
use synaptic::deep::backend::FilesystemBackend;
use std::sync::Arc;
let backend = Arc::new(FilesystemBackend::new("/home/user/project"));
let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;
The path you provide becomes the agent's root directory. All tool paths are resolved relative to this root. The agent cannot escape the root directory -- paths containing .. are rejected.
FilesystemBackend is the only built-in backend that supports shell command execution. Commands run via sh -c in the root directory with an optional timeout. When this backend is used, create_filesystem_tools automatically includes the execute tool.
Feature gate:
FilesystemBackendrequires thefilesystemCargo feature onsynaptic-deep. Thesynapticfacade does not forward this feature, so addsynaptic-deepas an explicit dependency:synaptic = { version = "0.2", features = ["deep"] } synaptic-deep = { version = "0.2", features = ["filesystem"] }
When to use: Local CLI tools, coding assistants, any scenario where the agent must interact with real files.
Implementing a Custom Backend
All three backends implement the Backend trait from synaptic::deep::backend:
use synaptic::deep::backend::{Backend, DirEntry, ExecResult, GrepOutputMode};
#[async_trait]
pub trait Backend: Send + Sync {
/// List entries in a directory.
async fn ls(&self, path: &str) -> Result<Vec<DirEntry>, SynapticError>;
/// Read file contents with line-based pagination.
async fn read_file(&self, path: &str, offset: usize, limit: usize)
-> Result<String, SynapticError>;
/// Create or overwrite a file.
async fn write_file(&self, path: &str, content: &str) -> Result<(), SynapticError>;
/// Find-and-replace text in a file.
async fn edit_file(&self, path: &str, old_text: &str, new_text: &str, replace_all: bool)
-> Result<(), SynapticError>;
/// Match file paths against a glob pattern within a base directory.
async fn glob(&self, pattern: &str, base: &str) -> Result<Vec<String>, SynapticError>;
/// Search file contents by regex pattern.
async fn grep(&self, pattern: &str, path: Option<&str>, file_glob: Option<&str>,
output_mode: GrepOutputMode) -> Result<String, SynapticError>;
/// Execute a shell command. Returns error by default.
async fn execute(&self, command: &str, timeout: Option<Duration>)
-> Result<ExecResult, SynapticError> { /* default: error */ }
/// Whether this backend supports shell command execution.
fn supports_execution(&self) -> bool { false }
}
Supporting types:
DirEntry--{ name: String, is_dir: bool, size: Option<u64> }ExecResult--{ stdout: String, stderr: String, exit_code: i32 }GrepMatch--{ file: String, line_number: usize, line: String }GrepOutputMode--FilesWithMatches | Content | Count
Implement this trait to back the agent with S3, a database, a remote server over SSH, or any other storage layer. Override execute and supports_execution if you want to enable the execute tool for your backend.
Offline Testing
Use StateBackend with ScriptedChatModel to test deep agents without API keys or real filesystem access:
use std::sync::Arc;
use synaptic::core::{ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::deep::backend::StateBackend;
// Script the model to write a file then finish
let model = Arc::new(ScriptedChatModel::new(vec![
ChatResponse {
message: Message::ai_with_tool_calls(
"I'll create a file.",
vec![ToolCall {
id: "call_1".into(),
name: "write_file".into(),
arguments: r#"{"path": "/hello.txt", "content": "Hello from test!"}"#.into(),
}],
),
usage: None,
},
ChatResponse {
message: Message::ai("Done! I created hello.txt."),
usage: None,
},
]));
let backend = Arc::new(StateBackend::new());
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;
// Run the agent...
// Then inspect the virtual filesystem:
let content = backend.read_file("/hello.txt", 0, 2000).await?;
assert!(content.contains("Hello from test!"));
This pattern is ideal for CI pipelines and unit tests. The StateBackend is fully deterministic and requires no cleanup.
Comparison
| Backend | Persistence | Real I/O | Execution | Feature gate | Best for |
|---|---|---|---|---|---|
StateBackend | None (in-memory) | No | No | None | Tests, sandboxing |
StoreBackend | Via Store trait | No | No | None | Servers, multi-tenant |
FilesystemBackend | Disk | Yes | Yes | filesystem | Local CLI, coding assistants |
Filesystem Tools
A Deep Agent ships with six built-in filesystem tools, plus a conditional seventh. These tools are automatically registered when you call create_deep_agent (if enable_filesystem is true, which is the default) and are dispatched through whichever backend you configure.
Creating the Tools
If you need the tool set outside of a DeepAgent (for example, in a custom graph), use the factory function:
use synaptic::deep::tools::create_filesystem_tools;
use synaptic::deep::backend::FilesystemBackend;
use std::sync::Arc;
let backend = Arc::new(FilesystemBackend::new("/workspace"));
let tools = create_filesystem_tools(backend);
// tools: Vec<Arc<dyn Tool>>
// 6 tools always: ls, read_file, write_file, edit_file, glob, grep
// + execute (only if backend.supports_execution() returns true)
The execute tool is only included when the backend reports that it supports execution. For FilesystemBackend this is always the case. For StateBackend and StoreBackend, execution is not supported and the tool is omitted.
Tool Reference
| Tool | Description | Always present |
|---|---|---|
ls | List directory contents | Yes |
read_file | Read file contents with optional line-based pagination | Yes |
write_file | Create or overwrite a file | Yes |
edit_file | Find and replace text in an existing file | Yes |
glob | Find files matching a glob pattern | Yes |
grep | Search file contents by regex pattern | Yes |
execute | Run a shell command and capture output | Only if backend supports execution |
ls
Lists files and directories at the given path.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | Directory to list |
Returns a JSON array of entries, each with name (string), is_dir (boolean), and size (integer or null) fields.
read_file
Reads the contents of a single file with line-based pagination.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | File path to read |
offset | integer | no | Starting line number, 0-based (default 0) |
limit | integer | no | Maximum number of lines to return (default 2000) |
Returns the file contents as a string. When offset and limit are provided, returns only the requested line range.
write_file
Creates a new file or overwrites an existing one.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | Destination file path |
content | string | yes | Full file contents to write |
Returns a confirmation string (e.g. "wrote path/to/file").
edit_file
Applies a targeted string replacement within an existing file.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | File to edit |
old_string | string | yes | Exact text to find |
new_string | string | yes | Replacement text |
replace_all | boolean | no | Replace all occurrences (default false) |
When replace_all is false (the default), only the first occurrence is replaced. The tool returns an error if old_string is not found in the file.
glob
Finds files matching a glob pattern.
| Parameter | Type | Required | Description |
|---|---|---|---|
pattern | string | yes | Glob pattern (e.g. "**/*.rs", "src/*.toml") |
path | string | no | Base directory to search from (default ".") |
Returns matching file paths as a newline-separated string.
grep
Searches file contents for lines matching a regular expression.
| Parameter | Type | Required | Description |
|---|---|---|---|
pattern | string | yes | Regex pattern to search for |
path | string | no | Directory or file to search in (defaults to workspace root) |
glob | string | no | Glob pattern to filter which files are searched (e.g. "*.rs") |
output_mode | string | no | Output format: "files_with_matches" (default), "content", or "count" |
Output modes control the format of results:
files_with_matches-- Returns one matching file path per line.content-- Returns matching lines infile:line_number:lineformat.count-- Returns match counts infile:countformat.
execute
Runs a shell command in the backend's working directory. This tool is only registered when the backend supports execution (i.e. FilesystemBackend).
| Parameter | Type | Required | Description |
|---|---|---|---|
command | string | yes | The shell command to execute |
timeout | integer | no | Timeout in seconds |
Returns a JSON object with stdout, stderr, and exit_code fields. Commands are executed via sh -c in the backend's root directory.
Subagents
A Deep Agent can spawn child agents -- called subagents -- to handle isolated subtasks. Subagents run in their own context, with their own conversation history, and return a result to the parent agent when they finish.
Task Tool
When subagents are enabled, create_deep_agent adds a built-in task tool. When the parent agent calls the task tool, a new child deep agent is created via create_deep_agent() with the same model and backend, runs the requested subtask, and returns its final answer as the tool result.
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
let mut options = DeepAgentOptions::new(backend);
options.enable_subagents = true; // enabled by default
let agent = create_deep_agent(model, options)?;
// The agent can now call the "task" tool in its reasoning loop.
// Example tool call the model might emit:
// { "name": "task", "arguments": { "description": "Refactor the parse module" } }
The task tool accepts two parameters:
| Parameter | Required | Description |
|---|---|---|
description | yes | A detailed description of the task for the sub-agent |
agent_type | no | Name of a custom sub-agent type to spawn (defaults to "general-purpose") |
SubAgentDef
For more control, define named subagent types with SubAgentDef. Each definition specifies a name, description, system prompt, and an optional tool set. SubAgentDef is a plain struct -- create it with a struct literal:
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, SubAgentDef};
let mut options = DeepAgentOptions::new(backend);
options.subagents = vec![
SubAgentDef {
name: "researcher".to_string(),
description: "Research specialist".to_string(),
system_prompt: "You are a research assistant. Find relevant files and summarize them.".to_string(),
tools: vec![], // inherits default deep agent tools
},
SubAgentDef {
name: "writer".to_string(),
description: "Code writer".to_string(),
system_prompt: "You are a code writer. Implement the requested changes.".to_string(),
tools: vec![],
},
];
let agent = create_deep_agent(model, options)?;
When the parent agent calls the task tool with "agent_type": "researcher", the TaskTool finds the matching SubAgentDef by name and uses its system_prompt and tools for the child agent. If no matching definition is found, a general-purpose child agent is spawned with default settings.
Recursion Depth Control
Subagents can themselves spawn further subagents. To prevent unbounded recursion, configure max_subagent_depth:
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
let mut options = DeepAgentOptions::new(backend);
options.max_subagent_depth = 3; // default is 3
let agent = create_deep_agent(model, options)?;
The SubAgentMiddleware tracks the current depth with an AtomicUsize counter. When the depth limit is reached, the task tool returns an error instead of spawning a new agent. The parent agent sees this error as a tool result and can adjust its strategy.
Context Isolation
Each subagent starts with a fresh conversation. The parent's message history is not forwarded. This keeps the subagent focused and avoids blowing the context window. The only information the subagent receives is:
- Its own system prompt (from
SubAgentDefor the default deep agent prompt). - The task description provided by the parent, sent as a
Message::human(). - The shared backend -- subagents read and write the same workspace.
The child agent is a full deep agent created via create_deep_agent(), so it has access to the same filesystem tools, skills, and middleware stack as the parent (subject to the depth limit for further subagent spawning).
When the subagent finishes, only the content of its last AI message is returned to the parent as a tool result string. Intermediate reasoning and tool calls are discarded.
Example: Delegating a Research Task
use std::sync::Arc;
use synaptic::core::Message;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::graph::MessageState;
let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;
let state = MessageState::with_messages(vec![
Message::human("Find all TODO comments in the codebase and write a summary to TODO_REPORT.md"),
]);
let result = agent.invoke(state).await?;
let final_state = result.into_state();
// Under the hood, the agent may call:
// task({ "description": "Search for TODO comments in all .rs files" })
// The subagent runs, returns results, and the parent writes the report.
Skills
Skills extend a Deep Agent's behavior by injecting domain-specific instructions into the system prompt. A skill is defined by a SKILL.md file with YAML frontmatter and a body of Markdown instructions. The SkillsMiddleware discovers skills from the backend filesystem and presents an index to the agent, which can then read the full skill file on demand via the read_file tool.
SKILL.md Format
Each skill file starts with YAML frontmatter between --- markers containing name and description fields:
---
name: search
description: Search the web for information
---
# Search Skill
Detailed instructions for how to perform web searches effectively...
The frontmatter fields:
| Field | Required | Description |
|---|---|---|
name | yes | Unique identifier for the skill |
description | no | One-line summary shown in the skill index (defaults to empty string if omitted) |
The parser extracts name and description by scanning lines between the --- markers for name: and description: prefixes. Values may optionally be quoted with single or double quotes.
Skills Directory Structure
Place skill files in a .skills/ directory at the workspace root:
my-project/
.skills/
search/SKILL.md
testing/SKILL.md
documentation/SKILL.md
src/
main.rs
Each skill lives in its own subdirectory. The SkillsMiddleware discovers them by listing directories under the configured skills_dir and reading {skills_dir}/{dir}/SKILL.md from each.
How Discovery Works
The SkillsMiddleware implements the AgentMiddleware trait. On each call to before_model(), it:
- Lists entries in the skills directory via the backend's
ls()method. - For each directory entry, reads the first 50 lines of
{dir}/SKILL.md. - Parses the YAML frontmatter to extract
nameanddescription. - Builds an
<available_skills>section and appends it to the system prompt.
The injected section looks like:
<available_skills>
- **search**: Search the web for information (read `.skills/search/SKILL.md` for details)
- **testing**: Guidelines for writing tests (read `.skills/testing/SKILL.md` for details)
</available_skills>
The agent sees this index and can read the full SKILL.md file via the read_file tool when it needs the detailed instructions.
Configuration
Skills are enabled by default. Configure via DeepAgentOptions:
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
let mut options = DeepAgentOptions::new(backend);
options.skills_dir = Some(".skills".to_string()); // default
options.enable_skills = true; // default
let agent = create_deep_agent(model, options)?;
To disable skills entirely, set enable_skills = false. To change the skills directory, set skills_dir to a different path within the backend.
Example: Adding a Rust Refactoring Skill
Create the file .skills/rust-refactoring/SKILL.md in your workspace:
---
name: rust-refactoring
description: Best practices for refactoring Rust code
---
When refactoring Rust code, follow these guidelines:
1. Run `cargo clippy` before and after changes.
2. Prefer extracting functions over inline complexity.
3. Use `#[must_use]` on public functions that return values.
4. Write a test for every extracted function.
Once this file is present in the backend, the SkillsMiddleware will automatically discover it and include it in the system prompt index. The agent can then read the full file for detailed instructions when it encounters a refactoring task.
There is no programmatic skill registration API. All skills are filesystem-based, discovered at runtime by scanning the backend.
More Examples
Code Review Skill
A code review skill injects a structured checklist so the agent applies consistent review standards:
---
name: code-review
description: Structured code review checklist with severity levels
---
When reviewing code, evaluate each change against this checklist:
## Severity Levels
- **Critical**: Security vulnerabilities, data loss risks, correctness bugs
- **Major**: Performance issues, missing error handling, API contract violations
- **Minor**: Style inconsistencies, missing docs, naming improvements
## Review Checklist
1. **Correctness** — Does the logic match the stated intent?
2. **Error handling** — Are all failure paths covered?
3. **Security** — Any injection, auth bypass, or data exposure risks?
4. **Performance** — Unnecessary allocations, O(n²) loops, missing indexes?
5. **Tests** — Are new paths tested? Are edge cases covered?
6. **Naming** — Do names convey purpose without needing comments?
## Output Format
For each finding, report:
- File and line range
- Severity level
- Description and suggested fix
This turns the agent into a disciplined reviewer that categorizes findings by severity rather than giving unstructured feedback.
TDD Workflow Skill
A TDD skill constrains the agent to follow a strict Red-Green-Refactor cycle:
---
name: tdd
description: Enforce test-driven development workflow
---
Follow the Red-Green-Refactor cycle strictly:
## Step 1: Red
- Write a failing test FIRST. Run it and confirm it fails.
- The test must describe the desired behavior, not the implementation.
## Step 2: Green
- Write the MINIMUM code to make the test pass.
- Do not add extra logic, optimizations, or edge case handling yet.
- Run the test and confirm it passes.
## Step 3: Refactor
- Clean up the implementation while keeping all tests green.
- Extract helpers, rename variables, remove duplication.
- Run the full test suite after each refactoring step.
## Rules
- Never write production code without a failing test.
- One behavior per test. If a test name contains "and", split it.
- Commit after each green-refactor cycle.
This prevents the agent from jumping ahead to write implementation code before tests exist.
API Design Conventions Skill
A conventions skill encodes team-wide API standards so every endpoint the agent creates follows the same patterns:
---
name: api-conventions
description: Team API design standards for REST endpoints
---
All REST endpoints must follow these conventions:
## URL Structure
- Use kebab-case for path segments: `/user-profiles`, not `/userProfiles`
- Nest resources: `/teams/{team_id}/members/{member_id}`
- Version prefix: `/api/v1/...`
## Request/Response
- Use `snake_case` for JSON field names
- Wrap collections: `{ "items": [...], "total": 42, "next_cursor": "..." }`
- Error format: `{ "error": { "code": "NOT_FOUND", "message": "..." } }`
## Status Codes
- 200 for success, 201 for creation, 204 for deletion
- 400 for validation errors, 404 for missing resources
- 409 for conflicts, 422 for semantic errors
## Naming
- List endpoint: `GET /resources`
- Create endpoint: `POST /resources`
- Get endpoint: `GET /resources/{id}`
- Update endpoint: `PATCH /resources/{id}`
- Delete endpoint: `DELETE /resources/{id}`
Any agent working on the API layer will automatically produce consistent endpoints without per-task reminders.
Multi-Skill Cooperation
When multiple skills exist in the workspace, the agent sees all of them in the index and reads the relevant ones based on the current task. Consider this layout:
my-project/
.skills/
code-review/SKILL.md
tdd/SKILL.md
api-conventions/SKILL.md
rust-refactoring/SKILL.md
src/
main.rs
The SkillsMiddleware injects the full index into the system prompt:
<available_skills>
- **code-review**: Structured code review checklist with severity levels (read `.skills/code-review/SKILL.md` for details)
- **tdd**: Enforce test-driven development workflow (read `.skills/tdd/SKILL.md` for details)
- **api-conventions**: Team API design standards for REST endpoints (read `.skills/api-conventions/SKILL.md` for details)
- **rust-refactoring**: Best practices for refactoring Rust code (read `.skills/rust-refactoring/SKILL.md` for details)
</available_skills>
The agent then selectively reads skills that match the task at hand:
- "Add a new
/usersendpoint with tests" — the agent readsapi-conventionsandtdd, then follows the TDD cycle while applying the URL and response format standards. - "Review this pull request" — the agent reads
code-reviewand produces findings with severity levels. - "Refactor the auth module" — the agent reads
rust-refactoringandcode-review(to self-check the result).
Skills are composable: each one contributes a focused set of instructions, and the agent combines them as needed. This is more maintainable than a single monolithic system prompt.
Best Practices
Keep skills focused and concise. Each skill should cover one topic. A 20–50 line SKILL.md is ideal. If a skill grows beyond 100 lines, consider splitting it.
Use action-oriented language. Write instructions as directives ("Run tests before committing", "Use kebab-case for URLs") rather than descriptions ("Tests should ideally be run").
Format with Markdown structure. Use headings, numbered lists, and bold text. The agent processes structured content more reliably than prose paragraphs.
Name directories in kebab-case. Use lowercase with hyphens: code-review/, api-conventions/, rust-refactoring/. Avoid spaces, underscores, or camelCase.
Skills vs. system prompt. Use skills for instructions that are reusable across tasks and discoverable by name. Use the system prompt directly for instructions that always apply to every interaction. If you find yourself copying the same instructions into multiple prompts, extract them into a skill.
Memory
A Deep Agent can persist learned context across sessions by reading and writing a memory file (default AGENTS.md) in the workspace. This gives the agent a form of long-term memory that survives restarts.
How It Works
The DeepMemoryMiddleware implements AgentMiddleware. On every model call, its before_model() hook reads the configured memory file from the backend. If the file exists and is not empty, its contents are wrapped in <agent_memory> tags and appended to the system prompt:
<agent_memory>
- The user prefers tabs over spaces.
- The project uses `thiserror 2.0` for error types.
- Always run `cargo fmt` after editing Rust files.
</agent_memory>
If the file does not exist or is empty, the middleware silently skips injection. The agent sees this context before processing each message, so it can apply learned preferences immediately.
Writing to Memory
The agent can update its memory at any time by writing to the memory file using the built-in filesystem tools (e.g., write_file or edit_file). A typical pattern is for the agent to append a new line when it learns something important:
Agent reasoning: "The user corrected me -- they want snake_case, not camelCase.
I should remember this for future sessions."
Tool call: edit_file({
"path": "AGENTS.md",
"old_string": "- Always run `cargo fmt` after editing Rust files.",
"new_string": "- Always run `cargo fmt` after editing Rust files.\n- Use snake_case for all function names."
})
Because the middleware re-reads the file on every model call, updates take effect on the very next turn.
Configuration
Memory is controlled by two fields on DeepAgentOptions:
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("AGENTS.md".to_string()); // default
options.enable_memory = true; // default
let agent = create_deep_agent(model, options)?;
memory_file(Option<String>, defaultSome("AGENTS.md")) -- path to the memory file within the backend. You can point this at a different file if you prefer:
let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("docs/MEMORY.md".to_string());
enable_memory(bool, defaulttrue) -- whentrue, theDeepMemoryMiddlewareis added to the middleware stack.
Disabling Memory
To run without persistent memory, set enable_memory to false:
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_memory = false;
let agent = create_deep_agent(model, options)?;
The DeepMemoryMiddleware is not added to the stack at all, so there is no overhead.
DeepMemoryMiddleware Internals
The middleware struct is straightforward:
pub struct DeepMemoryMiddleware {
backend: Arc<dyn Backend>,
memory_file: String,
}
impl DeepMemoryMiddleware {
pub fn new(backend: Arc<dyn Backend>, memory_file: String) -> Self;
}
It implements AgentMiddleware with a single hook:
before_model()-- reads the memory file from the backend. If the content is non-empty, wraps it in<agent_memory>tags and appends to the system prompt. If the file is missing or empty, does nothing.
Middleware Stack Position
DeepMemoryMiddleware runs first in the middleware stack (position 1 of 7), ensuring that memory context is available to all subsequent middleware and to the model itself. See the Customization page for the full assembly order.
Customization
Every aspect of a Deep Agent can be tuned through DeepAgentOptions. This page is a field-by-field reference with examples.
DeepAgentOptions Reference
DeepAgentOptions uses direct field assignment rather than a builder pattern. Create an instance with DeepAgentOptions::new(backend) to get sensible defaults, then override fields as needed:
use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
let mut options = DeepAgentOptions::new(backend.clone());
options.system_prompt = Some("You are a senior Rust engineer.".into());
options.max_subagent_depth = 2;
let agent = create_deep_agent(model, options)?;
Full Field List
pub struct DeepAgentOptions {
pub backend: Arc<dyn Backend>, // required
pub system_prompt: Option<String>, // None
pub tools: Vec<Arc<dyn Tool>>, // empty
pub middleware: Vec<Arc<dyn AgentMiddleware>>, // empty
pub checkpointer: Option<Arc<dyn Checkpointer>>, // None
pub store: Option<Arc<dyn Store>>, // None
pub max_input_tokens: usize, // 128_000
pub summarization_threshold: f64, // 0.85
pub eviction_threshold: usize, // 20_000
pub max_subagent_depth: usize, // 3
pub skills_dir: Option<String>, // Some(".skills")
pub memory_file: Option<String>, // Some("AGENTS.md")
pub subagents: Vec<SubAgentDef>, // empty
pub enable_subagents: bool, // true
pub enable_filesystem: bool, // true
pub enable_skills: bool, // true
pub enable_memory: bool, // true
}
Field Details
backend
The backend provides filesystem operations for the agent. This is the only required argument to DeepAgentOptions::new(). All other fields have defaults.
use synaptic::deep::backend::FilesystemBackend;
let backend = Arc::new(FilesystemBackend::new("/home/user/project"));
let options = DeepAgentOptions::new(backend);
system_prompt
Override the default system prompt entirely. When None, the agent uses a built-in prompt that describes the filesystem tools and expected behavior.
let mut options = DeepAgentOptions::new(backend.clone());
options.system_prompt = Some("You are a Rust expert. Use the provided tools to help.".into());
tools
Additional tools beyond the built-in filesystem tools. These are added to the agent's tool registry and made available to the model.
let mut options = DeepAgentOptions::new(backend.clone());
options.tools = vec![
Arc::new(MyCustomTool),
Arc::new(DatabaseQueryTool::new(db_pool)),
];
middleware
Custom middleware layers that run after the entire built-in stack. See Middleware Stack for ordering details.
let mut options = DeepAgentOptions::new(backend.clone());
options.middleware = vec![
Arc::new(AuditLogMiddleware::new(log_file)),
];
checkpointer
Optional checkpointer for graph state persistence. When provided, the agent can resume from checkpoints.
use synaptic::graph::MemorySaver;
let mut options = DeepAgentOptions::new(backend.clone());
options.checkpointer = Some(Arc::new(MemorySaver::new()));
store
Optional store for runtime tool injection via ToolRuntime.
use synaptic::store::InMemoryStore;
let mut options = DeepAgentOptions::new(backend.clone());
options.store = Some(Arc::new(InMemoryStore::new()));
max_input_tokens
Maximum input tokens before summarization is considered (default 128_000). The DeepSummarizationMiddleware uses this together with summarization_threshold to decide when to compress context.
let mut options = DeepAgentOptions::new(backend.clone());
options.max_input_tokens = 200_000; // for models with larger context windows
summarization_threshold
Fraction of max_input_tokens at which summarization triggers (default 0.85). When context exceeds max_input_tokens * summarization_threshold tokens, the middleware summarizes older messages.
let mut options = DeepAgentOptions::new(backend.clone());
options.summarization_threshold = 0.70; // summarize earlier
eviction_threshold
Token count above which tool results are evicted to files by the FilesystemMiddleware (default 20_000). Large tool outputs are written to a file and replaced with a reference.
let mut options = DeepAgentOptions::new(backend.clone());
options.eviction_threshold = 10_000; // evict smaller results
max_subagent_depth
Maximum recursion depth for nested subagent spawning (default 3). Prevents runaway agent chains.
let mut options = DeepAgentOptions::new(backend.clone());
options.max_subagent_depth = 2;
skills_dir
Directory path within the backend to scan for skill files (default Some(".skills")). Set to None to disable skill scanning even when enable_skills is true.
let mut options = DeepAgentOptions::new(backend.clone());
options.skills_dir = Some("my-skills".into());
memory_file
Path to the persistent memory file within the backend (default Some("AGENTS.md")). See the Memory page for details.
let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("docs/MEMORY.md".into());
subagents
Custom subagent definitions for the task tool. Each SubAgentDef describes a specialized subagent that can be spawned.
use synaptic::deep::SubAgentDef;
let mut options = DeepAgentOptions::new(backend.clone());
options.subagents = vec![
SubAgentDef {
name: "researcher".into(),
description: "Searches the web for information".into(),
// ...
},
];
enable_subagents
Toggle the task tool for child agent spawning (default true). When false, the SubAgentMiddleware and its task tool are not added.
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_subagents = false;
enable_filesystem
Toggle the built-in filesystem tools and FilesystemMiddleware (default true). When false, no filesystem tools are registered.
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_filesystem = false;
enable_skills
Toggle the SkillsMiddleware for progressive skill disclosure (default true).
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_skills = false;
enable_memory
Toggle the DeepMemoryMiddleware for persistent memory (default true). See the Memory page for details.
let mut options = DeepAgentOptions::new(backend.clone());
options.enable_memory = false;
Middleware Stack
create_deep_agent assembles the middleware stack in a fixed order. Each layer can be individually enabled or disabled:
| Order | Middleware | Controlled by |
|---|---|---|
| 1 | DeepMemoryMiddleware | enable_memory |
| 2 | SkillsMiddleware | enable_skills |
| 3 | FilesystemMiddleware + filesystem tools | enable_filesystem |
| 4 | SubAgentMiddleware's task tool | enable_subagents |
| 5 | DeepSummarizationMiddleware | always added |
| 6 | PatchToolCallsMiddleware | always added |
| 7 | User-provided middleware | middleware field |
The DeepSummarizationMiddleware and PatchToolCallsMiddleware are always present regardless of configuration.
Return Type
create_deep_agent returns Result<CompiledGraph<MessageState>, SynapticError>. The resulting graph is used like any other Synaptic graph:
use synaptic::core::Message;
use synaptic::graph::MessageState;
let agent = create_deep_agent(model, options)?;
let result = agent.invoke(MessageState::with_messages(vec![
Message::human("Refactor the error handling in src/lib.rs"),
])).await?;
Full Example
use std::sync::Arc;
use synaptic::core::Message;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, backend::FilesystemBackend};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;
let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(FilesystemBackend::new("/home/user/project"));
let mut options = DeepAgentOptions::new(backend);
options.system_prompt = Some("You are a senior Rust engineer.".into());
options.summarization_threshold = 0.70;
options.enable_subagents = true;
options.max_subagent_depth = 2;
let agent = create_deep_agent(model, options)?;
let result = agent.invoke(MessageState::with_messages(vec![
Message::human("Refactor the error handling in src/lib.rs"),
])).await?;
Callbacks
Synaptic provides an event-driven callback system for observing agent execution. The CallbackHandler trait receives RunEvent values at key lifecycle points -- when a run starts, when the LLM is called, when tools are executed, and when the run finishes or fails.
The CallbackHandler Trait
The trait is defined in synaptic_core:
#[async_trait]
pub trait CallbackHandler: Send + Sync {
async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError>;
}
A single method receives all event types. Handlers are Send + Sync so they can be shared across async tasks.
RunEvent Variants
The RunEvent enum covers the full agent lifecycle:
| Variant | Fields | When It Fires |
|---|---|---|
RunStarted | run_id, session_id | At the beginning of an agent run |
RunStep | run_id, step | At each iteration of the agent loop |
LlmCalled | run_id, message_count | When the LLM is invoked with messages |
ToolCalled | run_id, tool_name | When a tool is executed |
RunFinished | run_id, output | When the agent produces a final answer |
RunFailed | run_id, error | When the agent run fails with an error |
RunEvent implements Clone, so handlers can store copies of events for later inspection.
Built-in Handlers
Synaptic ships with four callback handlers:
| Handler | Purpose |
|---|---|
| RecordingCallback | Records all events in memory for later inspection |
| TracingCallback | Emits structured tracing spans and events |
| StdOutCallbackHandler | Prints events to stdout (with optional verbose mode) |
| CompositeCallback | Dispatches events to multiple handlers |
Implementing a Custom Handler
You can implement CallbackHandler to add your own observability:
use async_trait::async_trait;
use synaptic::core::{CallbackHandler, RunEvent, SynapticError};
struct MetricsCallback;
#[async_trait]
impl CallbackHandler for MetricsCallback {
async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError> {
match event {
RunEvent::LlmCalled { message_count, .. } => {
// Record to your metrics system
println!("LLM called with {message_count} messages");
}
RunEvent::ToolCalled { tool_name, .. } => {
println!("Tool executed: {tool_name}");
}
_ => {}
}
Ok(())
}
}
Guides
- Recording Callback -- capture events in memory for testing and inspection
- Tracing Callback -- integrate with the Rust
tracingecosystem - Composite Callback -- dispatch events to multiple handlers simultaneously
Recording Callback
RecordingCallback captures every RunEvent in an in-memory list. This is useful for testing agent behavior, debugging execution flow, and building audit logs.
Usage
use synaptic::callbacks::RecordingCallback;
use synaptic::core::RunEvent;
let callback = RecordingCallback::new();
// ... pass the callback to an agent or use it manually ...
// After the run, inspect all recorded events
let events = callback.events().await;
for event in &events {
match event {
RunEvent::RunStarted { run_id, session_id } => {
println!("Run started: run_id={run_id}, session={session_id}");
}
RunEvent::RunStep { run_id, step } => {
println!("Step {step} in run {run_id}");
}
RunEvent::LlmCalled { run_id, message_count } => {
println!("LLM called with {message_count} messages (run {run_id})");
}
RunEvent::ToolCalled { run_id, tool_name } => {
println!("Tool '{tool_name}' called (run {run_id})");
}
RunEvent::RunFinished { run_id, output } => {
println!("Run {run_id} finished: {output}");
}
RunEvent::RunFailed { run_id, error } => {
println!("Run {run_id} failed: {error}");
}
}
}
How It Works
RecordingCallback stores events in an Arc<RwLock<Vec<RunEvent>>>. Each call to on_event() appends the event to the list. The events() method returns a clone of the full event list.
Because it uses Arc, the callback can be cloned and shared across tasks. All clones refer to the same event storage.
Testing Example
RecordingCallback is particularly useful in tests to verify that an agent followed the expected execution path:
#[tokio::test]
async fn test_agent_calls_tool() {
let callback = RecordingCallback::new();
// ... run the agent with this callback ...
let events = callback.events().await;
// Verify the agent called the expected tool
let tool_events: Vec<_> = events.iter()
.filter_map(|e| match e {
RunEvent::ToolCalled { tool_name, .. } => Some(tool_name.clone()),
_ => None,
})
.collect();
assert!(tool_events.contains(&"calculator".to_string()));
}
Thread Safety
RecordingCallback is Clone, Send, and Sync. You can safely share it across async tasks and inspect events from any task that holds a reference.
Tracing Callback
TracingCallback integrates Synaptic's callback system with the Rust tracing ecosystem. Instead of storing events in memory, it emits structured tracing spans and events that flow into whatever subscriber you have configured -- terminal output, JSON logs, OpenTelemetry, etc.
Setup
First, initialize a tracing subscriber. The simplest option is the fmt subscriber from tracing-subscriber:
use tracing_subscriber;
// Initialize the default subscriber (prints to stderr)
tracing_subscriber::fmt::init();
Then create the callback:
use synaptic::callbacks::TracingCallback;
let callback = TracingCallback::new();
Pass this callback to your agent or use it with CompositeCallback.
What Gets Logged
TracingCallback maps each RunEvent variant to a tracing call:
| RunEvent | Tracing Level | Key Fields |
|---|---|---|
RunStarted | info! | run_id, session_id |
RunStep | info! | run_id, step |
LlmCalled | info! | run_id, message_count |
ToolCalled | info! | run_id, tool_name |
RunFinished | info! | run_id, output_len |
RunFailed | error! | run_id, error |
All events except RunFailed are logged at the INFO level. Failures are logged at ERROR.
Example Output
With the default fmt subscriber, you might see:
2026-02-17T10:30:00.123Z INFO synaptic: run started run_id="abc-123" session_id="user-1"
2026-02-17T10:30:00.456Z INFO synaptic: LLM called run_id="abc-123" message_count=3
2026-02-17T10:30:01.234Z INFO synaptic: tool called run_id="abc-123" tool_name="calculator"
2026-02-17T10:30:01.567Z INFO synaptic: run finished run_id="abc-123" output_len=42
Integration with the Tracing Ecosystem
Because TracingCallback uses the standard tracing macros, it works with any compatible subscriber:
tracing-subscriber-- terminal formatting, filtering, layering.tracing-opentelemetry-- export spans to Jaeger, Zipkin, or any OTLP collector.tracing-appender-- write logs to rolling files.- JSON output -- use
tracing_subscriber::fmt().json()for structured log ingestion.
// Example: JSON-formatted logs
tracing_subscriber::fmt()
.json()
.init();
let callback = TracingCallback::new();
When to Use
Use TracingCallback when:
- You want production-grade structured logging with minimal setup.
- You are already using the
tracingecosystem in your application. - You need to export agent telemetry to an observability platform (Datadog, Grafana, etc.).
For test-time event inspection, consider RecordingCallback instead, which stores events for programmatic access.
Composite Callback
CompositeCallback dispatches each RunEvent to multiple callback handlers. This lets you combine different observability strategies without choosing just one -- for example, recording events in memory for tests while also logging them via tracing.
Usage
use synaptic::callbacks::{CompositeCallback, RecordingCallback, TracingCallback};
use std::sync::Arc;
let recording = Arc::new(RecordingCallback::new());
let tracing_cb = Arc::new(TracingCallback::new());
let composite = CompositeCallback::new(vec![
recording.clone(),
tracing_cb,
]);
When composite.on_event(event) is called, the event is forwarded to each handler in order. If any handler returns an error, the composite stops and propagates that error.
How It Works
CompositeCallback holds a Vec<Arc<dyn CallbackHandler>>. On each event:
- The event is cloned for each handler (since
RunEventimplementsClone). - Each handler's
on_event()is awaited sequentially. - If all handlers succeed,
Ok(())is returned.
// Pseudocode of the dispatch logic
async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError> {
for handler in &self.handlers {
handler.on_event(event.clone()).await?;
}
Ok(())
}
Example: Recording + Tracing + Custom
You can mix built-in and custom handlers:
use async_trait::async_trait;
use synaptic::core::{CallbackHandler, RunEvent, SynapticError};
use synaptic::callbacks::{
CompositeCallback, RecordingCallback, TracingCallback, StdOutCallbackHandler,
};
use std::sync::Arc;
struct ToolCounter {
count: Arc<tokio::sync::RwLock<usize>>,
}
#[async_trait]
impl CallbackHandler for ToolCounter {
async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError> {
if matches!(event, RunEvent::ToolCalled { .. }) {
*self.count.write().await += 1;
}
Ok(())
}
}
let counter = Arc::new(ToolCounter {
count: Arc::new(tokio::sync::RwLock::new(0)),
});
let composite = CompositeCallback::new(vec![
Arc::new(RecordingCallback::new()),
Arc::new(TracingCallback::new()),
Arc::new(StdOutCallbackHandler::new()),
counter.clone(),
]);
When to Use
Use CompositeCallback whenever you need more than one callback handler active at the same time. Common combinations:
- Development:
StdOutCallbackHandler+RecordingCallback-- see events in the terminal and inspect them programmatically. - Testing:
RecordingCallbackalone is usually sufficient. - Production:
TracingCallback+ custom metrics handler -- structured logs plus application-specific telemetry.
Evaluation
Synaptic provides an evaluation framework for measuring the quality of AI outputs. The Evaluator trait defines a standard interface for scoring predictions against references, and the Dataset + evaluate() pipeline makes it easy to run batch evaluations across many test cases.
The Evaluator Trait
All evaluators implement the Evaluator trait from synaptic_eval:
#[async_trait]
pub trait Evaluator: Send + Sync {
async fn evaluate(
&self,
prediction: &str,
reference: &str,
input: &str,
) -> Result<EvalResult, SynapticError>;
}
prediction-- the AI's output to evaluate.reference-- the expected or ground-truth answer.input-- the original input that produced the prediction.
EvalResult
Every evaluator returns an EvalResult:
pub struct EvalResult {
pub score: f64, // Between 0.0 and 1.0
pub passed: bool, // true if score >= 0.5
pub reasoning: Option<String>, // Optional explanation
}
Helper constructors:
| Method | Score | Passed |
|---|---|---|
EvalResult::pass() | 1.0 | true |
EvalResult::fail() | 0.0 | false |
EvalResult::with_score(0.75) | 0.75 | true (>= 0.5) |
You can attach reasoning with .with_reasoning("explanation").
Built-in Evaluators
Synaptic provides five evaluators out of the box:
| Evaluator | What It Checks |
|---|---|
ExactMatchEvaluator | Exact string equality (with optional case-insensitive mode) |
JsonValidityEvaluator | Whether the prediction is valid JSON |
RegexMatchEvaluator | Whether the prediction matches a regex pattern |
EmbeddingDistanceEvaluator | Cosine similarity between prediction and reference embeddings |
LLMJudgeEvaluator | Uses an LLM to score prediction quality on a 0-10 scale |
See Evaluators for detailed usage of each.
Batch Evaluation
The evaluate() function runs an evaluator across a Dataset of test cases, producing an EvalReport with aggregate statistics. See Datasets for details.
Guides
- Evaluators -- usage and configuration for each built-in evaluator
- Datasets -- batch evaluation with
Datasetandevaluate()
Evaluators
Synaptic provides five built-in evaluators, ranging from simple string matching to LLM-based judgment. All implement the Evaluator trait and return an EvalResult with a score, pass/fail status, and optional reasoning.
ExactMatchEvaluator
Checks whether the prediction exactly matches the reference string:
use synaptic::eval::{ExactMatchEvaluator, Evaluator};
// Case-sensitive (default)
let eval = ExactMatchEvaluator::new();
let result = eval.evaluate("hello", "hello", "").await?;
assert!(result.passed);
assert_eq!(result.score, 1.0);
let result = eval.evaluate("Hello", "hello", "").await?;
assert!(!result.passed); // Case mismatch
// Case-insensitive
let eval = ExactMatchEvaluator::case_insensitive();
let result = eval.evaluate("Hello", "hello", "").await?;
assert!(result.passed); // Now passes
On failure, the reasoning field shows what was expected versus what was received.
JsonValidityEvaluator
Checks whether the prediction is valid JSON. The reference and input are ignored:
use synaptic::eval::{JsonValidityEvaluator, Evaluator};
let eval = JsonValidityEvaluator::new();
let result = eval.evaluate(r#"{"key": "value"}"#, "", "").await?;
assert!(result.passed);
let result = eval.evaluate("not json", "", "").await?;
assert!(!result.passed);
// reasoning: "Invalid JSON: expected ident at line 1 column 2"
This is useful for validating that an LLM produced well-formed JSON output.
RegexMatchEvaluator
Checks whether the prediction matches a regular expression pattern:
use synaptic::eval::{RegexMatchEvaluator, Evaluator};
// Match a date pattern
let eval = RegexMatchEvaluator::new(r"\d{4}-\d{2}-\d{2}")?;
let result = eval.evaluate("2024-01-15", "", "").await?;
assert!(result.passed);
let result = eval.evaluate("January 15, 2024", "", "").await?;
assert!(!result.passed);
The constructor returns a Result because the regex pattern is validated at creation time. Invalid patterns produce a SynapticError::Validation.
EmbeddingDistanceEvaluator
Computes cosine similarity between the embeddings of the prediction and reference. The score equals the cosine similarity, and the evaluation passes if the similarity meets or exceeds the threshold:
use synaptic::eval::{EmbeddingDistanceEvaluator, Evaluator};
use synaptic::embeddings::FakeEmbeddings;
use std::sync::Arc;
let embeddings = Arc::new(FakeEmbeddings::new());
let eval = EmbeddingDistanceEvaluator::new(embeddings, 0.8);
let result = eval.evaluate("the cat sat", "the cat sat on the mat", "").await?;
println!("Similarity: {:.4}", result.score);
println!("Passed (>= 0.8): {}", result.passed);
// reasoning: "Cosine similarity: 0.9234, threshold: 0.8000"
Parameters:
embeddings-- any type implementingArc<dyn Embeddings>(e.g.,OpenAiEmbeddingsfromsynaptic::openai,OllamaEmbeddingsfromsynaptic::ollama,FakeEmbeddingsfromsynaptic::embeddings).threshold-- minimum cosine similarity to pass. A typical value is0.8for semantic similarity checks.
LLMJudgeEvaluator
Uses an LLM to judge the quality of a prediction on a 0-10 scale. The score is normalized to 0.0-1.0:
use synaptic::eval::{LLMJudgeEvaluator, Evaluator};
use synaptic::openai::OpenAiChatModel;
use std::sync::Arc;
let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let eval = LLMJudgeEvaluator::new(model);
let result = eval.evaluate(
"Paris is the capital of France.", // prediction
"The capital of France is Paris.", // reference
"What is the capital of France?", // input
).await?;
println!("Score: {:.1}/10", result.score * 10.0);
// reasoning: "LLM judge score: 9.0/10"
Custom Prompt Template
You can customize the judge prompt. The template must contain {input}, {prediction}, and {reference} placeholders:
let eval = LLMJudgeEvaluator::with_prompt(
model,
r#"Evaluate whether the response is factually accurate.
Question: {input}
Expected: {reference}
Response: {prediction}
Rate accuracy from 0 (wrong) to 10 (perfect). Reply with a single number."#,
);
The default prompt asks the LLM to rate overall quality. The response is parsed for a number between 0 and 10; if no valid number is found, the evaluator returns a SynapticError::Parsing.
Summary
| Evaluator | Speed | Requires |
|---|---|---|
ExactMatchEvaluator | Instant | Nothing |
JsonValidityEvaluator | Instant | Nothing |
RegexMatchEvaluator | Instant | Nothing |
EmbeddingDistanceEvaluator | Fast | Embeddings model |
LLMJudgeEvaluator | Slow (LLM call) | Chat model |
Datasets
The Dataset type and evaluate() function provide a batch evaluation pipeline. You define a dataset of input-reference pairs, generate predictions, and score them all at once to produce an EvalReport.
Creating a Dataset
A Dataset is a collection of DatasetItem values, each with an input and a reference (expected answer):
use synaptic::eval::{Dataset, DatasetItem};
// From DatasetItem structs
let dataset = Dataset::new(vec![
DatasetItem {
input: "What is 2+2?".to_string(),
reference: "4".to_string(),
},
DatasetItem {
input: "Capital of France?".to_string(),
reference: "Paris".to_string(),
},
]);
// From string pairs (convenience method)
let dataset = Dataset::from_pairs(vec![
("What is 2+2?", "4"),
("Capital of France?", "Paris"),
]);
Running Batch Evaluation
The evaluate() function takes an evaluator, a dataset, and a slice of predictions. It evaluates each prediction against the corresponding dataset item and returns an EvalReport:
use synaptic::eval::{evaluate, Dataset, ExactMatchEvaluator};
let dataset = Dataset::from_pairs(vec![
("What is 2+2?", "4"),
("Capital of France?", "Paris"),
("Largest ocean?", "Pacific"),
]);
let evaluator = ExactMatchEvaluator::new();
// Your model's predictions (one per dataset item)
let predictions = vec![
"4".to_string(),
"Paris".to_string(),
"Atlantic".to_string(), // Wrong!
];
let report = evaluate(&evaluator, &dataset, &predictions).await?;
println!("Total: {}", report.total); // 3
println!("Passed: {}", report.passed); // 2
println!("Accuracy: {:.0}%", report.accuracy * 100.0); // 67%
The number of predictions must match the number of dataset items. If they differ, evaluate() returns a SynapticError::Validation.
EvalReport
The report contains aggregate statistics and per-item results:
pub struct EvalReport {
pub total: usize,
pub passed: usize,
pub accuracy: f32,
pub results: Vec<EvalResult>,
}
You can inspect individual results for detailed feedback:
for (i, result) in report.results.iter().enumerate() {
let status = if result.passed { "PASS" } else { "FAIL" };
let reason = result.reasoning.as_deref().unwrap_or("--");
println!("[{status}] Item {i}: score={:.2}, reason={reason}", result.score);
}
End-to-End Example
A typical evaluation workflow:
- Build a dataset of test cases.
- Run your model/chain on each input to produce predictions.
- Score predictions with an evaluator.
- Inspect the report.
use synaptic::eval::{evaluate, Dataset, ExactMatchEvaluator};
// 1. Dataset
let dataset = Dataset::from_pairs(vec![
("2+2", "4"),
("3*5", "15"),
("10/2", "5"),
]);
// 2. Generate predictions (in practice, run your model)
let predictions: Vec<String> = dataset.items.iter()
.map(|item| {
// Simulated model output
match item.input.as_str() {
"2+2" => "4",
"3*5" => "15",
"10/2" => "5",
_ => "unknown",
}.to_string()
})
.collect();
// 3. Evaluate
let evaluator = ExactMatchEvaluator::new();
let report = evaluate(&evaluator, &dataset, &predictions).await?;
// 4. Report
println!("Accuracy: {:.0}% ({}/{})",
report.accuracy * 100.0, report.passed, report.total);
Using Different Evaluators
The evaluate() function works with any Evaluator. Swap in a different evaluator to change the scoring criteria without modifying the dataset or prediction pipeline:
use synaptic::eval::{evaluate, RegexMatchEvaluator};
// Check that predictions contain a date
let evaluator = RegexMatchEvaluator::new(r"\d{4}-\d{2}-\d{2}")?;
let report = evaluate(&evaluator, &dataset, &predictions).await?;
Integrations
Synaptic provides optional integration crates that connect to external services. Each integration is gated behind a Cargo feature flag and adds no overhead when not enabled.
Available Integrations
| Integration | Feature | Purpose |
|---|---|---|
| OpenAI-Compatible Providers | openai | Groq, DeepSeek, Fireworks, Together, xAI, MistralAI, HuggingFace, Cohere, OpenRouter |
| Azure OpenAI | openai | Azure-hosted OpenAI models (chat + embeddings) |
| Anthropic | anthropic | Anthropic Claude models (chat + streaming + tool calling) |
| Google Gemini | gemini | Google Gemini models via Generative Language API |
| Ollama | ollama | Local LLM inference with Ollama (chat + embeddings) |
| AWS Bedrock | bedrock | AWS Bedrock foundation models (Claude, Llama, Mistral, etc.) |
| Cohere Reranker | cohere | Document reranking for improved retrieval quality |
| Qdrant | qdrant | Vector store backed by the Qdrant vector database |
| PgVector | pgvector | Vector store backed by PostgreSQL with the pgvector extension |
| Pinecone | pinecone | Managed vector store backed by Pinecone |
| Chroma | chroma | Open-source vector store backed by Chroma |
| MongoDB Atlas | mongodb | Vector search backed by MongoDB Atlas |
| Elasticsearch | elasticsearch | Vector store backed by Elasticsearch kNN |
| Redis | redis | Key-value store and LLM response cache backed by Redis |
| SQLite Cache | sqlite | Persistent LLM response cache backed by SQLite |
| PDF Loader | pdf | Document loader for PDF files |
| Tavily Search | tavily | Web search tool for agents |
Enabling integrations
Add the desired feature flags to your Cargo.toml:
[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant", "redis"] }
You can combine any number of feature flags. Each integration pulls in only the dependencies it needs.
Trait compatibility
Every integration implements a core Synaptic trait, so it plugs directly into the existing framework:
- OpenAI-Compatible, Azure OpenAI, and Bedrock implement
ChatModel-- use them anywhere a model is accepted. - OpenAI-Compatible (MistralAI, HuggingFace, Cohere) and Azure OpenAI also implement
Embeddings. - Cohere Reranker implements
DocumentCompressor-- use it withContextualCompressionRetrieverfor two-stage retrieval. - Qdrant, PgVector, Pinecone, Chroma, MongoDB Atlas, and Elasticsearch implement
VectorStore-- use them withVectorStoreRetrieveror any component that accepts&dyn VectorStore. - Redis Store implements
Store-- use it anywhereInMemoryStoreis used, including agentToolRuntimeinjection. - Redis Cache and SQLite Cache implement
LlmCache-- wrap anyChatModelwithCachedChatModelfor persistent response caching. - PDF Loader implements
Loader-- use it in RAG pipelines alongsideTextSplitter,Embeddings, andVectorStore. - Tavily Search implements
Tool-- register it with an agent for web search capabilities.
Guides
LLM Providers
- OpenAI-Compatible Providers -- Groq, DeepSeek, Fireworks, Together, xAI, MistralAI, HuggingFace, Cohere, OpenRouter
- Azure OpenAI -- Azure-hosted OpenAI models
- Anthropic -- Anthropic Claude models
- Google Gemini -- Google Gemini models
- Ollama -- Local LLM inference (chat + embeddings)
- AWS Bedrock -- AWS Bedrock foundation models
Reranking
- Cohere Reranker -- document reranking for improved retrieval
Vector Stores
- Qdrant Vector Store -- store and search embeddings with Qdrant
- PgVector -- store and search embeddings with PostgreSQL + pgvector
- Pinecone Vector Store -- managed vector store with Pinecone
- Chroma Vector Store -- open-source embedding database
- MongoDB Atlas Vector Search -- vector search with MongoDB Atlas
- Elasticsearch Vector Store -- vector search with Elasticsearch kNN
Storage & Caching
- Redis Store & Cache -- persistent key-value storage and LLM caching with Redis
- SQLite Cache -- local LLM response caching with SQLite
Loaders & Tools
- PDF Loader -- load documents from PDF files
- Tavily Search Tool -- web search tool for agents
OpenAI-Compatible Providers
Many LLM providers expose an OpenAI-compatible API. Synaptic ships convenience constructors for nine popular providers so you can connect without building configuration by hand.
Setup
Add the openai feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai"] }
All OpenAI-compatible providers use the synaptic-openai crate under the hood, so only the openai feature is required.
Supported Providers
The synaptic::openai::compat module provides two functions per provider:
{provider}_config(api_key, model)-- returns anOpenAiConfigpre-configured with the correct base URL.{provider}_chat_model(api_key, model, backend)-- returns a ready-to-useOpenAiChatModel.
Some providers also offer embeddings variants.
| Provider | Config function | Chat model function | Embeddings? |
|---|---|---|---|
| Groq | groq_config | groq_chat_model | No |
| DeepSeek | deepseek_config | deepseek_chat_model | No |
| Fireworks | fireworks_config | fireworks_chat_model | No |
| Together | together_config | together_chat_model | No |
| xAI | xai_config | xai_chat_model | No |
| MistralAI | mistral_config | mistral_chat_model | Yes |
| HuggingFace | huggingface_config | huggingface_chat_model | Yes |
| Cohere | cohere_config | cohere_chat_model | Yes |
| OpenRouter | openrouter_config | openrouter_chat_model | No |
Usage
Chat model
use std::sync::Arc;
use synaptic::openai::compat::{groq_chat_model, deepseek_chat_model};
use synaptic::models::HttpBackend;
use synaptic::core::{ChatModel, ChatRequest, Message};
let backend = Arc::new(HttpBackend::new());
// Groq
let model = groq_chat_model("gsk-...", "llama-3.3-70b-versatile", backend.clone());
let request = ChatRequest::new(vec![Message::human("Hello from Groq!")]);
let response = model.chat(&request).await?;
// DeepSeek
let model = deepseek_chat_model("sk-...", "deepseek-chat", backend.clone());
let response = model.chat(&request).await?;
Config-first approach
If you need to customize the config further before creating the model:
use std::sync::Arc;
use synaptic::openai::compat::fireworks_config;
use synaptic::openai::OpenAiChatModel;
use synaptic::models::HttpBackend;
let config = fireworks_config("fw-...", "accounts/fireworks/models/llama-v3p1-70b-instruct")
.with_temperature(0.7)
.with_max_tokens(2048);
let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));
Embeddings
Providers that support embeddings have {provider}_embeddings_config and {provider}_embeddings functions:
use std::sync::Arc;
use synaptic::openai::compat::{mistral_embeddings, cohere_embeddings, huggingface_embeddings};
use synaptic::models::HttpBackend;
use synaptic::core::Embeddings;
let backend = Arc::new(HttpBackend::new());
// MistralAI embeddings
let embeddings = mistral_embeddings("sk-...", "mistral-embed", backend.clone());
let vectors = embeddings.embed_documents(&["Hello world"]).await?;
// Cohere embeddings
let embeddings = cohere_embeddings("co-...", "embed-english-v3.0", backend.clone());
// HuggingFace embeddings
let embeddings = huggingface_embeddings("hf_...", "BAAI/bge-small-en-v1.5", backend.clone());
Unlisted providers
Any provider that exposes an OpenAI-compatible API can be used by setting a custom base URL on OpenAiConfig:
use std::sync::Arc;
use synaptic::openai::{OpenAiConfig, OpenAiChatModel};
use synaptic::models::HttpBackend;
let config = OpenAiConfig::new("your-api-key", "model-name")
.with_base_url("https://api.example.com/v1");
let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));
This works for any service that accepts the OpenAI chat completions request format at {base_url}/chat/completions.
Streaming
All OpenAI-compatible models support streaming. Use stream_chat() just like you would with the standard OpenAiChatModel:
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};
let request = ChatRequest::new(vec![Message::human("Tell me a story")]);
let mut stream = model.stream_chat(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(text) = &chunk.content {
print!("{}", text);
}
}
Provider reference
| Provider | Base URL | Env variable (convention) |
|---|---|---|
| Groq | https://api.groq.com/openai/v1 | GROQ_API_KEY |
| DeepSeek | https://api.deepseek.com/v1 | DEEPSEEK_API_KEY |
| Fireworks | https://api.fireworks.ai/inference/v1 | FIREWORKS_API_KEY |
| Together | https://api.together.xyz/v1 | TOGETHER_API_KEY |
| xAI | https://api.x.ai/v1 | XAI_API_KEY |
| MistralAI | https://api.mistral.ai/v1 | MISTRAL_API_KEY |
| HuggingFace | https://api-inference.huggingface.co/v1 | HUGGINGFACE_API_KEY |
| Cohere | https://api.cohere.com/v1 | CO_API_KEY |
| OpenRouter | https://openrouter.ai/api/v1 | OPENROUTER_API_KEY |
Azure OpenAI
This guide shows how to use Azure OpenAI Service as a chat model and embeddings provider in Synaptic. Azure OpenAI uses deployment-based URLs and api-key header authentication instead of Bearer tokens.
Setup
Add the openai feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai"] }
Azure OpenAI support is included in the synaptic-openai crate, so no additional feature flag is needed.
Configuration
Create an AzureOpenAiConfig with your API key, resource name, and deployment name:
use std::sync::Arc;
use synaptic::openai::{AzureOpenAiConfig, AzureOpenAiChatModel};
use synaptic::models::HttpBackend;
let config = AzureOpenAiConfig::new(
"your-azure-api-key",
"my-resource", // Azure resource name
"gpt-4o-deployment", // Deployment name
);
let model = AzureOpenAiChatModel::new(config, Arc::new(HttpBackend::new()));
The resulting endpoint URL is:
https://{resource_name}.openai.azure.com/openai/deployments/{deployment_name}/chat/completions?api-version={api_version}
API version
The default API version is "2024-10-21". You can override it:
let config = AzureOpenAiConfig::new("key", "resource", "deployment")
.with_api_version("2024-12-01-preview");
Model parameters
Configure temperature, max tokens, and other generation parameters:
let config = AzureOpenAiConfig::new("key", "resource", "deployment")
.with_temperature(0.7)
.with_max_tokens(4096);
Usage
AzureOpenAiChatModel implements the ChatModel trait, so it works everywhere a standard model does:
use synaptic::core::{ChatModel, ChatRequest, Message};
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant."),
Message::human("What is Azure OpenAI?"),
]);
let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());
Streaming
use futures::StreamExt;
let mut stream = model.stream_chat(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(text) = &chunk.content {
print!("{}", text);
}
}
Tool calling
use synaptic::core::{ChatRequest, Message, ToolDefinition};
let tools = vec![ToolDefinition {
name: "get_weather".into(),
description: "Get the current weather".into(),
parameters: serde_json::json!({
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}),
}];
let request = ChatRequest::new(vec![Message::human("What's the weather in Seattle?")])
.with_tools(tools);
let response = model.chat(&request).await?;
Embeddings
Use AzureOpenAiEmbeddings for text embedding with Azure-hosted models:
use std::sync::Arc;
use synaptic::openai::{AzureOpenAiEmbeddingsConfig, AzureOpenAiEmbeddings};
use synaptic::models::HttpBackend;
use synaptic::core::Embeddings;
let config = AzureOpenAiEmbeddingsConfig::new(
"your-azure-api-key",
"my-resource",
"text-embedding-ada-002-deployment",
);
let embeddings = AzureOpenAiEmbeddings::new(config, Arc::new(HttpBackend::new()));
let vectors = embeddings.embed_documents(&["Hello world", "Rust is fast"]).await?;
Environment variables
A common pattern is to read credentials from the environment:
let config = AzureOpenAiConfig::new(
std::env::var("AZURE_OPENAI_API_KEY").unwrap(),
std::env::var("AZURE_OPENAI_RESOURCE").unwrap(),
std::env::var("AZURE_OPENAI_DEPLOYMENT").unwrap(),
);
Configuration reference
AzureOpenAiConfig
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Azure OpenAI API key |
resource_name | String | required | Azure resource name |
deployment_name | String | required | Model deployment name |
api_version | String | "2024-10-21" | Azure API version |
temperature | Option<f32> | None | Sampling temperature |
max_tokens | Option<u32> | None | Maximum tokens to generate |
AzureOpenAiEmbeddingsConfig
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Azure OpenAI API key |
resource_name | String | required | Azure resource name |
deployment_name | String | required | Embeddings deployment name |
api_version | String | "2024-10-21" | Azure API version |
Anthropic
This guide shows how to use the Anthropic Messages API as a chat model provider in Synaptic. AnthropicChatModel wraps the Anthropic REST API and supports streaming, tool calling, and all standard ChatModel operations.
Setup
Add the anthropic feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["anthropic"] }
API key
Set your Anthropic API key as an environment variable:
export ANTHROPIC_API_KEY="sk-ant-..."
The key is passed to AnthropicConfig at construction time. Requests are authenticated with the x-api-key header (not a Bearer token).
Configuration
Create an AnthropicConfig with your API key and model name:
use synaptic::anthropic::{AnthropicConfig, AnthropicChatModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = AnthropicConfig::new("sk-ant-...", "claude-sonnet-4-20250514");
let model = AnthropicChatModel::new(config, Arc::new(HttpBackend::new()));
Custom base URL
To use a proxy or alternative endpoint:
let config = AnthropicConfig::new(api_key, "claude-sonnet-4-20250514")
.with_base_url("https://my-proxy.example.com");
Model parameters
let config = AnthropicConfig::new(api_key, "claude-sonnet-4-20250514")
.with_max_tokens(4096)
.with_top_p(0.9)
.with_stop(vec!["END".to_string()]);
Usage
AnthropicChatModel implements the ChatModel trait:
use synaptic::anthropic::{AnthropicConfig, AnthropicChatModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = AnthropicConfig::new(
std::env::var("ANTHROPIC_API_KEY").unwrap(),
"claude-sonnet-4-20250514",
);
let model = AnthropicChatModel::new(config, Arc::new(HttpBackend::new()));
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant."),
Message::human("Explain Rust's ownership model in one sentence."),
]);
let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());
Streaming
AnthropicChatModel supports native SSE streaming via the stream_chat method:
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};
let request = ChatRequest::new(vec![
Message::human("Write a short poem about Rust."),
]);
let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if !chunk.content.is_empty() {
print!("{}", chunk.content);
}
}
Tool calling
Anthropic models support tool calling through tool_use and tool_result content blocks. Synaptic maps ToolDefinition and ToolChoice to the Anthropic format automatically.
use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition, ToolChoice};
let tools = vec![ToolDefinition {
name: "get_weather".into(),
description: "Get the current weather for a city".into(),
parameters: serde_json::json!({
"type": "object",
"properties": {
"city": { "type": "string", "description": "City name" }
},
"required": ["city"]
}),
}];
let request = ChatRequest::new(vec![
Message::human("What is the weather in Tokyo?"),
])
.with_tools(tools)
.with_tool_choice(ToolChoice::Auto);
let response = model.chat(request).await?;
// Check if the model requested a tool call
for tc in response.message.tool_calls() {
println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}
ToolChoice variants map to Anthropic's tool_choice as follows:
| Synaptic | Anthropic |
|---|---|
Auto | {"type": "auto"} |
Required | {"type": "any"} |
None | {"type": "none"} |
Specific(name) | {"type": "tool", "name": "..."} |
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Anthropic API key |
model | String | required | Model name (e.g. claude-sonnet-4-20250514) |
base_url | String | "https://api.anthropic.com" | API base URL |
max_tokens | u32 | 1024 | Maximum tokens to generate |
top_p | Option<f64> | None | Nucleus sampling parameter |
stop | Option<Vec<String>> | None | Stop sequences |
Google Gemini
This guide shows how to use the Google Generative Language API as a chat model provider in Synaptic. GeminiChatModel wraps Google's Generative Language REST API and supports streaming, tool calling, and all standard ChatModel operations.
Setup
Add the gemini feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["gemini"] }
API key
Set your Google API key as an environment variable:
export GOOGLE_API_KEY="AIza..."
The key is passed to GeminiConfig at construction time. Unlike other providers, the API key is sent as a query parameter (?key=...) rather than in a request header.
Configuration
Create a GeminiConfig with your API key and model name:
use synaptic::gemini::{GeminiConfig, GeminiChatModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = GeminiConfig::new("AIza...", "gemini-2.0-flash");
let model = GeminiChatModel::new(config, Arc::new(HttpBackend::new()));
Custom base URL
To use a proxy or alternative endpoint:
let config = GeminiConfig::new(api_key, "gemini-2.0-flash")
.with_base_url("https://my-proxy.example.com");
Model parameters
let config = GeminiConfig::new(api_key, "gemini-2.0-flash")
.with_top_p(0.9)
.with_stop(vec!["END".to_string()]);
Usage
GeminiChatModel implements the ChatModel trait:
use synaptic::gemini::{GeminiConfig, GeminiChatModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = GeminiConfig::new(
std::env::var("GOOGLE_API_KEY").unwrap(),
"gemini-2.0-flash",
);
let model = GeminiChatModel::new(config, Arc::new(HttpBackend::new()));
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant."),
Message::human("Explain Rust's ownership model in one sentence."),
]);
let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());
Streaming
GeminiChatModel supports native SSE streaming via the stream_chat method. The streaming endpoint uses streamGenerateContent?alt=sse:
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};
let request = ChatRequest::new(vec![
Message::human("Write a short poem about Rust."),
]);
let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if !chunk.content.is_empty() {
print!("{}", chunk.content);
}
}
Tool calling
Gemini models support tool calling through functionCall and functionResponse parts (camelCase format). Synaptic maps ToolDefinition and ToolChoice to the Gemini format automatically.
use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition, ToolChoice};
let tools = vec![ToolDefinition {
name: "get_weather".into(),
description: "Get the current weather for a city".into(),
parameters: serde_json::json!({
"type": "object",
"properties": {
"city": { "type": "string", "description": "City name" }
},
"required": ["city"]
}),
}];
let request = ChatRequest::new(vec![
Message::human("What is the weather in Tokyo?"),
])
.with_tools(tools)
.with_tool_choice(ToolChoice::Auto);
let response = model.chat(request).await?;
// Check if the model requested a tool call
for tc in response.message.tool_calls() {
println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}
ToolChoice variants map to Gemini's functionCallingConfig as follows:
| Synaptic | Gemini |
|---|---|
Auto | {"mode": "AUTO"} |
Required | {"mode": "ANY"} |
None | {"mode": "NONE"} |
Specific(name) | {"mode": "ANY", "allowedFunctionNames": ["..."]} |
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Google API key |
model | String | required | Model name (e.g. gemini-2.0-flash) |
base_url | String | "https://generativelanguage.googleapis.com" | API base URL |
top_p | Option<f64> | None | Nucleus sampling parameter |
stop | Option<Vec<String>> | None | Stop sequences |
Ollama
This guide shows how to use Ollama as a local chat model and embeddings provider in Synaptic. OllamaChatModel wraps the Ollama REST API and supports streaming, tool calling, and all standard ChatModel operations. Because Ollama runs locally, no API key is needed.
Setup
Add the ollama feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["ollama"] }
Installing Ollama
Install Ollama from ollama.com and pull a model before using the provider:
# Install Ollama (macOS)
brew install ollama
# Start the Ollama server
ollama serve
# Pull a model
ollama pull llama3.1
The default endpoint is http://localhost:11434. Make sure the Ollama server is running before sending requests.
Configuration
Create an OllamaConfig with a model name. No API key is required:
use synaptic::ollama::{OllamaConfig, OllamaChatModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = OllamaConfig::new("llama3.1");
let model = OllamaChatModel::new(config, Arc::new(HttpBackend::new()));
Custom base URL
To connect to a remote Ollama instance or a non-default port:
let config = OllamaConfig::new("llama3.1")
.with_base_url("http://192.168.1.100:11434");
Model parameters
let config = OllamaConfig::new("llama3.1")
.with_top_p(0.9)
.with_stop(vec!["END".to_string()])
.with_seed(42);
Usage
OllamaChatModel implements the ChatModel trait:
use synaptic::ollama::{OllamaConfig, OllamaChatModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = OllamaConfig::new("llama3.1");
let model = OllamaChatModel::new(config, Arc::new(HttpBackend::new()));
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant."),
Message::human("Explain Rust's ownership model in one sentence."),
]);
let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());
Streaming
OllamaChatModel supports native streaming via the stream_chat method. Unlike cloud providers that use SSE, Ollama uses NDJSON (newline-delimited JSON) where each line is a complete JSON object:
use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};
let request = ChatRequest::new(vec![
Message::human("Write a short poem about Rust."),
]);
let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if !chunk.content.is_empty() {
print!("{}", chunk.content);
}
}
Tool calling
Ollama models that support function calling (such as llama3.1) can use tool calling through the tool_calls array format. Synaptic maps ToolDefinition and ToolChoice to the Ollama format automatically.
use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition, ToolChoice};
let tools = vec![ToolDefinition {
name: "get_weather".into(),
description: "Get the current weather for a city".into(),
parameters: serde_json::json!({
"type": "object",
"properties": {
"city": { "type": "string", "description": "City name" }
},
"required": ["city"]
}),
}];
let request = ChatRequest::new(vec![
Message::human("What is the weather in Tokyo?"),
])
.with_tools(tools)
.with_tool_choice(ToolChoice::Auto);
let response = model.chat(request).await?;
// Check if the model requested a tool call
for tc in response.message.tool_calls() {
println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}
ToolChoice variants map to Ollama's tool_choice as follows:
| Synaptic | Ollama |
|---|---|
Auto | "auto" |
Required | "required" |
None | "none" |
Specific(name) | {"type": "function", "function": {"name": "..."}} |
Reproducibility with seed
Ollama supports a seed parameter for reproducible generation. When set, the model will produce deterministic output for the same input:
let config = OllamaConfig::new("llama3.1")
.with_seed(42);
let model = OllamaChatModel::new(config, Arc::new(HttpBackend::new()));
let request = ChatRequest::new(vec![
Message::human("Pick a random number between 1 and 100."),
]);
// Same seed + same input = same output
let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());
Embeddings
OllamaEmbeddings provides local embedding generation through Ollama's /api/embed endpoint. Pull an embedding model first:
ollama pull nomic-embed-text
Configuration
use synaptic::ollama::{OllamaEmbeddingsConfig, OllamaEmbeddings};
use synaptic::models::HttpBackend;
use std::sync::Arc;
let config = OllamaEmbeddingsConfig::new("nomic-embed-text");
let embeddings = OllamaEmbeddings::new(config, Arc::new(HttpBackend::new()));
To connect to a remote instance:
let config = OllamaEmbeddingsConfig::new("nomic-embed-text")
.with_base_url("http://192.168.1.100:11434");
Usage
OllamaEmbeddings implements the Embeddings trait:
use synaptic::core::Embeddings;
// Embed a single query
let vector = embeddings.embed_query("What is Rust?").await?;
println!("Dimension: {}", vector.len());
// Embed multiple documents
let vectors = embeddings.embed_documents(&["First doc", "Second doc"]).await?;
println!("Embedded {} documents", vectors.len());
Configuration reference
OllamaConfig
| Field | Type | Default | Description |
|---|---|---|---|
model | String | required | Model name (e.g. llama3.1) |
base_url | String | "http://localhost:11434" | Ollama server URL |
top_p | Option<f64> | None | Nucleus sampling parameter |
stop | Option<Vec<String>> | None | Stop sequences |
seed | Option<u64> | None | Seed for reproducible generation |
OllamaEmbeddingsConfig
| Field | Type | Default | Description |
|---|---|---|---|
model | String | required | Embedding model name (e.g. nomic-embed-text) |
base_url | String | "http://localhost:11434" | Ollama server URL |
AWS Bedrock
This guide shows how to use AWS Bedrock as a chat model provider in Synaptic. Bedrock provides access to foundation models from Amazon, Anthropic, Meta, Mistral, and others through the AWS SDK.
Setup
Add the bedrock feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["bedrock"] }
AWS credentials
BedrockChatModel uses the AWS SDK for Rust, which reads credentials from the standard AWS credential chain:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN) - Shared credentials file (
~/.aws/credentials) - IAM role (when running on EC2, ECS, Lambda, etc.)
Ensure your IAM principal has bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions.
Configuration
Create a BedrockConfig with the model ID:
use synaptic::bedrock::{BedrockConfig, BedrockChatModel};
let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0");
let model = BedrockChatModel::new(config).await;
Note: The constructor is
asyncbecause it initializes the AWS SDK client, which loads credentials and resolves the region from the environment.
Region
By default, the region is resolved from the AWS SDK default chain (environment variable AWS_REGION, config file, etc.). You can override it:
let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0")
.with_region("us-west-2");
Model parameters
let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0")
.with_temperature(0.7)
.with_max_tokens(4096);
Usage
BedrockChatModel implements the ChatModel trait:
use synaptic::core::{ChatModel, ChatRequest, Message};
let request = ChatRequest::new(vec![
Message::system("You are a helpful assistant."),
Message::human("Explain AWS Bedrock in one sentence."),
]);
let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());
Streaming
use futures::StreamExt;
let mut stream = model.stream_chat(&request).await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(text) = &chunk.content {
print!("{}", text);
}
}
Tool calling
Bedrock supports tool calling for models that expose it (e.g. Anthropic Claude models):
use synaptic::core::{ChatRequest, Message, ToolDefinition};
let tools = vec![ToolDefinition {
name: "get_weather".into(),
description: "Get the current weather".into(),
parameters: serde_json::json!({
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}),
}];
let request = ChatRequest::new(vec![Message::human("Weather in Tokyo?")])
.with_tools(tools);
let response = model.chat(&request).await?;
Using an existing AWS client
If you already have a configured aws_sdk_bedrockruntime::Client, pass it directly with from_client:
use synaptic::bedrock::{BedrockConfig, BedrockChatModel};
let aws_config = aws_config::from_env().region("eu-west-1").load().await;
let client = aws_sdk_bedrockruntime::Client::new(&aws_config);
let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0");
let model = BedrockChatModel::from_client(config, client);
Note: Unlike the standard constructor,
from_clientis not async because it skips AWS SDK initialization.
Architecture note
BedrockChatModel does not use the ProviderBackend abstraction (HttpBackend/FakeBackend). It calls the AWS SDK directly via the Bedrock Runtime converse and converse_stream APIs. This means you cannot inject a FakeBackend for testing. Instead, use ScriptedChatModel as a test double:
use synaptic::models::ScriptedChatModel;
use synaptic::core::Message;
let model = ScriptedChatModel::new(vec![
Message::ai("Mocked Bedrock response"),
]);
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
model_id | String | required | Bedrock model ID (e.g. anthropic.claude-3-5-sonnet-20241022-v2:0) |
region | Option<String> | None (auto-detect) | AWS region override |
temperature | Option<f32> | None | Sampling temperature |
max_tokens | Option<u32> | None | Maximum tokens to generate |
Cohere Reranker
This guide shows how to use the Cohere Reranker in Synaptic. The reranker re-scores a list of documents by relevance to a query, improving retrieval quality when used as a second-stage filter.
Note: For Cohere chat models and embeddings, use the OpenAI-compatible constructors (
cohere_chat_model,cohere_embeddings) instead. This page covers the Reranker only.
Setup
Add the cohere feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["cohere"] }
Set your Cohere API key:
export CO_API_KEY="your-cohere-api-key"
Configuration
Create a CohereRerankerConfig and build the reranker:
use synaptic::cohere::{CohereRerankerConfig, CohereReranker};
let config = CohereRerankerConfig::new("your-cohere-api-key");
let reranker = CohereReranker::new(config);
Custom model
The default model is "rerank-v3.5". You can specify a different one:
let config = CohereRerankerConfig::new("your-cohere-api-key")
.with_model("rerank-english-v3.0");
Usage
Reranking documents
Pass a query, a list of documents, and the number of top results to return:
use synaptic::core::Document;
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is popular for data science"),
Document::new("3", "Rust ensures memory safety without a garbage collector"),
Document::new("4", "JavaScript runs in the browser"),
];
let top_docs = reranker.rerank("memory safe language", &docs, 2).await?;
for doc in &top_docs {
println!("{}: {}", doc.id, doc.content);
}
// Likely returns docs 3 and 1, re-ordered by relevance
The returned documents are sorted by descending relevance score. Only the top top_n documents are returned.
With ContextualCompressionRetriever
When the retrieval feature is also enabled, CohereReranker implements the DocumentCompressor trait. This allows it to plug into a ContextualCompressionRetriever for automatic reranking:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "cohere", "retrieval", "vectorstores", "embeddings"] }
use std::sync::Arc;
use synaptic::cohere::{CohereRerankerConfig, CohereReranker};
use synaptic::retrieval::ContextualCompressionRetriever;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever};
use synaptic::openai::OpenAiEmbeddings;
// Set up a base retriever
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(InMemoryVectorStore::new());
// ... add documents to the store ...
let base_retriever = Arc::new(VectorStoreRetriever::new(store, embeddings, 20));
// Wrap with reranker for two-stage retrieval
let reranker = Arc::new(CohereReranker::new(
CohereRerankerConfig::new("your-cohere-api-key"),
));
let retriever = ContextualCompressionRetriever::new(base_retriever, reranker);
// Retrieves 20 candidates, then reranks and returns the top 5
use synaptic::core::Retriever;
let results = retriever.retrieve("memory safety in Rust", 5).await?;
This two-stage pattern (broad retrieval followed by reranking) often produces better results than relying on embedding similarity alone.
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Cohere API key |
model | String | "rerank-v3.5" | Reranker model name |
Qdrant Vector Store
This guide shows how to use Qdrant as a vector store backend in Synaptic. Qdrant is a high-performance vector database purpose-built for similarity search.
Setup
Add the qdrant feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant"] }
Start a Qdrant instance (e.g. via Docker):
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
Port 6333 is the REST API; port 6334 is the gRPC endpoint used by the Rust client.
Configuration
Create a QdrantConfig with the connection URL, collection name, and vector dimensionality:
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config)?;
API key authentication
For Qdrant Cloud or secured deployments, attach an API key:
let config = QdrantConfig::new("https://my-cluster.cloud.qdrant.io:6334", "docs", 1536)
.with_api_key("your-api-key-here");
let store = QdrantVectorStore::new(config)?;
Distance metric
The default distance metric is cosine similarity. You can change it with with_distance():
use qdrant_client::qdrant::Distance;
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536)
.with_distance(Distance::Euclid);
Available options: Distance::Cosine (default), Distance::Euclid, Distance::Dot, Distance::Manhattan.
Creating the collection
Call ensure_collection() to create the collection if it does not already exist. This is idempotent and safe to call on every startup:
store.ensure_collection().await?;
The collection is created with the vector size and distance metric from your config.
Adding documents
QdrantVectorStore implements the VectorStore trait. Pass an embeddings provider to compute vectors:
use synaptic::qdrant::VectorStore;
use synaptic::retrieval::Document;
use synaptic::openai::OpenAiEmbeddings;
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
Document IDs are mapped to Qdrant point UUIDs. If a document ID is already a valid UUID, it is used directly. Otherwise, a deterministic UUID v5 is generated from the ID string.
Similarity search
Find the k most similar documents to a text query:
let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
Get similarity scores alongside results:
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Search by vector
Search using a pre-computed embedding vector:
use synaptic::embeddings::Embeddings;
let query_vec = embeddings.embed_query("systems programming").await?;
let results = store.similarity_search_by_vector(&query_vec, 3).await?;
Deleting documents
Remove documents by their IDs:
store.delete(&["1", "3"]).await?;
Using with a retriever
Wrap the store in a VectorStoreRetriever to use it with the rest of Synaptic's retrieval infrastructure:
use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::retrieval::Retriever;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;
Using an existing client
If you already have a configured qdrant_client::Qdrant instance, you can pass it directly:
use qdrant_client::Qdrant;
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
let client = Qdrant::from_url("http://localhost:6334").build()?;
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::from_client(client, config);
RAG Pipeline Example
A complete RAG pipeline: load documents, split them into chunks, embed and store in Qdrant, then retrieve relevant context and generate an answer.
use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;
let backend = Arc::new(HttpBackend::new());
let embeddings = Arc::new(OpenAiEmbeddings::new(
OpenAiEmbeddings::config("text-embedding-3-small"),
backend.clone(),
));
// 1. Load and split
let loader = TextLoader::new("docs/knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;
// 2. Store in Qdrant
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config)?;
store.ensure_collection().await?;
store.add_documents(chunks, embeddings.as_ref()).await?;
// 3. Retrieve and answer
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings.clone(), 5);
let relevant = retriever.retrieve("What is Synaptic?", 5).await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let model = OpenAiChatModel::new(/* config */);
let request = ChatRequest::new(vec![
Message::system(&format!("Answer based on context:\n{context}")),
Message::human("What is Synaptic?"),
]);
let response = model.chat(&request).await?;
Using with an Agent
Wrap the retriever as a tool so a ReAct agent can decide when to search the vector store during multi-step reasoning:
use synaptic::graph::create_react_agent;
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use std::sync::Arc;
// Build the retriever (as shown above)
let config = QdrantConfig::new("http://localhost:6334", "knowledge", 1536);
let store = Arc::new(QdrantVectorStore::new(config)?);
store.ensure_collection().await?;
let embeddings = Arc::new(OpenAiEmbeddings::new(/* config */));
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
// Register the retriever as a tool and create a ReAct agent
// that can autonomously decide when to search
let model = OpenAiChatModel::new(/* config */);
let agent = create_react_agent(model, vec![/* retriever tool */]).compile();
The agent will invoke the retriever tool whenever it determines that external knowledge is needed to answer the user's question.
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
url | String | required | Qdrant gRPC URL (e.g. http://localhost:6334) |
collection_name | String | required | Name of the Qdrant collection |
vector_size | u64 | required | Dimensionality of the embedding vectors |
api_key | Option<String> | None | API key for authenticated access |
distance | Distance | Cosine | Distance metric for similarity search |
PgVector
This guide shows how to use PostgreSQL with the pgvector extension as a vector store backend in Synaptic. This is a good choice when you already run PostgreSQL and want to keep embeddings alongside your relational data.
Prerequisites
Your PostgreSQL instance must have the pgvector extension installed. On most systems:
CREATE EXTENSION IF NOT EXISTS vector;
Refer to the pgvector installation guide for platform-specific instructions.
Setup
Add the pgvector feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.3", features = ["openai", "pgvector"] }
sqlx = { version = "0.8", features = ["runtime-tokio", "postgres"] }
The sqlx dependency is needed to create the connection pool. Synaptic uses sqlx::PgPool for all database operations.
Creating a store
Connect to PostgreSQL and create the store:
use sqlx::postgres::PgPoolOptions;
use synaptic::pgvector::{PgVectorConfig, PgVectorStore};
let pool = PgPoolOptions::new()
.max_connections(5)
.connect("postgres://user:pass@localhost/mydb")
.await?;
let config = PgVectorConfig::new("documents", 1536);
let store = PgVectorStore::new(pool, config);
The first argument to PgVectorConfig::new is the table name; the second is the embedding vector dimensionality (e.g. 1536 for OpenAI text-embedding-3-small).
Initializing the table
Call initialize() once to create the pgvector extension and the backing table. This is idempotent and safe to run on every application startup:
store.initialize().await?;
This creates a table with the following schema:
CREATE TABLE IF NOT EXISTS documents (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB NOT NULL DEFAULT '{}',
embedding vector(1536)
);
The vector(N) column type is provided by the pgvector extension, where N matches the vector_dimensions in your config.
Adding documents
PgVectorStore implements the VectorStore trait. Pass an embeddings provider to compute vectors:
use synaptic::pgvector::VectorStore;
use synaptic::retrieval::Document;
use synaptic::openai::OpenAiEmbeddings;
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
Documents with empty IDs are assigned a random UUID. Existing documents with the same ID are upserted (content, metadata, and embedding are updated).
Similarity search
Find the k most similar documents using cosine distance (<=>):
let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
Get cosine similarity scores (higher is more similar):
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Scores are computed as 1 - cosine_distance, so a score of 1.0 means identical vectors.
Search by vector
Search using a pre-computed embedding vector:
use synaptic::embeddings::Embeddings;
let query_vec = embeddings.embed_query("systems programming").await?;
let results = store.similarity_search_by_vector(&query_vec, 3).await?;
Deleting documents
Remove documents by their IDs:
store.delete(&["1", "3"]).await?;
Using with a retriever
Wrap the store in a VectorStoreRetriever for use with Synaptic's retrieval infrastructure:
use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::openai::OpenAiEmbeddings;
use synaptic::retrieval::Retriever;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;
Schema-qualified table names
You can use schema-qualified names (e.g. public.documents) for the table:
let config = PgVectorConfig::new("myschema.embeddings", 1536);
Table names are validated to contain only alphanumeric characters, underscores, and dots, preventing SQL injection.
Common patterns
RAG pipeline with PgVector
use synaptic::pgvector::{PgVectorConfig, PgVectorStore, VectorStore};
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::retrieval::{Document, Retriever};
use synaptic::core::{ChatModel, ChatRequest, Message};
use std::sync::Arc;
// Set up the store
let pool = PgPoolOptions::new()
.max_connections(5)
.connect("postgres://user:pass@localhost/mydb")
.await?;
let config = PgVectorConfig::new("knowledge_base", 1536);
let store = PgVectorStore::new(pool, config);
store.initialize().await?;
// Add documents
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let docs = vec![
Document::new("doc1", "Synaptic is a Rust agent framework"),
Document::new("doc2", "It supports RAG with vector stores"),
];
store.add_documents(docs, embeddings.as_ref()).await?;
// Retrieve and generate
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 3);
let context_docs = retriever.retrieve("What is Synaptic?", 3).await?;
let context = context_docs.iter()
.map(|d| d.content.as_str())
.collect::<Vec<_>>()
.join("\n");
let model = OpenAiChatModel::new("gpt-4o-mini");
let request = ChatRequest::new(vec![
Message::system(format!("Answer using this context:\n{context}")),
Message::human("What is Synaptic?"),
]);
let response = model.chat(request).await?;
Index Strategies
pgvector supports two index types for accelerating approximate nearest-neighbor search. Choosing the right one depends on your dataset size and performance requirements.
HNSW (Hierarchical Navigable Small World) -- recommended for most use cases. It provides better recall, faster queries at search time, and does not require a separate training step. The trade-off is higher memory usage and slower index build time.
IVFFlat (Inverted File with Flat compression) -- a good option for very large datasets where memory is a concern. It partitions vectors into lists and searches only a subset at query time. You must build the index after the table already contains data (it needs representative vectors for training).
-- HNSW index (recommended for most use cases)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- IVFFlat index (better for very large datasets)
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
| Property | HNSW | IVFFlat |
|---|---|---|
| Recall | Higher | Lower |
| Query speed | Faster | Slower (depends on probes) |
| Memory usage | Higher | Lower |
| Build speed | Slower | Faster |
| Training required | No | Yes (needs existing data) |
Tip: For tables with fewer than 100k rows, the default sequential scan is often fast enough. Add an index when query latency becomes a concern.
Reusing an Existing Connection Pool
If your application already maintains a sqlx::PgPool (e.g. for your main relational data), you can pass it directly to PgVectorStore instead of creating a new pool:
use sqlx::PgPool;
use synaptic::pgvector::{PgVectorConfig, PgVectorStore};
// Reuse the pool from your application state
let pool: PgPool = app_state.db_pool.clone();
let config = PgVectorConfig::new("app_embeddings", 1536);
let store = PgVectorStore::new(pool, config);
store.initialize().await?;
This avoids opening duplicate connections and lets your vector operations share the same transaction boundaries and connection limits as the rest of your application.
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
table_name | String | required | PostgreSQL table name (supports schema-qualified names) |
vector_dimensions | u32 | required | Dimensionality of the embedding vectors |
Pinecone Vector Store
This guide shows how to use Pinecone as a vector store backend in Synaptic. Pinecone is a managed vector database built for real-time similarity search at scale.
Setup
Add the pinecone feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "pinecone"] }
Set your Pinecone API key:
export PINECONE_API_KEY="your-pinecone-api-key"
You also need an existing Pinecone index. Create one through the Pinecone console or the Pinecone API. Note the index host URL (e.g. https://my-index-abc123.svc.aped-1234.pinecone.io).
Configuration
Create a PineconeConfig with your API key and index host URL:
use synaptic::pinecone::{PineconeConfig, PineconeVectorStore};
let config = PineconeConfig::new("your-pinecone-api-key", "https://my-index-abc123.svc.aped-1234.pinecone.io");
let store = PineconeVectorStore::new(config);
Namespace
Pinecone supports namespaces for partitioning data within an index:
let config = PineconeConfig::new("api-key", "https://my-index.pinecone.io")
.with_namespace("production");
If no namespace is set, the default namespace is used.
Adding documents
PineconeVectorStore implements the VectorStore trait. Pass an embeddings provider to compute vectors:
use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
Similarity search
Find the k most similar documents to a text query:
let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Deleting documents
Remove documents by their IDs:
store.delete(&["1", "3"]).await?;
Using with a retriever
Wrap the store in a VectorStoreRetriever for use with Synaptic's retrieval infrastructure:
use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;
Namespace Isolation
Namespaces are a common pattern for building multi-tenant RAG applications with Pinecone. Each tenant's data lives in a separate namespace within the same index, providing logical isolation without the overhead of managing multiple indexes.
use synaptic::pinecone::{PineconeConfig, PineconeVectorStore};
use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;
let api_key = std::env::var("PINECONE_API_KEY")?;
let index_host = "https://my-index-abc123.svc.aped-1234.pinecone.io";
// Create stores with different namespaces for tenant isolation
let config_a = PineconeConfig::new(&api_key, index_host)
.with_namespace("tenant-a");
let config_b = PineconeConfig::new(&api_key, index_host)
.with_namespace("tenant-b");
let store_a = PineconeVectorStore::new(config_a);
let store_b = PineconeVectorStore::new(config_b);
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
// Tenant A's documents are invisible to Tenant B
let docs_a = vec![Document::new("a1", "Tenant A internal report")];
store_a.add_documents(docs_a, &embeddings).await?;
// Searching in Tenant B's namespace returns no results from Tenant A
let results = store_b.similarity_search("internal report", 5, &embeddings).await?;
assert!(results.is_empty());
This approach scales well because Pinecone handles namespace-level partitioning internally. You can add, search, and delete documents in one namespace without affecting others.
RAG Pipeline Example
A complete RAG pipeline: load documents, split them into chunks, embed and store in Pinecone, then retrieve relevant context and generate an answer.
use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings, VectorStore, Retriever};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::pinecone::{PineconeConfig, PineconeVectorStore};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;
let backend = Arc::new(HttpBackend::new());
let embeddings = Arc::new(OpenAiEmbeddings::new(
OpenAiEmbeddings::config("text-embedding-3-small"),
backend.clone(),
));
// 1. Load and split
let loader = TextLoader::new("docs/knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;
// 2. Store in Pinecone
let config = PineconeConfig::new(
std::env::var("PINECONE_API_KEY")?,
"https://my-index-abc123.svc.aped-1234.pinecone.io",
);
let store = PineconeVectorStore::new(config);
store.add_documents(chunks, embeddings.as_ref()).await?;
// 3. Retrieve and answer
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings.clone(), 5);
let relevant = retriever.retrieve("What is Synaptic?", 5).await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let model = OpenAiChatModel::new(/* config */);
let request = ChatRequest::new(vec![
Message::system(&format!("Answer based on context:\n{context}")),
Message::human("What is Synaptic?"),
]);
let response = model.chat(&request).await?;
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Pinecone API key |
host | String | required | Index host URL from the Pinecone console |
namespace | Option<String> | None | Namespace for data partitioning |
Chroma Vector Store
This guide shows how to use Chroma as a vector store backend in Synaptic. Chroma is an open-source embedding database that runs locally or in the cloud.
Setup
Add the chroma feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "chroma"] }
Start a Chroma server (e.g. via Docker):
docker run -p 8000:8000 chromadb/chroma
Configuration
Create a ChromaConfig with the server URL and collection name:
use synaptic::chroma::{ChromaConfig, ChromaVectorStore};
let config = ChromaConfig::new("http://localhost:8000", "my_collection");
let store = ChromaVectorStore::new(config);
The default URL is http://localhost:8000.
Creating the collection
Call ensure_collection() to create the collection if it does not already exist. This is idempotent and safe to call on every startup:
store.ensure_collection().await?;
Authentication
If your Chroma server requires authentication, pass credentials:
let config = ChromaConfig::new("https://chroma.example.com", "my_collection")
.with_auth_token("your-token");
Adding documents
ChromaVectorStore implements the VectorStore trait:
use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
Similarity search
Find the k most similar documents:
let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Deleting documents
Remove documents by their IDs:
store.delete(&["1", "3"]).await?;
Using with a retriever
Wrap the store in a VectorStoreRetriever:
use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;
Docker Deployment
Chroma is easy to deploy with Docker for both development and production environments.
Quick start -- run a Chroma server with default settings:
# Start Chroma on port 8000
docker run -p 8000:8000 chromadb/chroma:latest
With persistent storage -- mount a volume so data survives container restarts:
docker run -p 8000:8000 -v ./chroma-data:/chroma/chroma chromadb/chroma:latest
Docker Compose -- for production deployments, use a docker-compose.yml:
version: "3.8"
services:
chroma:
image: chromadb/chroma:latest
ports:
- "8000:8000"
volumes:
- chroma-data:/chroma/chroma
restart: unless-stopped
volumes:
chroma-data:
Then connect from Synaptic:
use synaptic::chroma::{ChromaConfig, ChromaVectorStore};
let config = ChromaConfig::new("http://localhost:8000", "my_collection");
let store = ChromaVectorStore::new(config);
store.ensure_collection().await?;
For remote or authenticated deployments, use with_auth_token():
let config = ChromaConfig::new("https://chroma.example.com", "my_collection")
.with_auth_token("your-token");
RAG Pipeline Example
A complete RAG pipeline: load documents, split them into chunks, embed and store in Chroma, then retrieve relevant context and generate an answer.
use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings, VectorStore, Retriever};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::chroma::{ChromaConfig, ChromaVectorStore};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;
let backend = Arc::new(HttpBackend::new());
let embeddings = Arc::new(OpenAiEmbeddings::new(
OpenAiEmbeddings::config("text-embedding-3-small"),
backend.clone(),
));
// 1. Load and split
let loader = TextLoader::new("docs/knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;
// 2. Store in Chroma
let config = ChromaConfig::new("http://localhost:8000", "my_collection");
let store = ChromaVectorStore::new(config);
store.ensure_collection().await?;
store.add_documents(chunks, embeddings.as_ref()).await?;
// 3. Retrieve and answer
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings.clone(), 5);
let relevant = retriever.retrieve("What is Synaptic?", 5).await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let model = OpenAiChatModel::new(/* config */);
let request = ChatRequest::new(vec![
Message::system(&format!("Answer based on context:\n{context}")),
Message::human("What is Synaptic?"),
]);
let response = model.chat(&request).await?;
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
url | String | "http://localhost:8000" | Chroma server URL |
collection_name | String | required | Name of the collection |
auth_token | Option<String> | None | Authentication token |
MongoDB Atlas Vector Search
This guide shows how to use MongoDB Atlas Vector Search as a vector store backend in Synaptic. Atlas Vector Search enables semantic similarity search on data stored in MongoDB.
Setup
Add the mongodb feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "mongodb"] }
Prerequisites
- A MongoDB Atlas cluster (M10 or higher, or a free shared cluster with Atlas Search enabled).
- A vector search index configured on the target collection. Create one via the Atlas UI or the Atlas Admin API.
Example index definition (JSON):
{
"type": "vectorSearch",
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
}
]
}
Configuration
Create a MongoVectorConfig with the database name, collection name, index name, and vector dimensionality:
use synaptic::mongodb::{MongoVectorConfig, MongoVectorStore};
let config = MongoVectorConfig::new("my_database", "my_collection", "vector_index", 1536);
let store = MongoVectorStore::from_uri("mongodb+srv://user:pass@cluster.mongodb.net/", config).await?;
The from_uri constructor connects to MongoDB and is async.
Embedding field name
By default, vectors are stored in a field called "embedding". You can change this:
let config = MongoVectorConfig::new("mydb", "docs", "vector_index", 1536)
.with_embedding_field("vector");
Make sure this matches the path in your Atlas vector search index definition.
Content and metadata fields
Customize which fields store the document content and metadata:
let config = MongoVectorConfig::new("mydb", "docs", "vector_index", 1536)
.with_content_field("text")
.with_metadata_field("meta");
Adding documents
MongoVectorStore implements the VectorStore trait:
use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
Similarity search
Find the k most similar documents:
let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Deleting documents
Remove documents by their IDs:
store.delete(&["1", "3"]).await?;
Using with a retriever
Wrap the store in a VectorStoreRetriever:
use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;
Atlas Search Index Setup
Before you can run similarity searches, you must create a vector search index on your MongoDB Atlas collection. This requires an M10 or higher dedicated cluster (vector search is not available on free/shared tier clusters).
Creating an index via the Atlas UI
- Navigate to your cluster in the MongoDB Atlas console.
- Go to Search > Create Search Index.
- Choose JSON Editor and select the target database and collection.
- Paste the following index definition:
{
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
}
]
}
- Name your index (e.g.
vector_index) and click Create Search Index.
Note: The
pathfield must match theembedding_fieldconfigured in yourMongoVectorConfig. If you customized it with.with_embedding_field("vector"), set"path": "vector"in the index definition. Similarly, adjustnumDimensionsto match your embedding model's output dimensionality.
Creating an index via the Atlas CLI
You can also create the index programmatically using the MongoDB Atlas CLI:
First, save the index definition to a file called index.json:
{
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
}
]
}
Then run:
atlas clusters search indexes create \
--clusterName my-cluster \
--db my_database \
--collection my_collection \
--file index.json
The index build runs asynchronously. You can check its status with:
atlas clusters search indexes list \
--clusterName my-cluster \
--db my_database \
--collection my_collection
Wait until the status shows READY before running similarity searches.
Similarity options
The similarity field in the index definition controls how vectors are compared:
| Value | Description |
|---|---|
cosine | Cosine similarity (default, good for normalized embeddings) |
euclidean | Euclidean (L2) distance |
dotProduct | Dot product (use with unit-length vectors) |
RAG Pipeline Example
Below is a complete Retrieval-Augmented Generation (RAG) pipeline that loads documents, splits them, embeds and stores them in MongoDB Atlas, then retrieves relevant context to answer a question.
use std::sync::Arc;
use synaptic::core::{
ChatModel, ChatRequest, Document, Embeddings, Message, Retriever, VectorStore,
};
use synaptic::mongodb::{MongoVectorConfig, MongoVectorStore};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::vectorstores::VectorStoreRetriever;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Configure embeddings and LLM
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let llm = OpenAiChatModel::new("gpt-4o-mini");
// 2. Connect to MongoDB Atlas
let config = MongoVectorConfig::new("my_database", "documents", "vector_index", 1536);
let store = MongoVectorStore::from_uri(
"mongodb+srv://user:pass@cluster.mongodb.net/",
config,
)
.await?;
// 3. Load and split documents
let raw_docs = vec![
Document::new("doc1", "Rust is a multi-paradigm, general-purpose programming language \
that emphasizes performance, type safety, and concurrency. It enforces memory safety \
without a garbage collector."),
Document::new("doc2", "MongoDB Atlas is a fully managed cloud database service. It provides \
built-in vector search capabilities for AI applications, supporting cosine, euclidean, \
and dot product similarity metrics."),
];
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&raw_docs);
// 4. Embed and store in MongoDB
store.add_documents(chunks, embeddings.as_ref()).await?;
// 5. Create a retriever
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 3);
// 6. Retrieve relevant context
let query = "What is Rust?";
let relevant_docs = retriever.retrieve(query, 3).await?;
let context = relevant_docs
.iter()
.map(|doc| doc.content.as_str())
.collect::<Vec<_>>()
.join("\n\n");
// 7. Generate answer using retrieved context
let messages = vec![
Message::system("Answer the user's question based on the following context. \
If the context doesn't contain relevant information, say so.\n\n\
Context:\n{context}".replace("{context}", &context)),
Message::human(query),
];
let response = llm.chat(ChatRequest::new(messages)).await?;
println!("Answer: {}", response.message.content());
Ok(())
}
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
database | String | required | MongoDB database name |
collection | String | required | MongoDB collection name |
index_name | String | required | Atlas vector search index name |
dims | u32 | required | Dimensionality of embedding vectors |
embedding_field | String | "embedding" | Field name for the vector embedding |
content_field | String | "content" | Field name for document text content |
metadata_field | String | "metadata" | Field name for document metadata |
Elasticsearch Vector Store
This guide shows how to use Elasticsearch as a vector store backend in Synaptic. Elasticsearch supports approximate kNN (k-nearest neighbors) search using dense vector fields.
Setup
Add the elasticsearch feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "elasticsearch"] }
Start an Elasticsearch instance (e.g. via Docker):
docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.12.0
Configuration
Create an ElasticsearchConfig with the server URL, index name, and vector dimensionality:
use synaptic::elasticsearch::{ElasticsearchConfig, ElasticsearchVectorStore};
let config = ElasticsearchConfig::new("http://localhost:9200", "my_index", 1536);
let store = ElasticsearchVectorStore::new(config);
Authentication
For secured Elasticsearch clusters, provide credentials:
let config = ElasticsearchConfig::new("https://es.example.com:9200", "my_index", 1536)
.with_credentials("elastic", "changeme");
Creating the index
Call ensure_index() to create the index with the appropriate kNN vector mapping if it does not already exist:
store.ensure_index().await?;
This creates an index with a dense_vector field configured for the specified dimensionality and cosine similarity. The call is idempotent.
Similarity metric
The default similarity is cosine. You can change it:
let config = ElasticsearchConfig::new("http://localhost:9200", "my_index", 1536)
.with_similarity("dot_product");
Available options: "cosine" (default), "dot_product", "l2_norm".
Adding documents
ElasticsearchVectorStore implements the VectorStore trait:
use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;
let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");
let docs = vec![
Document::new("1", "Rust is a systems programming language"),
Document::new("2", "Python is great for data science"),
Document::new("3", "Go is designed for concurrency"),
];
let ids = store.add_documents(docs, &embeddings).await?;
Similarity search
Find the k most similar documents:
let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
println!("{}: {}", doc.id, doc.content);
}
Search with scores
let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}
Deleting documents
Remove documents by their IDs:
store.delete(&["1", "3"]).await?;
Using with a retriever
Wrap the store in a VectorStoreRetriever:
use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;
Index Mapping Configuration
While ensure_index() creates a default mapping automatically, you may want full control over the index mapping for production use. Below is the recommended Elasticsearch mapping for vector search:
{
"mappings": {
"properties": {
"embedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine"
},
"content": { "type": "text" },
"metadata": { "type": "object", "enabled": true }
}
}
}
Creating the index via the REST API
You can create the index with a custom mapping using the Elasticsearch REST API:
curl -X PUT "http://localhost:9200/my-index" \
-H "Content-Type: application/json" \
-d '{
"mappings": {
"properties": {
"embedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine"
},
"content": { "type": "text" },
"metadata": { "type": "object", "enabled": true }
}
}
}'
Key mapping fields
type: "dense_vector"-- Tells Elasticsearch this field stores a fixed-length float array for vector operations.dims-- Must match the dimensionality of your embedding model (e.g. 1536 fortext-embedding-3-small, 768 for many open-source models).index: true-- Enables the kNN search data structure. Without this, you can store vectors but cannot perform efficient approximate nearest-neighbor queries. Set totruefor production use.similarity-- Determines the distance function used for kNN search:"cosine"(default) -- Cosine similarity, recommended for most embedding models."dot_product"-- Dot product, best for unit-length normalized vectors."l2_norm"-- Euclidean distance.
Mapping for metadata filtering
If you plan to filter search results by metadata fields, add explicit mappings for those fields:
{
"mappings": {
"properties": {
"embedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine"
},
"content": { "type": "text" },
"metadata": {
"properties": {
"source": { "type": "keyword" },
"category": { "type": "keyword" },
"created_at": { "type": "date" }
}
}
}
}
}
Using keyword type for metadata fields enables exact-match filtering in kNN queries.
RAG Pipeline Example
Below is a complete Retrieval-Augmented Generation (RAG) pipeline that loads documents, splits them, embeds and stores them in Elasticsearch, then retrieves relevant context to answer a question.
use std::sync::Arc;
use synaptic::core::{
ChatModel, ChatRequest, Document, Embeddings, Message, Retriever, VectorStore,
};
use synaptic::elasticsearch::{ElasticsearchConfig, ElasticsearchVectorStore};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::vectorstores::VectorStoreRetriever;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Configure embeddings and LLM
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let llm = OpenAiChatModel::new("gpt-4o-mini");
// 2. Connect to Elasticsearch and create the index
let config = ElasticsearchConfig::new("http://localhost:9200", "rag_documents", 1536);
let store = ElasticsearchVectorStore::new(config);
store.ensure_index().await?;
// 3. Load and split documents
let raw_docs = vec![
Document::new("doc1", "Rust is a multi-paradigm, general-purpose programming language \
that emphasizes performance, type safety, and concurrency. It enforces memory safety \
without a garbage collector."),
Document::new("doc2", "Elasticsearch is a distributed, RESTful search and analytics engine. \
It supports vector search through dense_vector fields and approximate kNN queries, \
making it suitable for semantic search and RAG applications."),
];
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&raw_docs);
// 4. Embed and store in Elasticsearch
store.add_documents(chunks, embeddings.as_ref()).await?;
// 5. Create a retriever
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 3);
// 6. Retrieve relevant context
let query = "What is Rust?";
let relevant_docs = retriever.retrieve(query, 3).await?;
let context = relevant_docs
.iter()
.map(|doc| doc.content.as_str())
.collect::<Vec<_>>()
.join("\n\n");
// 7. Generate answer using retrieved context
let messages = vec![
Message::system("Answer the user's question based on the following context. \
If the context doesn't contain relevant information, say so.\n\n\
Context:\n{context}".replace("{context}", &context)),
Message::human(query),
];
let response = llm.chat(ChatRequest::new(messages)).await?;
println!("Answer: {}", response.message.content());
Ok(())
}
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
url | String | required | Elasticsearch server URL |
index_name | String | required | Name of the Elasticsearch index |
dims | u32 | required | Dimensionality of embedding vectors |
username | Option<String> | None | Username for basic auth |
password | Option<String> | None | Password for basic auth |
similarity | String | "cosine" | Similarity metric (cosine, dot_product, l2_norm) |
Redis Store & Cache
This guide shows how to use Redis for persistent key-value storage and LLM response caching in Synaptic. The redis integration provides two components:
RedisStore-- implements theStoretrait for namespace-scoped key-value storage.RedisCache-- implements theLlmCachetrait for caching LLM responses with optional TTL.
Setup
Add the redis feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.3", features = ["openai", "redis"] }
Ensure you have a Redis server running:
docker run -p 6379:6379 redis:7
RedisStore
Creating a store
The simplest way to create a store is from a Redis URL:
use synaptic::redis::RedisStore;
let store = RedisStore::from_url("redis://127.0.0.1/")?;
Custom key prefix
By default, all keys are prefixed with "synaptic:store:". You can customize this:
use synaptic::redis::{RedisStore, RedisStoreConfig};
let config = RedisStoreConfig {
prefix: "myapp:store:".to_string(),
};
let store = RedisStore::from_url_with_config("redis://127.0.0.1/", config)?;
Using an existing client
If you already have a configured redis::Client, pass it directly:
use synaptic::redis::{RedisStore, RedisStoreConfig};
let client = redis::Client::open("redis://127.0.0.1/")?;
let store = RedisStore::new(client, RedisStoreConfig::default());
Storing and retrieving data
RedisStore implements the Store trait with full namespace support:
use synaptic::redis::Store;
use serde_json::json;
// Put a value under a namespace
store.put(&["users", "prefs"], "theme", json!("dark")).await?;
// Retrieve the value
let item = store.get(&["users", "prefs"], "theme").await?;
if let Some(item) = item {
println!("Theme: {}", item.value); // "dark"
}
Searching within a namespace
Search for items using substring matching on keys and values:
store.put(&["docs"], "rust", json!("Rust is fast")).await?;
store.put(&["docs"], "python", json!("Python is flexible")).await?;
// Search with a query string (substring match)
let results = store.search(&["docs"], Some("fast"), 10).await?;
assert_eq!(results.len(), 1);
// Search without a query (list all items in namespace)
let all = store.search(&["docs"], None, 10).await?;
assert_eq!(all.len(), 2);
Deleting data
store.delete(&["users", "prefs"], "theme").await?;
Listing namespaces
List all known namespace paths, optionally filtered by prefix:
store.put(&["app", "settings"], "key1", json!("v1")).await?;
store.put(&["app", "cache"], "key2", json!("v2")).await?;
store.put(&["logs"], "key3", json!("v3")).await?;
// List all namespaces
let all_ns = store.list_namespaces(&[]).await?;
// [["app", "settings"], ["app", "cache"], ["logs"]]
// List namespaces under "app"
let app_ns = store.list_namespaces(&["app"]).await?;
// [["app", "settings"], ["app", "cache"]]
Using with agents
Pass the store to create_agent so that RuntimeAwareTool implementations receive it via ToolRuntime:
use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::redis::RedisStore;
let store = Arc::new(RedisStore::from_url("redis://127.0.0.1/")?);
let options = AgentOptions {
store: Some(store),
..Default::default()
};
let graph = create_agent(model, tools, options)?;
RedisCache
Creating a cache
Create a cache from a Redis URL:
use synaptic::redis::RedisCache;
let cache = RedisCache::from_url("redis://127.0.0.1/")?;
Cache with TTL
Set a TTL (in seconds) so entries expire automatically:
use synaptic::redis::{RedisCache, RedisCacheConfig};
let config = RedisCacheConfig {
ttl: Some(3600), // 1 hour
..Default::default()
};
let cache = RedisCache::from_url_with_config("redis://127.0.0.1/", config)?;
Without a TTL, cached entries persist indefinitely until explicitly cleared.
Custom key prefix
The default cache prefix is "synaptic:cache:". Customize it to avoid collisions:
let config = RedisCacheConfig {
prefix: "myapp:llm_cache:".to_string(),
ttl: Some(1800), // 30 minutes
};
let cache = RedisCache::from_url_with_config("redis://127.0.0.1/", config)?;
Wrapping a ChatModel
Use CachedChatModel to cache responses from any ChatModel:
use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::cache::CachedChatModel;
use synaptic::redis::RedisCache;
use synaptic::openai::OpenAiChatModel;
let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let cache = Arc::new(RedisCache::from_url("redis://127.0.0.1/")?);
let cached_model = CachedChatModel::new(model, cache);
// First call hits the LLM; identical requests return the cached response
Clearing the cache
Remove all cached entries:
use synaptic::redis::LlmCache;
cache.clear().await?;
This deletes all Redis keys matching the cache prefix.
Using an existing client
let client = redis::Client::open("redis://127.0.0.1/")?;
let cache = RedisCache::new(client, RedisCacheConfig::default());
Configuration reference
RedisStoreConfig
| Field | Type | Default | Description |
|---|---|---|---|
prefix | String | "synaptic:store:" | Key prefix for all store entries |
RedisCacheConfig
| Field | Type | Default | Description |
|---|---|---|---|
prefix | String | "synaptic:cache:" | Key prefix for all cache entries |
ttl | Option<u64> | None | TTL in seconds; None means entries never expire |
Key format
- Store keys:
{prefix}{namespace_joined_by_colon}:{key}(e.g.synaptic:store:users:prefs:theme) - Cache keys:
{prefix}{key}(e.g.synaptic:cache:abc123) - Namespace index:
{prefix}__namespaces__(a Redis SET tracking all namespace paths)
SQLite Cache
This guide shows how to use SQLite as a persistent LLM response cache in Synaptic. SqliteCache stores chat model responses locally so identical requests are served from disk without calling the LLM again.
Setup
Add the sqlite feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "sqlite"] }
No external service is required. The cache uses a local SQLite file (or an in-memory database for testing).
Configuration
File-based cache
Create a SqliteCacheConfig pointing to a database file:
use synaptic::sqlite::{SqliteCacheConfig, SqliteCache};
let config = SqliteCacheConfig::new("cache.db");
let cache = SqliteCache::new(config).await?;
The database file is created automatically if it does not exist. The constructor is async because it initializes the database schema.
In-memory cache
For testing or ephemeral use, create an in-memory SQLite cache:
let config = SqliteCacheConfig::in_memory();
let cache = SqliteCache::new(config).await?;
TTL (time-to-live)
Set an optional TTL so cached entries expire automatically:
use std::time::Duration;
let config = SqliteCacheConfig::new("cache.db")
.with_ttl(Duration::from_secs(3600)); // 1 hour
let cache = SqliteCache::new(config).await?;
Without a TTL, cached entries persist indefinitely.
Usage
Wrapping a ChatModel
Use CachedChatModel from synaptic-cache to wrap any ChatModel:
use std::sync::Arc;
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::cache::CachedChatModel;
use synaptic::sqlite::{SqliteCacheConfig, SqliteCache};
use synaptic::openai::OpenAiChatModel;
let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let config = SqliteCacheConfig::new("llm_cache.db");
let cache = Arc::new(SqliteCache::new(config).await?);
let cached_model = CachedChatModel::new(model, cache);
// First call hits the LLM
let request = ChatRequest::new(vec![Message::human("What is Rust?")]);
let response = cached_model.chat(&request).await?;
// Second identical call returns the cached response instantly
let response2 = cached_model.chat(&request).await?;
Direct cache access
SqliteCache implements the LlmCache trait, so you can use it directly:
use synaptic::core::LlmCache;
// Look up a cached response by key
let cached = cache.lookup("some-cache-key").await?;
// Store a response
cache.update("some-cache-key", &response).await?;
// Clear all entries
cache.clear().await?;
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
path | String | required | Path to the SQLite database file (or ":memory:" for in-memory) |
ttl | Option<Duration> | None | Time-to-live for cache entries; None means entries never expire |
PDF Loader
This guide shows how to load documents from PDF files using Synaptic's PdfLoader. It extracts text content from PDFs and produces Document values that can be passed to text splitters, embeddings, and vector stores.
Setup
Add the pdf feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.3", features = ["pdf"] }
The PDF extraction is handled by the pdf_extract library, which is pulled in automatically.
Loading a PDF as a single document
By default, PdfLoader combines all pages into one Document:
use synaptic::pdf::{PdfLoader, Loader};
let loader = PdfLoader::new("report.pdf");
let docs = loader.load().await?;
assert_eq!(docs.len(), 1);
println!("Content: {}", docs[0].content);
println!("Source: {}", docs[0].metadata["source"]); // "report.pdf"
println!("Pages: {}", docs[0].metadata["total_pages"]); // e.g. 12
The document ID is set to the file path string. Metadata includes:
source-- the file pathtotal_pages-- the total number of pages in the PDF
Loading with one document per page
Use with_split_pages to produce a separate Document for each page:
use synaptic::pdf::{PdfLoader, Loader};
let loader = PdfLoader::with_split_pages("report.pdf");
let docs = loader.load().await?;
for doc in &docs {
println!(
"Page {}/{}: {}...",
doc.metadata["page"],
doc.metadata["total_pages"],
&doc.content[..80]
);
}
Each document has the following metadata:
source-- the file pathpage-- the 1-based page numbertotal_pages-- the total number of pages
Document IDs follow the format {path}:page_{n} (e.g. report.pdf:page_3). Empty pages are automatically skipped.
RAG pipeline with PDF
A common pattern is to load a PDF, split it into chunks, embed, and store for retrieval:
use synaptic::pdf::{PdfLoader, Loader};
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore, VectorStoreRetriever};
use synaptic::openai::OpenAiEmbeddings;
use synaptic::retrieval::Retriever;
use std::sync::Arc;
// 1. Load the PDF
let loader = PdfLoader::with_split_pages("manual.pdf");
let docs = loader.load().await?;
// 2. Split into chunks
let splitter = RecursiveCharacterTextSplitter::new(1000, 200);
let chunks = splitter.split_documents(&docs)?;
// 3. Embed and store
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(InMemoryVectorStore::new());
store.add_documents(chunks, embeddings.as_ref()).await?;
// 4. Retrieve
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("How do I configure the system?", 5).await?;
This works equally well with QdrantVectorStore or PgVectorStore in place of InMemoryVectorStore.
Processing multiple PDFs
Use DirectoryLoader with a glob filter, or load PDFs individually and merge the results:
use synaptic::pdf::{PdfLoader, Loader};
let paths = vec!["docs/intro.pdf", "docs/guide.pdf", "docs/reference.pdf"];
let mut all_docs = Vec::new();
for path in paths {
let loader = PdfLoader::with_split_pages(path);
let docs = loader.load().await?;
all_docs.extend(docs);
}
// all_docs now contains page-level documents from all three PDFs
How text extraction works
PdfLoader uses the pdf_extract library internally. Text extraction runs on a blocking thread via tokio::task::spawn_blocking to avoid blocking the async runtime.
Page boundaries are detected by form feed characters (\x0c) that pdf_extract inserts between pages. When using with_split_pages, the text is split on these characters and each non-empty segment becomes a document.
Configuration reference
| Constructor | Behavior |
|---|---|
PdfLoader::new(path) | All pages combined into a single Document |
PdfLoader::with_split_pages(path) | One Document per page |
Metadata fields
| Field | Type | Present in | Description |
|---|---|---|---|
source | String | Both modes | The file path |
page | Number | Split pages only | 1-based page number |
total_pages | Number | Both modes | Total number of pages in the PDF |
Tavily Search Tool
This guide shows how to use the Tavily web search API as a tool in Synaptic. Tavily is a search engine optimized for LLM agents, returning concise and relevant results.
Setup
Add the tavily feature to your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["openai", "tavily"] }
Set your Tavily API key:
export TAVILY_API_KEY="tvly-..."
Configuration
Create a TavilyConfig and build the tool:
use synaptic::tavily::{TavilyConfig, TavilySearchTool};
let config = TavilyConfig::new("your-tavily-api-key");
let tool = TavilySearchTool::new(config);
Max results
Control how many search results are returned (default is 5):
let config = TavilyConfig::new("your-tavily-api-key")
.with_max_results(10);
Search depth
Choose between "basic" (default) and "advanced" search depth. Advanced search performs deeper crawling for more comprehensive results:
let config = TavilyConfig::new("your-tavily-api-key")
.with_search_depth("advanced");
Usage
As a standalone tool
TavilySearchTool implements the Tool trait with the name "tavily_search". It accepts a JSON input with a "query" field:
use synaptic::core::Tool;
let result = tool.call(serde_json::json!({
"query": "latest Rust programming news"
})).await?;
println!("{}", result);
The result is a JSON string containing search results with titles, URLs, and content snippets.
With an agent
Register the tool with an agent so the LLM can invoke web searches:
use std::sync::Arc;
use synaptic::tavily::{TavilyConfig, TavilySearchTool};
use synaptic::tools::ToolRegistry;
use synaptic::graph::create_react_agent;
use synaptic::openai::OpenAiChatModel;
let search = TavilySearchTool::new(TavilyConfig::new("your-tavily-api-key"));
let mut registry = ToolRegistry::new();
registry.register(Arc::new(search));
let model = OpenAiChatModel::new("gpt-4o");
let agent = create_react_agent(Arc::new(model), registry)?;
The agent can now call tavily_search when it needs to look up current information.
Tool definition
The tool advertises the following schema to the LLM:
{
"name": "tavily_search",
"description": "Search the web for current information on a topic.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
}
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
api_key | String | required | Tavily API key |
max_results | usize | 5 | Maximum number of search results to return |
search_depth | String | "basic" | Search depth: "basic" or "advanced" |
Procedural Macros
The synaptic-macros crate ships 12 attribute macros that eliminate boilerplate
when building agents with Synaptic. Instead of manually implementing traits such
as Tool, AgentMiddleware, or Entrypoint, you annotate an ordinary function
and the macro generates the struct, the trait implementation, and a factory
function for you.
All macros live in the synaptic_macros crate and are re-exported through the
synaptic facade, so you can import them with:
use synaptic::macros::*; // all macros at once
use synaptic::macros::tool; // or pick individually
| Macro | Purpose | Page |
|---|---|---|
#[tool] | Define tools from functions | This page |
#[chain] | Create runnable chains | This page |
#[entrypoint] | Workflow entry points | This page |
#[task] | Trackable tasks | This page |
#[traceable] | Tracing instrumentation | This page |
#[before_agent] | Middleware: before agent loop | Middleware Macros |
#[before_model] | Middleware: before model call | Middleware Macros |
#[after_model] | Middleware: after model call | Middleware Macros |
#[after_agent] | Middleware: after agent loop | Middleware Macros |
#[wrap_model_call] | Middleware: wrap model call | Middleware Macros |
#[wrap_tool_call] | Middleware: wrap tool call | Middleware Macros |
#[dynamic_prompt] | Middleware: dynamic system prompt | Middleware Macros |
For complete end-to-end scenarios, see Macro Examples.
#[tool] -- Define Tools from Functions
#[tool] converts an async fn into a full Tool (or RuntimeAwareTool)
implementation. The macro generates:
- A struct named
{PascalCase}Tool(e.g.web_searchbecomesWebSearchTool). - An
impl Tool for WebSearchToolblock withname(),description(),parameters()(JSON Schema), andcall(). - A factory function with the original name that returns
Arc<dyn Tool>.
Basic Usage
use synaptic::macros::tool;
use synaptic::core::SynapticError;
/// Search the web for a given query.
#[tool]
async fn web_search(query: String) -> Result<String, SynapticError> {
Ok(format!("Results for '{}'", query))
}
// The macro produces:
// struct WebSearchTool;
// impl Tool for WebSearchTool { ... }
// fn web_search() -> Arc<dyn Tool> { ... }
let tool = web_search();
assert_eq!(tool.name(), "web_search");
Doc Comments as Description
The doc comment on the function becomes the tool description that is sent to the LLM. Write a clear, concise sentence -- this is what the model reads when deciding whether to call your tool.
/// Fetch the current weather for a city.
#[tool]
async fn get_weather(city: String) -> Result<String, SynapticError> {
Ok(format!("Sunny in {}", city))
}
let tool = get_weather();
assert_eq!(tool.description(), "Fetch the current weather for a city.");
You can also override the description explicitly:
#[tool(description = "Look up weather information.")]
async fn get_weather(city: String) -> Result<String, SynapticError> {
Ok(format!("Sunny in {}", city))
}
Parameter Types and JSON Schema
Each function parameter is mapped to a JSON Schema property automatically. The following type mappings are supported:
| Rust Type | JSON Schema |
|---|---|
String | {"type": "string"} |
i8, i16, i32, i64, u8, u16, u32, u64, usize, isize | {"type": "integer"} |
f32, f64 | {"type": "number"} |
bool | {"type": "boolean"} |
Vec<T> | {"type": "array", "items": <schema of T>} |
serde_json::Value | {"type": "object"} |
T: JsonSchema (with schemars feature) | Full schema from schemars |
Any other type (without schemars) | {"type": "object"} (fallback) |
Parameter doc comments become "description" in the JSON Schema, giving the LLM
extra context about what to pass:
#[tool]
async fn search(
/// The search query string
query: String,
/// Maximum number of results to return
max_results: i64,
) -> Result<String, SynapticError> {
Ok(format!("Searching '{}' (limit {})", query, max_results))
}
This generates a JSON Schema similar to:
{
"type": "object",
"properties": {
"query": { "type": "string", "description": "The search query string" },
"max_results": { "type": "integer", "description": "Maximum number of results to return" }
},
"required": ["query", "max_results"]
}
Custom Types with schemars
By default, custom struct parameters generate a minimal {"type": "object"} schema
with no field details — the LLM has no guidance about the struct's shape. To generate
full schemas for custom types, enable the schemars feature and derive JsonSchema
on your parameter types.
Enable the feature in your Cargo.toml:
[dependencies]
synaptic = { version = "0.2", features = ["macros", "schemars"] }
schemars = { version = "0.8", features = ["derive"] }
Derive JsonSchema on your parameter types:
use schemars::JsonSchema;
use serde::Deserialize;
use synaptic::macros::tool;
use synaptic::core::SynapticError;
#[derive(Deserialize, JsonSchema)]
struct UserInfo {
/// User's display name
name: String,
/// Age in years
age: i32,
email: Option<String>,
}
/// Process user information.
#[tool]
async fn process_user(
/// The user to process
user: UserInfo,
/// Action to perform
action: String,
) -> Result<String, SynapticError> {
Ok(format!("{}: {}", user.name, action))
}
Without schemars, user generates:
{ "type": "object", "description": "The user to process" }
With schemars, user generates a full schema:
{
"type": "object",
"description": "The user to process",
"properties": {
"name": { "type": "string" },
"age": { "type": "integer", "format": "int32" },
"email": { "type": "string" }
},
"required": ["name", "age"]
}
Nested types work automatically — if UserInfo contained an Address struct that
also derives JsonSchema, the address schema is included via $defs references.
Note: Known primitive types (
String,i32,Vec<T>,bool, etc.) always use the built-in hardcoded schemas regardless of whetherschemarsis enabled. Only unknown/custom types benefit from theschemarsintegration.
Optional Parameters (Option<T>)
Wrap a parameter in Option<T> to make it optional. Optional parameters are
excluded from the "required" array in the schema. At runtime, missing or
null JSON values are deserialized as None.
#[tool]
async fn search(
query: String,
/// Filter by language (optional)
language: Option<String>,
) -> Result<String, SynapticError> {
let lang = language.unwrap_or_else(|| "en".into());
Ok(format!("Searching '{}' in {}", query, lang))
}
Default Values (#[default = ...])
Use #[default = value] on a parameter to supply a compile-time default.
Parameters with defaults are not required in the schema, and the default is
recorded in the "default" field of the schema property.
#[tool]
async fn search(
query: String,
#[default = 10]
max_results: i64,
#[default = "en"]
language: String,
) -> Result<String, SynapticError> {
Ok(format!("Searching '{}' (max {}, lang {})", query, max_results, language))
}
If the LLM omits max_results, it defaults to 10. If it omits language,
it defaults to "en".
Custom Tool Name (#[tool(name = "...")])
By default the tool name matches the function name. Override it with the name
attribute when you need a different identifier exposed to the LLM:
#[tool(name = "google_search")]
async fn search(query: String) -> Result<String, SynapticError> {
Ok(format!("Searching for '{}'", query))
}
let tool = search();
assert_eq!(tool.name(), "google_search");
The factory function keeps the original Rust name (search()), but
tool.name() returns "google_search".
Struct Fields (#[field])
Some tools need to hold state — a database connection, an API client, a backend
reference, etc. Mark those parameters with #[field] and they become struct
fields instead of JSON Schema parameters. The factory function will require
these values at construction time, and they are hidden from the LLM entirely.
use std::sync::Arc;
use synaptic::core::SynapticError;
use serde_json::Value;
#[tool]
async fn db_lookup(
#[field] connection: Arc<String>,
/// The table to query
table: String,
) -> Result<String, SynapticError> {
Ok(format!("Querying {} on {}", table, connection))
}
// Factory now requires the field parameter:
let tool = db_lookup(Arc::new("postgres://localhost".into()));
assert_eq!(tool.name(), "db_lookup");
// Only "table" appears in the schema; "connection" is hidden
The macro generates a struct with the field:
struct DbLookupTool {
connection: Arc<String>,
}
You can combine #[field] with regular parameters, Option<T>, and
#[default = ...]. Multiple #[field] parameters are supported:
#[tool]
async fn annotate(
#[field] prefix: String,
#[field] suffix: String,
/// The input text
text: String,
#[default = 1]
repeat: i64,
) -> Result<String, SynapticError> {
let inner = text.repeat(repeat as usize);
Ok(format!("{}{}{}", prefix, inner, suffix))
}
let tool = annotate("<<".into(), ">>".into());
Note:
#[field]and#[inject]cannot be used on the same parameter. Use#[field]when the value is provided at construction time; use#[inject]when it comes from the agent runtime.
Raw Arguments (#[args])
Some tools need to receive the raw JSON arguments without any deserialization —
for example, echo tools that forward the entire input, or tools that handle
arbitrary JSON payloads. Mark the parameter with #[args] and it will receive
the raw serde_json::Value passed to call() directly.
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use serde_json::{json, Value};
/// Echo the input back.
#[tool(name = "echo")]
async fn echo(#[args] args: Value) -> Result<Value, SynapticError> {
Ok(json!({"echo": args}))
}
let tool = echo();
assert_eq!(tool.name(), "echo");
// parameters() returns None — no JSON Schema is generated
assert!(tool.parameters().is_none());
The #[args] parameter:
- Receives the raw
Valuewithout any JSON Schema generation or deserialization - Causes
parameters()to returnNone(unless there are other normal parameters) - Can be combined with
#[field]parameters (struct fields are still supported) - Cannot be combined with
#[inject]on the same parameter - At most one parameter can be marked
#[args]
/// Echo with a configurable prefix.
#[tool]
async fn echo_with_prefix(
#[field] prefix: String,
#[args] args: Value,
) -> Result<Value, SynapticError> {
Ok(json!({"prefix": prefix, "data": args}))
}
let tool = echo_with_prefix(">>".into());
Runtime Injection (#[inject(state)], #[inject(store)], #[inject(tool_call_id)])
Some tools need access to agent runtime state that the LLM should not (and
cannot) provide. Mark those parameters with #[inject(...)] and they will be
populated from the ToolRuntime context instead of from the LLM-supplied JSON
arguments. Injected parameters are hidden from the JSON Schema entirely.
When any parameter uses #[inject(...)], the macro generates a
RuntimeAwareTool implementation (with call_with_runtime) instead of a plain
Tool.
There are three injection kinds:
| Annotation | Source | Typical Type |
|---|---|---|
#[inject(state)] | ToolRuntime::state (deserialized from Value) | Your state struct, or Value |
#[inject(store)] | ToolRuntime::store (cloned Option<Arc<dyn Store>>) | Arc<dyn Store> |
#[inject(tool_call_id)] | ToolRuntime::tool_call_id (the ID of the current call) | String |
use synaptic::core::{SynapticError, ToolRuntime};
use std::sync::Arc;
#[tool]
async fn save_note(
/// The note content
content: String,
/// Injected: the current tool call ID
#[inject(tool_call_id)]
call_id: String,
/// Injected: shared application state
#[inject(state)]
state: serde_json::Value,
) -> Result<String, SynapticError> {
Ok(format!("Saved note (call={}) with state {:?}", call_id, state))
}
// Factory returns Arc<dyn RuntimeAwareTool> instead of Arc<dyn Tool>
let tool = save_note();
The LLM only sees content in the schema; call_id and state are supplied
by the agent runtime automatically.
#[chain] -- Create Runnable Chains
#[chain] wraps an async fn as a BoxRunnable. It is a lightweight way to
create composable runnable steps that can be piped together.
The macro generates:
- A private
{name}_implfunction containing the original body. - A public factory function with the original name that returns a
BoxRunnable<InputType, OutputType>backed by aRunnableLambda.
Output Type Inference
The macro automatically detects the return type:
| Return Type | Generated Type | Behavior |
|---|---|---|
Result<Value, _> | BoxRunnable<I, Value> | Serializes result to Value |
Result<String, _> | BoxRunnable<I, String> | Returns directly, no serialization |
Result<T, _> (any other) | BoxRunnable<I, T> | Returns directly, no serialization |
Basic Usage
use synaptic::macros::chain;
use synaptic::core::SynapticError;
use serde_json::Value;
// Value output — result is serialized to Value
#[chain]
async fn uppercase(input: Value) -> Result<Value, SynapticError> {
let s = input.as_str().unwrap_or_default().to_uppercase();
Ok(Value::String(s))
}
// `uppercase()` returns BoxRunnable<Value, Value>
let runnable = uppercase();
Typed Output
When the return type is not Value, the macro generates a typed runnable
without serialization overhead:
// String output — returns BoxRunnable<String, String>
#[chain]
async fn to_upper(s: String) -> Result<String, SynapticError> {
Ok(s.to_uppercase())
}
#[chain]
async fn exclaim(s: String) -> Result<String, SynapticError> {
Ok(format!("{}!", s))
}
// Typed chains compose naturally with |
let pipeline = to_upper() | exclaim();
let result = pipeline.invoke("hello".into(), &config).await?;
assert_eq!(result, "HELLO!");
Composition with |
Runnables support pipe-based composition. Chain multiple steps together by combining the factories:
#[chain]
async fn step_a(input: Value) -> Result<Value, SynapticError> {
// ... transform input ...
Ok(input)
}
#[chain]
async fn step_b(input: Value) -> Result<Value, SynapticError> {
// ... transform further ...
Ok(input)
}
// Compose into a pipeline: step_a | step_b
let pipeline = step_a() | step_b();
let result = pipeline.invoke(serde_json::json!("hello")).await?;
Note:
#[chain]does not accept any arguments. Attempting to write#[chain(name = "...")]will produce a compile error.
#[entrypoint] -- Workflow Entry Points
#[entrypoint] defines a LangGraph-style workflow entry point. The macro
generates a factory function that returns a synaptic::core::Entrypoint struct
containing the configuration and a boxed async closure.
The decorated function must:
- Be
async. - Accept exactly one parameter of type
serde_json::Value. - Return
Result<Value, SynapticError>.
Basic Usage
use synaptic::macros::entrypoint;
use synaptic::core::SynapticError;
use serde_json::Value;
#[entrypoint]
async fn my_workflow(input: Value) -> Result<Value, SynapticError> {
// orchestrate agents, tools, subgraphs...
Ok(input)
}
let ep = my_workflow();
// ep.config.name == "my_workflow"
Attributes (name, checkpointer)
| Attribute | Default | Description |
|---|---|---|
name = "..." | function name | Override the entrypoint name |
checkpointer = "..." | None | Hint which checkpointer backend to use (e.g. "memory", "redis") |
#[entrypoint(name = "chat_bot", checkpointer = "memory")]
async fn my_workflow(input: Value) -> Result<Value, SynapticError> {
Ok(input)
}
let ep = my_workflow();
assert_eq!(ep.config.name, "chat_bot");
assert_eq!(ep.config.checkpointer, Some("memory"));
#[task] -- Trackable Tasks
#[task] marks an async function as a named task. This is useful inside
entrypoints for tracing and streaming identification. The macro:
- Renames the original function to
{name}_impl. - Creates a public wrapper function that defines a
__TASK_NAMEconstant and delegates to the impl.
Basic Usage
use synaptic::macros::task;
use synaptic::core::SynapticError;
#[task]
async fn fetch_weather(city: String) -> Result<String, SynapticError> {
Ok(format!("Sunny in {}", city))
}
// Calling fetch_weather("Paris".into()) internally sets __TASK_NAME = "fetch_weather"
// and delegates to fetch_weather_impl("Paris".into()).
let result = fetch_weather("Paris".into()).await?;
Custom Task Name
Override the task name with name = "...":
#[task(name = "weather_lookup")]
async fn fetch_weather(city: String) -> Result<String, SynapticError> {
Ok(format!("Sunny in {}", city))
}
// __TASK_NAME is now "weather_lookup"
#[traceable] -- Tracing Instrumentation
#[traceable] adds tracing instrumentation to any function. It wraps the
function body in a tracing::info_span! with parameter values recorded as span
fields. For async functions, the span is propagated correctly using
tracing::Instrument.
Basic Usage
use synaptic::macros::traceable;
#[traceable]
async fn process_data(input: String, count: usize) -> String {
format!("{}: {}", input, count)
}
This generates code equivalent to:
async fn process_data(input: String, count: usize) -> String {
use tracing::Instrument;
let __span = tracing::info_span!(
"process_data",
input = tracing::field::debug(&input),
count = tracing::field::debug(&count),
);
async move {
format!("{}: {}", input, count)
}
.instrument(__span)
.await
}
For synchronous functions, the macro uses a span guard instead of Instrument:
#[traceable]
fn compute(x: i32, y: i32) -> i32 {
x + y
}
// Generates a span guard: let __enter = __span.enter();
Custom Span Name
Override the default span name (which is the function name) with name = "...":
#[traceable(name = "data_pipeline")]
async fn process_data(input: String) -> String {
input.to_uppercase()
}
// The span is named "data_pipeline" instead of "process_data"
Skipping Parameters
Exclude sensitive or large parameters from being recorded in the span with
skip = "param1,param2":
#[traceable(skip = "api_key")]
async fn call_api(query: String, api_key: String) -> Result<String, SynapticError> {
// `query` is recorded in the span, `api_key` is not
Ok(format!("Called API with '{}'", query))
}
You can combine both attributes:
#[traceable(name = "api_call", skip = "api_key,secret")]
async fn call_api(query: String, api_key: String, secret: String) -> Result<String, SynapticError> {
Ok("done".into())
}
Middleware Macros
Synaptic provides seven macros for defining agent middleware. Each one generates:
- A struct named
{PascalCase}Middleware(e.g.log_responsebecomesLogResponseMiddleware). - An
impl AgentMiddleware for {PascalCase}Middlewarewith the corresponding hook method overridden. - A factory function with the original name that returns
Arc<dyn AgentMiddleware>.
None of the middleware macros accept attribute arguments. However, all middleware
macros support #[field] parameters for building stateful middleware (see
Stateful Middleware with #[field] below).
#[before_agent]
Runs before the agent loop starts. The function receives a mutable reference to the message list.
Signature: async fn(messages: &mut Vec<Message>) -> Result<(), SynapticError>
use synaptic::macros::before_agent;
use synaptic::core::{Message, SynapticError};
#[before_agent]
async fn inject_system(messages: &mut Vec<Message>) -> Result<(), SynapticError> {
println!("Starting agent with {} messages", messages.len());
Ok(())
}
let mw = inject_system(); // Arc<dyn AgentMiddleware>
#[before_model]
Runs before each model call. Use this to modify the request (e.g., add headers, tweak temperature, inject a system prompt).
Signature: async fn(request: &mut ModelRequest) -> Result<(), SynapticError>
use synaptic::macros::before_model;
use synaptic::middleware::ModelRequest;
use synaptic::core::SynapticError;
#[before_model]
async fn set_temperature(request: &mut ModelRequest) -> Result<(), SynapticError> {
request.temperature = Some(0.7);
Ok(())
}
let mw = set_temperature(); // Arc<dyn AgentMiddleware>
#[after_model]
Runs after each model call. Use this to inspect or mutate the response.
Signature: async fn(request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError>
use synaptic::macros::after_model;
use synaptic::middleware::{ModelRequest, ModelResponse};
use synaptic::core::SynapticError;
#[after_model]
async fn log_usage(request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError> {
if let Some(usage) = &response.usage {
println!("Tokens used: {}", usage.total_tokens);
}
Ok(())
}
let mw = log_usage(); // Arc<dyn AgentMiddleware>
#[after_agent]
Runs after the agent loop finishes. Receives the final message list.
Signature: async fn(messages: &mut Vec<Message>) -> Result<(), SynapticError>
use synaptic::macros::after_agent;
use synaptic::core::{Message, SynapticError};
#[after_agent]
async fn summarize(messages: &mut Vec<Message>) -> Result<(), SynapticError> {
println!("Agent finished with {} messages", messages.len());
Ok(())
}
let mw = summarize(); // Arc<dyn AgentMiddleware>
#[wrap_model_call]
Wraps the model call with custom logic, giving you full control over whether and how the underlying model is invoked. This is the right hook for retries, fallbacks, caching, or circuit-breaker patterns.
Signature: async fn(request: ModelRequest, next: &dyn ModelCaller) -> Result<ModelResponse, SynapticError>
use synaptic::macros::wrap_model_call;
use synaptic::middleware::{ModelRequest, ModelResponse, ModelCaller};
use synaptic::core::SynapticError;
#[wrap_model_call]
async fn retry_once(
request: ModelRequest,
next: &dyn ModelCaller,
) -> Result<ModelResponse, SynapticError> {
match next.call(request.clone()).await {
Ok(response) => Ok(response),
Err(_) => next.call(request).await, // retry once
}
}
let mw = retry_once(); // Arc<dyn AgentMiddleware>
#[wrap_tool_call]
Wraps individual tool calls. Same pattern as #[wrap_model_call] but for tool
invocations. Useful for logging, permission checks, or sandboxing.
Signature: async fn(request: ToolCallRequest, next: &dyn ToolCaller) -> Result<Value, SynapticError>
use synaptic::macros::wrap_tool_call;
use synaptic::middleware::{ToolCallRequest, ToolCaller};
use synaptic::core::SynapticError;
use serde_json::Value;
#[wrap_tool_call]
async fn log_tool(
request: ToolCallRequest,
next: &dyn ToolCaller,
) -> Result<Value, SynapticError> {
println!("Calling tool: {}", request.call.name);
let result = next.call(request).await?;
println!("Tool returned: {}", result);
Ok(result)
}
let mw = log_tool(); // Arc<dyn AgentMiddleware>
#[dynamic_prompt]
Generates a system prompt dynamically based on the current conversation. Unlike
the other middleware macros, the decorated function is synchronous (not
async). It reads the message history and returns a String that is set as the
system prompt before each model call.
Under the hood, the macro generates a middleware whose before_model hook sets
request.system_prompt to the return value of your function.
Signature: fn(messages: &[Message]) -> String
use synaptic::macros::dynamic_prompt;
use synaptic::core::Message;
#[dynamic_prompt]
fn context_aware_prompt(messages: &[Message]) -> String {
if messages.len() > 10 {
"Be concise. The conversation is getting long.".into()
} else {
"Be thorough and detailed in your responses.".into()
}
}
let mw = context_aware_prompt(); // Arc<dyn AgentMiddleware>
Why is
#[dynamic_prompt]synchronous?Unlike the other middleware macros,
#[dynamic_prompt]takes a plainfninstead ofasync fn. This is a deliberate design choice:
Pure computation — Dynamic prompt generation typically involves inspecting the message list and building a string. These are pure CPU operations (pattern matching, string formatting) with no I/O involved. Making them async would add unnecessary overhead (Future state machine, poll machinery) for zero benefit.
Simplicity — Synchronous functions are easier to write and reason about. No
.await, no pinning, no Send/Sync bounds to worry about.Internal async wrapping — The macro generates a
before_modelhook that calls your sync function inside an async context. The hook itself is async (as required byAgentMiddleware), but your function doesn't need to be.If you need async operations in your prompt generation (e.g., fetching context from a database or calling an API), use
#[before_model]directly and setrequest.system_promptyourself:#[before_model] async fn async_prompt(request: &mut ModelRequest) -> Result<(), SynapticError> { let context = fetch_from_database().await?; // async I/O request.system_prompt = Some(format!("Context: {}", context)); Ok(()) }
Stateful Middleware with #[field]
All middleware macros support #[field] parameters — function parameters that
become struct fields rather than trait method parameters. This lets you build
middleware with configuration state, just like #[tool] tools with #[field].
Field parameters must come before the trait-mandated parameters. The factory function will accept the field values, and the generated struct stores them.
Example: Retry middleware with configurable retries
use std::time::Duration;
use synaptic::macros::wrap_tool_call;
use synaptic::middleware::{ToolCallRequest, ToolCaller};
use synaptic::core::SynapticError;
use serde_json::Value;
#[wrap_tool_call]
async fn tool_retry(
#[field] max_retries: usize,
#[field] base_delay: Duration,
request: ToolCallRequest,
next: &dyn ToolCaller,
) -> Result<Value, SynapticError> {
let mut last_err = None;
for attempt in 0..=max_retries {
match next.call(request.clone()).await {
Ok(val) => return Ok(val),
Err(e) => {
last_err = Some(e);
if attempt < max_retries {
let delay = base_delay * 2u32.saturating_pow(attempt as u32);
tokio::time::sleep(delay).await;
}
}
}
}
Err(last_err.unwrap())
}
// Factory function accepts the field values:
let mw = tool_retry(3, Duration::from_millis(100));
Example: Model fallback with alternative models
use std::sync::Arc;
use synaptic::macros::wrap_model_call;
use synaptic::middleware::{BaseChatModelCaller, ModelRequest, ModelResponse, ModelCaller};
use synaptic::core::{ChatModel, SynapticError};
#[wrap_model_call]
async fn model_fallback(
#[field] fallbacks: Vec<Arc<dyn ChatModel>>,
request: ModelRequest,
next: &dyn ModelCaller,
) -> Result<ModelResponse, SynapticError> {
match next.call(request.clone()).await {
Ok(resp) => Ok(resp),
Err(primary_err) => {
for fallback in &fallbacks {
let caller = BaseChatModelCaller::new(fallback.clone());
if let Ok(resp) = caller.call(request.clone()).await {
return Ok(resp);
}
}
Err(primary_err)
}
}
}
let mw = model_fallback(vec![backup_model]);
Example: Dynamic prompt with branding
use synaptic::macros::dynamic_prompt;
use synaptic::core::Message;
#[dynamic_prompt]
fn branded_prompt(#[field] brand: String, messages: &[Message]) -> String {
format!("[{}] You have {} messages", brand, messages.len())
}
let mw = branded_prompt("Acme Corp".into());
Macro Examples
The following end-to-end scenarios show how the macros work together in realistic applications.
Scenario A: Weather Agent with Custom Tool
This example defines a tool with #[tool] and a #[field] for an API key,
registers it, creates a ReAct agent with create_react_agent, and runs a
query.
use synaptic::core::{ChatModel, Message, SynapticError};
use synaptic::graph::{create_react_agent, MessageState, GraphResult};
use synaptic::models::ScriptedChatModel;
use std::sync::Arc;
/// Get the current weather for a city.
#[tool]
async fn get_weather(
#[field] api_key: String,
/// City name to look up
city: String,
) -> Result<String, SynapticError> {
// In production, call a real weather API with api_key
Ok(format!("72°F and sunny in {}", city))
}
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
let tool = get_weather("sk-fake-key".into());
let tools: Vec<Arc<dyn synaptic::core::Tool>> = vec![tool];
let model: Arc<dyn ChatModel> = Arc::new(ScriptedChatModel::new(vec![/* ... */]));
let agent = create_react_agent(model, tools).compile()?;
let state = MessageState::from_messages(vec![
Message::human("What's the weather in Tokyo?"),
]);
let result = agent.invoke(state, None).await?;
println!("{:?}", result.into_state().messages);
Ok(())
}
Scenario B: Data Pipeline with Chain Macros
This example composes multiple #[chain] steps into a processing pipeline
that extracts text, normalizes it, and counts words.
use synaptic::core::{RunnableConfig, SynapticError};
use synaptic::runnables::Runnable;
use serde_json::{json, Value};
#[chain]
async fn extract_text(input: Value) -> Result<Value, SynapticError> {
let text = input["content"].as_str().unwrap_or("");
Ok(json!(text.to_string()))
}
#[chain]
async fn normalize(input: Value) -> Result<Value, SynapticError> {
let text = input.as_str().unwrap_or("").to_lowercase().trim().to_string();
Ok(json!(text))
}
#[chain]
async fn word_count(input: Value) -> Result<Value, SynapticError> {
let text = input.as_str().unwrap_or("");
let count = text.split_whitespace().count();
Ok(json!({"text": text, "word_count": count}))
}
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
let pipeline = extract_text() | normalize() | word_count();
let config = RunnableConfig::default();
let input = json!({"content": " Hello World from Synaptic! "});
let result = pipeline.invoke(input, &config).await?;
println!("Result: {}", result);
// {"text": "hello world from synaptic!", "word_count": 4}
Ok(())
}
Scenario C: Agent with Middleware Stack
This example combines middleware macros into a real agent with logging, retry, and dynamic prompting.
use synaptic::core::{Message, SynapticError};
use synaptic::middleware::{AgentMiddleware, MiddlewareChain, ModelRequest, ModelResponse, ModelCaller};
use std::sync::Arc;
// Log every model call
#[after_model]
async fn log_response(request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError> {
println!("[LOG] Model responded with {} chars",
response.message.content().len());
Ok(())
}
// Retry failed model calls up to 2 times
#[wrap_model_call]
async fn retry_model(
#[field] max_retries: usize,
request: ModelRequest,
next: &dyn ModelCaller,
) -> Result<ModelResponse, SynapticError> {
let mut last_err = None;
for _ in 0..=max_retries {
match next.call(request.clone()).await {
Ok(resp) => return Ok(resp),
Err(e) => last_err = Some(e),
}
}
Err(last_err.unwrap())
}
// Dynamic system prompt based on conversation length
#[dynamic_prompt]
fn adaptive_prompt(messages: &[Message]) -> String {
if messages.len() > 20 {
"Be concise. Summarize rather than elaborate.".into()
} else {
"You are a helpful assistant. Be thorough.".into()
}
}
fn build_middleware_stack() -> Vec<Arc<dyn AgentMiddleware>> {
vec![
adaptive_prompt(),
retry_model(2),
log_response(),
]
}
Scenario D: Store-Backed Note Manager with Typed Input
This example combines #[inject] for runtime access and schemars for rich
JSON Schema generation. A save_note tool accepts a custom NoteInput struct
whose full schema (title, content, tags) is visible to the LLM, while the
shared store and tool call ID are injected transparently by the agent runtime.
Cargo.toml -- enable the agent, store, and schemars features:
[dependencies]
synaptic = { version = "0.2", features = ["agent", "store", "schemars"] }
schemars = { version = "0.8", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Full example:
use std::sync::Arc;
use schemars::JsonSchema;
use serde::Deserialize;
use serde_json::json;
use synaptic::core::{Store, SynapticError};
use synaptic::macros::tool;
// --- Custom input type with schemars ---
// Deriving JsonSchema gives the LLM a complete description of every field,
// including the nested Vec<String> for tags.
#[derive(Deserialize, JsonSchema)]
struct NoteInput {
/// Title of the note
title: String,
/// Body content of the note (Markdown supported)
content: String,
/// Tags for categorisation (e.g. ["work", "urgent"])
tags: Vec<String>,
}
// --- What the LLM sees (with schemars enabled) ---
//
// The generated JSON Schema for the `note` parameter looks like:
//
// {
// "type": "object",
// "properties": {
// "title": { "type": "string", "description": "Title of the note" },
// "content": { "type": "string", "description": "Body content of the note (Markdown supported)" },
// "tags": { "type": "array", "items": { "type": "string" },
// "description": "Tags for categorisation (e.g. [\"work\", \"urgent\"])" }
// },
// "required": ["title", "content", "tags"]
// }
//
// --- Without schemars, the same parameter would produce only: ---
//
// { "type": "object" }
//
// ...giving the LLM no guidance about the expected fields.
/// Save a note to the shared store.
#[tool]
async fn save_note(
/// The note to save (title, content, and tags)
note: NoteInput,
/// Injected: persistent key-value store
#[inject(store)]
store: Arc<dyn Store>,
/// Injected: the current tool call ID for traceability
#[inject(tool_call_id)]
call_id: String,
) -> Result<String, SynapticError> {
// Build a unique key from the tool call ID
let key = format!("note:{}", call_id);
// Persist the note as a JSON item in the store
let value = json!({
"title": note.title,
"content": note.content,
"tags": note.tags,
"call_id": call_id,
});
store.put("notes", &key, value.clone()).await?;
Ok(format!(
"Saved note '{}' with {} tag(s) [key={}]",
note.title,
note.tags.len(),
key,
))
}
// Usage:
// let tool = save_note(); // Arc<dyn RuntimeAwareTool>
// assert_eq!(tool.name(), "save_note");
//
// The LLM sees only the `note` parameter in the schema.
// `store` and `call_id` are injected by ToolNode at runtime.
Key takeaways:
NoteInputderives bothDeserialize(for runtime deserialization) andJsonSchema(for compile-time schema generation). Theschemarsfeature must be enabled inCargo.tomlfor the#[tool]macro to pick up the derived schema.#[inject(store)]gives the tool direct access to the sharedStorewithout exposing it to the LLM. TheToolNodepopulates the store fromToolRuntimebefore each call.#[inject(tool_call_id)]provides a unique identifier for the current invocation, useful for creating deterministic storage keys or audit trails.- Because
#[inject]is present, the macro generates aRuntimeAwareTool(not a plainTool). The factory function returnsArc<dyn RuntimeAwareTool>.
Scenario E: Workflow with Entrypoint, Tasks, and Tracing
This scenario demonstrates #[entrypoint], #[task], and #[traceable]
working together to build an instrumented data pipeline.
use synaptic::core::SynapticError;
use synaptic::macros::{entrypoint, task, traceable};
use serde_json::{json, Value};
// A helper that calls an external API. The #[traceable] macro wraps it
// in a tracing span. We skip the api_key so it never appears in logs.
#[traceable(name = "external_api_call", skip = "api_key")]
async fn call_external_api(
url: String,
api_key: String,
) -> Result<Value, SynapticError> {
// In production: reqwest::get(...).await
Ok(json!({"status": "ok", "data": [1, 2, 3]}))
}
// Each #[task] gets a stable name used by streaming and tracing.
#[task(name = "fetch")]
async fn fetch_data(source: String) -> Result<Value, SynapticError> {
let api_key = std::env::var("API_KEY").unwrap_or_default();
let result = call_external_api(source, api_key).await?;
Ok(result)
}
#[task(name = "transform")]
async fn transform_data(raw: Value) -> Result<Value, SynapticError> {
let items = raw["data"].as_array().cloned().unwrap_or_default();
let doubled: Vec<Value> = items
.iter()
.filter_map(|v| v.as_i64())
.map(|n| json!(n * 2))
.collect();
Ok(json!({"transformed": doubled}))
}
// The entrypoint ties the workflow together with a name and checkpointer.
#[entrypoint(name = "data_pipeline", checkpointer = "memory")]
async fn run_pipeline(input: Value) -> Result<Value, SynapticError> {
let source = input["source"].as_str().unwrap_or("default").to_string();
let raw = fetch_data(source).await?;
let result = transform_data(raw).await?;
Ok(result)
}
#[tokio::main]
async fn main() -> Result<(), SynapticError> {
// Set up tracing to see the spans emitted by #[traceable] and #[task]:
// tracing_subscriber::fmt()
// .with_max_level(tracing::Level::INFO)
// .init();
let ep = run_pipeline();
let output = (ep.run)(json!({"source": "https://api.example.com/data"})).await?;
println!("Pipeline output: {}", output);
Ok(())
}
Key takeaways:
#[task]gives each step a stable name ("fetch","transform") that appears in streaming events and tracing spans, making it easy to identify which step is running or failed.#[traceable]instruments any function with an automatic tracing span. Useskip = "api_key"to keep secrets out of your traces.#[entrypoint]ties the workflow together with a logical name and an optionalcheckpointerhint for state persistence.- These macros are composable -- use them in any combination. A
#[task]can call a#[traceable]helper, and an#[entrypoint]can orchestrate any number of#[task]functions.
Scenario F: Tool Permission Gating with Audit Logging
This scenario demonstrates #[wrap_tool_call] with an allowlist field for
permission gating, plus #[before_agent] and #[after_agent] for lifecycle
audit logging.
use std::sync::Arc;
use synaptic::core::{Message, SynapticError};
use synaptic::macros::{before_agent, after_agent, wrap_tool_call};
use synaptic::middleware::{AgentMiddleware, ToolCallRequest, ToolCaller};
use serde_json::Value;
// --- Permission gating ---
// Only allow tools whose names appear in the allowlist.
// If the LLM tries to call a tool not in the list, return an error.
#[wrap_tool_call]
async fn permission_gate(
#[field] allowed_tools: Vec<String>,
request: ToolCallRequest,
next: &dyn ToolCaller,
) -> Result<Value, SynapticError> {
if !allowed_tools.contains(&request.call.name) {
return Err(SynapticError::Tool(format!(
"Tool '{}' is not in the allowed list: {:?}",
request.call.name, allowed_tools,
)));
}
next.call(request).await
}
// --- Audit: before agent ---
// Log the number of messages when the agent starts.
#[before_agent]
async fn audit_start(
#[field] label: String,
messages: &mut Vec<Message>,
) -> Result<(), SynapticError> {
println!("[{}] Agent starting with {} messages", label, messages.len());
Ok(())
}
// --- Audit: after agent ---
// Log the number of messages when the agent finishes.
#[after_agent]
async fn audit_end(
#[field] label: String,
messages: &mut Vec<Message>,
) -> Result<(), SynapticError> {
println!("[{}] Agent completed with {} messages", label, messages.len());
Ok(())
}
// --- Assemble the middleware stack ---
fn build_secured_stack() -> Vec<Arc<dyn AgentMiddleware>> {
let allowed = vec![
"web_search".to_string(),
"get_weather".to_string(),
];
vec![
audit_start("prod-agent".into()),
permission_gate(allowed),
audit_end("prod-agent".into()),
]
}
Key takeaways:
#[wrap_tool_call]gives full control over tool execution. Check permissions, transform arguments, or deny the call entirely by returning an error instead of callingnext.call().#[before_agent]and#[after_agent]bracket the entire agent lifecycle, making them ideal for audit logging, metrics collection, or resource setup/teardown.#[field]makes each middleware configurable and reusable. Thepermission_gatecan be instantiated with different allowlists for different agents, and the audit middleware accepts a label for log disambiguation.
Scenario G: State-Aware Tool with Raw Arguments
This scenario demonstrates #[inject(state)] for reading graph state and
#[args] for accepting raw JSON payloads, plus a combination of both
patterns with #[field].
use std::sync::Arc;
use serde::Deserialize;
use serde_json::{json, Value};
use synaptic::core::SynapticError;
use synaptic::macros::tool;
// --- State-aware tool ---
// Reads the graph state to adjust its behavior. After 10 conversation
// turns the tool switches to shorter replies.
#[derive(Deserialize)]
struct ConversationState {
turn_count: usize,
}
/// Generate a context-aware reply.
#[tool]
async fn smart_reply(
/// The user's latest message
message: String,
#[inject(state)]
state: ConversationState,
) -> Result<String, SynapticError> {
if state.turn_count > 10 {
// After 10 turns, keep it short
Ok(format!("TL;DR: {}", &message[..message.len().min(50)]))
} else {
Ok(format!(
"Turn {}: Let me elaborate on '{}'...",
state.turn_count, message
))
}
}
// --- Raw-args JSON proxy ---
// Accepts any JSON payload and forwards it to a webhook endpoint.
// No schema is generated -- the LLM sends whatever JSON it wants.
/// Forward a JSON payload to an external webhook.
#[tool(name = "webhook_forward")]
async fn webhook_forward(#[args] payload: Value) -> Result<String, SynapticError> {
// In production: reqwest::Client::new().post(url).json(&payload).send().await
Ok(format!("Forwarded payload with {} keys", payload.as_object().map_or(0, |m| m.len())))
}
// --- Configurable API proxy ---
// Combines #[field] for a base endpoint with #[args] for the request body.
// Each instance points at a different API.
/// Proxy arbitrary JSON to a configured API endpoint.
#[tool(name = "api_proxy")]
async fn api_proxy(
#[field] endpoint: String,
#[args] body: Value,
) -> Result<String, SynapticError> {
// In production: reqwest::Client::new().post(&endpoint).json(&body).send().await
Ok(format!(
"POST {} with {} bytes",
endpoint,
body.to_string().len()
))
}
fn main() {
// State-aware tool -- the LLM only sees "message" in the schema
let reply_tool = smart_reply();
// Raw-args tool -- parameters() returns None
let webhook_tool = webhook_forward();
// Configurable proxy -- each instance targets a different endpoint
let users_api = api_proxy("https://api.example.com/users".into());
let orders_api = api_proxy("https://api.example.com/orders".into());
}
Key takeaways:
#[inject(state)]gives tools read access to the current graph state without exposing it to the LLM. The state is deserialized fromToolRuntime::stateinto your custom struct automatically.#[args]bypasses schema generation entirely -- the tool accepts whatever JSON the LLM sends. Use this for proxy/forwarding patterns or tools that handle arbitrary payloads.parameters()returnsNonewhen#[args]is the only non-field, non-inject parameter.#[field]+#[args]combine naturally. The field is provided at construction time (hidden from the LLM), while the raw JSON arrives at call time. This makes it easy to create reusable tool templates that differ only in configuration.
Comparison with Python LangChain
If you are coming from Python LangChain / LangGraph, here is how the Synaptic macros map to their Python equivalents:
| Python | Rust (Synaptic) | Notes |
|---|---|---|
@tool | #[tool] | Both generate a tool from a function; Rust version produces a struct + trait impl |
RunnableLambda(fn) | #[chain] | Rust version returns BoxRunnable<I, O> with auto-detected output type |
@entrypoint | #[entrypoint] | Both define a workflow entry point; Rust adds checkpointer hint |
@task | #[task] | Both mark a function as a named sub-task |
| Middleware classes | #[before_agent], #[before_model], #[after_model], #[after_agent], #[wrap_model_call], #[wrap_tool_call], #[dynamic_prompt] | Rust splits each hook into its own macro for clarity |
@traceable | #[traceable] | Rust uses tracing crate spans; Python uses LangSmith |
InjectedState, InjectedStore, InjectedToolCallId | #[inject(state)], #[inject(store)], #[inject(tool_call_id)] | Rust uses parameter-level attributes instead of type annotations |
How Tool Definitions Reach the LLM
Understanding the full journey from a Rust function to an LLM tool call helps debug schema issues and customize behavior. Here is the complete chain:
#[tool] macro
|
v
struct + impl Tool (generated at compile time)
|
v
tool.as_tool_definition() -> ToolDefinition { name, description, parameters }
|
v
ChatRequest::with_tools(vec![...]) (tool definitions attached to request)
|
v
Model Adapter (OpenAI / Anthropic / Gemini)
| Converts ToolDefinition -> provider-specific JSON
| e.g. OpenAI: {"type": "function", "function": {"name": ..., "parameters": ...}}
v
HTTP POST -> LLM API
|
v
LLM returns ToolCall { id, name, arguments }
|
v
ToolNode dispatches -> tool.call(arguments)
|
v
Tool Message back into conversation
Key files in the codebase:
| Step | File |
|---|---|
#[tool] macro expansion | crates/synaptic-macros/src/tool.rs |
Tool / RuntimeAwareTool traits | crates/synaptic-core/src/lib.rs |
ToolDefinition, ToolCall types | crates/synaptic-core/src/lib.rs |
ToolNode (dispatches calls) | crates/synaptic-graph/src/tool_node.rs |
| OpenAI adapter | crates/synaptic-openai/src/lib.rs |
| Anthropic adapter | crates/synaptic-anthropic/src/lib.rs |
| Gemini adapter | crates/synaptic-gemini/src/lib.rs |
Testing Macro-Generated Code
Tools generated by #[tool] can be tested like any other Tool implementation. Call as_tool_definition() to inspect the schema and call() to verify behavior:
use serde_json::json;
use synaptic::core::Tool;
/// Add two numbers.
#[tool]
async fn add(
/// The first number
a: f64,
/// The second number
b: f64,
) -> Result<serde_json::Value, SynapticError> {
Ok(json!({"result": a + b}))
}
#[tokio::test]
async fn test_add_tool() {
let tool = add();
// Verify metadata
assert_eq!(tool.name(), "add");
assert_eq!(tool.description(), "Add two numbers.");
// Verify schema
let def = tool.as_tool_definition();
let required = def.parameters["required"].as_array().unwrap();
assert!(required.contains(&json!("a")));
assert!(required.contains(&json!("b")));
// Verify execution
let result = tool.call(json!({"a": 3.0, "b": 4.0})).await.unwrap();
assert_eq!(result["result"], 7.0);
}
For #[chain] macros, test the returned BoxRunnable with invoke():
use synaptic::core::RunnableConfig;
use synaptic::runnables::Runnable;
#[chain]
async fn to_upper(s: String) -> Result<String, SynapticError> {
Ok(s.to_uppercase())
}
#[tokio::test]
async fn test_chain() {
let runnable = to_upper();
let config = RunnableConfig::default();
let result = runnable.invoke("hello".into(), &config).await.unwrap();
assert_eq!(result, "HELLO");
}
What can go wrong
-
Custom types without
schemars: The parameter schema is{"type": "object"}with no field details. The LLM guesses (often incorrectly) what to send. Fix: Enable theschemarsfeature and deriveJsonSchema. -
Missing
as_tool_definition()call: If you constructToolDefinitionmanually withjson!({})for parameters instead of callingtool.as_tool_definition(), the schema will be empty. Fix: Always useas_tool_definition()on yourTool/RuntimeAwareTool. -
OpenAI strict mode: OpenAI's function calling strict mode rejects schemas with missing
typefields. All built-in types andValuenow generate valid schemas with"type"specified.
Architecture
Synaptic is organized as a workspace of focused Rust crates. Each crate owns exactly one concern, and they compose together through shared traits defined in a single core crate. This page explains the layered design, the principles behind it, and how the crates depend on each other.
Design Principles
Async-first. Every trait in Synaptic is async via #[async_trait], and the runtime is tokio. This is not an afterthought bolted onto a synchronous API -- async is the foundation. LLM calls, tool execution, memory access, and embedding queries are all naturally asynchronous operations, and Synaptic models them as such from the start.
One crate, one concern. Each provider has its own crate: synaptic-openai, synaptic-anthropic, synaptic-gemini, synaptic-ollama. The synaptic-tools crate knows how to register and execute tools. The synaptic-memory crate knows how to store and retrieve conversation history. No crate does two jobs. This keeps compile times manageable, makes it possible to use only what you need, and ensures that changes to one subsystem do not cascade across the codebase.
Shared traits in core. The synaptic-core crate defines every trait and type that crosses crate boundaries: ChatModel, Tool, MemoryStore, CallbackHandler, Message, ChatRequest, ChatResponse, ToolCall, SynapticError, RunnableConfig, and more. Implementation crates depend on core, never on each other (unless composition requires it).
Concurrency-safe by default. Shared registries use Arc<RwLock<_>> (standard library RwLock for low-contention read-heavy data like tool registries). Mutable state that requires async access -- callbacks, memory stores, checkpointers -- uses Arc<tokio::sync::Mutex<_>> or Arc<tokio::sync::RwLock<_>>. All core traits require Send + Sync.
Session isolation. Memory, agent runs, and graph checkpoints are keyed by a session or thread identifier. Two concurrent conversations never interfere with each other, even when they share the same model and tool instances.
Event-driven observability. The RunEvent enum captures every significant lifecycle event (run started, LLM called, tool called, run finished, run failed). Callback handlers receive these events asynchronously, enabling logging, tracing, recording, and custom side effects without modifying application code.
The Four Layers
Synaptic's crates fall into four layers, each building on the ones below it.
Layer 1: Core
synaptic-core is the foundation. It defines:
- Traits:
ChatModel,Tool,MemoryStore,CallbackHandler - Message types: The
Messageenum (System, Human, AI, Tool, Chat, Remove),AIMessageChunkfor streaming,ToolCall,ToolDefinition,ToolChoice - Request/response:
ChatRequest,ChatResponse,TokenUsage - Streaming: The
ChatStreamtype alias (Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send>>) - Configuration:
RunnableConfig(tags, metadata, concurrency limits, run IDs) - Events:
RunEventenum with six lifecycle variants - Errors:
SynapticErrorenum with 19 variants spanning all subsystems
Every other crate in the workspace depends on synaptic-core. Nothing depends on synaptic-core except through this single shared foundation.
Layer 2: Implementation Crates
Each crate implements one core concern:
| Crate | Purpose |
|---|---|
synaptic-models | ProviderBackend abstraction, test doubles (ScriptedChatModel), wrappers (RetryChatModel, RateLimitedChatModel, StructuredOutputChatModel<T>, BoundToolsChatModel) |
synaptic-openai | OpenAiChatModel + OpenAiEmbeddings |
synaptic-anthropic | AnthropicChatModel |
synaptic-gemini | GeminiChatModel |
synaptic-ollama | OllamaChatModel + OllamaEmbeddings |
synaptic-tools | ToolRegistry, SerialToolExecutor, ParallelToolExecutor, HandleErrorTool, ReturnDirectTool |
synaptic-memory | InMemoryStore and strategy types: Buffer, Window, Summary, TokenBuffer, SummaryBuffer, RunnableWithMessageHistory, FileChatMessageHistory |
synaptic-callbacks | RecordingCallback, TracingCallback, CompositeCallback |
synaptic-prompts | PromptTemplate, ChatPromptTemplate, FewShotChatMessagePromptTemplate, ExampleSelector |
synaptic-parsers | Output parsers: StrOutputParser, JsonOutputParser, StructuredOutputParser<T>, ListOutputParser, EnumOutputParser, BooleanOutputParser, MarkdownListOutputParser, NumberedListOutputParser, XmlOutputParser, RetryOutputParser, FixingOutputParser |
synaptic-cache | InMemoryCache, SemanticCache, CachedChatModel |
synaptic-eval | Evaluators (ExactMatch, JsonValidity, RegexMatch, EmbeddingDistance, LLMJudge), Dataset, batch evaluation pipeline |
Layer 3: Composition and Retrieval
These crates combine the implementation crates into higher-level abstractions:
| Crate | Purpose |
|---|---|
synaptic-runnables | The LCEL system: Runnable trait, BoxRunnable with pipe operator, RunnableSequence, RunnableParallel, RunnableBranch, RunnableWithFallbacks, RunnableAssign, RunnablePick, RunnableEach, RunnableRetry, RunnableGenerator |
synaptic-graph | LangGraph-style state machines: StateGraph builder, CompiledGraph, Node trait, ToolNode, create_react_agent(), checkpointing, streaming, visualization |
synaptic-loaders | Document loaders: TextLoader, JsonLoader, CsvLoader, DirectoryLoader, FileLoader, MarkdownLoader, WebLoader |
synaptic-splitters | Text splitters: CharacterTextSplitter, RecursiveCharacterTextSplitter, MarkdownHeaderTextSplitter, HtmlHeaderTextSplitter, LanguageTextSplitter, TokenTextSplitter |
synaptic-embeddings | Embeddings trait, FakeEmbeddings, CacheBackedEmbeddings |
synaptic-vectorstores | VectorStore trait, InMemoryVectorStore, MultiVectorRetriever |
synaptic-retrieval | Retriever trait and seven implementations: InMemory, BM25, MultiQuery, Ensemble, ContextualCompression, SelfQuery, ParentDocument |
synaptic-qdrant | QdrantVectorStore (Qdrant integration) |
synaptic-pgvector | PgVectorStore (PostgreSQL pgvector integration) |
synaptic-redis | RedisStore + RedisCache (Redis integration) |
synaptic-pdf | PdfLoader (PDF document loading) |
Layer 4: Facade
The synaptic crate re-exports everything from all sub-crates under a unified namespace. Application code can use a single dependency:
[dependencies]
synaptic = "0.2"
And then import from organized modules:
use synaptic::core::{Message, ChatRequest};
use synaptic::openai::OpenAiChatModel; // requires "openai" feature
use synaptic::anthropic::AnthropicChatModel; // requires "anthropic" feature
use synaptic::graph::{create_react_agent, MessageState};
use synaptic::runnables::{BoxRunnable, Runnable};
Crate Dependency Diagram
synaptic (facade)
|
+--------------------+--------------------+
| | |
synaptic-graph synaptic-runnables synaptic-eval
| | |
synaptic-tools synaptic-core synaptic-embeddings
| ^ |
synaptic-core | synaptic-core
|
+--------+-----------+-----------+--------+--------+
| | | | | |
synap- synap- synap- synap- synap- Provider
tic- tic- tic- tic- tic- crates:
models memory callbacks prompts parsers openai,
| | | | | anthropic,
+--------+-----------+-----------+--------+ gemini,
| ollama
synaptic-core
Retrieval pipeline (all depend on synaptic-core):
synaptic-loaders --> synaptic-splitters --> synaptic-embeddings
|
synaptic-vectorstores
|
synaptic-retrieval
Integration crates (each depends on synaptic-core):
synaptic-qdrant, synaptic-pgvector, synaptic-redis, synaptic-pdf
The arrows point downward toward dependencies. Every crate ultimately depends on synaptic-core. The composition crates (synaptic-graph, synaptic-runnables) additionally depend on the implementation crates they orchestrate.
Provider Abstraction
Each LLM provider lives in its own crate (synaptic-openai, synaptic-anthropic, synaptic-gemini, synaptic-ollama). They all use the ProviderBackend trait from synaptic-models to separate HTTP concerns from protocol mapping. HttpBackend makes real HTTP requests; FakeBackend returns scripted responses for testing. This means you can test any code that uses ChatModel without network access and without mocking at the HTTP level. You only compile the providers you actually use.
The Runnable Abstraction
The Runnable<I, O> trait in synaptic-runnables is the universal composition primitive. Prompt templates, output parsers, chat models, and entire graphs can all be treated as runnables. They compose via the | pipe operator into chains that can be invoked, batched, or streamed. See Runnables & LCEL for details.
The Graph Abstraction
The StateGraph builder in synaptic-graph provides a higher-level orchestration model for complex workflows. Where LCEL chains are linear pipelines (with branching), graphs support cycles, conditional routing, checkpointing, human-in-the-loop interrupts, and dynamic control flow via GraphCommand. See Graph for details.
See Also
- Installation -- feature flags for enabling specific crates
- Runnables & LCEL -- the composition primitive
- Graph -- state-machine orchestration
- Middleware -- cross-cutting agent concerns
- Key-Value Store -- persistent namespaced storage
Messages
Messages are the fundamental unit of communication in Synaptic. Every interaction with an LLM -- whether a simple question, a multi-turn conversation, a tool call, or a streaming response -- is expressed as a sequence of messages. This page explains the message system's design, its variants, and the utilities that operate on message sequences.
Message as a Tagged Enum
Message is a Rust enum with six variants, serialized with #[serde(tag = "role")]:
| Variant | Role String | Purpose |
|---|---|---|
System | "system" | Instructions to the model about behavior and constraints |
Human | "human" | User input |
AI | "assistant" | Model responses, optionally carrying tool calls |
Tool | "tool" | Results from tool execution, linked by tool_call_id |
Chat | custom | Messages with a user-defined role for special protocols |
Remove | "remove" | A signal to remove a message by ID from history |
This is a tagged enum, not a trait hierarchy. Pattern matching is exhaustive, serialization is automatic, and the compiler enforces that every code path handles every variant.
Why an Enum?
An enum makes it impossible to construct an invalid message. An AI message always has a tool_calls field (even if empty). A Tool message always has a tool_call_id. A System message never has tool calls. These invariants are enforced by the type system rather than by runtime checks.
Creating Messages
Synaptic provides factory methods rather than exposing struct literals. This keeps the API stable even as internal fields are added:
use synaptic::core::Message;
// Basic messages
let sys = Message::system("You are a helpful assistant.");
let user = Message::human("What is the weather?");
let reply = Message::ai("The weather is sunny today.");
// AI message with tool calls
let with_tools = Message::ai_with_tool_calls("Let me check.", vec![tool_call]);
// Tool result linked to a specific call
let result = Message::tool("72 degrees", "call_abc123");
// Custom role
let custom = Message::chat("moderator", "This message is approved.");
// Removal signal
let remove = Message::remove("msg_id_to_remove");
Builder Methods
Factory methods create messages with default (empty) optional fields. Builder methods let you set them:
let msg = Message::human("Hello")
.with_id("msg_001")
.with_name("Alice")
.with_content_blocks(vec![
ContentBlock::Text { text: "Hello".into() },
ContentBlock::Image { url: "https://example.com/photo.jpg".into(), detail: None },
]);
Available builders: with_id(), with_name(), with_additional_kwarg(), with_response_metadata_entry(), with_content_blocks(), with_usage_metadata() (AI only).
Accessing Message Fields
Accessor methods work uniformly across variants:
let msg = Message::ai("Hello world");
msg.content() // "Hello world"
msg.role() // "assistant"
msg.is_ai() // true
msg.is_human() // false
msg.tool_calls() // &[] (empty slice for non-AI messages)
msg.tool_call_id() // None (only Some for Tool messages)
msg.id() // None (unless set with .with_id())
msg.name() // None (unless set with .with_name())
Type-check methods: is_system(), is_human(), is_ai(), is_tool(), is_chat(), is_remove().
The Remove variant is special: it carries only an id field. Calling content() on it returns "", and name() returns None. The remove_id() method returns Some(&str) only for Remove messages.
Common Fields
Every message variant (except Remove) carries these fields:
content: String-- the text contentid: Option<String>-- optional unique identifiername: Option<String>-- optional sender nameadditional_kwargs: HashMap<String, Value>-- extensible key-value metadataresponse_metadata: HashMap<String, Value>-- provider-specific response metadatacontent_blocks: Vec<ContentBlock>-- multimodal content (text, images, audio, video, files, data, reasoning)
The AI variant additionally carries:
tool_calls: Vec<ToolCall>-- structured tool invocationsinvalid_tool_calls: Vec<InvalidToolCall>-- tool calls that failed to parseusage_metadata: Option<TokenUsage>-- token usage from the provider
The Tool variant additionally carries:
tool_call_id: String-- links back to the ToolCall that produced this result
Streaming with AIMessageChunk
When streaming responses from an LLM, content arrives in chunks. The AIMessageChunk struct represents a single chunk:
pub struct AIMessageChunk {
pub content: String,
pub tool_calls: Vec<ToolCall>,
pub usage: Option<TokenUsage>,
pub id: Option<String>,
pub tool_call_chunks: Vec<ToolCallChunk>,
pub invalid_tool_calls: Vec<InvalidToolCall>,
}
Chunks support the + and += operators to merge them incrementally:
let mut accumulated = AIMessageChunk::default();
accumulated += chunk1; // content is concatenated
accumulated += chunk2; // tool_calls are extended
accumulated += chunk3; // usage is summed
// Convert the accumulated chunk to a Message
let message = accumulated.into_message();
The merge semantics are:
contentis concatenated viapush_strtool_calls,tool_call_chunks, andinvalid_tool_callsare extendedidtakes the first non-None valueusageis summed field-by-field (input_tokens, output_tokens, total_tokens)
Multimodal Content
The ContentBlock enum supports rich content types beyond plain text:
| Variant | Fields | Purpose |
|---|---|---|
Text | text | Plain text |
Image | url, detail | Image reference with optional detail level |
Audio | url | Audio reference |
Video | url | Video reference |
File | url, mime_type | Generic file reference |
Data | data: Value | Arbitrary structured data |
Reasoning | content | Model reasoning/chain-of-thought |
Content blocks are carried alongside the content string field, allowing messages to contain both a text summary and structured multimodal data.
Message Utility Functions
Synaptic provides four utility functions for working with message sequences:
filter_messages
Filter messages by role, name, or ID with include/exclude lists:
use synaptic::core::filter_messages;
let humans_only = filter_messages(
&messages,
Some(&["human"]), // include_types
None, // exclude_types
None, None, // include/exclude names
None, None, // include/exclude ids
);
trim_messages
Trim a message sequence to fit within a token budget:
use synaptic::core::{trim_messages, TrimStrategy};
let trimmed = trim_messages(
messages,
4096, // max tokens
|msg| msg.content().len() / 4, // token counter function
TrimStrategy::Last, // keep most recent
true, // always preserve system message
);
TrimStrategy::First keeps messages from the beginning. TrimStrategy::Last keeps messages from the end, optionally preserving the leading system message.
merge_message_runs
Merge consecutive messages of the same role into a single message:
use synaptic::core::merge_message_runs;
let merged = merge_message_runs(vec![
Message::human("Hello"),
Message::human("How are you?"),
Message::ai("I'm fine"),
]);
// Result: [Human("Hello\nHow are you?"), AI("I'm fine")]
For AI messages, tool calls and invalid tool calls are also merged.
get_buffer_string
Convert a message sequence to a human-readable string:
use synaptic::core::get_buffer_string;
let text = get_buffer_string(&messages, "Human", "AI");
// "System: You are helpful.\nHuman: Hello\nAI: Hi there!"
Serialization
Messages serialize as JSON with a role discriminator field:
{
"role": "assistant",
"content": "Hello!",
"tool_calls": [],
"id": null,
"name": null
}
The AI variant serializes its role as "assistant" (matching OpenAI convention), while role() returns "assistant" at runtime as well. Empty collections and None optionals are omitted from serialization via skip_serializing_if attributes.
This serialization format is compatible with LangChain's message schema, making it straightforward to exchange message histories between Synaptic and Python-based systems.
See Also
- Message Types -- detailed examples for each message variant
- Filter & Trim -- filtering and trimming message sequences
- Merge Runs -- merging consecutive same-role messages
- Memory -- how messages are stored and managed across sessions
Runnables & LCEL
The LangChain Expression Language (LCEL) is a composition system for building data processing pipelines. In Synaptic, this is implemented through the Runnable trait and a set of combinators that let you pipe, branch, parallelize, retry, and stream operations. This page explains the design and the key types.
The Runnable Trait
At the heart of LCEL is a single trait:
#[async_trait]
pub trait Runnable<I, O>: Send + Sync
where
I: Send + 'static,
O: Send + 'static,
{
async fn invoke(&self, input: I, config: &RunnableConfig) -> Result<O, SynapticError>;
async fn batch(&self, inputs: Vec<I>, config: &RunnableConfig) -> Vec<Result<O, SynapticError>>;
fn stream<'a>(&'a self, input: I, config: &'a RunnableConfig) -> RunnableOutputStream<'a, O>;
fn boxed(self) -> BoxRunnable<I, O>;
}
Only invoke() is required. Default implementations are provided for:
batch()-- runsinvoke()sequentially for each inputstream()-- wrapsinvoke()as a single-item streamboxed()-- wrapsselfinto a type-erasedBoxRunnable
The RunnableConfig parameter threads runtime configuration (tags, metadata, concurrency limits, run IDs) through the entire pipeline without changing the input/output types.
BoxRunnable and the Pipe Operator
Rust's type system requires concrete types for composition, but LCEL chains can contain heterogeneous steps. BoxRunnable<I, O> is a type-erased wrapper that erases the concrete type while preserving the Runnable interface.
The pipe operator (|) connects two boxed runnables into a RunnableSequence:
use synaptic::runnables::{BoxRunnable, Runnable, RunnableLambda};
let step1 = RunnableLambda::new(|x: String| async move {
Ok(x.to_uppercase())
}).boxed();
let step2 = RunnableLambda::new(|x: String| async move {
Ok(format!("Result: {x}"))
}).boxed();
let chain = step1 | step2;
let output = chain.invoke("hello".into(), &config).await?;
// output: "Result: HELLO"
This is Rust's BitOr trait overloaded on BoxRunnable. The intermediate type between steps must match -- the output of step1 must be the input type of step2.
Key Runnable Types
RunnablePassthrough
Passes input through unchanged. Useful as a branch in RunnableParallel or as a placeholder in a chain:
let passthrough = RunnablePassthrough::new().boxed();
// invoke("hello") => Ok("hello")
RunnableLambda
Wraps an async closure into a Runnable. This is the most common way to insert custom logic into a chain:
let transform = RunnableLambda::new(|input: String| async move {
Ok(input.split_whitespace().count())
}).boxed();
Tip: For named, reusable functions you can use the
#[chain]macro instead ofRunnableLambda::new. It generates a factory function that returns aBoxRunnabledirectly. See Procedural Macros.
RunnableSequence
Created by the | operator. Executes steps in order, feeding each output as the next step's input. You rarely construct this directly.
RunnableParallel
Runs named branches concurrently and merges their outputs into a serde_json::Value object:
let parallel = RunnableParallel::new()
.add("upper", RunnableLambda::new(|s: String| async move {
Ok(Value::String(s.to_uppercase()))
}).boxed())
.add("length", RunnableLambda::new(|s: String| async move {
Ok(Value::Number(s.len().into()))
}).boxed());
let result = parallel.invoke("hello".into(), &config).await?;
// result: {"upper": "HELLO", "length": 5}
All branches receive a clone of the same input and run concurrently via tokio::join!. The output is a JSON object keyed by the branch names.
RunnableBranch
Routes input to one of several branches based on conditions, with a default fallthrough:
let branch = RunnableBranch::new(
vec![
(
|input: &String| input.starts_with("math:"),
math_chain.boxed(),
),
(
|input: &String| input.starts_with("code:"),
code_chain.boxed(),
),
],
default_chain.boxed(), // fallback
);
Conditions are checked in order. The first matching condition's branch is invoked. If none match, the default branch handles it.
RunnableWithFallbacks
Tries alternatives when the primary runnable fails:
let robust = RunnableWithFallbacks::new(
primary_model.boxed(),
vec![fallback_model.boxed()],
);
If primary_model returns an error, fallback_model is tried with the same input. This is useful for model failover (e.g., try GPT-4, fall back to GPT-3.5).
RunnableAssign
Runs a parallel branch and merges its output into the existing JSON value. The input must be a serde_json::Value object, and the parallel branch's outputs are merged as additional keys:
let assign = RunnableAssign::new(
RunnableParallel::new()
.add("word_count", count_words_runnable)
);
// Input: {"text": "hello world"}
// Output: {"text": "hello world", "word_count": 2}
RunnablePick
Extracts specific keys from a JSON value:
let pick = RunnablePick::new(vec!["name".into(), "age".into()]);
// Input: {"name": "Alice", "age": 30, "email": "..."}
// Output: {"name": "Alice", "age": 30}
Single-key picks return the value directly rather than wrapping it in an object.
RunnableEach
Maps a runnable over each element of a collection:
let each = RunnableEach::new(transform_single_item.boxed());
// Input: vec!["a", "b", "c"]
// Output: vec![transformed_a, transformed_b, transformed_c]
RunnableRetry
Retries a runnable on failure with configurable policy:
let retry = RunnableRetry::new(
flaky_runnable.boxed(),
RetryPolicy {
max_retries: 3,
delay: Duration::from_millis(100),
backoff_factor: 2.0,
},
);
RunnableGenerator
Produces values from a stream, useful for wrapping streaming sources into the runnable pipeline:
let generator = RunnableGenerator::new(|input: String, _config| {
Box::pin(async_stream::stream! {
for word in input.split_whitespace() {
yield Ok(word.to_string());
}
})
});
Config Binding
BoxRunnable::bind() applies a config transform before delegation. This lets you attach metadata, set concurrency limits, or override run names without changing the chain's input/output types:
let tagged = chain.bind(|mut config| {
config.tags.push("production".into());
config
});
with_config() is a convenience that replaces the config entirely. with_listeners() adds before/after callbacks around invocation.
Streaming Through Pipelines
When you call stream() on a chain, the streaming behavior depends on the components:
- If the final component in a sequence truly streams (e.g., an LLM that yields token-by-token), the chain streams those chunks through.
- Intermediate steps in the pipeline run their
invoke()and pass the result forward. RunnableGeneratorproduces a true stream from any async function.
This means a chain like prompt | model | parser will stream the model's output chunks through the parser, provided the parser implements true streaming.
Everything Is a Runnable
Synaptic's LCEL design means that many types across the framework implement Runnable:
- Prompt templates (
ChatPromptTemplate) implementRunnable<Value, Vec<Message>>-- they take template variables and produce messages. - Output parsers (
StrOutputParser,JsonOutputParser, etc.) implementRunnable-- they transform one output format to another. - Chat models can be wrapped as runnables for use in chains.
- Graphs produce state from state.
This uniformity means you can compose any of these with | and get type-safe, streamable pipelines.
See Also
- Pipe Operator -- composing runnables with
| - Streaming -- streaming through chains
- Parallel & Branch -- concurrent execution and routing
- Assign & Pick -- JSON manipulation in chains
- Fallbacks -- error recovery
- Retry -- automatic retry with backoff
- Streaming (concept) -- streaming across all layers
Agents & Tools
Agents are systems where an LLM decides what actions to take. Rather than following a fixed script, the model examines the conversation, chooses which tools to call (if any), processes the results, and decides whether to call more tools or produce a final answer. This page explains how Synaptic models tools, how they are registered and executed, and how the agent loop works.
The Tool Trait
A tool in Synaptic is anything that implements the Tool trait:
#[async_trait]
pub trait Tool: Send + Sync {
fn name(&self) -> &'static str;
fn description(&self) -> &'static str;
async fn call(&self, args: Value) -> Result<Value, SynapticError>;
}
name()returns a unique identifier the LLM uses to refer to this tool.description()explains what the tool does, in natural language. This is sent to the LLM so it knows when and how to use the tool.call()executes the tool with JSON arguments and returns a JSON result.
The trait is intentionally minimal. A tool does not know about conversations, memory, or models. It receives arguments, does work, and returns a result. This keeps tools reusable and testable in isolation.
ToolDefinition
When tools are sent to an LLM, they are described as ToolDefinition structs:
pub struct ToolDefinition {
pub name: String,
pub description: String,
pub parameters: Value, // JSON Schema
pub extras: Option<HashMap<String, Value>>, // provider-specific params
}
The parameters field is a JSON Schema that describes the tool's expected arguments. LLM providers use this schema to generate valid tool calls. The ToolDefinition is metadata about the tool -- it never executes anything.
The optional extras field carries provider-specific parameters (e.g., Anthropic's cache_control). Provider adapters in synaptic-models forward these to the API when present.
ToolCall and ToolChoice
When an LLM decides to use a tool, it produces a ToolCall:
pub struct ToolCall {
pub id: String,
pub name: String,
pub arguments: Value,
}
The id links the call to its result. When a tool finishes execution, the result is wrapped in a Message::tool(result, tool_call_id) that references this ID, allowing the LLM to match results back to calls.
ToolChoice controls the LLM's tool-calling behavior:
| Variant | Behavior |
|---|---|
Auto | The model decides whether to call tools |
Required | The model must call at least one tool |
None | Tool calling is disabled |
Specific(name) | The model must call the named tool |
ToolChoice is set on ChatRequest via .with_tool_choice().
ToolRegistry
The ToolRegistry is a thread-safe collection of tools, backed by Arc<RwLock<HashMap<String, Arc<dyn Tool>>>>:
use synaptic::tools::ToolRegistry;
let registry = ToolRegistry::new();
registry.register(Arc::new(WeatherTool))?;
registry.register(Arc::new(CalculatorTool))?;
// Look up a tool by name
let tool = registry.get("weather");
Registration is idempotent -- registering a tool with the same name replaces the previous one. The Arc<RwLock<_>> ensures safe concurrent access: multiple readers can look up tools simultaneously, and registration briefly acquires a write lock.
Tool Executors
Executors bridge the gap between tool calls from an LLM and the tool registry:
SerialToolExecutor -- executes tool calls one at a time. Simple and predictable:
let executor = SerialToolExecutor::new(registry);
let result = executor.execute("weather", json!({"city": "Tokyo"})).await?;
ParallelToolExecutor -- executes multiple tool calls concurrently. Useful when the LLM produces several independent tool calls in a single response.
Tool Wrappers
Synaptic provides wrapper types that add behavior to existing tools:
HandleErrorTool-- catches errors from the inner tool and returns them as a string result instead of propagating the error. This allows the LLM to see the error and retry with different arguments.ReturnDirectTool-- marks the tool's output as the final response, short-circuiting the agent loop instead of feeding the result back to the LLM.
ToolNode
In the graph system, ToolNode is a pre-built graph node that processes AI messages containing tool calls. It:
- Reads the last message from the graph state
- Extracts all
ToolCallentries from it - Executes each tool call via a
SerialToolExecutor - Appends the results as
Message::tool(...)messages back to the state
ToolNode is the standard way to handle tool execution inside a graph workflow. You do not need to write tool dispatching logic yourself.
The ReAct Agent Pattern
ReAct (Reasoning + Acting) is the most common agent pattern. The model alternates between reasoning about what to do and acting by calling tools. Synaptic provides a prebuilt ReAct agent via create_react_agent():
use synaptic::graph::{create_react_agent, MessageState};
let graph = create_react_agent(model, tools)?;
let state = MessageState::from_messages(vec![
Message::human("What is the weather in Tokyo?"),
]);
let result = graph.invoke(state).await?;
This builds a graph with two nodes:
[START] --> [agent] --tool_calls--> [tools] --> [agent] ...
\--no_tools----> [END]
- "agent" node: Calls the LLM with the current messages and tool definitions. The LLM's response is appended to the state.
- "tools" node: A
ToolNodethat executes any tool calls from the agent's response and appends results.
The conditional edge after "agent" checks if the last message has tool calls. If yes, route to "tools". If no, route to END. The edge from "tools" always returns to "agent", creating the loop.
The Agent Loop in Detail
- The user message enters the graph state.
- The "agent" node sends all messages to the LLM along with tool definitions.
- The LLM responds. If it includes tool calls: a. The response (with tool calls) is appended to the state. b. Routing sends execution to the "tools" node. c. Each tool call is executed and results are appended as Tool messages. d. Routing sends execution back to the "agent" node. e. The LLM now sees the tool results and can decide what to do next.
- When the LLM responds without tool calls, it has produced its final answer. Routing sends execution to END.
This loop continues until the LLM decides it has enough information to answer directly, or until the graph's iteration safety limit (100) is reached.
ReactAgentOptions
The create_react_agent_with_options() function accepts a ReactAgentOptions struct for advanced configuration:
let options = ReactAgentOptions {
checkpointer: Some(Arc::new(MemorySaver::new())),
system_prompt: Some("You are a helpful weather assistant.".into()),
interrupt_before: vec!["tools".into()],
interrupt_after: vec![],
};
let graph = create_react_agent_with_options(model, tools, options)?;
| Option | Purpose |
|---|---|
checkpointer | State persistence for resumption across invocations |
system_prompt | Prepended to messages before each LLM call |
interrupt_before | Pause before named nodes (for human approval of tool calls) |
interrupt_after | Pause after named nodes (for human review of tool results) |
Setting interrupt_before: vec!["tools".into()] creates a human-in-the-loop agent: the graph pauses before executing tools, allowing a human to inspect the proposed tool calls, modify them, or reject them entirely. The graph is then resumed via update_state().
See Also
- Custom Tools -- creating tools with the
#[tool]macro - Tool Registry -- managing tool collections
- Tool Choice -- controlling model tool-calling behavior
- Tool Definition Extras -- provider-specific parameters
- Runtime-Aware Tools -- tools with store/state access
- Tool Node -- ToolNode in graph workflows
- Graph -- the graph system that agents run within
Memory
Without memory, every LLM call is stateless -- the model has no knowledge of previous interactions. Memory in Synaptic solves this by storing, retrieving, and managing conversation history so that subsequent calls include relevant context. This page explains the memory abstraction, the available strategies, and how they trade off between completeness and cost.
The MemoryStore Trait
All memory backends implement a single trait:
#[async_trait]
pub trait MemoryStore: Send + Sync {
async fn append(&self, session_id: &str, message: Message) -> Result<(), SynapticError>;
async fn load(&self, session_id: &str) -> Result<Vec<Message>, SynapticError>;
async fn clear(&self, session_id: &str) -> Result<(), SynapticError>;
}
Three operations, keyed by a session identifier:
append-- add a message to the session's historyload-- retrieve the full history for a sessionclear-- delete all messages for a session
The session_id parameter is central to Synaptic's memory design. Two conversations with different session IDs are completely isolated, even if they share the same memory store instance. This enables multi-tenant applications where many users interact concurrently through a single system.
InMemoryStore
The simplest implementation -- a HashMap<String, Vec<Message>> wrapped in Arc<RwLock<_>>:
use synaptic::memory::InMemoryStore;
let store = InMemoryStore::new();
store.append("session_1", Message::human("Hello")).await?;
let history = store.load("session_1").await?;
InMemoryStore is fast, requires no external dependencies, and is suitable for development, testing, and short-lived applications. Data is lost when the process exits.
FileChatMessageHistory
A persistent store that writes messages to a JSON file on disk. Each session is stored as a separate file. This is useful for applications that need persistence without a database:
use synaptic::memory::FileChatMessageHistory;
let history = FileChatMessageHistory::new("./chat_history")?;
Memory Strategies
Raw MemoryStore keeps every message forever. For long conversations, this leads to unbounded token usage and eventually exceeds the model's context window. Memory strategies wrap a store and control which messages are included in the context.
ConversationBufferMemory
Keeps all messages. The simplest strategy -- everything is sent to the LLM every time.
- Advantage: No information loss.
- Disadvantage: Token usage grows without bound. Eventually exceeds the context window.
- Use case: Short conversations where you know the total message count is small.
ConversationWindowMemory
Keeps only the last K message pairs (human + AI). Older messages are dropped:
use synaptic::memory::ConversationWindowMemory;
let memory = ConversationWindowMemory::new(store, 5); // keep last 5 exchanges
- Advantage: Fixed, predictable token usage.
- Disadvantage: Complete loss of older context. The model has no knowledge of what happened more than K turns ago.
- Use case: Chat UIs, customer service bots, and any scenario where recent context matters most.
ConversationSummaryMemory
Summarizes older messages using an LLM, keeping only the summary plus recent messages:
use synaptic::memory::ConversationSummaryMemory;
let memory = ConversationSummaryMemory::new(store, summarizer_model);
After each exchange, the strategy uses an LLM to produce a running summary of the conversation. The summary replaces the older messages, so the context sent to the main model includes the summary followed by recent messages.
- Advantage: Retains the gist of the entire conversation. Constant-ish token usage.
- Disadvantage: Summarization has a cost (an extra LLM call). Details may be lost in compression. Summarization quality depends on the model.
- Use case: Long-running conversations where historical context matters (e.g., a multi-session assistant that remembers past preferences).
ConversationTokenBufferMemory
Keeps as many recent messages as fit within a token budget:
use synaptic::memory::ConversationTokenBufferMemory;
let memory = ConversationTokenBufferMemory::new(store, 4096); // max 4096 tokens
Unlike window memory (which counts messages), token buffer memory counts tokens. This is more precise when messages vary significantly in length.
- Advantage: Direct control over context size. Works well with models that have strict context limits.
- Disadvantage: Still loses old messages entirely.
- Use case: Cost-sensitive applications where you want to fill the context window efficiently.
ConversationSummaryBufferMemory
A hybrid: summarizes old messages and keeps recent ones, with a token threshold controlling the boundary:
use synaptic::memory::ConversationSummaryBufferMemory;
let memory = ConversationSummaryBufferMemory::new(store, model, 2000);
// Summarize when recent messages exceed 2000 tokens
When the total token count of recent messages exceeds the threshold, the oldest messages are summarized and replaced with the summary. The result is a context that starts with a summary of the distant past, followed by verbatim recent messages.
- Advantage: Best of both worlds -- retains old context through summaries while keeping recent messages verbatim.
- Disadvantage: More complex. Requires an LLM for summarization.
- Use case: Production chat applications that need both historical awareness and accurate recent context.
Strategy Comparison
| Strategy | What It Keeps | Token Growth | Info Loss | Extra LLM Calls |
|---|---|---|---|---|
| Buffer | Everything | Unbounded | None | None |
| Window | Last K turns | Fixed | Old messages lost | None |
| Summary | Summary + recent | Near-constant | Details compressed | Yes |
| TokenBuffer | Recent within budget | Fixed | Old messages lost | None |
| SummaryBuffer | Summary + recent buffer | Bounded | Old details compressed | Yes |
RunnableWithMessageHistory
Rather than manually loading and saving messages around each LLM call, RunnableWithMessageHistory wraps any Runnable and handles it automatically:
use synaptic::memory::RunnableWithMessageHistory;
let chain_with_memory = RunnableWithMessageHistory::new(
my_chain,
store,
|config| config.metadata.get("session_id")
.and_then(|v| v.as_str())
.unwrap_or("default")
.to_string(),
);
On each invocation:
- The session ID is extracted from the
RunnableConfigmetadata. - Historical messages are loaded from the store.
- The inner runnable is invoked with the historical context prepended.
- The new messages (input and output) are appended to the store.
This separates memory management from application logic. The inner runnable does not need to know about memory at all.
Session Isolation
A key design property: memory is always scoped to a session. The session_id is just a string -- it could be a user ID, a conversation ID, a thread ID, or any other identifier meaningful to your application.
Different sessions sharing the same InMemoryStore (or any other store) are completely independent. Appending to session "alice" never affects session "bob". This makes it safe to use a single store instance across an entire application serving multiple users.
See Also
- Buffer Memory -- keeping all messages
- Window Memory -- keeping last K turns
- Summary Memory -- LLM-based summarization
- Token Buffer Memory -- token-budget trimming
- Summary Buffer Memory -- hybrid summary + recent buffer
- RunnableWithMessageHistory -- automatic history management
- Messages -- the Message type that memory stores
Retrieval
Retrieval-Augmented Generation (RAG) grounds LLM responses in external knowledge. Instead of relying solely on what the model learned during training, a RAG system retrieves relevant documents at query time and includes them in the prompt. This page explains the retrieval pipeline's architecture, the role of each component, and the retriever types Synaptic provides.
The Pipeline
A RAG pipeline has five stages:
Load --> Split --> Embed --> Store --> Retrieve
- Load: Read raw content from files, databases, or the web into
Documentstructs. - Split: Break large documents into smaller, semantically coherent chunks.
- Embed: Convert text chunks into numerical vectors that capture meaning.
- Store: Index the vectors for efficient similarity search.
- Retrieve: Given a query, find the most relevant chunks.
Each stage has a dedicated trait and multiple implementations. You can mix and match implementations at each stage depending on your data sources and requirements.
Document
The Document struct is the universal unit of content:
pub struct Document {
pub id: Option<String>,
pub content: String,
pub metadata: HashMap<String, Value>,
}
contentholds the text.metadataholds arbitrary key-value pairs (source filename, page number, section heading, creation date, etc.).idis an optional unique identifier used by stores for upsert and delete operations.
Documents flow through every stage of the pipeline. Loaders produce them, splitters transform them (preserving and augmenting metadata), and retrievers return them.
Loading
The Loader trait is async and returns a stream of documents:
| Loader | Source | Behavior |
|---|---|---|
TextLoader | Plain text files | One document per file |
JsonLoader | JSON files | Configurable id_key and content_key extraction |
CsvLoader | CSV files | Column-based, with metadata from other columns |
DirectoryLoader | Directory of files | Recursive, with glob filtering to select file types |
FileLoader | Single file | Generic file loading with configurable parser |
MarkdownLoader | Markdown files | Markdown-aware parsing |
WebLoader | URLs | Fetches and processes web content |
Loaders handle the mechanics of reading and parsing. They produce Document values with appropriate metadata (e.g., a source field with the file path).
Splitting
Large documents must be split into chunks that fit within embedding models' context windows and that contain focused, coherent content. The TextSplitter trait provides:
pub trait TextSplitter: Send + Sync {
fn split_text(&self, text: &str) -> Result<Vec<String>, SynapticError>;
fn split_documents(&self, documents: Vec<Document>) -> Result<Vec<Document>, SynapticError>;
}
| Splitter | Strategy |
|---|---|
CharacterTextSplitter | Splits on a single separator (default: "\n\n") with configurable chunk size and overlap |
RecursiveCharacterTextSplitter | Tries a hierarchy of separators ("\n\n", "\n", " ", "") -- splits on the largest unit that fits within the chunk size |
MarkdownHeaderTextSplitter | Splits on Markdown headers, adding header hierarchy to metadata |
HtmlHeaderTextSplitter | Splits on HTML header tags, adding header hierarchy to metadata |
TokenTextSplitter | Splits based on approximate token count (~4 chars/token heuristic, word-boundary aware) |
LanguageTextSplitter | Splits code using language-aware separators (functions, classes, etc.) |
The most commonly used splitter is RecursiveCharacterTextSplitter. It produces chunks that respect natural document boundaries (paragraphs, then sentences, then words) and includes configurable overlap between chunks so that information at chunk boundaries is not lost.
split_documents() preserves the original document's metadata on each chunk, so you can trace every chunk back to its source.
Embedding
Embedding models convert text into dense numerical vectors. Texts with similar meaning produce vectors that are close together in the vector space. The trait:
#[async_trait]
pub trait Embeddings: Send + Sync {
async fn embed_documents(&self, texts: Vec<String>) -> Result<Vec<Vec<f32>>, SynapticError>;
async fn embed_query(&self, text: &str) -> Result<Vec<f32>, SynapticError>;
}
Two methods because some providers optimize differently for documents (which may be batched) versus queries (single text, possibly with different prompt prefixes).
| Implementation | Description |
|---|---|
OpenAiEmbeddings | OpenAI's embedding API (text-embedding-ada-002, etc.) |
OllamaEmbeddings | Local Ollama embedding models |
FakeEmbeddings | Deterministic vectors for testing (no API calls) |
CachedEmbeddings | Wraps any Embeddings with a cache to avoid redundant API calls |
Vector Storage
Vector stores hold embedded documents and support similarity search:
#[async_trait]
pub trait VectorStore: Send + Sync {
async fn add_documents(&self, docs: Vec<Document>, embeddings: Vec<Vec<f32>>) -> Result<Vec<String>, SynapticError>;
async fn similarity_search(&self, query_embedding: &[f32], k: usize) -> Result<Vec<Document>, SynapticError>;
async fn delete(&self, ids: &[String]) -> Result<(), SynapticError>;
}
InMemoryVectorStore uses cosine similarity with brute-force search. It stores documents and their embeddings in a RwLock<HashMap>, computes cosine similarity against all stored vectors at query time, and returns the top-k results. This is suitable for small to medium collections (thousands of documents). For larger collections, you would implement the VectorStore trait with a dedicated vector database.
Retrieval
The Retriever trait is the query-time interface:
#[async_trait]
pub trait Retriever: Send + Sync {
async fn retrieve(&self, query: &str) -> Result<Vec<Document>, SynapticError>;
}
A retriever takes a natural-language query and returns relevant documents. Synaptic provides seven retriever implementations, each with different strengths.
InMemoryRetriever
The simplest retriever -- stores documents in memory and returns them based on keyword matching. Useful for testing and small collections.
BM25Retriever
Implements the Okapi BM25 scoring algorithm, a classical information retrieval method that ranks documents by term frequency and inverse document frequency. No embeddings required -- purely lexical matching.
BM25 excels at exact keyword matching. If a user searches for "tokio runtime" and a document contains exactly those words, BM25 will rank it highly even if semantically similar documents that use different words score lower.
MultiQueryRetriever
Uses an LLM to generate multiple query variants from the original query, then runs each variant through a base retriever and combines the results. This addresses the problem that a single query phrasing may miss relevant documents:
Original query: "How do I handle errors?"
Generated variants:
- "What is the error handling approach?"
- "How are errors propagated in the system?"
- "What error types are available?"
EnsembleRetriever
Combines results from multiple retrievers using Reciprocal Rank Fusion (RRF). A typical setup pairs BM25 (good at exact matches) with a vector store retriever (good at semantic matches):
The RRF algorithm assigns scores based on rank position across retrievers, so a document that appears in the top results of multiple retrievers gets a higher combined score.
ContextualCompressionRetriever
Wraps a base retriever and compresses retrieved documents to remove irrelevant content. Uses a DocumentCompressor (such as EmbeddingsFilter, which filters out documents below a similarity threshold) to refine results after retrieval.
SelfQueryRetriever
Uses an LLM to parse the user's query into a structured filter over document metadata, combined with a semantic search query. For example:
User query: "Find papers about transformers published after 2020"
Parsed:
- Semantic query: "papers about transformers"
- Metadata filter: year > 2020
This enables natural-language queries that combine semantic search with precise metadata filtering.
ParentDocumentRetriever
Stores small child chunks for embedding (which improves retrieval precision) but returns the larger parent documents they came from (which provides more context to the LLM). This addresses the tension between small chunks (better for matching) and large chunks (better for context).
MultiVectorRetriever
Similar to ParentDocumentRetriever, but implemented at the vector store level. MultiVectorRetriever stores child document embeddings in a VectorStore and maintains a separate docstore mapping child IDs to parent documents. At query time, it searches for matching child chunks and then looks up their parent documents for return. This is available in synaptic-vectorstores.
Connecting Retrieval to Generation
Retrievers produce Vec<Document>. To use them in a RAG chain, you typically format the documents into a prompt and pass them to an LLM:
// Pseudocode for a RAG chain
let docs = retriever.retrieve("What is Synaptic?").await?;
let context = docs.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let prompt = format!("Context:\n{context}\n\nQuestion: What is Synaptic?");
Using LCEL, this can be composed into a reusable chain with RunnableParallel (to fetch context and pass through the question simultaneously), RunnableLambda (to format the prompt), and a chat model.
See Also
- Document Loaders -- loading data from files and the web
- Text Splitters -- splitting documents into chunks
- Embeddings -- embedding models for vector search
- Vector Stores -- storing and searching vectors
- BM25 Retriever -- keyword-based retrieval
- Ensemble Retriever -- combining multiple retrievers
- Self-Query Retriever -- LLM-powered metadata filtering
- Runnables & LCEL -- composing retrieval into chains
Graph
LCEL chains are powerful for linear pipelines, but some workflows need cycles, conditional branching, checkpointed state, and human intervention. The graph system (Synaptic's equivalent of LangGraph) provides these capabilities through a state-machine abstraction. This page explains the graph model, its key concepts, and how it differs from chain-based composition.
Why Graphs?
Consider a ReAct agent. The LLM calls tools, sees the results, and decides whether to call more tools or produce a final answer. This is a loop -- the execution path is not known in advance. LCEL chains compose linearly (A | B | C), but a ReAct agent needs to go from A to B, then back to A, then conditionally to C.
Graphs solve this. Each step is a node, transitions are edges, and the graph runtime handles routing, checkpointing, and streaming. The execution path emerges at runtime based on the state.
State
Every graph operates on a shared state type that implements the State trait:
pub trait State: Send + Sync + Clone + 'static {
fn merge(&mut self, other: Self);
}
The merge() method defines how state updates are combined. When a node returns a new state, it is merged into the current state. This is the graph's "reducer" -- it determines how concurrent or sequential updates compose.
MessageState
Synaptic provides MessageState as the built-in state type for conversational agents:
pub struct MessageState {
pub messages: Vec<Message>,
}
Its merge() implementation appends new messages to the existing list. This means each node can add messages (LLM responses, tool results, etc.) and they accumulate naturally.
You can define custom state types for non-conversational workflows. Any Clone + Send + Sync + 'static type that implements State (specifically, the merge method) can be used.
Nodes
A node is a unit of computation within the graph:
#[async_trait]
pub trait Node<S: State>: Send + Sync {
async fn process(&self, state: S) -> Result<NodeOutput<S>, SynapticError>;
}
A node receives the current state, does work, and returns a NodeOutput<S>:
NodeOutput::State(S)-- a regular state update. TheFrom<S>impl lets you writeOk(state.into()).NodeOutput::Command(Command<S>)-- a control flow command: dynamic routing (Command::goto), early termination (Command::end), or interrupts (interrupt()).
FnNode wraps an async closure into a node, which is the most common way to define nodes:
let my_node = FnNode::new(|state: MessageState| async move {
// Process state, add messages, etc.
Ok(state.into())
});
ToolNode is a pre-built node that extracts tool calls from the last AI message, executes them, and appends the results. The tools_condition function provides standard routing: returns "tools" if the last message has tool calls, else END.
Building a Graph
StateGraph<S> is the builder:
use synaptic::graph::{StateGraph, MessageState, END};
let graph = StateGraph::new()
.add_node("step_1", node_1)
.add_node("step_2", node_2)
.set_entry_point("step_1")
.add_edge("step_1", "step_2")
.add_edge("step_2", END)
.compile()?;
add_node(name, node)
Registers a named node. Names are arbitrary strings. Two special constants exist: START (the entry sentinel) and END (the exit sentinel). You never add START or END as nodes -- they are implicit.
set_entry_point(name)
Defines which node executes first after START.
add_edge(source, target)
A fixed edge -- after source completes, always go to target. The target can be END to terminate the graph.
add_conditional_edges(source, router_fn)
A conditional edge -- after source completes, call router_fn with the current state to determine the next node:
.add_conditional_edges("agent", |state: &MessageState| {
if state.last_message().map_or(false, |m| !m.tool_calls().is_empty()) {
"tools".to_string()
} else {
END.to_string()
}
})
The router function receives a reference to the state and returns the name of the next node (or END).
There is also add_conditional_edges_with_path_map(), which additionally provides a mapping from router return values to node names. This path map is used by visualization tools to render the conditional branches.
compile()
Validates the graph (checks that all referenced nodes exist, that the entry point is set, etc.) and returns a CompiledGraph<S>.
Executing a Graph
CompiledGraph<S> provides two execution methods:
invoke(state)
Runs the graph and returns a GraphResult<S>:
let initial = MessageState::with_messages(vec![Message::human("Hello")]);
let result = graph.invoke(initial).await?;
match result {
GraphResult::Complete(state) => println!("Done: {} messages", state.messages.len()),
GraphResult::Interrupted { state, interrupt_value } => {
println!("Paused: {interrupt_value}");
}
}
// Or use convenience methods:
let state = result.into_state(); // works for both Complete and Interrupted
stream(state, mode)
Returns a GraphStream that yields GraphEvent<S> after each node executes:
use futures::StreamExt;
use synaptic::graph::StreamMode;
let mut stream = graph.stream(initial, StreamMode::Values);
while let Some(event) = stream.next().await {
let event = event?;
println!("Node '{}' completed", event.node);
}
StreamMode::Values yields the full state after each node. StreamMode::Updates yields the per-node state changes.
Checkpointing
Graphs support state persistence through the Checkpointer trait. After each node executes, the current state and the next scheduled node are saved. This enables:
- Resumption: If the process crashes, the graph can resume from the last checkpoint.
- Human-in-the-loop: The graph can pause, persist state, and resume later after human input.
MemorySaver is the built-in in-memory checkpointer. For production use, you would implement Checkpointer with a database backend.
use synaptic::graph::MemorySaver;
let checkpointer = Arc::new(MemorySaver::new());
let graph = graph.with_checkpointer(checkpointer);
Checkpoints are identified by a CheckpointConfig that includes a thread_id. Different threads have independent checkpoint histories.
get_state / get_state_history
You can inspect the current state and full history of a checkpointed graph:
let current = graph.get_state(&config).await?;
let history = graph.get_state_history(&config).await?;
get_state_history() returns a list of (state, next_node) pairs, ordered from oldest to newest.
Human-in-the-Loop
Two mechanisms pause graph execution for human intervention:
interrupt_before(nodes)
The graph pauses before executing the named nodes. The current state is checkpointed, and the graph returns GraphResult::Interrupted.
let graph = StateGraph::new()
// ...
.interrupt_before(vec!["tools".into()])
.compile()?;
After the interrupt, the human can inspect the state (e.g., review proposed tool calls), modify it via update_state(), and resume execution:
// Inspect the proposed tool calls
let state = graph.get_state(&config).await?.unwrap();
// Modify state if needed
graph.update_state(&config, updated_state).await?;
// Resume execution
let result = graph.invoke_with_config(
MessageState::default(),
Some(config),
).await?;
let final_state = result.into_state();
interrupt_after(nodes)
The graph pauses after executing the named nodes. The node's output is already in the state, and the next node is recorded in the checkpoint. Useful for reviewing a node's output before proceeding.
Programmatic interrupt()
Nodes can also interrupt programmatically using the interrupt() function:
use synaptic::graph::{interrupt, NodeOutput};
// Inside a node's process() method:
Ok(interrupt(serde_json::json!({"question": "Approve?"})))
This returns GraphResult::Interrupted with the specified value, which the caller can inspect via result.interrupt_value().
Dynamic Control Flow with Command
Nodes can override normal edge-based routing by returning NodeOutput::Command(...):
Command::goto(target)
Redirects execution to a specific node, skipping normal edge resolution:
Ok(NodeOutput::Command(Command::goto("summary")))
Command::goto_with_update(target, state_delta)
Routes to a node while also applying a state update:
Ok(NodeOutput::Command(Command::goto_with_update("next", delta)))
Command::end()
Ends graph execution immediately:
Ok(NodeOutput::Command(Command::end()))
Command::update(state_delta)
Applies a state update without overriding routing (uses normal edges):
Ok(NodeOutput::Command(Command::update(delta)))
Commands take priority over edges. After a node executes, the graph checks for a command before consulting edges. This enables dynamic, state-dependent control flow that goes beyond what static edge definitions can express.
Send (Fan-out)
The Send mechanism allows a node to dispatch work to multiple target nodes via Command::send(), enabling fan-out (map-reduce) patterns within the graph.
Visualization
CompiledGraph provides multiple rendering methods:
| Method | Output | Requirements |
|---|---|---|
draw_mermaid() | Mermaid flowchart string | None |
draw_ascii() | Plain text summary | None |
draw_dot() | Graphviz DOT format | None |
draw_png(path) | PNG image file | Graphviz dot in PATH |
draw_mermaid_png(path) | PNG via mermaid.ink API | Internet access |
draw_mermaid_svg(path) | SVG via mermaid.ink API | Internet access |
Display is also implemented, so println!("{graph}") outputs the ASCII representation.
Mermaid output example for a ReAct agent:
graph TD
__start__(["__start__"])
agent["agent"]
tools["tools"]
__end__(["__end__"])
__start__ --> agent
tools --> agent
agent -.-> |tools| tools
agent -.-> |__end__| __end__
Prebuilt Multi-Agent Patterns
Beyond create_react_agent, Synaptic provides two multi-agent graph constructors:
create_supervisor
Builds a supervisor graph where a central LLM orchestrates sub-agents. The supervisor decides which agent to delegate to by calling handoff tools (transfer_to_<agent_name>). Each sub-agent is itself a compiled react agent graph.
use synaptic::graph::{create_supervisor, SupervisorOptions};
let agents = vec![
("researcher".to_string(), researcher_graph),
("writer".to_string(), writer_graph),
];
let graph = create_supervisor(supervisor_model, agents, SupervisorOptions::default())?;
The supervisor loop: supervisor calls LLM → if handoff tool call, route to sub-agent → sub-agent runs to completion → return to supervisor → repeat until supervisor produces a final answer (no tool calls).
create_swarm
Builds a swarm graph where agents hand off to each other peer-to-peer, without a central coordinator. Each agent has its own model, tools, and system prompt. Handoff is done via transfer_to_<agent_name> tool calls.
use synaptic::graph::{create_swarm, SwarmAgent, SwarmOptions};
let agents = vec![
SwarmAgent { name: "triage".into(), model, tools, system_prompt: Some("...".into()) },
SwarmAgent { name: "support".into(), model, tools, system_prompt: Some("...".into()) },
];
let graph = create_swarm(agents, SwarmOptions::default())?;
The first agent in the list is the entry point. Each agent runs until it either produces a final answer or hands off to another agent.
Safety Limits
The graph runtime enforces a maximum of 100 iterations per execution to prevent infinite loops. If a graph cycles more than 100 times, it returns SynapticError::Graph("max iterations (100) exceeded"). This is a safety guard, not a configurable limit -- if your workflow legitimately needs more iterations, the graph structure should be reconsidered.
See Also
- State & Nodes -- building custom nodes and state types
- Command & Routing -- dynamic control flow with Command
- Interrupt & Resume -- programmatic interrupts
- Human-in-the-Loop -- pausing for human input
- Streaming -- graph streaming with StreamMode
- Supervisor -- supervisor pattern how-to
- Swarm -- swarm pattern how-to
- Tool Node -- ToolNode and tools_condition
Streaming
LLM responses can take seconds to generate. Without streaming, the user sees nothing until the entire response is complete. Streaming delivers tokens as they are produced, reducing perceived latency and enabling real-time UIs. This page explains how streaming works across Synaptic's layers -- from individual model calls through LCEL chains to graph execution.
Model-Level Streaming
The ChatModel trait provides two methods:
#[async_trait]
pub trait ChatModel: Send + Sync {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, SynapticError>;
fn stream_chat(&self, request: ChatRequest) -> ChatStream<'_>;
}
chat() waits for the complete response. stream_chat() returns a ChatStream immediately:
pub type ChatStream<'a> =
Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send + 'a>>;
This is a pinned, boxed, async stream of AIMessageChunk values. Each chunk contains a fragment of the response -- typically a few tokens of text, part of a tool call, or usage information.
Default Implementation
The stream_chat() method has a default implementation that wraps chat() as a single-chunk stream. If a model adapter does not implement true streaming, it falls back to this behavior -- the caller still gets a stream, but it contains only one chunk (the complete response). This means code that consumes a ChatStream works with any model, whether or not it supports true streaming.
Consuming a Stream
use futures::StreamExt;
let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
print!("{}", chunk.content); // print tokens as they arrive
}
AIMessageChunk Merging
Streaming produces many chunks that must be assembled into a complete message. AIMessageChunk supports the + and += operators:
let mut accumulated = AIMessageChunk::default();
while let Some(chunk) = stream.next().await {
accumulated += chunk?;
}
let complete_message: Message = accumulated.into_message();
The merge rules:
content: Concatenated viapush_str. Each chunk's content fragment is appended to the accumulated string.tool_calls: Extended. Chunks may carry partial or complete tool call objects.tool_call_chunks: Extended. Raw partial tool call data from the provider.invalid_tool_calls: Extended.id: The first non-Nonevalue wins. Subsequent chunks do not overwrite the ID.usage: Summed field-by-field. If both sides have usage data,input_tokens,output_tokens, andtotal_tokensare added together. If only one side has usage, it is preserved.
After accumulation, into_message() converts the chunk into a Message::AI with the complete content and tool calls.
LCEL Streaming
The Runnable trait includes a stream() method:
fn stream<'a>(&'a self, input: I, config: &'a RunnableConfig) -> RunnableOutputStream<'a, O>;
The default implementation wraps invoke() as a single-item stream, similar to the model-level default. Components that support true streaming override this method.
Streaming Through Chains
When you call stream() on a BoxRunnable chain (e.g., prompt | model | parser), the behavior is:
- Intermediate steps run their
invoke()method and pass the result forward. - The final component in the chain streams its output.
This means in a prompt | model | parser chain, the prompt template runs synchronously, the model truly streams, and the parser processes each chunk as it arrives (if it supports streaming) or waits for the complete output (if it does not).
let chain = prompt_template.boxed() | model_runnable.boxed() | parser.boxed();
let mut stream = chain.stream(input, &config);
while let Some(item) = stream.next().await {
let output = item?;
// Process each streamed output
}
RunnableGenerator
For producing custom streams, RunnableGenerator wraps an async function that returns a stream:
let generator = RunnableGenerator::new(|input: String, _config| {
Box::pin(async_stream::stream! {
for word in input.split_whitespace() {
yield Ok(word.to_string());
}
})
});
This is useful when you need to inject a streaming source into an LCEL chain that is not a model.
Graph Streaming
Graph execution can also stream, yielding events after each node completes:
use synaptic::graph::StreamMode;
let mut stream = graph.stream(initial_state, StreamMode::Values);
while let Some(event) = stream.next().await {
let event = event?;
println!("Node '{}' completed. Messages: {}", event.node, event.state.messages.len());
}
StreamMode
| Mode | Yields | Use Case |
|---|---|---|
Values | Full state after each node | When you need the complete picture at each step |
Updates | Post-node state snapshot | When you want to observe what each node changed |
GraphEvent
pub struct GraphEvent<S> {
pub node: String,
pub state: S,
}
Each event tells you which node just executed and what the state looks like. For a ReAct agent, you would see alternating "agent" and "tools" events, with messages accumulating in the state.
When to Use Streaming
Use model-level streaming when you need token-by-token output for a chat UI or when you want to show partial results to the user as they are generated.
Use LCEL streaming when you have a chain of operations and want the final output to stream. The intermediate steps run synchronously, but the user sees the final result incrementally.
Use graph streaming when you have a multi-step workflow and want to observe progress. Each node completion is an event, giving you visibility into the graph's execution.
Streaming and Error Handling
Streams can yield errors at any point. A network failure mid-stream, a malformed chunk from the provider, or a graph node failure all produce Err items in the stream. Consumers should handle errors on each next() call:
while let Some(result) = stream.next().await {
match result {
Ok(chunk) => process(chunk),
Err(e) => {
eprintln!("Stream error: {e}");
break;
}
}
}
There is no automatic retry at the stream level. If a stream fails mid-way, the consumer decides how to handle it -- retry the entire call, return a partial result, or propagate the error. For automatic retries, wrap the model in a RetryChatModel before streaming, which retries the entire request on failure.
See Also
- Chat Model Streaming -- model-level streaming how-to
- LCEL Streaming -- streaming through runnable chains
- Graph Streaming -- graph-level streaming with StreamMode
- Runnables & LCEL -- the composition system that streams run through
Middleware
Middleware intercepts and transforms agent behavior at well-defined lifecycle points. Rather than modifying agent logic directly, middleware wraps around model calls and tool calls, adding cross-cutting concerns like rate limiting, human approval, summarization, and context management. This page explains the middleware abstraction, the lifecycle hooks, and the available middleware classes.
The AgentMiddleware Trait
All middleware implements a single trait with six hooks:
#[async_trait]
pub trait AgentMiddleware: Send + Sync {
async fn before_agent(&self, state: &MessageState) -> Result<(), SynapticError> { Ok(()) }
async fn after_agent(&self, state: &MessageState) -> Result<(), SynapticError> { Ok(()) }
async fn before_model(&self, messages: &mut Vec<Message>) -> Result<(), SynapticError> { Ok(()) }
async fn after_model(&self, response: &mut ChatResponse) -> Result<(), SynapticError> { Ok(()) }
async fn wrap_model_call(&self, messages: Vec<Message>, next: ModelCallFn) -> Result<ChatResponse, SynapticError>;
async fn wrap_tool_call(&self, name: &str, args: &Value, next: ToolCallFn) -> Result<Value, SynapticError>;
}
Each hook has a default implementation that passes through unchanged. Middleware only overrides the hooks it needs.
Lifecycle
A single agent turn follows this sequence:
before_agent → before_model → wrap_model_call → after_model → wrap_tool_call (per tool) → after_agent
before_agent-- called once at the start of each agent turn. Use for setup, logging, or state inspection.before_model-- called before the LLM request. Can modify messages (e.g., inject context, trim history).wrap_model_call-- wraps the actual model invocation. Can retry, add fallbacks, or replace the call entirely.after_model-- called after the LLM responds. Can modify the response (e.g., fix tool calls, add metadata).wrap_tool_call-- wraps each tool invocation. Can approve/reject, add logging, or modify arguments.after_agent-- called once at the end of each agent turn. Use for cleanup or state persistence.
MiddlewareChain
Multiple middleware instances are composed into a MiddlewareChain. The chain applies middleware in order for "before" hooks and in reverse order for "after" hooks (onion model):
use synaptic::middleware::MiddlewareChain;
let chain = MiddlewareChain::new(vec![
Arc::new(ToolCallLimitMiddleware::new(10)),
Arc::new(HumanInTheLoopMiddleware::new(callback)),
Arc::new(SummarizationMiddleware::new(model, 4000)),
]);
Available Middleware
ToolCallLimitMiddleware
Limits the total number of tool calls per agent session. When the limit is reached, subsequent tool calls return an error instead of executing.
- Use case: Preventing runaway agents that call tools in an infinite loop.
- Configuration:
ToolCallLimitMiddleware::new(max_calls)
HumanInTheLoopMiddleware
Routes tool calls through an approval callback before execution. The callback receives the tool name and arguments and returns an approval decision.
- Use case: High-stakes operations (database writes, external API calls) that require human review.
- Configuration:
HumanInTheLoopMiddleware::new(callback)or.for_tools(vec!["dangerous_tool"])to guard only specific tools.
SummarizationMiddleware
Monitors message history length and summarizes older messages when a token threshold is exceeded. Replaces distant messages with a summary while preserving recent ones.
- Use case: Long-running agents that accumulate large message histories.
- Configuration:
SummarizationMiddleware::new(summarizer_model, token_threshold)
ContextEditingMiddleware
Transforms the message history before each model call using a configurable strategy:
ContextStrategy::LastN(n)-- keep only the last N messages (preserving leading system messages).ContextStrategy::StripToolCalls-- remove tool call/result messages, keeping only human and AI content messages.
ModelRetryMiddleware
Wraps the model call with retry logic, attempting the call multiple times on transient failures.
ModelFallbackMiddleware
Provides fallback models when the primary model fails. Tries alternatives in order until one succeeds.
Middleware vs. Graph Features
Middleware and graph features (checkpointing, interrupts) serve different purposes:
| Concern | Middleware | Graph |
|---|---|---|
| Tool approval | HumanInTheLoopMiddleware | interrupt_before("tools") |
| Context management | ContextEditingMiddleware | Custom node logic |
| Rate limiting | ToolCallLimitMiddleware | Not applicable |
| State persistence | Not applicable | Checkpointer |
Middleware operates within a single agent node. Graph features operate across the entire graph. Use middleware for per-turn concerns and graph features for workflow-level concerns.
See Also
- Middleware How-to Guides -- detailed usage for each middleware class
- Tool Call Limit -- limiting tool calls
- Human-in-the-Loop -- approval workflows
- Summarization -- automatic context summarization
- Context Editing -- message history strategies
Key-Value Store
The key-value store provides persistent, namespaced storage for structured data. Unlike memory (which stores conversation messages by session), the store holds arbitrary key-value items organized into hierarchical namespaces. It supports CRUD operations, namespace listing, and optional semantic search when an embeddings model is configured.
The Store Trait
The Store trait is defined in synaptic-core and implemented in synaptic-store:
#[async_trait]
pub trait Store: Send + Sync {
async fn put(&self, namespace: &[&str], key: &str, value: Item) -> Result<(), SynapticError>;
async fn get(&self, namespace: &[&str], key: &str) -> Result<Option<Item>, SynapticError>;
async fn delete(&self, namespace: &[&str], key: &str) -> Result<(), SynapticError>;
async fn search(&self, namespace: &[&str], query: &SearchQuery) -> Result<Vec<Item>, SynapticError>;
async fn list_namespaces(&self, prefix: &[&str]) -> Result<Vec<Vec<String>>, SynapticError>;
}
Namespace Hierarchy
Namespaces are arrays of strings, forming a path-like hierarchy:
// Store user preferences
store.put(&["users", "alice", "preferences"], "theme", item).await?;
// Store project data
store.put(&["projects", "my-app", "config"], "settings", item).await?;
// List all user namespaces
let namespaces = store.list_namespaces(&["users"]).await?;
// [["users", "alice", "preferences"], ["users", "bob", "preferences"]]
Items in different namespaces are completely isolated. A get or search in one namespace never returns items from another.
Item
The Item struct holds the stored value:
pub struct Item {
pub key: String,
pub value: Value, // serde_json::Value
pub namespace: Vec<String>,
pub created_at: Option<DateTime<Utc>>,
pub updated_at: Option<DateTime<Utc>>,
pub score: Option<f32>, // populated by semantic search
}
The score field is None for regular CRUD operations and is populated only when items are returned from a semantic search query.
InMemoryStore
The built-in implementation uses Arc<RwLock<HashMap>> for thread-safe concurrent access:
use synaptic::store::InMemoryStore;
let store = InMemoryStore::new();
Suitable for development, testing, and applications that don't need persistence across restarts. For production use, implement the Store trait with a database backend.
Semantic Search
When an embeddings model is configured, the store supports semantic search -- finding items by meaning rather than exact key match:
use synaptic::store::InMemoryStore;
let store = InMemoryStore::with_embeddings(embeddings_model);
// Items are automatically embedded when stored
store.put(&["docs"], "rust-intro", item).await?;
// Search by semantic similarity
let results = store.search(&["docs"], &SearchQuery {
query: Some("programming language".into()),
limit: 5,
..Default::default()
}).await?;
Each returned item has a score field (0.0 to 1.0) indicating semantic similarity to the query.
Store vs. Memory
| Aspect | Store | Memory (MemoryStore) |
|---|---|---|
| Purpose | General key-value storage | Conversation message history |
| Keyed by | Namespace + key | Session ID |
| Value type | Arbitrary JSON (Value) | Message |
| Operations | CRUD + search + list | Append + load + clear |
| Search | Semantic (with embeddings) | Not applicable |
| Use case | Agent knowledge, user profiles, configuration | Chat history, context management |
Use memory for conversation state. Use the store for everything else -- agent knowledge bases, user preferences, cached computations, cross-session data.
Store in the Graph
The store is accessible within graph nodes through the ToolRuntime:
// Inside a RuntimeAwareTool
async fn call_with_runtime(&self, args: Value, runtime: &ToolRuntime) -> Result<Value, SynapticError> {
if let Some(store) = &runtime.store {
let item = store.get(&["memory"], "context").await?;
// Use stored data in tool execution
}
Ok(json!({"status": "ok"}))
}
This enables tools to read and write persistent data during graph execution without passing the store through function arguments.
See Also
- Key-Value Store How-to -- usage examples and patterns
- Runtime-Aware Tools -- accessing the store from tools
- Deep Agent Backends -- StoreBackend uses the Store trait
Integrations
Synaptic uses a provider-centric architecture for external service integrations. Each integration lives in its own crate, depends only on synaptic-core (plus any provider SDK), and implements one or more core traits.
Architecture
synaptic-core (defines traits)
├── synaptic-openai (ChatModel + Embeddings)
├── synaptic-anthropic (ChatModel)
├── synaptic-gemini (ChatModel)
├── synaptic-ollama (ChatModel + Embeddings)
├── synaptic-bedrock (ChatModel)
├── synaptic-cohere (DocumentCompressor)
├── synaptic-qdrant (VectorStore)
├── synaptic-pgvector (VectorStore)
├── synaptic-pinecone (VectorStore)
├── synaptic-chroma (VectorStore)
├── synaptic-mongodb (VectorStore)
├── synaptic-elasticsearch (VectorStore)
├── synaptic-redis (Store + LlmCache)
├── synaptic-sqlite (LlmCache)
├── synaptic-pdf (Loader)
└── synaptic-tavily (Tool)
All integration crates share a common pattern:
- Core traits —
ChatModel,Embeddings,VectorStore,Store,LlmCache,Loaderare defined insynaptic-core - Independent crates — Each integration is a separate crate with its own feature flag
- Zero coupling — Integration crates never depend on each other
- Config structs — Builder-pattern configuration with
new()+with_*()methods
Core Traits
| Trait | Purpose | Crate Implementations |
|---|---|---|
ChatModel | LLM chat completion | openai, anthropic, gemini, ollama, bedrock |
Embeddings | Text embedding vectors | openai, ollama |
VectorStore | Vector similarity search | qdrant, pgvector, pinecone, chroma, mongodb, elasticsearch, (+ in-memory) |
Store | Key-value storage | redis, (+ in-memory) |
LlmCache | LLM response caching | redis, sqlite, (+ in-memory) |
Loader | Document loading | pdf, (+ text, json, csv, directory) |
DocumentCompressor | Document reranking/filtering | cohere, (+ embeddings filter) |
Tool | Agent tool | tavily, (+ custom tools) |
LLM Provider Pattern
All LLM providers follow the same pattern — a config struct, a model struct, and a ProviderBackend for HTTP transport:
use synaptic::openai::{OpenAiChatModel, OpenAiConfig};
use synaptic::models::{HttpBackend, FakeBackend};
// Production
let config = OpenAiConfig::new("sk-...", "gpt-4o");
let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));
// Testing (no network calls)
let model = OpenAiChatModel::new(config, Arc::new(FakeBackend::with_responses(vec![...])));
The ProviderBackend abstraction (in synaptic-models) enables:
HttpBackend— real HTTP calls in productionFakeBackend— deterministic responses in tests
Storage & Retrieval Pattern
Vector stores, key-value stores, and caches implement core traits that allow drop-in replacement:
// Swap InMemoryVectorStore for QdrantVectorStore — same trait interface
use synaptic::qdrant::{QdrantVectorStore, QdrantConfig};
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config);
store.add_documents(docs, &embeddings).await?;
let results = store.similarity_search("query", 5, &embeddings).await?;
Feature Flags
Each integration has its own feature flag in the synaptic facade crate:
[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant"] }
| Feature | Integration |
|---|---|
openai | OpenAI ChatModel + Embeddings (+ OpenAI-compatible providers + Azure) |
anthropic | Anthropic ChatModel |
gemini | Google Gemini ChatModel |
ollama | Ollama ChatModel + Embeddings |
bedrock | AWS Bedrock ChatModel |
cohere | Cohere Reranker |
qdrant | Qdrant vector store |
pgvector | PostgreSQL pgvector store |
pinecone | Pinecone vector store |
chroma | Chroma vector store |
mongodb | MongoDB Atlas vector search |
elasticsearch | Elasticsearch vector store |
redis | Redis store + cache |
sqlite | SQLite LLM cache |
pdf | PDF document loader |
tavily | Tavily search tool |
Convenience combinations: models (all 6 LLM providers including bedrock and cohere), agent (includes openai), rag (includes openai + retrieval stack), full (everything).
Provider Selection Guide
Choose a provider based on your requirements:
| Provider | Auth | Streaming | Tool Calling | Embeddings | Best For |
|---|---|---|---|---|---|
| OpenAI | API key (header) | SSE | Yes | Yes | General-purpose, widest model selection |
| Anthropic | API key (x-api-key) | SSE | Yes | No | Long context, reasoning tasks |
| Gemini | API key (query param) | SSE | Yes | No | Google ecosystem, multimodal |
| Ollama | None (local) | NDJSON | Yes | Yes | Privacy-sensitive, offline, development |
| Bedrock | AWS IAM | AWS SDK | Yes | No | Enterprise AWS environments |
| OpenAI-Compatible | Varies | SSE | Varies | Varies | Cost optimization (Groq, DeepSeek, etc.) |
Deciding factors:
- Privacy & compliance — Ollama runs entirely locally; Bedrock keeps data within AWS
- Cost — Ollama is free; OpenAI-compatible providers (Groq, DeepSeek) offer competitive pricing
- Latency — Ollama has no network round-trip; Groq is optimized for speed
- Ecosystem — OpenAI has the most third-party integrations; Bedrock integrates with AWS services
Vector Store Selection Guide
| Store | Deployment | Managed | Filtering | Scaling | Best For |
|---|---|---|---|---|---|
| Qdrant | Self-hosted / Cloud | Yes (Qdrant Cloud) | Rich (payload filters) | Horizontal | General-purpose, production |
| pgvector | Self-hosted | Via managed Postgres | SQL WHERE clauses | Vertical | Teams already using PostgreSQL |
| Pinecone | Fully managed | Yes | Metadata filters | Automatic | Zero-ops, rapid prototyping |
| Chroma | Self-hosted / Docker | No | Metadata filters | Single node | Development, small-medium datasets |
| MongoDB Atlas | Fully managed | Yes | MQL filters | Automatic | Teams already using MongoDB |
| Elasticsearch | Self-hosted / Cloud | Yes (Elastic Cloud) | Full query DSL | Horizontal | Hybrid text + vector search |
| InMemory | In-process | N/A | None | N/A | Testing, prototyping |
Deciding factors:
- Existing infrastructure — Use pgvector if you have PostgreSQL, MongoDB Atlas if you use MongoDB, Elasticsearch if you already run an ES cluster
- Operational complexity — Pinecone and MongoDB Atlas are fully managed; Qdrant and Elasticsearch require cluster management
- Query capabilities — Elasticsearch excels at hybrid text + vector queries; Qdrant has the richest filtering
- Cost — InMemory and Chroma are free; pgvector reuses existing database infrastructure
Cache Selection Guide
| Cache | Persistence | Deployment | TTL Support | Best For |
|---|---|---|---|---|
| InMemory | No (process lifetime) | In-process | Yes | Testing, single-process apps |
| Redis | Yes (configurable) | External server | Yes | Multi-process, distributed |
| SQLite | Yes (file-based) | In-process | Yes | Single-machine persistence |
| Semantic | Depends on backing store | In-process | No | Fuzzy-match caching |
Complete RAG Pipeline Example
This example combines multiple integrations into a full retrieval-augmented generation pipeline with caching and reranking:
use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings};
use synaptic::openai::{OpenAiChatModel, OpenAiConfig, OpenAiEmbeddings};
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::cohere::{CohereReranker, CohereConfig};
use synaptic::cache::{CachedChatModel, InMemoryCache};
use synaptic::retrieval::ContextualCompressionRetriever;
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;
let backend = Arc::new(HttpBackend::new());
// 1. Set up embeddings
let embeddings = Arc::new(OpenAiEmbeddings::new(
OpenAiEmbeddings::config("text-embedding-3-small"),
backend.clone(),
));
// 2. Ingest documents into Qdrant
let loader = TextLoader::new("knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;
let qdrant_config = QdrantConfig::new("http://localhost:6334", "knowledge", 1536);
let store = QdrantVectorStore::new(qdrant_config, embeddings.clone()).await?;
store.add_documents(&chunks).await?;
// 3. Build retriever with Cohere reranking
let base_retriever = Arc::new(VectorStoreRetriever::new(Arc::new(store)));
let reranker = CohereReranker::new(CohereConfig::new(std::env::var("COHERE_API_KEY")?));
let retriever = ContextualCompressionRetriever::new(base_retriever, Arc::new(reranker));
// 4. Wrap the LLM with a cache
let llm_config = OpenAiConfig::new(std::env::var("OPENAI_API_KEY")?, "gpt-4o");
let base_model = OpenAiChatModel::new(llm_config, backend.clone());
let cache = Arc::new(InMemoryCache::new());
let model = CachedChatModel::new(Arc::new(base_model), cache);
// 5. Retrieve and generate
let relevant = retriever.retrieve("How does Synaptic handle streaming?").await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let request = ChatRequest::new(vec![
Message::system(&format!("Answer based on the following context:\n\n{context}")),
Message::human("How does Synaptic handle streaming?"),
]);
let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());
This pipeline demonstrates:
- Qdrant for vector storage and retrieval
- Cohere for reranking retrieved documents
- InMemoryCache for caching LLM responses (swap with Redis/SQLite for persistence)
- OpenAI for both embeddings and chat completion
Adding a New Integration
To add a new integration:
- Create a new crate
synaptic-{name}incrates/ - Depend on
synaptic-corefor trait definitions - Implement the appropriate trait(s)
- Add a feature flag in the
synapticfacade crate - Re-export via
pub use synaptic_{name} as {name}in the facadelib.rs
See Also
- Installation — Feature flag reference
- Architecture — Overall system design
Error Handling
Synaptic uses a single error enum, SynapticError, across the entire framework. Every async function returns Result<T, SynapticError>, and errors propagate naturally with the ? operator. This page explains the error model, the available variants, and the patterns for handling and recovering from errors.
SynapticError
#[derive(Debug, Error)]
pub enum SynapticError {
#[error("prompt error: {0}")] Prompt(String),
#[error("model error: {0}")] Model(String),
#[error("tool error: {0}")] Tool(String),
#[error("tool not found: {0}")] ToolNotFound(String),
#[error("memory error: {0}")] Memory(String),
#[error("rate limit: {0}")] RateLimit(String),
#[error("timeout: {0}")] Timeout(String),
#[error("validation error: {0}")] Validation(String),
#[error("parsing error: {0}")] Parsing(String),
#[error("callback error: {0}")] Callback(String),
#[error("max steps exceeded: {max_steps}")] MaxStepsExceeded { max_steps: usize },
#[error("embedding error: {0}")] Embedding(String),
#[error("vector store error: {0}")] VectorStore(String),
#[error("retriever error: {0}")] Retriever(String),
#[error("loader error: {0}")] Loader(String),
#[error("splitter error: {0}")] Splitter(String),
#[error("graph error: {0}")] Graph(String),
#[error("cache error: {0}")] Cache(String),
#[error("config error: {0}")] Config(String),
#[error("mcp error: {0}")] Mcp(String),
}
Twenty variants, one for each subsystem. The design is intentional:
- Single type everywhere: You never need to convert between error types. Any function in any crate can return
SynapticError, and the caller can propagate it with?without conversion. - String payloads: Most variants carry a
Stringmessage. This keeps the error type simple and avoids nested error hierarchies. The message provides context about what went wrong. thiserrorderivation:SynapticErrorimplementsstd::error::ErrorandDisplayautomatically via the#[error(...)]attributes.
Variant Reference
Infrastructure Errors
| Variant | When It Occurs |
|---|---|
Model(String) | LLM provider returns an error, network failure, invalid response format |
RateLimit(String) | Provider rate limit exceeded, token bucket exhausted |
Timeout(String) | Request timed out |
Config(String) | Invalid configuration (missing API key, bad parameters) |
Input/Output Errors
| Variant | When It Occurs |
|---|---|
Prompt(String) | Template variable missing, invalid template syntax |
Validation(String) | Input fails validation (e.g., empty message list, invalid schema) |
Parsing(String) | Output parser cannot extract structured data from LLM response |
Tool Errors
| Variant | When It Occurs |
|---|---|
Tool(String) | Tool execution failed (network error, computation error, etc.) |
ToolNotFound(String) | Requested tool name is not in the registry |
Subsystem Errors
| Variant | When It Occurs |
|---|---|
Memory(String) | Memory store read/write failure |
Callback(String) | Callback handler raised an error |
Embedding(String) | Embedding API failure |
VectorStore(String) | Vector store read/write failure |
Retriever(String) | Retrieval operation failed |
Loader(String) | Document loading failed (file not found, parse error) |
Splitter(String) | Text splitting failed |
Cache(String) | Cache read/write failure |
Execution Control Errors
| Variant | When It Occurs |
|---|---|
Graph(String) | Graph execution error (compilation, routing, missing nodes) |
MaxStepsExceeded { max_steps } | Agent loop exceeded the maximum iteration count |
Mcp(String) | MCP server connection, transport, or protocol error |
Error Propagation
Because every async function in Synaptic returns Result<T, SynapticError>, errors propagate naturally:
async fn process_query(model: &dyn ChatModel, query: &str) -> Result<String, SynapticError> {
let messages = vec![Message::human(query)];
let request = ChatRequest::new(messages);
let response = model.chat(request).await?; // Model error propagates
Ok(response.message.content().to_string())
}
There is no need for .map_err() conversions in application code. A Model error from a provider adapter, a Tool error from execution, or a Graph error from the state machine all flow through the same Result type.
Retry and Fallback Patterns
Not all errors are fatal. Synaptic provides several mechanisms for resilience:
RetryChatModel
Wraps a ChatModel and retries on transient failures:
use synaptic::models::RetryChatModel;
let robust_model = RetryChatModel::new(model, max_retries, delay);
On failure, it waits and retries up to max_retries times. This handles transient network errors and rate limits without application code needing to implement retry logic.
RateLimitedChatModel and TokenBucketChatModel
Proactively prevent rate limit errors by throttling requests:
RateLimitedChatModellimits requests per time window.TokenBucketChatModeluses a token bucket algorithm for smooth rate limiting.
By throttling before hitting the provider's limit, these wrappers convert potential RateLimit errors into controlled delays.
RunnableWithFallbacks
Tries alternative runnables when the primary one fails:
use synaptic::runnables::RunnableWithFallbacks;
let chain = RunnableWithFallbacks::new(
primary.boxed(),
vec![fallback_1.boxed(), fallback_2.boxed()],
);
If primary fails, fallback_1 is tried with the same input. If that also fails, fallback_2 is tried. Only if all options fail does the error propagate.
RunnableRetry
Retries a runnable with configurable exponential backoff:
use std::time::Duration;
use synaptic::runnables::{RunnableRetry, RetryPolicy};
let retry = RunnableRetry::new(
flaky_step.boxed(),
RetryPolicy::default()
.with_max_attempts(4)
.with_base_delay(Duration::from_millis(200))
.with_max_delay(Duration::from_secs(5)),
);
The delay doubles after each attempt (200ms, 400ms, 800ms, ...) up to max_delay. You can also set a retry_on predicate to only retry specific error types. This is useful for any step in an LCEL chain, not just model calls.
HandleErrorTool
Wraps a tool so that errors are returned as string results instead of propagating:
use synaptic::tools::HandleErrorTool;
let safe_tool = HandleErrorTool::new(risky_tool);
When the inner tool fails, the error message becomes the tool's output. The LLM sees the error and can decide to retry with different arguments or take a different approach. This prevents a single tool failure from crashing the entire agent loop.
Graph Interrupts (Not Errors)
Human-in-the-loop interrupts in the graph system are not errors. Graph invoke() returns GraphResult<S>, which is either Complete(state) or Interrupted(state):
use synaptic::graph::GraphResult;
match graph.invoke(state).await? {
GraphResult::Complete(final_state) => {
// Graph finished normally
handle_result(final_state);
}
GraphResult::Interrupted(partial_state) => {
// Human-in-the-loop: inspect state, get approval, resume
// The graph has checkpointed its state automatically
}
}
To extract the state regardless of completion status, use .into_state():
let state = graph.invoke(initial).await?.into_state();
Interrupts can also be triggered programmatically via Command::interrupt() from within a node:
use synaptic::graph::Command;
// Inside a node's process() method:
Command::interrupt(updated_state)
SynapticError::Graph is reserved for true errors: compilation failures, missing nodes, routing errors, and recursion limit violations.
Matching on Error Variants
Since SynapticError is an enum, you can match on specific variants to implement targeted error handling:
match result {
Ok(value) => use_value(value),
Err(SynapticError::RateLimit(_)) => {
// Wait and retry
}
Err(SynapticError::ToolNotFound(name)) => {
// Log the missing tool and continue without it
}
Err(SynapticError::Parsing(msg)) => {
// LLM output was malformed; ask the model to try again
}
Err(e) => {
// All other errors: propagate
return Err(e);
}
}
This pattern is especially useful in agent loops where some errors are recoverable (the model can try again) and others are not (network is down, API key is invalid).
See Also
- Retry & Rate Limiting -- automatic retry for model errors
- Fallbacks -- fallback chains for error recovery
- Interrupt & Resume -- graph interrupts (not errors)
API Reference
Synaptic is organized as a workspace of focused crates. Each crate has its own API documentation generated from doc comments in the source code.
Crate Reference
| Crate | Description | Docs |
|---|---|---|
synaptic-core | Shared traits and types (ChatModel, Tool, Message, SynapticError, etc.) | docs.rs |
synaptic-models | ProviderBackend abstraction, ScriptedChatModel test double, wrappers (retry, rate limit, structured output, bound tools) | docs.rs |
synaptic-openai | OpenAI provider (OpenAiChatModel, OpenAiEmbeddings) | docs.rs |
synaptic-anthropic | Anthropic provider (AnthropicChatModel) | docs.rs |
synaptic-gemini | Google Gemini provider (GeminiChatModel) | docs.rs |
synaptic-ollama | Ollama provider (OllamaChatModel, OllamaEmbeddings) | docs.rs |
synaptic-runnables | LCEL composition (Runnable trait, BoxRunnable, pipe operator, parallel, branch, fallbacks, assign, pick) | docs.rs |
synaptic-prompts | Prompt templates (PromptTemplate, ChatPromptTemplate, FewShotChatMessagePromptTemplate) | docs.rs |
synaptic-parsers | Output parsers (string, JSON, structured, list, enum, boolean, XML, fixing, retry) | docs.rs |
synaptic-tools | Tool system (ToolRegistry, SerialToolExecutor, ParallelToolExecutor) | docs.rs |
synaptic-memory | Memory strategies (buffer, window, summary, token buffer, summary buffer, RunnableWithMessageHistory) | docs.rs |
synaptic-callbacks | Callback handlers (RecordingCallback, TracingCallback, CompositeCallback) | docs.rs |
synaptic-retrieval | Retriever implementations (in-memory, BM25, multi-query, ensemble, contextual compression, self-query, parent document) | docs.rs |
synaptic-loaders | Document loaders (text, JSON, CSV, directory, file, markdown, web) | docs.rs |
synaptic-splitters | Text splitters (character, recursive character, markdown header, token, HTML header, language) | docs.rs |
synaptic-embeddings | Embeddings trait, FakeEmbeddings, CacheBackedEmbeddings | docs.rs |
synaptic-vectorstores | Vector store implementations (InMemoryVectorStore, VectorStoreRetriever, MultiVectorRetriever) | docs.rs |
synaptic-qdrant | Qdrant vector store (QdrantVectorStore) | docs.rs |
synaptic-pgvector | PostgreSQL pgvector store (PgVectorStore) | docs.rs |
synaptic-redis | Redis store and cache (RedisStore, RedisCache) | docs.rs |
synaptic-pdf | PDF document loader (PdfLoader) | docs.rs |
synaptic-graph | Graph orchestration (StateGraph, CompiledGraph, ToolNode, create_react_agent, checkpointing, streaming) | docs.rs |
synaptic-cache | LLM caching (InMemoryCache, SemanticCache, CachedChatModel) | docs.rs |
synaptic-eval | Evaluation framework (exact match, regex, JSON validity, embedding distance, LLM judge evaluators; Dataset and evaluate()) | docs.rs |
synaptic | Unified facade crate that re-exports all sub-crates under a single namespace | docs.rs |
Note: The docs.rs links above will become active once the crates are published to crates.io. In the meantime, generate local documentation as described below.
Local API Documentation
You can generate and browse the full API documentation locally with:
cargo doc --workspace --open
This builds rustdoc for every crate in the workspace and opens the result in your browser. The generated documentation includes all public types, traits, functions, and their doc comments.
To generate docs without opening the browser (useful in CI):
cargo doc --workspace --no-deps
Using the Facade Crate
If you prefer a single dependency instead of listing individual crates, use the synaptic facade:
[dependencies]
synaptic = "0.2"
Then import through the unified namespace:
use synaptic::core::Message;
use synaptic::openai::OpenAiChatModel; // requires "openai" feature
use synaptic::models::ScriptedChatModel; // requires "model-utils" feature
use synaptic::graph::create_react_agent;
use synaptic::runnables::Runnable;
Contributing
Thank you for your interest in contributing to Synaptic. This guide covers the workflow and standards for submitting changes.
Getting Started
- Fork the repository on GitHub.
- Clone your fork locally:
git clone https://github.com/<your-username>/synaptic.git cd synaptic - Create a branch for your changes:
git checkout -b feature/my-change
Development Workflow
Before submitting a pull request, make sure all checks pass locally.
Run Tests
cargo test --workspace
All tests must pass. If you are adding a new feature, add tests for it in the appropriate tests/ directory within the crate.
Run Clippy
cargo clippy --workspace
Fix any warnings. Clippy enforces idiomatic Rust patterns and catches common mistakes.
Check Formatting
cargo fmt --all -- --check
If this fails, run cargo fmt --all to auto-format and commit the result.
Build the Workspace
cargo build --workspace
Ensure everything compiles cleanly.
Submitting a Pull Request
- Push your branch to your fork.
- Open a pull request against the
mainbranch. - Provide a clear description of what your change does and why.
- Reference any related issues.
Guidelines
Code
- Follow existing patterns in the codebase. Each crate has a consistent structure with
src/for implementation andtests/for integration tests. - All traits are async via
#[async_trait]. Tests use#[tokio::test]. - Use
Arc<RwLock<_>>for shared registries andArc<tokio::sync::Mutex<_>>for callbacks and memory. - Prefer factory methods over struct literals for core types (e.g.,
Message::human(),ChatRequest::new()).
Documentation
- When adding a new feature or changing a public API, update the corresponding documentation page in
docs/book/en/src/. - How-to guides go in
how-to/, conceptual explanations inconcepts/, and step-by-step walkthroughs intutorials/. - If your change affects the project overview, update the README at the repository root.
Tests
- Each crate has a
tests/directory with integration-style tests in separate files. - Use
ScriptedChatModelorFakeBackendfor testing model interactions without real API calls. - Use
FakeEmbeddingsfor testing embedding-dependent features.
Commit Messages
- Write clear, concise commit messages that explain the "why" behind the change.
- Use conventional prefixes when appropriate:
feat:,fix:,docs:,refactor:,test:.
Project Structure
The workspace contains 17 library crates in crates/ plus example binaries in examples/. See Architecture Overview for a detailed breakdown of the crate layers and dependency graph.
Questions
If you are unsure about an approach, open an issue to discuss before writing code. This helps avoid wasted effort and keeps changes aligned with the project direction.
Development Setup
This page covers everything you need to build, test, and run Synaptic locally.
Prerequisites
-
Rust 1.88 or later -- Install via rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shVerify with:
rustc --version # Should print 1.88.0 or later cargo --version -
cargo -- Included with the Rust toolchain. No separate install needed.
Clone the Repository
git clone https://github.com/<your-username>/synaptic.git
cd synaptic
Build
Build every crate in the workspace:
cargo build --workspace
Test
Run All Tests
cargo test --workspace
This runs unit tests and integration tests across all 17 library crates.
Test a Single Crate
cargo test -p synaptic-tools
Replace synaptic-tools with any crate name from the workspace.
Run a Specific Test by Name
cargo test -p synaptic-core -- chunk
This runs only tests whose names contain "chunk" within the synaptic-core crate.
Run Examples
The examples/ directory contains runnable binaries that demonstrate common patterns:
cargo run -p react_basic
List all available example targets with:
ls examples/
Lint
Run Clippy to catch common mistakes and enforce idiomatic patterns:
cargo clippy --workspace
Fix any warnings before submitting changes.
Format
Check that all code follows the standard Rust formatting:
cargo fmt --all -- --check
If this fails, auto-format with:
cargo fmt --all
Pre-commit Hook
The repository ships a pre-commit hook that runs cargo fmt --check automatically before each commit. Enable it once after cloning:
git config core.hooksPath .githooks
If formatting fails the hook will run cargo fmt --all for you — just re-stage the changes and commit again.
Build Documentation Locally
API Docs (rustdoc)
Generate and open the full API reference in your browser:
cargo doc --workspace --open
mdBook Site
The documentation site is built with mdBook. Install it and serve the English docs locally:
cargo install mdbook
mdbook serve docs/book/en
This starts a local server (typically at http://localhost:3000) with live reload. Edit any .md file under docs/book/en/src/ and the browser will update automatically.
To build the book without serving:
mdbook build docs/book/en
The output is written to docs/book/en/book/.
Editor Setup
Synaptic is a standard Cargo workspace. Any editor with rust-analyzer support will provide inline errors, completions, and go-to-definition across all crates. Recommended:
- VS Code with the rust-analyzer extension
- IntelliJ IDEA with the Rust plugin
- Neovim with rust-analyzer via LSP
Environment Variables
Some provider adapters require API keys at runtime (not at build time):
| Variable | Used by |
|---|---|
OPENAI_API_KEY | OpenAiChatModel, OpenAiEmbeddings |
ANTHROPIC_API_KEY | AnthropicChatModel |
GOOGLE_API_KEY | GeminiChatModel |
These are only needed when running examples or tests that hit real provider APIs. The test suite uses ScriptedChatModel, FakeBackend, and FakeEmbeddings for offline testing, so you can run cargo test --workspace without any API keys.