Introduction

Synaptic is a Rust agent framework with LangChain-compatible architecture.

Build production-grade AI agents, chains, and retrieval pipelines in Rust with the same mental model you know from LangChain -- but with compile-time safety, zero-cost abstractions, and native async performance.

Why Synaptic?

Type-safe -- Message types, tool definitions, and runnable pipelines are checked at compile time. No runtime surprises from mismatched schemas.
Async-native -- Built on Tokio and async-trait from the ground up. Every trait method is async, and streaming is a first-class citizen via Stream.
Composable -- LCEL-style pipe operator (|), parallel branches, conditional routing, and fallback chains let you build complex workflows from simple parts.
LangChain-compatible -- Familiar concepts map directly: ChatPromptTemplate, StateGraph, create_react_agent, ToolNode, VectorStoreRetriever, and more.

Features at a Glance

Area	What you get
Chat Models	OpenAI, Anthropic, Gemini, Ollama adapters with streaming, retry, rate limiting, and caching
Messages	Typed message enum with factory methods, filtering, trimming, and merge utilities
Prompts	Template interpolation, chat prompt templates, few-shot prompting
Output Parsers	String, JSON, structured, list, enum, boolean, XML parsers
Runnables (LCEL)	Pipe operator, parallel, branch, assign/pick, bind, fallbacks, retry
Tools	Tool trait, registry, serial/parallel execution, tool choice
Memory	Buffer, window, summary, token buffer, summary buffer strategies
Graph	LangGraph-style state machines with checkpointing, streaming, and human-in-the-loop
Retrieval	Loaders, splitters, embeddings, vector stores, BM25, multi-query, ensemble retrievers
Evaluation	Exact match, regex, JSON validity, embedding distance, LLM judge evaluators
Callbacks	Recording, tracing, composite callback handlers

Quick Links

What is Synaptic? -- Concept mapping from LangChain Python to Synaptic Rust
Architecture Overview -- Layered crate design and dependency graph
Installation -- Add Synaptic to your project
Quickstart -- Your first Synaptic program in 30 lines
Tutorials -- Step-by-step guides for common use cases
API Reference -- Full API documentation

What is Synaptic?

Synaptic is a Rust framework for building AI agents, chains, and retrieval pipelines. It follows the same architecture and abstractions as LangChain (Python), translated into idiomatic Rust with strong typing, async-native design, and zero-cost abstractions.

If you have used LangChain in Python, you already know the mental model. Synaptic provides the same composable building blocks -- chat models, prompts, output parsers, runnables, tools, memory, graphs, and retrieval -- but catches errors at compile time instead of runtime.

LangChain to Synaptic Mapping

The table below shows how core LangChain Python concepts map to their Synaptic Rust equivalents:

LangChain (Python)	Synaptic (Rust)	Crate
`ChatOpenAI`	`OpenAiChatModel`	`synaptic-openai`
`ChatAnthropic`	`AnthropicChatModel`	`synaptic-anthropic`
`ChatGoogleGenerativeAI`	`GeminiChatModel`	`synaptic-gemini`
`HumanMessage` / `AIMessage`	`Message::human()` / `Message::ai()`	`synaptic-core`
`RunnableSequence` / LCEL `\|`	`BoxRunnable` / `\|` pipe operator	`synaptic-runnables`
`RunnableLambda`	`RunnableLambda`	`synaptic-runnables`
`RunnableParallel`	`RunnableParallel`	`synaptic-runnables`
`RunnableBranch`	`RunnableBranch`	`synaptic-runnables`
`RunnablePassthrough.assign()`	`RunnableAssign`	`synaptic-runnables`
`ChatPromptTemplate`	`ChatPromptTemplate`	`synaptic-prompts`
`ToolNode`	`ToolNode`	`synaptic-graph`
`StateGraph`	`StateGraph`	`synaptic-graph`
`create_react_agent`	`create_react_agent`	`synaptic-graph`
`InMemorySaver`	`MemorySaver`	`synaptic-graph`
`StrOutputParser`	`StrOutputParser`	`synaptic-parsers`
`JsonOutputParser`	`JsonOutputParser`	`synaptic-parsers`
`VectorStoreRetriever`	`VectorStoreRetriever`	`synaptic-vectorstores`
`RecursiveCharacterTextSplitter`	`RecursiveCharacterTextSplitter`	`synaptic-splitters`
`OpenAIEmbeddings`	`OpenAiEmbeddings`	`synaptic-openai`

Key Differences from LangChain Python

While the architecture is compatible, Synaptic makes deliberate Rust-idiomatic choices:

Message is a tagged enum, not a class hierarchy. You construct messages with factory methods like Message::human("hello") rather than instantiating classes.
ChatRequest uses a constructor with builder methods: ChatRequest::new(messages).with_tools(tools).with_tool_choice(ToolChoice::Auto).
All traits are async via #[async_trait]. Every chat(), invoke(), and call() is an async function.
Concurrency uses Arc-based sharing. Registries use Arc<RwLock<_>>, callbacks and memory use Arc<tokio::sync::Mutex<_>>.
Errors are typed. SynapticError is an enum with 19 variants (one per subsystem), not a generic exception.
Streaming is trait-based. ChatModel::stream_chat() returns a ChatStream (a pinned Stream of AIMessageChunk), and graph streaming yields GraphEvent values.

When to Use Synaptic

Synaptic is a good fit when you need:

Performance-critical AI applications -- Rust's zero-cost abstractions and lack of garbage collection make Synaptic suitable for high-throughput, low-latency agent workloads. There is no Python GIL limiting concurrency.
Rust ecosystem integration -- If your application is already written in Rust (web servers with Axum/Actix, CLI tools, embedded systems), Synaptic lets you add AI agent capabilities without crossing an FFI boundary or managing a Python subprocess.
Compile-time safety -- Tool argument schemas, message types, and runnable pipeline signatures are all checked by the compiler. Refactoring a tool's input type produces compile errors at every call site, not runtime crashes in production.
Deployable binaries -- Synaptic compiles to a single static binary with no runtime dependencies. No Python interpreter, no virtual environment, no pip install.
Concurrent agent workloads -- Tokio's async runtime lets you run hundreds of concurrent agent sessions on a single machine with efficient task scheduling.

When Not to Use Synaptic

If your team primarily writes Python and rapid prototyping speed matters more than runtime performance, LangChain Python is the more pragmatic choice.
If you need access to the full LangChain ecosystem of third-party integrations (hundreds of vector stores, document loaders, and model providers), LangChain Python has broader coverage today.

Architecture Overview

Synaptic is organized as a Cargo workspace with 26 library crates, 1 facade crate, and several example binaries. The crates form a layered architecture where each layer builds on the one below it.

Crate Layers

Core Layer

synaptic-core defines all shared traits and types. Every other crate depends on it.

Traits: ChatModel, Tool, RuntimeAwareTool, MemoryStore, CallbackHandler, Store, Embeddings
Types: Message, ChatRequest, ChatResponse, ToolCall, ToolDefinition, ToolChoice, AIMessageChunk, TokenUsage, RunEvent, RunnableConfig, Runtime, ToolRuntime, ModelProfile, Item, ContentBlock
Error type: SynapticError (20 variants covering all subsystems)
Stream type: ChatStream (Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send>>)

Implementation Crates

Each crate implements one core trait or provides a focused capability:

Crate	Purpose
`synaptic-models`	`ProviderBackend` abstraction, `ScriptedChatModel` test double, wrappers (retry, rate limit, structured output, bound tools)
`synaptic-openai`	`OpenAiChatModel` + `OpenAiEmbeddings`
`synaptic-anthropic`	`AnthropicChatModel`
`synaptic-gemini`	`GeminiChatModel`
`synaptic-ollama`	`OllamaChatModel` + `OllamaEmbeddings`
`synaptic-tools`	`ToolRegistry`, `SerialToolExecutor`, `ParallelToolExecutor`
`synaptic-memory`	Memory strategies: buffer, window, summary, token buffer, summary buffer, `RunnableWithMessageHistory`
`synaptic-callbacks`	`RecordingCallback`, `TracingCallback`, `CompositeCallback`
`synaptic-prompts`	`PromptTemplate`, `ChatPromptTemplate`, `FewShotChatMessagePromptTemplate`
`synaptic-parsers`	Output parsers: string, JSON, structured, list, enum, boolean, XML, markdown list, numbered list
`synaptic-cache`	`InMemoryCache`, `SemanticCache`, `CachedChatModel`

Composition Crates

These crates provide higher-level orchestration:

Crate	Purpose
`synaptic-runnables`	`Runnable` trait with `invoke()`/`batch()`/`stream()`, `BoxRunnable` with pipe operator, `RunnableLambda`, `RunnableParallel`, `RunnableBranch`, `RunnableAssign`, `RunnablePick`, `RunnableWithFallbacks`
`synaptic-graph`	LangGraph-style state machines: `StateGraph`, `CompiledGraph`, `ToolNode`, `create_react_agent`, `create_supervisor`, `create_swarm`, `Command`, `GraphResult`, `Checkpointer`, `MemorySaver`, multi-mode streaming

Retrieval Pipeline

These crates form the document ingestion and retrieval pipeline:

Crate	Purpose
`synaptic-loaders`	`TextLoader`, `JsonLoader`, `CsvLoader`, `DirectoryLoader`
`synaptic-splitters`	`CharacterTextSplitter`, `RecursiveCharacterTextSplitter`, `MarkdownHeaderTextSplitter`, `TokenTextSplitter`
`synaptic-embeddings`	`Embeddings` trait, `FakeEmbeddings`, `CacheBackedEmbeddings`
`synaptic-vectorstores`	`VectorStore` trait, `InMemoryVectorStore`, `VectorStoreRetriever`
`synaptic-retrieval`	`Retriever` trait, `BM25Retriever`, `MultiQueryRetriever`, `EnsembleRetriever`, `ContextualCompressionRetriever`, `SelfQueryRetriever`, `ParentDocumentRetriever`

Evaluation

Crate	Purpose
`synaptic-eval`	`Evaluator` trait, `ExactMatchEvaluator`, `RegexMatchEvaluator`, `JsonValidityEvaluator`, `EmbeddingDistanceEvaluator`, `LLMJudgeEvaluator`, `Dataset`, batch evaluation pipeline

Advanced Crates

These crates provide specialized capabilities for production agent systems:

Crate	Purpose
`synaptic-store`	`Store` trait implementation, `InMemoryStore` with semantic search (optional embeddings)
`synaptic-middleware`	`AgentMiddleware` trait, `MiddlewareChain`, built-in middleware: model retry, PII filtering, prompt caching, summarization, human-in-the-loop approval, tool call limiting
`synaptic-mcp`	Model Context Protocol adapters: `MultiServerMcpClient`, Stdio/SSE/HTTP transports for tool discovery and invocation
`synaptic-macros`	Procedural macros: `#[tool]`, `#[chain]`, `#[entrypoint]`, `#[task]`, `#[traceable]`, middleware macros
`synaptic-deep`	Deep Agent harness: `Backend` trait (State/Store/Filesystem), 7 filesystem tools, 6 middleware, `create_deep_agent()` factory

Integration Crates

These crates provide third-party service integrations:

Crate	Purpose
`synaptic-qdrant`	`QdrantVectorStore` (Qdrant vector database)
`synaptic-pgvector`	`PgVectorStore` (PostgreSQL pgvector extension)
`synaptic-redis`	`RedisStore` + `RedisCache` (Redis key-value store and LLM cache)
`synaptic-pdf`	`PdfLoader` (PDF document loading)

Facade

synaptic re-exports all sub-crates for convenient single-import usage:

use synaptic::core::{ChatModel, Message, ChatRequest};
use synaptic::openai::OpenAiChatModel;     // requires "openai" feature
use synaptic::models::ScriptedChatModel;   // requires "model-utils" feature
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::graph::{StateGraph, create_react_agent};

Dependency Diagram

All crates depend on synaptic-core for shared traits and types. Higher-level crates depend on the layer below:

                            ┌──────────┐
                            │ synaptic │  (facade: re-exports all)
                            └─────┬────┘
                                  │
     ┌──────────────┬─────────────┼──────────────┬───────────────┐
     │              │             │              │               │
 ┌───┴───┐   ┌─────┴────┐  ┌────┴─────┐  ┌─────┴────┐   ┌─────┴───┐
 │ deep  │   │middleware│  │  graph   │  │runnables │   │  eval   │
 └───┬───┘   └─────┬────┘  └────┬─────┘  └────┬─────┘   └─────┬───┘
     │              │            │              │               │
     ├──────────────┴────┬───────┴──────────────┤               │
     │                   │                      │               │
┌────┴──┐ ┌─────┐ ┌─────┴──┐ ┌──────┐ ┌───────┐│┌──────┐┌─────┴──┐
│models │ │tools│ │memory  │ │store │ │prompts│││parsers││cache   │
└───┬───┘ └──┬──┘ └───┬────┘ └──┬───┘ └───┬───┘│└───┬───┘└───┬────┘
    │        │        │         │         │    │    │        │
    │  ┌─────┴─┬──────┤    ┌────┘         │    │    │        │
    │  │       │      │    │              │    │    │        │
    ├──┤  ┌────┴──┐   │  ┌─┴────┐  ┌─────┴────┴────┴────────┤
    │  │  │macros │   │  │ mcp  │  │    callbacks            │
    │  │  └───┬───┘   │  └──┬───┘  └────────┬────────────────┘
    │  │      │       │     │               │
  ┌─┴──┴──────┴───────┴─────┴───────────────┴──┐
  │              synaptic-core                  │
  │  (ChatModel, Tool, Store, Embeddings, ...) │
  └──────────────────┬──────────────────────────┘
                     │
  Provider crates (each depends on synaptic-core + synaptic-models):
  openai, anthropic, gemini, ollama

  Retrieval pipeline:

  loaders ──► splitters ──► embeddings ──► vectorstores ──► retrieval

  Integration crates: qdrant, pgvector, redis, pdf

Design Principles

Async-first with `#[async_trait]`

Every trait in Synaptic is async. The ChatModel::chat() method, Tool::call(), MemoryStore::load(), and Runnable::invoke() are all async functions. This means you can freely await network calls, database queries, and concurrent operations inside any implementation without blocking the runtime.

Synaptic uses Arc<RwLock<_>> for registries (like ToolRegistry) where many readers need concurrent access, and Arc<tokio::sync::Mutex<_>> for stateful components (like callbacks and memory stores) where mutations must be serialized. This allows safe sharing across async tasks and agent sessions.

Session isolation

Memory stores and agent runs are keyed by session_id. Multiple conversations can run concurrently on the same model and tool set without state leaking between sessions.

Event-driven callbacks

The CallbackHandler trait receives RunEvent values at each lifecycle stage (run started, LLM called, tool called, run finished, run failed). You can compose multiple handlers with CompositeCallback for logging, tracing, metrics, and recording simultaneously.

Typed error handling

SynapticError has one variant per subsystem (Prompt, Model, Tool, Memory, Graph, etc.). This makes it straightforward to match on specific failure modes and provide targeted recovery logic.

Composition over inheritance

Rather than deep trait hierarchies, Synaptic favors composition. A CachedChatModel wraps any ChatModel. A RetryChatModel wraps any ChatModel. A RunnableWithFallbacks wraps any Runnable. You stack behaviors by wrapping, not by extending base classes.

Installation

Requirements

Rust edition: 2021
Minimum supported Rust version (MSRV): 1.88
Runtime: Tokio (async runtime)

Adding Synaptic to Your Project

The synaptic facade crate re-exports all sub-crates. Use feature flags to control which modules are compiled.

Feature Flags

Synaptic provides fine-grained feature flags, similar to tokio:

[dependencies]
# Full — everything enabled (equivalent to previous default)
synaptic = { version = "0.2", features = ["full"] }

# Agent development (OpenAI + tools + graph + memory, etc.)
synaptic = { version = "0.2", features = ["agent"] }

# RAG applications (OpenAI + retrieval + loaders + splitters + embeddings + vectorstores, etc.)
synaptic = { version = "0.2", features = ["rag"] }

# Agent + RAG
synaptic = { version = "0.2", features = ["agent", "rag"] }

# Just OpenAI model calls
synaptic = { version = "0.2", features = ["openai"] }

# All 4 providers (OpenAI + Anthropic + Gemini + Ollama)
synaptic = { version = "0.2", features = ["models"] }

# Fine-grained: one provider + specific modules
synaptic = { version = "0.2", features = ["anthropic", "graph", "cache"] }

Composite features:

Feature	Description
`default`	`model-utils`, `runnables`, `prompts`, `parsers`, `tools`, `callbacks`
`agent`	`default` + `openai`, `graph`, `memory`
`rag`	`default` + `openai`, `retrieval`, `loaders`, `splitters`, `embeddings`, `vectorstores`
`models`	All 6 providers: `openai` + `anthropic` + `gemini` + `ollama` + `bedrock` + `cohere`
`full`	All features enabled

Provider features (each enables one provider crate):

Feature	Description
`openai`	`OpenAiChatModel` + `OpenAiEmbeddings` (`synaptic-openai`)
`anthropic`	`AnthropicChatModel` (`synaptic-anthropic`)
`gemini`	`GeminiChatModel` (`synaptic-gemini`)
`ollama`	`OllamaChatModel` + `OllamaEmbeddings` (`synaptic-ollama`)

Module features:

Individual features: model-utils, runnables, prompts, parsers, tools, memory, callbacks, retrieval, loaders, splitters, embeddings, vectorstores, graph, cache, eval, store, middleware, mcp, macros, deep.

Feature	Description
`model-utils`	`ProviderBackend` abstraction, `ScriptedChatModel`, wrappers (`RetryChatModel`, `RateLimitedChatModel`, `StructuredOutputChatModel`, etc.)
`store`	Key-value store with namespace hierarchy and optional semantic search
`middleware`	Agent middleware chain (tool call limits, HITL, summarization, context editing)
`mcp`	Model Context Protocol client (Stdio/SSE/HTTP transports)
`macros`	Proc macros (`#[tool]`, `#[chain]`, `#[entrypoint]`, `#[traceable]`)
`deep`	Deep agent harness (backends, filesystem tools, sub-agents, skills)

Integration features:

Feature	Description
`qdrant`	Qdrant vector store (`synaptic-qdrant`)
`pgvector`	PostgreSQL pgvector store (`synaptic-pgvector`)
`redis`	Redis store + cache (`synaptic-redis`)
`pdf`	PDF document loader (`synaptic-pdf`)
`bedrock`	AWS Bedrock ChatModel (`synaptic-bedrock`)
`cohere`	Cohere Reranker (`synaptic-cohere`)
`pinecone`	Pinecone vector store (`synaptic-pinecone`)
`chroma`	Chroma vector store (`synaptic-chroma`)
`mongodb`	MongoDB Atlas vector search (`synaptic-mongodb`)
`elasticsearch`	Elasticsearch vector store (`synaptic-elasticsearch`)
`sqlite`	SQLite LLM cache (`synaptic-sqlite`)
`tavily`	Tavily search tool (`synaptic-tavily`)

The core module (traits and types) is always available regardless of feature selection.

Quick Start Example

[dependencies]
synaptic = { version = "0.2", features = ["agent"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Using the Facade

The facade crate provides namespaced re-exports for all sub-crates. You access types through their module path:

use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};  // requires "openai" feature
use synaptic::anthropic::AnthropicChatModel;                  // requires "anthropic" feature
use synaptic::models::ScriptedChatModel;                      // requires "model-utils" feature
use synaptic::runnables::{Runnable, BoxRunnable, RunnableLambda};
use synaptic::prompts::ChatPromptTemplate;
use synaptic::parsers::StrOutputParser;
use synaptic::tools::ToolRegistry;
use synaptic::memory::InMemoryStore;
use synaptic::graph::{StateGraph, create_react_agent};
use synaptic::retrieval::Retriever;
use synaptic::vectorstores::InMemoryVectorStore;

Alternatively, you can depend on individual crates directly if you want to minimize compile times:

[dependencies]
synaptic-core = "0.2"
synaptic-models = "0.2"

Provider API Keys

Synaptic reads API keys from environment variables. Set the ones you need for your chosen provider:

Provider	Environment Variable
OpenAI	`OPENAI_API_KEY`
Anthropic	`ANTHROPIC_API_KEY`
Google Gemini	`GOOGLE_API_KEY`
Ollama	No key required (runs locally)

For example, on a Unix shell:

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="AI..."

You do not need any API keys to run the Quickstart example, which uses the ScriptedChatModel test double.

Building and Testing

From the workspace root:

# Build all crates
cargo build --workspace

# Run all tests
cargo test --workspace

# Test a single crate
cargo test -p synaptic-models

# Run a specific test by name
cargo test -p synaptic-core -- trim_messages

# Check formatting
cargo fmt --all -- --check

# Run lints
cargo clippy --workspace

Workspace Dependencies

Synaptic uses Cargo workspace-level dependency management. Key shared dependencies include:

async-trait -- async trait methods
serde / serde_json -- serialization
thiserror 2.0 -- error derive
tokio -- async runtime (macros, rt-multi-thread, sync, time)
reqwest -- HTTP client (json, stream features)
futures / async-stream -- stream utilities
tracing / tracing-subscriber -- structured logging

Quickstart

This guide walks you through a minimal Synaptic program that sends a chat request and prints the response. It uses ScriptedChatModel, a test double that returns pre-configured responses, so you do not need any API keys to run it.

The Complete Example

use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError};
use synaptic::models::ScriptedChatModel;

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    // 1. Create a scripted model with a predefined response.
    //    ScriptedChatModel returns responses in order, one per chat() call.
    let model = ScriptedChatModel::new(vec![
        ChatResponse {
            message: Message::ai("Hello! I'm a Synaptic assistant. How can I help you today?"),
            usage: None,
        },
    ]);

    // 2. Build a chat request with a system prompt and a user message.
    let request = ChatRequest::new(vec![
        Message::system("You are a helpful assistant built with Synaptic."),
        Message::human("Hello! What are you?"),
    ]);

    // 3. Send the request and get a response.
    let response = model.chat(request).await?;

    // 4. Print the assistant's reply.
    println!("Assistant: {}", response.message.content());

    Ok(())
}

Running this program prints:

Assistant: Hello! I'm a Synaptic assistant. How can I help you today?

What is Happening

ScriptedChatModel::new(vec![...]) creates a chat model that returns the given ChatResponse values in sequence. This is useful for testing and examples without requiring a live API. In production, you would replace this with OpenAiChatModel (from synaptic::openai), AnthropicChatModel (from synaptic::anthropic), or another provider adapter.
ChatRequest::new(messages) constructs a chat request from a vector of messages. Messages are created with factory methods: Message::system() for system prompts, Message::human() for user input, and Message::ai() for assistant responses.
model.chat(request).await? sends the request asynchronously and returns a ChatResponse containing the model's message and optional token usage information.
response.message.content() extracts the text content from the response message.

Using a Real Provider

To use OpenAI instead of the scripted model, replace the model creation:

use synaptic::openai::OpenAiChatModel;

// Reads OPENAI_API_KEY from the environment automatically.
let model = OpenAiChatModel::new("gpt-4o");

You will also need the "openai" feature enabled in your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai"] }

The rest of the code stays the same -- ChatModel::chat() has the same signature regardless of provider.

Next Steps

Build a Simple LLM Application -- Chain prompts with output parsers
Build a Chatbot with Memory -- Add conversation history
Build a ReAct Agent -- Give your model tools to call
Build a RAG Application -- Retrieve documents for context
Architecture Overview -- Understand the crate structure

Build a Simple LLM Application

This tutorial walks you through building a basic chat application with Synaptic. You will learn how to create a chat model, send messages, template prompts, and compose processing pipelines using the LCEL pipe operator.

Prerequisites

Add the required Synaptic crates to your Cargo.toml:

[dependencies]
synaptic = "0.2"
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Step 1: Create a Chat Model

Every LLM interaction in Synaptic goes through a type that implements the ChatModel trait. For production use you would reach for OpenAiChatModel (from synaptic::openai), AnthropicChatModel (from synaptic::anthropic), or one of the other provider adapters. For this tutorial we use ScriptedChatModel, which returns pre-configured responses -- perfect for offline development and testing.

use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message};
use synaptic::models::ScriptedChatModel;

let model = ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("Paris is the capital of France."),
        usage: None,
    },
]);

ScriptedChatModel pops responses from a queue in order. Each call to chat() returns the next response. This makes tests deterministic and lets you compile and run examples without an API key.

Step 2: Build a Request and Get a Response

A ChatRequest holds the conversation messages (and optionally tool definitions). Build one with ChatRequest::new() and pass a vector of messages:

use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message};
use synaptic::models::ScriptedChatModel;

#[tokio::main]
async fn main() {
    let model = ScriptedChatModel::new(vec![
        ChatResponse {
            message: Message::ai("Paris is the capital of France."),
            usage: None,
        },
    ]);

    let request = ChatRequest::new(vec![
        Message::system("You are a geography expert."),
        Message::human("What is the capital of France?"),
    ]);

    let response = model.chat(request).await.unwrap();
    println!("{}", response.message.content());
    // Output: Paris is the capital of France.
}

Key points:

Message::system(), Message::human(), and Message::ai() are factory methods for building typed messages.
ChatRequest::new(messages) is the constructor. Never build the struct literal directly.
model.chat(request) is async and returns Result<ChatResponse, SynapticError>.

Step 3: Template Messages with ChatPromptTemplate

Hard-coding message strings works for one-off calls, but real applications need parameterized prompts. ChatPromptTemplate lets you define message templates with {{ variable }} placeholders that are filled in at runtime.

use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a helpful assistant that speaks {{ language }}."),
    MessageTemplate::human("{{ question }}"),
]);

To render the template, call format() with a map of variable values:

use std::collections::HashMap;
use serde_json::Value;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a helpful assistant that speaks {{ language }}."),
    MessageTemplate::human("{{ question }}"),
]);

let mut values = HashMap::new();
values.insert("language".to_string(), Value::String("French".to_string()));
values.insert("question".to_string(), Value::String("What is the capital of France?".to_string()));

let messages = template.format(&values).unwrap();
// messages[0] => System("You are a helpful assistant that speaks French.")
// messages[1] => Human("What is the capital of France?")

ChatPromptTemplate also implements the Runnable trait, which means it can participate in LCEL pipelines. When used as a Runnable, it takes a HashMap<String, Value> as input and produces Vec<Message> as output.

Step 4: Compose a Pipeline with the Pipe Operator

Synaptic implements LangChain Expression Language (LCEL) composition through the | pipe operator. You can chain any two runnables together as long as the output type of the first matches the input type of the second.

Here is a complete example that templates a prompt and extracts the response text:

use std::collections::HashMap;
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;

#[tokio::main]
async fn main() {
    // 1. Define the model
    let model = ScriptedChatModel::new(vec![
        ChatResponse {
            message: Message::ai("The capital of France is Paris."),
            usage: None,
        },
    ]);

    // 2. Define the prompt template
    let template = ChatPromptTemplate::from_messages(vec![
        MessageTemplate::system("You are a geography expert."),
        MessageTemplate::human("{{ question }}"),
    ]);

    // 3. Build the chain: template -> model -> parser
    //    Each step is boxed to erase types, then piped with |
    let chain = template.boxed() | model.boxed() | StrOutputParser.boxed();

    // 4. Invoke the chain
    let mut input = HashMap::new();
    input.insert(
        "question".to_string(),
        serde_json::Value::String("What is the capital of France?".to_string()),
    );

    let config = RunnableConfig::default();
    let result: String = chain.invoke(input, &config).await.unwrap();
    println!("{}", result);
    // Output: The capital of France is Paris.
}

Here is what happens at each stage of the pipeline:

ChatPromptTemplate receives HashMap<String, Value>, renders the templates, and outputs Vec<Message>.
ScriptedChatModel receives Vec<Message> (via its Runnable implementation which wraps them in a ChatRequest), calls the model, and outputs a Message.
StrOutputParser receives a Message and extracts its text content as a String.

The boxed() method wraps each component into a BoxRunnable, which is a type-erased wrapper that enables the | operator. Without boxing, Rust cannot unify the different concrete types.

Summary

In this tutorial you learned how to:

Create a ScriptedChatModel for offline development
Build ChatRequest objects from typed messages
Use ChatPromptTemplate with {{ variable }} interpolation
Compose processing pipelines with the LCEL | pipe operator

Next Steps

Build a Chatbot with Memory -- add conversation history
Build a ReAct Agent -- give the LLM tools to call
Runnables & LCEL -- deeper look at composition patterns

Build a Chatbot with Memory

This tutorial walks you through building a session-based chatbot that remembers conversation history. You will learn how to store and retrieve messages with InMemoryStore, isolate conversations by session ID, and choose the right memory strategy for your use case.

Prerequisites

Add the required Synaptic crates to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["memory"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Step 1: Store and Load Messages

Every chatbot needs to remember what was said. Synaptic provides the MemoryStore trait for this purpose, and InMemoryStore as a simple in-process implementation backed by a HashMap.

use synaptic::core::{MemoryStore, Message, SynapticError};
use synaptic::memory::InMemoryStore;

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    let memory = InMemoryStore::new();
    let session_id = "demo-session";

    // Simulate a conversation
    memory.append(session_id, Message::human("Hello, Synaptic")).await?;
    memory.append(session_id, Message::ai("Hello! How can I help you?")).await?;
    memory.append(session_id, Message::human("What can you do?")).await?;
    memory.append(session_id, Message::ai("I can help with many tasks!")).await?;

    // Load the conversation history
    let transcript = memory.load(session_id).await?;
    for message in &transcript {
        println!("{}: {}", message.role(), message.content());
    }

    // Clear memory when done
    memory.clear(session_id).await?;
    Ok(())
}

The output will be:

human: Hello, Synaptic
ai: Hello! How can I help you?
human: What can you do?
ai: I can help with many tasks!

The MemoryStore trait defines three methods:

append(session_id, message) -- adds a message to a session's history.
load(session_id) -- returns all messages for a session as a Vec<Message>.
clear(session_id) -- removes all messages for a session.

Step 2: Session Isolation

Each session ID maps to an independent conversation history. This is how you keep multiple users or threads separate:

use synaptic::core::{MemoryStore, Message, SynapticError};
use synaptic::memory::InMemoryStore;

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    let memory = InMemoryStore::new();

    // Alice's conversation
    memory.append("alice", Message::human("Hi, I'm Alice")).await?;
    memory.append("alice", Message::ai("Hello, Alice!")).await?;

    // Bob's conversation (completely independent)
    memory.append("bob", Message::human("Hi, I'm Bob")).await?;
    memory.append("bob", Message::ai("Hello, Bob!")).await?;

    // Each session has its own history
    let alice_history = memory.load("alice").await?;
    let bob_history = memory.load("bob").await?;

    assert_eq!(alice_history.len(), 2);
    assert_eq!(bob_history.len(), 2);
    assert_eq!(alice_history[0].content(), "Hi, I'm Alice");
    assert_eq!(bob_history[0].content(), "Hi, I'm Bob");

    Ok(())
}

Session IDs are arbitrary strings. In a web application you would typically use a user ID, a conversation thread ID, or a combination of both.

Step 3: Choose a Memory Strategy

As conversations grow long, sending every message to the LLM becomes expensive and eventually exceeds the context window. Synaptic provides several memory strategies that wrap an underlying MemoryStore and control what gets returned by load().

ConversationBufferMemory

Keeps all messages. This is the simplest strategy -- a passthrough wrapper that makes the "keep everything" policy explicit:

use std::sync::Arc;
use synaptic::core::MemoryStore;
use synaptic::memory::{InMemoryStore, ConversationBufferMemory};

let store = Arc::new(InMemoryStore::new());
let memory = ConversationBufferMemory::new(store);
// memory.load() returns all messages

Best for: short conversations where you want the full history available.

ConversationWindowMemory

Keeps only the last K messages. Older messages are still stored but are not returned by load():

use std::sync::Arc;
use synaptic::core::MemoryStore;
use synaptic::memory::{InMemoryStore, ConversationWindowMemory};

let store = Arc::new(InMemoryStore::new());
let memory = ConversationWindowMemory::new(store, 10); // keep last 10 messages
// memory.load() returns at most 10 messages

Best for: conversations where recent context is sufficient and you want predictable costs.

ConversationSummaryMemory

Uses an LLM to summarize older messages. When the stored message count exceeds buffer_size * 2, the older portion is compressed into a summary that is prepended as a system message:

use std::sync::Arc;
use synaptic::core::{ChatModel, MemoryStore};
use synaptic::memory::{InMemoryStore, ConversationSummaryMemory};

let store = Arc::new(InMemoryStore::new());
let model: Arc<dyn ChatModel> = /* your chat model */;
let memory = ConversationSummaryMemory::new(store, model, 6);
// When messages exceed 12, older ones are summarized
// memory.load() returns: [summary system message] + [recent 6 messages]

Best for: long-running conversations where you need to retain the gist of older context without the full verbatim history.

ConversationTokenBufferMemory

Keeps messages within a token budget. Uses a configurable token estimator to drop the oldest messages once the total exceeds the limit:

use std::sync::Arc;
use synaptic::core::MemoryStore;
use synaptic::memory::{InMemoryStore, ConversationTokenBufferMemory};

let store = Arc::new(InMemoryStore::new());
let memory = ConversationTokenBufferMemory::new(store, 4000); // 4000 token budget
// memory.load() returns as many recent messages as fit within 4000 tokens

Best for: staying within a model's context window by directly managing token count.

ConversationSummaryBufferMemory

A hybrid of summary and buffer strategies. Keeps the most recent messages verbatim, and summarizes everything older when the token count exceeds a threshold:

use std::sync::Arc;
use synaptic::core::{ChatModel, MemoryStore};
use synaptic::memory::{InMemoryStore, ConversationSummaryBufferMemory};

let store = Arc::new(InMemoryStore::new());
let model: Arc<dyn ChatModel> = /* your chat model */;
let memory = ConversationSummaryBufferMemory::new(store, model, 2000);
// Keeps recent messages verbatim; summarizes when total tokens exceed 2000

Best for: balancing cost with context quality -- you get the detail of recent messages and the compressed gist of older ones.

Step 4: Auto-Manage History with RunnableWithMessageHistory

In a real chatbot, you want the history load/save to happen automatically on each turn. RunnableWithMessageHistory wraps any Runnable<Vec<Message>, String> and handles this for you:

Extracts the session_id from RunnableConfig.metadata["session_id"]
Loads conversation history from memory
Appends the user's new message
Calls the inner runnable with the full message list
Saves the AI response back to memory

use std::sync::Arc;
use std::collections::HashMap;
use synaptic::core::{MemoryStore, RunnableConfig};
use synaptic::memory::{InMemoryStore, RunnableWithMessageHistory};
use synaptic::runnables::Runnable;

// Wrap a model chain with automatic history management
let memory = Arc::new(InMemoryStore::new());
let chain = /* your model chain (BoxRunnable<Vec<Message>, String>) */;
let chatbot = RunnableWithMessageHistory::new(chain, memory);

// Each call automatically loads/saves history
let mut config = RunnableConfig::default();
config.metadata.insert(
    "session_id".to_string(),
    serde_json::Value::String("user-42".to_string()),
);

let response = chatbot.invoke("What is Rust?".to_string(), &config).await?;
// The user message and AI response are now stored in memory for session "user-42"

This is the recommended approach for production chatbots because it keeps the memory management out of your application logic.

How It All Fits Together

Here is the mental model for Synaptic memory:

                    +-----------------------+
                    |    MemoryStore trait   |
                    |  append / load / clear |
                    +-----------+-----------+
                                |
         +----------------------+----------------------+
         |                      |                      |
  InMemoryStore          (other stores)       Memory Strategies
  (raw storage)                              (wrap a MemoryStore)
                                                       |
                                +----------------------+----------------------+
                                |         |         |         |              |
                             Buffer    Window   Summary   TokenBuffer   SummaryBuffer
                             (all)    (last K)   (LLM)    (tokens)       (hybrid)

All memory strategies implement MemoryStore themselves, so they are composable -- you could wrap an InMemoryStore in a ConversationWindowMemory, and everything downstream only sees the MemoryStore trait.

Summary

In this tutorial you learned how to:

Use InMemoryStore to store and retrieve conversation messages
Isolate conversations with session IDs
Choose a memory strategy based on your conversation length and cost requirements
Automate history management with RunnableWithMessageHistory

Next Steps

Build a RAG Application -- add document retrieval to your chatbot
Memory How-to Guides -- detailed guides for each memory strategy
Memory Concepts -- deeper understanding of memory architecture

Build a RAG Application

This tutorial walks you through building a Retrieval-Augmented Generation (RAG) pipeline with Synaptic. RAG is a pattern where you retrieve relevant documents from a knowledge base and include them as context in a prompt, so the LLM can answer questions grounded in your data rather than relying solely on its training.

Prerequisites

Add the required Synaptic crates to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["rag"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

How RAG Works

A RAG pipeline has two phases:

 Indexing (offline)                     Querying (online)
 ==================                     ==================

 +-----------+                          +-----------+
 | Documents |                          |   Query   |
 +-----+-----+                          +-----+-----+
       |                                      |
       v                                      v
 +-----+------+                         +-----+------+
 |   Split    |                         |  Retrieve  | <--- Vector Store
 +-----+------+                         +-----+------+
       |                                      |
       v                                      v
 +-----+------+                         +-----+------+
 |   Embed    |                         |  Augment   | (inject context into prompt)
 +-----+------+                         +-----+------+
       |                                      |
       v                                      v
 +-----+------+                         +-----+------+
 |   Store    | ---> Vector Store       |  Generate  | (LLM produces answer)
 +------------+                         +------------+

Indexing -- Load documents, split them into chunks, embed each chunk, and store the vectors.
Querying -- Embed the user's question, find the most similar chunks, include them in a prompt, and ask the LLM.

Step 1: Load Documents

Synaptic provides several document loaders. TextLoader wraps an in-memory string into a Document. For files on disk, use FileLoader.

use synaptic::loaders::{Loader, TextLoader};

let loader = TextLoader::new(
    "rust-intro",
    "Rust is a systems programming language focused on safety, speed, and concurrency. \
     It achieves memory safety without a garbage collector through its ownership system. \
     Rust's type system and borrow checker ensure that references are always valid. \
     The language has grown rapidly since its 1.0 release in 2015 and is widely used \
     for systems programming, web backends, embedded devices, and command-line tools.",
);

let docs = loader.load().await?;
// docs[0].id == "rust-intro"
// docs[0].content == the full text above

Each Document has three fields:

id -- a unique identifier (a string you provide).
content -- the text content.
metadata -- a HashMap<String, serde_json::Value> for arbitrary key-value pairs.

For loading files from disk, use FileLoader:

use synaptic::loaders::{Loader, FileLoader};

let loader = FileLoader::new("data/rust-book.txt");
let docs = loader.load().await?;
// docs[0].id == "data/rust-book.txt"
// docs[0].metadata["source"] == "data/rust-book.txt"

Other loaders include JsonLoader, CsvLoader, and DirectoryLoader (for loading many files at once with glob filtering).

Step 2: Split Documents into Chunks

Large documents need to be split into smaller chunks so that retrieval can return focused, relevant passages instead of entire files. RecursiveCharacterTextSplitter tries a hierarchy of separators (\n\n, \n, , "") and keeps chunks within a size limit.

use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};

let splitter = RecursiveCharacterTextSplitter::new(100)
    .with_chunk_overlap(20);

let chunks = splitter.split_documents(docs);
for chunk in &chunks {
    println!("[{}] {} chars: {}...", chunk.id, chunk.content.len(), &chunk.content[..40]);
}

The splitter produces new Document values with IDs like rust-intro-chunk-0, rust-intro-chunk-1, etc. Each chunk inherits the parent document's metadata and gains a chunk_index metadata field.

Key parameters:

chunk_size -- the maximum character length of each chunk (passed to new()).
chunk_overlap -- how many characters from the end of one chunk overlap with the start of the next (set with .with_chunk_overlap()). Overlap helps preserve context across chunk boundaries.

Other splitters are available for specialized content: CharacterTextSplitter, MarkdownHeaderTextSplitter, HtmlHeaderTextSplitter, and TokenTextSplitter.

Step 3: Embed and Store

Embeddings convert text into numerical vectors so that similarity can be computed mathematically. FakeEmbeddings provides deterministic, hash-based vectors for testing -- no API key required.

use std::sync::Arc;
use synaptic::embeddings::FakeEmbeddings;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore};

let embeddings = Arc::new(FakeEmbeddings::new(128));

// Create a vector store and add the chunks
let store = InMemoryVectorStore::new();
let ids = store.add_documents(chunks, embeddings.as_ref()).await?;
println!("Indexed {} chunks", ids.len());

InMemoryVectorStore stores document vectors in memory and uses cosine similarity for search. For convenience, you can also create a pre-populated store in one step:

let store = InMemoryVectorStore::from_documents(chunks, embeddings.as_ref()).await?;

For production use, replace FakeEmbeddings with OpenAiEmbeddings (from synaptic::openai) or OllamaEmbeddings (from synaptic::ollama), which call real embedding APIs.

Step 4: Retrieve Relevant Documents

Now you can search the vector store for chunks that are similar to a query:

use synaptic::vectorstores::VectorStore;

let results = store.similarity_search("What is Rust?", 3, embeddings.as_ref()).await?;
for doc in &results {
    println!("Found: {}", doc.content);
}

The second argument (3) is k -- the number of results to return.

Using a Retriever

For a cleaner API that decouples retrieval logic from the store implementation, wrap the store in a VectorStoreRetriever:

use synaptic::retrieval::Retriever;
use synaptic::vectorstores::VectorStoreRetriever;

let retriever = VectorStoreRetriever::new(
    Arc::new(store),
    embeddings.clone(),
    3, // default k
);

let results = retriever.retrieve("What is Rust?", 3).await?;

The Retriever trait has a single method -- retrieve(query, top_k) -- and is implemented by many retrieval strategies in Synaptic:

VectorStoreRetriever -- wraps any VectorStore for similarity search.
BM25Retriever -- keyword-based scoring (no embeddings needed).
MultiQueryRetriever -- generates multiple query variants with an LLM to improve recall.
EnsembleRetriever -- combines multiple retrievers with Reciprocal Rank Fusion.

Step 5: Generate an Answer

The final step combines retrieved context with the user's question in a prompt. Here is the complete pipeline:

use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError};
use synaptic::models::ScriptedChatModel;
use synaptic::loaders::{Loader, TextLoader};
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore, VectorStoreRetriever};
use synaptic::retrieval::Retriever;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    // 1. Load
    let loader = TextLoader::new(
        "rust-guide",
        "Rust is a systems programming language focused on safety, speed, and concurrency. \
         It achieves memory safety without a garbage collector through its ownership system. \
         Rust was first released in 2015 and has grown into one of the most loved languages \
         according to developer surveys.",
    );
    let docs = loader.load().await?;

    // 2. Split
    let splitter = RecursiveCharacterTextSplitter::new(100).with_chunk_overlap(20);
    let chunks = splitter.split_documents(docs);

    // 3. Embed and store
    let embeddings = Arc::new(FakeEmbeddings::new(128));
    let store = InMemoryVectorStore::from_documents(chunks, embeddings.as_ref()).await?;

    // 4. Retrieve
    let retriever = VectorStoreRetriever::new(Arc::new(store), embeddings.clone(), 2);
    let question = "When was Rust first released?";
    let relevant = retriever.retrieve(question, 2).await?;

    // 5. Build the augmented prompt
    let context = relevant
        .iter()
        .map(|doc| doc.content.as_str())
        .collect::<Vec<_>>()
        .join("\n\n");

    let prompt = format!(
        "Answer the question based only on the following context:\n\n\
         {context}\n\n\
         Question: {question}"
    );

    // 6. Generate (using ScriptedChatModel for offline testing)
    let model = ScriptedChatModel::new(vec![
        ChatResponse {
            message: Message::ai("Rust was first released in 2015."),
            usage: None,
        },
    ]);

    let request = ChatRequest::new(vec![
        Message::system("You are a helpful assistant. Answer questions using only the provided context."),
        Message::human(prompt),
    ]);

    let response = model.chat(request).await?;
    println!("Answer: {}", response.message.content());
    // Output: Answer: Rust was first released in 2015.

    Ok(())
}

In production, you would replace ScriptedChatModel with a real provider like OpenAiChatModel (from synaptic::openai) or AnthropicChatModel (from synaptic::anthropic).

Building RAG with LCEL Chains

For a more composable approach, you can integrate the retrieval step into an LCEL pipeline using RunnableParallel, RunnableLambda, and the pipe operator. This lets you express the RAG pattern as a single chain:

                    +---> retriever ---> format context ---+
                    |                                      |
  input (query) ---+                                      +---> prompt ---> model ---> parser
                    |                                      |
                    +---> passthrough (question) ----------+

Each step is a Runnable, and they compose with |. See the Runnables how-to guides for details on RunnableParallel and RunnableLambda.

Summary

In this tutorial you learned how to:

Load documents with TextLoader and FileLoader
Split documents into retrieval-friendly chunks with RecursiveCharacterTextSplitter
Embed and store chunks in an InMemoryVectorStore
Retrieve relevant documents with VectorStoreRetriever
Combine retrieved context with a prompt to generate grounded answers

Next Steps

Build a Graph Workflow -- orchestrate multi-step agent logic with a state graph
Retrieval How-to Guides -- BM25, multi-query, ensemble, and compression retrievers
Retrieval Concepts -- deeper look at embedding and retrieval strategies

Build a ReAct Agent

This tutorial walks you through building a ReAct (Reasoning + Acting) agent that can decide when to call tools and when to respond to the user. You will define a custom tool, wire it into a prebuilt agent graph, and watch the agent loop through reasoning and tool execution.

What is a ReAct Agent?

A ReAct agent follows a loop:

Reason -- The LLM looks at the conversation so far and decides what to do next.
Act -- If the LLM determines it needs information, it emits one or more tool calls.
Observe -- The tool results are added to the conversation as Tool messages.
Repeat -- The LLM reviews the tool output and either calls more tools or produces a final answer.

Synaptic provides create_react_agent(model, tools), which builds a compiled StateGraph that implements this loop automatically.

Prerequisites

Add the required crates to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["agent", "macros"] }
async-trait = "0.1"
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Step 1: Define a Custom Tool

The easiest way to define a tool in Synaptic is with the #[tool] macro. Write an async function, add a doc comment (this becomes the description the LLM sees), and the macro generates the struct, Tool trait implementation, and a factory function automatically.

use serde_json::json;
use synaptic::core::SynapticError;
use synaptic::macros::tool;

/// Adds two numbers.
#[tool]
async fn add(
    /// The first number
    a: i64,
    /// The second number
    b: i64,
) -> Result<serde_json::Value, SynapticError> {
    Ok(json!({ "value": a + b }))
}

The function parameters are automatically mapped to a JSON Schema that tells the LLM what arguments to provide. Parameter doc comments become "description" fields in the schema. In production, you can use Option<T> for optional parameters and #[default = value] for defaults. See Procedural Macros for the full reference.

Step 2: Create a Chat Model

For this tutorial we build a simple demo model that simulates the ReAct loop. On the first call (when there is no tool output in the conversation yet), it returns a tool call. On the second call (after tool output has been added), it returns a final text answer.

use async_trait::async_trait;
use serde_json::json;
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError, ToolCall};

struct DemoModel;

#[async_trait]
impl ChatModel for DemoModel {
    async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, SynapticError> {
        let has_tool_output = request.messages.iter().any(|m| m.is_tool());

        if !has_tool_output {
            // First turn: ask to call the "add" tool
            Ok(ChatResponse {
                message: Message::ai_with_tool_calls(
                    "I will use a tool to calculate this.",
                    vec![ToolCall {
                        id: "call-1".to_string(),
                        name: "add".to_string(),
                        arguments: json!({ "a": 7, "b": 5 }),
                    }],
                ),
                usage: None,
            })
        } else {
            // Second turn: the tool result is in, produce the final answer
            Ok(ChatResponse {
                message: Message::ai("The result is 12."),
                usage: None,
            })
        }
    }
}

In a real application you would use one of the provider adapters (OpenAiChatModel from synaptic::openai, AnthropicChatModel from synaptic::anthropic, etc.) instead of a scripted model.

Step 3: Build the Agent Graph

create_react_agent takes a model and a vector of tools, and returns a CompiledGraph<MessageState>. Under the hood, it creates two nodes:

"agent" -- calls the ChatModel with the current messages and tool definitions.
"tools" -- executes any tool calls from the agent's response using a ToolNode.

A conditional edge routes from "agent" to "tools" if the response contains tool calls, or to END if it does not. An unconditional edge routes from "tools" back to "agent" so the model can review the results.

use std::sync::Arc;
use synaptic::core::Tool;
use synaptic::graph::create_react_agent;

let model = Arc::new(DemoModel);
let tools: Vec<Arc<dyn Tool>> = vec![add()];

let graph = create_react_agent(model, tools).unwrap();

The add() factory function (generated by #[tool]) returns Arc<dyn Tool>, so it can be used directly in the tools vector. The model is wrapped in Arc because the graph needs shared ownership -- nodes may be invoked concurrently in more complex workflows.

Step 4: Run the Agent

Create an initial MessageState with the user's question and invoke the graph:

use synaptic::core::Message;
use synaptic::graph::MessageState;

let initial_state = MessageState {
    messages: vec![Message::human("What is 7 + 5?")],
};

let result = graph.invoke(initial_state).await.unwrap();

let last = result.last_message().unwrap();
println!("agent answer: {}", last.content());
// Output: agent answer: The result is 12.

MessageState is the built-in state type for conversational agents. It holds a Vec<Message> that grows as the agent loop progresses. After invocation, last_message() returns the final message in the conversation -- typically the agent's answer.

Full Working Example

Here is the complete program that ties all the pieces together:

use std::sync::Arc;
use async_trait::async_trait;
use serde_json::json;
use synaptic::core::{ChatModel, ChatRequest, ChatResponse, Message, SynapticError, Tool, ToolCall};
use synaptic::graph::{create_react_agent, MessageState};
use synaptic::macros::tool;

// --- Model ---

struct DemoModel;

#[async_trait]
impl ChatModel for DemoModel {
    async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, SynapticError> {
        let has_tool_output = request.messages.iter().any(|m| m.is_tool());
        if !has_tool_output {
            Ok(ChatResponse {
                message: Message::ai_with_tool_calls(
                    "I will use a tool to calculate this.",
                    vec![ToolCall {
                        id: "call-1".to_string(),
                        name: "add".to_string(),
                        arguments: json!({ "a": 7, "b": 5 }),
                    }],
                ),
                usage: None,
            })
        } else {
            Ok(ChatResponse {
                message: Message::ai("The result is 12."),
                usage: None,
            })
        }
    }
}

// --- Tool ---

/// Adds two numbers.
#[tool]
async fn add(
    /// The first number
    a: i64,
    /// The second number
    b: i64,
) -> Result<serde_json::Value, SynapticError> {
    Ok(json!({ "value": a + b }))
}

// --- Main ---

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    let model = Arc::new(DemoModel);
    let tools: Vec<Arc<dyn Tool>> = vec![add()];

    let graph = create_react_agent(model, tools)?;

    let initial_state = MessageState {
        messages: vec![Message::human("What is 7 + 5?")],
    };

    let result = graph.invoke(initial_state).await?;
    let last = result.last_message().unwrap();
    println!("agent answer: {}", last.content());
    Ok(())
}

How the Loop Executes

Here is the sequence of events when you run this example:

Step	Node	What happens
1	agent	Receives `[Human("What is 7 + 5?")]`. Returns an AI message with a `ToolCall` for `add(a=7, b=5)`.
2	routing	The conditional edge sees tool calls in the last message and routes to tools.
3	tools	`ToolNode` looks up `"add"` in the registry, calls the `add` tool's `call` method, and appends a `Tool` message with `{"value": 12}`.
4	edge	The unconditional edge routes from tools back to agent.
5	agent	Receives the full conversation including the tool result. Returns `AI("The result is 12.")` with no tool calls.
6	routing	No tool calls in the last message, so the conditional edge routes to `END`.

The graph terminates and returns the final MessageState.

Next Steps

Build a Graph Workflow -- build custom state graphs with conditional edges
Tool Choice -- control which tools the model can call
Human-in-the-Loop -- add interrupt points for human review
Checkpointing -- persist agent state across invocations

Build a Graph Workflow

This tutorial walks you through building a custom multi-step workflow using Synaptic's LangGraph-style state graph. You will learn how to define nodes, wire them with edges, stream execution events, add conditional routing, and visualize the graph.

Prerequisites

Add the required Synaptic crates to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["graph"] }
async-trait = "0.1"
futures = "0.3"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

How State Graphs Work

A Synaptic state graph is a directed graph where:

Nodes are processing steps. Each node takes the current state, transforms it, and returns the new state.
Edges connect nodes. Fixed edges always route to the same target; conditional edges choose the target at runtime based on the state.
State is a value that flows through the graph. It carries all the data nodes need to read and write.

The lifecycle is:

  START ---> node_a ---> node_b ---> node_c ---> END
              |            |            |
              v            v            v
           state_0 --> state_1 --> state_2 --> state_3

Each node receives the state, processes it, and passes the updated state to the next node. The graph terminates when execution reaches the END sentinel.

Step 1: Define the State

The simplest built-in state is MessageState, which holds a Vec<Message>. It is suitable for most agent and chatbot workflows:

use synaptic::graph::MessageState;
use synaptic::core::Message;

let state = MessageState::with_messages(vec![
    Message::human("Hi"),
]);

MessageState implements the State trait, which requires a merge() method. When states are merged (e.g., during checkpointing or human-in-the-loop updates), MessageState appends the new messages to the existing list.

For custom workflows, you can implement State on your own types. The trait requires Clone + Send + Sync + 'static and a merge method:

use serde::{Serialize, Deserialize};
use synaptic::graph::State;

#[derive(Debug, Clone, Serialize, Deserialize)]
struct MyState {
    counter: u32,
    results: Vec<String>,
}

impl State for MyState {
    fn merge(&mut self, other: Self) {
        self.counter += other.counter;
        self.results.extend(other.results);
    }
}

Step 2: Define Nodes

A node is any type that implements the Node<S> trait. The trait has a single async method, process, which takes the state and returns the updated state:

use async_trait::async_trait;
use synaptic::core::{Message, SynapticError};
use synaptic::graph::{MessageState, Node};

struct GreetNode;

#[async_trait]
impl Node<MessageState> for GreetNode {
    async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
        state.messages.push(Message::ai("Hello! Let me help you."));
        Ok(state)
    }
}

struct ProcessNode;

#[async_trait]
impl Node<MessageState> for ProcessNode {
    async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
        state.messages.push(Message::ai("Processing your request..."));
        Ok(state)
    }
}

struct FinalizeNode;

#[async_trait]
impl Node<MessageState> for FinalizeNode {
    async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
        state.messages.push(Message::ai("Done! Here's the result."));
        Ok(state)
    }
}

For simpler cases, you can use FnNode to wrap an async closure without defining a separate struct:

use synaptic::graph::FnNode;

let greet = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Hello!"));
    Ok(state)
});

Step 3: Build and Compile the Graph

Use StateGraph to wire nodes and edges into a workflow, then call compile() to produce an executable CompiledGraph:

use synaptic::graph::{StateGraph, END};

let graph = StateGraph::new()
    .add_node("greet", GreetNode)
    .add_node("process", ProcessNode)
    .add_node("finalize", FinalizeNode)
    .set_entry_point("greet")
    .add_edge("greet", "process")
    .add_edge("process", "finalize")
    .add_edge("finalize", END)
    .compile()?;

The builder methods are chainable:

add_node(name, node) -- registers a named node.
set_entry_point(name) -- designates the first node to execute.
add_edge(source, target) -- adds a fixed edge between two nodes (use END as the target to terminate).
compile() -- validates the graph and returns a CompiledGraph. It returns an error if the entry point is missing or if any edge references a non-existent node.

Step 4: Invoke the Graph

Call invoke() with an initial state. The graph executes each node in sequence according to the edges, and returns the final state:

use synaptic::core::Message;
use synaptic::graph::MessageState;

let state = MessageState::with_messages(vec![Message::human("Hi")]);
let result = graph.invoke(state).await?;

for msg in &result.messages {
    println!("{}: {}", msg.role(), msg.content());
}

Output:

human: Hi
ai: Hello! Let me help you.
ai: Processing your request...
ai: Done! Here's the result.

Step 5: Stream Execution

For real-time feedback, use stream() to receive a GraphEvent after each node completes. Each event contains the node name and the current state snapshot:

use futures::StreamExt;
use synaptic::graph::StreamMode;

let state = MessageState::with_messages(vec![Message::human("Hi")]);
let mut stream = graph.stream(state, StreamMode::Values);

while let Some(event) = stream.next().await {
    let event = event?;
    println!("Node '{}' completed, {} messages in state",
        event.node, event.state.messages.len());
}

Output:

Node 'greet' completed, 2 messages in state
Node 'process' completed, 3 messages in state
Node 'finalize' completed, 4 messages in state

StreamMode controls what each event contains:

StreamMode::Values -- the event's state is the full accumulated state after the node ran.
StreamMode::Updates -- the event's state is the state as it stands after the node, useful for observing per-node changes.

Step 6: Add Conditional Edges

Real workflows often need branching logic. Use add_conditional_edges with a routing function that inspects the state and returns the name of the next node:

use std::collections::HashMap;
use synaptic::graph::{StateGraph, END};

let graph = StateGraph::new()
    .add_node("greet", GreetNode)
    .add_node("process", ProcessNode)
    .add_node("finalize", FinalizeNode)
    .set_entry_point("greet")
    .add_edge("greet", "process")
    .add_conditional_edges_with_path_map(
        "process",
        |state: &MessageState| {
            if state.messages.len() > 3 {
                "finalize".to_string()
            } else {
                "process".to_string()
            }
        },
        HashMap::from([
            ("finalize".to_string(), "finalize".to_string()),
            ("process".to_string(), "process".to_string()),
        ]),
    )
    .add_edge("finalize", END)
    .compile()?;

In this example, the process node loops back to itself until the state has more than 3 messages, at which point it routes to finalize.

There are two variants:

add_conditional_edges(source, router_fn) -- the routing function returns a node name directly. Simple, but visualization tools cannot display the possible targets.
add_conditional_edges_with_path_map(source, router_fn, path_map) -- also provides a HashMap<String, String> that maps labels to target node names. This enables visualization tools to show all possible routing targets.

The routing function must be Fn(&S) -> String + Send + Sync + 'static. It receives a reference to the current state and returns the name of the target node (or END to terminate).

Step 7: Visualize the Graph

CompiledGraph provides several methods for visualizing the graph structure. These are useful for debugging and documentation.

Mermaid Diagram

println!("{}", graph.draw_mermaid());

Produces a Mermaid flowchart that can be rendered by GitHub, GitLab, or any Mermaid-compatible viewer:

graph TD
    __start__(["__start__"])
    greet["greet"]
    process["process"]
    finalize["finalize"]
    __end__(["__end__"])
    __start__ --> greet
    greet --> process
    finalize --> __end__
    process -.-> |finalize| finalize
    process -.-> |process| process

Fixed edges appear as solid arrows (-->), conditional edges as dashed arrows (-.->) with labels.

ASCII Summary

println!("{}", graph.draw_ascii());

Produces a compact text summary:

Graph:
  Nodes: finalize, greet, process
  Entry: __start__ -> greet
  Edges:
    finalize -> __end__
    greet -> process
    process -> finalize | process  [conditional]

Other Formats

draw_dot() -- produces a Graphviz DOT string, suitable for rendering with the dot command.
draw_png(path) -- renders the graph as a PNG image using Graphviz (requires dot to be installed).
draw_mermaid_png(path) -- renders via the mermaid.ink API (requires internet access).
draw_mermaid_svg(path) -- renders as SVG via the mermaid.ink API.

Complete Example

Here is the full program combining all the concepts:

use std::collections::HashMap;
use async_trait::async_trait;
use futures::StreamExt;
use synaptic::core::{Message, SynapticError};
use synaptic::graph::{MessageState, Node, StateGraph, StreamMode, END};

struct GreetNode;

#[async_trait]
impl Node<MessageState> for GreetNode {
    async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
        state.messages.push(Message::ai("Hello! Let me help you."));
        Ok(state)
    }
}

struct ProcessNode;

#[async_trait]
impl Node<MessageState> for ProcessNode {
    async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
        state.messages.push(Message::ai("Processing your request..."));
        Ok(state)
    }
}

struct FinalizeNode;

#[async_trait]
impl Node<MessageState> for FinalizeNode {
    async fn process(&self, mut state: MessageState) -> Result<MessageState, SynapticError> {
        state.messages.push(Message::ai("Done! Here's the result."));
        Ok(state)
    }
}

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    // Build the graph with a conditional loop
    let graph = StateGraph::new()
        .add_node("greet", GreetNode)
        .add_node("process", ProcessNode)
        .add_node("finalize", FinalizeNode)
        .set_entry_point("greet")
        .add_edge("greet", "process")
        .add_conditional_edges_with_path_map(
            "process",
            |state: &MessageState| {
                if state.messages.len() > 3 {
                    "finalize".to_string()
                } else {
                    "process".to_string()
                }
            },
            HashMap::from([
                ("finalize".to_string(), "finalize".to_string()),
                ("process".to_string(), "process".to_string()),
            ]),
        )
        .add_edge("finalize", END)
        .compile()?;

    // Visualize the graph
    println!("=== Graph Structure ===");
    println!("{}", graph.draw_ascii());
    println!();
    println!("=== Mermaid ===");
    println!("{}", graph.draw_mermaid());
    println!();

    // Stream execution
    println!("=== Execution ===");
    let state = MessageState::with_messages(vec![Message::human("Hi")]);
    let mut stream = graph.stream(state, StreamMode::Values);

    while let Some(event) = stream.next().await {
        let event = event?;
        let last_msg = event.state.last_message().unwrap();
        println!("[{}] {}: {}", event.node, last_msg.role(), last_msg.content());
    }

    Ok(())
}

Output:

=== Graph Structure ===
Graph:
  Nodes: finalize, greet, process
  Entry: __start__ -> greet
  Edges:
    finalize -> __end__
    greet -> process
    process -> finalize | process  [conditional]

=== Mermaid ===
graph TD
    __start__(["__start__"])
    finalize["finalize"]
    greet["greet"]
    process["process"]
    __end__(["__end__"])
    __start__ --> greet
    finalize --> __end__
    greet --> process
    process -.-> |finalize| finalize
    process -.-> |process| process

=== Execution ===
[greet] ai: Hello! Let me help you.
[process] ai: Processing your request...
[process] ai: Processing your request...
[finalize] ai: Done! Here's the result.

The process node executes twice because on the first pass the state has only 3 messages (the human message plus greet and process outputs), so the conditional edge loops back. On the second pass it has 4 messages, which exceeds the threshold, and routing proceeds to finalize.

Summary

In this tutorial you learned how to:

Define graph state with MessageState or a custom State type
Create nodes by implementing the Node<S> trait or using FnNode
Build a graph with StateGraph using fixed and conditional edges
Execute a graph with invoke() or stream it with stream()
Visualize the graph with Mermaid, ASCII, DOT, and image output

Next Steps

Build a ReAct Agent -- use the prebuilt create_react_agent helper for tool-calling agents
Graph How-to Guides -- checkpointing, human-in-the-loop, streaming, and tool nodes
Graph Concepts -- deeper look at state machines and the LangGraph execution model

Build a Deep Agent

This tutorial walks you through building a Deep Agent step by step. You will start with a minimal agent that can read and write files, then progressively add skills, subagents, memory, and custom configuration. By the end you will understand every layer of the deep agent stack.

What You Will Build

A Deep Agent that:

Uses filesystem tools to read, write, and search files.
Loads domain-specific skills from SKILL.md files.
Delegates subtasks to custom subagents.
Persists learned knowledge in an AGENTS.md memory file.
Auto-summarizes conversation history when context grows large.

Prerequisites

Create a new binary crate:

cargo new deep-agent-tutorial
cd deep-agent-tutorial

Add dependencies to Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["deep", "openai"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Set your OpenAI API key:

export OPENAI_API_KEY="sk-..."

Step 1: Create a Backend

Every deep agent needs a backend that provides filesystem operations. The backend is the agent's view of the world -- it determines where files are read from and written to.

Synaptic ships three backend implementations:

StateBackend -- in-memory HashMap<String, String>. Great for tests and sandboxed demos. No real files are touched.
StoreBackend -- delegates to a Synaptic Store implementation. Useful when you already have a store with semantic search.
FilesystemBackend -- reads and writes real files on disk, sandboxed to a root directory. Requires the filesystem feature flag.

For this tutorial we use StateBackend so everything runs in memory:

use std::sync::Arc;
use synaptic::deep::backend::{Backend, StateBackend};

let backend = Arc::new(StateBackend::new());

The deep agent wraps each backend operation as a tool that the model can call.

Step 2: Create a Minimal Deep Agent

The create_deep_agent function assembles a full middleware stack and tool set in one call. It returns a CompiledGraph<MessageState> -- the same graph type used by create_agent and create_react_agent, so you run it with invoke().

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::deep::backend::StateBackend;
use synaptic::core::{ChatModel, Message};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o"));
    let backend = Arc::new(StateBackend::new());

    let options = DeepAgentOptions::new(backend.clone());
    let agent = create_deep_agent(model.clone(), options)?;

    let state = MessageState::with_messages(vec![
        Message::human("Create a file called hello.txt with 'Hello World!'"),
    ]);
    let result = agent.invoke(state).await?;
    let final_state = result.into_state();
    println!("{}", final_state.last_message().unwrap().content());

    Ok(())
}

What happens under the hood:

DeepAgentOptions::new(backend) configures sensible defaults -- filesystem tools enabled, skills enabled, memory enabled, subagents enabled.
create_deep_agent assembles 6 middleware layers and 6-7 tools, then calls create_agent to produce a compiled graph.
agent.invoke(state) runs the agent loop. The model sees the write_file tool and calls it to create hello.txt in the backend.
result.into_state() unwraps the GraphResult into the final MessageState.

Because we are using StateBackend, the file lives only in memory. You can verify it:

let content = backend.read_file("hello.txt", 0, 100).await?;
assert!(content.contains("Hello World!"));

Step 3: Use Filesystem Tools

The deep agent automatically registers these tools: ls, read_file, write_file, edit_file, glob, grep, and execute (if the backend supports shell commands).

Let us seed the backend with a small Rust project and ask the agent to analyze it:

// Seed files into the in-memory backend
backend.write_file("src/main.rs", r#"fn main() {
    let items = vec![1, 2, 3, 4, 5];
    let mut total = 0;
    for i in items {
        total = total + i;
    }
    println!("Total: {}", total);
    // TODO: add error handling
    // TODO: extract into a function
}
"#).await?;

backend.write_file("Cargo.toml", r#"[package]
name = "sample"
version = "0.1.0"
edition = "2021"
"#).await?;

let state = MessageState::with_messages(vec![
    Message::human("Read src/main.rs. List all the TODO comments and suggest improvements."),
]);
let result = agent.invoke(state).await?;
let final_state = result.into_state();
println!("{}", final_state.last_message().unwrap().content());

The agent calls read_file to get the source, finds the TODO comments, and responds with suggestions. You can follow up with a write request:

let state = MessageState::with_messages(vec![
    Message::human(
        "Create src/lib.rs with a public function `sum_items(items: &[i32]) -> i32` \
         that uses iter().sum(). Then update src/main.rs to use it."
    ),
]);
let result = agent.invoke(state).await?;

The agent uses write_file and edit_file to make the changes.

Step 4: Add Skills

Skills are domain-specific instructions stored as SKILL.md files in the backend. The SkillsMiddleware scans {skills_dir}/*/SKILL.md on each model call, parses YAML frontmatter for name and description, and injects a skill index into the system prompt. The agent can then read_file any skill for full details.

Write a skill file directly to the backend:

backend.write_file(
    ".skills/testing/SKILL.md",
    "---\nname: testing\ndescription: Write comprehensive tests\n---\n\
     Testing Skill\n\n\
     When asked to test Rust code:\n\n\
     1. Create a `tests/` module with `#[cfg(test)]`.\n\
     2. Write at least one happy-path test and one edge-case test.\n\
     3. Use `assert_eq!` with descriptive messages.\n\
     4. Test error paths with `assert!(result.is_err())`.\n"
).await?;

Skills are enabled by default (enable_skills = true). When the agent processes a request, it sees the skill index in its system prompt:

<available_skills>
- **testing**: Write comprehensive tests (read `.skills/testing/SKILL.md` for details)
</available_skills>

The agent can call read_file on .skills/testing/SKILL.md to get the full instructions. This is progressive disclosure -- the index is always small, and full skill content is loaded on demand.

You can add multiple skills:

backend.write_file(
    ".skills/refactoring/SKILL.md",
    "---\nname: refactoring\ndescription: Rust refactoring best practices\n---\n\
     Refactoring Skill\n\n\
     1. Prefer `iter().sum()` over manual loops.\n\
     2. Add `#[must_use]` to pure functions.\n\
     3. Run clippy before and after changes.\n"
).await?;

Step 5: Add Custom Subagents

The deep agent can spawn child agents via a task tool. Each child gets its own conversation, runs the same middleware stack, and returns a summary to the parent.

Define custom subagent types with SubAgentDef:

use synaptic::deep::SubAgentDef;

let mut options = DeepAgentOptions::new(backend.clone());
options.subagents = vec![SubAgentDef {
    name: "researcher".to_string(),
    description: "Research specialist".to_string(),
    system_prompt: "You are a research assistant. Use grep and read_file to \
                    find information in the codebase. Report findings concisely."
        .to_string(),
    tools: vec![], // inherits filesystem tools from the deep agent
}];
let agent = create_deep_agent(model.clone(), options)?;

When the model calls the task tool, it passes a description and an optional agent_type. If agent_type matches a SubAgentDef name, the child uses that definition's system prompt and extra tools. Otherwise a general-purpose child agent is spawned.

Subagent depth is bounded by max_subagent_depth (default 3) to prevent runaway recursion. You can disable subagents entirely:

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_subagents = false;
let agent = create_deep_agent(model.clone(), options)?;

Step 6: Add Memory Persistence

The DeepMemoryMiddleware loads a memory file from the backend on each model call and injects it into the system prompt wrapped in <agent_memory> tags. Write an initial memory file:

backend.write_file(
    "AGENTS.md",
    "# Agent Memory\n\n\
     - Always use Rust idioms\n\
     - Prefer async/await over blocking I/O\n\
     - User prefers 4-space indentation\n"
).await?;

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_memory = true; // this is already the default
let agent = create_deep_agent(model.clone(), options)?;

The agent now sees this in its system prompt on every call:

<agent_memory>
# Agent Memory

- Always use Rust idioms
- Prefer async/await over blocking I/O
- User prefers 4-space indentation
</agent_memory>

The memory file path defaults to "AGENTS.md". You can change it:

let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("project-notes.md".to_string());

The agent can update memory by calling write_file or edit_file on the memory file. Future sessions will pick up the changes automatically.

Step 7: Customize Options

DeepAgentOptions gives you control over the entire agent stack:

let mut options = DeepAgentOptions::new(backend.clone());

// System prompt prepended to all model calls
options.system_prompt = Some("You are a coding assistant.".to_string());

// Token budget and summarization
options.max_input_tokens = 128_000;       // default
options.summarization_threshold = 0.85;   // default (85% of max)
options.eviction_threshold = 20_000;      // evict large tool results (default)

// Subagent configuration
options.max_subagent_depth = 3;           // default
options.enable_subagents = true;          // default

// Feature toggles
options.enable_filesystem = true;         // default
options.enable_skills = true;             // default
options.enable_memory = true;             // default

// Paths in the backend
options.skills_dir = Some(".skills".to_string());    // default
options.memory_file = Some("AGENTS.md".to_string()); // default

// Extensibility: add your own tools, middleware, checkpointer, or store
options.tools = vec![];
options.middleware = vec![];
options.checkpointer = None;
options.store = None;
options.subagents = vec![];

let agent = create_deep_agent(model.clone(), options)?;

Step 8: Putting It All Together

Here is a complete example that combines everything:

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, SubAgentDef};
use synaptic::deep::backend::StateBackend;
use synaptic::core::{ChatModel, Message};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o"));
    let backend = Arc::new(StateBackend::new());

    // Seed the workspace
    backend.write_file("src/main.rs", "fn main() {\n    println!(\"hello\");\n}\n").await?;

    // Add a skill
    backend.write_file(
        ".skills/testing/SKILL.md",
        "---\nname: testing\ndescription: Write comprehensive tests\n---\n# Testing\nAlways write unit tests.\n"
    ).await?;

    // Add agent memory
    backend.write_file("AGENTS.md", "# Memory\n- Use Rust 2021 edition\n").await?;

    // Configure the deep agent
    let mut options = DeepAgentOptions::new(backend.clone());
    options.system_prompt = Some("You are a senior Rust engineer. Be concise.".to_string());
    options.max_input_tokens = 64_000;
    options.summarization_threshold = 0.80;
    options.max_subagent_depth = 2;
    options.subagents = vec![SubAgentDef {
        name: "researcher".to_string(),
        description: "Code research specialist".to_string(),
        system_prompt: "You research codebases and report findings.".to_string(),
        tools: vec![],
    }];

    let agent = create_deep_agent(model, options)?;

    // Run the agent
    let state = MessageState::with_messages(vec![
        Message::human(
            "Audit this project: read all source files, find TODOs, \
             and write a summary to REPORT.md."
        ),
    ]);
    let result = agent.invoke(state).await?;
    let final_state = result.into_state();
    println!("{}", final_state.last_message().unwrap().content());

    // Verify the report was created
    let report = backend.read_file("REPORT.md", 0, 100).await?;
    println!("--- REPORT.md ---\n{}", report);

    Ok(())
}

How the Middleware Stack Works

create_deep_agent assembles this middleware stack in order:

DeepMemoryMiddleware -- reads AGENTS.md and appends it to the system prompt.
SkillsMiddleware -- scans .skills/*/SKILL.md and injects a skill index into the system prompt.
FilesystemMiddleware -- registers filesystem tools. Evicts results larger than eviction_threshold tokens to .evicted/ files with a preview.
SubAgentMiddleware -- provides the task tool for spawning child agents.
DeepSummarizationMiddleware -- summarizes older messages when token count exceeds the threshold, saving full history to .context/history_N.md.
PatchToolCallsMiddleware -- fixes malformed tool calls (strips code fences, deduplicates IDs, removes empty names).
User middleware -- anything in options.middleware runs last.

Using a Real Filesystem Backend

For production use, enable the filesystem feature to work with real files:

[dependencies]
synaptic = { version = "0.2", features = ["deep", "openai"] }
synaptic-deep = { version = "0.2", features = ["filesystem"] }

Note: The filesystem feature is on the synaptic-deep crate directly because the synaptic facade does not forward it. Add synaptic-deep as an explicit dependency when you need FilesystemBackend.

use synaptic::deep::backend::FilesystemBackend;

let backend = Arc::new(FilesystemBackend::new("/path/to/workspace"));
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;

FilesystemBackend sandboxes all operations to the root directory. Path traversal via .. is rejected. It also supports shell command execution via the execute tool.

Offline Mode (No API Key Required)

For testing and CI, combine StateBackend with ScriptedChatModel to run the entire deep agent without network access:

use std::sync::Arc;
use synaptic::core::{ChatModel, ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::deep::backend::StateBackend;
use synaptic::graph::MessageState;

let backend = Arc::new(StateBackend::new());

// Script the model to: 1) write a file, 2) respond
let model: Arc<dyn ChatModel> = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai_with_tool_calls(
            "Creating the file.",
            vec![ToolCall {
                id: "call_1".into(),
                name: "write_file".into(),
                arguments: r#"{"path": "/output.txt", "content": "Hello from offline test!"}"#.into(),
            }],
        ),
        usage: None,
    },
    ChatResponse {
        message: Message::ai("Done! Created output.txt."),
        usage: None,
    },
]));

let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;

let state = MessageState::with_messages(vec![
    Message::human("Create output.txt with a greeting."),
]);
let result = agent.invoke(state).await?.into_state();

// Verify the file was created in the virtual filesystem
let content = backend.read_file("/output.txt", 0, 100).await?;
assert!(content.contains("Hello from offline test!"));

This approach is ideal for:

Unit tests -- deterministic, no API costs, fast execution
CI pipelines -- no secrets required
Demos -- runs anywhere without configuration

What You Built

Over the course of this tutorial you:

Created a StateBackend as an in-memory filesystem for the agent.
Used create_deep_agent to assemble a full agent with tools and middleware.
Ran the agent with invoke() on a MessageState and extracted results with into_state().
Registered built-in filesystem tools (ls, read_file, write_file, edit_file, glob, grep).
Added domain skills via SKILL.md files with YAML frontmatter.
Defined custom subagents with SubAgentDef for task delegation.
Enabled persistent memory via AGENTS.md.
Customized every option through DeepAgentOptions.

Next Steps

Multi-Agent Patterns -- supervisor and swarm architectures
Middleware -- write custom middleware for the agent stack
Store -- persistent key-value storage with semantic search

Chat Models

Synaptic supports multiple LLM providers through the ChatModel trait defined in synaptic-core. Each provider lives in its own crate, giving you a uniform interface for sending messages and receiving responses -- whether you are using OpenAI, Anthropic, Gemini, or a local Ollama instance.

Providers

Each provider adapter lives in its own crate. You enable only the providers you need via feature flags:

Provider	Adapter	Crate	Feature
OpenAI	`OpenAiChatModel`	`synaptic-openai`	`"openai"`
Anthropic	`AnthropicChatModel`	`synaptic-anthropic`	`"anthropic"`
Google Gemini	`GeminiChatModel`	`synaptic-gemini`	`"gemini"`
Ollama (local)	`OllamaChatModel`	`synaptic-ollama`	`"ollama"`

use std::sync::Arc;
use synaptic::openai::OpenAiChatModel;

let model = OpenAiChatModel::new("gpt-4o");

For testing, use ScriptedChatModel (returns pre-defined responses) or FakeBackend (simulates HTTP responses without network calls).

Wrappers

Synaptic provides composable wrappers that add behavior on top of any ChatModel:

Wrapper	Purpose
`RetryChatModel`	Automatic retry with exponential backoff
`RateLimitedChatModel`	Concurrency-based rate limiting (semaphore)
`TokenBucketChatModel`	Token bucket rate limiting
`StructuredOutputChatModel<T>`	JSON schema enforcement for structured output
`CachedChatModel`	Response caching (exact-match or semantic)
`BoundToolsChatModel`	Automatically attach tool definitions to every request

All wrappers implement ChatModel, so they can be stacked:

use std::sync::Arc;
use synaptic::models::{RetryChatModel, RetryPolicy, RateLimitedChatModel};

let model: Arc<dyn ChatModel> = Arc::new(base_model);
let with_retry = Arc::new(RetryChatModel::new(model, RetryPolicy::default()));
let with_rate_limit = RateLimitedChatModel::new(with_retry, 5);

Guides

Streaming Responses -- consume tokens as they arrive with stream_chat()
Bind Tools to a Model -- send tool definitions alongside your request
Control Tool Choice -- force, prevent, or target specific tool usage
Structured Output -- get typed Rust structs from LLM responses
Caching LLM Responses -- avoid redundant API calls with in-memory or semantic caching
Retry & Rate Limiting -- handle transient failures and control request throughput
Model Profiles -- query model capabilities and limits at runtime

Streaming Responses

This guide shows how to consume LLM responses as a stream of tokens, rather than waiting for the entire response to complete.

Overview

Every ChatModel in Synaptic provides two methods:

chat() -- returns a complete ChatResponse once the model finishes generating.
stream_chat() -- returns a ChatStream, which yields AIMessageChunk items as the model produces them.

Streaming is useful for displaying partial results to users in real time.

Basic streaming

Use stream_chat() and iterate over chunks with StreamExt::next():

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message, AIMessageChunk};

async fn stream_example(model: &dyn ChatModel) -> Result<(), Box<dyn std::error::Error>> {
    let request = ChatRequest::new(vec![
        Message::human("Tell me a story about a brave robot"),
    ]);

    let mut stream = model.stream_chat(request);

    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        print!("{}", chunk.content);  // Print each token as it arrives
    }
    println!();  // Final newline

    Ok(())
}

The ChatStream type is defined as:

type ChatStream<'a> = Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send + 'a>>;

Accumulating chunks into a message

AIMessageChunk supports the + and += operators for merging chunks together. After streaming completes, convert the accumulated result into a full Message:

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message, AIMessageChunk};

async fn accumulate_stream(model: &dyn ChatModel) -> Result<Message, Box<dyn std::error::Error>> {
    let request = ChatRequest::new(vec![
        Message::human("Summarize Rust's ownership model"),
    ]);

    let mut stream = model.stream_chat(request);
    let mut full = AIMessageChunk::default();

    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        full += chunk;  // Merge content, tool_calls, usage, etc.
    }

    let final_message = full.into_message();
    println!("Complete response: {}", final_message.content());

    Ok(final_message)
}

When merging chunks:

content strings are concatenated.
tool_calls are appended to the accumulated list.
usage token counts are summed.
The first non-None id is preserved.

Using the `+` operator

You can also combine two chunks with + without mutation:

let combined = chunk_a + chunk_b;

This produces a new AIMessageChunk with the merged fields from both.

Streaming with tool calls

When the model streams a response that includes tool calls, tool call data arrives across multiple chunks. After accumulation, the full tool call information is available on the resulting message:

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message, AIMessageChunk, ToolDefinition};
use serde_json::json;

async fn stream_with_tools(model: &dyn ChatModel) -> Result<(), Box<dyn std::error::Error>> {
    let tool = ToolDefinition {
        name: "get_weather".to_string(),
        description: "Get current weather".to_string(),
        parameters: json!({"type": "object", "properties": {"city": {"type": "string"}}}),
    };

    let request = ChatRequest::new(vec![
        Message::human("What's the weather in Paris?"),
    ]).with_tools(vec![tool]);

    let mut stream = model.stream_chat(request);
    let mut full = AIMessageChunk::default();

    while let Some(chunk) = stream.next().await {
        full += chunk?;
    }

    let message = full.into_message();
    for tc in message.tool_calls() {
        println!("Call tool '{}' with: {}", tc.name, tc.arguments);
    }

    Ok(())
}

Default streaming behavior

If a provider adapter does not implement native streaming, the default stream_chat() implementation wraps the chat() result as a single-chunk stream. This means you can always use stream_chat() regardless of provider -- you just may not get incremental token delivery from providers that do not support it natively.

Bind Tools to a Model

This guide shows how to include tool (function) definitions in a chat request so the model can decide to call them.

Defining tools

A ToolDefinition describes a tool the model can invoke. It has a name, description, and a JSON Schema for its parameters:

use synaptic::core::ToolDefinition;
use serde_json::json;

let weather_tool = ToolDefinition {
    name: "get_weather".to_string(),
    description: "Get the current weather for a location".to_string(),
    parameters: json!({
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name, e.g. 'Tokyo'"
            }
        },
        "required": ["location"]
    }),
};

Sending tools with a request

Use ChatRequest::with_tools() to attach tool definitions to a single request:

use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition};
use serde_json::json;

async fn call_with_tools(model: &dyn ChatModel) -> Result<(), Box<dyn std::error::Error>> {
    let tool_def = ToolDefinition {
        name: "get_weather".to_string(),
        description: "Get the current weather for a location".to_string(),
        parameters: json!({
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }),
    };

    let request = ChatRequest::new(vec![
        Message::human("What's the weather in Tokyo?"),
    ]).with_tools(vec![tool_def]);

    let response = model.chat(request).await?;

    // Check if the model decided to call any tools
    for tc in response.message.tool_calls() {
        println!("Tool: {}, Args: {}", tc.name, tc.arguments);
    }

    Ok(())
}

Processing tool calls

When the model returns tool calls, each ToolCall contains:

id -- a unique identifier for this call (used to match the tool result back)
name -- the name of the tool to invoke
arguments -- a serde_json::Value with the arguments

After executing the tool, send the result back as a Tool message:

use synaptic::core::{ChatRequest, Message, ToolCall};
use serde_json::json;

// Suppose the model returned a tool call
let tool_call = ToolCall {
    id: "call_123".to_string(),
    name: "get_weather".to_string(),
    arguments: json!({"location": "Tokyo"}),
};

// Execute your tool logic...
let result = "Sunny, 22C";

// Send the result back in a follow-up request
let messages = vec![
    Message::human("What's the weather in Tokyo?"),
    Message::ai_with_tool_calls("", vec![tool_call]),
    Message::tool(result, "call_123"),  // tool_call_id must match
];

let follow_up = ChatRequest::new(messages);
// let final_response = model.chat(follow_up).await?;

Permanently binding tools with `BoundToolsChatModel`

If you want every request through a model to automatically include certain tool definitions, use BoundToolsChatModel:

use std::sync::Arc;
use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition};
use synaptic::models::BoundToolsChatModel;
use serde_json::json;

let tools = vec![
    ToolDefinition {
        name: "get_weather".to_string(),
        description: "Get weather for a city".to_string(),
        parameters: json!({"type": "object", "properties": {"city": {"type": "string"}}}),
    },
    ToolDefinition {
        name: "search".to_string(),
        description: "Search the web".to_string(),
        parameters: json!({"type": "object", "properties": {"query": {"type": "string"}}}),
    },
];

let base_model: Arc<dyn ChatModel> = Arc::new(base_model);
let bound = BoundToolsChatModel::new(base_model, tools);

// Now every call to bound.chat() will include both tools automatically
let request = ChatRequest::new(vec![Message::human("Look up Rust news")]);
// let response = bound.chat(request).await?;

Multiple tools

You can provide any number of tools. The model will choose which (if any) to call based on the conversation context:

let request = ChatRequest::new(vec![
    Message::human("Search for Rust news and tell me the weather in Berlin"),
]).with_tools(vec![search_tool, weather_tool, calculator_tool]);

See also: Control Tool Choice for fine-grained control over which tools the model uses.

Control Tool Choice

This guide shows how to control whether and which tools the model uses when responding to a request.

Overview

When you attach tools to a ChatRequest, the model decides by default whether to call any of them. The ToolChoice enum lets you override this behavior, forcing the model to use tools, avoid them, or target a specific one.

The `ToolChoice` enum

use synaptic::core::ToolChoice;

// Auto -- the model decides whether to use tools (this is the default)
ToolChoice::Auto

// Required -- the model must call at least one tool
ToolChoice::Required

// None -- the model must not call any tools, even if tools are provided
ToolChoice::None

// Specific -- the model must call this exact tool
ToolChoice::Specific("get_weather".to_string())

Setting tool choice on a request

Use ChatRequest::with_tool_choice():

use synaptic::core::{ChatRequest, Message, ToolChoice, ToolDefinition};
use serde_json::json;

let tools = vec![
    ToolDefinition {
        name: "get_weather".to_string(),
        description: "Get weather for a city".to_string(),
        parameters: json!({
            "type": "object",
            "properties": {
                "city": { "type": "string" }
            },
            "required": ["city"]
        }),
    },
    ToolDefinition {
        name: "search".to_string(),
        description: "Search the web".to_string(),
        parameters: json!({
            "type": "object",
            "properties": {
                "query": { "type": "string" }
            },
            "required": ["query"]
        }),
    },
];

let messages = vec![Message::human("What's the weather in London?")];

Auto (default)

The model chooses freely whether to call tools:

let request = ChatRequest::new(messages.clone())
    .with_tools(tools.clone())
    .with_tool_choice(ToolChoice::Auto);

This is equivalent to not calling with_tool_choice() at all.

Required

Force the model to call at least one tool. Useful when you know the user's intent maps to a tool call:

let request = ChatRequest::new(messages.clone())
    .with_tools(tools.clone())
    .with_tool_choice(ToolChoice::Required);

None

Prevent the model from calling tools, even though tools are provided. This is helpful when you want to temporarily disable tool usage without removing the definitions:

let request = ChatRequest::new(messages.clone())
    .with_tools(tools.clone())
    .with_tool_choice(ToolChoice::None);

Specific

Force the model to call one specific tool by name. The model will always call this tool, regardless of the conversation context:

let request = ChatRequest::new(messages.clone())
    .with_tools(tools.clone())
    .with_tool_choice(ToolChoice::Specific("get_weather".to_string()));

Practical patterns

Routing with specific tool choice

When building a multi-step agent, you can force a classification step by requiring a specific "router" tool:

let router_tool = ToolDefinition {
    name: "route".to_string(),
    description: "Classify the user's intent".to_string(),
    parameters: json!({
        "type": "object",
        "properties": {
            "intent": {
                "type": "string",
                "enum": ["weather", "search", "calculator"]
            }
        },
        "required": ["intent"]
    }),
};

let request = ChatRequest::new(vec![Message::human("What is 2 + 2?")])
    .with_tools(vec![router_tool])
    .with_tool_choice(ToolChoice::Specific("route".to_string()));

Two-phase generation

First call with Required to extract structured data, then call with None to generate a natural language response:

// Phase 1: extract data
let extract_request = ChatRequest::new(messages.clone())
    .with_tools(tools.clone())
    .with_tool_choice(ToolChoice::Required);

// Phase 2: generate response (no tools)
let respond_request = ChatRequest::new(full_conversation)
    .with_tools(tools.clone())
    .with_tool_choice(ToolChoice::None);

Structured Output

This guide shows how to get typed Rust structs from LLM responses using StructuredOutputChatModel<T>.

Overview

StructuredOutputChatModel<T> wraps any ChatModel and instructs it to respond with valid JSON matching a schema you describe. It injects a system prompt with the schema instructions and provides a parse_response() method to deserialize the JSON into your Rust type.

Basic usage

Define your output type as a struct that implements Deserialize, then wrap your model:

use std::sync::Arc;
use serde::Deserialize;
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::StructuredOutputChatModel;

#[derive(Debug, Deserialize)]
struct MovieReview {
    title: String,
    rating: f32,
    summary: String,
}

async fn get_review(base_model: Arc<dyn ChatModel>) -> Result<(), Box<dyn std::error::Error>> {
    let structured = StructuredOutputChatModel::<MovieReview>::new(
        base_model,
        r#"{"title": "string", "rating": "number (1-10)", "summary": "string"}"#,
    );

    let request = ChatRequest::new(vec![
        Message::human("Review the movie 'Interstellar'"),
    ]);

    // Use generate() to get both the parsed struct and the raw response
    let (review, _raw_response) = structured.generate(request).await?;

    println!("Title: {}", review.title);
    println!("Rating: {}/10", review.rating);
    println!("Summary: {}", review.summary);

    Ok(())
}

How it works

When you call chat() or generate() on a StructuredOutputChatModel:

A system message is prepended to the request instructing the model to respond with valid JSON matching the schema description.
The request is forwarded to the inner model.
With generate(), the response text is parsed as JSON into your target type T.

The schema description is a free-form string. It does not need to be valid JSON Schema -- it just needs to clearly communicate the expected shape to the LLM:

// Simple field descriptions
let schema = r#"{"name": "string", "age": "integer", "hobbies": ["string"]}"#;

// More detailed descriptions
let schema = r#"{
    "sentiment": "one of: positive, negative, neutral",
    "confidence": "float between 0.0 and 1.0",
    "key_phrases": "array of strings"
}"#;

Parsing responses manually

If you want to use the model as a normal ChatModel and parse later, you can call chat() followed by parse_response():

let structured = StructuredOutputChatModel::<MovieReview>::new(base_model, schema);

let response = structured.chat(request).await?;
let parsed: MovieReview = structured.parse_response(&response)?;

Handling markdown code blocks

The parser automatically handles responses wrapped in markdown code blocks. All of these formats are supported:

{"title": "Interstellar", "rating": 9.0, "summary": "..."}

```json
{"title": "Interstellar", "rating": 9.0, "summary": "..."}
```

```
{"title": "Interstellar", "rating": 9.0, "summary": "..."}
```

Complex output types

You can use nested structs, enums, and collections:

#[derive(Debug, Deserialize)]
struct AnalysisResult {
    entities: Vec<Entity>,
    sentiment: Sentiment,
    language: String,
}

#[derive(Debug, Deserialize)]
struct Entity {
    name: String,
    entity_type: String,
}

#[derive(Debug, Deserialize)]
#[serde(rename_all = "lowercase")]
enum Sentiment {
    Positive,
    Negative,
    Neutral,
}

let structured = StructuredOutputChatModel::<AnalysisResult>::new(
    base_model,
    r#"{
        "entities": [{"name": "string", "entity_type": "person|org|location"}],
        "sentiment": "positive|negative|neutral",
        "language": "ISO 639-1 code"
    }"#,
);

Combining with other wrappers

Since StructuredOutputChatModel<T> implements ChatModel, it composes with other wrappers:

use synaptic::models::{RetryChatModel, RetryPolicy};

let base: Arc<dyn ChatModel> = Arc::new(base_model);
let structured = Arc::new(StructuredOutputChatModel::<MovieReview>::new(
    base,
    r#"{"title": "string", "rating": "number", "summary": "string"}"#,
));

// Add retry logic on top
let reliable = RetryChatModel::new(structured, RetryPolicy::default());

Caching LLM Responses

This guide shows how to cache LLM responses to avoid redundant API calls and reduce latency.

Overview

Synaptic provides two cache implementations through the LlmCache trait:

InMemoryCache -- exact-match caching with optional TTL expiration.
SemanticCache -- embedding-based similarity matching for semantically equivalent queries.

Both are used with CachedChatModel, which wraps any ChatModel and checks the cache before making an API call.

Exact-match caching with `InMemoryCache`

The simplest cache stores responses keyed by the exact request content:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::cache::{InMemoryCache, CachedChatModel};

let base_model: Arc<dyn ChatModel> = Arc::new(model);
let cache = Arc::new(InMemoryCache::new());
let cached_model = CachedChatModel::new(base_model, cache);

// First call hits the LLM
// let response1 = cached_model.chat(request.clone()).await?;

// Identical request returns cached response instantly
// let response2 = cached_model.chat(request.clone()).await?;

Cache with TTL

Set a time-to-live so entries expire automatically:

use std::time::Duration;
use std::sync::Arc;
use synaptic::cache::InMemoryCache;

// Entries expire after 1 hour
let cache = Arc::new(InMemoryCache::with_ttl(Duration::from_secs(3600)));

// Entries expire after 5 minutes
let cache = Arc::new(InMemoryCache::with_ttl(Duration::from_secs(300)));

After the TTL elapses, a cache lookup for that entry returns None, and the next request will hit the LLM again.

Semantic caching with `SemanticCache`

Semantic caching uses embeddings to find similar queries, even when the exact wording differs. For example, "What's the weather?" and "Tell me the current weather" could match the same cached response.

use std::sync::Arc;
use synaptic::cache::{SemanticCache, CachedChatModel};
use synaptic::openai::OpenAiEmbeddings;

let embeddings: Arc<dyn synaptic::embeddings::Embeddings> = Arc::new(embeddings_provider);

// Similarity threshold of 0.95 means only very similar queries match
let cache = Arc::new(SemanticCache::new(embeddings, 0.95));

let cached_model = CachedChatModel::new(base_model, cache);

When looking up a cached response:

The query is embedded using the provided Embeddings implementation.
The embedding is compared against all stored entries using cosine similarity.
If the best match exceeds the similarity threshold, the cached response is returned.

Choosing a threshold

0.95 -- 0.99: Very strict. Only nearly identical queries match. Good for factual Q&A where slight wording changes can change meaning.
0.90 -- 0.95: Moderate. Catches common rephrasing. Good for general-purpose chatbots.
0.80 -- 0.90: Loose. Broader matching. Useful when you want aggressive caching and approximate answers are acceptable.

The `LlmCache` trait

Both cache types implement the LlmCache trait:

#[async_trait]
pub trait LlmCache: Send + Sync {
    async fn get(&self, key: &str) -> Result<Option<ChatResponse>, SynapticError>;
    async fn put(&self, key: &str, response: &ChatResponse) -> Result<(), SynapticError>;
    async fn clear(&self) -> Result<(), SynapticError>;
}

You can implement this trait for custom cache backends (Redis, SQLite, etc.).

Clearing the cache

Both cache implementations support clearing all entries:

use synaptic::cache::LlmCache;

// cache implements LlmCache
// cache.clear().await?;

Combining with other wrappers

Since CachedChatModel implements ChatModel, it composes with retry, rate limiting, and other wrappers:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::cache::{InMemoryCache, CachedChatModel};
use synaptic::models::{RetryChatModel, RetryPolicy};

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Cache first, then retry on cache miss + API failure
let cache = Arc::new(InMemoryCache::new());
let cached = Arc::new(CachedChatModel::new(base_model, cache));
let reliable = RetryChatModel::new(cached, RetryPolicy::default());

Retry & Rate Limiting

This guide shows how to add automatic retry logic and rate limiting to any ChatModel.

Retry with `RetryChatModel`

RetryChatModel wraps a model and automatically retries on transient failures (rate limit errors and timeouts). It uses exponential backoff between attempts.

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::{RetryChatModel, RetryPolicy};

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Use default policy: 3 attempts, 500ms base delay
let retry_model = RetryChatModel::new(base_model, RetryPolicy::default());

Custom retry policy

Configure the maximum number of attempts and the base delay for exponential backoff:

use std::time::Duration;
use synaptic::models::RetryPolicy;

let policy = RetryPolicy {
    max_attempts: 5,                         // Try up to 5 times
    base_delay: Duration::from_millis(200),  // Start with 200ms delay
};

let retry_model = RetryChatModel::new(base_model, policy);

The delay between retries follows exponential backoff: base_delay * 2^attempt. With a 200ms base delay:

Attempt	Delay before retry
1st retry	200ms
2nd retry	400ms
3rd retry	800ms
4th retry	1600ms

Only retryable errors trigger retries:

SynapticError::RateLimit -- the provider returned a rate limit response.
SynapticError::Timeout -- the request timed out.

All other errors are returned immediately without retrying.

Streaming with retry

RetryChatModel also retries stream_chat() calls. If a retryable error occurs during streaming, the entire stream is retried from the beginning.

Concurrency limiting with `RateLimitedChatModel`

RateLimitedChatModel uses a semaphore to limit the number of concurrent requests to the underlying model:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::RateLimitedChatModel;

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Allow at most 5 concurrent requests
let limited = RateLimitedChatModel::new(base_model, 5);

When the concurrency limit is reached, additional callers wait until a slot becomes available. This is useful for:

Respecting provider concurrency limits.
Preventing resource exhaustion in high-throughput applications.
Controlling costs by limiting parallel API calls.

Token bucket rate limiting with `TokenBucketChatModel`

TokenBucketChatModel uses a token bucket algorithm for smoother rate limiting. The bucket starts full and refills at a steady rate:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::TokenBucketChatModel;

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Bucket capacity: 100 tokens, refill rate: 10 tokens/second
let throttled = TokenBucketChatModel::new(base_model, 100.0, 10.0);

Each chat() or stream_chat() call consumes one token from the bucket. When the bucket is empty, callers wait until a token is refilled.

Parameters:

capacity -- the maximum burst size. A capacity of 100 allows 100 rapid-fire requests before throttling kicks in.
refill_rate -- tokens added per second. A rate of 10.0 means the bucket refills at 10 tokens per second.

Token bucket vs concurrency limiting

Feature	`RateLimitedChatModel`	`TokenBucketChatModel`
Controls	Concurrent requests	Request rate over time
Mechanism	Semaphore	Token bucket
Burst handling	Blocks when N requests are in-flight	Allows bursts up to capacity
Best for	Concurrency limits	Rate limits (requests/second)

Stacking wrappers

All wrappers implement ChatModel, so they compose naturally. A common pattern is retry on the outside, rate limiting on the inside:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::{RetryChatModel, RetryPolicy, TokenBucketChatModel};

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// First, apply rate limiting
let throttled: Arc<dyn ChatModel> = Arc::new(
    TokenBucketChatModel::new(base_model, 50.0, 5.0)
);

// Then, add retry on top
let reliable = RetryChatModel::new(throttled, RetryPolicy::default());

This ensures that retried requests also go through the rate limiter, preventing retry storms from overwhelming the provider.

Model Profiles

ModelProfile exposes a model's capabilities and limits so that calling code can inspect provider support flags at runtime without hard-coding provider-specific knowledge.

The `ModelProfile` Struct

pub struct ModelProfile {
    pub name: String,
    pub provider: String,
    pub supports_tool_calling: bool,
    pub supports_structured_output: bool,
    pub supports_streaming: bool,
    pub max_input_tokens: Option<usize>,
    pub max_output_tokens: Option<usize>,
}

Field	Type	Description
`name`	`String`	Model identifier (e.g. `"gpt-4o"`, `"claude-3-opus"`)
`provider`	`String`	Provider name (e.g. `"openai"`, `"anthropic"`)
`supports_tool_calling`	`bool`	Whether the model can handle `ToolDefinition` in requests
`supports_structured_output`	`bool`	Whether the model supports JSON schema enforcement
`supports_streaming`	`bool`	Whether `stream_chat()` produces real token-level chunks
`max_input_tokens`	`Option<usize>`	Maximum context window size, if known
`max_output_tokens`	`Option<usize>`	Maximum generation length, if known

Querying a Model's Profile

Every ChatModel implementation exposes a profile() method that returns Option<ModelProfile>. The default implementation returns None, so providers opt in by overriding it:

use synaptic::core::ChatModel;

let model = my_chat_model();

if let Some(profile) = model.profile() {
    println!("Provider: {}", profile.provider);
    println!("Supports tools: {}", profile.supports_tool_calling);

    if let Some(max) = profile.max_input_tokens {
        println!("Context window: {} tokens", max);
    }
} else {
    println!("No profile available for this model");
}

Using Profiles for Capability Checks

Profiles are useful when writing generic code that works across multiple providers. For example, you can guard tool-calling or structured-output logic behind a capability check:

use synaptic::core::{ChatModel, ChatRequest, ToolChoice};

async fn maybe_call_with_tools(
    model: &dyn ChatModel,
    request: ChatRequest,
) -> Result<ChatResponse, SynapticError> {
    let supports_tools = model
        .profile()
        .map(|p| p.supports_tool_calling)
        .unwrap_or(false);

    if supports_tools {
        let request = request.with_tool_choice(ToolChoice::Auto);
        model.chat(request).await
    } else {
        // Fall back to plain chat without tools
        model.chat(ChatRequest::new(request.messages)).await
    }
}

Implementing `profile()` for a Custom Model

If you implement your own ChatModel, override profile() to advertise capabilities:

use synaptic::core::{ChatModel, ModelProfile};

impl ChatModel for MyCustomModel {
    // ... chat() and stream_chat() ...

    fn profile(&self) -> Option<ModelProfile> {
        Some(ModelProfile {
            name: "my-model-v1".to_string(),
            provider: "custom".to_string(),
            supports_tool_calling: true,
            supports_structured_output: false,
            supports_streaming: true,
            max_input_tokens: Some(128_000),
            max_output_tokens: Some(4_096),
        })
    }
}

Messages

Messages are the fundamental unit of communication in Synaptic. Every interaction with a chat model is expressed as a sequence of Message values, and every response comes back as a Message.

The Message enum is defined in synaptic_core and uses a tagged union with six variants: System, Human, AI, Tool, Chat, and Remove. You create messages through factory methods rather than struct literals.

Quick example

use synaptic::core::{ChatRequest, Message};

let messages = vec![
    Message::system("You are a helpful assistant."),
    Message::human("What is Rust?"),
];

let request = ChatRequest::new(messages);

Guides

Message Types -- all message variants, factory methods, and accessor methods
Filter & Trim Messages -- select messages by type/name/id and trim to a token budget
Merge Message Runs -- combine consecutive messages of the same role into one

Message Types

This guide covers all message variants in Synaptic, how to create them, and how to inspect their contents.

The `Message` enum

Message is a tagged enum (#[serde(tag = "role")]) with six variants:

Variant	Factory method	Role string	Purpose
`System`	`Message::system()`	`"system"`	System instructions for the model
`Human`	`Message::human()`	`"human"`	User input
`AI`	`Message::ai()`	`"assistant"`	Model response (text only)
`AI` (with tools)	`Message::ai_with_tool_calls()`	`"assistant"`	Model response with tool calls
`Tool`	`Message::tool()`	`"tool"`	Tool execution result
`Chat`	`Message::chat()`	custom	Custom role message
`Remove`	`Message::remove()`	`"remove"`	Signals removal of a message by ID

Creating messages

Always use factory methods instead of constructing enum variants directly:

use synaptic::core::{Message, ToolCall};
use serde_json::json;

// System message -- sets the model's behavior
let system = Message::system("You are a helpful assistant.");

// Human message -- user input
let human = Message::human("Hello, how are you?");

// AI message -- plain text response
let ai = Message::ai("I'm doing well, thanks for asking!");

// AI message with tool calls
let ai_tools = Message::ai_with_tool_calls(
    "Let me look that up for you.",
    vec![
        ToolCall {
            id: "call_1".to_string(),
            name: "search".to_string(),
            arguments: json!({"query": "Rust programming"}),
        },
    ],
);

// Tool message -- result of a tool execution
// Second argument is the tool_call_id, which must match the ToolCall's id
let tool = Message::tool("Found 42 results for 'Rust programming'", "call_1");

// Chat message -- custom role
let chat = Message::chat("moderator", "This conversation is on topic.");

// Remove message -- used in message history management
let remove = Message::remove("msg-id-to-remove");

Accessor methods

All message variants share a common set of accessor methods:

use synaptic::core::Message;

let msg = Message::human("Hello!");

// Get the role as a string
assert_eq!(msg.role(), "human");

// Get the text content
assert_eq!(msg.content(), "Hello!");

// Type-checking predicates
assert!(msg.is_human());
assert!(!msg.is_ai());
assert!(!msg.is_system());
assert!(!msg.is_tool());
assert!(!msg.is_chat());
assert!(!msg.is_remove());

// Tool-related accessors (empty/None for non-AI/non-Tool messages)
assert!(msg.tool_calls().is_empty());
assert!(msg.tool_call_id().is_none());

// Optional fields
assert!(msg.id().is_none());
assert!(msg.name().is_none());

Tool call accessors

use synaptic::core::{Message, ToolCall};
use serde_json::json;

let ai = Message::ai_with_tool_calls("", vec![
    ToolCall {
        id: "call_1".into(),
        name: "search".into(),
        arguments: json!({"q": "rust"}),
    },
]);

// Get all tool calls (only meaningful for AI messages)
let calls = ai.tool_calls();
assert_eq!(calls.len(), 1);
assert_eq!(calls[0].name, "search");

let tool_msg = Message::tool("result", "call_1");

// Get the tool_call_id (only meaningful for Tool messages)
assert_eq!(tool_msg.tool_call_id(), Some("call_1"));

Builder methods

Messages support a builder pattern for setting optional fields:

use synaptic::core::Message;
use serde_json::json;

let msg = Message::human("Hello!")
    .with_id("msg-001")
    .with_name("Alice")
    .with_additional_kwarg("source", json!("web"))
    .with_response_metadata_entry("model", json!("gpt-4o"));

assert_eq!(msg.id(), Some("msg-001"));
assert_eq!(msg.name(), Some("Alice"));

Available builder methods:

Method	Description
`.with_id(id)`	Set the message ID
`.with_name(name)`	Set the sender name
`.with_additional_kwarg(key, value)`	Add an arbitrary key-value pair
`.with_response_metadata_entry(key, value)`	Add response metadata
`.with_content_blocks(blocks)`	Set multimodal content blocks
`.with_usage_metadata(usage)`	Set token usage (AI messages only)

Serialization

Messages serialize to JSON with a "role" tag:

use synaptic::core::Message;

let msg = Message::human("Hello!");
let json = serde_json::to_string_pretty(&msg).unwrap();
// {
//   "role": "human",
//   "content": "Hello!"
// }

Note that the AI variant serializes with "role": "assistant" (not "ai"), matching the convention used by most LLM providers.

Filter & Trim Messages

This guide shows how to select specific messages from a conversation and trim message lists to fit within token budgets.

Filtering messages with `filter_messages`

The filter_messages function selects messages based on their type (role), name, or ID. It supports both inclusion and exclusion filters.

use synaptic::core::{filter_messages, Message};

Filter by type

let messages = vec![
    Message::system("You are helpful."),
    Message::human("Question 1"),
    Message::ai("Answer 1"),
    Message::human("Question 2"),
    Message::ai("Answer 2"),
];

// Keep only human messages
let humans = filter_messages(
    &messages,
    Some(&["human"]),  // include_types
    None,              // exclude_types
    None,              // include_names
    None,              // exclude_names
    None,              // include_ids
    None,              // exclude_ids
);
assert_eq!(humans.len(), 2);
assert_eq!(humans[0].content(), "Question 1");
assert_eq!(humans[1].content(), "Question 2");

Exclude by type

// Remove system messages, keep everything else
let without_system = filter_messages(
    &messages,
    None,                // include_types
    Some(&["system"]),   // exclude_types
    None, None, None, None,
);
assert_eq!(without_system.len(), 4);

Filter by name

let messages = vec![
    Message::human("Hi").with_name("Alice"),
    Message::human("Hello").with_name("Bob"),
    Message::ai("Hey!"),
];

// Only messages from Alice
let alice_msgs = filter_messages(
    &messages,
    None, None,
    Some(&["Alice"]),  // include_names
    None, None, None,
);
assert_eq!(alice_msgs.len(), 1);
assert_eq!(alice_msgs[0].content(), "Hi");

Filter by ID

let messages = vec![
    Message::human("First").with_id("msg-1"),
    Message::human("Second").with_id("msg-2"),
    Message::human("Third").with_id("msg-3"),
];

// Exclude a specific message
let filtered = filter_messages(
    &messages,
    None, None, None, None,
    None,                         // include_ids
    Some(&["msg-2"]),             // exclude_ids
);
assert_eq!(filtered.len(), 2);

Combining filters

All filter parameters can be combined. A message must pass all active filters to be included:

// Keep only human messages from Alice
let result = filter_messages(
    &messages,
    Some(&["human"]),    // include_types
    None,                // exclude_types
    Some(&["Alice"]),    // include_names
    None, None, None,
);

Trimming messages with `trim_messages`

The trim_messages function trims a message list to fit within a token budget. It supports two strategies: keep the first messages or keep the last messages.

use synaptic::core::{trim_messages, TrimStrategy, Message};

Keep last messages (most common)

This is the typical pattern for chat applications where you want to preserve the most recent context:

let messages = vec![
    Message::system("You are a helpful assistant."),
    Message::human("Question 1"),
    Message::ai("Answer 1"),
    Message::human("Question 2"),
    Message::ai("Answer 2"),
    Message::human("Question 3"),
];

// Simple token counter: estimate ~4 chars per token
let token_counter = |msg: &Message| -> usize {
    msg.content().len() / 4
};

// Keep last messages within 50 tokens, preserve the system message
let trimmed = trim_messages(
    messages,
    50,               // max_tokens
    token_counter,
    TrimStrategy::Last,
    true,             // include_system: preserve the leading system message
);

// Result: system message + as many recent messages as fit in the budget
assert!(trimmed[0].is_system());

Keep first messages

Useful when you want to preserve the beginning of a conversation:

let trimmed = trim_messages(
    messages,
    50,
    token_counter,
    TrimStrategy::First,
    false,  // include_system not relevant for First strategy
);

The `include_system` parameter

When using TrimStrategy::Last with include_system: true:

If the first message is a system message, it is always preserved.
The system message's tokens are subtracted from the budget.
The remaining budget is filled with messages from the end of the list.

This ensures your system prompt is never trimmed away, even as the conversation grows.

Custom token counters

The token_counter parameter is a function that takes a &Message and returns a usize token count. You can use any estimation strategy:

// Simple character-based estimate
let simple = |msg: &Message| -> usize { msg.content().len() / 4 };

// Word-based estimate
let word_based = |msg: &Message| -> usize {
    msg.content().split_whitespace().count()
};

// Fixed cost per message (useful when all messages are similar size)
let fixed = |_msg: &Message| -> usize { 10 };

Merge Message Runs

This guide shows how to use merge_message_runs to combine consecutive messages of the same role into a single message.

Overview

Some LLM providers require alternating message roles (human, assistant, human, assistant). If your message history has consecutive messages from the same role, you can merge them into one message before sending the request.

Basic usage

use synaptic::core::{merge_message_runs, Message};

let messages = vec![
    Message::human("Hello"),
    Message::human("How are you?"),       // Same role as previous
    Message::ai("I'm fine!"),
    Message::ai("Thanks for asking!"),    // Same role as previous
];

let merged = merge_message_runs(messages);

assert_eq!(merged.len(), 2);
assert_eq!(merged[0].content(), "Hello\nHow are you?");
assert_eq!(merged[1].content(), "I'm fine!\nThanks for asking!");

How merging works

When two consecutive messages share the same role:

Their content strings are joined with a newline (\n).
For AI messages, tool_calls and invalid_tool_calls from subsequent messages are appended to the first message's lists.
The resulting message retains the id, name, and other metadata of the first message in the run.

Merging AI messages with tool calls

Tool calls from consecutive AI messages are combined:

use synaptic::core::{merge_message_runs, Message, ToolCall};
use serde_json::json;

let messages = vec![
    Message::ai_with_tool_calls("Looking up weather...", vec![
        ToolCall {
            id: "call_1".into(),
            name: "get_weather".into(),
            arguments: json!({"city": "Tokyo"}),
        },
    ]),
    Message::ai_with_tool_calls("Also checking news...", vec![
        ToolCall {
            id: "call_2".into(),
            name: "search_news".into(),
            arguments: json!({"query": "Tokyo"}),
        },
    ]),
];

let merged = merge_message_runs(messages);

assert_eq!(merged.len(), 1);
assert_eq!(merged[0].content(), "Looking up weather...\nAlso checking news...");
assert_eq!(merged[0].tool_calls().len(), 2);

Preserving different roles

Messages with different roles are never merged, even if they appear to be related:

use synaptic::core::{merge_message_runs, Message};

let messages = vec![
    Message::system("Be helpful."),
    Message::human("Hi"),
    Message::ai("Hello!"),
    Message::human("Bye"),
];

let merged = merge_message_runs(messages);
assert_eq!(merged.len(), 4);  // No change -- all roles are different

Practical use case: preparing messages for providers

Some providers reject requests with consecutive same-role messages. Use merge_message_runs to clean up before sending:

use synaptic::core::{merge_message_runs, ChatRequest, Message};

let conversation = vec![
    Message::system("You are a translator."),
    Message::human("Translate to French:"),
    Message::human("Hello, how are you?"),    // User sent two messages in a row
    Message::ai("Bonjour, comment allez-vous ?"),
];

let cleaned = merge_message_runs(conversation);
let request = ChatRequest::new(cleaned);
// Now safe to send: roles alternate correctly

Empty input

merge_message_runs returns an empty vector when given an empty input:

use synaptic::core::merge_message_runs;

let result = merge_message_runs(vec![]);
assert!(result.is_empty());

Prompts

Synaptic provides two levels of prompt template:

PromptTemplate -- simple string interpolation with {{ variable }} syntax. Takes a HashMap<String, String> and returns a rendered String.
ChatPromptTemplate -- produces a Vec<Message> from a sequence of MessageTemplate entries. Each entry can be a system, human, or AI message template, or a Placeholder that injects an existing list of messages.

Both template types implement the Runnable trait, so they compose directly with chat models, output parsers, and other runnables using the LCEL pipe operator (|).

Quick Example

use synaptic::prompts::{PromptTemplate, ChatPromptTemplate, MessageTemplate};

// Simple string template
let pt = PromptTemplate::new("Hello, {{ name }}!");
let mut values = std::collections::HashMap::new();
values.insert("name".to_string(), "world".to_string());
assert_eq!(pt.render(&values).unwrap(), "Hello, world!");

// Chat message template (produces Vec<Message>)
let chat = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a {{ role }} assistant."),
    MessageTemplate::human("{{ question }}"),
]);

Sub-Pages

Chat Prompt Template -- build multi-message prompts with variable interpolation and placeholders
Few-Shot Prompting -- inject example conversations for few-shot learning

Chat Prompt Template

ChatPromptTemplate produces a Vec<Message> from a sequence of MessageTemplate entries. Each entry renders one or more messages with {{ variable }} interpolation. The template implements the Runnable trait, so it integrates directly into LCEL pipelines.

Creating a Template

Use ChatPromptTemplate::from_messages() (or new()) with a vector of MessageTemplate variants:

use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a {{ role }} assistant."),
    MessageTemplate::human("{{ question }}"),
]);

Rendering with `format()`

Call format() with a HashMap<String, serde_json::Value> to produce messages:

use std::collections::HashMap;
use serde_json::json;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a {{ role }} assistant."),
    MessageTemplate::human("{{ question }}"),
]);

let values: HashMap<String, serde_json::Value> = HashMap::from([
    ("role".to_string(), json!("helpful")),
    ("question".to_string(), json!("What is Rust?")),
]);

let messages = template.format(&values).unwrap();
// messages[0] => Message::system("You are a helpful assistant.")
// messages[1] => Message::human("What is Rust?")

Using as a Runnable

Because ChatPromptTemplate implements Runnable<HashMap<String, Value>, Vec<Message>>, you can call invoke() or compose it with the pipe operator:

use std::collections::HashMap;
use serde_json::json;
use synaptic::core::RunnableConfig;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::runnables::Runnable;

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a {{ role }} assistant."),
    MessageTemplate::human("{{ question }}"),
]);

let config = RunnableConfig::default();
let values: HashMap<String, serde_json::Value> = HashMap::from([
    ("role".to_string(), json!("helpful")),
    ("question".to_string(), json!("What is Rust?")),
]);

let messages = template.invoke(values, &config).await?;
// messages = [Message::system("You are a helpful assistant."), Message::human("What is Rust?")]

MessageTemplate Variants

MessageTemplate is an enum with four variants:

Variant	Description
`MessageTemplate::system(text)`	Renders a system message from a template string
`MessageTemplate::human(text)`	Renders a human message from a template string
`MessageTemplate::ai(text)`	Renders an AI message from a template string
`MessageTemplate::Placeholder(key)`	Injects a list of messages from the input map

Placeholder Example

Placeholder injects messages stored under a key in the input map. The value must be a JSON array of serialized Message objects. This is useful for injecting conversation history:

use std::collections::HashMap;
use serde_json::json;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are helpful."),
    MessageTemplate::Placeholder("history".to_string()),
    MessageTemplate::human("{{ input }}"),
]);

let history = json!([
    {"role": "human", "content": "Hi"},
    {"role": "assistant", "content": "Hello!"}
]);

let values: HashMap<String, serde_json::Value> = HashMap::from([
    ("history".to_string(), history),
    ("input".to_string(), json!("How are you?")),
]);

let messages = template.format(&values).unwrap();
// messages[0] => System("You are helpful.")
// messages[1] => Human("Hi")         -- from placeholder
// messages[2] => AI("Hello!")         -- from placeholder
// messages[3] => Human("How are you?")

Composing in a Pipeline

A common pattern is to pipe a prompt template into a chat model and then into an output parser:

use std::collections::HashMap;
use serde_json::json;
use synaptic::core::{ChatModel, ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;

let model = ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("Rust is a systems programming language."),
        usage: None,
    },
]);

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a {{ role }} assistant."),
    MessageTemplate::human("{{ question }}"),
]);

let chain = template.boxed() | model.boxed() | StrOutputParser.boxed();

let values: HashMap<String, serde_json::Value> = HashMap::from([
    ("role".to_string(), json!("helpful")),
    ("question".to_string(), json!("What is Rust?")),
]);

let config = RunnableConfig::default();
let result: String = chain.invoke(values, &config).await.unwrap();
// result = "Rust is a systems programming language."

Few-Shot Prompting

FewShotChatMessagePromptTemplate injects example conversations into a prompt for few-shot learning. Each example is a pair of human input and AI output, formatted as alternating Human and AI messages. An optional system prefix message can be prepended.

Basic Usage

Create the template with a list of FewShotExample values and a suffix PromptTemplate for the user's actual query:

use std::collections::HashMap;
use synaptic::prompts::{
    FewShotChatMessagePromptTemplate, FewShotExample, PromptTemplate,
};

let template = FewShotChatMessagePromptTemplate::new(
    vec![
        FewShotExample {
            input: "What is 2+2?".to_string(),
            output: "4".to_string(),
        },
        FewShotExample {
            input: "What is 3+3?".to_string(),
            output: "6".to_string(),
        },
    ],
    PromptTemplate::new("{{ question }}"),
);

let values = HashMap::from([
    ("question".to_string(), "What is 4+4?".to_string()),
]);
let messages = template.format(&values).unwrap();

// messages[0] => Human("What is 2+2?")  -- example 1 input
// messages[1] => AI("4")                 -- example 1 output
// messages[2] => Human("What is 3+3?")  -- example 2 input
// messages[3] => AI("6")                 -- example 2 output
// messages[4] => Human("What is 4+4?")  -- actual query (suffix)

Each FewShotExample has two fields:

input -- the human message for this example
output -- the AI response for this example

The suffix template is rendered with the user-provided variables and appended as the final human message.

Adding a System Prefix

Use with_prefix() to prepend a system message before the examples:

use std::collections::HashMap;
use synaptic::prompts::{
    FewShotChatMessagePromptTemplate, FewShotExample, PromptTemplate,
};

let template = FewShotChatMessagePromptTemplate::new(
    vec![FewShotExample {
        input: "hi".to_string(),
        output: "hello".to_string(),
    }],
    PromptTemplate::new("{{ input }}"),
)
.with_prefix(PromptTemplate::new("You are a polite assistant."));

let values = HashMap::from([("input".to_string(), "hey".to_string())]);
let messages = template.format(&values).unwrap();

// messages[0] => System("You are a polite assistant.")  -- prefix
// messages[1] => Human("hi")                            -- example input
// messages[2] => AI("hello")                            -- example output
// messages[3] => Human("hey")                           -- actual query

The prefix template supports {{ variable }} interpolation, so you can parameterize the system message too.

Using as a Runnable

FewShotChatMessagePromptTemplate implements Runnable<HashMap<String, String>, Vec<Message>>, so you can call invoke() or compose it in pipelines:

use std::collections::HashMap;
use synaptic::core::RunnableConfig;
use synaptic::prompts::{
    FewShotChatMessagePromptTemplate, FewShotExample, PromptTemplate,
};
use synaptic::runnables::Runnable;

let template = FewShotChatMessagePromptTemplate::new(
    vec![FewShotExample {
        input: "x".to_string(),
        output: "y".to_string(),
    }],
    PromptTemplate::new("{{ q }}"),
);

let config = RunnableConfig::default();
let values = HashMap::from([("q".to_string(), "z".to_string())]);
let messages = template.invoke(values, &config).await?;
// 3 messages: Human("x"), AI("y"), Human("z")

Note: The Runnable implementation for FewShotChatMessagePromptTemplate takes HashMap<String, String>, while ChatPromptTemplate takes HashMap<String, serde_json::Value>. This difference reflects their underlying template rendering: few-shot templates use PromptTemplate::render() which works with string values.

Output Parsers

Output parsers transform raw LLM output into structured data. Every parser in Synaptic implements the Runnable trait, so they compose naturally with prompt templates, chat models, and other runnables using the LCEL pipe operator (|).

Available Parsers

Parser	Input	Output	Description
`StrOutputParser`	`Message`	`String`	Extracts the text content from a message
`JsonOutputParser`	`String`	`serde_json::Value`	Parses a string as JSON
`StructuredOutputParser<T>`	`String`	`T`	Deserializes JSON into a typed struct
`ListOutputParser`	`String`	`Vec<String>`	Splits by a configurable separator
`EnumOutputParser`	`String`	`String`	Validates against a list of allowed values
`BooleanOutputParser`	`String`	`bool`	Parses yes/no/true/false strings
`MarkdownListOutputParser`	`String`	`Vec<String>`	Parses markdown bullet lists
`NumberedListOutputParser`	`String`	`Vec<String>`	Parses numbered lists
`XmlOutputParser`	`String`	`XmlElement`	Parses XML into a tree structure

All parsers also implement the FormatInstructions trait, which provides a get_format_instructions() method. You can include these instructions in your prompt to guide the LLM toward producing output in the expected format.

Quick Example

use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::{Message, RunnableConfig};

let parser = StrOutputParser;
let config = RunnableConfig::default();
let result = parser.invoke(Message::ai("Hello world"), &config).await?;
assert_eq!(result, "Hello world");

Sub-Pages

Basic Parsers -- StrOutputParser, JsonOutputParser, ListOutputParser
Structured Parser -- deserialize JSON into typed Rust structs
Enum Parser -- validate output against a fixed set of values

Basic Parsers

Synaptic provides several simple output parsers for common transformations. Each implements Runnable, so it can be used standalone or composed in a pipeline.

StrOutputParser

Extracts the text content from a Message. This is the most commonly used parser -- it sits at the end of most chains to convert the model's response into a plain String.

Signature: Runnable<Message, String>

use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::{Message, RunnableConfig};

let parser = StrOutputParser;
let config = RunnableConfig::default();

let result = parser.invoke(Message::ai("Hello world"), &config).await?;
assert_eq!(result, "Hello world");

StrOutputParser works with any Message variant -- system, human, AI, or tool messages all have content that can be extracted.

JsonOutputParser

Parses a JSON string into a serde_json::Value. Useful when you need to work with arbitrary JSON structures without defining a specific Rust type.

Signature: Runnable<String, serde_json::Value>

use synaptic::parsers::JsonOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let parser = JsonOutputParser;
let config = RunnableConfig::default();

let result = parser.invoke(
    r#"{"name": "Synaptic", "version": 1}"#.to_string(),
    &config,
).await?;

assert_eq!(result["name"], "Synaptic");
assert_eq!(result["version"], 1);

If the input is not valid JSON, the parser returns Err(SynapticError::Parsing(...)).

ListOutputParser

Splits a string into a Vec<String> using a configurable separator. Useful when you ask the LLM to return a comma-separated or newline-separated list.

Signature: Runnable<String, Vec<String>>

use synaptic::parsers::{ListOutputParser, ListSeparator};
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let config = RunnableConfig::default();

// Split on commas
let parser = ListOutputParser::comma();
let result = parser.invoke("apple, banana, cherry".to_string(), &config).await?;
assert_eq!(result, vec!["apple", "banana", "cherry"]);

// Split on newlines (default)
let parser = ListOutputParser::newline();
let result = parser.invoke("first\nsecond\nthird".to_string(), &config).await?;
assert_eq!(result, vec!["first", "second", "third"]);

// Custom separator
let parser = ListOutputParser::new(ListSeparator::Custom("|".to_string()));
let result = parser.invoke("a | b | c".to_string(), &config).await?;
assert_eq!(result, vec!["a", "b", "c"]);

Each item is trimmed of leading and trailing whitespace. Empty items after trimming are filtered out.

BooleanOutputParser

Parses yes/no, true/false, y/n, and 1/0 style responses into a bool. Case-insensitive and whitespace-trimmed.

Signature: Runnable<String, bool>

use synaptic::parsers::BooleanOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let parser = BooleanOutputParser;
let config = RunnableConfig::default();

assert_eq!(parser.invoke("Yes".to_string(), &config).await?, true);
assert_eq!(parser.invoke("false".to_string(), &config).await?, false);
assert_eq!(parser.invoke("1".to_string(), &config).await?, true);
assert_eq!(parser.invoke("N".to_string(), &config).await?, false);

Unrecognized values return Err(SynapticError::Parsing(...)).

XmlOutputParser

Parses XML-formatted LLM output into an XmlElement tree. Supports nested elements, attributes, and text content without requiring a full XML library.

Signature: Runnable<String, XmlElement>

use synaptic::parsers::{XmlOutputParser, XmlElement};
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let config = RunnableConfig::default();

// Parse with a root tag filter
let parser = XmlOutputParser::with_root_tag("answer");
let result = parser.invoke(
    "Here is my answer: <answer><item>hello</item></answer>".to_string(),
    &config,
).await?;

assert_eq!(result.tag, "answer");
assert_eq!(result.children[0].tag, "item");
assert_eq!(result.children[0].text, Some("hello".to_string()));

Use XmlOutputParser::new() to parse the entire input as XML, or with_root_tag("tag") to extract content from within a specific root tag.

MarkdownListOutputParser

Parses markdown-formatted bullet lists (- item or * item) into a Vec<String>. Lines not starting with a bullet marker are ignored.

Signature: Runnable<String, Vec<String>>

use synaptic::parsers::MarkdownListOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let parser = MarkdownListOutputParser;
let config = RunnableConfig::default();

let result = parser.invoke(
    "Here are the items:\n- Apple\n- Banana\n* Cherry\nNot a list item".to_string(),
    &config,
).await?;

assert_eq!(result, vec!["Apple", "Banana", "Cherry"]);

NumberedListOutputParser

Parses numbered lists (1. item, 2. item) into a Vec<String>. The number prefix is stripped; only lines matching the N. text pattern are included.

Signature: Runnable<String, Vec<String>>

use synaptic::parsers::NumberedListOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let parser = NumberedListOutputParser;
let config = RunnableConfig::default();

let result = parser.invoke(
    "Top 3 languages:\n1. Rust\n2. Python\n3. TypeScript".to_string(),
    &config,
).await?;

assert_eq!(result, vec!["Rust", "Python", "TypeScript"]);

Format Instructions

All parsers implement the FormatInstructions trait. You can include the instructions in your prompt to guide the model:

use synaptic::parsers::{JsonOutputParser, ListOutputParser, FormatInstructions};

let json_parser = JsonOutputParser;
println!("{}", json_parser.get_format_instructions());
// "Your response should be a valid JSON object."

let list_parser = ListOutputParser::comma();
println!("{}", list_parser.get_format_instructions());
// "Your response should be a list of items separated by commas."

Pipeline Example

A typical chain pipes a prompt template through a model and into a parser:

use std::collections::HashMap;
use serde_json::json;
use synaptic::core::{ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::StrOutputParser;
use synaptic::runnables::Runnable;

let model = ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("The answer is 42."),
        usage: None,
    },
]);

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system("You are a helpful assistant."),
    MessageTemplate::human("{{ question }}"),
]);

// template -> model -> parser
let chain = template.boxed() | model.boxed() | StrOutputParser.boxed();

let config = RunnableConfig::default();
let values: HashMap<String, serde_json::Value> = HashMap::from([
    ("question".to_string(), json!("What is the meaning of life?")),
]);

let result: String = chain.invoke(values, &config).await?;
assert_eq!(result, "The answer is 42.");

Structured Parser

StructuredOutputParser<T> deserializes a JSON string directly into a typed Rust struct. This is the preferred parser when you know the exact shape of the data you expect from the LLM.

Basic Usage

Define a struct that derives Deserialize, then create a parser for it:

use synaptic::parsers::StructuredOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
use serde::Deserialize;

#[derive(Deserialize)]
struct Person {
    name: String,
    age: u32,
}

let parser = StructuredOutputParser::<Person>::new();
let config = RunnableConfig::default();

let result = parser.invoke(
    r#"{"name": "Alice", "age": 30}"#.to_string(),
    &config,
).await?;

assert_eq!(result.name, "Alice");
assert_eq!(result.age, 30);

Signature: Runnable<String, T> where T: DeserializeOwned + Send + Sync + 'static

Error Handling

If the input string is not valid JSON or does not match the struct's schema, the parser returns Err(SynapticError::Parsing(...)):

use synaptic::parsers::StructuredOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;
use serde::Deserialize;

#[derive(Deserialize)]
struct Config {
    enabled: bool,
    threshold: f64,
}

let parser = StructuredOutputParser::<Config>::new();
let config = RunnableConfig::default();

// Missing required field -- returns an error
let err = parser.invoke(
    r#"{"enabled": true}"#.to_string(),
    &config,
).await.unwrap_err();

assert!(err.to_string().contains("structured parse error"));

Format Instructions

StructuredOutputParser<T> implements the FormatInstructions trait. Include the instructions in your prompt to guide the model toward producing correctly-shaped JSON:

use synaptic::parsers::{StructuredOutputParser, FormatInstructions};
use serde::Deserialize;

#[derive(Deserialize)]
struct Answer {
    reasoning: String,
    answer: String,
}

let parser = StructuredOutputParser::<Answer>::new();
let instructions = parser.get_format_instructions();
// "Your response should be a valid JSON object matching the expected schema."

Pipeline Example

In a chain, StructuredOutputParser typically follows a StrOutputParser step or receives the string content directly. Here is a complete example:

use synaptic::parsers::StructuredOutputParser;
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::{Message, RunnableConfig};
use serde::Deserialize;

#[derive(Debug, Deserialize)]
struct Sentiment {
    label: String,
    confidence: f64,
}

// Simulate an LLM that returns JSON in a Message
let extract_content = RunnableLambda::new(|msg: Message| async move {
    Ok(msg.content().to_string())
});

let parser = StructuredOutputParser::<Sentiment>::new();

let chain = extract_content.boxed() | parser.boxed();
let config = RunnableConfig::default();

let input = Message::ai(r#"{"label": "positive", "confidence": 0.95}"#);
let result: Sentiment = chain.invoke(input, &config).await?;

assert_eq!(result.label, "positive");
assert!((result.confidence - 0.95).abs() < f64::EPSILON);

When to Use Structured vs. JSON Parser

Use StructuredOutputParser<T> when you know the exact schema at compile time and want type-safe access to fields.
Use JsonOutputParser when you need to work with arbitrary or dynamic JSON structures where the shape is not known in advance.

Enum Parser

EnumOutputParser validates that the LLM's output matches one of a predefined set of allowed values. This is useful for classification tasks where the model should respond with exactly one of several categories.

Basic Usage

Create the parser with a list of allowed values, then invoke it:

use synaptic::parsers::EnumOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let parser = EnumOutputParser::new(vec![
    "positive".to_string(),
    "negative".to_string(),
    "neutral".to_string(),
]);

let config = RunnableConfig::default();

// Valid value -- returns Ok
let result = parser.invoke("positive".to_string(), &config).await?;
assert_eq!(result, "positive");

Signature: Runnable<String, String>

Validation

The parser trims whitespace from the input before checking. If the trimmed input does not match any allowed value, it returns Err(SynapticError::Parsing(...)):

use synaptic::parsers::EnumOutputParser;
use synaptic::runnables::Runnable;
use synaptic::core::RunnableConfig;

let parser = EnumOutputParser::new(vec![
    "positive".to_string(),
    "negative".to_string(),
    "neutral".to_string(),
]);

let config = RunnableConfig::default();

// Whitespace is trimmed -- this succeeds
let result = parser.invoke("  neutral  ".to_string(), &config).await?;
assert_eq!(result, "neutral");

// Invalid value -- returns an error
let err = parser.invoke("invalid".to_string(), &config).await.unwrap_err();
assert!(err.to_string().contains("expected one of"));

Format Instructions

EnumOutputParser implements FormatInstructions. Include the instructions in your prompt so the model knows which values to choose from:

use synaptic::parsers::{EnumOutputParser, FormatInstructions};

let parser = EnumOutputParser::new(vec![
    "positive".to_string(),
    "negative".to_string(),
    "neutral".to_string(),
]);

let instructions = parser.get_format_instructions();
// "Your response should be one of the following values: positive, negative, neutral"

Pipeline Example

A typical classification pipeline combines a prompt, a model, a content extractor, and the enum parser:

use std::collections::HashMap;
use serde_json::json;
use synaptic::core::{ChatResponse, Message, RunnableConfig};
use synaptic::models::ScriptedChatModel;
use synaptic::prompts::{ChatPromptTemplate, MessageTemplate};
use synaptic::parsers::{StrOutputParser, EnumOutputParser, FormatInstructions};
use synaptic::runnables::Runnable;

let parser = EnumOutputParser::new(vec![
    "positive".to_string(),
    "negative".to_string(),
    "neutral".to_string(),
]);

let model = ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("positive"),
        usage: None,
    },
]);

let template = ChatPromptTemplate::from_messages(vec![
    MessageTemplate::system(
        &format!(
            "Classify the sentiment of the text. {}",
            parser.get_format_instructions()
        ),
    ),
    MessageTemplate::human("{{ text }}"),
]);

// template -> model -> extract content -> validate enum
let chain = template.boxed()
    | model.boxed()
    | StrOutputParser.boxed()
    | parser.boxed();

let config = RunnableConfig::default();
let values: HashMap<String, serde_json::Value> = HashMap::from([
    ("text".to_string(), json!("I love this product!")),
]);

let result: String = chain.invoke(values, &config).await?;
assert_eq!(result, "positive");

Runnables (LCEL)

Synaptic implements LCEL (LangChain Expression Language) through the Runnable trait and a set of composable building blocks. Every component in an LCEL chain -- prompts, models, parsers, custom logic -- implements the same Runnable<I, O> interface, so they can be combined freely with a uniform API.

The `Runnable` trait

The Runnable<I, O> trait is defined in synaptic_core and provides three core methods:

Method	Description
`invoke(input, config)`	Execute on a single input, returning one output
`batch(inputs, config)`	Execute on multiple inputs sequentially
`stream(input, config)`	Return a `RunnableOutputStream` of incremental results

Every Runnable also has a boxed() method that wraps it into a BoxRunnable<I, O> -- a type-erased container that enables the | pipe operator for composition.

use synaptic::runnables::{Runnable, RunnableLambda, BoxRunnable};
use synaptic::core::RunnableConfig;

let step = RunnableLambda::new(|x: String| async move {
    Ok(x.to_uppercase())
});

let config = RunnableConfig::default();
let result = step.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "HELLO");

`BoxRunnable` -- type-erased composition

BoxRunnable<I, O> is the key type for building chains. It wraps any Runnable<I, O> behind a trait object, which erases the concrete type. This is necessary because the | operator requires both sides to have known types at the call site.

BoxRunnable itself implements Runnable<I, O>, so boxed runnables compose seamlessly.

Building blocks

Synaptic provides the following LCEL building blocks:

Type	Purpose
`RunnableLambda`	Wraps an async closure as a runnable
`RunnablePassthrough`	Passes input through unchanged
`RunnableSequence`	Chains two runnables (created by `\|` operator)
`RunnableParallel`	Runs named branches concurrently, merges to JSON
`RunnableBranch`	Routes input by condition, with a default fallback
`RunnableAssign`	Merges parallel branch results into the input JSON object
`RunnablePick`	Extracts specific keys from a JSON object
`RunnableWithFallbacks`	Tries alternatives when the primary runnable fails
`RunnableRetry`	Retries with exponential backoff on failure
`RunnableEach`	Maps a runnable over each element in a `Vec`
`RunnableGenerator`	Wraps a generator function for true streaming output

Tip: For standalone async functions, you can also use the #[chain] macro to generate a BoxRunnable factory. This avoids writing RunnableLambda::new(|x| async { ... }).boxed() by hand. See Procedural Macros.

Guides

Pipe Operator -- chain runnables with | to build sequential pipelines
Streaming -- consume incremental output through a chain
Parallel & Branch -- run branches concurrently or route by condition
Assign & Pick -- merge computed keys into JSON and extract specific fields
Fallbacks -- provide alternative runnables when the primary one fails
Bind -- attach config transforms to a runnable
Retry -- retry with exponential backoff on transient failures
Generator -- wrap a streaming generator function as a runnable
Each -- map a runnable over each element in a list

Pipe Operator

This guide shows how to chain runnables together using the | pipe operator to build sequential processing pipelines.

Overview

The | operator on BoxRunnable creates a RunnableSequence that feeds the output of the first runnable into the input of the second. This is the primary way to build LCEL chains in Synaptic.

The pipe operator is implemented via Rust's BitOr trait on BoxRunnable. Both sides must be boxed first with .boxed(), because the operator needs type-erased wrappers to connect runnables with different concrete types.

Basic chaining

use synaptic::runnables::{Runnable, RunnableLambda, BoxRunnable};
use synaptic::core::RunnableConfig;

let step1 = RunnableLambda::new(|x: String| async move {
    Ok(format!("Step 1: {x}"))
});

let step2 = RunnableLambda::new(|x: String| async move {
    Ok(format!("{x} -> Step 2"))
});

// Pipe operator creates a RunnableSequence
let chain = step1.boxed() | step2.boxed();

let config = RunnableConfig::default();
let result = chain.invoke("input".to_string(), &config).await?;
assert_eq!(result, "Step 1: input -> Step 2");

The types must be compatible: the output type of step1 must match the input type of step2. In this example both work with String, so the types line up. The compiler will reject chains where the types do not match.

Multi-step chains

You can chain more than two steps by continuing to pipe. The result is still a single BoxRunnable:

let step3 = RunnableLambda::new(|x: String| async move {
    Ok(format!("{x} -> Step 3"))
});

let chain = step1.boxed() | step2.boxed() | step3.boxed();

let result = chain.invoke("start".to_string(), &config).await?;
assert_eq!(result, "Step 1: start -> Step 2 -> Step 3");

Each | wraps the left side into a new RunnableSequence, so a | b | c produces a RunnableSequence(RunnableSequence(a, b), c). This nesting is transparent -- you interact with the result as a single BoxRunnable<I, O>.

Type conversions across steps

Steps can change the type flowing through the chain, as long as each step's output matches the next step's input:

use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;

// String -> usize -> String
let count_chars = RunnableLambda::new(|s: String| async move {
    Ok(s.len())
});

let format_count = RunnableLambda::new(|n: usize| async move {
    Ok(format!("Length: {n}"))
});

let chain = count_chars.boxed() | format_count.boxed();

let config = RunnableConfig::default();
let result = chain.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "Length: 5");

Why `boxed()` is required

Rust's type system needs to know the exact types at compile time. Without boxed(), each RunnableLambda has a unique closure type that cannot appear on both sides of |. Calling .boxed() erases the concrete type into BoxRunnable<I, O>, which is a trait object that can compose with any other BoxRunnable as long as the input/output types align.

BoxRunnable::new(runnable) is equivalent to runnable.boxed() -- use whichever reads better in context.

Using `RunnablePassthrough`

RunnablePassthrough is a no-op runnable that passes its input through unchanged. It is useful when you need an identity step in a chain -- for example, as one branch in a RunnableParallel:

use synaptic::runnables::{Runnable, RunnablePassthrough};

let passthrough = RunnablePassthrough;
let result = passthrough.invoke("unchanged".to_string(), &config).await?;
assert_eq!(result, "unchanged");

Error propagation

If any step in the chain returns an Err, the chain short-circuits immediately and returns that error. Subsequent steps are not executed:

use synaptic::core::SynapticError;

let failing = RunnableLambda::new(|_x: String| async move {
    Err::<String, _>(SynapticError::Validation("something went wrong".into()))
});

let after = RunnableLambda::new(|x: String| async move {
    Ok(format!("This won't run: {x}"))
});

let chain = failing.boxed() | after.boxed();
let result = chain.invoke("test".to_string(), &config).await;
assert!(result.is_err());

Streaming through Chains

This guide shows how to use stream() to consume incremental output from an LCEL chain.

Overview

Every Runnable provides a stream() method that returns a RunnableOutputStream -- a pinned, boxed Stream of Result<O, SynapticError> items. This allows downstream consumers to process results as they become available, rather than waiting for the entire chain to finish.

The default stream() implementation wraps invoke() as a single-item stream. Runnables that support true incremental output (such as LLM model adapters or RunnableGenerator) override stream() to yield items one at a time.

Streaming a single runnable

use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;

let upper = RunnableLambda::new(|x: String| async move {
    Ok(x.to_uppercase())
});

let config = RunnableConfig::default();
let mut stream = upper.stream("hello".to_string(), &config);

while let Some(result) = stream.next().await {
    let value = result?;
    println!("Got: {value}");
}
// Prints: Got: HELLO

Because RunnableLambda uses the default stream() implementation, this yields exactly one item -- the full result of invoke().

Streaming through a chain

When you stream through a RunnableSequence (created by the | operator), the behavior is:

The first step runs fully via invoke() and produces its complete output.
That output is fed into the second step's stream(), which yields items incrementally.

This means only the final component in a chain truly streams. Intermediate steps buffer their output. This matches the LangChain behavior.

use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;

let step1 = RunnableLambda::new(|x: String| async move {
    Ok(format!("processed: {x}"))
});

let step2 = RunnableLambda::new(|x: String| async move {
    Ok(x.to_uppercase())
});

let chain = step1.boxed() | step2.boxed();

let config = RunnableConfig::default();
let mut stream = chain.stream("input".to_string(), &config);

while let Some(result) = stream.next().await {
    let value = result?;
    println!("Got: {value}");
}
// Prints: Got: PROCESSED: INPUT

Streaming with `BoxRunnable`

BoxRunnable preserves the streaming behavior of the inner runnable. Call .stream() directly on it:

let boxed_chain = step1.boxed() | step2.boxed();
let mut stream = boxed_chain.stream("input".to_string(), &config);

while let Some(result) = stream.next().await {
    let value = result?;
    println!("{value}");
}

True streaming with `RunnableGenerator`

RunnableGenerator wraps a generator function that returns a Stream, enabling true incremental output:

use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableGenerator};
use synaptic::core::RunnableConfig;

let gen = RunnableGenerator::new(|input: String| {
    async_stream::stream! {
        for word in input.split_whitespace() {
            yield Ok(word.to_uppercase());
        }
    }
});

let config = RunnableConfig::default();
let mut stream = gen.stream("hello world foo".to_string(), &config);

while let Some(result) = stream.next().await {
    let items = result?;
    println!("Chunk: {:?}", items);
}
// Prints each word as a separate chunk:
// Chunk: ["HELLO"]
// Chunk: ["WORLD"]
// Chunk: ["FOO"]

When you call invoke() on a RunnableGenerator, it collects all streamed items into a Vec<O>.

Collecting a stream into a single result

If you need the full result rather than incremental output, use invoke() instead of stream(). Alternatively, collect the stream manually:

use futures::StreamExt;

let mut stream = chain.stream("input".to_string(), &config);
let mut items = Vec::new();

while let Some(result) = stream.next().await {
    items.push(result?);
}

// items now contains all yielded values

Error handling in streams

If any step in a chain fails during streaming, the stream yields an Err item. Consumers should check each item:

while let Some(result) = stream.next().await {
    match result {
        Ok(value) => println!("Got: {value}"),
        Err(e) => eprintln!("Error: {e}"),
    }
}

When the first step of a RunnableSequence fails (during its invoke()), the stream immediately yields that error as the only item.

Parallel & Branch

This guide shows how to run multiple runnables concurrently with RunnableParallel and how to route input to different runnables with RunnableBranch.

RunnableParallel

RunnableParallel runs named branches concurrently on the same input, then merges all outputs into a single serde_json::Value object keyed by branch name.

The input type must implement Clone, because each branch receives its own copy. Every branch must produce a serde_json::Value output.

Basic usage

use serde_json::Value;
use synaptic::runnables::{Runnable, RunnableParallel, RunnableLambda};
use synaptic::core::RunnableConfig;

let parallel = RunnableParallel::new(vec![
    (
        "upper".to_string(),
        RunnableLambda::new(|x: String| async move {
            Ok(Value::String(x.to_uppercase()))
        }).boxed(),
    ),
    (
        "lower".to_string(),
        RunnableLambda::new(|x: String| async move {
            Ok(Value::String(x.to_lowercase()))
        }).boxed(),
    ),
    (
        "length".to_string(),
        RunnableLambda::new(|x: String| async move {
            Ok(Value::Number(x.len().into()))
        }).boxed(),
    ),
]);

let config = RunnableConfig::default();
let result = parallel.invoke("Hello".to_string(), &config).await?;

// result is a JSON object:
// {"upper": "HELLO", "lower": "hello", "length": 5}
assert_eq!(result["upper"], "HELLO");
assert_eq!(result["lower"], "hello");
assert_eq!(result["length"], 5);

Constructor

RunnableParallel::new() takes a Vec<(String, BoxRunnable<I, Value>)> -- a list of (name, runnable) pairs. All branches run concurrently via futures::future::join_all.

In a chain

RunnableParallel implements Runnable<I, Value>, so you can use it in a pipe chain. A common pattern is to fan out processing and then merge the results:

let analyze = RunnableParallel::new(vec![
    ("summary".to_string(), summarizer.boxed()),
    ("keywords".to_string(), keyword_extractor.boxed()),
]);

let format_report = RunnableLambda::new(|data: Value| async move {
    Ok(format!(
        "Summary: {}\nKeywords: {}",
        data["summary"], data["keywords"]
    ))
});

let chain = analyze.boxed() | format_report.boxed();

Error handling

If any branch fails, the entire RunnableParallel invocation returns the first error encountered. Successful branches that completed before the failure are discarded.

RunnableBranch

RunnableBranch routes input to one of several runnables based on condition functions. It evaluates conditions in order, invoking the runnable associated with the first matching condition. If no conditions match, the default runnable is used.

Basic usage

use synaptic::runnables::{Runnable, RunnableBranch, RunnableLambda, BoxRunnable};
use synaptic::core::RunnableConfig;

let branch = RunnableBranch::new(
    vec![
        (
            Box::new(|x: &String| x.starts_with("hi")) as Box<dyn Fn(&String) -> bool + Send + Sync>,
            RunnableLambda::new(|x: String| async move {
                Ok(format!("Greeting: {x}"))
            }).boxed(),
        ),
        (
            Box::new(|x: &String| x.starts_with("bye")),
            RunnableLambda::new(|x: String| async move {
                Ok(format!("Farewell: {x}"))
            }).boxed(),
        ),
    ],
    // Default: used when no condition matches
    RunnableLambda::new(|x: String| async move {
        Ok(format!("Other: {x}"))
    }).boxed(),
);

let config = RunnableConfig::default();

let r1 = branch.invoke("hi there".to_string(), &config).await?;
assert_eq!(r1, "Greeting: hi there");

let r2 = branch.invoke("bye now".to_string(), &config).await?;
assert_eq!(r2, "Farewell: bye now");

let r3 = branch.invoke("something else".to_string(), &config).await?;
assert_eq!(r3, "Other: something else");

Constructor

RunnableBranch::new() takes two arguments:

branches: Vec<(BranchCondition<I>, BoxRunnable<I, O>)> -- condition/runnable pairs evaluated in order. The condition type is Box<dyn Fn(&I) -> bool + Send + Sync>.
default: BoxRunnable<I, O> -- the fallback runnable when no condition matches.

In a chain

RunnableBranch implements Runnable<I, O>, so it works with the pipe operator:

let preprocess = RunnableLambda::new(|x: String| async move {
    Ok(x.trim().to_string())
});

let route = RunnableBranch::new(
    vec![/* conditions */],
    default_handler.boxed(),
);

let chain = preprocess.boxed() | route.boxed();

When to use each

Use RunnableParallel when you need to run multiple operations on the same input concurrently and combine all results.
Use RunnableBranch when you need to select a single processing path based on the input value.

Assign & Pick

This guide shows how to use RunnableAssign to merge computed values into a JSON object and RunnablePick to extract specific keys from one.

RunnableAssign

RunnableAssign takes a JSON object as input, runs named branches in parallel on that object, and merges the branch outputs back into the original object. This is useful for enriching data as it flows through a chain -- you keep the original fields and add new computed ones.

Basic usage

use serde_json::{json, Value};
use synaptic::runnables::{Runnable, RunnableAssign, RunnableLambda};
use synaptic::core::RunnableConfig;

let assign = RunnableAssign::new(vec![
    (
        "name_upper".to_string(),
        RunnableLambda::new(|input: Value| async move {
            let name = input["name"].as_str().unwrap_or_default();
            Ok(Value::String(name.to_uppercase()))
        }).boxed(),
    ),
    (
        "greeting".to_string(),
        RunnableLambda::new(|input: Value| async move {
            let name = input["name"].as_str().unwrap_or_default();
            Ok(Value::String(format!("Hello, {name}!")))
        }).boxed(),
    ),
]);

let config = RunnableConfig::default();
let input = json!({"name": "Alice", "age": 30});
let result = assign.invoke(input, &config).await?;

// Original fields are preserved, new fields are merged in
assert_eq!(result["name"], "Alice");
assert_eq!(result["age"], 30);
assert_eq!(result["name_upper"], "ALICE");
assert_eq!(result["greeting"], "Hello, Alice!");

How it works

The input must be a JSON object (Value::Object). If it is not, RunnableAssign returns a SynapticError::Validation error.
Each branch receives a clone of the full input object.
All branches run concurrently via futures::future::join_all.
Branch outputs are inserted into the original object using the branch name as the key. If a branch name collides with an existing key, the branch output overwrites the original value.

Constructor

RunnableAssign::new() takes a Vec<(String, BoxRunnable<Value, Value>)> -- named branches that each transform the input into a value to be merged.

Shorthand via `RunnablePassthrough`

RunnablePassthrough provides a convenience method that creates a RunnableAssign directly:

use synaptic::runnables::{RunnablePassthrough, RunnableLambda};
use serde_json::Value;

let assign = RunnablePassthrough::assign(vec![
    (
        "processed".to_string(),
        RunnableLambda::new(|input: Value| async move {
            // compute something from the input
            Ok(Value::String("result".to_string()))
        }).boxed(),
    ),
]);

RunnablePick

RunnablePick extracts specified keys from a JSON object, producing a new object containing only those keys. Keys that do not exist in the input are silently omitted from the output.

Basic usage

use serde_json::{json, Value};
use synaptic::runnables::{Runnable, RunnablePick};
use synaptic::core::RunnableConfig;

let pick = RunnablePick::new(vec![
    "name".to_string(),
    "age".to_string(),
]);

let config = RunnableConfig::default();
let input = json!({
    "name": "Alice",
    "age": 30,
    "email": "alice@example.com",
    "internal_id": 42
});

let result = pick.invoke(input, &config).await?;

// Only the picked keys are present
assert_eq!(result, json!({"name": "Alice", "age": 30}));

Error handling

RunnablePick expects a JSON object as input. If the input is not an object (e.g., a string or array), it returns a SynapticError::Validation error.

Missing keys are not an error -- they are simply absent from the output:

let pick = RunnablePick::new(vec!["name".to_string(), "missing_key".to_string()]);
let result = pick.invoke(json!({"name": "Bob"}), &config).await?;
assert_eq!(result, json!({"name": "Bob"}));

Combining Assign and Pick in a chain

A common pattern is to use RunnableAssign to enrich data, then RunnablePick to select only the fields needed downstream:

use serde_json::{json, Value};
use synaptic::runnables::{Runnable, RunnableAssign, RunnablePick, RunnableLambda};
use synaptic::core::RunnableConfig;

// Step 1: Enrich input with a computed field
let assign = RunnableAssign::new(vec![
    (
        "full_name".to_string(),
        RunnableLambda::new(|input: Value| async move {
            let first = input["first"].as_str().unwrap_or_default();
            let last = input["last"].as_str().unwrap_or_default();
            Ok(Value::String(format!("{first} {last}")))
        }).boxed(),
    ),
]);

// Step 2: Pick only what the next step needs
let pick = RunnablePick::new(vec!["full_name".to_string()]);

let chain = assign.boxed() | pick.boxed();

let config = RunnableConfig::default();
let input = json!({"first": "Jane", "last": "Doe", "internal_id": 99});
let result = chain.invoke(input, &config).await?;

assert_eq!(result, json!({"full_name": "Jane Doe"}));

Fallbacks

This guide shows how to use RunnableWithFallbacks to provide alternative runnables that are tried when the primary one fails.

Overview

RunnableWithFallbacks wraps a primary runnable and a list of fallback runnables. When invoked, it tries the primary first. If the primary returns an error, it tries each fallback in order until one succeeds. If all fail, it returns the error from the last fallback attempted.

This is particularly useful when working with LLM providers that may experience transient outages, or when you want to try a cheaper model first and fall back to a more capable one.

Basic usage

use synaptic::runnables::{Runnable, RunnableWithFallbacks, RunnableLambda};
use synaptic::core::{RunnableConfig, SynapticError};

// A runnable that always fails
let unreliable = RunnableLambda::new(|_x: String| async move {
    Err::<String, _>(SynapticError::Provider("service unavailable".into()))
});

// A reliable fallback
let fallback = RunnableLambda::new(|x: String| async move {
    Ok(format!("Fallback handled: {x}"))
});

let with_fallbacks = RunnableWithFallbacks::new(
    unreliable.boxed(),
    vec![fallback.boxed()],
);

let config = RunnableConfig::default();
let result = with_fallbacks.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "Fallback handled: hello");

Multiple fallbacks

You can provide multiple fallbacks. They are tried in order:

let primary = failing_model.boxed();
let fallback_1 = cheaper_model.boxed();
let fallback_2 = local_model.boxed();

let resilient = RunnableWithFallbacks::new(
    primary,
    vec![fallback_1, fallback_2],
);

// Tries: primary -> fallback_1 -> fallback_2
let result = resilient.invoke(input, &config).await?;

If the primary succeeds, no fallbacks are attempted. If the primary fails but fallback_1 succeeds, fallback_2 is never tried.

Input cloning requirement

The input type must implement Clone, because RunnableWithFallbacks needs to pass a copy of the input to each fallback attempt. This is enforced by the type signature:

pub struct RunnableWithFallbacks<I: Send + Clone + 'static, O: Send + 'static> {
    primary: BoxRunnable<I, O>,
    fallbacks: Vec<BoxRunnable<I, O>>,
}

String, Vec<Message>, serde_json::Value, and most standard types implement Clone.

Streaming with fallbacks

RunnableWithFallbacks also supports stream(). When streaming, it buffers the primary stream's output. If the primary stream yields an error, it discards the buffered items and tries the next fallback's stream. This means there is no partial output from a failed provider -- you get the complete output from whichever provider succeeds.

use futures::StreamExt;

let resilient = RunnableWithFallbacks::new(primary.boxed(), vec![fallback.boxed()]);

let mut stream = resilient.stream("input".to_string(), &config);
while let Some(result) = stream.next().await {
    let value = result?;
    println!("Got: {value}");
}

In a chain

RunnableWithFallbacks implements Runnable<I, O>, so it composes with the pipe operator:

let resilient_model = RunnableWithFallbacks::new(
    primary_model.boxed(),
    vec![fallback_model.boxed()],
);

let chain = preprocess.boxed() | resilient_model.boxed() | postprocess.boxed();

When to use fallbacks vs. retry

Use RunnableWithFallbacks when you have genuinely different alternatives (e.g., different LLM providers or different strategies).
Use RunnableRetry when you want to retry the same runnable with exponential backoff (e.g., transient network errors).

You can combine both -- wrap a retrying runnable as the primary, with a different provider as a fallback:

use synaptic::runnables::{RunnableRetry, RetryPolicy, RunnableWithFallbacks};

let retrying_primary = RunnableRetry::new(primary.boxed(), RetryPolicy::default());
let resilient = RunnableWithFallbacks::new(
    retrying_primary.boxed(),
    vec![fallback.boxed()],
);

Bind

This guide shows how to use BoxRunnable::bind() to attach configuration transforms and listeners to a runnable.

Overview

bind() creates a new BoxRunnable that applies a transformation to the RunnableConfig before each invocation. This is useful for injecting tags, metadata, or other config fields into a runnable without modifying the call site.

Internally, bind() wraps the runnable in a RunnableBind that calls the transform function on the config, then delegates to the inner runnable with the modified config.

Basic usage

use synaptic::runnables::{Runnable, RunnableLambda};
use synaptic::core::RunnableConfig;

let step = RunnableLambda::new(|x: String| async move {
    Ok(x.to_uppercase())
});

// Bind a config transform that adds a tag
let bound = step.boxed().bind(|mut config| {
    config.tags.push("my-tag".to_string());
    config
});

let config = RunnableConfig::default();
let result = bound.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "HELLO");
// The inner runnable received a config with tags: ["my-tag"]

The transform function receives the RunnableConfig by value (cloned from the original) and returns the modified config.

Adding metadata

You can use bind() to attach metadata that downstream runnables or callbacks can inspect:

use serde_json::json;

let bound = step.boxed().bind(|mut config| {
    config.metadata.insert("source".to_string(), json!("user-query"));
    config.metadata.insert("priority".to_string(), json!("high"));
    config
});

Setting a fixed config with `with_config()`

If you want to replace the config entirely rather than modify it, use with_config(). This ignores whatever config is passed at invocation time and uses the provided config instead:

let fixed_config = RunnableConfig {
    tags: vec!["production".to_string()],
    run_name: Some("fixed-pipeline".to_string()),
    ..RunnableConfig::default()
};

let bound = step.boxed().with_config(fixed_config);

// Even if a different config is passed to invoke(), the fixed config is used
let any_config = RunnableConfig::default();
let result = bound.invoke("hello".to_string(), &any_config).await?;

Streaming with bind

bind() also applies the config transform during stream() calls, not just invoke():

use futures::StreamExt;

let bound = step.boxed().bind(|mut config| {
    config.tags.push("streaming".to_string());
    config
});

let mut stream = bound.stream("hello".to_string(), &config);
while let Some(result) = stream.next().await {
    let value = result?;
    println!("{value}");
}

Attaching listeners with `with_listeners()`

with_listeners() wraps a runnable with before/after callbacks that fire on each invocation. The callbacks receive a reference to the RunnableConfig:

let with_logging = step.boxed().with_listeners(
    |config| {
        println!("Starting run: {:?}", config.run_name);
    },
    |config| {
        println!("Finished run: {:?}", config.run_name);
    },
);

let result = with_logging.invoke("hello".to_string(), &config).await?;
// Prints: Starting run: None
// Prints: Finished run: None

Listeners also fire around stream() calls -- on_start fires before the first item is yielded, and on_end fires after the stream completes.

Composing with bind in a chain

bind() returns a BoxRunnable, so you can chain it with the pipe operator:

let tagged_step = step.boxed().bind(|mut config| {
    config.tags.push("step-1".to_string());
    config
});

let chain = tagged_step | next_step.boxed();
let result = chain.invoke("input".to_string(), &config).await?;

RunnableConfig fields reference

The RunnableConfig struct has the following fields that you can modify via bind():

Field	Type	Description
`tags`	`Vec<String>`	Tags for filtering and categorization
`metadata`	`HashMap<String, Value>`	Arbitrary key-value metadata
`max_concurrency`	`Option<usize>`	Concurrency limit for batch operations
`recursion_limit`	`Option<usize>`	Maximum recursion depth for chains
`run_id`	`Option<String>`	Unique identifier for the current run
`run_name`	`Option<String>`	Human-readable name for the current run

Retry

This guide shows how to use RunnableRetry with RetryPolicy to automatically retry a runnable on failure with exponential backoff.

Overview

RunnableRetry wraps any runnable with retry logic. When the inner runnable returns an error, RunnableRetry waits for a backoff delay and tries again, up to a configurable maximum number of attempts. The backoff follows an exponential schedule: min(base_delay * 2^attempt, max_delay).

Basic usage

use std::time::Duration;
use synaptic::runnables::{Runnable, RunnableRetry, RetryPolicy, RunnableLambda};
use synaptic::core::RunnableConfig;

let flaky_step = RunnableLambda::new(|x: String| async move {
    // Imagine this sometimes fails due to network issues
    Ok(x.to_uppercase())
});

let policy = RetryPolicy::default();  // 3 attempts, 100ms base delay, 10s max delay

let with_retry = RunnableRetry::new(flaky_step.boxed(), policy);

let config = RunnableConfig::default();
let result = with_retry.invoke("hello".to_string(), &config).await?;
assert_eq!(result, "HELLO");

Configuring the retry policy

RetryPolicy uses a builder pattern for configuration:

use std::time::Duration;
use synaptic::runnables::RetryPolicy;

let policy = RetryPolicy::default()
    .with_max_attempts(5)               // Up to 5 total attempts (1 initial + 4 retries)
    .with_base_delay(Duration::from_millis(200))   // Start with 200ms delay
    .with_max_delay(Duration::from_secs(30));      // Cap delay at 30 seconds

Default values

Field	Default
`max_attempts`	3
`base_delay`	100ms
`max_delay`	10 seconds

Backoff schedule

The delay for each retry attempt is calculated as:

delay = min(base_delay * 2^attempt, max_delay)

For the defaults (100ms base, 10s max):

Attempt	Delay
1st retry (attempt 0)	100ms
2nd retry (attempt 1)	200ms
3rd retry (attempt 2)	400ms
4th retry (attempt 3)	800ms
...	...
Capped at	10s

Filtering retryable errors

By default, all errors trigger a retry. Use with_retry_on() to specify a predicate that decides which errors are worth retrying:

use synaptic::runnables::RetryPolicy;
use synaptic::core::SynapticError;

let policy = RetryPolicy::default()
    .with_max_attempts(4)
    .with_retry_on(|error: &SynapticError| {
        // Only retry provider errors (e.g., rate limits, timeouts)
        matches!(error, SynapticError::Provider(_))
    });

When the predicate returns false for an error, RunnableRetry immediately returns that error without further retries.

Input cloning requirement

The input type must implement Clone, because the input is reused for each retry attempt:

pub struct RunnableRetry<I: Send + Clone + 'static, O: Send + 'static> { ... }

In a chain

RunnableRetry implements Runnable<I, O>, so it works with the pipe operator:

use synaptic::runnables::{Runnable, RunnableRetry, RetryPolicy, RunnableLambda};

let preprocess = RunnableLambda::new(|x: String| async move {
    Ok(x.trim().to_string())
});

let retrying_model = RunnableRetry::new(
    model_step.boxed(),
    RetryPolicy::default().with_max_attempts(3),
);

let chain = preprocess.boxed() | retrying_model.boxed();

Combining retry with fallbacks

For maximum resilience, wrap a retrying runnable with fallbacks. The primary is retried up to its limit; if it still fails, the fallback is tried:

use synaptic::runnables::{RunnableRetry, RetryPolicy, RunnableWithFallbacks};

let retrying_primary = RunnableRetry::new(
    primary_model.boxed(),
    RetryPolicy::default().with_max_attempts(3),
);

let resilient = RunnableWithFallbacks::new(
    retrying_primary.boxed(),
    vec![fallback_model.boxed()],
);

Full example

use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::time::Duration;
use synaptic::runnables::{Runnable, RunnableRetry, RetryPolicy, RunnableLambda};
use synaptic::core::{RunnableConfig, SynapticError};

// Simulate a flaky service that fails twice then succeeds
let call_count = Arc::new(AtomicUsize::new(0));
let counter = call_count.clone();

let flaky = RunnableLambda::new(move |x: String| {
    let counter = counter.clone();
    async move {
        let n = counter.fetch_add(1, Ordering::SeqCst);
        if n < 2 {
            Err(SynapticError::Provider("temporary failure".into()))
        } else {
            Ok(format!("Success: {x}"))
        }
    }
});

let policy = RetryPolicy::default()
    .with_max_attempts(5)
    .with_base_delay(Duration::from_millis(10));

let retrying = RunnableRetry::new(flaky.boxed(), policy);

let config = RunnableConfig::default();
let result = retrying.invoke("test".to_string(), &config).await?;
assert_eq!(result, "Success: test");
assert_eq!(call_count.load(Ordering::SeqCst), 3);  // 2 failures + 1 success

Generator

This guide shows how to use RunnableGenerator to create a runnable from a streaming generator function.

Overview

RunnableGenerator wraps a function that produces a Stream of results. It bridges the gap between streaming generators and the Runnable trait:

invoke() collects the entire stream into a Vec<O>
stream() yields each item individually as it is produced

This is useful when you want a runnable that naturally produces output incrementally -- for example, tokenizers, chunkers, or any computation that yields partial results.

Basic usage

use synaptic::runnables::{Runnable, RunnableGenerator};
use synaptic::core::{RunnableConfig, SynapticError};

let gen = RunnableGenerator::new(|input: String| {
    async_stream::stream! {
        for word in input.split_whitespace() {
            yield Ok(word.to_uppercase());
        }
    }
});

let config = RunnableConfig::default();
let result = gen.invoke("hello world".to_string(), &config).await?;
assert_eq!(result, vec!["HELLO", "WORLD"]);

Streaming

The real power of RunnableGenerator is streaming. stream() yields each item as it is produced, without waiting for the generator to finish:

use futures::StreamExt;
use synaptic::runnables::{Runnable, RunnableGenerator};
use synaptic::core::RunnableConfig;

let gen = RunnableGenerator::new(|input: String| {
    async_stream::stream! {
        for ch in input.chars() {
            yield Ok(ch.to_string());
        }
    }
});

let config = RunnableConfig::default();
let mut stream = gen.stream("abc".to_string(), &config);

// Each item arrives individually wrapped in a Vec
while let Some(item) = stream.next().await {
    let chunk = item?;
    println!("{:?}", chunk); // ["a"], ["b"], ["c"]
}

Each streamed item is wrapped in Vec<O> to match the output type of invoke(). This means stream() yields Result<Vec<O>, SynapticError> where each Vec contains a single element.

Error handling

If the generator yields an Err, invoke() stops collecting and returns that error. stream() yields the error and continues to the next item:

use synaptic::runnables::RunnableGenerator;
use synaptic::core::SynapticError;

let gen = RunnableGenerator::new(|_input: String| {
    async_stream::stream! {
        yield Ok("first".to_string());
        yield Err(SynapticError::Other("oops".into()));
        yield Ok("third".to_string());
    }
});

// invoke() fails on the error:
// gen.invoke("x".to_string(), &config).await => Err(...)

// stream() yields all three items:
// Ok(["first"]), Err(...), Ok(["third"])

In a pipeline

RunnableGenerator implements Runnable<I, Vec<O>>, so it works with the pipe operator. Place it wherever you need streaming generation in a chain:

use synaptic::runnables::{Runnable, RunnableGenerator, RunnableLambda};

let tokenize = RunnableGenerator::new(|input: String| {
    async_stream::stream! {
        for token in input.split_whitespace() {
            yield Ok(token.to_string());
        }
    }
});

let count = RunnableLambda::new(|tokens: Vec<String>| async move {
    Ok(tokens.len())
});

let chain = tokenize.boxed() | count.boxed();

// chain.invoke("one two three".to_string(), &config).await => Ok(3)

Type signature

pub struct RunnableGenerator<I: Send + 'static, O: Send + 'static> { ... }

impl<I, O> Runnable<I, Vec<O>> for RunnableGenerator<I, O> { ... }

The constructor accepts any function Fn(I) -> S where S: Stream<Item = Result<O, SynapticError>> + Send + 'static. The async_stream::stream! macro is the most ergonomic way to produce such a stream.

Each

This guide shows how to use RunnableEach to map a runnable over each element in a list.

Overview

RunnableEach wraps any BoxRunnable<I, O> and applies it to every element in a Vec<I>, producing a Vec<O>. It is the runnable equivalent of Iterator::map() -- process a batch of items through the same transformation.

Basic usage

use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
use synaptic::core::RunnableConfig;

let upper = RunnableLambda::new(|s: String| async move {
    Ok(s.to_uppercase())
});

let each = RunnableEach::new(upper.boxed());

let config = RunnableConfig::default();
let result = each.invoke(
    vec!["hello".into(), "world".into()],
    &config,
).await?;

assert_eq!(result, vec!["HELLO", "WORLD"]);

Error propagation

If the inner runnable fails on any element, RunnableEach stops and returns that error immediately. Elements processed before the failure are discarded:

use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
use synaptic::core::{RunnableConfig, SynapticError};

let must_be_short = RunnableLambda::new(|s: String| async move {
    if s.len() > 5 {
        Err(SynapticError::Other(format!("too long: {s}")))
    } else {
        Ok(s.to_uppercase())
    }
});

let each = RunnableEach::new(must_be_short.boxed());
let config = RunnableConfig::default();

let result = each.invoke(
    vec!["hi".into(), "toolong".into(), "ok".into()],
    &config,
).await;

assert!(result.is_err()); // fails on "toolong"

Empty input

An empty input vector produces an empty output vector:

use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};
use synaptic::core::RunnableConfig;

let identity = RunnableLambda::new(|s: String| async move { Ok(s) });
let each = RunnableEach::new(identity.boxed());

let config = RunnableConfig::default();
let result = each.invoke(vec![], &config).await?;
assert!(result.is_empty());

In a pipeline

RunnableEach implements Runnable<Vec<I>, Vec<O>>, so it composes with the pipe operator. A common pattern is to split input into parts, process each with RunnableEach, and then combine the results:

use synaptic::runnables::{Runnable, RunnableEach, RunnableLambda};

// Step 1: split a string into words
let split = RunnableLambda::new(|s: String| async move {
    Ok(s.split_whitespace().map(String::from).collect::<Vec<_>>())
});

// Step 2: process each word
let process = RunnableEach::new(
    RunnableLambda::new(|w: String| async move {
        Ok(w.to_uppercase())
    }).boxed()
);

// Step 3: join results
let join = RunnableLambda::new(|words: Vec<String>| async move {
    Ok(words.join(", "))
});

let chain = split.boxed() | process.boxed() | join.boxed();
// chain.invoke("hello world".to_string(), &config).await => Ok("HELLO, WORLD")

Type signature

pub struct RunnableEach<I: Send + 'static, O: Send + 'static> {
    inner: BoxRunnable<I, O>,
}

impl<I, O> Runnable<Vec<I>, Vec<O>> for RunnableEach<I, O> { ... }

Elements are processed sequentially in order. For concurrent processing, use RunnableParallel or the batch() method on a BoxRunnable instead.

Retrieval

Synaptic provides a complete Retrieval-Augmented Generation (RAG) pipeline. The pipeline follows five stages:

Load -- ingest raw data from files, JSON, CSV, web URLs, or entire directories.
Split -- break large documents into smaller chunks that fit within context windows.
Embed -- convert text chunks into numerical vectors using an embedding model.
Store -- persist embeddings in a vector store for efficient similarity search.
Retrieve -- find the most relevant documents for a given query.

Key types

Type	Crate	Purpose
`Document`	`synaptic_retrieval`	A unit of text with `id`, `content`, and `metadata: HashMap<String, Value>`
`Loader` trait	`synaptic_loaders`	Async trait for loading documents from various sources
`TextSplitter` trait	`synaptic_splitters`	Splits text into chunks with optional overlap
`Embeddings` trait	`synaptic_embeddings`	Converts text into vector representations
`VectorStore` trait	`synaptic_vectorstores`	Stores and searches document embeddings
`Retriever` trait	`synaptic_retrieval`	Retrieves relevant documents given a query string

Retrievers

Synaptic ships with seven retriever implementations, each suited to different use cases:

Retriever	Strategy
`VectorStoreRetriever`	Wraps any `VectorStore` for cosine similarity search
`BM25Retriever`	Okapi BM25 keyword scoring -- no embeddings required
`MultiQueryRetriever`	Uses an LLM to generate query variants, retrieves for each, deduplicates
`EnsembleRetriever`	Combines multiple retrievers via Reciprocal Rank Fusion
`ContextualCompressionRetriever`	Post-filters retrieved documents using a `DocumentCompressor`
`SelfQueryRetriever`	Uses an LLM to extract structured metadata filters from natural language
`ParentDocumentRetriever`	Searches small child chunks but returns full parent documents

Guides

Document Loaders -- load data from text, JSON, CSV, files, directories, and the web
Text Splitters -- break documents into chunks with character, recursive, markdown, or token-based strategies
Embeddings -- embed text using OpenAI, Ollama, or deterministic fake embeddings
Vector Stores -- store and search embeddings with InMemoryVectorStore
BM25 Retriever -- keyword-based retrieval with Okapi BM25 scoring
Multi-Query Retriever -- improve recall by generating multiple query perspectives
Ensemble Retriever -- combine retrievers with Reciprocal Rank Fusion
Contextual Compression -- post-filter results with embedding similarity thresholds
Self-Query Retriever -- LLM-powered metadata filtering from natural language
Parent Document Retriever -- search small chunks, return full parent documents

Document Loaders

This guide shows how to load documents from various sources using Synaptic's Loader trait and its built-in implementations.

Overview

Every loader implements the Loader trait from synaptic_loaders:

#[async_trait]
pub trait Loader: Send + Sync {
    async fn load(&self) -> Result<Vec<Document>, SynapticError>;
}

Each loader returns Vec<Document>. A Document has three fields:

id: String -- a unique identifier
content: String -- the document text
metadata: HashMap<String, Value> -- arbitrary key-value metadata

TextLoader

Wraps a string of text into a single Document. Useful when you already have content in memory.

use synaptic::loaders::{TextLoader, Loader};

let loader = TextLoader::new("doc-1", "Rust is a systems programming language.");
let docs = loader.load().await?;

assert_eq!(docs.len(), 1);
assert_eq!(docs[0].content, "Rust is a systems programming language.");

The first argument is the document ID; the second is the content.

FileLoader

Reads a file from disk using tokio::fs::read_to_string and returns a single Document. The file path is used as the document ID, and a source metadata key is set to the file path.

use synaptic::loaders::{FileLoader, Loader};

let loader = FileLoader::new("data/notes.txt");
let docs = loader.load().await?;

assert_eq!(docs[0].metadata["source"], "data/notes.txt");

JsonLoader

Loads documents from a JSON string. If the JSON is an array of objects, each object becomes a Document. If it is a single object, one Document is produced.

use synaptic::loaders::{JsonLoader, Loader};

let json_data = r#"[
    {"id": "1", "content": "First document"},
    {"id": "2", "content": "Second document"}
]"#;

let loader = JsonLoader::new(json_data);
let docs = loader.load().await?;

assert_eq!(docs.len(), 2);
assert_eq!(docs[0].content, "First document");

By default, JsonLoader looks for "id" and "content" keys. You can customize them with builder methods:

let loader = JsonLoader::new(json_data)
    .with_id_key("doc_id")
    .with_content_key("text");

CsvLoader

Loads documents from CSV data. Each row becomes a Document. All columns are stored as metadata.

use synaptic::loaders::{CsvLoader, Loader};

let csv_data = "title,body,author\nIntro,Hello world,Alice\nChapter 1,Once upon a time,Bob";

let loader = CsvLoader::new(csv_data)
    .with_content_column("body")
    .with_id_column("title");

let docs = loader.load().await?;

assert_eq!(docs.len(), 2);
assert_eq!(docs[0].id, "Intro");
assert_eq!(docs[0].content, "Hello world");
assert_eq!(docs[0].metadata["author"], "Alice");

If no content_column is specified, all columns are concatenated. If no id_column is specified, IDs default to "row-0", "row-1", etc.

DirectoryLoader

Loads all files from a directory, each file becoming a Document. Use with_glob to filter by extension and with_recursive to include subdirectories.

use synaptic::loaders::{DirectoryLoader, Loader};

let loader = DirectoryLoader::new("./docs")
    .with_glob("*.txt")
    .with_recursive(true);

let docs = loader.load().await?;
// Each document has a `source` metadata key set to the file path

Document IDs are the relative file paths from the base directory.

MarkdownLoader

Reads a markdown file and returns it as a single Document with format: "markdown" in metadata.

use synaptic::loaders::{MarkdownLoader, Loader};

let loader = MarkdownLoader::new("docs/guide.md");
let docs = loader.load().await?;

assert_eq!(docs[0].metadata["format"], "markdown");

WebBaseLoader

Fetches content from a URL via HTTP GET and returns a single Document. Metadata includes source (the URL) and content_type (from the response header).

use synaptic::loaders::{WebBaseLoader, Loader};

let loader = WebBaseLoader::new("https://example.com/page.html");
let docs = loader.load().await?;

assert_eq!(docs[0].metadata["source"], "https://example.com/page.html");

Lazy loading

Every Loader also provides a lazy_load() method that returns a Stream of documents instead of loading all at once. The default implementation wraps load(), but custom loaders can override it for true lazy behavior.

use futures::StreamExt;
use synaptic::loaders::{DirectoryLoader, Loader};

let loader = DirectoryLoader::new("./data").with_glob("*.txt");
let mut stream = loader.lazy_load();

while let Some(result) = stream.next().await {
    let doc = result?;
    println!("Loaded: {}", doc.id);
}

Text Splitters

This guide shows how to break large documents into smaller chunks using Synaptic's TextSplitter trait and its built-in implementations.

Overview

All splitters implement the TextSplitter trait from synaptic_splitters:

pub trait TextSplitter: Send + Sync {
    fn split_text(&self, text: &str) -> Vec<String>;
    fn split_documents(&self, docs: Vec<Document>) -> Vec<Document>;
}

split_text() takes a string and returns a vector of chunks.
split_documents() splits each document's content, producing new Document values with preserved metadata and an added chunk_index field.

CharacterTextSplitter

Splits text on a single separator string, then merges small pieces to stay under chunk_size.

use synaptic::splitters::CharacterTextSplitter;
use synaptic::splitters::TextSplitter;

// Chunk size in characters, default separator is "\n\n"
let splitter = CharacterTextSplitter::new(500);
let chunks = splitter.split_text("long text...");

Configure the separator and overlap:

let splitter = CharacterTextSplitter::new(500)
    .with_separator("\n")       // Split on single newlines
    .with_chunk_overlap(50);    // 50 characters of overlap between chunks

RecursiveCharacterTextSplitter

The most commonly used splitter. Tries a hierarchy of separators in order, splitting with the first one that produces chunks small enough. If a chunk is still too large, it recurses with the next separator.

Default separators: ["\n\n", "\n", " ", ""]

use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::splitters::TextSplitter;

let splitter = RecursiveCharacterTextSplitter::new(1000)
    .with_chunk_overlap(200);

let chunks = splitter.split_text("long document text...");

Custom separators:

let splitter = RecursiveCharacterTextSplitter::new(1000)
    .with_separators(vec![
        "\n\n\n".to_string(),
        "\n\n".to_string(),
        "\n".to_string(),
        " ".to_string(),
        String::new(),
    ]);

Language-aware splitting

Use from_language() to get separators tuned for a specific programming language:

use synaptic::splitters::{RecursiveCharacterTextSplitter, Language};

let splitter = RecursiveCharacterTextSplitter::from_language(
    Language::Rust,
    1000,  // chunk_size
    200,   // chunk_overlap
);

MarkdownHeaderTextSplitter

Splits markdown text by headers, adding the header hierarchy to each chunk's metadata.

use synaptic::splitters::{MarkdownHeaderTextSplitter, HeaderType};

let splitter = MarkdownHeaderTextSplitter::new(vec![
    HeaderType { level: "#".to_string(), name: "h1".to_string() },
    HeaderType { level: "##".to_string(), name: "h2".to_string() },
    HeaderType { level: "###".to_string(), name: "h3".to_string() },
]);

let docs = splitter.split_markdown("# Title\n\nIntro text\n\n## Section\n\nBody text");
// docs[0].metadata contains {"h1": "Title"}
// docs[1].metadata contains {"h1": "Title", "h2": "Section"}

A convenience constructor provides the default #, ##, ### configuration:

let splitter = MarkdownHeaderTextSplitter::default_headers();

Note that MarkdownHeaderTextSplitter also implements TextSplitter, but split_markdown() returns Vec<Document> with full metadata, which is usually what you want.

TokenTextSplitter

Splits text by estimated token count using a ~4 characters per token heuristic. Splits at word boundaries to keep chunks readable.

use synaptic::splitters::TokenTextSplitter;
use synaptic::splitters::TextSplitter;

// chunk_size is in estimated tokens (not characters)
let splitter = TokenTextSplitter::new(500)
    .with_chunk_overlap(50);

let chunks = splitter.split_text("long text...");

This is consistent with the token estimation used in ConversationTokenBufferMemory.

HtmlHeaderTextSplitter

Splits HTML text by header tags (<h1>, <h2>, etc.), adding header hierarchy to each chunk's metadata. Similar to MarkdownHeaderTextSplitter but for HTML content.

use synaptic::splitters::HtmlHeaderTextSplitter;

let splitter = HtmlHeaderTextSplitter::new(vec![
    ("h1".to_string(), "Header 1".to_string()),
    ("h2".to_string(), "Header 2".to_string()),
]);

let html = "<h1>Title</h1><p>Intro text</p><h2>Section</h2><p>Body text</p>";
let docs = splitter.split_html(html);
// docs[0].metadata contains {"Header 1": "Title"}
// docs[1].metadata contains {"Header 1": "Title", "Header 2": "Section"}

The constructor takes a list of (tag_name, metadata_key) pairs. Only the specified tags are treated as split points; all other HTML content is treated as body text within the current section.

Splitting documents

All splitters can split a Vec<Document> into smaller chunks. Each chunk inherits the parent's metadata and gets a chunk_index field. The chunk ID is formatted as "{original_id}-chunk-{index}".

use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
use synaptic::retrieval::Document;

let splitter = RecursiveCharacterTextSplitter::new(500);

let docs = vec![
    Document::new("doc-1", "A very long document..."),
    Document::new("doc-2", "Another long document..."),
];

let chunks = splitter.split_documents(docs);
// chunks[0].id == "doc-1-chunk-0"
// chunks[0].metadata["chunk_index"] == 0

Choosing a splitter

Splitter	Best for
`CharacterTextSplitter`	Simple splitting on a known delimiter
`RecursiveCharacterTextSplitter`	General-purpose text -- tries to preserve paragraphs, then sentences, then words
`MarkdownHeaderTextSplitter`	Markdown documents where you want header context in metadata
`TokenTextSplitter`	When you need to control chunk size in tokens rather than characters

Embeddings

This guide shows how to convert text into vector representations using Synaptic's Embeddings trait and its built-in providers.

Overview

All embedding providers implement the Embeddings trait from synaptic_embeddings:

#[async_trait]
pub trait Embeddings: Send + Sync {
    async fn embed_documents(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>, SynapticError>;
    async fn embed_query(&self, text: &str) -> Result<Vec<f32>, SynapticError>;
}

embed_documents() embeds multiple texts in a single batch -- use this for indexing.
embed_query() embeds a single query text -- use this at retrieval time.

FakeEmbeddings

Generates deterministic vectors based on a simple hash of the input text. Useful for testing and development without API calls.

use synaptic::embeddings::FakeEmbeddings;
use synaptic::embeddings::Embeddings;

// Specify the number of dimensions (default is 4)
let embeddings = FakeEmbeddings::new(4);

let doc_vectors = embeddings.embed_documents(&["doc one", "doc two"]).await?;
let query_vector = embeddings.embed_query("search query").await?;

// Vectors are normalized to unit length
// Similar texts produce similar vectors

OpenAiEmbeddings

Uses the OpenAI embeddings API. Requires an API key and a ProviderBackend.

use std::sync::Arc;
use synaptic::embeddings::{OpenAiEmbeddings, OpenAiEmbeddingsConfig};
use synaptic::embeddings::Embeddings;
use synaptic::models::backend::HttpBackend;

let config = OpenAiEmbeddingsConfig::new("sk-...")
    .with_model("text-embedding-3-small");  // default model

let backend = Arc::new(HttpBackend::new());
let embeddings = OpenAiEmbeddings::new(config, backend);

let vectors = embeddings.embed_documents(&["hello world"]).await?;

You can customize the base URL for compatible APIs:

let config = OpenAiEmbeddingsConfig::new("sk-...")
    .with_base_url("https://my-proxy.example.com/v1");

OllamaEmbeddings

Uses a local Ollama instance for embedding. No API key required -- just specify the model name.

use std::sync::Arc;
use synaptic::embeddings::{OllamaEmbeddings, OllamaEmbeddingsConfig};
use synaptic::embeddings::Embeddings;
use synaptic::models::backend::HttpBackend;

let config = OllamaEmbeddingsConfig::new("nomic-embed-text");
// Default base_url: http://localhost:11434

let backend = Arc::new(HttpBackend::new());
let embeddings = OllamaEmbeddings::new(config, backend);

let vector = embeddings.embed_query("search query").await?;

Custom Ollama endpoint:

let config = OllamaEmbeddingsConfig::new("nomic-embed-text")
    .with_base_url("http://my-ollama:11434");

CacheBackedEmbeddings

Wraps any Embeddings provider with an in-memory cache. Previously computed embeddings are returned from cache; only uncached texts are sent to the underlying provider.

use std::sync::Arc;
use synaptic::embeddings::{CacheBackedEmbeddings, FakeEmbeddings, Embeddings};

let inner = Arc::new(FakeEmbeddings::new(128));
let cached = CacheBackedEmbeddings::new(inner);

// First call computes the embedding
let v1 = cached.embed_query("hello").await?;

// Second call returns the cached result -- no recomputation
let v2 = cached.embed_query("hello").await?;

assert_eq!(v1, v2);

This is especially useful when adding documents to a vector store and then querying, since the same text may be embedded multiple times across operations.

Using embeddings with vector stores

Embeddings are passed to vector store methods rather than stored inside the vector store. This lets you swap embedding providers without rebuilding the store.

use synaptic::vectorstores::{InMemoryVectorStore, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;

let embeddings = FakeEmbeddings::new(128);
let store = InMemoryVectorStore::new();

let docs = vec![Document::new("1", "Rust is fast")];
store.add_documents(docs, &embeddings).await?;

let results = store.similarity_search("fast language", 5, &embeddings).await?;

Vector Stores

This guide shows how to store and search document embeddings using Synaptic's VectorStore trait and the built-in InMemoryVectorStore.

Overview

The VectorStore trait from synaptic_vectorstores provides methods for adding, searching, and deleting documents:

#[async_trait]
pub trait VectorStore: Send + Sync {
    async fn add_documents(
        &self, docs: Vec<Document>, embeddings: &dyn Embeddings,
    ) -> Result<Vec<String>, SynapticError>;

    async fn similarity_search(
        &self, query: &str, k: usize, embeddings: &dyn Embeddings,
    ) -> Result<Vec<Document>, SynapticError>;

    async fn similarity_search_with_score(
        &self, query: &str, k: usize, embeddings: &dyn Embeddings,
    ) -> Result<Vec<(Document, f32)>, SynapticError>;

    async fn similarity_search_by_vector(
        &self, embedding: &[f32], k: usize,
    ) -> Result<Vec<Document>, SynapticError>;

    async fn delete(&self, ids: &[&str]) -> Result<(), SynapticError>;
}

The embeddings parameter is passed to each method rather than stored inside the vector store. This design lets you swap embedding providers without rebuilding the store.

InMemoryVectorStore

An in-memory vector store that uses cosine similarity for search. Backed by a RwLock<HashMap>.

Creating a store

use synaptic::vectorstores::InMemoryVectorStore;

let store = InMemoryVectorStore::new();

Adding documents

use synaptic::vectorstores::{InMemoryVectorStore, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;

let store = InMemoryVectorStore::new();
let embeddings = FakeEmbeddings::new(128);

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;
// ids == ["1", "2", "3"]

Similarity search

Find the k most similar documents to a query:

let results = store.similarity_search("fast systems language", 2, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

Get similarity scores alongside results (higher is more similar):

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Search by vector

Search using a pre-computed embedding vector instead of a text query:

use synaptic::embeddings::Embeddings;

let query_vec = embeddings.embed_query("systems programming").await?;
let results = store.similarity_search_by_vector(&query_vec, 3).await?;

Deleting documents

store.delete(&["1", "3"]).await?;

Convenience constructors

Create a store pre-populated with documents:

use synaptic::vectorstores::InMemoryVectorStore;
use synaptic::embeddings::FakeEmbeddings;

let embeddings = FakeEmbeddings::new(128);

// From (id, content) tuples
let store = InMemoryVectorStore::from_texts(
    vec![("1", "Rust is fast"), ("2", "Python is flexible")],
    &embeddings,
).await?;

// From Document values
let store = InMemoryVectorStore::from_documents(docs, &embeddings).await?;

Maximum Marginal Relevance (MMR)

MMR search balances relevance with diversity. The lambda_mult parameter controls the trade-off:

1.0 -- pure relevance (equivalent to standard similarity search)
0.0 -- maximum diversity
0.5 -- balanced (typical default)

let results = store.max_marginal_relevance_search(
    "programming language",
    3,        // k: number of results
    10,       // fetch_k: initial candidates to consider
    0.5,      // lambda_mult: relevance vs. diversity
    &embeddings,
).await?;

VectorStoreRetriever

VectorStoreRetriever bridges any VectorStore to the Retriever trait, making it compatible with the rest of Synaptic's retrieval infrastructure.

use std::sync::Arc;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Retriever;

let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());
// ... add documents to store ...

let retriever = VectorStoreRetriever::new(store, embeddings, 5);

let results = retriever.retrieve("query", 5).await?;

MultiVectorRetriever

MultiVectorRetriever stores small child chunks in a vector store for precise retrieval, but returns the larger parent documents they came from. This gives you the best of both worlds: small chunks for accurate embedding search and full documents for LLM context.

use std::sync::Arc;
use synaptic::vectorstores::{InMemoryVectorStore, MultiVectorRetriever};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::{Document, Retriever};

let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());

let retriever = MultiVectorRetriever::new(store, embeddings, 3);

// Add parent documents with their child chunks
let parent = Document::new("parent-1", "Full article about Rust ownership...");
let children = vec![
    Document::new("child-1", "Ownership rules in Rust"),
    Document::new("child-2", "Borrowing and references"),
];

retriever.add_documents(parent, children).await?;

// Search finds child chunks but returns the parent
let results = retriever.retrieve("ownership", 1).await?;
assert_eq!(results[0].id, Some("parent-1".to_string()));

The id_key metadata field links children to their parent. By default it is "doc_id".

Score threshold filtering

Set a minimum similarity score. Only documents meeting the threshold are returned:

let retriever = VectorStoreRetriever::new(store, embeddings, 10)
    .with_score_threshold(0.7);

let results = retriever.retrieve("query", 10).await?;
// Only documents with cosine similarity >= 0.7 are included

BM25 Retriever

This guide shows how to use the BM25Retriever for keyword-based document retrieval using Okapi BM25 scoring.

Overview

BM25 (Best Matching 25) is a classic information retrieval algorithm that scores documents based on term frequency, inverse document frequency, and document length normalization. Unlike vector-based retrieval, BM25 does not require embeddings -- it works directly on the text.

BM25 is a good choice when:

You need exact keyword matching rather than semantic similarity.
You want fast retrieval without the cost of computing embeddings.
You want to combine it with a vector retriever in an ensemble (see Ensemble Retriever).

Basic usage

use synaptic::retrieval::{BM25Retriever, Document, Retriever};

let docs = vec![
    Document::new("1", "Rust is a systems programming language focused on safety"),
    Document::new("2", "Python is widely used for data science and machine learning"),
    Document::new("3", "Go was designed at Google for concurrent programming"),
    Document::new("4", "Rust provides memory safety without garbage collection"),
];

let retriever = BM25Retriever::new(docs);

let results = retriever.retrieve("Rust memory safety", 2).await?;
// Returns documents 4 and 1 (highest BM25 scores for those query terms)

The retriever pre-computes term frequencies, document lengths, and inverse document frequencies at construction time, so retrieval itself is fast.

Custom BM25 parameters

BM25 has two tuning parameters:

k1 (default 1.5) -- controls term frequency saturation. Higher values give more weight to term frequency.
b (default 0.75) -- controls document length normalization. 1.0 means full length normalization; 0.0 means no length normalization.

let retriever = BM25Retriever::with_params(docs, 1.2, 0.8);

How scoring works

For each query term, BM25 computes:

score = IDF * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * dl / avgdl))

Where:

IDF = ln((N - df + 0.5) / (df + 0.5) + 1) -- inverse document frequency
tf -- term frequency in the document
dl -- document length (in tokens)
avgdl -- average document length across the corpus
N -- total number of documents
df -- number of documents containing the term

Documents with a total score of zero (no matching terms) are excluded from results.

Combining with vector search

BM25 pairs well with vector retrieval through the EnsembleRetriever. This gives you the best of both keyword matching and semantic search:

use std::sync::Arc;
use synaptic::retrieval::{BM25Retriever, EnsembleRetriever, Retriever};

let bm25 = Arc::new(BM25Retriever::new(docs.clone()));
let vector_retriever = Arc::new(/* VectorStoreRetriever */);

let ensemble = EnsembleRetriever::new(vec![
    (vector_retriever as Arc<dyn Retriever>, 0.5),
    (bm25 as Arc<dyn Retriever>, 0.5),
]);

let results = ensemble.retrieve("query", 5).await?;

See Ensemble Retriever for more details on combining retrievers.

Multi-Query Retriever

This guide shows how to use the MultiQueryRetriever to improve retrieval recall by generating multiple query perspectives with an LLM.

Overview

A single search query may not capture all relevant documents, especially when the user's phrasing does not match the vocabulary in the document corpus. The MultiQueryRetriever addresses this by:

Using a ChatModel to generate alternative phrasings of the original query.
Running each query variant through a base retriever.
Deduplicating and merging the results.

This technique improves recall by overcoming limitations of distance-based similarity search.

Basic usage

use std::sync::Arc;
use synaptic::retrieval::{MultiQueryRetriever, Retriever};

let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);
let model: Arc<dyn ChatModel> = Arc::new(/* any ChatModel */);

// Default: generates 3 query variants
let retriever = MultiQueryRetriever::new(base_retriever, model);

let results = retriever.retrieve("What are the benefits of Rust?", 5).await?;

When you call retrieve(), the retriever:

Sends a prompt to the LLM asking it to rephrase the query into 3 different versions.
Runs the original query plus all generated variants through the base retriever.
Collects all results, deduplicates by document id, and returns up to top_k documents.

Custom number of query variants

Specify a different number of generated queries:

let retriever = MultiQueryRetriever::with_num_queries(
    base_retriever,
    model,
    5,  // Generate 5 query variants
);

More variants increase recall but also increase the number of LLM and retriever calls.

How it works internally

The retriever sends a prompt like this to the LLM:

You are an AI language model assistant. Your task is to generate 3 different versions of the given user question to retrieve relevant documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of distance-based similarity search. Provide these alternative questions separated by newlines. Only output the questions, nothing else.

Original question: What are the benefits of Rust?

The LLM might respond with:

Why should I use Rust as a programming language?
What advantages does Rust offer over other languages?
What makes Rust a good choice for software development?

Each of these queries is then run through the base retriever, and all results are merged with deduplication.

Example with a vector store

use std::sync::Arc;
use synaptic::retrieval::{MultiQueryRetriever, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;

// Set up vector store
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());

let docs = vec![
    Document::new("1", "Rust ensures memory safety without a garbage collector"),
    Document::new("2", "Rust's ownership system prevents data races at compile time"),
    Document::new("3", "Go uses goroutines for lightweight concurrency"),
];
store.add_documents(docs, embeddings.as_ref()).await?;

// Wrap vector store as a retriever
let base = Arc::new(VectorStoreRetriever::new(store, embeddings, 5));

// Create multi-query retriever
let model: Arc<dyn ChatModel> = Arc::new(/* your model */);
let retriever = MultiQueryRetriever::new(base, model);

let results = retriever.retrieve("Why is Rust safe?", 5).await?;
// May find documents that mention "memory safety", "ownership", "data races"
// even if the original query doesn't use those exact terms

Ensemble Retriever

This guide shows how to combine multiple retrievers using the EnsembleRetriever and Reciprocal Rank Fusion (RRF).

Overview

Different retrieval strategies have different strengths. Keyword-based methods (like BM25) excel at exact term matching, while vector-based methods capture semantic similarity. The EnsembleRetriever combines results from multiple retrievers into a single ranked list, giving you the best of both approaches.

It uses Reciprocal Rank Fusion (RRF) to merge rankings. Each retriever contributes a weighted RRF score for each document based on the document's rank in that retriever's results. Documents are then sorted by their total RRF score.

Basic usage

use std::sync::Arc;
use synaptic::retrieval::{EnsembleRetriever, Retriever};

let retriever_a: Arc<dyn Retriever> = Arc::new(/* vector retriever */);
let retriever_b: Arc<dyn Retriever> = Arc::new(/* BM25 retriever */);

let ensemble = EnsembleRetriever::new(vec![
    (retriever_a, 0.5),  // weight 0.5
    (retriever_b, 0.5),  // weight 0.5
]);

let results = ensemble.retrieve("query", 5).await?;

Each tuple contains a retriever and its weight. The weight scales the RRF score contribution from that retriever.

Combining vector search with BM25

The most common use case is combining semantic (vector) search with keyword (BM25) search:

use std::sync::Arc;
use synaptic::retrieval::{BM25Retriever, EnsembleRetriever, Document, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;

let docs = vec![
    Document::new("1", "Rust provides memory safety through ownership"),
    Document::new("2", "Python has a large ecosystem for machine learning"),
    Document::new("3", "Rust's borrow checker prevents data races"),
    Document::new("4", "Go is designed for building scalable services"),
];

// BM25 retriever (keyword-based)
let bm25 = Arc::new(BM25Retriever::new(docs.clone()));

// Vector retriever (semantic)
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::from_documents(docs, embeddings.as_ref()).await?);
let vector = Arc::new(VectorStoreRetriever::new(store, embeddings, 5));

// Combine with equal weights
let ensemble = EnsembleRetriever::new(vec![
    (vector as Arc<dyn Retriever>, 0.5),
    (bm25 as Arc<dyn Retriever>, 0.5),
]);

let results = ensemble.retrieve("Rust safety", 3).await?;

Adjusting weights

Weights control how much each retriever contributes to the final ranking. Higher weight means more influence.

// Favor semantic search
let ensemble = EnsembleRetriever::new(vec![
    (vector_retriever, 0.7),
    (bm25_retriever, 0.3),
]);

// Favor keyword search
let ensemble = EnsembleRetriever::new(vec![
    (vector_retriever, 0.3),
    (bm25_retriever, 0.7),
]);

How Reciprocal Rank Fusion works

For each document returned by a retriever, RRF computes a score:

rrf_score = weight / (k + rank)

Where:

weight is the retriever's configured weight.
k is a constant (60, the standard RRF constant) that prevents top-ranked documents from dominating.
rank is the document's 1-based position in the retriever's results.

If a document appears in results from multiple retrievers, its RRF scores are summed. The final results are sorted by total RRF score in descending order.

Combining more than two retrievers

You can combine any number of retrievers:

let ensemble = EnsembleRetriever::new(vec![
    (vector_retriever, 0.4),
    (bm25_retriever, 0.3),
    (multi_query_retriever, 0.3),
]);

let results = ensemble.retrieve("query", 10).await?;

Contextual Compression

This guide shows how to post-filter retrieved documents using the ContextualCompressionRetriever and EmbeddingsFilter.

Overview

A base retriever may return documents that are only loosely related to the query. Contextual compression adds a second filtering step: after retrieval, a DocumentCompressor evaluates each document against the query and removes documents that do not meet a relevance threshold.

This is especially useful when your base retriever fetches broadly (high recall) and you want to tighten the results (high precision).

DocumentCompressor trait

The filtering logic is defined by the DocumentCompressor trait:

#[async_trait]
pub trait DocumentCompressor: Send + Sync {
    async fn compress_documents(
        &self,
        documents: Vec<Document>,
        query: &str,
    ) -> Result<Vec<Document>, SynapticError>;
}

Synaptic provides EmbeddingsFilter as a built-in compressor.

EmbeddingsFilter

Filters documents by computing cosine similarity between the query embedding and each document's content embedding. Only documents that meet or exceed the similarity threshold are kept.

use std::sync::Arc;
use synaptic::retrieval::EmbeddingsFilter;
use synaptic::embeddings::FakeEmbeddings;

let embeddings = Arc::new(FakeEmbeddings::new(128));

// Only keep documents with similarity >= 0.7
let filter = EmbeddingsFilter::new(embeddings, 0.7);

A convenience constructor uses the default threshold of 0.75:

let filter = EmbeddingsFilter::with_default_threshold(embeddings);

ContextualCompressionRetriever

Wraps a base retriever and applies a DocumentCompressor to the results:

use std::sync::Arc;
use synaptic::retrieval::{
    ContextualCompressionRetriever,
    EmbeddingsFilter,
    Retriever,
};
use synaptic::embeddings::FakeEmbeddings;

let embeddings = Arc::new(FakeEmbeddings::new(128));
let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);

// Create the filter
let filter = Arc::new(EmbeddingsFilter::new(embeddings, 0.7));

// Wrap the base retriever with compression
let retriever = ContextualCompressionRetriever::new(base_retriever, filter);

let results = retriever.retrieve("query", 5).await?;
// Only documents with cosine similarity >= 0.7 to the query are returned

Full example

use std::sync::Arc;
use synaptic::retrieval::{
    BM25Retriever,
    ContextualCompressionRetriever,
    EmbeddingsFilter,
    Document,
    Retriever,
};
use synaptic::embeddings::FakeEmbeddings;

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "The weather today is sunny and warm"),
    Document::new("3", "Rust provides memory safety guarantees"),
    Document::new("4", "Cooking pasta requires boiling water"),
];

// BM25 might return loosely relevant results
let base = Arc::new(BM25Retriever::new(docs));

// Use embedding similarity to filter out irrelevant documents
let embeddings = Arc::new(FakeEmbeddings::new(128));
let filter = Arc::new(EmbeddingsFilter::new(embeddings, 0.6));
let retriever = ContextualCompressionRetriever::new(base, filter);

let results = retriever.retrieve("Rust programming", 5).await?;
// Documents about weather and cooking are filtered out

How it works

The ContextualCompressionRetriever calls base.retrieve(query, top_k) to get candidate documents.
It passes those candidates to the DocumentCompressor (e.g., EmbeddingsFilter).
The compressor embeds the query and all candidate documents, computes cosine similarity, and removes documents below the threshold.
The filtered results are returned.

Custom compressors

You can implement your own DocumentCompressor for other filtering strategies -- for example, using an LLM to judge relevance or extracting only the most relevant passage from each document.

use async_trait::async_trait;
use synaptic::retrieval::{DocumentCompressor, Document};
use synaptic::core::SynapticError;

struct MyCompressor;

#[async_trait]
impl DocumentCompressor for MyCompressor {
    async fn compress_documents(
        &self,
        documents: Vec<Document>,
        query: &str,
    ) -> Result<Vec<Document>, SynapticError> {
        // Your filtering logic here
        Ok(documents)
    }
}

Self-Query Retriever

This guide shows how to use the SelfQueryRetriever to automatically extract structured metadata filters from natural language queries.

Overview

Users often express search intent that includes both a semantic query and metadata constraints in the same sentence. For example:

"Find documents about Rust published after 2024"

This contains:

A semantic query: "documents about Rust"
A metadata filter: year > 2024

The SelfQueryRetriever uses a ChatModel to parse the user's natural language query into a structured search query plus metadata filters, then applies those filters to the results from a base retriever.

Defining metadata fields

First, describe the metadata fields available in your document corpus using MetadataFieldInfo:

use synaptic::retrieval::MetadataFieldInfo;

let fields = vec![
    MetadataFieldInfo {
        name: "year".to_string(),
        description: "The year the document was published".to_string(),
        field_type: "integer".to_string(),
    },
    MetadataFieldInfo {
        name: "language".to_string(),
        description: "The programming language discussed".to_string(),
        field_type: "string".to_string(),
    },
    MetadataFieldInfo {
        name: "author".to_string(),
        description: "The author of the document".to_string(),
        field_type: "string".to_string(),
    },
];

Each field has a name, a human-readable description, and a field_type that tells the LLM what kind of values to expect.

Basic usage

use std::sync::Arc;
use synaptic::retrieval::{SelfQueryRetriever, MetadataFieldInfo, Retriever};

let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);
let model: Arc<dyn ChatModel> = Arc::new(/* any ChatModel */);

let retriever = SelfQueryRetriever::new(base_retriever, model, fields);

let results = retriever.retrieve(
    "find articles about Rust written by Alice",
    5,
).await?;
// LLM extracts: query="Rust", filters: [language eq "Rust", author eq "Alice"]

How it works

The retriever builds a prompt describing the available metadata fields and sends the user's query to the LLM.
The LLM responds with a JSON object containing:
- "query" -- the extracted semantic search query.
- "filters" -- an array of filter objects, each with "field", "op", and "value".
The retriever runs the extracted query through the base retriever (fetching extra candidates, top_k * 2).
Filters are applied to the results, keeping only documents whose metadata matches all filter conditions.
The final filtered results are truncated to top_k and returned.

Supported filter operators

Operator	Meaning
`eq`	Equal to
`gt`	Greater than
`gte`	Greater than or equal to
`lt`	Less than
`lte`	Less than or equal to
`contains`	String contains substring

Numeric comparisons work on both integers and floats. String comparisons use lexicographic ordering.

Full example

use std::sync::Arc;
use std::collections::HashMap;
use synaptic::retrieval::{
    BM25Retriever,
    SelfQueryRetriever,
    MetadataFieldInfo,
    Document,
    Retriever,
};
use serde_json::json;

// Documents with metadata
let docs = vec![
    Document::with_metadata(
        "1",
        "An introduction to Rust's ownership model",
        HashMap::from([
            ("year".to_string(), json!(2024)),
            ("language".to_string(), json!("Rust")),
        ]),
    ),
    Document::with_metadata(
        "2",
        "Advanced Python patterns for data pipelines",
        HashMap::from([
            ("year".to_string(), json!(2023)),
            ("language".to_string(), json!("Python")),
        ]),
    ),
    Document::with_metadata(
        "3",
        "Rust async programming with Tokio",
        HashMap::from([
            ("year".to_string(), json!(2025)),
            ("language".to_string(), json!("Rust")),
        ]),
    ),
];

let base = Arc::new(BM25Retriever::new(docs));
let model: Arc<dyn ChatModel> = Arc::new(/* your model */);

let fields = vec![
    MetadataFieldInfo {
        name: "year".to_string(),
        description: "Publication year".to_string(),
        field_type: "integer".to_string(),
    },
    MetadataFieldInfo {
        name: "language".to_string(),
        description: "Programming language topic".to_string(),
        field_type: "string".to_string(),
    },
];

let retriever = SelfQueryRetriever::new(base, model, fields);

// Natural language query with implicit filters
let results = retriever.retrieve("Rust articles from 2025", 5).await?;
// LLM extracts: query="Rust articles", filters: [language eq "Rust", year eq 2025]
// Returns only document 3

Considerations

The quality of filter extraction depends on the LLM. Use a capable model for reliable results.
Only filters referencing fields declared in MetadataFieldInfo are applied; unknown fields are ignored.
If the LLM cannot parse the query into structured filters, it falls back to an empty filter list and returns standard retrieval results.

Parent Document Retriever

This guide shows how to use the ParentDocumentRetriever to search on small chunks for precision while returning full parent documents for context.

The problem

When splitting documents for retrieval, you face a trade-off:

Small chunks are better for search precision -- they match queries more accurately because there is less noise.
Large documents are better for context -- they give the LLM more information to work with when generating answers.

The ParentDocumentRetriever solves this by maintaining both: it splits parent documents into small child chunks for indexing, but when a child chunk matches a query, it returns the full parent document.

How it works

You provide parent documents and a splitting function.
The retriever splits each parent into child chunks, storing a child-to-parent mapping.
Child chunks are indexed in a child retriever (e.g., backed by a vector store).
At retrieval time, the child retriever finds matching chunks, then the parent retriever maps those back to their parent documents, deduplicating along the way.

Basic usage

use std::sync::Arc;
use synaptic::retrieval::{ParentDocumentRetriever, Document, Retriever};
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};

// Create a child retriever (any Retriever implementation)
let child_retriever: Arc<dyn Retriever> = Arc::new(/* vector store retriever */);

// Create the parent document retriever with a splitting function
let splitter = RecursiveCharacterTextSplitter::new(200);
let parent_retriever = ParentDocumentRetriever::new(
    child_retriever.clone(),
    move |text: &str| splitter.split_text(text),
);

Adding documents

The add_documents() method splits parent documents into children and stores the mappings. It returns the child documents so you can index them in the child retriever.

let parent_docs = vec![
    Document::new("doc-1", "A very long document about Rust ownership..."),
    Document::new("doc-2", "A detailed guide to async programming in Rust..."),
];

// Split parents into children and get child docs for indexing
let child_docs = parent_retriever.add_documents(parent_docs).await;

// Index child docs in the vector store
// child_docs[0].id == "doc-1-child-0"
// child_docs[0].metadata["parent_id"] == "doc-1"
// child_docs[0].metadata["chunk_index"] == 0

Each child document:

Has an ID formatted as "{parent_id}-child-{index}".
Inherits all metadata from the parent.
Gets additional parent_id and chunk_index metadata fields.

Retrieval

When you call retrieve(), the retriever searches for matching child chunks, then returns the corresponding parent documents:

let results = parent_retriever.retrieve("ownership borrowing", 3).await?;
// Returns full parent documents, not individual chunks

The retriever fetches top_k * 3 child results internally to ensure enough parent documents can be assembled after deduplication.

Full example

use std::sync::Arc;
use synaptic::retrieval::{ParentDocumentRetriever, Document, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};

// Set up embeddings and vector store for child chunks
let embeddings = Arc::new(FakeEmbeddings::new(128));
let child_store = Arc::new(InMemoryVectorStore::new());

// Create the child retriever
let child_retriever = Arc::new(VectorStoreRetriever::new(
    child_store.clone(),
    embeddings.clone(),
    10,
));

// Create parent retriever with a small chunk size for children
let splitter = RecursiveCharacterTextSplitter::new(200);
let parent_retriever = ParentDocumentRetriever::new(
    child_retriever,
    move |text: &str| splitter.split_text(text),
);

// Add parent documents
let parents = vec![
    Document::new("rust-guide", "A comprehensive guide to Rust. \
        Rust is a systems programming language focused on safety, speed, and concurrency. \
        It achieves memory safety without garbage collection through its ownership system. \
        The borrow checker enforces ownership rules at compile time..."),
    Document::new("go-guide", "A comprehensive guide to Go. \
        Go is a statically typed language designed at Google. \
        It features goroutines for lightweight concurrency. \
        Go's garbage collector manages memory automatically..."),
];

let children = parent_retriever.add_documents(parents).await;

// Index children in the vector store
child_store.add_documents(children, embeddings.as_ref()).await?;

// Search for child chunks, get back full parent documents
let results = parent_retriever.retrieve("memory safety ownership", 2).await?;
// Returns the full "rust-guide" parent document, even though only
// a small chunk about ownership matched the query

When to use this

The ParentDocumentRetriever is most useful when:

Your documents are long and cover multiple topics, but you want precise retrieval.
You need the LLM to see the full document context for generating high-quality answers.
Small chunks alone would lose important surrounding context.

For simpler use cases where chunks are self-contained, a standard VectorStoreRetriever may be sufficient.

Tools

Tools give LLMs the ability to take actions in the world -- calling APIs, querying databases, performing calculations, or any other side effect. Synaptic provides a complete tool system built around the Tool trait defined in synaptic-core.

Key Components

Component	Crate	Description
`Tool` trait	`synaptic-core`	The interface every tool must implement: `name()`, `description()`, and `call()`
`ToolRegistry`	`synaptic-tools`	Thread-safe collection of registered tools (`Arc<RwLock<HashMap>>`)
`SerialToolExecutor`	`synaptic-tools`	Dispatches tool calls by name through the registry
`ToolNode`	`synaptic-graph`	Graph node that executes tool calls from AI messages in a state machine workflow
`ToolDefinition`	`synaptic-core`	Schema description sent to the model so it knows what tools are available
`ToolChoice`	`synaptic-core`	Controls whether and how the model selects tools

How It Works

You define tools using the #[tool] macro (or by implementing the Tool trait manually).
Register them in a ToolRegistry.
Convert them to ToolDefinition values and attach them to a ChatRequest so the model knows what tools are available.
When the model responds with ToolCall entries, dispatch them through SerialToolExecutor to get results.
Send the results back to the model as Message::tool(...) messages to continue the conversation.

Quick Example

use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};

/// Add two numbers.
#[tool]
async fn add(
    /// First number
    a: f64,
    /// Second number
    b: f64,
) -> Result<Value, SynapticError> {
    Ok(json!({"result": a + b}))
}

let registry = ToolRegistry::new();
registry.register(add())?;  // add() returns Arc<dyn Tool>

let executor = SerialToolExecutor::new(registry);
let result = executor.execute("add", json!({"a": 3, "b": 4})).await?;
assert_eq!(result, json!({"result": 7.0}));

Sub-Pages

Custom Tools -- implement the Tool trait for your own tools
Tool Registry -- register, look up, and execute tools
Tool Choice -- control how the model selects tools with ToolChoice
Tool Definition Extras -- attach provider-specific parameters to tool definitions
Runtime-Aware Tools -- tools that access graph state, store, and runtime context

Custom Tools

Every tool in Synaptic implements the Tool trait from synaptic-core. The recommended way to define tools is with the #[tool] attribute macro, which generates all the boilerplate for you.

Defining a Tool with `#[tool]`

The #[tool] macro converts an async function into a full Tool implementation. Doc comments on the function become the tool description, and doc comments on parameters become JSON Schema descriptions:

use synaptic::macros::tool;
use synaptic::core::SynapticError;
use serde_json::{json, Value};

/// Get the current weather for a location.
#[tool]
async fn get_weather(
    /// The city name
    location: String,
) -> Result<Value, SynapticError> {
    // In production, call a real weather API here
    Ok(json!({
        "location": location,
        "temperature": 22,
        "condition": "sunny"
    }))
}

// `get_weather()` returns Arc<dyn Tool>
let tool = get_weather();
assert_eq!(tool.name(), "get_weather");

Key points:

The function name becomes the tool name (override with #[tool(name = "custom_name")]).
The doc comment on the function becomes the tool description.
Each parameter becomes a JSON Schema property; doc comments on parameters become "description" fields in the schema.
String, i64, f64, bool, Vec<T>, and Option<T> types are mapped to JSON Schema types automatically.
The factory function (get_weather()) returns Arc<dyn Tool>.

Error Handling

Return SynapticError::Tool(...) for tool-specific errors. The macro handles parameter validation automatically, but you can add your own domain-specific checks:

use synaptic::macros::tool;
use synaptic::core::SynapticError;
use serde_json::{json, Value};

/// Divide two numbers.
#[tool]
async fn divide(
    /// The numerator
    a: f64,
    /// The denominator
    b: f64,
) -> Result<Value, SynapticError> {
    if b == 0.0 {
        return Err(SynapticError::Tool("division by zero".to_string()));
    }

    Ok(json!({"result": a / b}))
}

Note that the macro auto-generates validation for missing or invalid parameters (returning SynapticError::Tool errors), so you no longer need manual args["a"].as_f64().ok_or_else(...) checks.

Registering and Using

The #[tool] macro factory returns Arc<dyn Tool>, which you register directly:

use synaptic::tools::{ToolRegistry, SerialToolExecutor};
use serde_json::json;

let registry = ToolRegistry::new();
registry.register(get_weather())?;

let executor = SerialToolExecutor::new(registry);
let result = executor.execute("get_weather", json!({"location": "Tokyo"})).await?;
// result = {"location": "Tokyo", "temperature": 22, "condition": "sunny"}

See the Tool Registry page for more on registration and execution.

Full ReAct Agent Loop

Here is a complete offline example that defines tools with #[tool], then wires them into a ReAct agent with ScriptedChatModel:

use std::sync::Arc;
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::{ChatModel, ChatResponse, Message, Tool, ToolCall, SynapticError};
use synaptic::models::ScriptedChatModel;
use synaptic::graph::{create_react_agent, MessageState};

// 1. Define tools with the macro
/// Add two numbers.
#[tool]
async fn add(
    /// First number
    a: f64,
    /// Second number
    b: f64,
) -> Result<Value, SynapticError> {
    Ok(json!({"result": a + b}))
}

// 2. Script the model to call the tool and then respond
let model: Arc<dyn ChatModel> = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai_with_tool_calls(
            "",
            vec![ToolCall {
                id: "call_1".into(),
                name: "add".into(),
                arguments: r#"{"a": 3, "b": 4}"#.into(),
            }],
        ),
        usage: None,
    },
    ChatResponse {
        message: Message::ai("The sum is 7."),
        usage: None,
    },
]));

// 3. Build the agent -- add() returns Arc<dyn Tool>
let tools: Vec<Arc<dyn Tool>> = vec![add()];
let agent = create_react_agent(model, tools)?;

// 4. Run it
let state = MessageState::with_messages(vec![
    Message::human("What is 3 + 4?"),
]);
let result = agent.invoke(state).await?.into_state();
assert_eq!(result.messages.last().unwrap().content(), "The sum is 7.");

Tool Definitions for Models

To tell a chat model about available tools, create ToolDefinition values and attach them to a ChatRequest:

use serde_json::json;
use synaptic::core::{ChatRequest, Message, ToolDefinition};

let tool_def = ToolDefinition {
    name: "get_weather".to_string(),
    description: "Get the current weather for a location".to_string(),
    parameters: json!({
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name"
            }
        },
        "required": ["location"]
    }),
};

let request = ChatRequest::new(vec![
    Message::human("What is the weather in Tokyo?"),
])
.with_tools(vec![tool_def]);

The parameters field follows the JSON Schema format that LLM providers expect.

Optional and Default Parameters

#[tool]
async fn search(
    /// The search query
    query: String,
    /// Maximum results (default 10)
    #[default = 10]
    max_results: i64,
    /// Language filter
    language: Option<String>,
) -> Result<String, SynapticError> {
    let lang = language.unwrap_or_else(|| "en".into());
    Ok(format!("Searching '{}' (max {}, lang {})", query, max_results, lang))
}

Stateful Tools with `#[field]`

Tools that need to hold state (database connections, API clients, etc.) can use #[field] to create struct fields that are hidden from the LLM schema:

use std::sync::Arc;

#[tool]
async fn db_query(
    #[field] pool: Arc<DbPool>,
    /// SQL query to execute
    query: String,
) -> Result<Value, SynapticError> {
    let result = pool.execute(&query).await?;
    Ok(serde_json::to_value(result).unwrap())
}

// Factory requires the field parameter
let tool = db_query(pool.clone());

For the full macro reference including #[inject], #[default], and middleware macros, see the Procedural Macros page.

Manual Implementation

For advanced cases that the macro cannot handle (custom parameters() overrides, conditional logic in name() or description(), or implementing both Tool and other traits on the same struct), you can implement the Tool trait directly:

use async_trait::async_trait;
use serde_json::{json, Value};
use synaptic::core::{Tool, SynapticError};

struct WeatherTool;

#[async_trait]
impl Tool for WeatherTool {
    fn name(&self) -> &'static str {
        "get_weather"
    }

    fn description(&self) -> &'static str {
        "Get the current weather for a location"
    }

    async fn call(&self, args: Value) -> Result<Value, SynapticError> {
        let location = args["location"]
            .as_str()
            .unwrap_or("unknown");

        Ok(json!({
            "location": location,
            "temperature": 22,
            "condition": "sunny"
        }))
    }
}

The trait requires three methods:

name() -- a &'static str identifier the model uses when making tool calls.
description() -- tells the model what the tool does.
call() -- receives arguments as a serde_json::Value and returns a Value result.

Wrap manual implementations in Arc::new(WeatherTool) when registering them.

Tool Registry

ToolRegistry is a thread-safe collection of tools, and SerialToolExecutor dispatches tool calls through the registry by name. Both are provided by the synaptic-tools crate.

ToolRegistry

ToolRegistry stores tools in an Arc<RwLock<HashMap<String, Arc<dyn Tool>>>>. It is Clone and can be shared across threads.

Creating and Registering Tools

use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use synaptic::tools::ToolRegistry;

/// Echo back the input.
#[tool]
async fn echo(
    #[args] args: Value,
) -> Result<Value, SynapticError> {
    Ok(json!({"echo": args}))
}

let registry = ToolRegistry::new();
registry.register(echo())?;  // echo() returns Arc<dyn Tool>

If you register two tools with the same name, the second registration replaces the first.

Looking Up Tools

Use get() to retrieve a tool by name:

let tool = registry.get("echo");
assert!(tool.is_some());

let missing = registry.get("nonexistent");
assert!(missing.is_none());

get() returns Option<Arc<dyn Tool>>, so the tool can be called directly if needed.

SerialToolExecutor

SerialToolExecutor wraps a ToolRegistry and provides a convenience method that looks up a tool by name and calls it in one step.

Creating and Using

use synaptic::tools::SerialToolExecutor;
use serde_json::json;

let executor = SerialToolExecutor::new(registry);

let result = executor.execute("echo", json!({"message": "hello"})).await?;
assert_eq!(result, json!({"echo": {"message": "hello"}}));

The execute() method:

Looks up the tool by name in the registry.
Calls tool.call(args) with the provided arguments.
Returns the result or SynapticError::ToolNotFound if the tool does not exist.

Handling Unknown Tools

If you call execute() with a name that is not registered, it returns SynapticError::ToolNotFound:

let err = executor.execute("nonexistent", json!({})).await.unwrap_err();
assert!(matches!(err, synaptic::core::SynapticError::ToolNotFound(name) if name == "nonexistent"));

Complete Example

Here is a full example that registers multiple tools and executes them:

use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};

/// Add two numbers.
#[tool]
async fn add(
    /// First number
    a: f64,
    /// Second number
    b: f64,
) -> Result<Value, SynapticError> {
    Ok(json!({"result": a + b}))
}

/// Multiply two numbers.
#[tool]
async fn multiply(
    /// First number
    a: f64,
    /// Second number
    b: f64,
) -> Result<Value, SynapticError> {
    Ok(json!({"result": a * b}))
}

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    let registry = ToolRegistry::new();
    registry.register(add())?;
    registry.register(multiply())?;

    let executor = SerialToolExecutor::new(registry);

    let sum = executor.execute("add", json!({"a": 3, "b": 4})).await?;
    assert_eq!(sum, json!({"result": 7.0}));

    let product = executor.execute("multiply", json!({"a": 3, "b": 4})).await?;
    assert_eq!(product, json!({"result": 12.0}));

    Ok(())
}

Integration with Chat Models

In a typical agent workflow, the model's response contains ToolCall entries. You dispatch them through the executor and send the results back:

use synaptic::core::{Message, ToolCall};
use serde_json::json;

// After model responds with tool calls:
let tool_calls = vec![
    ToolCall {
        id: "call-1".to_string(),
        name: "add".to_string(),
        arguments: json!({"a": 3, "b": 4}),
    },
];

// Execute each tool call
for tc in &tool_calls {
    let result = executor.execute(&tc.name, tc.arguments.clone()).await?;

    // Create a tool message with the result
    let tool_message = Message::tool(
        result.to_string(),
        &tc.id,
    );
    // Append tool_message to the conversation and send back to the model
}

See the ReAct Agent tutorial for a complete agent loop example.

Tool Choice

ToolChoice controls whether and how a chat model selects tools when responding. It is defined in synaptic-core and attached to a ChatRequest via the with_tool_choice() builder method.

ToolChoice Variants

Variant	Behavior
`ToolChoice::Auto`	The model decides whether to call a tool or respond with text (default when tools are provided)
`ToolChoice::Required`	The model must call at least one tool -- it cannot respond with plain text
`ToolChoice::None`	The model must not call any tools, even if tools are provided in the request
`ToolChoice::Specific(name)`	The model must call the specific named tool

Basic Usage

Attach ToolChoice to a ChatRequest alongside tool definitions:

use serde_json::json;
use synaptic::core::{ChatRequest, Message, ToolChoice, ToolDefinition};

let weather_tool = ToolDefinition {
    name: "get_weather".to_string(),
    description: "Get the current weather for a location".to_string(),
    parameters: json!({
        "type": "object",
        "properties": {
            "location": { "type": "string" }
        },
        "required": ["location"]
    }),
};

// Force the model to use tools
let request = ChatRequest::new(vec![
    Message::human("What is the weather in Tokyo?"),
])
.with_tools(vec![weather_tool])
.with_tool_choice(ToolChoice::Required);

When to Use Each Variant

Auto (Default)

Let the model decide. This is the best choice for general-purpose agents that should respond with text when no tool is needed:

use synaptic::core::{ChatRequest, Message, ToolChoice};

let request = ChatRequest::new(vec![
    Message::human("Hello, how are you?"),
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::Auto);

Required

Force tool usage. Useful in agent loops where the next step must be a tool call, or when you know the user's request requires tool invocation:

use synaptic::core::{ChatRequest, Message, ToolChoice};

let request = ChatRequest::new(vec![
    Message::human("Look up the weather in Paris and Tokyo."),
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::Required);
// The model MUST respond with one or more tool calls

None

Suppress tool calls. Useful when you want to temporarily disable tools without removing them from the request, or during a final summarization step:

use synaptic::core::{ChatRequest, Message, ToolChoice};

let request = ChatRequest::new(vec![
    Message::system("Summarize the tool results for the user."),
    Message::human("What is the weather?"),
    // ... tool result messages ...
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::None);
// The model MUST respond with text, not tool calls

Specific

Force a particular tool. Useful when you know exactly which tool should be called:

use synaptic::core::{ChatRequest, Message, ToolChoice};

let request = ChatRequest::new(vec![
    Message::human("Check the weather in London."),
])
.with_tools(tool_defs)
.with_tool_choice(ToolChoice::Specific("get_weather".to_string()));
// The model MUST call the "get_weather" tool specifically

Complete Example

Here is a full example that creates tools, forces a specific tool call, and processes the result:

use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::{
    ChatModel, ChatRequest, Message, SynapticError, Tool,
    ToolChoice,
};
use synaptic::tools::{ToolRegistry, SerialToolExecutor};

/// Perform arithmetic calculations.
#[tool]
async fn calculator(
    /// The arithmetic expression to evaluate
    expression: String,
) -> Result<Value, SynapticError> {
    // Simplified: in production, parse and evaluate the expression
    Ok(json!({"result": expression}))
}

// Register tools
let registry = ToolRegistry::new();
let calc_tool = calculator();  // Arc<dyn Tool>
registry.register(calc_tool.clone())?;

// Build the tool definition from the tool itself
let calc_def = calc_tool.as_tool_definition();

// Build a request that forces the calculator tool
let request = ChatRequest::new(vec![
    Message::human("What is 42 * 17?"),
])
.with_tools(vec![calc_def])
.with_tool_choice(ToolChoice::Specific("calculator".to_string()));

// Send to the model, then execute the returned tool calls
let response = model.chat(request).await?;
for tc in response.message.tool_calls() {
    let executor = SerialToolExecutor::new(registry.clone());
    let result = executor.execute(&tc.name, tc.arguments.clone()).await?;
    println!("Tool {} returned: {}", tc.name, result);
}

Provider Support

All Synaptic provider adapters (OpenAiChatModel, AnthropicChatModel, GeminiChatModel, OllamaChatModel) support ToolChoice. The adapter translates the Synaptic ToolChoice enum into the provider-specific format automatically.

See also: Bind Tools for attaching tools to a model permanently, and the ReAct Agent tutorial for a complete agent loop.

Tool Definition Extras

The extras field on ToolDefinition carries provider-specific parameters that fall outside the standard name/description/parameters schema, such as Anthropic's cache_control or any custom metadata your provider adapter needs.

The `extras` Field

pub struct ToolDefinition {
    pub name: String,
    pub description: String,
    pub parameters: Value,
    /// Provider-specific parameters (e.g., Anthropic's `cache_control`).
    pub extras: Option<HashMap<String, Value>>,
}

When extras is None (the default), no additional fields are serialized. Provider adapters inspect extras during request building and map recognized keys into the provider's wire format.

Setting Extras on a Tool Definition

Build a ToolDefinition with extras by populating the field directly:

use std::collections::HashMap;
use serde_json::{json, Value};
use synaptic::core::ToolDefinition;

let mut extras = HashMap::new();
extras.insert("cache_control".to_string(), json!({"type": "ephemeral"}));

let tool_def = ToolDefinition {
    name: "search".to_string(),
    description: "Search the web".to_string(),
    parameters: json!({
        "type": "object",
        "properties": {
            "query": { "type": "string" }
        },
        "required": ["query"]
    }),
    extras: Some(extras),
};

Common Use Cases

Anthropic prompt caching -- Anthropic supports a cache_control field on tool definitions to enable prompt caching for tool schemas that rarely change:

let mut extras = HashMap::new();
extras.insert("cache_control".to_string(), json!({"type": "ephemeral"}));

let def = ToolDefinition {
    name: "lookup".to_string(),
    description: "Look up a record".to_string(),
    parameters: json!({"type": "object", "properties": {}}),
    extras: Some(extras),
};

Custom metadata -- You can attach arbitrary key-value pairs for your own adapter logic:

let mut extras = HashMap::new();
extras.insert("priority".to_string(), json!("high"));
extras.insert("timeout_ms".to_string(), json!(5000));

let def = ToolDefinition {
    name: "deploy".to_string(),
    description: "Deploy the service".to_string(),
    parameters: json!({"type": "object", "properties": {}}),
    extras: Some(extras),
};

Extras with `#[tool]` Macro Tools

The #[tool] macro does not support extras directly -- extras are a property of the ToolDefinition, not the tool function itself. Define your tool with the macro, then add extras to the generated definition:

use std::collections::HashMap;
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::SynapticError;

/// Does something useful.
#[tool]
async fn my_tool(
    /// The input query
    query: String,
) -> Result<Value, SynapticError> {
    Ok(json!("done"))
}

// Get the tool definition and add extras
let tool = my_tool();
let mut def = tool.as_tool_definition();
def.extras = Some(HashMap::from([
    ("cache_control".to_string(), json!({"type": "ephemeral"})),
]));

// Use `def` when building the ChatRequest

This approach works with any tool -- whether defined via #[tool] or by implementing the Tool trait manually.

Runtime-Aware Tools

RuntimeAwareTool extends the basic Tool trait with runtime context -- current graph state, a store reference, stream writer, tool call ID, and runnable config. Implement this trait for tools that need to read or modify graph state during execution.

The `ToolRuntime` Struct

When a runtime-aware tool is invoked, it receives a ToolRuntime with the following fields:

pub struct ToolRuntime {
    pub store: Option<Arc<dyn Store>>,
    pub stream_writer: Option<StreamWriter>,
    pub state: Option<Value>,
    pub tool_call_id: String,
    pub config: Option<RunnableConfig>,
}

Field	Description
`store`	Shared key-value store for cross-tool persistence
`stream_writer`	Writer for pushing streaming output from within a tool
`state`	Serialized snapshot of the current graph state
`tool_call_id`	The ID of the tool call being executed
`config`	Runnable config with tags, metadata, and run ID

Implementing with `#[tool]` and `#[inject]`

The recommended way to define a runtime-aware tool is with the #[tool] macro. Use #[inject(store)], #[inject(state)], or #[inject(tool_call_id)] on parameters to receive runtime context. These injected parameters are hidden from the LLM schema. Using any #[inject] attribute automatically switches the generated impl to RuntimeAwareTool:

use std::sync::Arc;
use serde_json::{json, Value};
use synaptic::macros::tool;
use synaptic::core::{Store, SynapticError};

/// Save a note to the store.
#[tool]
async fn save_note(
    /// The note key
    key: String,
    /// The note text
    text: String,
    #[inject(store)] store: Arc<dyn Store>,
) -> Result<Value, SynapticError> {
    store.put(
        &["notes"],
        &key,
        json!({"text": text}),
    ).await?;

    Ok(json!({"saved": key}))
}

// save_note() returns Arc<dyn RuntimeAwareTool>
let tool = save_note();

The #[inject(store)] parameter receives the Arc<dyn Store> from the ToolRuntime at execution time. Only key and text appear in the JSON Schema sent to the model.

Using with `ToolNode` in a Graph

ToolNode automatically injects runtime context into registered RuntimeAwareTool instances. Register them with with_runtime_tool() and optionally attach a store with with_store():

use synaptic::graph::ToolNode;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};

let registry = ToolRegistry::new();
let executor = SerialToolExecutor::new(registry);

let tool_node = ToolNode::new(executor)
    .with_store(store.clone())
    .with_runtime_tool(save_note());  // save_note() returns Arc<dyn RuntimeAwareTool>

When the graph executes this tool node and encounters a tool call matching "save_note", it builds a ToolRuntime populated with the current graph state, the store, and the tool call ID, then calls call_with_runtime().

`RuntimeAwareToolAdapter` -- Using Outside a Graph

If you need to use a RuntimeAwareTool in a context that expects the standard Tool trait (for example, with SerialToolExecutor directly), wrap it in a RuntimeAwareToolAdapter:

use std::sync::Arc;
use synaptic::core::{RuntimeAwareTool, RuntimeAwareToolAdapter, ToolRuntime};

let tool = save_note();  // Arc<dyn RuntimeAwareTool>
let adapter = RuntimeAwareToolAdapter::new(tool);

// Optionally inject a runtime before calling
adapter.set_runtime(ToolRuntime {
    store: Some(store.clone()),
    stream_writer: None,
    state: None,
    tool_call_id: "call-1".to_string(),
    config: None,
}).await;

// Now use it as a regular Tool
let result = adapter.call(json!({"key": "k", "text": "hello"})).await?;

If set_runtime() is not called before call(), the adapter uses a default empty ToolRuntime with all optional fields set to None and an empty tool_call_id.

`create_react_agent` with a Store

When building a ReAct agent via create_react_agent, pass a store through AgentOptions to have it automatically wired into the ToolNode for all registered runtime-aware tools:

use synaptic::graph::{create_react_agent, AgentOptions};

let graph = create_react_agent(
    model,
    tools,
    AgentOptions {
        store: Some(store),
        ..Default::default()
    },
);

Memory

Synaptic provides session-keyed conversation memory through the MemoryStore trait and a family of memory strategies that control how conversation history is stored, trimmed, and summarized.

The `MemoryStore` Trait

All memory strategies implement the MemoryStore trait, which defines three async operations:

#[async_trait]
pub trait MemoryStore: Send + Sync {
    async fn append(&self, session_id: &str, message: Message) -> Result<(), SynapticError>;
    async fn load(&self, session_id: &str) -> Result<Vec<Message>, SynapticError>;
    async fn clear(&self, session_id: &str) -> Result<(), SynapticError>;
}

append -- adds a message to the session's history.
load -- retrieves the conversation history for a session.
clear -- removes all messages for a session.

Every operation is keyed by a session_id string, which isolates conversations from one another. You choose the session key (a user ID, a thread ID, a UUID -- whatever makes sense for your application).

`InMemoryStore`

The simplest MemoryStore implementation is InMemoryStore, which stores messages in a HashMap protected by an Arc<RwLock<_>>:

use synaptic::memory::InMemoryStore;
use synaptic::core::{MemoryStore, Message};

let store = InMemoryStore::new();

store.append("session-1", Message::human("Hello")).await?;
store.append("session-1", Message::ai("Hi there!")).await?;

let history = store.load("session-1").await?;
assert_eq!(history.len(), 2);

// Different sessions are completely isolated
let other = store.load("session-2").await?;
assert!(other.is_empty());

InMemoryStore is often used as the backing store for the higher-level memory strategies described below.

Memory Strategies

Each memory strategy wraps an underlying MemoryStore and applies a different policy when loading messages. All strategies implement MemoryStore themselves, so they are interchangeable wherever a MemoryStore is expected.

Strategy	Behavior	When to Use
Buffer Memory	Keeps the entire conversation history	Short conversations where full context matters
Window Memory	Keeps only the last K messages	Chat UIs where older context is less relevant
Summary Memory	Summarizes older messages with an LLM	Very long conversations requiring compact history
Token Buffer Memory	Keeps recent messages within a token budget	Cost control and prompt size limits
Summary Buffer Memory	Hybrid -- summarizes old messages, keeps recent ones verbatim	Best balance of context and efficiency

Auto-Managing History

For the common pattern of loading history before a chain call and saving the result afterward, Synaptic provides RunnableWithMessageHistory. It wraps any Runnable<Vec<Message>, String> and handles the load/save lifecycle automatically, keyed by a session ID in the RunnableConfig metadata.

Choosing a Strategy

If your conversations are short (under 20 messages), Buffer Memory is the simplest choice.
If you want predictable memory usage without an LLM call, use Window Memory or Token Buffer Memory.
If conversations are long and you need the full context preserved in compressed form, use Summary Memory.
If you want the best of both worlds -- exact recent messages plus a compressed summary of older history -- use Summary Buffer Memory.

Buffer Memory

ConversationBufferMemory is the simplest memory strategy. It keeps the entire conversation history, returning every message on load() with no trimming or summarization.

Usage

use std::sync::Arc;
use synaptic::memory::{ConversationBufferMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message};

// Create a backing store and wrap it with buffer memory
let store = Arc::new(InMemoryStore::new());
let memory = ConversationBufferMemory::new(store);

let session = "user-1";

memory.append(session, Message::human("Hello")).await?;
memory.append(session, Message::ai("Hi there!")).await?;
memory.append(session, Message::human("What is Rust?")).await?;
memory.append(session, Message::ai("Rust is a systems programming language.")).await?;

let history = memory.load(session).await?;
// Returns ALL 4 messages -- the full conversation
assert_eq!(history.len(), 4);

How It Works

ConversationBufferMemory is a thin passthrough wrapper. It delegates append(), load(), and clear() directly to the underlying MemoryStore without modification. The "strategy" here is simply: keep everything.

This makes the buffer strategy explicit and composable. By wrapping your store in ConversationBufferMemory, you signal that this particular use site intentionally stores full history, and you can later swap in a different strategy (e.g., ConversationWindowMemory) without changing the rest of your code.

When to Use

Buffer memory is a good fit when:

Conversations are short (under ~20 exchanges) and the full history fits comfortably within the model's context window.
You need perfect recall of every message (e.g., for auditing or evaluation).
You are prototyping and do not yet need a more sophisticated strategy.

Trade-offs

Grows unbounded -- every message is stored and returned. For long conversations, this will eventually exceed the model's context window or cause high token costs.
No compression -- there is no summarization or trimming, so you pay for every token in the history on every LLM call.

If unbounded growth is a concern, consider Window Memory for a fixed-size window, Token Buffer Memory for a token budget, or Summary Memory for LLM-based compression.

Window Memory

ConversationWindowMemory keeps only the most recent K messages. All messages are stored in the underlying store, but load() returns a sliding window of the last window_size messages.

Usage

use std::sync::Arc;
use synaptic::memory::{ConversationWindowMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message};

let store = Arc::new(InMemoryStore::new());

// Keep only the last 4 messages visible
let memory = ConversationWindowMemory::new(store, 4);

let session = "user-1";

memory.append(session, Message::human("Message 1")).await?;
memory.append(session, Message::ai("Reply 1")).await?;
memory.append(session, Message::human("Message 2")).await?;
memory.append(session, Message::ai("Reply 2")).await?;
memory.append(session, Message::human("Message 3")).await?;
memory.append(session, Message::ai("Reply 3")).await?;

let history = memory.load(session).await?;
// Only the last 4 messages are returned
assert_eq!(history.len(), 4);
assert_eq!(history[0].content(), "Message 2");
assert_eq!(history[3].content(), "Reply 3");

How It Works

append() stores every message in the underlying MemoryStore -- nothing is discarded on write.
load() retrieves all messages from the store, then returns only the last window_size entries. If the total number of messages is less than or equal to window_size, all messages are returned.
clear() removes all messages from the underlying store for the given session.

The window is applied at load time, not at write time. This means the full history remains in the backing store and could be accessed directly if needed.

Choosing `window_size`

The window_size parameter is measured in individual messages, not pairs. A typical human/AI exchange produces 2 messages, so a window_size of 10 keeps roughly 5 turns of conversation.

Consider your model's context window when choosing a size. A window of 20 messages is usually safe for most models, while a window of 4-6 messages works well for lightweight chat UIs where only the most recent context matters.

When to Use

Window memory is a good fit when:

You want fixed, predictable memory usage with no LLM calls for summarization.
Older context is genuinely less relevant (e.g., a casual chatbot or customer support flow).
You need a simple strategy that is easy to reason about.

Trade-offs

Hard cutoff -- messages outside the window are invisible to the model. There is no summary or compressed representation of older history.
No token awareness -- the window is measured in message count, not token count. A few long messages could still exceed the model's context window. If you need token-level control, see Token Buffer Memory.

For a strategy that preserves older context through summarization, see Summary Memory or Summary Buffer Memory.

Summary Memory

ConversationSummaryMemory uses an LLM to compress older messages into a running summary. Recent messages are kept verbatim, while everything beyond a buffer_size threshold is summarized into a single system message.

Usage

use std::sync::Arc;
use synaptic::memory::{ConversationSummaryMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message, ChatModel};

// You need a ChatModel to generate summaries
let model: Arc<dyn ChatModel> = Arc::new(my_model);
let store = Arc::new(InMemoryStore::new());

// Keep the last 4 messages verbatim; summarize older ones
let memory = ConversationSummaryMemory::new(store, model, 4);

let session = "user-1";

// As messages accumulate beyond buffer_size * 2, summarization triggers
memory.append(session, Message::human("Tell me about Rust.")).await?;
memory.append(session, Message::ai("Rust is a systems programming language...")).await?;
memory.append(session, Message::human("What about ownership?")).await?;
memory.append(session, Message::ai("Ownership is Rust's core memory model...")).await?;
// ... more messages ...

let history = memory.load(session).await?;
// If summarization has occurred, history starts with a system message
// containing the summary, followed by the most recent messages.

How It Works

append() stores the message in the underlying store, then checks the total message count.
When the count exceeds buffer_size * 2, the strategy splits messages into "older" and "recent" (the last buffer_size messages).
The older messages are sent to the ChatModel with a prompt asking for a concise summary. If a previous summary already exists, it is included as context for the new summary.
The store is cleared and repopulated with only the recent messages.
load() returns the stored messages, prepended with a system message containing the summary text (if one exists):
```
Summary of earlier conversation: <summary text>
```
clear() removes both the stored messages and the summary for the session.

Parameters

Parameter	Type	Description
`store`	`Arc<dyn MemoryStore>`	The backing store for raw messages
`model`	`Arc<dyn ChatModel>`	The LLM used to generate summaries
`buffer_size`	`usize`	Number of recent messages to keep verbatim

When to Use

Summary memory is a good fit when:

Conversations are very long and you need to preserve context from the entire history.
You can afford the additional LLM call for summarization (it only triggers when the buffer overflows, not on every append).
You want roughly constant token usage regardless of how long the conversation runs.

Trade-offs

Lossy compression -- the summary is generated by an LLM, so specific details from older messages may be lost or distorted.
Additional LLM cost -- each summarization step makes a separate ChatModel call. The model used for summarization can be a smaller, cheaper model than your primary model.
Latency -- the append() call that triggers summarization will be slower than usual due to the LLM round-trip.

If you want exact recent messages with no LLM calls, use Window Memory or Token Buffer Memory. For a hybrid approach that balances exact recall of recent messages with summarized older history, see Summary Buffer Memory.

Token Buffer Memory

ConversationTokenBufferMemory keeps the most recent messages that fit within a token budget. On load(), the oldest messages are dropped until the total estimated token count is at or below max_tokens.

Usage

use std::sync::Arc;
use synaptic::memory::{ConversationTokenBufferMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message};

let store = Arc::new(InMemoryStore::new());

// Keep messages within a 200-token budget
let memory = ConversationTokenBufferMemory::new(store, 200);

let session = "user-1";

memory.append(session, Message::human("Hello!")).await?;
memory.append(session, Message::ai("Hi! How can I help?")).await?;
memory.append(session, Message::human("Tell me a long story about Rust.")).await?;
memory.append(session, Message::ai("Rust began as a personal project...")).await?;

let history = memory.load(session).await?;
// Only messages that fit within 200 estimated tokens are returned.
// Oldest messages are dropped first.

How It Works

append() stores every message in the underlying MemoryStore without modification.
load() retrieves all messages, estimates their total token count, and removes the oldest messages one by one until the total fits within max_tokens.
clear() removes all messages from the underlying store for the session.

Token Estimation

Synaptic uses a simple heuristic of approximately 4 characters per token, with a minimum of 1 token per message:

fn estimate_tokens(text: &str) -> usize {
    text.len() / 4 + 1
}

This is a rough approximation. Actual token counts vary by model and tokenizer. The heuristic is intentionally conservative (slightly overestimates) to avoid exceeding real token limits.

Parameters

Parameter	Type	Description
`store`	`Arc<dyn MemoryStore>`	The backing store for raw messages
`max_tokens`	`usize`	Maximum estimated tokens to return from `load()`

When to Use

Token buffer memory is a good fit when:

You need to control prompt size in token terms rather than message count.
You want to stay within a model's context window without manually counting messages.
You prefer a simple, no-LLM-call strategy for managing memory size.

Trade-offs

Approximate -- the token estimate is a heuristic, not an exact count. For precise token budgeting, you would need a model-specific tokenizer.
Hard cutoff -- dropped messages are lost entirely. There is no summary or compressed representation of older history.
Drops whole messages -- if a single message is very long, it may consume most of the budget by itself.

For a fixed message count instead of a token budget, see Window Memory. For a strategy that preserves older context through summarization, see Summary Memory or Summary Buffer Memory.

Summary Buffer Memory

ConversationSummaryBufferMemory is a hybrid strategy that combines the strengths of Summary Memory and Token Buffer Memory. Recent messages are kept verbatim, while older messages are compressed into a running LLM-generated summary when the total estimated token count exceeds a configurable threshold.

Usage

use std::sync::Arc;
use synaptic::memory::{ConversationSummaryBufferMemory, InMemoryStore};
use synaptic::core::{MemoryStore, Message, ChatModel};

let model: Arc<dyn ChatModel> = Arc::new(my_model);
let store = Arc::new(InMemoryStore::new());

// Summarize older messages when total tokens exceed 500
let memory = ConversationSummaryBufferMemory::new(store, model, 500);

let session = "user-1";

memory.append(session, Message::human("What is Rust?")).await?;
memory.append(session, Message::ai("Rust is a systems programming language...")).await?;
memory.append(session, Message::human("How does ownership work?")).await?;
memory.append(session, Message::ai("Ownership is a set of rules...")).await?;
// ... as conversation grows and exceeds 500 estimated tokens,
// older messages are summarized automatically ...

let history = memory.load(session).await?;
// history = [System("Summary of earlier conversation: ..."), recent messages...]

How It Works

append() stores the new message, then estimates the total token count across all stored messages.
When the total exceeds max_token_limit and there is more than one message:
- A split point is calculated: recent messages that fit within half the token limit are kept verbatim.
- All messages before the split point are summarized by the ChatModel. If a previous summary exists, it is included as context.
- The store is cleared and repopulated with only the recent messages.
load() returns the stored messages, prepended with a system message containing the summary (if one exists):
```
Summary of earlier conversation: <summary text>
```
clear() removes both stored messages and the summary for the session.

Parameters

Parameter	Type	Description
`store`	`Arc<dyn MemoryStore>`	The backing store for raw messages
`model`	`Arc<dyn ChatModel>`	The LLM used to generate summaries
`max_token_limit`	`usize`	Token threshold that triggers summarization

Token Estimation

Like ConversationTokenBufferMemory, this strategy estimates tokens at approximately 4 characters per token (with a minimum of 1). The same heuristic caveat applies: actual token counts will vary by model.

When to Use

Summary buffer memory is the recommended strategy when:

Conversations are long and you need both exact recent context and compressed older context.
You want to stay within a token budget while preserving as much information as possible.
The additional cost of occasional LLM summarization calls is acceptable.

This is the closest equivalent to LangChain's ConversationSummaryBufferMemory and is generally the best default choice for production chatbots.

Trade-offs

LLM cost on overflow -- summarization only triggers when the token limit is exceeded, but each summarization call adds latency and cost.
Lossy for old messages -- details from older messages may be lost in the summary, though recent messages are always exact.
Heuristic token counting -- the split point is based on estimated tokens, not exact counts.

Offline Testing with ScriptedChatModel

Use ScriptedChatModel to test summarization without API keys:

use std::sync::Arc;
use synaptic::core::{ChatResponse, MemoryStore, Message};
use synaptic::models::ScriptedChatModel;
use synaptic::memory::{ConversationSummaryBufferMemory, InMemoryStore};

// Script the model to return a summary when called
let summarizer = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("The user asked about Rust and ownership."),
        usage: None,
    },
]));

let store = Arc::new(InMemoryStore::new());
let memory = ConversationSummaryBufferMemory::new(store, summarizer, 50);

let session = "test";

// Add enough messages to exceed the 50-token threshold
memory.append(session, Message::human("What is Rust?")).await?;
memory.append(session, Message::ai("Rust is a systems programming language focused on safety, speed, and concurrency.")).await?;
memory.append(session, Message::human("How does ownership work?")).await?;
memory.append(session, Message::ai("Ownership is a set of rules the compiler checks at compile time. Each value has a single owner.")).await?;

// Load -- older messages are now summarized
let history = memory.load(session).await?;
// history[0] is a System message with the summary
// Remaining messages are the most recent ones kept verbatim

For simpler alternatives, see Buffer Memory (keep everything), Window Memory (fixed message count), or Token Buffer Memory (token budget without summarization).

RunnableWithMessageHistory

RunnableWithMessageHistory wraps any Runnable<Vec<Message>, String> to automatically load conversation history before invocation and save the result afterward. This eliminates the boilerplate of manually calling memory.load() and memory.append() around every chain invocation.

Usage

use std::sync::Arc;
use synaptic::memory::{RunnableWithMessageHistory, InMemoryStore};
use synaptic::core::{MemoryStore, Message, RunnableConfig};
use synaptic::runnables::Runnable;

let store = Arc::new(InMemoryStore::new());

// `chain` is any Runnable<Vec<Message>, String>, e.g. a ChatModel pipeline
let with_history = RunnableWithMessageHistory::new(
    chain.boxed(),
    store,
);

// The session_id is passed via config metadata
let mut config = RunnableConfig::default();
config.metadata.insert(
    "session_id".to_string(),
    serde_json::Value::String("user-42".to_string()),
);

// First invocation
let response = with_history.invoke("Hello!".to_string(), &config).await?;
// Internally:
// 1. Loads existing messages for session "user-42" (empty on first call)
// 2. Appends Message::human("Hello!") to the store and to the message list
// 3. Passes the full Vec<Message> to the inner runnable
// 4. Saves Message::ai(response) to the store

// Second invocation -- history is automatically carried forward
let response = with_history.invoke("Tell me more.".to_string(), &config).await?;
// The inner runnable now receives all 4 messages:
// [Human("Hello!"), AI(first_response), Human("Tell me more."), ...]

How It Works

RunnableWithMessageHistory implements Runnable<String, String>. On each invoke() call:

Extract session ID -- reads session_id from config.metadata. If not present, defaults to "default".
Load history -- calls memory.load(session_id) to retrieve existing messages.
Append human message -- creates Message::human(input), appends it to both the in-memory list and the store.
Invoke inner runnable -- passes the full Vec<Message> (history + new message) to the wrapped runnable.
Save AI response -- creates Message::ai(output) and appends it to the store.
Return -- returns the output string.

Session Isolation

Different session IDs produce completely isolated conversation histories:

let mut config_a = RunnableConfig::default();
config_a.metadata.insert(
    "session_id".to_string(),
    serde_json::Value::String("alice".to_string()),
);

let mut config_b = RunnableConfig::default();
config_b.metadata.insert(
    "session_id".to_string(),
    serde_json::Value::String("bob".to_string()),
);

// Alice and Bob have independent conversation histories
with_history.invoke("Hi, I'm Alice.".to_string(), &config_a).await?;
with_history.invoke("Hi, I'm Bob.".to_string(), &config_b).await?;

Combining with Memory Strategies

Because RunnableWithMessageHistory takes any Arc<dyn MemoryStore>, you can pass in a memory strategy to control how history is managed:

use synaptic::memory::{ConversationWindowMemory, InMemoryStore, RunnableWithMessageHistory};
use std::sync::Arc;

let store = Arc::new(InMemoryStore::new());
let windowed = Arc::new(ConversationWindowMemory::new(store, 10));

let with_history = RunnableWithMessageHistory::new(
    chain.boxed(),
    windowed,  // Only the last 10 messages will be loaded
);

This lets you combine automatic history management with any trimming or summarization strategy.

When to Use

Use RunnableWithMessageHistory when:

You have a Runnable chain that takes messages and returns a string (the common pattern for chat pipelines).
You want to avoid manually loading and saving messages around every invocation.
You need session-based conversation management with minimal boilerplate.

Clearing History

Use MemoryStore::clear() on the underlying store to reset a session's history:

let store = Arc::new(InMemoryStore::new());
let with_history = RunnableWithMessageHistory::new(chain.boxed(), store.clone());

// After some conversation...
store.clear("user-42").await?;

// Next invocation starts fresh -- no previous messages are loaded

For lower-level control over when messages are loaded and saved, use the MemoryStore trait directly.

Graph

Synaptic provides LangGraph-style graph orchestration through the synaptic_graph crate. A StateGraph is a state machine where nodes process state and edges control the flow between nodes. This architecture supports fixed routing, conditional branching, checkpointing for persistence, human-in-the-loop interrupts, and streaming execution.

Core Concepts

Concept	Description
`State` trait	Defines how graph state is merged when nodes produce updates
`Node<S>` trait	A processing unit that takes state and returns updated state
`StateGraph`	Builder for assembling nodes and edges into a graph
`CompiledGraph`	The executable graph produced by `StateGraph::compile()`
`Checkpointer`	Trait for persisting graph state across invocations
`ToolNode`	Prebuilt node that auto-dispatches tool calls from AI messages

How It Works

Define a state type that implements State (or use the built-in MessageState).
Create nodes -- either by implementing the Node<S> trait or by wrapping a closure with FnNode.
Build a graph with StateGraph::new(), adding nodes and edges.
Call .compile() to validate the graph and produce a CompiledGraph.
Run the graph with invoke() for a single result or stream() for per-node events.

use synaptic::graph::{StateGraph, MessageState, FnNode, END};
use synaptic::core::Message;

let greet = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Hello from the graph!"));
    Ok(state)
});

let graph = StateGraph::new()
    .add_node("greet", greet)
    .set_entry_point("greet")
    .add_edge("greet", END)
    .compile()?;

let initial = MessageState::with_messages(vec![Message::human("Hi")]);
let result = graph.invoke(initial).await?;
assert_eq!(result.messages.len(), 2);

Guides

State & Nodes -- define state types and processing nodes
Edges -- connect nodes with fixed and conditional edges
Graph Streaming -- consume per-node events during execution (single and multi-mode)
Checkpointing -- persist and resume graph state
Human-in-the-Loop -- interrupt execution for human review
Tool Node -- auto-dispatch tool calls from AI messages
Visualization -- render graphs as Mermaid, ASCII, DOT, or PNG

Advanced Features

Node Caching

Use add_node_with_cache() to cache node results based on input state. Cached entries expire after the specified TTL:

use synaptic::graph::{StateGraph, CachePolicy, END};
use std::time::Duration;

let graph = StateGraph::new()
    .add_node_with_cache(
        "expensive",
        expensive_node,
        CachePolicy::new(Duration::from_secs(300)),
    )
    .add_edge("expensive", END)
    .set_entry_point("expensive")
    .compile()?;

When the same input state is seen again within the TTL, the cached result is returned without re-executing the node.

Deferred Nodes

Use add_deferred_node() to create nodes that wait for ALL incoming paths to complete before executing. This is useful for fan-in aggregation after parallel fan-out with Send:

let graph = StateGraph::new()
    .add_node("branch_a", node_a)
    .add_node("branch_b", node_b)
    .add_deferred_node("aggregate", aggregator_node)
    .add_edge("branch_a", "aggregate")
    .add_edge("branch_b", "aggregate")
    .add_edge("aggregate", END)
    .set_entry_point("branch_a")
    .compile()?;

Structured Output (response_format)

When creating an agent with create_agent(), set response_format in AgentOptions to force the final response into a specific JSON schema:

use synaptic::graph::{create_agent, AgentOptions};

let graph = create_agent(model, tools, AgentOptions {
    response_format: Some(serde_json::json!({
        "type": "object",
        "properties": {
            "answer": { "type": "string" },
            "confidence": { "type": "number" }
        },
        "required": ["answer", "confidence"]
    })),
    ..Default::default()
})?;

When the agent produces its final answer (no tool calls), it re-calls the model with structured output instructions matching the schema.

State & Nodes

Graphs in Synaptic operate on a state value that flows through nodes. Each node receives the current state, processes it, and returns an updated state. The State trait defines how states are merged, and the Node<S> trait defines how nodes process state.

The `State` Trait

Any type used as graph state must implement the State trait:

pub trait State: Clone + Send + Sync + 'static {
    /// Merge another state into this one (reducer pattern).
    fn merge(&mut self, other: Self);
}

The merge() method is called when combining state updates -- for example, when update_state() is used during human-in-the-loop flows. The merge semantics are up to you: append, replace, or any custom logic.

`MessageState` -- The Built-in State

For the common case of conversational agents, Synaptic provides MessageState:

use synaptic::graph::MessageState;
use synaptic::core::Message;

// Create an empty state
let state = MessageState::new();

// Create with initial messages
let state = MessageState::with_messages(vec![
    Message::human("Hello"),
    Message::ai("Hi there!"),
]);

// Access the last message
if let Some(msg) = state.last_message() {
    println!("Last: {}", msg.content());
}

MessageState implements State by appending messages on merge:

fn merge(&mut self, other: Self) {
    self.messages.extend(other.messages);
}

This append-only behavior is the right default for conversational workflows where each node adds new messages to the history.

Custom State

You can define your own state type for non-conversational graphs:

use synaptic::graph::State;
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
struct PipelineState {
    input: String,
    steps_completed: Vec<String>,
    result: Option<String>,
}

impl State for PipelineState {
    fn merge(&mut self, other: Self) {
        self.steps_completed.extend(other.steps_completed);
        if other.result.is_some() {
            self.result = other.result;
        }
    }
}

If you plan to use checkpointing, your state must also implement Serialize and Deserialize.

The `Node<S>` Trait

A node is any type that implements Node<S>:

use async_trait::async_trait;
use synaptic::core::SynapticError;
use synaptic::graph::{Node, NodeOutput, MessageState};
use synaptic::core::Message;

struct GreeterNode;

#[async_trait]
impl Node<MessageState> for GreeterNode {
    async fn process(&self, mut state: MessageState) -> Result<NodeOutput<MessageState>, SynapticError> {
        state.messages.push(Message::ai("Hello! How can I help?"));
        Ok(state.into()) // NodeOutput::State(state)
    }
}

Nodes return NodeOutput<S>, which is an enum:

NodeOutput::State(S) -- a regular state update (existing behavior). The From<S> impl lets you write Ok(state.into()).
NodeOutput::Command(Command<S>) -- a control flow command (goto, interrupt, fan-out). See Human-in-the-Loop for interrupt examples.

Nodes are Send + Sync, so they can safely hold shared references (e.g., Arc<dyn ChatModel>) and be used across async tasks.

`FnNode` -- Closure-based Nodes

For simple logic, FnNode wraps an async closure as a node without defining a separate struct:

use synaptic::graph::{FnNode, MessageState};
use synaptic::core::Message;

let greeter = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Hello from a closure!"));
    Ok(state.into())
});

FnNode accepts any function with the signature Fn(S) -> Future<Output = Result<NodeOutput<S>, SynapticError>> where S: State.

Adding Nodes to a Graph

Nodes are added to a StateGraph with a string name. The name is used to reference the node in edges and conditional routing:

use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::Message;

let node_a = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Step A"));
    Ok(state.into())
});

let node_b = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Step B"));
    Ok(state.into())
});

let graph = StateGraph::new()
    .add_node("a", node_a)
    .add_node("b", node_b)
    .set_entry_point("a")
    .add_edge("a", "b")
    .add_edge("b", END)
    .compile()?;

Both struct-based nodes (implementing Node<S>) and FnNode closures can be passed to add_node() interchangeably.

Edges

Edges define the flow of execution between nodes in a graph. Synaptic supports two kinds of edges: fixed edges that always route to the same target, and conditional edges that route dynamically based on the current state.

Fixed Edges

A fixed edge unconditionally routes execution from one node to another:

use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::Message;

let node_a = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Step A"));
    Ok(state)
});

let node_b = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Step B"));
    Ok(state)
});

let graph = StateGraph::new()
    .add_node("a", node_a)
    .add_node("b", node_b)
    .set_entry_point("a")
    .add_edge("a", "b")     // a always flows to b
    .add_edge("b", END)     // b always flows to END
    .compile()?;

Use the END constant to indicate that a node terminates the graph. Every execution path must eventually reach END; otherwise, the graph will hit the 100-iteration safety limit.

Entry Point

Every graph requires an entry point -- the first node to execute:

let graph = StateGraph::new()
    .add_node("start", my_node)
    .set_entry_point("start")  // required
    // ...

Calling .compile() without setting an entry point returns an error.

Conditional Edges

Conditional edges route execution based on a function that inspects the current state and returns the name of the next node:

use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::Message;

let router = FnNode::new(|state: MessageState| async move {
    Ok(state)  // routing logic is in the edge, not the node
});

let handle_greeting = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Hello!"));
    Ok(state)
});

let handle_question = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Let me look that up."));
    Ok(state)
});

let graph = StateGraph::new()
    .add_node("router", router)
    .add_node("greeting", handle_greeting)
    .add_node("question", handle_question)
    .set_entry_point("router")
    .add_conditional_edges("router", |state: &MessageState| {
        let last = state.last_message().map(|m| m.content().to_string());
        match last.as_deref() {
            Some("hi") | Some("hello") => "greeting".to_string(),
            _ => "question".to_string(),
        }
    })
    .add_edge("greeting", END)
    .add_edge("question", END)
    .compile()?;

The router function receives an immutable reference to the state (&S) and returns a String -- the name of the next node to execute (or END to terminate).

Conditional Edges with Path Map

For graph visualization, you can provide a path_map that enumerates the possible routing targets. This gives visualization tools (Mermaid, DOT, ASCII) the information they need to draw all possible paths:

use std::collections::HashMap;
use synaptic::graph::{StateGraph, MessageState, END};

let graph = StateGraph::new()
    .add_node("router", router_node)
    .add_node("path_a", node_a)
    .add_node("path_b", node_b)
    .set_entry_point("router")
    .add_conditional_edges_with_path_map(
        "router",
        |state: &MessageState| {
            if state.messages.len() > 3 {
                "path_a".to_string()
            } else {
                "path_b".to_string()
            }
        },
        HashMap::from([
            ("path_a".to_string(), "path_a".to_string()),
            ("path_b".to_string(), "path_b".to_string()),
        ]),
    )
    .add_edge("path_a", END)
    .add_edge("path_b", END)
    .compile()?;

The path_map is a HashMap<String, String> where keys are labels and values are target node names. The compile step validates that all path map targets reference existing nodes (or END).

Validation

When you call .compile(), the graph validates:

An entry point is set and refers to an existing node.
Every fixed edge source and target refers to an existing node (or END).
Every conditional edge source refers to an existing node.
All path_map targets refer to existing nodes (or END).

If any validation fails, compile() returns a SynapticError::Graph with a descriptive message.

Graph Streaming

Instead of waiting for the entire graph to finish, you can stream execution and receive a GraphEvent after each node completes. This is useful for progress reporting, real-time UIs, and debugging.

`stream()` and `StreamMode`

The stream() method on CompiledGraph returns a GraphStream -- a Pin<Box<dyn Stream>> that yields Result<GraphEvent<S>, SynapticError> values:

use synaptic::graph::{StateGraph, FnNode, MessageState, StreamMode, GraphEvent, END};
use synaptic::core::Message;
use futures::StreamExt;

let step_a = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Step A done"));
    Ok(state)
});

let step_b = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Step B done"));
    Ok(state)
});

let graph = StateGraph::new()
    .add_node("a", step_a)
    .add_node("b", step_b)
    .set_entry_point("a")
    .add_edge("a", "b")
    .add_edge("b", END)
    .compile()?;

let initial = MessageState::with_messages(vec![Message::human("Start")]);

let mut stream = graph.stream(initial, StreamMode::Values);
while let Some(event) = stream.next().await {
    let event: GraphEvent<MessageState> = event?;
    println!(
        "Node '{}' completed -- {} messages in state",
        event.node,
        event.state.messages.len()
    );
}
// Output:
//   Node 'a' completed -- 2 messages in state
//   Node 'b' completed -- 3 messages in state

`GraphEvent`

Each event contains:

Field	Type	Description
`node`	`String`	The name of the node that just executed
`state`	`S`	The state snapshot after the node ran

Stream Modes

The StreamMode enum controls what the state field contains:

Mode	Behavior
`StreamMode::Values`	Each event contains the full accumulated state after the node
`StreamMode::Updates`	Each event contains the pre-node state (useful for computing per-node deltas)
`StreamMode::Messages`	Same as Values — callers filter for AI messages in chat UIs
`StreamMode::Debug`	Same as Values — intended for detailed debug information
`StreamMode::Custom`	Events emitted via StreamWriter during node execution

Multi-Mode Streaming

You can request multiple stream modes simultaneously using stream_modes(). Each event is wrapped in a MultiGraphEvent tagged with its mode:

use synaptic::graph::{StreamMode, MultiGraphEvent};
use futures::StreamExt;

let mut stream = graph.stream_modes(
    initial_state,
    vec![StreamMode::Values, StreamMode::Updates],
);

while let Some(result) = stream.next().await {
    let event: MultiGraphEvent<MessageState> = result?;
    match event.mode {
        StreamMode::Values => {
            println!("Full state after '{}': {:?}", event.event.node, event.event.state);
        }
        StreamMode::Updates => {
            println!("State before '{}': {:?}", event.event.node, event.event.state);
        }
        _ => {}
    }
}

For each node execution, one event per requested mode is emitted. With two modes and three nodes, you get six events total.

Streaming with Checkpoints

You can combine streaming with checkpointing using stream_with_config():

use synaptic::graph::{MemorySaver, CheckpointConfig, StreamMode};
use std::sync::Arc;

let checkpointer = Arc::new(MemorySaver::new());
let graph = graph.with_checkpointer(checkpointer);

let config = CheckpointConfig::new("thread-1");

let mut stream = graph.stream_with_config(
    initial_state,
    StreamMode::Values,
    Some(config),
);

while let Some(event) = stream.next().await {
    let event = event?;
    println!("Node: {}", event.node);
}

Checkpoints are saved after each node during streaming, just as they are during invoke(). If the graph is interrupted (via interrupt_before or interrupt_after), the stream yields the interrupt error and terminates.

Error Handling

The stream yields Result values. If a node returns an error, the stream yields that error and terminates. Consuming code should handle both successful events and errors:

while let Some(result) = stream.next().await {
    match result {
        Ok(event) => println!("Node '{}' succeeded", event.node),
        Err(e) => {
            eprintln!("Graph error: {e}");
            break;
        }
    }
}

Checkpointing

Checkpointing persists graph state between invocations, enabling resumable execution, multi-turn conversations over a graph, and human-in-the-loop workflows. The Checkpointer trait abstracts the storage backend, and MemorySaver provides an in-memory implementation for development and testing.

The `Checkpointer` Trait

#[async_trait]
pub trait Checkpointer: Send + Sync {
    async fn put(&self, config: &CheckpointConfig, checkpoint: &Checkpoint) -> Result<(), SynapticError>;
    async fn get(&self, config: &CheckpointConfig) -> Result<Option<Checkpoint>, SynapticError>;
    async fn list(&self, config: &CheckpointConfig) -> Result<Vec<Checkpoint>, SynapticError>;
}

A Checkpoint stores the serialized state and the name of the next node to execute:

pub struct Checkpoint {
    pub state: serde_json::Value,
    pub next_node: Option<String>,
}

`MemorySaver`

MemorySaver is the built-in in-memory checkpointer. It stores checkpoints in a HashMap keyed by thread ID:

use synaptic::graph::MemorySaver;
use std::sync::Arc;

let checkpointer = Arc::new(MemorySaver::new());

For production use, you would implement Checkpointer with a persistent backend (database, Redis, file system, etc.).

Attaching a Checkpointer

After compiling a graph, attach a checkpointer with .with_checkpointer():

use synaptic::graph::{StateGraph, FnNode, MessageState, MemorySaver, END};
use synaptic::core::Message;
use std::sync::Arc;

let node = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Processed"));
    Ok(state)
});

let graph = StateGraph::new()
    .add_node("process", node)
    .set_entry_point("process")
    .add_edge("process", END)
    .compile()?
    .with_checkpointer(Arc::new(MemorySaver::new()));

`CheckpointConfig`

A CheckpointConfig identifies a thread (conversation) for checkpointing:

use synaptic::graph::CheckpointConfig;

let config = CheckpointConfig::new("thread-1");

The thread_id string isolates different conversations. Each thread maintains its own checkpoint history.

Invoking with Checkpoints

Use invoke_with_config() to run the graph with checkpointing enabled:

let config = CheckpointConfig::new("thread-1");
let initial = MessageState::with_messages(vec![Message::human("Hello")]);

let result = graph.invoke_with_config(initial, Some(config.clone())).await?;

After each node executes, the current state and next node are saved to the checkpointer. On subsequent invocations with the same CheckpointConfig, the graph resumes from the last checkpoint.

Retrieving State

You can inspect the current state saved for a thread:

// Get the latest state for a thread
if let Some(state) = graph.get_state(&config).await? {
    println!("Messages: {}", state.messages.len());
}

// Get the full checkpoint history (oldest to newest)
let history = graph.get_state_history(&config).await?;
for (state, next_node) in &history {
    println!(
        "State with {} messages, next node: {:?}",
        state.messages.len(),
        next_node
    );
}

State Serialization

Checkpointing requires your state type to implement Serialize and Deserialize (from serde). The built-in MessageState already has these derives. For custom state types, add the derives:

use serde::{Serialize, Deserialize};
use synaptic::graph::State;

#[derive(Clone, Serialize, Deserialize)]
struct MyState {
    data: Vec<String>,
}

impl State for MyState {
    fn merge(&mut self, other: Self) {
        self.data.extend(other.data);
    }
}

Human-in-the-Loop

Human-in-the-loop (HITL) allows you to pause graph execution at specific points, giving a human the opportunity to review, approve, or modify the state before the graph continues. Synaptic supports two approaches:

interrupt_before / interrupt_after -- declarative interrupts on the StateGraph builder.
interrupt() function -- programmatic interrupts inside nodes via Command.

Both require a checkpointer to persist state for later resumption.

Interrupt Before and After

The StateGraph builder provides two interrupt modes:

interrupt_before(nodes) -- pause execution before the named nodes run.
interrupt_after(nodes) -- pause execution after the named nodes run.

Example: Approval Before Tool Execution

A common pattern is to interrupt before a tool execution node so a human can review the tool calls the agent proposed:

use synaptic::graph::{StateGraph, FnNode, MessageState, MemorySaver, CheckpointConfig, END};
use synaptic::core::Message;
use std::sync::Arc;

let agent_node = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("I want to call the delete_file tool."));
    Ok(state.into())
});

let tool_node = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::tool("File deleted.", "call-1"));
    Ok(state.into())
});

let graph = StateGraph::new()
    .add_node("agent", agent_node)
    .add_node("tools", tool_node)
    .set_entry_point("agent")
    .add_edge("agent", "tools")
    .add_edge("tools", END)
    // Pause before the tools node executes
    .interrupt_before(vec!["tools".to_string()])
    .compile()?
    .with_checkpointer(Arc::new(MemorySaver::new()));

let config = CheckpointConfig::new("thread-1");
let initial = MessageState::with_messages(vec![Message::human("Delete old logs")]);

Step 1: First Invocation -- Interrupt

The first invoke_with_config() runs the agent node, then stops before tools:

let result = graph.invoke_with_config(initial, Some(config.clone())).await?;

// Returns GraphResult::Interrupted
assert!(result.is_interrupted());

// You can inspect the interrupt value
if let Some(iv) = result.interrupt_value() {
    println!("Interrupted: {iv}");
}

At this point, the checkpointer has saved the state after agent ran, with tools as the next node.

Step 2: Human Review

The human can inspect the saved state to review what the agent proposed:

if let Some(state) = graph.get_state(&config).await? {
    for msg in &state.messages {
        println!("[{}] {}", msg.role(), msg.content());
    }
}

Step 3: Update State (Optional)

If the human wants to modify the state before resuming -- for example, to add an approval message or to change the tool call -- use update_state():

let approval = MessageState::with_messages(vec![
    Message::human("Approved -- go ahead and delete."),
]);

graph.update_state(&config, approval).await?;

update_state() loads the current checkpoint, calls State::merge() with the provided update, and saves the merged result back to the checkpointer.

Step 4: Resume Execution

Resume the graph by calling invoke_with_config() again with the same config and a default (empty) state. The graph loads the checkpoint and continues from the interrupted node:

let result = graph
    .invoke_with_config(MessageState::default(), Some(config))
    .await?;

// The graph executed "tools" and reached END
let state = result.into_state();
println!("Final messages: {}", state.messages.len());

Programmatic Interrupt with `interrupt()`

For more control, nodes can call the interrupt() function to pause execution with a custom value. This is useful when the decision to interrupt depends on runtime state:

use synaptic::graph::{interrupt, Node, NodeOutput, MessageState};

struct ApprovalNode;

#[async_trait]
impl Node<MessageState> for ApprovalNode {
    async fn process(&self, state: MessageState) -> Result<NodeOutput<MessageState>, SynapticError> {
        // Check if any tool call is potentially dangerous
        if let Some(msg) = state.last_message() {
            for call in msg.tool_calls() {
                if call.name == "delete_file" {
                    // Interrupt and ask for approval
                    return Ok(interrupt(serde_json::json!({
                        "question": "Approve file deletion?",
                        "tool_call": call.name,
                    })));
                }
            }
        }
        // No dangerous calls -- continue normally
        Ok(state.into())
    }
}

The caller receives a GraphResult::Interrupted with the interrupt value:

let result = graph.invoke_with_config(state, Some(config.clone())).await?;
if result.is_interrupted() {
    let question = result.interrupt_value().unwrap();
    println!("Agent asks: {}", question["question"]);
}

Dynamic Routing with `Command`

Nodes can also use Command to override the normal edge-based routing:

use synaptic::graph::{Command, NodeOutput};

// Route to a specific node, skipping normal edges
Ok(NodeOutput::Command(Command::goto("summary")))

// Route to a specific node with a state update
Ok(NodeOutput::Command(Command::goto_with_update("next", delta_state)))

// End the graph immediately
Ok(NodeOutput::Command(Command::end()))

// Update state without overriding routing
Ok(NodeOutput::Command(Command::update(delta_state)))

`interrupt_after`

interrupt_after works the same way, but the specified node runs before the interrupt. This is useful when you want to see the node's output before deciding whether to continue:

let graph = StateGraph::new()
    .add_node("agent", agent_node)
    .add_node("tools", tool_node)
    .set_entry_point("agent")
    .add_edge("agent", "tools")
    .add_edge("tools", END)
    // Interrupt after the agent node runs (to review its output)
    .interrupt_after(vec!["agent".to_string()])
    .compile()?
    .with_checkpointer(Arc::new(MemorySaver::new()));

`GraphResult`

graph.invoke() returns Result<GraphResult<S>, SynapticError>. GraphResult is an enum:

GraphResult::Complete(state) -- graph ran to END normally.
GraphResult::Interrupted { state, interrupt_value } -- graph paused.

Key methods:

Method	Description
`is_complete()`	Returns `true` if the graph completed normally
`is_interrupted()`	Returns `true` if the graph was interrupted
`state()`	Borrow the state (regardless of completion/interrupt)
`into_state()`	Consume and return the state
`interrupt_value()`	Returns `Some(&Value)` if interrupted, `None` otherwise

Notes

Interrupts require a checkpointer. Without one, the graph cannot save state for resumption.
interrupt_before / interrupt_after return GraphResult::Interrupted (not an error).
Programmatic interrupt() also returns GraphResult::Interrupted with the value you pass.
You can interrupt at multiple nodes by passing multiple names to interrupt_before() or interrupt_after().
You can combine interrupt_before and interrupt_after on different nodes in the same graph.

Command & Routing

Command<S> gives nodes dynamic control over graph execution, allowing them to override edge-based routing, update state, fan out to multiple nodes, or terminate early. Use it when routing decisions depend on runtime state.

Nodes return NodeOutput<S> -- either NodeOutput::State(S) for a regular state update (via Ok(state.into())), or NodeOutput::Command(Command<S>) for dynamic control flow.

Command Constructors

Constructor	Behavior
`Command::goto("node")`	Route to a specific node, skipping normal edges
`Command::goto_with_update("node", delta)`	Route to a node and merge `delta` into state
`Command::update(delta)`	Merge `delta` into state, then follow normal routing
`Command::end()`	Terminate the graph immediately
`Command::send(targets)`	Fan-out to multiple nodes via [`Send`]
`Command::resume(value)`	Resume from a previous interrupt (see Interrupt & Resume)

Conditional Routing with `goto`

A "triage" node inspects the input and routes to different handlers:

use synaptic::graph::{Command, FnNode, NodeOutput, State, StateGraph, END};
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct TicketState {
    category: String,
    resolved: bool,
}

impl State for TicketState {
    fn merge(&mut self, other: Self) {
        if !other.category.is_empty() { self.category = other.category; }
        self.resolved = self.resolved || other.resolved;
    }
}

let triage = FnNode::new(|state: TicketState| async move {
    let target = if state.category == "billing" {
        "billing_handler"
    } else {
        "support_handler"
    };
    Ok(NodeOutput::Command(Command::goto(target)))
});

let billing = FnNode::new(|mut state: TicketState| async move {
    state.resolved = true;
    Ok(state.into())
});

let support = FnNode::new(|mut state: TicketState| async move {
    state.resolved = true;
    Ok(state.into())
});

let graph = StateGraph::new()
    .add_node("triage", triage)
    .add_node("billing_handler", billing)
    .add_node("support_handler", support)
    .set_entry_point("triage")
    .add_edge("billing_handler", END)
    .add_edge("support_handler", END)
    .compile()?;

let result = graph.invoke(TicketState {
    category: "billing".into(),
    resolved: false,
}).await?.into_state();
assert!(result.resolved);

Routing with State Update

goto_with_update routes and merges a state delta in one step. The delta is merged via State::merge() before the target node runs:

Ok(NodeOutput::Command(Command::goto_with_update("escalation", delta)))

Update Without Routing

Command::update(delta) merges state but follows normal edges. Useful when a node contributes a partial update without overriding the next step:

Ok(NodeOutput::Command(Command::update(delta)))

Early Termination

Command::end() stops the graph immediately. No further nodes execute:

let guard = FnNode::new(|state: TicketState| async move {
    if state.category == "spam" {
        return Ok(NodeOutput::Command(Command::end()));
    }
    Ok(state.into())
});

Fan-Out with `Send`

Command::send() dispatches work to multiple targets. Each Send carries a node name and a JSON payload:

use synaptic::graph::Send;

let targets = vec![
    Send::new("worker", serde_json::json!({"chunk": "part1"})),
    Send::new("worker", serde_json::json!({"chunk": "part2"})),
];
Ok(NodeOutput::Command(Command::send(targets)))

Note: Full parallel fan-out is not yet implemented. Targets are currently processed sequentially.

Commands in Streaming Mode

Commands work identically when streaming. If node "a" issues Command::goto("c"), the stream yields events for "a" and "c" but skips "b", even if an a -> b edge exists.

Interrupt & Resume

interrupt(value) pauses graph execution and returns control to the caller with a JSON value, enabling human-in-the-loop workflows where a node decides at runtime whether to pause. A checkpointer is required to persist state for later resumption.

For declarative interrupts (interrupt_before/interrupt_after), see Human-in-the-Loop.

The `interrupt()` Function

use synaptic::graph::{interrupt, Node, NodeOutput, MessageState};
use synaptic::core::SynapticError;
use async_trait::async_trait;

struct ApprovalGate;

#[async_trait]
impl Node<MessageState> for ApprovalGate {
    async fn process(
        &self,
        state: MessageState,
    ) -> Result<NodeOutput<MessageState>, SynapticError> {
        if let Some(msg) = state.last_message() {
            for call in msg.tool_calls() {
                if call.name == "delete_database" {
                    return Ok(interrupt(serde_json::json!({
                        "question": "Approve database deletion?",
                        "tool_call": call.name,
                    })));
                }
            }
        }
        Ok(state.into()) // continue normally
    }
}

Detecting Interrupts with `GraphResult`

graph.invoke() returns GraphResult<S> -- either Complete(state) or Interrupted { state, interrupt_value }:

let result = graph.invoke_with_config(state, Some(config.clone())).await?;

if result.is_interrupted() {
    println!("Paused: {}", result.interrupt_value().unwrap());
} else {
    println!("Done: {:?}", result.into_state());
}

Full Round-Trip Example

use std::sync::Arc;
use serde::{Serialize, Deserialize};
use serde_json::json;
use synaptic::graph::{
    interrupt, CheckpointConfig, FnNode, MemorySaver,
    NodeOutput, State, StateGraph, END,
};

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct ReviewState {
    proposal: String,
    approved: bool,
    done: bool,
}

impl State for ReviewState {
    fn merge(&mut self, other: Self) {
        if !other.proposal.is_empty() { self.proposal = other.proposal; }
        self.approved = self.approved || other.approved;
        self.done = self.done || other.done;
    }
}

let propose = FnNode::new(|mut state: ReviewState| async move {
    state.proposal = "Delete all temporary files".into();
    Ok(state.into())
});

let gate = FnNode::new(|state: ReviewState| async move {
    Ok(interrupt(json!({"question": "Approve?", "proposal": state.proposal})))
});

let execute = FnNode::new(|mut state: ReviewState| async move {
    state.done = true;
    Ok(state.into())
});

let saver = Arc::new(MemorySaver::new());
let graph = StateGraph::new()
    .add_node("propose", propose)
    .add_node("gate", gate)
    .add_node("execute", execute)
    .set_entry_point("propose")
    .add_edge("propose", "gate")
    .add_edge("gate", "execute")
    .add_edge("execute", END)
    .compile()?
    .with_checkpointer(saver);

let config = CheckpointConfig::new("review-thread");

// Step 1: Invoke -- graph pauses at the gate
let result = graph
    .invoke_with_config(ReviewState::default(), Some(config.clone()))
    .await?;
assert!(result.is_interrupted());

// Step 2: Review saved state
let saved = graph.get_state(&config).await?.unwrap();
println!("Proposal: {}", saved.proposal);

// Step 3: Optionally update state before resuming
graph.update_state(&config, ReviewState {
    proposal: String::new(), approved: true, done: false,
}).await?;

// Step 4: Resume execution
let result = graph
    .invoke_with_config(ReviewState::default(), Some(config))
    .await?;
assert!(result.is_complete());
assert!(result.into_state().done);

Notes

Checkpointer required. Without one, state cannot be saved between interrupt and resume. MemorySaver works for development; implement Checkpointer for production.
State is not merged on interrupt. When a node returns interrupt(), the node's state update is not applied -- only state from previously executed nodes is preserved.
Command::resume(value) passes a value to the graph on resumption, available via the command's resume_value field.
State history. Call graph.get_state_history(&config) to inspect all checkpoints for a thread.

Node Caching

CachePolicy paired with add_node_with_cache() enables hash-based result caching on individual graph nodes. When the same serialized input state is seen within the TTL window, the cached output is returned without re-executing the node. Use this for expensive nodes (LLM calls, API requests) where identical inputs produce identical outputs.

Setup

use std::time::Duration;
use synaptic::graph::{CachePolicy, FnNode, StateGraph, MessageState, END};
use synaptic::core::Message;

let expensive = FnNode::new(|mut state: MessageState| async move {
    state.messages.push(Message::ai("Expensive result"));
    Ok(state.into())
});

let graph = StateGraph::new()
    .add_node_with_cache(
        "llm_call",
        expensive,
        CachePolicy::new(Duration::from_secs(60)),
    )
    .add_edge("llm_call", END)
    .set_entry_point("llm_call")
    .compile()?;

How It Works

Before executing a cached node, the graph serializes the current state to JSON and computes a hash.
If the cache contains a valid (non-expired) entry for that (node_name, state_hash), the cached NodeOutput is returned immediately -- process() is not called.
On a cache miss, the node executes normally and the result is stored.

The cache is held in Arc<RwLock<HashMap>> inside CompiledGraph, persisting across multiple invoke() calls on the same instance.

Example: Verifying Cache Hits

use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::time::Duration;
use async_trait::async_trait;
use serde::{Serialize, Deserialize};
use synaptic::core::SynapticError;
use synaptic::graph::{CachePolicy, Node, NodeOutput, State, StateGraph, END};

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct MyState { counter: usize }

impl State for MyState {
    fn merge(&mut self, other: Self) { self.counter += other.counter; }
}

struct TrackedNode { call_count: Arc<AtomicUsize> }

#[async_trait]
impl Node<MyState> for TrackedNode {
    async fn process(&self, mut state: MyState) -> Result<NodeOutput<MyState>, SynapticError> {
        self.call_count.fetch_add(1, Ordering::SeqCst);
        state.counter += 1;
        Ok(state.into())
    }
}

let calls = Arc::new(AtomicUsize::new(0));
let graph = StateGraph::new()
    .add_node_with_cache("n", TrackedNode { call_count: calls.clone() },
        CachePolicy::new(Duration::from_secs(60)))
    .add_edge("n", END)
    .set_entry_point("n")
    .compile()?;

// First call: cache miss
graph.invoke(MyState::default()).await?;
assert_eq!(calls.load(Ordering::SeqCst), 1);

// Same input: cache hit -- node not called
graph.invoke(MyState::default()).await?;
assert_eq!(calls.load(Ordering::SeqCst), 1);

// Different input: cache miss
graph.invoke(MyState { counter: 5 }).await?;
assert_eq!(calls.load(Ordering::SeqCst), 2);

TTL Expiry

Cached entries expire after the configured TTL. The next call with the same input re-executes the node:

let graph = StateGraph::new()
    .add_node_with_cache("n", my_node,
        CachePolicy::new(Duration::from_millis(100)))
    .add_edge("n", END)
    .set_entry_point("n")
    .compile()?;

graph.invoke(state.clone()).await?;                       // executes
tokio::time::sleep(Duration::from_millis(150)).await;
graph.invoke(state.clone()).await?;                       // executes again

Mixing Cached and Uncached Nodes

Only nodes added with add_node_with_cache() are cached. Nodes added with add_node() always execute:

let graph = StateGraph::new()
    .add_node_with_cache("llm", llm_node, CachePolicy::new(Duration::from_secs(300)))
    .add_node("format", format_node) // always runs
    .set_entry_point("llm")
    .add_edge("llm", "format")
    .add_edge("format", END)
    .compile()?;

Notes

State must implement Serialize. The cache key is a hash of the JSON-serialized state.
Cache scope. The cache lives on the CompiledGraph instance. A new compile() starts with an empty cache.
Works with Commands. Cached entries store the full NodeOutput, including Command variants.

Deferred Nodes

add_deferred_node() registers a node that is intended to wait until all incoming edges have been traversed before executing. Use deferred nodes as fan-in aggregation points after parallel fan-out with Command::send(), where multiple upstream branches must complete before the aggregator runs.

Adding a Deferred Node

Use add_deferred_node() on StateGraph instead of add_node():

use synaptic::graph::{FnNode, State, StateGraph, END};
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
struct AggState { values: Vec<String> }

impl State for AggState {
    fn merge(&mut self, other: Self) { self.values.extend(other.values); }
}

let worker_a = FnNode::new(|mut state: AggState| async move {
    state.values.push("from_a".into());
    Ok(state.into())
});

let worker_b = FnNode::new(|mut state: AggState| async move {
    state.values.push("from_b".into());
    Ok(state.into())
});

let aggregator = FnNode::new(|state: AggState| async move {
    println!("Collected {} results", state.values.len());
    Ok(state.into())
});

let graph = StateGraph::new()
    .add_node("worker_a", worker_a)
    .add_node("worker_b", worker_b)
    .add_deferred_node("aggregator", aggregator)
    .add_edge("worker_a", "aggregator")
    .add_edge("worker_b", "aggregator")
    .add_edge("aggregator", END)
    .set_entry_point("worker_a")
    .compile()?;

Querying Deferred Status

After compiling, check whether a node is deferred with is_deferred():

assert!(graph.is_deferred("aggregator"));
assert!(!graph.is_deferred("worker_a"));

Counting Incoming Edges

incoming_edge_count() returns the total number of fixed and conditional edges targeting a node. Use it to validate that a deferred node has the expected number of upstream dependencies:

assert_eq!(graph.incoming_edge_count("aggregator"), 2);
assert_eq!(graph.incoming_edge_count("worker_a"), 0);

The count includes fixed edges (add_edge) and conditional edge path-map entries that reference the node. Conditional edges without a path map are not counted because their targets cannot be determined statically.

Combining with `Command::send()`

Deferred nodes are designed as the aggregation target after Command::send() fans out work:

use synaptic::graph::{Command, NodeOutput, Send};

let dispatcher = FnNode::new(|_state: AggState| async move {
    let targets = vec![
        Send::new("worker", serde_json::json!({"chunk": "A"})),
        Send::new("worker", serde_json::json!({"chunk": "B"})),
    ];
    Ok(NodeOutput::Command(Command::send(targets)))
});

let graph = StateGraph::new()
    .add_node("dispatch", dispatcher)
    .add_node("worker", worker_node)
    .add_deferred_node("collect", collector_node)
    .add_edge("worker", "collect")
    .add_edge("collect", END)
    .set_entry_point("dispatch")
    .compile()?;

Note: Full parallel fan-out for Command::send() is not yet implemented. Targets are currently processed sequentially. The deferred node infrastructure is in place for when parallel execution is added.

Linear Graphs

A deferred node in a linear chain compiles and executes normally. The deferred marker only becomes meaningful when multiple edges converge on the same node:

let graph = StateGraph::new()
    .add_node("step1", step1)
    .add_deferred_node("step2", step2)
    .add_edge("step1", "step2")
    .add_edge("step2", END)
    .set_entry_point("step1")
    .compile()?;

let result = graph.invoke(AggState::default()).await?.into_state();
// Runs identically to a non-deferred node in a linear chain

Notes

Deferred is a marker. The current execution engine does not block on incoming edge completion -- it runs nodes in edge/command order. The marker is forward-looking infrastructure for future parallel fan-out support.
is_deferred() and incoming_edge_count() are introspection-only. They let you validate graph topology in tests without affecting execution.

Tool Node

ToolNode is a prebuilt graph node that automatically dispatches tool calls found in the last AI message of the state. It bridges the synaptic_tools crate's execution infrastructure with the graph system, making it straightforward to build tool-calling agent loops.

How It Works

When ToolNode processes state, it:

Reads the last message from the state.
Extracts any tool_calls from that message (AI messages carry tool call requests).
Executes each tool call through the provided SerialToolExecutor.
Appends a Message::tool(result, call_id) for each tool call result.
Returns the updated state.

If the last message has no tool calls, the node passes the state through unchanged.

Setup

Create a ToolNode by providing a SerialToolExecutor with registered tools:

use synaptic::graph::ToolNode;
use synaptic::tools::{ToolRegistry, SerialToolExecutor};
use synaptic::core::{Tool, SynapticError};
use synaptic::macros::tool;
use std::sync::Arc;

// Define a tool using the #[tool] macro
/// Evaluates math expressions.
#[tool(name = "calculator")]
async fn calculator(
    /// The math expression to evaluate
    expression: String,
) -> Result<String, SynapticError> {
    Ok(format!("Result: {expression}"))
}

// Register and create the executor
let registry = ToolRegistry::new();
registry.register(calculator()).await?;

let executor = SerialToolExecutor::new(registry);
let tool_node = ToolNode::new(executor);

Note: The #[tool] macro generates the struct, Tool trait implementation, and a factory function automatically. The doc comment becomes the tool description, and function parameters become the JSON Schema. See Procedural Macros for full details.

Using ToolNode in a Graph

ToolNode implements Node<MessageState>, so it can be added directly to a StateGraph:

use synaptic::graph::{StateGraph, FnNode, MessageState, END};
use synaptic::core::{Message, ToolCall};

// An agent node that produces tool calls
let agent = FnNode::new(|mut state: MessageState| async move {
    let tool_call = ToolCall {
        id: "call-1".to_string(),
        name: "calculator".to_string(),
        arguments: serde_json::json!({"expression": "2+2"}),
    };
    state.messages.push(Message::ai_with_tool_calls("", vec![tool_call]));
    Ok(state)
});

let graph = StateGraph::new()
    .add_node("agent", agent)
    .add_node("tools", tool_node)
    .set_entry_point("agent")
    .add_edge("agent", "tools")
    .add_edge("tools", END)
    .compile()?;

let result = graph.invoke(MessageState::new()).await?.into_state();
// State now contains:
//   [0] AI message with tool_calls
//   [1] Tool message with "Result: 2+2"

`tools_condition` -- Standard Routing Function

Synaptic provides a tools_condition function that implements the standard routing logic: returns "tools" if the last message has tool calls, otherwise returns END. This replaces the need to write a custom routing closure:

use synaptic::graph::{StateGraph, MessageState, tools_condition, END};

let graph = StateGraph::new()
    .add_node("agent", agent_node)
    .add_node("tools", tool_node)
    .set_entry_point("agent")
    .add_conditional_edges("agent", tools_condition)
    .add_edge("tools", "agent")  // tool results go back to agent
    .compile()?;

Agent Loop Pattern

In a typical ReAct agent, the tool node feeds results back to the agent node, which decides whether to call more tools or produce a final answer. Use tools_condition or conditional edges to implement this loop:

use std::collections::HashMap;
use synaptic::graph::{StateGraph, MessageState, END};

let graph = StateGraph::new()
    .add_node("agent", agent_node)
    .add_node("tools", tool_node)
    .set_entry_point("agent")
    .add_conditional_edges_with_path_map(
        "agent",
        |state: &MessageState| {
            // If the last message has tool calls, go to tools
            if let Some(msg) = state.last_message() {
                if !msg.tool_calls().is_empty() {
                    return "tools".to_string();
                }
            }
            END.to_string()
        },
        HashMap::from([
            ("tools".to_string(), "tools".to_string()),
            (END.to_string(), END.to_string()),
        ]),
    )
    .add_edge("tools", "agent")  // tool results go back to agent
    .compile()?;

This is exactly the pattern that create_react_agent() implements automatically (using tools_condition internally).

`create_react_agent`

For convenience, Synaptic provides a factory function that assembles the standard ReAct agent graph:

use synaptic::graph::create_react_agent;

let graph = create_react_agent(model, tools);

This creates a compiled graph with "agent" and "tools" nodes wired in a conditional loop, equivalent to the manual setup shown above.

RuntimeAwareTool Injection

ToolNode supports RuntimeAwareTool instances that receive the current graph state, store reference, and tool call ID via ToolRuntime. Register runtime-aware tools with with_runtime_tool():

use synaptic::graph::ToolNode;
use synaptic::core::{RuntimeAwareTool, ToolRuntime};

let tool_node = ToolNode::new(executor)
    .with_store(store)            // inject store into ToolRuntime
    .with_runtime_tool(my_tool);  // register a RuntimeAwareTool

When create_agent is called with AgentOptions { store: Some(store), .. }, the store is automatically wired into the ToolNode.

Graph Visualization

Synaptic provides multiple ways to visualize a compiled graph, from text-based formats suitable for terminals and documentation to image formats for presentations and debugging.

Mermaid Diagram

Generate a Mermaid flowchart string. This is ideal for embedding in Markdown documents and GitHub READMEs:

let mermaid = graph.draw_mermaid();
println!("{mermaid}");

Example output:

graph TD
    __start__(["__start__"])
    agent["agent"]
    tools["tools"]
    __end__(["__end__"])
    __start__ --> agent
    agent --> tools
    tools -.-> |continue| agent
    tools -.-> |end| __end__

__start__ and __end__ are rendered as rounded nodes.
User-defined nodes are rendered as rectangles.
Fixed edges use solid arrows (-->).
Conditional edges with a path map use dashed arrows (-.->) with labels.

ASCII Art

Generate a simple text summary for terminal output:

let ascii = graph.draw_ascii();
println!("{ascii}");

Example output:

Graph:
  Nodes: agent, tools
  Entry: __start__ -> agent
  Edges:
    agent -> tools
    tools -> __end__ | agent  [conditional]

The Display trait is also implemented, so you can use println!("{graph}") directly, which outputs the ASCII representation.

DOT Format (Graphviz)

Generate a Graphviz DOT string for use with the dot command-line tool:

let dot = graph.draw_dot();
println!("{dot}");

Example output:

digraph G {
    rankdir=TD;
    "__start__" [shape=oval];
    "agent" [shape=box];
    "tools" [shape=box];
    "__end__" [shape=oval];
    "__start__" -> "agent" [style=solid];
    "agent" -> "tools" [style=solid];
    "tools" -> "agent" [style=dashed, label="continue"];
    "tools" -> "__end__" [style=dashed, label="end"];
}

PNG via Graphviz

Render the graph to a PNG image using the Graphviz dot command. This requires dot to be installed and available in your $PATH:

graph.draw_png("my_graph.png")?;

Under the hood, this pipes the DOT output through dot -Tpng and writes the resulting image to the specified path.

PNG via Mermaid.ink API

Render the graph to a PNG image using the mermaid.ink web service. This requires internet access but does not require any local tools:

graph.draw_mermaid_png("graph_mermaid.png").await?;

The Mermaid text is base64-encoded and sent to https://mermaid.ink/img/{encoded}. The returned image is saved to the specified path.

SVG via Mermaid.ink API

Similarly, you can generate an SVG instead:

graph.draw_mermaid_svg("graph_mermaid.svg").await?;

Summary

Method	Format	Requires
`draw_mermaid()`	Mermaid text	Nothing
`draw_ascii()`	Plain text	Nothing
`draw_dot()`	DOT text	Nothing
`draw_png(path)`	PNG image	Graphviz `dot` in PATH
`draw_mermaid_png(path)`	PNG image	Internet access
`draw_mermaid_svg(path)`	SVG image	Internet access
`Display` trait	Plain text	Nothing

Tips

Use draw_mermaid() for documentation that renders on GitHub or mdBook.
Use draw_ascii() or Display for quick debugging in the terminal.
Conditional edges without a path_map cannot show their targets in visualizations. If you want full visualization support, use add_conditional_edges_with_path_map() instead of add_conditional_edges().

Middleware Overview

The middleware system intercepts and modifies agent behavior at every lifecycle point -- before/after the agent run, before/after each model call, and around each tool call. Use middleware when you need cross-cutting concerns (rate limiting, retries, context management) without modifying your agent logic.

AgentMiddleware Trait

All methods have default no-op implementations. Override only the hooks you need.

#[async_trait]
pub trait AgentMiddleware: Send + Sync {
    async fn before_agent(&self, messages: &mut Vec<Message>) -> Result<(), SynapticError>;
    async fn after_agent(&self, messages: &mut Vec<Message>) -> Result<(), SynapticError>;
    async fn before_model(&self, request: &mut ModelRequest) -> Result<(), SynapticError>;
    async fn after_model(&self, request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError>;
    async fn wrap_model_call(&self, request: ModelRequest, next: &dyn ModelCaller) -> Result<ModelResponse, SynapticError>;
    async fn wrap_tool_call(&self, request: ToolCallRequest, next: &dyn ToolCaller) -> Result<Value, SynapticError>;
}

Lifecycle Diagram

before_agent(messages)
  loop {
    before_model(request)
      -> wrap_model_call(request, next)
    after_model(request, response)
    for each tool_call {
      wrap_tool_call(request, next)
    }
  }
after_agent(messages)

before_agent and after_agent run once per invocation. The inner loop repeats for each agent step (model call followed by tool execution). before_model / after_model run around every model call and can mutate the request or response. wrap_model_call and wrap_tool_call are onion-style wrappers that receive a next caller to delegate to the next layer.

MiddlewareChain

MiddlewareChain composes multiple middlewares and executes them in registration order for before_* hooks, and in reverse order for after_* hooks.

use synaptic::middleware::MiddlewareChain;

let chain = MiddlewareChain::new(vec![
    Arc::new(ModelCallLimitMiddleware::new(10)),
    Arc::new(ToolRetryMiddleware::new(3)),
]);

Using Middleware with `create_agent`

Pass middlewares through AgentOptions::middleware. The agent graph wires them into both the model node and the tool node automatically.

use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::{ModelCallLimitMiddleware, ToolRetryMiddleware};

let options = AgentOptions {
    middleware: vec![
        Arc::new(ModelCallLimitMiddleware::new(10)),
        Arc::new(ToolRetryMiddleware::new(3)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

Built-in Middlewares

Middleware	Hook Used	Description
`ModelCallLimitMiddleware`	`wrap_model_call`	Limits model invocations per run
`ToolCallLimitMiddleware`	`wrap_tool_call`	Limits tool invocations per run
`ToolRetryMiddleware`	`wrap_tool_call`	Retries failed tools with exponential backoff
`ModelFallbackMiddleware`	`wrap_model_call`	Falls back to alternative models on failure
`SummarizationMiddleware`	`before_model`	Auto-summarizes when context exceeds token limit
`TodoListMiddleware`	`before_model`	Injects a task list into the agent context
`HumanInTheLoopMiddleware`	`wrap_tool_call`	Pauses for human approval before tool execution
`ContextEditingMiddleware`	`before_model`	Trims or filters context before model calls

Writing a Custom Middleware

The easiest way to define a middleware is with the corresponding macro. Each lifecycle hook has its own macro (#[before_agent], #[before_model], #[after_model], #[after_agent], #[wrap_model_call], #[wrap_tool_call], #[dynamic_prompt]). The macro generates the struct, AgentMiddleware trait implementation, and a factory function automatically.

use synaptic::macros::before_model;
use synaptic::middleware::ModelRequest;
use synaptic::core::SynapticError;

#[before_model]
async fn log_model_call(request: &mut ModelRequest) -> Result<(), SynapticError> {
    println!("Model call with {} messages", request.messages.len());
    Ok(())
}

Then add it to your agent:

let options = AgentOptions {
    middleware: vec![log_model_call()],
    ..Default::default()
};
let graph = create_agent(model, tools, options)?;

Note: The log_model_call() factory function returns Arc<dyn AgentMiddleware>. For stateful middleware, use #[field] parameters on the function. See Procedural Macros for the full reference, including all seven middleware macros and stateful middleware with #[field].

ModelCallLimitMiddleware

Limits the number of model invocations during a single agent run, preventing runaway loops. Use this when you want a hard cap on how many times the LLM is called per invocation.

Constructor

use synaptic::middleware::ModelCallLimitMiddleware;

let mw = ModelCallLimitMiddleware::new(10); // max 10 model calls

The middleware also exposes call_count() to inspect the current count and reset() to zero it out.

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ModelCallLimitMiddleware;

let options = AgentOptions {
    middleware: vec![
        Arc::new(ModelCallLimitMiddleware::new(5)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: wrap_model_call
Before delegating to the next layer, the middleware atomically increments an internal counter.
If the counter has reached or exceeded max_calls, it returns SynapticError::MaxStepsExceeded immediately without calling the model.
Otherwise, it delegates to next.call(request) as normal.

This means the agent loop terminates with an error once the limit is hit. The counter persists across the entire agent invocation (all steps in the agent loop), so a limit of 5 means at most 5 model round-trips total.

Example: Combining with Other Middleware

let options = AgentOptions {
    middleware: vec![
        Arc::new(ModelCallLimitMiddleware::new(10)),
        Arc::new(ToolRetryMiddleware::new(3)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

The model call limit is checked on every model call regardless of whether other middlewares modify the request or response.

ToolCallLimitMiddleware

Limits the number of tool invocations during a single agent run. Use this to cap tool usage when agents may generate excessive tool calls in a loop.

Constructor

use synaptic::middleware::ToolCallLimitMiddleware;

let mw = ToolCallLimitMiddleware::new(20); // max 20 tool calls

The middleware exposes call_count() and reset() for inspection and manual reset.

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ToolCallLimitMiddleware;

let options = AgentOptions {
    middleware: vec![
        Arc::new(ToolCallLimitMiddleware::new(20)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: wrap_tool_call
Each time a tool call is dispatched, the middleware atomically increments an internal counter.
If the counter has reached or exceeded max_calls, it returns SynapticError::MaxStepsExceeded without executing the tool.
Otherwise, it delegates to next.call(request) normally.

The counter tracks individual tool calls, not agent steps. If a single model response requests three tool calls, the counter increments three times. This gives you precise control over total tool usage across the entire agent run.

Combining Model and Tool Limits

Both limits can be applied simultaneously to guard against different failure modes:

use synaptic::middleware::{ModelCallLimitMiddleware, ToolCallLimitMiddleware};

let options = AgentOptions {
    middleware: vec![
        Arc::new(ModelCallLimitMiddleware::new(10)),
        Arc::new(ToolCallLimitMiddleware::new(30)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

The agent stops as soon as either limit is hit.

Handling the Error

When the limit is exceeded, the middleware returns SynapticError::MaxStepsExceeded. You can catch this to provide a graceful fallback:

use synaptic::core::SynapticError;

let mut state = MessageState::new();
state.messages.push(Message::human("Do something complex."));

match graph.invoke(state).await {
    Ok(result) => println!("{}", result.into_state().messages.last().unwrap().content()),
    Err(SynapticError::MaxStepsExceeded(msg)) => {
        println!("Agent hit tool call limit: {msg}");
        // Retry with a higher limit, summarize progress, or inform the user
    }
    Err(e) => println!("Other error: {e}"),
}

Inspecting and Resetting

The middleware provides methods to inspect and reset the counter:

let mw = ToolCallLimitMiddleware::new(10);

// After an agent run, check how many tool calls were made
println!("Tool calls used: {}", mw.call_count());

// Reset the counter for a new run
mw.reset();
assert_eq!(mw.call_count(), 0);

ToolRetryMiddleware

Retries failed tool calls with exponential backoff. Use this when tools may experience transient failures (network timeouts, rate limits, temporary unavailability) and you want automatic recovery without surfacing errors to the model.

Constructor

use synaptic::middleware::ToolRetryMiddleware;

// Retry up to 3 times (4 total attempts including the first)
let mw = ToolRetryMiddleware::new(3);

Configuration

The base delay between retries defaults to 100ms and doubles on each attempt (exponential backoff). You can customize it with with_base_delay:

use std::time::Duration;

let mw = ToolRetryMiddleware::new(3)
    .with_base_delay(Duration::from_millis(500));
// Delays: 500ms, 1000ms, 2000ms

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ToolRetryMiddleware;

let options = AgentOptions {
    middleware: vec![
        Arc::new(ToolRetryMiddleware::new(3)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: wrap_tool_call
When a tool call fails, the middleware waits for base_delay * 2^attempt and retries.
Retries continue up to max_retries times. If all retries fail, the last error is returned.
If the tool call succeeds on any attempt, the result is returned immediately.

The backoff schedule with the default 100ms base delay:

Attempt	Delay before retry
1st retry	100ms
2nd retry	200ms
3rd retry	400ms

Combining with Tool Call Limits

When both middlewares are active, the retry middleware operates inside the tool call limit. Each retry counts as a separate tool call:

let options = AgentOptions {
    middleware: vec![
        Arc::new(ToolCallLimitMiddleware::new(30)),
        Arc::new(ToolRetryMiddleware::new(3)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

ModelFallbackMiddleware

Falls back to alternative models when the primary model fails. Use this for high-availability scenarios where you want seamless failover between providers (e.g., OpenAI to Anthropic) or between model tiers (e.g., GPT-4 to GPT-3.5).

Constructor

use synaptic::middleware::ModelFallbackMiddleware;

let mw = ModelFallbackMiddleware::new(vec![
    fallback_model_1,  // Arc<dyn ChatModel>
    fallback_model_2,  // Arc<dyn ChatModel>
]);

The fallback list is tried in order. The first successful response is returned.

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::openai::OpenAiChatModel;
use synaptic::anthropic::AnthropicChatModel;
use synaptic::middleware::ModelFallbackMiddleware;

let primary = Arc::new(OpenAiChatModel::new("gpt-4o"));
let fallback = Arc::new(AnthropicChatModel::new("claude-sonnet-4-20250514"));

let options = AgentOptions {
    middleware: vec![
        Arc::new(ModelFallbackMiddleware::new(vec![fallback])),
    ],
    ..Default::default()
};

let graph = create_agent(primary, tools, options)?;

How It Works

Lifecycle hook: wrap_model_call
The middleware first delegates to next.call(request), which calls the primary model through the rest of the middleware chain.
If the primary call succeeds, the response is returned as-is.
If the primary call fails, the middleware tries each fallback model in order by creating a BaseChatModelCaller and sending the same request.
The first fallback that succeeds is returned. If all fallbacks also fail, the original error from the primary model is returned.

Fallback models are called directly (bypassing the middleware chain) to avoid interference from other middlewares that may have caused or contributed to the failure.

Example: Multi-tier Fallback

let primary = Arc::new(OpenAiChatModel::new("gpt-4o"));
let tier2 = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let tier3 = Arc::new(AnthropicChatModel::new("claude-sonnet-4-20250514"));

let options = AgentOptions {
    middleware: vec![
        Arc::new(ModelFallbackMiddleware::new(vec![tier2, tier3])),
    ],
    ..Default::default()
};

let graph = create_agent(primary, tools, options)?;

The agent tries GPT-4o first, then GPT-4o-mini, then Claude Sonnet.

SummarizationMiddleware

Automatically summarizes conversation history when it exceeds a token limit. Use this for long-running agents where the context window would otherwise overflow, replacing older messages with a concise summary while keeping recent messages intact.

Constructor

use synaptic::middleware::SummarizationMiddleware;

let mw = SummarizationMiddleware::new(
    summarizer_model,   // Arc<dyn ChatModel> -- model used to generate summaries
    4000,               // max_tokens -- threshold that triggers summarization
    |msg: &Message| {   // token_counter -- estimates tokens per message
        msg.content().len() / 4
    },
);

Parameters:

model -- The ChatModel used to generate the summary. Can be the same model as the agent or a cheaper/faster one.
max_tokens -- When the estimated total tokens exceed this value, summarization is triggered.
token_counter -- A function Fn(&Message) -> usize that estimates the token count for a single message. A common heuristic is content.len() / 4.

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::SummarizationMiddleware;
use synaptic::openai::OpenAiChatModel;

let summarizer = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));

let options = AgentOptions {
    middleware: vec![
        Arc::new(SummarizationMiddleware::new(
            summarizer,
            4000,
            |msg| msg.content().len() / 4,
        )),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: before_model
Before each model call, the middleware sums the estimated tokens across all messages.
If the total is within max_tokens, no action is taken.
If the total exceeds the limit, it splits messages into two groups:
- Recent messages that fit within half the token budget (kept as-is).
- Older messages that are sent to the summarizer model.
The summarizer produces a concise summary, which replaces the older messages as a system message prefixed with [Previous conversation summary].
The request then proceeds with the summary plus the recent messages, staying within budget.

This approach preserves the most recent context verbatim while compressing older exchanges, keeping the agent informed about prior conversation without exceeding context limits.

Example: Budget-conscious Summarization

Use a cheaper model for summaries to reduce costs:

use synaptic::openai::OpenAiChatModel;

let agent_model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let cheap_model = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));

let options = AgentOptions {
    middleware: vec![
        Arc::new(SummarizationMiddleware::new(
            cheap_model,
            8000,
            |msg| msg.content().len() / 4,
        )),
    ],
    ..Default::default()
};

let graph = create_agent(agent_model, tools, options)?;

Offline Testing with ScriptedChatModel

Test summarization behavior without API keys:

use std::sync::Arc;
use synaptic::core::{ChatResponse, Message};
use synaptic::models::ScriptedChatModel;
use synaptic::middleware::SummarizationMiddleware;
use synaptic::graph::{create_agent, AgentOptions, MessageState};

// Script: summarizer returns a summary, agent responds
let summarizer = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("Summary: discussed Rust ownership."),
        usage: None,
    },
]));

let agent_model = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("Here's more about lifetimes."),
        usage: None,
    },
]));

let options = AgentOptions {
    middleware: vec![
        Arc::new(SummarizationMiddleware::new(
            summarizer,
            100,  // low threshold for testing
            |msg| msg.content().len() / 4,
        )),
    ],
    ..Default::default()
};

let graph = create_agent(agent_model, vec![], options)?;

// Build a state with enough messages to exceed the threshold
let mut state = MessageState::new();
state.messages.push(Message::human("What is Rust?"));
state.messages.push(Message::ai("Rust is a systems programming language..."));
state.messages.push(Message::human("Tell me about ownership."));
state.messages.push(Message::ai("Ownership is a set of rules that govern memory..."));
state.messages.push(Message::human("And lifetimes?"));

let result = graph.invoke(state).await?.into_state();

TodoListMiddleware

Injects task-planning state into the agent's context by maintaining a shared todo list. Use this when your agent performs multi-step operations and you want it to track progress across model calls.

Constructor

use synaptic::middleware::TodoListMiddleware;

let mw = TodoListMiddleware::new();

Managing Tasks

The middleware provides async methods to add and complete tasks programmatically:

let mw = TodoListMiddleware::new();

// Add tasks before or during agent execution
let id1 = mw.add("Research competitor pricing").await;
let id2 = mw.add("Draft summary report").await;

// Mark tasks as done
mw.complete(id1).await;

// Inspect current state
let items = mw.items().await;

Each task gets a unique auto-incrementing ID. Tasks have an id, task (description), and done (completion status).

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::TodoListMiddleware;

let todo = Arc::new(TodoListMiddleware::new());
todo.add("Gather user requirements").await;
todo.add("Generate implementation plan").await;
todo.add("Write code").await;

let options = AgentOptions {
    middleware: vec![todo.clone()],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: before_model
Before each model call, the middleware checks the current todo list.
If the list is non-empty, it inserts a system message at the beginning of the request's message list containing the formatted task list.
The model sees the current state of all tasks, including which ones are done.

The injected message looks like:

Current TODO list:
  [ ] #1: Gather user requirements
  [x] #2: Generate implementation plan
  [ ] #3: Write code

This gives the model awareness of the overall plan and progress, enabling it to work through tasks methodically. You can call complete() from tool implementations or external code to update progress between agent steps.

Example: Tool-driven Task Completion

Combine with a custom tool that marks tasks as done:

let todo = Arc::new(TodoListMiddleware::new());
todo.add("Fetch data from API").await;
todo.add("Transform data").await;
todo.add("Save results").await;

// The agent sees the todo list in its context and can
// reason about which tasks remain. Your tools can call
// todo.complete(id) when they finish their work.

HumanInTheLoopMiddleware

Pauses tool execution to request human approval before proceeding. Use this when certain tool calls (e.g., database writes, payments, deployments) require human oversight.

Constructor

There are two constructors depending on the scope of approval:

use synaptic::middleware::HumanInTheLoopMiddleware;

// Require approval for ALL tool calls
let mw = HumanInTheLoopMiddleware::new(callback);

// Require approval only for specific tools
let mw = HumanInTheLoopMiddleware::for_tools(
    callback,
    vec!["delete_record".to_string(), "send_email".to_string()],
);

ApprovalCallback Trait

You must implement the ApprovalCallback trait to define how approval is obtained:

use synaptic::middleware::ApprovalCallback;

struct CliApproval;

#[async_trait]
impl ApprovalCallback for CliApproval {
    async fn approve(&self, tool_name: &str, arguments: &Value) -> Result<bool, SynapticError> {
        println!("Tool '{}' wants to run with args: {}", tool_name, arguments);
        println!("Approve? (y/n)");
        // Read user input and return true/false
        Ok(true)
    }
}

Return Ok(true) to approve, Ok(false) to reject (the model receives a rejection message), or Err(...) to abort the entire agent run.

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::HumanInTheLoopMiddleware;

let approval = Arc::new(CliApproval);
let hitl = HumanInTheLoopMiddleware::for_tools(
    approval,
    vec!["delete_record".to_string()],
);

let options = AgentOptions {
    middleware: vec![Arc::new(hitl)],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: wrap_tool_call
When a tool call arrives, the middleware checks whether it requires approval:
- If constructed with new(), all tools require approval.
- If constructed with for_tools(), only the named tools require approval.
For tools that require approval, it calls callback.approve(tool_name, arguments).
If approved (true), the tool call proceeds normally via next.call(request).
If rejected (false), the middleware returns a Value::String message saying the call was rejected. This message is fed back to the model as the tool result, allowing it to adjust its plan.

Example: Selective Approval with Logging

struct AuditApproval {
    auto_approve: HashSet<String>,
}

#[async_trait]
impl ApprovalCallback for AuditApproval {
    async fn approve(&self, tool_name: &str, arguments: &Value) -> Result<bool, SynapticError> {
        if self.auto_approve.contains(tool_name) {
            tracing::info!("Auto-approved: {}", tool_name);
            return Ok(true);
        }
        tracing::warn!("Requires manual approval: {} with {:?}", tool_name, arguments);
        // In production, this could send a Slack message, webhook, etc.
        Ok(false) // reject by default until approved
    }
}

This pattern lets you auto-approve safe operations while gating dangerous ones.

ContextEditingMiddleware

Trims or filters the conversation context before each model call. Use this to keep the context window manageable when full summarization is unnecessary -- for example, dropping old messages or stripping tool call noise from the history.

Constructor

The middleware accepts a ContextStrategy that defines how messages are edited:

use synaptic::middleware::{ContextEditingMiddleware, ContextStrategy};

// Keep only the last 10 non-system messages
let mw = ContextEditingMiddleware::new(ContextStrategy::LastN(10));

// Remove tool call/result pairs, keeping only human/AI content messages
let mw = ContextEditingMiddleware::new(ContextStrategy::StripToolCalls);

// Strip tool calls first, then keep last N
let mw = ContextEditingMiddleware::new(ContextStrategy::StripAndTruncate(10));

Convenience Constructors

let mw = ContextEditingMiddleware::last_n(10);
let mw = ContextEditingMiddleware::strip_tool_calls();

Strategies

Strategy	Behavior
`LastN(n)`	Keeps leading system messages, then the last `n` non-system messages
`StripToolCalls`	Removes `Tool` messages and AI messages that contain only tool calls (no text)
`StripAndTruncate(n)`	Applies `StripToolCalls` first, then `LastN(n)`

Usage with `create_agent`

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::middleware::ContextEditingMiddleware;

let options = AgentOptions {
    middleware: vec![
        Arc::new(ContextEditingMiddleware::last_n(20)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

How It Works

Lifecycle hook: before_model
Before each model call, the middleware applies the configured strategy to request.messages.
LastN: System messages at the start of the list are always preserved. From the remaining messages, only the last n are kept. Earlier messages are dropped.
StripToolCalls: Messages with is_tool() == true are removed. AI messages that have tool calls but empty text content are also removed. This cleans up the tool-call/tool-result pairs while preserving the conversational content.
StripAndTruncate: Runs both filters in sequence -- first strips tool calls, then truncates to the last N.

The original message list in the agent state is not modified; only the request sent to the model is trimmed.

Example: Combining with Summarization

For maximum context efficiency, strip tool calls first, then summarize what remains:

let options = AgentOptions {
    middleware: vec![
        Arc::new(ContextEditingMiddleware::strip_tool_calls()),
        Arc::new(SummarizationMiddleware::new(model.clone(), 4000, |msg| msg.content().len() / 4)),
    ],
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

The context editor removes tool noise before summarization runs, producing cleaner summaries.

Key-Value Store

The Store trait provides persistent key-value storage for agents, enabling cross-invocation state management.

Store Trait

use synaptic::store::Store;

#[async_trait]
pub trait Store: Send + Sync {
    async fn get(&self, namespace: &[&str], key: &str) -> Result<Option<Item>, SynapticError>;
    async fn search(&self, namespace: &[&str], query: Option<&str>, limit: usize) -> Result<Vec<Item>, SynapticError>;
    async fn put(&self, namespace: &[&str], key: &str, value: Value) -> Result<(), SynapticError>;
    async fn delete(&self, namespace: &[&str], key: &str) -> Result<(), SynapticError>;
    async fn list_namespaces(&self, prefix: &[&str]) -> Result<Vec<Vec<String>>, SynapticError>;
}

Each Item returned from get() or search() contains:

pub struct Item {
    pub namespace: Vec<String>,
    pub key: String,
    pub value: Value,
    pub created_at: String,
    pub updated_at: String,
    pub score: Option<f64>,  // populated by semantic search
}

InMemoryStore

use synaptic::store::InMemoryStore;

let store = InMemoryStore::new();
store.put(&["users", "prefs"], "theme", json!("dark")).await?;

let item = store.get(&["users", "prefs"], "theme").await?;

Semantic Search

When configured with an embeddings model, InMemoryStore uses cosine similarity for search() queries instead of substring matching. Items are ranked by relevance and Item::score is populated.

use synaptic::store::InMemoryStore;
use synaptic::openai::OpenAiEmbeddings;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = InMemoryStore::new().with_embeddings(embeddings);

// Put documents
store.put(&["docs"], "rust", json!("Rust is a systems programming language")).await?;
store.put(&["docs"], "python", json!("Python is an interpreted language")).await?;

// Semantic search — results ranked by similarity
let results = store.search(&["docs"], Some("systems programming"), 10).await?;
// results[0] will be the "rust" item with highest similarity score
assert!(results[0].score.unwrap() > results[1].score.unwrap());

Without embeddings, search() falls back to substring matching on key and value.

Using with Agents

use synaptic::graph::{create_agent, AgentOptions};
use synaptic::store::InMemoryStore;

let store = Arc::new(InMemoryStore::new());
let options = AgentOptions {
    store: Some(store),
    ..Default::default()
};

let graph = create_agent(model, tools, options)?;

When a store is provided to create_agent, it is automatically wired into ToolNode. Any RuntimeAwareTool registered with the agent will receive the store via ToolRuntime.

Multi-Agent Patterns

Synaptic provides prebuilt multi-agent orchestration patterns that compose individual agents into collaborative workflows.

Pattern Comparison

Pattern	Coordinator	Routing	Best For
Supervisor	Central supervisor model	Supervisor decides which sub-agent runs next	Structured delegation with clear task boundaries
Swarm	None (decentralized)	Each agent hands off to peers directly	Organic collaboration where any agent can escalate
Handoff Tools	Custom	You wire the topology	Arbitrary graphs that don't fit supervisor or swarm

When to Use Each

Supervisor -- Use when you have a clear hierarchy. A single model reads the conversation and decides which specialist agent should handle the next step. The supervisor sees the full message history and can route back to itself when done.

Swarm -- Use when agents are peers. Each agent has its own model, tools, and a set of handoff tools to transfer to any other agent. There is no central coordinator; any agent can decide to transfer at any time.

Handoff Tools -- Use when you need a custom topology. create_handoff_tool produces a Tool that signals an intent to transfer to another agent. You can register these in any graph structure you design manually.

Key Types

All multi-agent functions live in synaptic_graph:

use synaptic::graph::{
    create_supervisor, SupervisorOptions,
    create_swarm, SwarmAgent, SwarmOptions,
    create_handoff_tool,
    create_agent, AgentOptions,
    MessageState,
};

Minimal Example

use std::sync::Arc;
use synaptic::graph::{
    create_agent, create_supervisor, AgentOptions, SupervisorOptions, MessageState,
};
use synaptic::core::Message;

// Build two sub-agents
let agent_a = create_agent(model.clone(), tools_a, AgentOptions::default())?;
let agent_b = create_agent(model.clone(), tools_b, AgentOptions::default())?;

// Wire them under a supervisor
let graph = create_supervisor(
    model,
    vec![
        ("agent_a".to_string(), agent_a),
        ("agent_b".to_string(), agent_b),
    ],
    SupervisorOptions::default(),
)?;

let mut state = MessageState::new();
state.messages.push(Message::human("Hello, delegate this."));
let result = graph.invoke(state).await?.into_state();

See the individual pages for detailed usage of each pattern.

Supervisor Pattern

The supervisor pattern uses a central model to route conversations to specialized sub-agents.

How It Works

create_supervisor builds a graph with a "supervisor" node at the center. The supervisor node calls a ChatModel with handoff tools -- one per sub-agent. When the model emits a transfer_to_<agent_name> tool call, the graph routes to that sub-agent. When the sub-agent finishes, control returns to the supervisor, which can delegate again or produce a final answer.

         +------------+
         | supervisor |<-----+
         +-----+------+      |
           /       \          |
    agent_a     agent_b ------+

API

use synaptic::graph::{create_supervisor, SupervisorOptions};

pub fn create_supervisor(
    model: Arc<dyn ChatModel>,
    agents: Vec<(String, CompiledGraph<MessageState>)>,
    options: SupervisorOptions,
) -> Result<CompiledGraph<MessageState>, SynapticError>;

SupervisorOptions

Field	Type	Description
`checkpointer`	`Option<Arc<dyn Checkpointer>>`	Persist state across invocations
`store`	`Option<Arc<dyn Store>>`	Shared key-value store
`system_prompt`	`Option<String>`	Override the default supervisor prompt

If no system_prompt is provided, a default is generated:

"You are a supervisor managing these agents: agent_a, agent_b. Use the transfer tools to delegate tasks to the appropriate agent. When the task is complete, respond directly to the user."

Full Example

use std::sync::Arc;
use synaptic::core::{ChatModel, Message, Tool};
use synaptic::graph::{
    create_agent, create_supervisor, AgentOptions, MessageState, SupervisorOptions,
};

// Assume `model` implements ChatModel, `research_tools` and `writing_tools`
// are Vec<Arc<dyn Tool>>.

// 1. Create sub-agents
let researcher = create_agent(
    model.clone(),
    research_tools,
    AgentOptions {
        system_prompt: Some("You are a research assistant.".into()),
        ..Default::default()
    },
)?;

let writer = create_agent(
    model.clone(),
    writing_tools,
    AgentOptions {
        system_prompt: Some("You are a writing assistant.".into()),
        ..Default::default()
    },
)?;

// 2. Create the supervisor graph
let supervisor = create_supervisor(
    model,
    vec![
        ("researcher".to_string(), researcher),
        ("writer".to_string(), writer),
    ],
    SupervisorOptions {
        system_prompt: Some(
            "Route research questions to researcher, writing tasks to writer.".into(),
        ),
        ..Default::default()
    },
)?;

// 3. Invoke
let mut state = MessageState::new();
state.messages.push(Message::human("Write a summary of recent AI trends."));
let result = supervisor.invoke(state).await?.into_state();

println!("{}", result.messages.last().unwrap().content());

With Checkpointing

Pass a checkpointer to persist the supervisor's state across calls:

use synaptic::graph::MemorySaver;

let supervisor = create_supervisor(
    model,
    agents,
    SupervisorOptions {
        checkpointer: Some(Arc::new(MemorySaver::new())),
        ..Default::default()
    },
)?;

Offline Testing with ScriptedChatModel

You can test supervisor graphs without an API key using ScriptedChatModel. Script the supervisor to emit a handoff tool call, and script the sub-agent to produce a response:

use std::sync::Arc;
use synaptic::core::{ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::graph::{
    create_agent, create_supervisor, AgentOptions, MessageState, SupervisorOptions,
};

// Sub-agent model: responds directly (no tool calls)
let agent_model = ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("The research is complete."),
        usage: None,
    },
]);

// Supervisor model: first response transfers to researcher, second is final answer
let supervisor_model = ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai_with_tool_calls(
            "",
            vec![ToolCall {
                id: "call_1".into(),
                name: "transfer_to_researcher".into(),
                arguments: "{}".into(),
            }],
        ),
        usage: None,
    },
    ChatResponse {
        message: Message::ai("All done. Here is the summary."),
        usage: None,
    },
]);

let researcher = create_agent(
    Arc::new(agent_model),
    vec![],
    AgentOptions::default(),
)?;

let supervisor = create_supervisor(
    Arc::new(supervisor_model),
    vec![("researcher".to_string(), researcher)],
    SupervisorOptions::default(),
)?;

let mut state = MessageState::new();
state.messages.push(Message::human("Research AI trends."));
let result = supervisor.invoke(state).await?.into_state();

Notes

Each sub-agent is wrapped in a SubAgentNode that calls graph.invoke(state) and returns the resulting state back to the supervisor.
The supervisor sees the full message history, including messages appended by sub-agents.
The graph terminates when the supervisor produces a response with no tool calls.

Swarm Pattern

The swarm pattern creates a decentralized multi-agent graph where every agent can hand off to any other agent directly.

How It Works

create_swarm takes a list of SwarmAgent definitions. Each agent has its own model, tools, and system prompt. Synaptic automatically generates handoff tools (transfer_to_<peer>) for every other agent and adds them to each agent's tool set. A shared "tools" node executes regular tool calls and routes handoff tool calls to the target agent.

    triage ----> tools ----> billing
       ^           |            |
       |           v            |
       +------- support <------+

The first agent in the list is the entry point.

API

use synaptic::graph::{create_swarm, SwarmAgent, SwarmOptions};

pub fn create_swarm(
    agents: Vec<SwarmAgent>,
    options: SwarmOptions,
) -> Result<CompiledGraph<MessageState>, SynapticError>;

SwarmAgent

Field	Type	Description
`name`	`String`	Unique agent identifier
`model`	`Arc<dyn ChatModel>`	The model this agent uses
`tools`	`Vec<Arc<dyn Tool>>`	Agent-specific tools (handoff tools are added automatically)
`system_prompt`	`Option<String>`	Optional system prompt for this agent

SwarmOptions

Field	Type	Description
`checkpointer`	`Option<Arc<dyn Checkpointer>>`	Persist state across invocations
`store`	`Option<Arc<dyn Store>>`	Shared key-value store

Full Example

use std::sync::Arc;
use synaptic::core::{ChatModel, Message, Tool};
use synaptic::graph::{create_swarm, MessageState, SwarmAgent, SwarmOptions};

// Assume `model` implements ChatModel and *_tools are Vec<Arc<dyn Tool>>.

let swarm = create_swarm(
    vec![
        SwarmAgent {
            name: "triage".to_string(),
            model: model.clone(),
            tools: triage_tools,
            system_prompt: Some("You triage incoming requests.".into()),
        },
        SwarmAgent {
            name: "billing".to_string(),
            model: model.clone(),
            tools: billing_tools,
            system_prompt: Some("You handle billing questions.".into()),
        },
        SwarmAgent {
            name: "support".to_string(),
            model: model.clone(),
            tools: support_tools,
            system_prompt: Some("You provide technical support.".into()),
        },
    ],
    SwarmOptions::default(),
)?;

// The first agent ("triage") is the entry point.
let mut state = MessageState::new();
state.messages.push(Message::human("I need to update my payment method."));
let result = swarm.invoke(state).await?.into_state();

// The triage agent will call `transfer_to_billing`, routing to the billing agent.
println!("{}", result.messages.last().unwrap().content());

Routing Logic

When an agent produces tool calls, the graph routes to the "tools" node.
The tools node executes regular tool calls via the shared SerialToolExecutor.
For handoff tools (transfer_to_<name>), it adds a synthetic tool response message and skips execution.
After the tools node, routing inspects the last AI message for handoff calls and transfers to the target agent. If no handoff occurred, the current agent continues.

Offline Testing with ScriptedChatModel

Test swarm graphs without API keys by scripting each agent's model:

use std::sync::Arc;
use synaptic::core::{ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::graph::{create_swarm, MessageState, SwarmAgent, SwarmOptions};

// Triage model: transfers to billing
let triage_model = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai_with_tool_calls(
            "",
            vec![ToolCall {
                id: "call_1".into(),
                name: "transfer_to_billing".into(),
                arguments: "{}".into(),
            }],
        ),
        usage: None,
    },
]));

// Billing model: responds directly
let billing_model = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai("Your payment method has been updated."),
        usage: None,
    },
]));

let swarm = create_swarm(
    vec![
        SwarmAgent {
            name: "triage".to_string(),
            model: triage_model,
            tools: vec![],
            system_prompt: Some("Route requests to the right agent.".into()),
        },
        SwarmAgent {
            name: "billing".to_string(),
            model: billing_model,
            tools: vec![],
            system_prompt: Some("Handle billing questions.".into()),
        },
    ],
    SwarmOptions::default(),
)?;

let mut state = MessageState::new();
state.messages.push(Message::human("Update my payment method."));
let result = swarm.invoke(state).await?.into_state();

Notes

The swarm requires at least one agent. An empty list returns an error.
All agent tools are registered in a single shared ToolRegistry, so tool names must be unique across agents.
Each agent has its own model, so you can mix providers (e.g., a fast model for triage, a powerful model for support).
Handoff tools are generated for all peers -- an agent cannot hand off to itself.

Handoff Tools

Handoff tools signal an intent to transfer a conversation from one agent to another.

create_handoff_tool

The create_handoff_tool function creates a Tool that, when called, returns a transfer message. The tool is named transfer_to_<agent_name> and routing logic uses this naming convention to detect handoffs.

use synaptic::graph::create_handoff_tool;

let handoff = create_handoff_tool("billing", "Transfer to the billing specialist");
// handoff.name()        => "transfer_to_billing"
// handoff.description() => "Transfer to the billing specialist"

When invoked, the tool returns:

"Transferring to agent 'billing'."

Using Handoff Tools in Custom Agents

You can register handoff tools alongside regular tools when building an agent:

use std::sync::Arc;
use synaptic::graph::{create_agent, create_handoff_tool, AgentOptions};

let escalate = create_handoff_tool("human_review", "Escalate to a human reviewer");

let mut all_tools: Vec<Arc<dyn Tool>> = my_tools;
all_tools.push(escalate);

let agent = create_agent(model, all_tools, AgentOptions::default())?;

The model will see transfer_to_human_review as an available tool. When it decides to call it, your graph's conditional edges can detect the handoff and route accordingly.

Building Custom Topologies

For workflows that don't fit the supervisor or swarm patterns, combine handoff tools with a manual StateGraph:

use std::collections::HashMap;
use synaptic::graph::{
    create_handoff_tool, StateGraph, FnNode, MessageState, END,
};

// Create handoff tools
let to_reviewer = create_handoff_tool("reviewer", "Send to reviewer");
let to_publisher = create_handoff_tool("publisher", "Send to publisher");

// Build nodes (agent_node, reviewer_node, publisher_node defined elsewhere)

let graph = StateGraph::new()
    .add_node("drafter", drafter_node)
    .add_node("reviewer", reviewer_node)
    .add_node("publisher", publisher_node)
    .set_entry_point("drafter")
    .add_conditional_edges("drafter", |state: &MessageState| {
        if let Some(last) = state.last_message() {
            for tc in last.tool_calls() {
                if tc.name == "transfer_to_reviewer" {
                    return "reviewer".to_string();
                }
                if tc.name == "transfer_to_publisher" {
                    return "publisher".to_string();
                }
            }
        }
        END.to_string()
    })
    .add_edge("reviewer", "drafter")
    .add_edge("publisher", END)
    .compile()?;

Naming Convention

The handoff tool is always named transfer_to_<agent_name>. Both create_supervisor and create_swarm rely on this convention internally when routing. If you build custom topologies, match against the same pattern in your conditional edges.

Notes

Handoff tools take no arguments. The model calls them with an empty object {}.
The tool itself only returns a string message -- the actual routing is handled by the graph's conditional edges, not by the tool execution.
You can create multiple handoff tools per agent to build complex routing graphs (e.g., an agent can hand off to three different specialists).

MCP (Model Context Protocol)

The synaptic_mcp crate connects to external MCP-compatible tool servers, discovers their tools, and exposes them as standard synaptic::core::Tool implementations.

What is MCP?

The Model Context Protocol is an open standard for connecting AI models to external tool servers. An MCP server advertises a set of tools via a JSON-RPC interface. Synaptic's MCP client discovers those tools at connection time and wraps each one as a native Tool that can be used with any agent, graph, or tool executor.

Supported Transports

Transport	Config Struct	Communication
Stdio	`StdioConnection`	Spawn a child process; JSON-RPC over stdin/stdout
SSE	`SseConnection`	HTTP POST with Server-Sent Events for streaming
HTTP	`HttpConnection`	Standard HTTP POST with JSON-RPC

All transports use the same JSON-RPC tools/list method for discovery and tools/call method for invocation.

Quick Start

use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, StdioConnection};

// Configure a single MCP server
let mut servers = HashMap::new();
servers.insert(
    "my_server".to_string(),
    McpConnection::Stdio(StdioConnection {
        command: "npx".to_string(),
        args: vec!["-y".to_string(), "@my/mcp-server".to_string()],
        env: HashMap::new(),
    }),
);

// Connect and discover tools
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;

// Use discovered tools with an agent
let agent = create_react_agent(model, tools)?;

Tool Name Prefixing

By default, discovered tool names are prefixed with the server name to avoid collisions (e.g., my_server_search). Disable this with:

let client = MultiServerMcpClient::new(servers).with_prefix(false);

Convenience Function

The load_mcp_tools function combines connect() and get_tools() in a single call:

use synaptic::mcp::load_mcp_tools;

let tools = load_mcp_tools(&client).await?;

Crate Imports

use synaptic::mcp::{
    MultiServerMcpClient,
    McpConnection,
    StdioConnection,
    SseConnection,
    HttpConnection,
    load_mcp_tools,
};

See the individual transport pages for detailed configuration examples.

Stdio Transport

The Stdio transport spawns a child process and communicates with it over stdin/stdout using JSON-RPC.

Configuration

use synaptic::mcp::StdioConnection;
use std::collections::HashMap;

let connection = StdioConnection {
    command: "npx".to_string(),
    args: vec!["-y".to_string(), "@modelcontextprotocol/server-filesystem".to_string()],
    env: HashMap::from([
        ("HOME".to_string(), "/home/user".to_string()),
    ]),
};

Fields

Field	Type	Description
`command`	`String`	The executable to spawn
`args`	`Vec<String>`	Command-line arguments
`env`	`HashMap<String, String>`	Additional environment variables (empty by default)

How It Works

Discovery (tools/list): Synaptic spawns the process, writes a JSON-RPC tools/list request to stdin, reads the response from stdout, then kills the process.
Invocation (tools/call): For each tool call, Synaptic spawns a fresh process, writes a JSON-RPC tools/call request, reads the response, and kills the process.

Full Example

use std::collections::HashMap;
use std::sync::Arc;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, StdioConnection};
use synaptic::graph::create_react_agent;

// Configure an MCP server that provides filesystem tools
let mut servers = HashMap::new();
servers.insert(
    "filesystem".to_string(),
    McpConnection::Stdio(StdioConnection {
        command: "npx".to_string(),
        args: vec![
            "-y".to_string(),
            "@modelcontextprotocol/server-filesystem".to_string(),
            "/allowed/path".to_string(),
        ],
        env: HashMap::new(),
    }),
);

// Connect and discover tools
let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// tools might include: filesystem_read_file, filesystem_write_file, etc.

// Wire into an agent
let agent = create_react_agent(model, tools)?;

Testing Without a Server

For unit tests, you can test MCP client types without spawning a real server. The connection types are serializable and the client can be inspected before connecting:

use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, StdioConnection};

// Create a client without connecting
let mut servers = HashMap::new();
servers.insert(
    "test".to_string(),
    McpConnection::Stdio(StdioConnection {
        command: "echo".to_string(),
        args: vec!["hello".to_string()],
        env: HashMap::new(),
    }),
);

let client = MultiServerMcpClient::new(servers);

// Before connect(), no tools are available
let tools = client.get_tools().await;
assert!(tools.is_empty());

// Connection types round-trip through serde
let json = serde_json::to_string(&McpConnection::Stdio(StdioConnection {
    command: "npx".to_string(),
    args: vec![],
    env: HashMap::new(),
}))?;
let _: McpConnection = serde_json::from_str(&json)?;

For integration tests that need actual tool discovery, use a simple echo script as the MCP server command.

Notes

Each tool call spawns a new process. This is simple but adds latency for each invocation.
Ensure the command is available on PATH or provide an absolute path.
The env map is merged with the current process environment -- it does not replace it.
Stderr from the child process is discarded (Stdio::null()).

SSE Transport

The SSE (Server-Sent Events) transport connects to a remote MCP server over HTTP, using the SSE transport variant of the protocol.

Configuration

use synaptic::mcp::SseConnection;
use std::collections::HashMap;

let connection = SseConnection {
    url: "http://localhost:3001/mcp".to_string(),
    headers: HashMap::from([
        ("Authorization".to_string(), "Bearer my-token".to_string()),
    ]),
};

Fields

Field	Type	Description
`url`	`String`	The MCP server endpoint URL
`headers`	`HashMap<String, String>`	Additional HTTP headers (e.g., auth tokens)

How It Works

Both tool discovery (tools/list) and tool invocation (tools/call) use HTTP POST requests with JSON-RPC payloads against the configured URL. The Content-Type: application/json header is added automatically.

Full Example

use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, SseConnection};

let mut servers = HashMap::new();
servers.insert(
    "search".to_string(),
    McpConnection::Sse(SseConnection {
        url: "http://localhost:3001/mcp".to_string(),
        headers: HashMap::from([
            ("Authorization".to_string(), "Bearer secret".to_string()),
        ]),
    }),
);

let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// tools might include: search_web_search, search_image_search, etc.

Notes

SSE and HTTP transports share the same underlying HTTP POST mechanism for tool calls.
The headers map is applied to every request (both discovery and invocation).
The server must implement the MCP JSON-RPC interface at the given URL.

HTTP Transport

The HTTP transport connects to an MCP server using standard HTTP POST requests with JSON-RPC payloads.

Configuration

use synaptic::mcp::HttpConnection;
use std::collections::HashMap;

let connection = HttpConnection {
    url: "https://mcp.example.com/rpc".to_string(),
    headers: HashMap::from([
        ("X-Api-Key".to_string(), "my-api-key".to_string()),
    ]),
};

Fields

Field	Type	Description
`url`	`String`	The MCP server endpoint URL
`headers`	`HashMap<String, String>`	Additional HTTP headers (e.g., API keys)

How It Works

Both tool discovery (tools/list) and tool invocation (tools/call) send a JSON-RPC POST request to the configured URL. The Content-Type: application/json header is added automatically. Custom headers from the config are included in every request.

Full Example

use std::collections::HashMap;
use synaptic::mcp::{MultiServerMcpClient, McpConnection, HttpConnection};

let mut servers = HashMap::new();
servers.insert(
    "calculator".to_string(),
    McpConnection::Http(HttpConnection {
        url: "https://mcp.example.com/rpc".to_string(),
        headers: HashMap::from([
            ("X-Api-Key".to_string(), "my-api-key".to_string()),
        ]),
    }),
);

let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;
// tools might include: calculator_add, calculator_multiply, etc.

Notes

HTTP and SSE transports use identical request/response handling for tool calls. The distinction is in how the MCP server manages the connection.
Use HTTPS in production to protect API keys and tool call payloads.
The headers map is applied to every request, making it suitable for static authentication tokens.

Multi-Server Client

MultiServerMcpClient connects to multiple MCP servers simultaneously and aggregates all discovered tools into a single collection.

Why Multiple Servers?

Real-world agents often need tools from several sources: a filesystem server for local files, a web search server for internet queries, and a database server for structured data. MultiServerMcpClient lets you configure all of them in one place and get back a unified Vec<Arc<dyn Tool>>.

Configuration

Pass a HashMap<String, McpConnection> where keys are server names and values are connection configs. You can mix transports freely:

use std::collections::HashMap;
use synaptic::mcp::{
    MultiServerMcpClient, McpConnection,
    StdioConnection, HttpConnection, SseConnection,
};

let mut servers = HashMap::new();

// Local filesystem server via stdio
servers.insert(
    "fs".to_string(),
    McpConnection::Stdio(StdioConnection {
        command: "npx".to_string(),
        args: vec!["-y".to_string(), "@mcp/server-filesystem".to_string()],
        env: HashMap::new(),
    }),
);

// Remote search server via HTTP
servers.insert(
    "search".to_string(),
    McpConnection::Http(HttpConnection {
        url: "https://search.example.com/mcp".to_string(),
        headers: HashMap::from([
            ("Authorization".to_string(), "Bearer token".to_string()),
        ]),
    }),
);

// Analytics server via SSE
servers.insert(
    "analytics".to_string(),
    McpConnection::Sse(SseConnection {
        url: "http://localhost:8080/mcp".to_string(),
        headers: HashMap::new(),
    }),
);

Connecting and Using Tools

let client = MultiServerMcpClient::new(servers);
client.connect().await?;
let tools = client.get_tools().await;

// Tools from all three servers are combined:
// fs_read_file, fs_write_file, search_web_search, analytics_query, ...

// Pass directly to an agent
let agent = create_react_agent(model, tools)?;

Tool Name Prefixing

By default, every tool name is prefixed with its server name to prevent collisions. For example, a tool named read_file from the "fs" server becomes fs_read_file.

To disable prefixing (when you know tool names are globally unique):

let client = MultiServerMcpClient::new(servers).with_prefix(false);

load_mcp_tools Shorthand

The load_mcp_tools convenience function combines connect() and get_tools():

use synaptic::mcp::load_mcp_tools;

let client = MultiServerMcpClient::new(servers);
let tools = load_mcp_tools(&client).await?;

Notes

connect() iterates over all servers sequentially. If any server fails, the entire call returns an error.
Tools are stored in an Arc<RwLock<Vec<...>>> internally, so get_tools() is safe to call from multiple tasks.
The server name is used only for prefixing tool names -- it does not need to match any value on the server side.

Deep Agent

A Deep Agent is a high-level agent abstraction that combines a middleware stack, a backend for filesystem and state operations, and a factory for creating fully-configured agents in a single call. It is designed for tasks that require reading and writing files, spawning subagents, loading skills, and maintaining persistent memory -- the kinds of workflows typically associated with coding assistants and autonomous research agents.

Architecture

A Deep Agent is assembled from layers that wrap a core ReAct agent graph:

+-----------------------------------------------+
|              Deep Agent                        |
|  +------------------------------------------+ |
|  |  Middleware Stack                         | |
|  |  - DeepMemoryMiddleware (AGENTS.md)      | |
|  |  - SkillsMiddleware (SKILL.md injection) | |
|  |  - FilesystemMiddleware (tool eviction)  | |
|  |  - SubAgentMiddleware (task tool)        | |
|  |  - DeepSummarizationMiddleware           | |
|  |  - PatchToolCallsMiddleware              | |
|  +------------------------------------------+ |
|  +------------------------------------------+ |
|  |  Filesystem Tools                         | |
|  |  ls, read_file, write_file, edit_file,    | |
|  |  glob, grep (+execute if supported)       | |
|  +------------------------------------------+ |
|  +------------------------------------------+ |
|  |  Backend (State / Store / Filesystem)     | |
|  +------------------------------------------+ |
|  +------------------------------------------+ |
|  |  ReAct Agent Graph (agent + tools nodes)  | |
|  +------------------------------------------+ |
+-----------------------------------------------+

Core Capabilities

Capability	Description
Filesystem tools	Read, write, edit, search, and list files through a pluggable backend. An `execute` tool is added when the backend supports it.
Subagents	Spawn child agents for isolated subtasks with recursion depth control (`max_subagent_depth`)
Skills	Load `SKILL.md` files from a configurable directory that inject domain-specific instructions into the system prompt
Memory	Persist learned context in `AGENTS.md` and reload it across sessions
Summarization	Auto-summarize conversation history when context length exceeds `summarization_threshold` of `max_input_tokens`
Backend abstraction	Swap between in-memory (`StateBackend`), persistent store (`StoreBackend`), and real filesystem (`FilesystemBackend`) backends

Minimal Example

use synaptic::deep::{create_deep_agent, DeepAgentOptions, backend::FilesystemBackend};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;
use synaptic::core::Message;
use std::sync::Arc;

let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(FilesystemBackend::new("/path/to/workspace"));
let options = DeepAgentOptions::new(backend);

let agent = create_deep_agent(model, options)?;

let result = agent.invoke(MessageState::with_messages(vec![
    Message::human("List the Rust files in src/"),
])).await?;
println!("{}", result.into_state().last_message_content());

create_deep_agent returns a CompiledGraph<MessageState> -- the same graph type used by create_react_agent. You invoke it with a MessageState containing your input messages and receive a GraphResult<MessageState> back.

Guides

Quickstart -- create and run your first Deep Agent
Backends -- choose between State, Store, and Filesystem backends
Filesystem Tools -- reference for the built-in tools
Subagents -- delegate subtasks to child agents
Skills -- extend agent behavior with SKILL.md files
Memory -- persistent agent memory via AGENTS.md
Customization -- full DeepAgentOptions reference

When to Use a Deep Agent

Use a Deep Agent when your task involves file manipulation, multi-step reasoning over project state, or spawning subtasks. If you only need a simple question-answering loop, a plain create_react_agent is sufficient. Deep Agent adds the infrastructure layers that turn a basic ReAct loop into an autonomous coding or research assistant.

Quickstart

This guide walks you through creating and running a Deep Agent in three steps.

Prerequisites

Add the required crates to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["deep", "openai"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Step 1: Create a Backend

The backend determines how the agent interacts with the outside world. For this quickstart we use FilesystemBackend, which reads and writes real files on your machine:

use synaptic::deep::backend::FilesystemBackend;
use std::sync::Arc;

let backend = Arc::new(FilesystemBackend::new("/tmp/my-workspace"));

For testing without touching the filesystem, swap in StateBackend::new() instead:

use synaptic::deep::backend::StateBackend;

let backend = Arc::new(StateBackend::new());

Step 2: Create the Agent

Use create_deep_agent with a model and a DeepAgentOptions. The options struct has sensible defaults -- you only need to provide the backend:

use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::openai::OpenAiChatModel;
use std::sync::Arc;

let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let options = DeepAgentOptions::new(backend);

let agent = create_deep_agent(model, options)?;

create_deep_agent wires up the full middleware stack (memory, skills, filesystem, subagents, summarization, tool-call patching), registers the filesystem tools, and compiles the underlying ReAct graph. It returns a CompiledGraph<MessageState>.

Step 3: Run the Agent

Build a MessageState with your prompt and call invoke. The agent will reason, call tools, and return a final result:

use synaptic::graph::MessageState;
use synaptic::core::Message;

let state = MessageState::with_messages(vec![
    Message::human("Create a file called hello.txt containing 'Hello, world!'"),
]);
let result = agent.invoke(state).await?;
println!("{}", result.into_state().last_message_content());

What Happens Under the Hood

When you call agent.invoke(state):

Memory loading -- The DeepMemoryMiddleware checks for an AGENTS.md file via the backend and injects any saved context into the system prompt.
Skills injection -- The SkillsMiddleware scans the .skills/ directory for SKILL.md files and adds matching skill instructions to the system prompt.
Agent loop -- The underlying ReAct graph enters its reason-act-observe loop. The model sees the filesystem tools and decides which ones to call.
Tool execution -- Each tool call (e.g. write_file) is dispatched through the backend. FilesystemBackend performs real I/O; StateBackend operates on an in-memory map.
Summarization -- If the conversation grows beyond the configured token threshold (default: 85% of 128,000 tokens), the DeepSummarizationMiddleware compresses older messages into a summary before the next model call.
Tool-call patching -- The PatchToolCallsMiddleware fixes malformed tool calls before they reach the executor.
Final answer -- When the model responds without tool calls, the graph terminates and invoke returns the GraphResult<MessageState>.

Customizing Options

DeepAgentOptions fields can be set directly before passing to create_deep_agent:

let mut options = DeepAgentOptions::new(backend);
options.system_prompt = Some("You are a Rust expert.".to_string());
options.max_input_tokens = 64_000;
options.enable_subagents = false;

let agent = create_deep_agent(model, options)?;

Key defaults:

Field	Default
`max_input_tokens`	128,000
`summarization_threshold`	0.85
`eviction_threshold`	20,000
`max_subagent_depth`	3
`skills_dir`	`".skills"`
`memory_file`	`"AGENTS.md"`
`enable_subagents`	`true`
`enable_filesystem`	`true`
`enable_skills`	`true`
`enable_memory`	`true`

Full Working Example

use std::sync::Arc;
use synaptic::core::Message;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, backend::FilesystemBackend};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
    let backend = Arc::new(FilesystemBackend::new("/tmp/demo"));
    let options = DeepAgentOptions::new(backend);

    let agent = create_deep_agent(model, options)?;

    let state = MessageState::with_messages(vec![
        Message::human("What files are in the current directory?"),
    ]);
    let result = agent.invoke(state).await?;
    println!("{}", result.into_state().last_message_content());
    Ok(())
}

Next Steps

Backends -- learn about State, Store, and Filesystem backends
Filesystem Tools -- see what each tool does
Customization -- tune every option with DeepAgentOptions

Backends

A Deep Agent backend controls how filesystem tools interact with the outside world. Synaptic provides three built-in backends. You choose the one that matches your deployment context.

StateBackend

An entirely in-memory backend. Files are stored in a HashMap<String, String> keyed by normalized paths and never touch the real filesystem. Directories are inferred from path prefixes rather than stored as explicit entries. This is the default for tests and sandboxed demos.

use synaptic::deep::backend::StateBackend;
use std::sync::Arc;

let backend = Arc::new(StateBackend::new());

let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;

// After the agent runs, inspect the virtual filesystem:
let entries = backend.ls("/").await?;
let content = backend.read_file("/hello.txt", 0, 2000).await?;

StateBackend does not support shell command execution -- supports_execution() returns false and execute() returns an error.

When to use: Unit tests, CI pipelines, sandboxed playgrounds where no real I/O should occur.

StoreBackend

Persists files through Synaptic's Store trait. Each file is stored as an item with key=path and value={"content": "..."}. All items share a configurable namespace prefix. This lets you back the agent's workspace with any store implementation -- InMemoryStore for development, or a custom database-backed store for production.

use synaptic::deep::backend::StoreBackend;
use synaptic::store::InMemoryStore;
use std::sync::Arc;

let store = Arc::new(InMemoryStore::new());
let namespace = vec!["workspace".to_string(), "agent1".to_string()];
let backend = Arc::new(StoreBackend::new(store, namespace));

let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;

The second argument is a Vec<String> namespace. All file keys are stored under this namespace, so multiple agents can share a single store without key collisions.

StoreBackend does not support shell command execution -- supports_execution() returns false and execute() returns an error.

When to use: Server deployments where you want persistence without granting direct filesystem access. Ideal for multi-tenant applications.

FilesystemBackend

Reads and writes real files on the host operating system. This is the backend you want for coding assistants and local automation.

use synaptic::deep::backend::FilesystemBackend;
use std::sync::Arc;

let backend = Arc::new(FilesystemBackend::new("/home/user/project"));

let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;

The path you provide becomes the agent's root directory. All tool paths are resolved relative to this root. The agent cannot escape the root directory -- paths containing .. are rejected.

FilesystemBackend is the only built-in backend that supports shell command execution. Commands run via sh -c in the root directory with an optional timeout. When this backend is used, create_filesystem_tools automatically includes the execute tool.

Feature gate: FilesystemBackend requires the filesystem Cargo feature on synaptic-deep. The synaptic facade does not forward this feature, so add synaptic-deep as an explicit dependency:
synaptic = { version = "0.2", features = ["deep"] }
synaptic-deep = { version = "0.2", features = ["filesystem"] }

When to use: Local CLI tools, coding assistants, any scenario where the agent must interact with real files.

Implementing a Custom Backend

All three backends implement the Backend trait from synaptic::deep::backend:

use synaptic::deep::backend::{Backend, DirEntry, ExecResult, GrepOutputMode};

#[async_trait]
pub trait Backend: Send + Sync {
    /// List entries in a directory.
    async fn ls(&self, path: &str) -> Result<Vec<DirEntry>, SynapticError>;

    /// Read file contents with line-based pagination.
    async fn read_file(&self, path: &str, offset: usize, limit: usize)
        -> Result<String, SynapticError>;

    /// Create or overwrite a file.
    async fn write_file(&self, path: &str, content: &str) -> Result<(), SynapticError>;

    /// Find-and-replace text in a file.
    async fn edit_file(&self, path: &str, old_text: &str, new_text: &str, replace_all: bool)
        -> Result<(), SynapticError>;

    /// Match file paths against a glob pattern within a base directory.
    async fn glob(&self, pattern: &str, base: &str) -> Result<Vec<String>, SynapticError>;

    /// Search file contents by regex pattern.
    async fn grep(&self, pattern: &str, path: Option<&str>, file_glob: Option<&str>,
        output_mode: GrepOutputMode) -> Result<String, SynapticError>;

    /// Execute a shell command. Returns error by default.
    async fn execute(&self, command: &str, timeout: Option<Duration>)
        -> Result<ExecResult, SynapticError> { /* default: error */ }

    /// Whether this backend supports shell command execution.
    fn supports_execution(&self) -> bool { false }
}

Supporting types:

DirEntry -- { name: String, is_dir: bool, size: Option<u64> }
ExecResult -- { stdout: String, stderr: String, exit_code: i32 }
GrepMatch -- { file: String, line_number: usize, line: String }
GrepOutputMode -- FilesWithMatches | Content | Count

Implement this trait to back the agent with S3, a database, a remote server over SSH, or any other storage layer. Override execute and supports_execution if you want to enable the execute tool for your backend.

Offline Testing

Use StateBackend with ScriptedChatModel to test deep agents without API keys or real filesystem access:

use std::sync::Arc;
use synaptic::core::{ChatResponse, Message, ToolCall};
use synaptic::models::ScriptedChatModel;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::deep::backend::StateBackend;

// Script the model to write a file then finish
let model = Arc::new(ScriptedChatModel::new(vec![
    ChatResponse {
        message: Message::ai_with_tool_calls(
            "I'll create a file.",
            vec![ToolCall {
                id: "call_1".into(),
                name: "write_file".into(),
                arguments: r#"{"path": "/hello.txt", "content": "Hello from test!"}"#.into(),
            }],
        ),
        usage: None,
    },
    ChatResponse {
        message: Message::ai("Done! I created hello.txt."),
        usage: None,
    },
]));

let backend = Arc::new(StateBackend::new());
let options = DeepAgentOptions::new(backend.clone());
let agent = create_deep_agent(model, options)?;

// Run the agent...
// Then inspect the virtual filesystem:
let content = backend.read_file("/hello.txt", 0, 2000).await?;
assert!(content.contains("Hello from test!"));

This pattern is ideal for CI pipelines and unit tests. The StateBackend is fully deterministic and requires no cleanup.

Comparison

Backend	Persistence	Real I/O	Execution	Feature gate	Best for
`StateBackend`	None (in-memory)	No	No	None	Tests, sandboxing
`StoreBackend`	Via Store trait	No	No	None	Servers, multi-tenant
`FilesystemBackend`	Disk	Yes	Yes	`filesystem`	Local CLI, coding assistants

Filesystem Tools

A Deep Agent ships with six built-in filesystem tools, plus a conditional seventh. These tools are automatically registered when you call create_deep_agent (if enable_filesystem is true, which is the default) and are dispatched through whichever backend you configure.

Creating the Tools

If you need the tool set outside of a DeepAgent (for example, in a custom graph), use the factory function:

use synaptic::deep::tools::create_filesystem_tools;
use synaptic::deep::backend::FilesystemBackend;
use std::sync::Arc;

let backend = Arc::new(FilesystemBackend::new("/workspace"));
let tools = create_filesystem_tools(backend);
// tools: Vec<Arc<dyn Tool>>
// 6 tools always: ls, read_file, write_file, edit_file, glob, grep
// + execute (only if backend.supports_execution() returns true)

The execute tool is only included when the backend reports that it supports execution. For FilesystemBackend this is always the case. For StateBackend and StoreBackend, execution is not supported and the tool is omitted.

Tool Reference

Tool	Description	Always present
`ls`	List directory contents	Yes
`read_file`	Read file contents with optional line-based pagination	Yes
`write_file`	Create or overwrite a file	Yes
`edit_file`	Find and replace text in an existing file	Yes
`glob`	Find files matching a glob pattern	Yes
`grep`	Search file contents by regex pattern	Yes
`execute`	Run a shell command and capture output	Only if backend supports execution

ls

Lists files and directories at the given path.

Parameter	Type	Required	Description
`path`	string	yes	Directory to list

Returns a JSON array of entries, each with name (string), is_dir (boolean), and size (integer or null) fields.

read_file

Reads the contents of a single file with line-based pagination.

Parameter	Type	Required	Description
`path`	string	yes	File path to read
`offset`	integer	no	Starting line number, 0-based (default 0)
`limit`	integer	no	Maximum number of lines to return (default 2000)

Returns the file contents as a string. When offset and limit are provided, returns only the requested line range.

write_file

Creates a new file or overwrites an existing one.

Parameter	Type	Required	Description
`path`	string	yes	Destination file path
`content`	string	yes	Full file contents to write

Returns a confirmation string (e.g. "wrote path/to/file").

edit_file

Applies a targeted string replacement within an existing file.

Parameter	Type	Required	Description
`path`	string	yes	File to edit
`old_string`	string	yes	Exact text to find
`new_string`	string	yes	Replacement text
`replace_all`	boolean	no	Replace all occurrences (default false)

When replace_all is false (the default), only the first occurrence is replaced. The tool returns an error if old_string is not found in the file.

glob

Finds files matching a glob pattern.

Parameter	Type	Required	Description
`pattern`	string	yes	Glob pattern (e.g. `"*/.rs"`, `"src/*.toml"`)
`path`	string	no	Base directory to search from (default `"."`)

Returns matching file paths as a newline-separated string.

grep

Searches file contents for lines matching a regular expression.

Parameter	Type	Required	Description
`pattern`	string	yes	Regex pattern to search for
`path`	string	no	Directory or file to search in (defaults to workspace root)
`glob`	string	no	Glob pattern to filter which files are searched (e.g. `"*.rs"`)
`output_mode`	string	no	Output format: `"files_with_matches"` (default), `"content"`, or `"count"`

Output modes control the format of results:

files_with_matches -- Returns one matching file path per line.
content -- Returns matching lines in file:line_number:line format.
count -- Returns match counts in file:count format.

execute

Runs a shell command in the backend's working directory. This tool is only registered when the backend supports execution (i.e. FilesystemBackend).

Parameter	Type	Required	Description
`command`	string	yes	The shell command to execute
`timeout`	integer	no	Timeout in seconds

Returns a JSON object with stdout, stderr, and exit_code fields. Commands are executed via sh -c in the backend's root directory.

Subagents

A Deep Agent can spawn child agents -- called subagents -- to handle isolated subtasks. Subagents run in their own context, with their own conversation history, and return a result to the parent agent when they finish.

Task Tool

When subagents are enabled, create_deep_agent adds a built-in task tool. When the parent agent calls the task tool, a new child deep agent is created via create_deep_agent() with the same model and backend, runs the requested subtask, and returns its final answer as the tool result.

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};

let mut options = DeepAgentOptions::new(backend);
options.enable_subagents = true; // enabled by default
let agent = create_deep_agent(model, options)?;

// The agent can now call the "task" tool in its reasoning loop.
// Example tool call the model might emit:
// { "name": "task", "arguments": { "description": "Refactor the parse module" } }

The task tool accepts two parameters:

Parameter	Required	Description
`description`	yes	A detailed description of the task for the sub-agent
`agent_type`	no	Name of a custom sub-agent type to spawn (defaults to `"general-purpose"`)

SubAgentDef

For more control, define named subagent types with SubAgentDef. Each definition specifies a name, description, system prompt, and an optional tool set. SubAgentDef is a plain struct -- create it with a struct literal:

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, SubAgentDef};

let mut options = DeepAgentOptions::new(backend);
options.subagents = vec![
    SubAgentDef {
        name: "researcher".to_string(),
        description: "Research specialist".to_string(),
        system_prompt: "You are a research assistant. Find relevant files and summarize them.".to_string(),
        tools: vec![], // inherits default deep agent tools
    },
    SubAgentDef {
        name: "writer".to_string(),
        description: "Code writer".to_string(),
        system_prompt: "You are a code writer. Implement the requested changes.".to_string(),
        tools: vec![],
    },
];
let agent = create_deep_agent(model, options)?;

When the parent agent calls the task tool with "agent_type": "researcher", the TaskTool finds the matching SubAgentDef by name and uses its system_prompt and tools for the child agent. If no matching definition is found, a general-purpose child agent is spawned with default settings.

Recursion Depth Control

Subagents can themselves spawn further subagents. To prevent unbounded recursion, configure max_subagent_depth:

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};

let mut options = DeepAgentOptions::new(backend);
options.max_subagent_depth = 3; // default is 3
let agent = create_deep_agent(model, options)?;

The SubAgentMiddleware tracks the current depth with an AtomicUsize counter. When the depth limit is reached, the task tool returns an error instead of spawning a new agent. The parent agent sees this error as a tool result and can adjust its strategy.

Context Isolation

Each subagent starts with a fresh conversation. The parent's message history is not forwarded. This keeps the subagent focused and avoids blowing the context window. The only information the subagent receives is:

Its own system prompt (from SubAgentDef or the default deep agent prompt).
The task description provided by the parent, sent as a Message::human().
The shared backend -- subagents read and write the same workspace.

The child agent is a full deep agent created via create_deep_agent(), so it has access to the same filesystem tools, skills, and middleware stack as the parent (subject to the depth limit for further subagent spawning).

When the subagent finishes, only the content of its last AI message is returned to the parent as a tool result string. Intermediate reasoning and tool calls are discarded.

Example: Delegating a Research Task

use std::sync::Arc;
use synaptic::core::Message;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};
use synaptic::graph::MessageState;

let options = DeepAgentOptions::new(backend);
let agent = create_deep_agent(model, options)?;

let state = MessageState::with_messages(vec![
    Message::human("Find all TODO comments in the codebase and write a summary to TODO_REPORT.md"),
]);
let result = agent.invoke(state).await?;
let final_state = result.into_state();

// Under the hood, the agent may call:
//   task({ "description": "Search for TODO comments in all .rs files" })
// The subagent runs, returns results, and the parent writes the report.

Skills

Skills extend a Deep Agent's behavior by injecting domain-specific instructions into the system prompt. A skill is defined by a SKILL.md file with YAML frontmatter and a body of Markdown instructions. The SkillsMiddleware discovers skills from the backend filesystem and presents an index to the agent, which can then read the full skill file on demand via the read_file tool.

SKILL.md Format

Each skill file starts with YAML frontmatter between --- markers containing name and description fields:

---
name: search
description: Search the web for information
---

# Search Skill

Detailed instructions for how to perform web searches effectively...

The frontmatter fields:

Field	Required	Description
`name`	yes	Unique identifier for the skill
`description`	no	One-line summary shown in the skill index (defaults to empty string if omitted)

The parser extracts name and description by scanning lines between the --- markers for name: and description: prefixes. Values may optionally be quoted with single or double quotes.

Skills Directory Structure

Place skill files in a .skills/ directory at the workspace root:

my-project/
  .skills/
    search/SKILL.md
    testing/SKILL.md
    documentation/SKILL.md
  src/
    main.rs

Each skill lives in its own subdirectory. The SkillsMiddleware discovers them by listing directories under the configured skills_dir and reading {skills_dir}/{dir}/SKILL.md from each.

How Discovery Works

The SkillsMiddleware implements the AgentMiddleware trait. On each call to before_model(), it:

Lists entries in the skills directory via the backend's ls() method.
For each directory entry, reads the first 50 lines of {dir}/SKILL.md.
Parses the YAML frontmatter to extract name and description.
Builds an <available_skills> section and appends it to the system prompt.

The injected section looks like:

<available_skills>
- **search**: Search the web for information (read `.skills/search/SKILL.md` for details)
- **testing**: Guidelines for writing tests (read `.skills/testing/SKILL.md` for details)
</available_skills>

The agent sees this index and can read the full SKILL.md file via the read_file tool when it needs the detailed instructions.

Configuration

Skills are enabled by default. Configure via DeepAgentOptions:

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};

let mut options = DeepAgentOptions::new(backend);
options.skills_dir = Some(".skills".to_string());  // default
options.enable_skills = true;                       // default
let agent = create_deep_agent(model, options)?;

To disable skills entirely, set enable_skills = false. To change the skills directory, set skills_dir to a different path within the backend.

Example: Adding a Rust Refactoring Skill

Create the file .skills/rust-refactoring/SKILL.md in your workspace:

---
name: rust-refactoring
description: Best practices for refactoring Rust code
---

When refactoring Rust code, follow these guidelines:

1. Run `cargo clippy` before and after changes.
2. Prefer extracting functions over inline complexity.
3. Use `#[must_use]` on public functions that return values.
4. Write a test for every extracted function.

Once this file is present in the backend, the SkillsMiddleware will automatically discover it and include it in the system prompt index. The agent can then read the full file for detailed instructions when it encounters a refactoring task.

There is no programmatic skill registration API. All skills are filesystem-based, discovered at runtime by scanning the backend.

More Examples

Code Review Skill

A code review skill injects a structured checklist so the agent applies consistent review standards:

---
name: code-review
description: Structured code review checklist with severity levels
---

When reviewing code, evaluate each change against this checklist:

## Severity Levels
- **Critical**: Security vulnerabilities, data loss risks, correctness bugs
- **Major**: Performance issues, missing error handling, API contract violations
- **Minor**: Style inconsistencies, missing docs, naming improvements

## Review Checklist
1. **Correctness** — Does the logic match the stated intent?
2. **Error handling** — Are all failure paths covered?
3. **Security** — Any injection, auth bypass, or data exposure risks?
4. **Performance** — Unnecessary allocations, O(n²) loops, missing indexes?
5. **Tests** — Are new paths tested? Are edge cases covered?
6. **Naming** — Do names convey purpose without needing comments?

## Output Format
For each finding, report:
- File and line range
- Severity level
- Description and suggested fix

This turns the agent into a disciplined reviewer that categorizes findings by severity rather than giving unstructured feedback.

TDD Workflow Skill

A TDD skill constrains the agent to follow a strict Red-Green-Refactor cycle:

---
name: tdd
description: Enforce test-driven development workflow
---

Follow the Red-Green-Refactor cycle strictly:

## Step 1: Red
- Write a failing test FIRST. Run it and confirm it fails.
- The test must describe the desired behavior, not the implementation.

## Step 2: Green
- Write the MINIMUM code to make the test pass.
- Do not add extra logic, optimizations, or edge case handling yet.
- Run the test and confirm it passes.

## Step 3: Refactor
- Clean up the implementation while keeping all tests green.
- Extract helpers, rename variables, remove duplication.
- Run the full test suite after each refactoring step.

## Rules
- Never write production code without a failing test.
- One behavior per test. If a test name contains "and", split it.
- Commit after each green-refactor cycle.

This prevents the agent from jumping ahead to write implementation code before tests exist.

API Design Conventions Skill

A conventions skill encodes team-wide API standards so every endpoint the agent creates follows the same patterns:

---
name: api-conventions
description: Team API design standards for REST endpoints
---

All REST endpoints must follow these conventions:

## URL Structure
- Use kebab-case for path segments: `/user-profiles`, not `/userProfiles`
- Nest resources: `/teams/{team_id}/members/{member_id}`
- Version prefix: `/api/v1/...`

## Request/Response
- Use `snake_case` for JSON field names
- Wrap collections: `{ "items": [...], "total": 42, "next_cursor": "..." }`
- Error format: `{ "error": { "code": "NOT_FOUND", "message": "..." } }`

## Status Codes
- 200 for success, 201 for creation, 204 for deletion
- 400 for validation errors, 404 for missing resources
- 409 for conflicts, 422 for semantic errors

## Naming
- List endpoint: `GET /resources`
- Create endpoint: `POST /resources`
- Get endpoint: `GET /resources/{id}`
- Update endpoint: `PATCH /resources/{id}`
- Delete endpoint: `DELETE /resources/{id}`

Any agent working on the API layer will automatically produce consistent endpoints without per-task reminders.

Multi-Skill Cooperation

When multiple skills exist in the workspace, the agent sees all of them in the index and reads the relevant ones based on the current task. Consider this layout:

my-project/
  .skills/
    code-review/SKILL.md
    tdd/SKILL.md
    api-conventions/SKILL.md
    rust-refactoring/SKILL.md
  src/
    main.rs

The SkillsMiddleware injects the full index into the system prompt:

<available_skills>
- **code-review**: Structured code review checklist with severity levels (read `.skills/code-review/SKILL.md` for details)
- **tdd**: Enforce test-driven development workflow (read `.skills/tdd/SKILL.md` for details)
- **api-conventions**: Team API design standards for REST endpoints (read `.skills/api-conventions/SKILL.md` for details)
- **rust-refactoring**: Best practices for refactoring Rust code (read `.skills/rust-refactoring/SKILL.md` for details)
</available_skills>

The agent then selectively reads skills that match the task at hand:

"Add a new /users endpoint with tests" — the agent reads api-conventions and tdd, then follows the TDD cycle while applying the URL and response format standards.
"Review this pull request" — the agent reads code-review and produces findings with severity levels.
"Refactor the auth module" — the agent reads rust-refactoring and code-review (to self-check the result).

Skills are composable: each one contributes a focused set of instructions, and the agent combines them as needed. This is more maintainable than a single monolithic system prompt.

Best Practices

Keep skills focused and concise. Each skill should cover one topic. A 20–50 line SKILL.md is ideal. If a skill grows beyond 100 lines, consider splitting it.

Use action-oriented language. Write instructions as directives ("Run tests before committing", "Use kebab-case for URLs") rather than descriptions ("Tests should ideally be run").

Format with Markdown structure. Use headings, numbered lists, and bold text. The agent processes structured content more reliably than prose paragraphs.

Name directories in kebab-case. Use lowercase with hyphens: code-review/, api-conventions/, rust-refactoring/. Avoid spaces, underscores, or camelCase.

Skills vs. system prompt. Use skills for instructions that are reusable across tasks and discoverable by name. Use the system prompt directly for instructions that always apply to every interaction. If you find yourself copying the same instructions into multiple prompts, extract them into a skill.

Memory

A Deep Agent can persist learned context across sessions by reading and writing a memory file (default AGENTS.md) in the workspace. This gives the agent a form of long-term memory that survives restarts.

How It Works

The DeepMemoryMiddleware implements AgentMiddleware. On every model call, its before_model() hook reads the configured memory file from the backend. If the file exists and is not empty, its contents are wrapped in <agent_memory> tags and appended to the system prompt:

<agent_memory>
- The user prefers tabs over spaces.
- The project uses `thiserror 2.0` for error types.
- Always run `cargo fmt` after editing Rust files.
</agent_memory>

If the file does not exist or is empty, the middleware silently skips injection. The agent sees this context before processing each message, so it can apply learned preferences immediately.

Writing to Memory

The agent can update its memory at any time by writing to the memory file using the built-in filesystem tools (e.g., write_file or edit_file). A typical pattern is for the agent to append a new line when it learns something important:

Agent reasoning: "The user corrected me -- they want snake_case, not camelCase.
I should remember this for future sessions."

Tool call: edit_file({
  "path": "AGENTS.md",
  "old_string": "- Always run `cargo fmt` after editing Rust files.",
  "new_string": "- Always run `cargo fmt` after editing Rust files.\n- Use snake_case for all function names."
})

Because the middleware re-reads the file on every model call, updates take effect on the very next turn.

Configuration

Memory is controlled by two fields on DeepAgentOptions:

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};

let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("AGENTS.md".to_string()); // default
options.enable_memory = true;                         // default

let agent = create_deep_agent(model, options)?;

memory_file (Option<String>, default Some("AGENTS.md")) -- path to the memory file within the backend. You can point this at a different file if you prefer:

let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("docs/MEMORY.md".to_string());

enable_memory (bool, default true) -- when true, the DeepMemoryMiddleware is added to the middleware stack.

Disabling Memory

To run without persistent memory, set enable_memory to false:

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_memory = false;

let agent = create_deep_agent(model, options)?;

The DeepMemoryMiddleware is not added to the stack at all, so there is no overhead.

DeepMemoryMiddleware Internals

The middleware struct is straightforward:

pub struct DeepMemoryMiddleware {
    backend: Arc<dyn Backend>,
    memory_file: String,
}

impl DeepMemoryMiddleware {
    pub fn new(backend: Arc<dyn Backend>, memory_file: String) -> Self;
}

It implements AgentMiddleware with a single hook:

before_model() -- reads the memory file from the backend. If the content is non-empty, wraps it in <agent_memory> tags and appends to the system prompt. If the file is missing or empty, does nothing.

Middleware Stack Position

DeepMemoryMiddleware runs first in the middleware stack (position 1 of 7), ensuring that memory context is available to all subsequent middleware and to the model itself. See the Customization page for the full assembly order.

Customization

Every aspect of a Deep Agent can be tuned through DeepAgentOptions. This page is a field-by-field reference with examples.

DeepAgentOptions Reference

DeepAgentOptions uses direct field assignment rather than a builder pattern. Create an instance with DeepAgentOptions::new(backend) to get sensible defaults, then override fields as needed:

use std::sync::Arc;
use synaptic::deep::{create_deep_agent, DeepAgentOptions};

let mut options = DeepAgentOptions::new(backend.clone());
options.system_prompt = Some("You are a senior Rust engineer.".into());
options.max_subagent_depth = 2;

let agent = create_deep_agent(model, options)?;

Full Field List

pub struct DeepAgentOptions {
    pub backend: Arc<dyn Backend>,                    // required
    pub system_prompt: Option<String>,                // None
    pub tools: Vec<Arc<dyn Tool>>,                    // empty
    pub middleware: Vec<Arc<dyn AgentMiddleware>>,     // empty
    pub checkpointer: Option<Arc<dyn Checkpointer>>,  // None
    pub store: Option<Arc<dyn Store>>,                // None
    pub max_input_tokens: usize,                      // 128_000
    pub summarization_threshold: f64,                  // 0.85
    pub eviction_threshold: usize,                     // 20_000
    pub max_subagent_depth: usize,                     // 3
    pub skills_dir: Option<String>,                    // Some(".skills")
    pub memory_file: Option<String>,                   // Some("AGENTS.md")
    pub subagents: Vec<SubAgentDef>,                   // empty
    pub enable_subagents: bool,                        // true
    pub enable_filesystem: bool,                       // true
    pub enable_skills: bool,                           // true
    pub enable_memory: bool,                           // true
}

Field Details

backend

The backend provides filesystem operations for the agent. This is the only required argument to DeepAgentOptions::new(). All other fields have defaults.

use synaptic::deep::backend::FilesystemBackend;

let backend = Arc::new(FilesystemBackend::new("/home/user/project"));
let options = DeepAgentOptions::new(backend);

system_prompt

Override the default system prompt entirely. When None, the agent uses a built-in prompt that describes the filesystem tools and expected behavior.

let mut options = DeepAgentOptions::new(backend.clone());
options.system_prompt = Some("You are a Rust expert. Use the provided tools to help.".into());

tools

Additional tools beyond the built-in filesystem tools. These are added to the agent's tool registry and made available to the model.

let mut options = DeepAgentOptions::new(backend.clone());
options.tools = vec![
    Arc::new(MyCustomTool),
    Arc::new(DatabaseQueryTool::new(db_pool)),
];

middleware

Custom middleware layers that run after the entire built-in stack. See Middleware Stack for ordering details.

let mut options = DeepAgentOptions::new(backend.clone());
options.middleware = vec![
    Arc::new(AuditLogMiddleware::new(log_file)),
];

checkpointer

Optional checkpointer for graph state persistence. When provided, the agent can resume from checkpoints.

use synaptic::graph::MemorySaver;

let mut options = DeepAgentOptions::new(backend.clone());
options.checkpointer = Some(Arc::new(MemorySaver::new()));

store

Optional store for runtime tool injection via ToolRuntime.

use synaptic::store::InMemoryStore;

let mut options = DeepAgentOptions::new(backend.clone());
options.store = Some(Arc::new(InMemoryStore::new()));

max_input_tokens

Maximum input tokens before summarization is considered (default 128_000). The DeepSummarizationMiddleware uses this together with summarization_threshold to decide when to compress context.

let mut options = DeepAgentOptions::new(backend.clone());
options.max_input_tokens = 200_000; // for models with larger context windows

summarization_threshold

Fraction of max_input_tokens at which summarization triggers (default 0.85). When context exceeds max_input_tokens * summarization_threshold tokens, the middleware summarizes older messages.

let mut options = DeepAgentOptions::new(backend.clone());
options.summarization_threshold = 0.70; // summarize earlier

eviction_threshold

Token count above which tool results are evicted to files by the FilesystemMiddleware (default 20_000). Large tool outputs are written to a file and replaced with a reference.

let mut options = DeepAgentOptions::new(backend.clone());
options.eviction_threshold = 10_000; // evict smaller results

max_subagent_depth

Maximum recursion depth for nested subagent spawning (default 3). Prevents runaway agent chains.

let mut options = DeepAgentOptions::new(backend.clone());
options.max_subagent_depth = 2;

skills_dir

Directory path within the backend to scan for skill files (default Some(".skills")). Set to None to disable skill scanning even when enable_skills is true.

let mut options = DeepAgentOptions::new(backend.clone());
options.skills_dir = Some("my-skills".into());

memory_file

Path to the persistent memory file within the backend (default Some("AGENTS.md")). See the Memory page for details.

let mut options = DeepAgentOptions::new(backend.clone());
options.memory_file = Some("docs/MEMORY.md".into());

subagents

Custom subagent definitions for the task tool. Each SubAgentDef describes a specialized subagent that can be spawned.

use synaptic::deep::SubAgentDef;

let mut options = DeepAgentOptions::new(backend.clone());
options.subagents = vec![
    SubAgentDef {
        name: "researcher".into(),
        description: "Searches the web for information".into(),
        // ...
    },
];

enable_subagents

Toggle the task tool for child agent spawning (default true). When false, the SubAgentMiddleware and its task tool are not added.

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_subagents = false;

enable_filesystem

Toggle the built-in filesystem tools and FilesystemMiddleware (default true). When false, no filesystem tools are registered.

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_filesystem = false;

enable_skills

Toggle the SkillsMiddleware for progressive skill disclosure (default true).

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_skills = false;

enable_memory

Toggle the DeepMemoryMiddleware for persistent memory (default true). See the Memory page for details.

let mut options = DeepAgentOptions::new(backend.clone());
options.enable_memory = false;

Middleware Stack

create_deep_agent assembles the middleware stack in a fixed order. Each layer can be individually enabled or disabled:

Order	Middleware	Controlled by
1	`DeepMemoryMiddleware`	`enable_memory`
2	`SkillsMiddleware`	`enable_skills`
3	`FilesystemMiddleware` + filesystem tools	`enable_filesystem`
4	SubAgentMiddleware's `task` tool	`enable_subagents`
5	`DeepSummarizationMiddleware`	always added
6	`PatchToolCallsMiddleware`	always added
7	User-provided middleware	`middleware` field

The DeepSummarizationMiddleware and PatchToolCallsMiddleware are always present regardless of configuration.

Return Type

create_deep_agent returns Result<CompiledGraph<MessageState>, SynapticError>. The resulting graph is used like any other Synaptic graph:

use synaptic::core::Message;
use synaptic::graph::MessageState;

let agent = create_deep_agent(model, options)?;
let result = agent.invoke(MessageState::with_messages(vec![
    Message::human("Refactor the error handling in src/lib.rs"),
])).await?;

Full Example

use std::sync::Arc;
use synaptic::core::Message;
use synaptic::deep::{create_deep_agent, DeepAgentOptions, backend::FilesystemBackend};
use synaptic::graph::MessageState;
use synaptic::openai::OpenAiChatModel;

let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let backend = Arc::new(FilesystemBackend::new("/home/user/project"));

let mut options = DeepAgentOptions::new(backend);
options.system_prompt = Some("You are a senior Rust engineer.".into());
options.summarization_threshold = 0.70;
options.enable_subagents = true;
options.max_subagent_depth = 2;

let agent = create_deep_agent(model, options)?;
let result = agent.invoke(MessageState::with_messages(vec![
    Message::human("Refactor the error handling in src/lib.rs"),
])).await?;

Callbacks

Synaptic provides an event-driven callback system for observing agent execution. The CallbackHandler trait receives RunEvent values at key lifecycle points -- when a run starts, when the LLM is called, when tools are executed, and when the run finishes or fails.

The `CallbackHandler` Trait

The trait is defined in synaptic_core:

#[async_trait]
pub trait CallbackHandler: Send + Sync {
    async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError>;
}

A single method receives all event types. Handlers are Send + Sync so they can be shared across async tasks.

`RunEvent` Variants

The RunEvent enum covers the full agent lifecycle:

Variant	Fields	When It Fires
`RunStarted`	`run_id`, `session_id`	At the beginning of an agent run
`RunStep`	`run_id`, `step`	At each iteration of the agent loop
`LlmCalled`	`run_id`, `message_count`	When the LLM is invoked with messages
`ToolCalled`	`run_id`, `tool_name`	When a tool is executed
`RunFinished`	`run_id`, `output`	When the agent produces a final answer
`RunFailed`	`run_id`, `error`	When the agent run fails with an error

RunEvent implements Clone, so handlers can store copies of events for later inspection.

Built-in Handlers

Synaptic ships with four callback handlers:

Handler	Purpose
RecordingCallback	Records all events in memory for later inspection
TracingCallback	Emits structured `tracing` spans and events
StdOutCallbackHandler	Prints events to stdout (with optional verbose mode)
CompositeCallback	Dispatches events to multiple handlers

Implementing a Custom Handler

You can implement CallbackHandler to add your own observability:

use async_trait::async_trait;
use synaptic::core::{CallbackHandler, RunEvent, SynapticError};

struct MetricsCallback;

#[async_trait]
impl CallbackHandler for MetricsCallback {
    async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError> {
        match event {
            RunEvent::LlmCalled { message_count, .. } => {
                // Record to your metrics system
                println!("LLM called with {message_count} messages");
            }
            RunEvent::ToolCalled { tool_name, .. } => {
                println!("Tool executed: {tool_name}");
            }
            _ => {}
        }
        Ok(())
    }
}

Guides

Recording Callback -- capture events in memory for testing and inspection
Tracing Callback -- integrate with the Rust tracing ecosystem
Composite Callback -- dispatch events to multiple handlers simultaneously

Recording Callback

RecordingCallback captures every RunEvent in an in-memory list. This is useful for testing agent behavior, debugging execution flow, and building audit logs.

Usage

use synaptic::callbacks::RecordingCallback;
use synaptic::core::RunEvent;

let callback = RecordingCallback::new();

// ... pass the callback to an agent or use it manually ...

// After the run, inspect all recorded events
let events = callback.events().await;
for event in &events {
    match event {
        RunEvent::RunStarted { run_id, session_id } => {
            println!("Run started: run_id={run_id}, session={session_id}");
        }
        RunEvent::RunStep { run_id, step } => {
            println!("Step {step} in run {run_id}");
        }
        RunEvent::LlmCalled { run_id, message_count } => {
            println!("LLM called with {message_count} messages (run {run_id})");
        }
        RunEvent::ToolCalled { run_id, tool_name } => {
            println!("Tool '{tool_name}' called (run {run_id})");
        }
        RunEvent::RunFinished { run_id, output } => {
            println!("Run {run_id} finished: {output}");
        }
        RunEvent::RunFailed { run_id, error } => {
            println!("Run {run_id} failed: {error}");
        }
    }
}

How It Works

RecordingCallback stores events in an Arc<RwLock<Vec<RunEvent>>>. Each call to on_event() appends the event to the list. The events() method returns a clone of the full event list.

Because it uses Arc, the callback can be cloned and shared across tasks. All clones refer to the same event storage.

Testing Example

RecordingCallback is particularly useful in tests to verify that an agent followed the expected execution path:

#[tokio::test]
async fn test_agent_calls_tool() {
    let callback = RecordingCallback::new();

    // ... run the agent with this callback ...

    let events = callback.events().await;

    // Verify the agent called the expected tool
    let tool_events: Vec<_> = events.iter()
        .filter_map(|e| match e {
            RunEvent::ToolCalled { tool_name, .. } => Some(tool_name.clone()),
            _ => None,
        })
        .collect();

    assert!(tool_events.contains(&"calculator".to_string()));
}

Thread Safety

RecordingCallback is Clone, Send, and Sync. You can safely share it across async tasks and inspect events from any task that holds a reference.

Tracing Callback

TracingCallback integrates Synaptic's callback system with the Rust tracing ecosystem. Instead of storing events in memory, it emits structured tracing spans and events that flow into whatever subscriber you have configured -- terminal output, JSON logs, OpenTelemetry, etc.

Setup

First, initialize a tracing subscriber. The simplest option is the fmt subscriber from tracing-subscriber:

use tracing_subscriber;

// Initialize the default subscriber (prints to stderr)
tracing_subscriber::fmt::init();

Then create the callback:

use synaptic::callbacks::TracingCallback;

let callback = TracingCallback::new();

Pass this callback to your agent or use it with CompositeCallback.

What Gets Logged

TracingCallback maps each RunEvent variant to a tracing call:

RunEvent	Tracing Level	Key Fields
`RunStarted`	`info!`	`run_id`, `session_id`
`RunStep`	`info!`	`run_id`, `step`
`LlmCalled`	`info!`	`run_id`, `message_count`
`ToolCalled`	`info!`	`run_id`, `tool_name`
`RunFinished`	`info!`	`run_id`, `output_len`
`RunFailed`	`error!`	`run_id`, `error`

All events except RunFailed are logged at the INFO level. Failures are logged at ERROR.

Example Output

With the default fmt subscriber, you might see:

2026-02-17T10:30:00.123Z  INFO synaptic: run started run_id="abc-123" session_id="user-1"
2026-02-17T10:30:00.456Z  INFO synaptic: LLM called run_id="abc-123" message_count=3
2026-02-17T10:30:01.234Z  INFO synaptic: tool called run_id="abc-123" tool_name="calculator"
2026-02-17T10:30:01.567Z  INFO synaptic: run finished run_id="abc-123" output_len=42

Integration with the Tracing Ecosystem

Because TracingCallback uses the standard tracing macros, it works with any compatible subscriber:

tracing-subscriber -- terminal formatting, filtering, layering.
tracing-opentelemetry -- export spans to Jaeger, Zipkin, or any OTLP collector.
tracing-appender -- write logs to rolling files.
JSON output -- use tracing_subscriber::fmt().json() for structured log ingestion.

// Example: JSON-formatted logs
tracing_subscriber::fmt()
    .json()
    .init();

let callback = TracingCallback::new();

When to Use

Use TracingCallback when:

You want production-grade structured logging with minimal setup.
You are already using the tracing ecosystem in your application.
You need to export agent telemetry to an observability platform (Datadog, Grafana, etc.).

For test-time event inspection, consider RecordingCallback instead, which stores events for programmatic access.

Composite Callback

CompositeCallback dispatches each RunEvent to multiple callback handlers. This lets you combine different observability strategies without choosing just one -- for example, recording events in memory for tests while also logging them via tracing.

Usage

use synaptic::callbacks::{CompositeCallback, RecordingCallback, TracingCallback};
use std::sync::Arc;

let recording = Arc::new(RecordingCallback::new());
let tracing_cb = Arc::new(TracingCallback::new());

let composite = CompositeCallback::new(vec![
    recording.clone(),
    tracing_cb,
]);

When composite.on_event(event) is called, the event is forwarded to each handler in order. If any handler returns an error, the composite stops and propagates that error.

How It Works

CompositeCallback holds a Vec<Arc<dyn CallbackHandler>>. On each event:

The event is cloned for each handler (since RunEvent implements Clone).
Each handler's on_event() is awaited sequentially.
If all handlers succeed, Ok(()) is returned.

// Pseudocode of the dispatch logic
async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError> {
    for handler in &self.handlers {
        handler.on_event(event.clone()).await?;
    }
    Ok(())
}

Example: Recording + Tracing + Custom

You can mix built-in and custom handlers:

use async_trait::async_trait;
use synaptic::core::{CallbackHandler, RunEvent, SynapticError};
use synaptic::callbacks::{
    CompositeCallback, RecordingCallback, TracingCallback, StdOutCallbackHandler,
};
use std::sync::Arc;

struct ToolCounter {
    count: Arc<tokio::sync::RwLock<usize>>,
}

#[async_trait]
impl CallbackHandler for ToolCounter {
    async fn on_event(&self, event: RunEvent) -> Result<(), SynapticError> {
        if matches!(event, RunEvent::ToolCalled { .. }) {
            *self.count.write().await += 1;
        }
        Ok(())
    }
}

let counter = Arc::new(ToolCounter {
    count: Arc::new(tokio::sync::RwLock::new(0)),
});

let composite = CompositeCallback::new(vec![
    Arc::new(RecordingCallback::new()),
    Arc::new(TracingCallback::new()),
    Arc::new(StdOutCallbackHandler::new()),
    counter.clone(),
]);

When to Use

Use CompositeCallback whenever you need more than one callback handler active at the same time. Common combinations:

Development: StdOutCallbackHandler + RecordingCallback -- see events in the terminal and inspect them programmatically.
Testing: RecordingCallback alone is usually sufficient.
Production: TracingCallback + custom metrics handler -- structured logs plus application-specific telemetry.

Evaluation

Synaptic provides an evaluation framework for measuring the quality of AI outputs. The Evaluator trait defines a standard interface for scoring predictions against references, and the Dataset + evaluate() pipeline makes it easy to run batch evaluations across many test cases.

The `Evaluator` Trait

All evaluators implement the Evaluator trait from synaptic_eval:

#[async_trait]
pub trait Evaluator: Send + Sync {
    async fn evaluate(
        &self,
        prediction: &str,
        reference: &str,
        input: &str,
    ) -> Result<EvalResult, SynapticError>;
}

prediction -- the AI's output to evaluate.
reference -- the expected or ground-truth answer.
input -- the original input that produced the prediction.

`EvalResult`

Every evaluator returns an EvalResult:

pub struct EvalResult {
    pub score: f64,       // Between 0.0 and 1.0
    pub passed: bool,     // true if score >= 0.5
    pub reasoning: Option<String>,  // Optional explanation
}

Helper constructors:

Method	Score	Passed
`EvalResult::pass()`	1.0	true
`EvalResult::fail()`	0.0	false
`EvalResult::with_score(0.75)`	0.75	true (>= 0.5)

You can attach reasoning with .with_reasoning("explanation").

Built-in Evaluators

Synaptic provides five evaluators out of the box:

Evaluator	What It Checks
`ExactMatchEvaluator`	Exact string equality (with optional case-insensitive mode)
`JsonValidityEvaluator`	Whether the prediction is valid JSON
`RegexMatchEvaluator`	Whether the prediction matches a regex pattern
`EmbeddingDistanceEvaluator`	Cosine similarity between prediction and reference embeddings
`LLMJudgeEvaluator`	Uses an LLM to score prediction quality on a 0-10 scale

See Evaluators for detailed usage of each.

Batch Evaluation

The evaluate() function runs an evaluator across a Dataset of test cases, producing an EvalReport with aggregate statistics. See Datasets for details.

Guides

Evaluators -- usage and configuration for each built-in evaluator
Datasets -- batch evaluation with Dataset and evaluate()

Evaluators

Synaptic provides five built-in evaluators, ranging from simple string matching to LLM-based judgment. All implement the Evaluator trait and return an EvalResult with a score, pass/fail status, and optional reasoning.

ExactMatchEvaluator

Checks whether the prediction exactly matches the reference string:

use synaptic::eval::{ExactMatchEvaluator, Evaluator};

// Case-sensitive (default)
let eval = ExactMatchEvaluator::new();
let result = eval.evaluate("hello", "hello", "").await?;
assert!(result.passed);
assert_eq!(result.score, 1.0);

let result = eval.evaluate("Hello", "hello", "").await?;
assert!(!result.passed);  // Case mismatch

// Case-insensitive
let eval = ExactMatchEvaluator::case_insensitive();
let result = eval.evaluate("Hello", "hello", "").await?;
assert!(result.passed);  // Now passes

On failure, the reasoning field shows what was expected versus what was received.

JsonValidityEvaluator

Checks whether the prediction is valid JSON. The reference and input are ignored:

use synaptic::eval::{JsonValidityEvaluator, Evaluator};

let eval = JsonValidityEvaluator::new();

let result = eval.evaluate(r#"{"key": "value"}"#, "", "").await?;
assert!(result.passed);

let result = eval.evaluate("not json", "", "").await?;
assert!(!result.passed);
// reasoning: "Invalid JSON: expected ident at line 1 column 2"

This is useful for validating that an LLM produced well-formed JSON output.

RegexMatchEvaluator

Checks whether the prediction matches a regular expression pattern:

use synaptic::eval::{RegexMatchEvaluator, Evaluator};

// Match a date pattern
let eval = RegexMatchEvaluator::new(r"\d{4}-\d{2}-\d{2}")?;

let result = eval.evaluate("2024-01-15", "", "").await?;
assert!(result.passed);

let result = eval.evaluate("January 15, 2024", "", "").await?;
assert!(!result.passed);

The constructor returns a Result because the regex pattern is validated at creation time. Invalid patterns produce a SynapticError::Validation.

EmbeddingDistanceEvaluator

Computes cosine similarity between the embeddings of the prediction and reference. The score equals the cosine similarity, and the evaluation passes if the similarity meets or exceeds the threshold:

use synaptic::eval::{EmbeddingDistanceEvaluator, Evaluator};
use synaptic::embeddings::FakeEmbeddings;
use std::sync::Arc;

let embeddings = Arc::new(FakeEmbeddings::new());
let eval = EmbeddingDistanceEvaluator::new(embeddings, 0.8);

let result = eval.evaluate("the cat sat", "the cat sat on the mat", "").await?;
println!("Similarity: {:.4}", result.score);
println!("Passed (>= 0.8): {}", result.passed);
// reasoning: "Cosine similarity: 0.9234, threshold: 0.8000"

Parameters:

embeddings -- any type implementing Arc<dyn Embeddings> (e.g., OpenAiEmbeddings from synaptic::openai, OllamaEmbeddings from synaptic::ollama, FakeEmbeddings from synaptic::embeddings).
threshold -- minimum cosine similarity to pass. A typical value is 0.8 for semantic similarity checks.

LLMJudgeEvaluator

Uses an LLM to judge the quality of a prediction on a 0-10 scale. The score is normalized to 0.0-1.0:

use synaptic::eval::{LLMJudgeEvaluator, Evaluator};
use synaptic::openai::OpenAiChatModel;
use std::sync::Arc;

let model = Arc::new(OpenAiChatModel::new("gpt-4o"));
let eval = LLMJudgeEvaluator::new(model);

let result = eval.evaluate(
    "Paris is the capital of France.",  // prediction
    "The capital of France is Paris.",  // reference
    "What is the capital of France?",   // input
).await?;

println!("Score: {:.1}/10", result.score * 10.0);
// reasoning: "LLM judge score: 9.0/10"

Custom Prompt Template

You can customize the judge prompt. The template must contain {input}, {prediction}, and {reference} placeholders:

let eval = LLMJudgeEvaluator::with_prompt(
    model,
    r#"Evaluate whether the response is factually accurate.

Question: {input}
Expected: {reference}
Response: {prediction}

Rate accuracy from 0 (wrong) to 10 (perfect). Reply with a single number."#,
);

The default prompt asks the LLM to rate overall quality. The response is parsed for a number between 0 and 10; if no valid number is found, the evaluator returns a SynapticError::Parsing.

Summary

Evaluator	Speed	Requires
`ExactMatchEvaluator`	Instant	Nothing
`JsonValidityEvaluator`	Instant	Nothing
`RegexMatchEvaluator`	Instant	Nothing
`EmbeddingDistanceEvaluator`	Fast	Embeddings model
`LLMJudgeEvaluator`	Slow (LLM call)	Chat model

Datasets

The Dataset type and evaluate() function provide a batch evaluation pipeline. You define a dataset of input-reference pairs, generate predictions, and score them all at once to produce an EvalReport.

Creating a Dataset

A Dataset is a collection of DatasetItem values, each with an input and a reference (expected answer):

use synaptic::eval::{Dataset, DatasetItem};

// From DatasetItem structs
let dataset = Dataset::new(vec![
    DatasetItem {
        input: "What is 2+2?".to_string(),
        reference: "4".to_string(),
    },
    DatasetItem {
        input: "Capital of France?".to_string(),
        reference: "Paris".to_string(),
    },
]);

// From string pairs (convenience method)
let dataset = Dataset::from_pairs(vec![
    ("What is 2+2?", "4"),
    ("Capital of France?", "Paris"),
]);

Running Batch Evaluation

The evaluate() function takes an evaluator, a dataset, and a slice of predictions. It evaluates each prediction against the corresponding dataset item and returns an EvalReport:

use synaptic::eval::{evaluate, Dataset, ExactMatchEvaluator};

let dataset = Dataset::from_pairs(vec![
    ("What is 2+2?", "4"),
    ("Capital of France?", "Paris"),
    ("Largest ocean?", "Pacific"),
]);

let evaluator = ExactMatchEvaluator::new();

// Your model's predictions (one per dataset item)
let predictions = vec![
    "4".to_string(),
    "Paris".to_string(),
    "Atlantic".to_string(),  // Wrong!
];

let report = evaluate(&evaluator, &dataset, &predictions).await?;

println!("Total: {}", report.total);      // 3
println!("Passed: {}", report.passed);     // 2
println!("Accuracy: {:.0}%", report.accuracy * 100.0);  // 67%

The number of predictions must match the number of dataset items. If they differ, evaluate() returns a SynapticError::Validation.

`EvalReport`

The report contains aggregate statistics and per-item results:

pub struct EvalReport {
    pub total: usize,
    pub passed: usize,
    pub accuracy: f32,
    pub results: Vec<EvalResult>,
}

You can inspect individual results for detailed feedback:

for (i, result) in report.results.iter().enumerate() {
    let status = if result.passed { "PASS" } else { "FAIL" };
    let reason = result.reasoning.as_deref().unwrap_or("--");
    println!("[{status}] Item {i}: score={:.2}, reason={reason}", result.score);
}

End-to-End Example

A typical evaluation workflow:

Build a dataset of test cases.
Run your model/chain on each input to produce predictions.
Score predictions with an evaluator.
Inspect the report.

use synaptic::eval::{evaluate, Dataset, ExactMatchEvaluator};

// 1. Dataset
let dataset = Dataset::from_pairs(vec![
    ("2+2", "4"),
    ("3*5", "15"),
    ("10/2", "5"),
]);

// 2. Generate predictions (in practice, run your model)
let predictions: Vec<String> = dataset.items.iter()
    .map(|item| {
        // Simulated model output
        match item.input.as_str() {
            "2+2" => "4",
            "3*5" => "15",
            "10/2" => "5",
            _ => "unknown",
        }.to_string()
    })
    .collect();

// 3. Evaluate
let evaluator = ExactMatchEvaluator::new();
let report = evaluate(&evaluator, &dataset, &predictions).await?;

// 4. Report
println!("Accuracy: {:.0}% ({}/{})",
    report.accuracy * 100.0, report.passed, report.total);

Using Different Evaluators

The evaluate() function works with any Evaluator. Swap in a different evaluator to change the scoring criteria without modifying the dataset or prediction pipeline:

use synaptic::eval::{evaluate, RegexMatchEvaluator};

// Check that predictions contain a date
let evaluator = RegexMatchEvaluator::new(r"\d{4}-\d{2}-\d{2}")?;
let report = evaluate(&evaluator, &dataset, &predictions).await?;

Integrations

Synaptic provides optional integration crates that connect to external services. Each integration is gated behind a Cargo feature flag and adds no overhead when not enabled.

Available Integrations

Integration	Feature	Purpose
OpenAI-Compatible Providers	`openai`	Groq, DeepSeek, Fireworks, Together, xAI, MistralAI, HuggingFace, Cohere, OpenRouter
Azure OpenAI	`openai`	Azure-hosted OpenAI models (chat + embeddings)
Anthropic	`anthropic`	Anthropic Claude models (chat + streaming + tool calling)
Google Gemini	`gemini`	Google Gemini models via Generative Language API
Ollama	`ollama`	Local LLM inference with Ollama (chat + embeddings)
AWS Bedrock	`bedrock`	AWS Bedrock foundation models (Claude, Llama, Mistral, etc.)
Cohere Reranker	`cohere`	Document reranking for improved retrieval quality
Qdrant	`qdrant`	Vector store backed by the Qdrant vector database
PgVector	`pgvector`	Vector store backed by PostgreSQL with the pgvector extension
Pinecone	`pinecone`	Managed vector store backed by Pinecone
Chroma	`chroma`	Open-source vector store backed by Chroma
MongoDB Atlas	`mongodb`	Vector search backed by MongoDB Atlas
Elasticsearch	`elasticsearch`	Vector store backed by Elasticsearch kNN
Redis	`redis`	Key-value store and LLM response cache backed by Redis
SQLite Cache	`sqlite`	Persistent LLM response cache backed by SQLite
PDF Loader	`pdf`	Document loader for PDF files
Tavily Search	`tavily`	Web search tool for agents

Enabling integrations

Add the desired feature flags to your Cargo.toml:

[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant", "redis"] }

You can combine any number of feature flags. Each integration pulls in only the dependencies it needs.

Trait compatibility

Every integration implements a core Synaptic trait, so it plugs directly into the existing framework:

OpenAI-Compatible, Azure OpenAI, and Bedrock implement ChatModel -- use them anywhere a model is accepted.
OpenAI-Compatible (MistralAI, HuggingFace, Cohere) and Azure OpenAI also implement Embeddings.
Cohere Reranker implements DocumentCompressor -- use it with ContextualCompressionRetriever for two-stage retrieval.
Qdrant, PgVector, Pinecone, Chroma, MongoDB Atlas, and Elasticsearch implement VectorStore -- use them with VectorStoreRetriever or any component that accepts &dyn VectorStore.
Redis Store implements Store -- use it anywhere InMemoryStore is used, including agent ToolRuntime injection.
Redis Cache and SQLite Cache implement LlmCache -- wrap any ChatModel with CachedChatModel for persistent response caching.
PDF Loader implements Loader -- use it in RAG pipelines alongside TextSplitter, Embeddings, and VectorStore.
Tavily Search implements Tool -- register it with an agent for web search capabilities.

Guides

LLM Providers

OpenAI-Compatible Providers -- Groq, DeepSeek, Fireworks, Together, xAI, MistralAI, HuggingFace, Cohere, OpenRouter
Azure OpenAI -- Azure-hosted OpenAI models
Anthropic -- Anthropic Claude models
Google Gemini -- Google Gemini models
Ollama -- Local LLM inference (chat + embeddings)
AWS Bedrock -- AWS Bedrock foundation models

Reranking

Cohere Reranker -- document reranking for improved retrieval

Vector Stores

Qdrant Vector Store -- store and search embeddings with Qdrant
PgVector -- store and search embeddings with PostgreSQL + pgvector
Pinecone Vector Store -- managed vector store with Pinecone
Chroma Vector Store -- open-source embedding database
MongoDB Atlas Vector Search -- vector search with MongoDB Atlas
Elasticsearch Vector Store -- vector search with Elasticsearch kNN

Storage & Caching

Redis Store & Cache -- persistent key-value storage and LLM caching with Redis
SQLite Cache -- local LLM response caching with SQLite

Loaders & Tools

PDF Loader -- load documents from PDF files
Tavily Search Tool -- web search tool for agents

OpenAI-Compatible Providers

Many LLM providers expose an OpenAI-compatible API. Synaptic ships convenience constructors for nine popular providers so you can connect without building configuration by hand.

Setup

Add the openai feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai"] }

All OpenAI-compatible providers use the synaptic-openai crate under the hood, so only the openai feature is required.

Supported Providers

The synaptic::openai::compat module provides two functions per provider:

{provider}_config(api_key, model) -- returns an OpenAiConfig pre-configured with the correct base URL.
{provider}_chat_model(api_key, model, backend) -- returns a ready-to-use OpenAiChatModel.

Some providers also offer embeddings variants.

Provider	Config function	Chat model function	Embeddings?
Groq	`groq_config`	`groq_chat_model`	No
DeepSeek	`deepseek_config`	`deepseek_chat_model`	No
Fireworks	`fireworks_config`	`fireworks_chat_model`	No
Together	`together_config`	`together_chat_model`	No
xAI	`xai_config`	`xai_chat_model`	No
MistralAI	`mistral_config`	`mistral_chat_model`	Yes
HuggingFace	`huggingface_config`	`huggingface_chat_model`	Yes
Cohere	`cohere_config`	`cohere_chat_model`	Yes
OpenRouter	`openrouter_config`	`openrouter_chat_model`	No

Usage

Chat model

use std::sync::Arc;
use synaptic::openai::compat::{groq_chat_model, deepseek_chat_model};
use synaptic::models::HttpBackend;
use synaptic::core::{ChatModel, ChatRequest, Message};

let backend = Arc::new(HttpBackend::new());

// Groq
let model = groq_chat_model("gsk-...", "llama-3.3-70b-versatile", backend.clone());
let request = ChatRequest::new(vec![Message::human("Hello from Groq!")]);
let response = model.chat(&request).await?;

// DeepSeek
let model = deepseek_chat_model("sk-...", "deepseek-chat", backend.clone());
let response = model.chat(&request).await?;

Config-first approach

If you need to customize the config further before creating the model:

use std::sync::Arc;
use synaptic::openai::compat::fireworks_config;
use synaptic::openai::OpenAiChatModel;
use synaptic::models::HttpBackend;

let config = fireworks_config("fw-...", "accounts/fireworks/models/llama-v3p1-70b-instruct")
    .with_temperature(0.7)
    .with_max_tokens(2048);

let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));

Embeddings

Providers that support embeddings have {provider}_embeddings_config and {provider}_embeddings functions:

use std::sync::Arc;
use synaptic::openai::compat::{mistral_embeddings, cohere_embeddings, huggingface_embeddings};
use synaptic::models::HttpBackend;
use synaptic::core::Embeddings;

let backend = Arc::new(HttpBackend::new());

// MistralAI embeddings
let embeddings = mistral_embeddings("sk-...", "mistral-embed", backend.clone());
let vectors = embeddings.embed_documents(&["Hello world"]).await?;

// Cohere embeddings
let embeddings = cohere_embeddings("co-...", "embed-english-v3.0", backend.clone());

// HuggingFace embeddings
let embeddings = huggingface_embeddings("hf_...", "BAAI/bge-small-en-v1.5", backend.clone());

Unlisted providers

Any provider that exposes an OpenAI-compatible API can be used by setting a custom base URL on OpenAiConfig:

use std::sync::Arc;
use synaptic::openai::{OpenAiConfig, OpenAiChatModel};
use synaptic::models::HttpBackend;

let config = OpenAiConfig::new("your-api-key", "model-name")
    .with_base_url("https://api.example.com/v1");

let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));

This works for any service that accepts the OpenAI chat completions request format at {base_url}/chat/completions.

Streaming

All OpenAI-compatible models support streaming. Use stream_chat() just like you would with the standard OpenAiChatModel:

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};

let request = ChatRequest::new(vec![Message::human("Tell me a story")]);
let mut stream = model.stream_chat(&request).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(text) = &chunk.content {
        print!("{}", text);
    }
}

Provider reference

Provider	Base URL	Env variable (convention)
Groq	`https://api.groq.com/openai/v1`	`GROQ_API_KEY`
DeepSeek	`https://api.deepseek.com/v1`	`DEEPSEEK_API_KEY`
Fireworks	`https://api.fireworks.ai/inference/v1`	`FIREWORKS_API_KEY`
Together	`https://api.together.xyz/v1`	`TOGETHER_API_KEY`
xAI	`https://api.x.ai/v1`	`XAI_API_KEY`
MistralAI	`https://api.mistral.ai/v1`	`MISTRAL_API_KEY`
HuggingFace	`https://api-inference.huggingface.co/v1`	`HUGGINGFACE_API_KEY`
Cohere	`https://api.cohere.com/v1`	`CO_API_KEY`
OpenRouter	`https://openrouter.ai/api/v1`	`OPENROUTER_API_KEY`

Azure OpenAI

This guide shows how to use Azure OpenAI Service as a chat model and embeddings provider in Synaptic. Azure OpenAI uses deployment-based URLs and api-key header authentication instead of Bearer tokens.

Setup

Add the openai feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai"] }

Azure OpenAI support is included in the synaptic-openai crate, so no additional feature flag is needed.

Configuration

Create an AzureOpenAiConfig with your API key, resource name, and deployment name:

use std::sync::Arc;
use synaptic::openai::{AzureOpenAiConfig, AzureOpenAiChatModel};
use synaptic::models::HttpBackend;

let config = AzureOpenAiConfig::new(
    "your-azure-api-key",
    "my-resource",         // Azure resource name
    "gpt-4o-deployment",   // Deployment name
);

let model = AzureOpenAiChatModel::new(config, Arc::new(HttpBackend::new()));

The resulting endpoint URL is:

https://{resource_name}.openai.azure.com/openai/deployments/{deployment_name}/chat/completions?api-version={api_version}

API version

The default API version is "2024-10-21". You can override it:

let config = AzureOpenAiConfig::new("key", "resource", "deployment")
    .with_api_version("2024-12-01-preview");

Model parameters

Configure temperature, max tokens, and other generation parameters:

let config = AzureOpenAiConfig::new("key", "resource", "deployment")
    .with_temperature(0.7)
    .with_max_tokens(4096);

Usage

AzureOpenAiChatModel implements the ChatModel trait, so it works everywhere a standard model does:

use synaptic::core::{ChatModel, ChatRequest, Message};

let request = ChatRequest::new(vec![
    Message::system("You are a helpful assistant."),
    Message::human("What is Azure OpenAI?"),
]);

let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());

Streaming

use futures::StreamExt;

let mut stream = model.stream_chat(&request).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(text) = &chunk.content {
        print!("{}", text);
    }
}

Tool calling

use synaptic::core::{ChatRequest, Message, ToolDefinition};

let tools = vec![ToolDefinition {
    name: "get_weather".into(),
    description: "Get the current weather".into(),
    parameters: serde_json::json!({
        "type": "object",
        "properties": {
            "city": { "type": "string" }
        },
        "required": ["city"]
    }),
}];

let request = ChatRequest::new(vec![Message::human("What's the weather in Seattle?")])
    .with_tools(tools);

let response = model.chat(&request).await?;

Embeddings

Use AzureOpenAiEmbeddings for text embedding with Azure-hosted models:

use std::sync::Arc;
use synaptic::openai::{AzureOpenAiEmbeddingsConfig, AzureOpenAiEmbeddings};
use synaptic::models::HttpBackend;
use synaptic::core::Embeddings;

let config = AzureOpenAiEmbeddingsConfig::new(
    "your-azure-api-key",
    "my-resource",
    "text-embedding-ada-002-deployment",
);

let embeddings = AzureOpenAiEmbeddings::new(config, Arc::new(HttpBackend::new()));
let vectors = embeddings.embed_documents(&["Hello world", "Rust is fast"]).await?;

Environment variables

A common pattern is to read credentials from the environment:

let config = AzureOpenAiConfig::new(
    std::env::var("AZURE_OPENAI_API_KEY").unwrap(),
    std::env::var("AZURE_OPENAI_RESOURCE").unwrap(),
    std::env::var("AZURE_OPENAI_DEPLOYMENT").unwrap(),
);

Configuration reference

AzureOpenAiConfig

Field	Type	Default	Description
`api_key`	`String`	required	Azure OpenAI API key
`resource_name`	`String`	required	Azure resource name
`deployment_name`	`String`	required	Model deployment name
`api_version`	`String`	`"2024-10-21"`	Azure API version
`temperature`	`Option<f32>`	`None`	Sampling temperature
`max_tokens`	`Option<u32>`	`None`	Maximum tokens to generate

AzureOpenAiEmbeddingsConfig

Field	Type	Default	Description
`api_key`	`String`	required	Azure OpenAI API key
`resource_name`	`String`	required	Azure resource name
`deployment_name`	`String`	required	Embeddings deployment name
`api_version`	`String`	`"2024-10-21"`	Azure API version

Anthropic

This guide shows how to use the Anthropic Messages API as a chat model provider in Synaptic. AnthropicChatModel wraps the Anthropic REST API and supports streaming, tool calling, and all standard ChatModel operations.

Setup

Add the anthropic feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["anthropic"] }

API key

Set your Anthropic API key as an environment variable:

export ANTHROPIC_API_KEY="sk-ant-..."

The key is passed to AnthropicConfig at construction time. Requests are authenticated with the x-api-key header (not a Bearer token).

Configuration

Create an AnthropicConfig with your API key and model name:

use synaptic::anthropic::{AnthropicConfig, AnthropicChatModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = AnthropicConfig::new("sk-ant-...", "claude-sonnet-4-20250514");
let model = AnthropicChatModel::new(config, Arc::new(HttpBackend::new()));

Custom base URL

To use a proxy or alternative endpoint:

let config = AnthropicConfig::new(api_key, "claude-sonnet-4-20250514")
    .with_base_url("https://my-proxy.example.com");

Model parameters

let config = AnthropicConfig::new(api_key, "claude-sonnet-4-20250514")
    .with_max_tokens(4096)
    .with_top_p(0.9)
    .with_stop(vec!["END".to_string()]);

Usage

AnthropicChatModel implements the ChatModel trait:

use synaptic::anthropic::{AnthropicConfig, AnthropicChatModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = AnthropicConfig::new(
    std::env::var("ANTHROPIC_API_KEY").unwrap(),
    "claude-sonnet-4-20250514",
);
let model = AnthropicChatModel::new(config, Arc::new(HttpBackend::new()));

let request = ChatRequest::new(vec![
    Message::system("You are a helpful assistant."),
    Message::human("Explain Rust's ownership model in one sentence."),
]);

let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());

Streaming

AnthropicChatModel supports native SSE streaming via the stream_chat method:

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};

let request = ChatRequest::new(vec![
    Message::human("Write a short poem about Rust."),
]);

let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if !chunk.content.is_empty() {
        print!("{}", chunk.content);
    }
}

Tool calling

Anthropic models support tool calling through tool_use and tool_result content blocks. Synaptic maps ToolDefinition and ToolChoice to the Anthropic format automatically.

use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition, ToolChoice};

let tools = vec![ToolDefinition {
    name: "get_weather".into(),
    description: "Get the current weather for a city".into(),
    parameters: serde_json::json!({
        "type": "object",
        "properties": {
            "city": { "type": "string", "description": "City name" }
        },
        "required": ["city"]
    }),
}];

let request = ChatRequest::new(vec![
    Message::human("What is the weather in Tokyo?"),
])
.with_tools(tools)
.with_tool_choice(ToolChoice::Auto);

let response = model.chat(request).await?;

// Check if the model requested a tool call
for tc in response.message.tool_calls() {
    println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}

ToolChoice variants map to Anthropic's tool_choice as follows:

Synaptic	Anthropic
`Auto`	`{"type": "auto"}`
`Required`	`{"type": "any"}`
`None`	`{"type": "none"}`
`Specific(name)`	`{"type": "tool", "name": "..."}`

Configuration reference

Field	Type	Default	Description
`api_key`	`String`	required	Anthropic API key
`model`	`String`	required	Model name (e.g. `claude-sonnet-4-20250514`)
`base_url`	`String`	`"https://api.anthropic.com"`	API base URL
`max_tokens`	`u32`	`1024`	Maximum tokens to generate
`top_p`	`Option<f64>`	`None`	Nucleus sampling parameter
`stop`	`Option<Vec<String>>`	`None`	Stop sequences

Google Gemini

This guide shows how to use the Google Generative Language API as a chat model provider in Synaptic. GeminiChatModel wraps Google's Generative Language REST API and supports streaming, tool calling, and all standard ChatModel operations.

Setup

Add the gemini feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["gemini"] }

API key

Set your Google API key as an environment variable:

export GOOGLE_API_KEY="AIza..."

The key is passed to GeminiConfig at construction time. Unlike other providers, the API key is sent as a query parameter (?key=...) rather than in a request header.

Configuration

Create a GeminiConfig with your API key and model name:

use synaptic::gemini::{GeminiConfig, GeminiChatModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = GeminiConfig::new("AIza...", "gemini-2.0-flash");
let model = GeminiChatModel::new(config, Arc::new(HttpBackend::new()));

Custom base URL

To use a proxy or alternative endpoint:

let config = GeminiConfig::new(api_key, "gemini-2.0-flash")
    .with_base_url("https://my-proxy.example.com");

Model parameters

let config = GeminiConfig::new(api_key, "gemini-2.0-flash")
    .with_top_p(0.9)
    .with_stop(vec!["END".to_string()]);

Usage

GeminiChatModel implements the ChatModel trait:

use synaptic::gemini::{GeminiConfig, GeminiChatModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = GeminiConfig::new(
    std::env::var("GOOGLE_API_KEY").unwrap(),
    "gemini-2.0-flash",
);
let model = GeminiChatModel::new(config, Arc::new(HttpBackend::new()));

let request = ChatRequest::new(vec![
    Message::system("You are a helpful assistant."),
    Message::human("Explain Rust's ownership model in one sentence."),
]);

let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());

Streaming

GeminiChatModel supports native SSE streaming via the stream_chat method. The streaming endpoint uses streamGenerateContent?alt=sse:

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};

let request = ChatRequest::new(vec![
    Message::human("Write a short poem about Rust."),
]);

let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if !chunk.content.is_empty() {
        print!("{}", chunk.content);
    }
}

Tool calling

Gemini models support tool calling through functionCall and functionResponse parts (camelCase format). Synaptic maps ToolDefinition and ToolChoice to the Gemini format automatically.

use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition, ToolChoice};

let tools = vec![ToolDefinition {
    name: "get_weather".into(),
    description: "Get the current weather for a city".into(),
    parameters: serde_json::json!({
        "type": "object",
        "properties": {
            "city": { "type": "string", "description": "City name" }
        },
        "required": ["city"]
    }),
}];

let request = ChatRequest::new(vec![
    Message::human("What is the weather in Tokyo?"),
])
.with_tools(tools)
.with_tool_choice(ToolChoice::Auto);

let response = model.chat(request).await?;

// Check if the model requested a tool call
for tc in response.message.tool_calls() {
    println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}

ToolChoice variants map to Gemini's functionCallingConfig as follows:

Synaptic	Gemini
`Auto`	`{"mode": "AUTO"}`
`Required`	`{"mode": "ANY"}`
`None`	`{"mode": "NONE"}`
`Specific(name)`	`{"mode": "ANY", "allowedFunctionNames": ["..."]}`

Configuration reference

Field	Type	Default	Description
`api_key`	`String`	required	Google API key
`model`	`String`	required	Model name (e.g. `gemini-2.0-flash`)
`base_url`	`String`	`"https://generativelanguage.googleapis.com"`	API base URL
`top_p`	`Option<f64>`	`None`	Nucleus sampling parameter
`stop`	`Option<Vec<String>>`	`None`	Stop sequences

Ollama

This guide shows how to use Ollama as a local chat model and embeddings provider in Synaptic. OllamaChatModel wraps the Ollama REST API and supports streaming, tool calling, and all standard ChatModel operations. Because Ollama runs locally, no API key is needed.

Setup

Add the ollama feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["ollama"] }

Installing Ollama

Install Ollama from ollama.com and pull a model before using the provider:

# Install Ollama (macOS)
brew install ollama

# Start the Ollama server
ollama serve

# Pull a model
ollama pull llama3.1

The default endpoint is http://localhost:11434. Make sure the Ollama server is running before sending requests.

Configuration

Create an OllamaConfig with a model name. No API key is required:

use synaptic::ollama::{OllamaConfig, OllamaChatModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = OllamaConfig::new("llama3.1");
let model = OllamaChatModel::new(config, Arc::new(HttpBackend::new()));

Custom base URL

To connect to a remote Ollama instance or a non-default port:

let config = OllamaConfig::new("llama3.1")
    .with_base_url("http://192.168.1.100:11434");

Model parameters

let config = OllamaConfig::new("llama3.1")
    .with_top_p(0.9)
    .with_stop(vec!["END".to_string()])
    .with_seed(42);

Usage

OllamaChatModel implements the ChatModel trait:

use synaptic::ollama::{OllamaConfig, OllamaChatModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = OllamaConfig::new("llama3.1");
let model = OllamaChatModel::new(config, Arc::new(HttpBackend::new()));

let request = ChatRequest::new(vec![
    Message::system("You are a helpful assistant."),
    Message::human("Explain Rust's ownership model in one sentence."),
]);

let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());

Streaming

OllamaChatModel supports native streaming via the stream_chat method. Unlike cloud providers that use SSE, Ollama uses NDJSON (newline-delimited JSON) where each line is a complete JSON object:

use futures::StreamExt;
use synaptic::core::{ChatModel, ChatRequest, Message};

let request = ChatRequest::new(vec![
    Message::human("Write a short poem about Rust."),
]);

let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if !chunk.content.is_empty() {
        print!("{}", chunk.content);
    }
}

Tool calling

Ollama models that support function calling (such as llama3.1) can use tool calling through the tool_calls array format. Synaptic maps ToolDefinition and ToolChoice to the Ollama format automatically.

use synaptic::core::{ChatModel, ChatRequest, Message, ToolDefinition, ToolChoice};

let tools = vec![ToolDefinition {
    name: "get_weather".into(),
    description: "Get the current weather for a city".into(),
    parameters: serde_json::json!({
        "type": "object",
        "properties": {
            "city": { "type": "string", "description": "City name" }
        },
        "required": ["city"]
    }),
}];

let request = ChatRequest::new(vec![
    Message::human("What is the weather in Tokyo?"),
])
.with_tools(tools)
.with_tool_choice(ToolChoice::Auto);

let response = model.chat(request).await?;

// Check if the model requested a tool call
for tc in response.message.tool_calls() {
    println!("Tool: {}, Args: {}", tc.name, tc.arguments);
}

ToolChoice variants map to Ollama's tool_choice as follows:

Synaptic	Ollama
`Auto`	`"auto"`
`Required`	`"required"`
`None`	`"none"`
`Specific(name)`	`{"type": "function", "function": {"name": "..."}}`

Reproducibility with seed

Ollama supports a seed parameter for reproducible generation. When set, the model will produce deterministic output for the same input:

let config = OllamaConfig::new("llama3.1")
    .with_seed(42);
let model = OllamaChatModel::new(config, Arc::new(HttpBackend::new()));

let request = ChatRequest::new(vec![
    Message::human("Pick a random number between 1 and 100."),
]);

// Same seed + same input = same output
let response = model.chat(request).await?;
println!("{}", response.message.content().unwrap_or_default());

Embeddings

OllamaEmbeddings provides local embedding generation through Ollama's /api/embed endpoint. Pull an embedding model first:

ollama pull nomic-embed-text

Configuration

use synaptic::ollama::{OllamaEmbeddingsConfig, OllamaEmbeddings};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = OllamaEmbeddingsConfig::new("nomic-embed-text");
let embeddings = OllamaEmbeddings::new(config, Arc::new(HttpBackend::new()));

To connect to a remote instance:

let config = OllamaEmbeddingsConfig::new("nomic-embed-text")
    .with_base_url("http://192.168.1.100:11434");

Usage

OllamaEmbeddings implements the Embeddings trait:

use synaptic::core::Embeddings;

// Embed a single query
let vector = embeddings.embed_query("What is Rust?").await?;
println!("Dimension: {}", vector.len());

// Embed multiple documents
let vectors = embeddings.embed_documents(&["First doc", "Second doc"]).await?;
println!("Embedded {} documents", vectors.len());

Configuration reference

OllamaConfig

Field	Type	Default	Description
`model`	`String`	required	Model name (e.g. `llama3.1`)
`base_url`	`String`	`"http://localhost:11434"`	Ollama server URL
`top_p`	`Option<f64>`	`None`	Nucleus sampling parameter
`stop`	`Option<Vec<String>>`	`None`	Stop sequences
`seed`	`Option<u64>`	`None`	Seed for reproducible generation

OllamaEmbeddingsConfig

Field	Type	Default	Description
`model`	`String`	required	Embedding model name (e.g. `nomic-embed-text`)
`base_url`	`String`	`"http://localhost:11434"`	Ollama server URL

AWS Bedrock

This guide shows how to use AWS Bedrock as a chat model provider in Synaptic. Bedrock provides access to foundation models from Amazon, Anthropic, Meta, Mistral, and others through the AWS SDK.

Setup

Add the bedrock feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["bedrock"] }

AWS credentials

BedrockChatModel uses the AWS SDK for Rust, which reads credentials from the standard AWS credential chain:

Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
Shared credentials file (~/.aws/credentials)
IAM role (when running on EC2, ECS, Lambda, etc.)

Ensure your IAM principal has bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions.

Configuration

Create a BedrockConfig with the model ID:

use synaptic::bedrock::{BedrockConfig, BedrockChatModel};

let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0");
let model = BedrockChatModel::new(config).await;

Note: The constructor is async because it initializes the AWS SDK client, which loads credentials and resolves the region from the environment.

Region

By default, the region is resolved from the AWS SDK default chain (environment variable AWS_REGION, config file, etc.). You can override it:

let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0")
    .with_region("us-west-2");

Model parameters

let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0")
    .with_temperature(0.7)
    .with_max_tokens(4096);

Usage

BedrockChatModel implements the ChatModel trait:

use synaptic::core::{ChatModel, ChatRequest, Message};

let request = ChatRequest::new(vec![
    Message::system("You are a helpful assistant."),
    Message::human("Explain AWS Bedrock in one sentence."),
]);

let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());

Streaming

use futures::StreamExt;

let mut stream = model.stream_chat(&request).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(text) = &chunk.content {
        print!("{}", text);
    }
}

Tool calling

Bedrock supports tool calling for models that expose it (e.g. Anthropic Claude models):

use synaptic::core::{ChatRequest, Message, ToolDefinition};

let tools = vec![ToolDefinition {
    name: "get_weather".into(),
    description: "Get the current weather".into(),
    parameters: serde_json::json!({
        "type": "object",
        "properties": {
            "city": { "type": "string" }
        },
        "required": ["city"]
    }),
}];

let request = ChatRequest::new(vec![Message::human("Weather in Tokyo?")])
    .with_tools(tools);

let response = model.chat(&request).await?;

Using an existing AWS client

If you already have a configured aws_sdk_bedrockruntime::Client, pass it directly with from_client:

use synaptic::bedrock::{BedrockConfig, BedrockChatModel};

let aws_config = aws_config::from_env().region("eu-west-1").load().await;
let client = aws_sdk_bedrockruntime::Client::new(&aws_config);

let config = BedrockConfig::new("anthropic.claude-3-5-sonnet-20241022-v2:0");
let model = BedrockChatModel::from_client(config, client);

Note: Unlike the standard constructor, from_client is not async because it skips AWS SDK initialization.

Architecture note

BedrockChatModel does not use the ProviderBackend abstraction (HttpBackend/FakeBackend). It calls the AWS SDK directly via the Bedrock Runtime converse and converse_stream APIs. This means you cannot inject a FakeBackend for testing. Instead, use ScriptedChatModel as a test double:

use synaptic::models::ScriptedChatModel;
use synaptic::core::Message;

let model = ScriptedChatModel::new(vec![
    Message::ai("Mocked Bedrock response"),
]);

Configuration reference

Field	Type	Default	Description
`model_id`	`String`	required	Bedrock model ID (e.g. `anthropic.claude-3-5-sonnet-20241022-v2:0`)
`region`	`Option<String>`	`None` (auto-detect)	AWS region override
`temperature`	`Option<f32>`	`None`	Sampling temperature
`max_tokens`	`Option<u32>`	`None`	Maximum tokens to generate

Cohere Reranker

This guide shows how to use the Cohere Reranker in Synaptic. The reranker re-scores a list of documents by relevance to a query, improving retrieval quality when used as a second-stage filter.

Note: For Cohere chat models and embeddings, use the OpenAI-compatible constructors (cohere_chat_model, cohere_embeddings) instead. This page covers the Reranker only.

Setup

Add the cohere feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["cohere"] }

Set your Cohere API key:

export CO_API_KEY="your-cohere-api-key"

Configuration

Create a CohereRerankerConfig and build the reranker:

use synaptic::cohere::{CohereRerankerConfig, CohereReranker};

let config = CohereRerankerConfig::new("your-cohere-api-key");
let reranker = CohereReranker::new(config);

Custom model

The default model is "rerank-v3.5". You can specify a different one:

let config = CohereRerankerConfig::new("your-cohere-api-key")
    .with_model("rerank-english-v3.0");

Usage

Reranking documents

Pass a query, a list of documents, and the number of top results to return:

use synaptic::core::Document;

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is popular for data science"),
    Document::new("3", "Rust ensures memory safety without a garbage collector"),
    Document::new("4", "JavaScript runs in the browser"),
];

let top_docs = reranker.rerank("memory safe language", &docs, 2).await?;

for doc in &top_docs {
    println!("{}: {}", doc.id, doc.content);
}
// Likely returns docs 3 and 1, re-ordered by relevance

The returned documents are sorted by descending relevance score. Only the top top_n documents are returned.

With ContextualCompressionRetriever

When the retrieval feature is also enabled, CohereReranker implements the DocumentCompressor trait. This allows it to plug into a ContextualCompressionRetriever for automatic reranking:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "cohere", "retrieval", "vectorstores", "embeddings"] }

use std::sync::Arc;
use synaptic::cohere::{CohereRerankerConfig, CohereReranker};
use synaptic::retrieval::ContextualCompressionRetriever;
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever};
use synaptic::openai::OpenAiEmbeddings;

// Set up a base retriever
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(InMemoryVectorStore::new());
// ... add documents to the store ...

let base_retriever = Arc::new(VectorStoreRetriever::new(store, embeddings, 20));

// Wrap with reranker for two-stage retrieval
let reranker = Arc::new(CohereReranker::new(
    CohereRerankerConfig::new("your-cohere-api-key"),
));

let retriever = ContextualCompressionRetriever::new(base_retriever, reranker);

// Retrieves 20 candidates, then reranks and returns the top 5
use synaptic::core::Retriever;
let results = retriever.retrieve("memory safety in Rust", 5).await?;

This two-stage pattern (broad retrieval followed by reranking) often produces better results than relying on embedding similarity alone.

Configuration reference

Field	Type	Default	Description
`api_key`	`String`	required	Cohere API key
`model`	`String`	`"rerank-v3.5"`	Reranker model name

Qdrant Vector Store

This guide shows how to use Qdrant as a vector store backend in Synaptic. Qdrant is a high-performance vector database purpose-built for similarity search.

Setup

Add the qdrant feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant"] }

Start a Qdrant instance (e.g. via Docker):

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Port 6333 is the REST API; port 6334 is the gRPC endpoint used by the Rust client.

Configuration

Create a QdrantConfig with the connection URL, collection name, and vector dimensionality:

use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};

let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config)?;

API key authentication

For Qdrant Cloud or secured deployments, attach an API key:

let config = QdrantConfig::new("https://my-cluster.cloud.qdrant.io:6334", "docs", 1536)
    .with_api_key("your-api-key-here");

let store = QdrantVectorStore::new(config)?;

Distance metric

The default distance metric is cosine similarity. You can change it with with_distance():

use qdrant_client::qdrant::Distance;

let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536)
    .with_distance(Distance::Euclid);

Available options: Distance::Cosine (default), Distance::Euclid, Distance::Dot, Distance::Manhattan.

Creating the collection

Call ensure_collection() to create the collection if it does not already exist. This is idempotent and safe to call on every startup:

store.ensure_collection().await?;

The collection is created with the vector size and distance metric from your config.

Adding documents

QdrantVectorStore implements the VectorStore trait. Pass an embeddings provider to compute vectors:

use synaptic::qdrant::VectorStore;
use synaptic::retrieval::Document;
use synaptic::openai::OpenAiEmbeddings;

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;

Document IDs are mapped to Qdrant point UUIDs. If a document ID is already a valid UUID, it is used directly. Otherwise, a deterministic UUID v5 is generated from the ID string.

Similarity search

Find the k most similar documents to a text query:

let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

Get similarity scores alongside results:

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Search by vector

Search using a pre-computed embedding vector:

use synaptic::embeddings::Embeddings;

let query_vec = embeddings.embed_query("systems programming").await?;
let results = store.similarity_search_by_vector(&query_vec, 3).await?;

Deleting documents

Remove documents by their IDs:

store.delete(&["1", "3"]).await?;

Using with a retriever

Wrap the store in a VectorStoreRetriever to use it with the rest of Synaptic's retrieval infrastructure:

use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::retrieval::Retriever;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);

let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;

Using an existing client

If you already have a configured qdrant_client::Qdrant instance, you can pass it directly:

use qdrant_client::Qdrant;
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};

let client = Qdrant::from_url("http://localhost:6334").build()?;
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);

let store = QdrantVectorStore::from_client(client, config);

RAG Pipeline Example

A complete RAG pipeline: load documents, split them into chunks, embed and store in Qdrant, then retrieve relevant context and generate an answer.

use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;

let backend = Arc::new(HttpBackend::new());
let embeddings = Arc::new(OpenAiEmbeddings::new(
    OpenAiEmbeddings::config("text-embedding-3-small"),
    backend.clone(),
));

// 1. Load and split
let loader = TextLoader::new("docs/knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;

// 2. Store in Qdrant
let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config)?;
store.ensure_collection().await?;
store.add_documents(chunks, embeddings.as_ref()).await?;

// 3. Retrieve and answer
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings.clone(), 5);
let relevant = retriever.retrieve("What is Synaptic?", 5).await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");

let model = OpenAiChatModel::new(/* config */);
let request = ChatRequest::new(vec![
    Message::system(&format!("Answer based on context:\n{context}")),
    Message::human("What is Synaptic?"),
]);
let response = model.chat(&request).await?;

Using with an Agent

Wrap the retriever as a tool so a ReAct agent can decide when to search the vector store during multi-step reasoning:

use synaptic::graph::create_react_agent;
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use std::sync::Arc;

// Build the retriever (as shown above)
let config = QdrantConfig::new("http://localhost:6334", "knowledge", 1536);
let store = Arc::new(QdrantVectorStore::new(config)?);
store.ensure_collection().await?;
let embeddings = Arc::new(OpenAiEmbeddings::new(/* config */));
let retriever = VectorStoreRetriever::new(store, embeddings, 5);

// Register the retriever as a tool and create a ReAct agent
// that can autonomously decide when to search
let model = OpenAiChatModel::new(/* config */);
let agent = create_react_agent(model, vec![/* retriever tool */]).compile();

The agent will invoke the retriever tool whenever it determines that external knowledge is needed to answer the user's question.

Configuration reference

Field	Type	Default	Description
`url`	`String`	required	Qdrant gRPC URL (e.g. `http://localhost:6334`)
`collection_name`	`String`	required	Name of the Qdrant collection
`vector_size`	`u64`	required	Dimensionality of the embedding vectors
`api_key`	`Option<String>`	`None`	API key for authenticated access
`distance`	`Distance`	`Cosine`	Distance metric for similarity search

PgVector

This guide shows how to use PostgreSQL with the pgvector extension as a vector store backend in Synaptic. This is a good choice when you already run PostgreSQL and want to keep embeddings alongside your relational data.

Prerequisites

Your PostgreSQL instance must have the pgvector extension installed. On most systems:

CREATE EXTENSION IF NOT EXISTS vector;

Refer to the pgvector installation guide for platform-specific instructions.

Setup

Add the pgvector feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.3", features = ["openai", "pgvector"] }
sqlx = { version = "0.8", features = ["runtime-tokio", "postgres"] }

The sqlx dependency is needed to create the connection pool. Synaptic uses sqlx::PgPool for all database operations.

Creating a store

Connect to PostgreSQL and create the store:

use sqlx::postgres::PgPoolOptions;
use synaptic::pgvector::{PgVectorConfig, PgVectorStore};

let pool = PgPoolOptions::new()
    .max_connections(5)
    .connect("postgres://user:pass@localhost/mydb")
    .await?;

let config = PgVectorConfig::new("documents", 1536);
let store = PgVectorStore::new(pool, config);

The first argument to PgVectorConfig::new is the table name; the second is the embedding vector dimensionality (e.g. 1536 for OpenAI text-embedding-3-small).

Initializing the table

Call initialize() once to create the pgvector extension and the backing table. This is idempotent and safe to run on every application startup:

store.initialize().await?;

This creates a table with the following schema:

CREATE TABLE IF NOT EXISTS documents (
    id TEXT PRIMARY KEY,
    content TEXT NOT NULL,
    metadata JSONB NOT NULL DEFAULT '{}',
    embedding vector(1536)
);

The vector(N) column type is provided by the pgvector extension, where N matches the vector_dimensions in your config.

Adding documents

PgVectorStore implements the VectorStore trait. Pass an embeddings provider to compute vectors:

use synaptic::pgvector::VectorStore;
use synaptic::retrieval::Document;
use synaptic::openai::OpenAiEmbeddings;

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;

Documents with empty IDs are assigned a random UUID. Existing documents with the same ID are upserted (content, metadata, and embedding are updated).

Similarity search

Find the k most similar documents using cosine distance (<=>):

let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

Get cosine similarity scores (higher is more similar):

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Scores are computed as 1 - cosine_distance, so a score of 1.0 means identical vectors.

Search by vector

Search using a pre-computed embedding vector:

use synaptic::embeddings::Embeddings;

let query_vec = embeddings.embed_query("systems programming").await?;
let results = store.similarity_search_by_vector(&query_vec, 3).await?;

Deleting documents

Remove documents by their IDs:

store.delete(&["1", "3"]).await?;

Using with a retriever

Wrap the store in a VectorStoreRetriever for use with Synaptic's retrieval infrastructure:

use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::openai::OpenAiEmbeddings;
use synaptic::retrieval::Retriever;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);

let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;

Schema-qualified table names

You can use schema-qualified names (e.g. public.documents) for the table:

let config = PgVectorConfig::new("myschema.embeddings", 1536);

Table names are validated to contain only alphanumeric characters, underscores, and dots, preventing SQL injection.

Common patterns

RAG pipeline with PgVector

use synaptic::pgvector::{PgVectorConfig, PgVectorStore, VectorStore};
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::retrieval::{Document, Retriever};
use synaptic::core::{ChatModel, ChatRequest, Message};
use std::sync::Arc;

// Set up the store
let pool = PgPoolOptions::new()
    .max_connections(5)
    .connect("postgres://user:pass@localhost/mydb")
    .await?;
let config = PgVectorConfig::new("knowledge_base", 1536);
let store = PgVectorStore::new(pool, config);
store.initialize().await?;

// Add documents
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let docs = vec![
    Document::new("doc1", "Synaptic is a Rust agent framework"),
    Document::new("doc2", "It supports RAG with vector stores"),
];
store.add_documents(docs, embeddings.as_ref()).await?;

// Retrieve and generate
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings, 3);
let context_docs = retriever.retrieve("What is Synaptic?", 3).await?;

let context = context_docs.iter()
    .map(|d| d.content.as_str())
    .collect::<Vec<_>>()
    .join("\n");

let model = OpenAiChatModel::new("gpt-4o-mini");
let request = ChatRequest::new(vec![
    Message::system(format!("Answer using this context:\n{context}")),
    Message::human("What is Synaptic?"),
]);
let response = model.chat(request).await?;

Index Strategies

pgvector supports two index types for accelerating approximate nearest-neighbor search. Choosing the right one depends on your dataset size and performance requirements.

HNSW (Hierarchical Navigable Small World) -- recommended for most use cases. It provides better recall, faster queries at search time, and does not require a separate training step. The trade-off is higher memory usage and slower index build time.

IVFFlat (Inverted File with Flat compression) -- a good option for very large datasets where memory is a concern. It partitions vectors into lists and searches only a subset at query time. You must build the index after the table already contains data (it needs representative vectors for training).

-- HNSW index (recommended for most use cases)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

-- IVFFlat index (better for very large datasets)
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

Property	HNSW	IVFFlat
Recall	Higher	Lower
Query speed	Faster	Slower (depends on `probes`)
Memory usage	Higher	Lower
Build speed	Slower	Faster
Training required	No	Yes (needs existing data)

Tip: For tables with fewer than 100k rows, the default sequential scan is often fast enough. Add an index when query latency becomes a concern.

Reusing an Existing Connection Pool

If your application already maintains a sqlx::PgPool (e.g. for your main relational data), you can pass it directly to PgVectorStore instead of creating a new pool:

use sqlx::PgPool;
use synaptic::pgvector::{PgVectorConfig, PgVectorStore};

// Reuse the pool from your application state
let pool: PgPool = app_state.db_pool.clone();

let config = PgVectorConfig::new("app_embeddings", 1536);
let store = PgVectorStore::new(pool, config);
store.initialize().await?;

This avoids opening duplicate connections and lets your vector operations share the same transaction boundaries and connection limits as the rest of your application.

Configuration reference

Field	Type	Default	Description
`table_name`	`String`	required	PostgreSQL table name (supports schema-qualified names)
`vector_dimensions`	`u32`	required	Dimensionality of the embedding vectors

Pinecone Vector Store

This guide shows how to use Pinecone as a vector store backend in Synaptic. Pinecone is a managed vector database built for real-time similarity search at scale.

Setup

Add the pinecone feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "pinecone"] }

Set your Pinecone API key:

export PINECONE_API_KEY="your-pinecone-api-key"

You also need an existing Pinecone index. Create one through the Pinecone console or the Pinecone API. Note the index host URL (e.g. https://my-index-abc123.svc.aped-1234.pinecone.io).

Configuration

Create a PineconeConfig with your API key and index host URL:

use synaptic::pinecone::{PineconeConfig, PineconeVectorStore};

let config = PineconeConfig::new("your-pinecone-api-key", "https://my-index-abc123.svc.aped-1234.pinecone.io");
let store = PineconeVectorStore::new(config);

Namespace

Pinecone supports namespaces for partitioning data within an index:

let config = PineconeConfig::new("api-key", "https://my-index.pinecone.io")
    .with_namespace("production");

If no namespace is set, the default namespace is used.

Adding documents

PineconeVectorStore implements the VectorStore trait. Pass an embeddings provider to compute vectors:

use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;

Similarity search

Find the k most similar documents to a text query:

let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Deleting documents

Remove documents by their IDs:

store.delete(&["1", "3"]).await?;

Using with a retriever

Wrap the store in a VectorStoreRetriever for use with Synaptic's retrieval infrastructure:

use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);

let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;

Namespace Isolation

Namespaces are a common pattern for building multi-tenant RAG applications with Pinecone. Each tenant's data lives in a separate namespace within the same index, providing logical isolation without the overhead of managing multiple indexes.

use synaptic::pinecone::{PineconeConfig, PineconeVectorStore};
use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;

let api_key = std::env::var("PINECONE_API_KEY")?;
let index_host = "https://my-index-abc123.svc.aped-1234.pinecone.io";

// Create stores with different namespaces for tenant isolation
let config_a = PineconeConfig::new(&api_key, index_host)
    .with_namespace("tenant-a");
let config_b = PineconeConfig::new(&api_key, index_host)
    .with_namespace("tenant-b");

let store_a = PineconeVectorStore::new(config_a);
let store_b = PineconeVectorStore::new(config_b);

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

// Tenant A's documents are invisible to Tenant B
let docs_a = vec![Document::new("a1", "Tenant A internal report")];
store_a.add_documents(docs_a, &embeddings).await?;

// Searching in Tenant B's namespace returns no results from Tenant A
let results = store_b.similarity_search("internal report", 5, &embeddings).await?;
assert!(results.is_empty());

This approach scales well because Pinecone handles namespace-level partitioning internally. You can add, search, and delete documents in one namespace without affecting others.

RAG Pipeline Example

A complete RAG pipeline: load documents, split them into chunks, embed and store in Pinecone, then retrieve relevant context and generate an answer.

use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings, VectorStore, Retriever};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::pinecone::{PineconeConfig, PineconeVectorStore};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;

let backend = Arc::new(HttpBackend::new());
let embeddings = Arc::new(OpenAiEmbeddings::new(
    OpenAiEmbeddings::config("text-embedding-3-small"),
    backend.clone(),
));

// 1. Load and split
let loader = TextLoader::new("docs/knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;

// 2. Store in Pinecone
let config = PineconeConfig::new(
    std::env::var("PINECONE_API_KEY")?,
    "https://my-index-abc123.svc.aped-1234.pinecone.io",
);
let store = PineconeVectorStore::new(config);
store.add_documents(chunks, embeddings.as_ref()).await?;

// 3. Retrieve and answer
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings.clone(), 5);
let relevant = retriever.retrieve("What is Synaptic?", 5).await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");

let model = OpenAiChatModel::new(/* config */);
let request = ChatRequest::new(vec![
    Message::system(&format!("Answer based on context:\n{context}")),
    Message::human("What is Synaptic?"),
]);
let response = model.chat(&request).await?;

Configuration reference

Field	Type	Default	Description
`api_key`	`String`	required	Pinecone API key
`host`	`String`	required	Index host URL from the Pinecone console
`namespace`	`Option<String>`	`None`	Namespace for data partitioning

Chroma Vector Store

This guide shows how to use Chroma as a vector store backend in Synaptic. Chroma is an open-source embedding database that runs locally or in the cloud.

Setup

Add the chroma feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "chroma"] }

Start a Chroma server (e.g. via Docker):

docker run -p 8000:8000 chromadb/chroma

Configuration

Create a ChromaConfig with the server URL and collection name:

use synaptic::chroma::{ChromaConfig, ChromaVectorStore};

let config = ChromaConfig::new("http://localhost:8000", "my_collection");
let store = ChromaVectorStore::new(config);

The default URL is http://localhost:8000.

Creating the collection

Call ensure_collection() to create the collection if it does not already exist. This is idempotent and safe to call on every startup:

store.ensure_collection().await?;

Authentication

If your Chroma server requires authentication, pass credentials:

let config = ChromaConfig::new("https://chroma.example.com", "my_collection")
    .with_auth_token("your-token");

Adding documents

ChromaVectorStore implements the VectorStore trait:

use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;

Similarity search

Find the k most similar documents:

let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Deleting documents

Remove documents by their IDs:

store.delete(&["1", "3"]).await?;

Using with a retriever

Wrap the store in a VectorStoreRetriever:

use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);

let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;

Docker Deployment

Chroma is easy to deploy with Docker for both development and production environments.

Quick start -- run a Chroma server with default settings:

# Start Chroma on port 8000
docker run -p 8000:8000 chromadb/chroma:latest

With persistent storage -- mount a volume so data survives container restarts:

docker run -p 8000:8000 -v ./chroma-data:/chroma/chroma chromadb/chroma:latest

Docker Compose -- for production deployments, use a docker-compose.yml:

version: "3.8"
services:
  chroma:
    image: chromadb/chroma:latest
    ports:
      - "8000:8000"
    volumes:
      - chroma-data:/chroma/chroma
    restart: unless-stopped

volumes:
  chroma-data:

Then connect from Synaptic:

use synaptic::chroma::{ChromaConfig, ChromaVectorStore};

let config = ChromaConfig::new("http://localhost:8000", "my_collection");
let store = ChromaVectorStore::new(config);
store.ensure_collection().await?;

For remote or authenticated deployments, use with_auth_token():

let config = ChromaConfig::new("https://chroma.example.com", "my_collection")
    .with_auth_token("your-token");

RAG Pipeline Example

A complete RAG pipeline: load documents, split them into chunks, embed and store in Chroma, then retrieve relevant context and generate an answer.

use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings, VectorStore, Retriever};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::chroma::{ChromaConfig, ChromaVectorStore};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;

let backend = Arc::new(HttpBackend::new());
let embeddings = Arc::new(OpenAiEmbeddings::new(
    OpenAiEmbeddings::config("text-embedding-3-small"),
    backend.clone(),
));

// 1. Load and split
let loader = TextLoader::new("docs/knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;

// 2. Store in Chroma
let config = ChromaConfig::new("http://localhost:8000", "my_collection");
let store = ChromaVectorStore::new(config);
store.ensure_collection().await?;
store.add_documents(chunks, embeddings.as_ref()).await?;

// 3. Retrieve and answer
let store = Arc::new(store);
let retriever = VectorStoreRetriever::new(store, embeddings.clone(), 5);
let relevant = retriever.retrieve("What is Synaptic?", 5).await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");

let model = OpenAiChatModel::new(/* config */);
let request = ChatRequest::new(vec![
    Message::system(&format!("Answer based on context:\n{context}")),
    Message::human("What is Synaptic?"),
]);
let response = model.chat(&request).await?;

Configuration reference

Field	Type	Default	Description
`url`	`String`	`"http://localhost:8000"`	Chroma server URL
`collection_name`	`String`	required	Name of the collection
`auth_token`	`Option<String>`	`None`	Authentication token

MongoDB Atlas Vector Search

This guide shows how to use MongoDB Atlas Vector Search as a vector store backend in Synaptic. Atlas Vector Search enables semantic similarity search on data stored in MongoDB.

Setup

Add the mongodb feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "mongodb"] }

Prerequisites

A MongoDB Atlas cluster (M10 or higher, or a free shared cluster with Atlas Search enabled).
A vector search index configured on the target collection. Create one via the Atlas UI or the Atlas Admin API.

Example index definition (JSON):

{
  "type": "vectorSearch",
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1536,
      "similarity": "cosine"
    }
  ]
}

Configuration

Create a MongoVectorConfig with the database name, collection name, index name, and vector dimensionality:

use synaptic::mongodb::{MongoVectorConfig, MongoVectorStore};

let config = MongoVectorConfig::new("my_database", "my_collection", "vector_index", 1536);
let store = MongoVectorStore::from_uri("mongodb+srv://user:pass@cluster.mongodb.net/", config).await?;

The from_uri constructor connects to MongoDB and is async.

Embedding field name

By default, vectors are stored in a field called "embedding". You can change this:

let config = MongoVectorConfig::new("mydb", "docs", "vector_index", 1536)
    .with_embedding_field("vector");

Make sure this matches the path in your Atlas vector search index definition.

Content and metadata fields

Customize which fields store the document content and metadata:

let config = MongoVectorConfig::new("mydb", "docs", "vector_index", 1536)
    .with_content_field("text")
    .with_metadata_field("meta");

Adding documents

MongoVectorStore implements the VectorStore trait:

use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;

Similarity search

Find the k most similar documents:

let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Deleting documents

Remove documents by their IDs:

store.delete(&["1", "3"]).await?;

Using with a retriever

Wrap the store in a VectorStoreRetriever:

use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);

let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;

Atlas Search Index Setup

Before you can run similarity searches, you must create a vector search index on your MongoDB Atlas collection. This requires an M10 or higher dedicated cluster (vector search is not available on free/shared tier clusters).

Creating an index via the Atlas UI

Navigate to your cluster in the MongoDB Atlas console.
Go to Search > Create Search Index.
Choose JSON Editor and select the target database and collection.
Paste the following index definition:

{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1536,
      "similarity": "cosine"
    }
  ]
}

Name your index (e.g. vector_index) and click Create Search Index.

Note: The path field must match the embedding_field configured in your MongoVectorConfig. If you customized it with .with_embedding_field("vector"), set "path": "vector" in the index definition. Similarly, adjust numDimensions to match your embedding model's output dimensionality.

Creating an index via the Atlas CLI

You can also create the index programmatically using the MongoDB Atlas CLI:

First, save the index definition to a file called index.json:

{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1536,
      "similarity": "cosine"
    }
  ]
}

Then run:

atlas clusters search indexes create \
  --clusterName my-cluster \
  --db my_database \
  --collection my_collection \
  --file index.json

The index build runs asynchronously. You can check its status with:

atlas clusters search indexes list \
  --clusterName my-cluster \
  --db my_database \
  --collection my_collection

Wait until the status shows READY before running similarity searches.

Similarity options

The similarity field in the index definition controls how vectors are compared:

Value	Description
`cosine`	Cosine similarity (default, good for normalized embeddings)
`euclidean`	Euclidean (L2) distance
`dotProduct`	Dot product (use with unit-length vectors)

RAG Pipeline Example

Below is a complete Retrieval-Augmented Generation (RAG) pipeline that loads documents, splits them, embeds and stores them in MongoDB Atlas, then retrieves relevant context to answer a question.

use std::sync::Arc;
use synaptic::core::{
    ChatModel, ChatRequest, Document, Embeddings, Message, Retriever, VectorStore,
};
use synaptic::mongodb::{MongoVectorConfig, MongoVectorStore};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::vectorstores::VectorStoreRetriever;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Configure embeddings and LLM
    let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
    let llm = OpenAiChatModel::new("gpt-4o-mini");

    // 2. Connect to MongoDB Atlas
    let config = MongoVectorConfig::new("my_database", "documents", "vector_index", 1536);
    let store = MongoVectorStore::from_uri(
        "mongodb+srv://user:pass@cluster.mongodb.net/",
        config,
    )
    .await?;

    // 3. Load and split documents
    let raw_docs = vec![
        Document::new("doc1", "Rust is a multi-paradigm, general-purpose programming language \
            that emphasizes performance, type safety, and concurrency. It enforces memory safety \
            without a garbage collector."),
        Document::new("doc2", "MongoDB Atlas is a fully managed cloud database service. It provides \
            built-in vector search capabilities for AI applications, supporting cosine, euclidean, \
            and dot product similarity metrics."),
    ];

    let splitter = RecursiveCharacterTextSplitter::new(500, 50);
    let chunks = splitter.split_documents(&raw_docs);

    // 4. Embed and store in MongoDB
    store.add_documents(chunks, embeddings.as_ref()).await?;

    // 5. Create a retriever
    let store = Arc::new(store);
    let retriever = VectorStoreRetriever::new(store, embeddings, 3);

    // 6. Retrieve relevant context
    let query = "What is Rust?";
    let relevant_docs = retriever.retrieve(query, 3).await?;

    let context = relevant_docs
        .iter()
        .map(|doc| doc.content.as_str())
        .collect::<Vec<_>>()
        .join("\n\n");

    // 7. Generate answer using retrieved context
    let messages = vec![
        Message::system("Answer the user's question based on the following context. \
            If the context doesn't contain relevant information, say so.\n\n\
            Context:\n{context}".replace("{context}", &context)),
        Message::human(query),
    ];

    let response = llm.chat(ChatRequest::new(messages)).await?;
    println!("Answer: {}", response.message.content());

    Ok(())
}

Configuration reference

Field	Type	Default	Description
`database`	`String`	required	MongoDB database name
`collection`	`String`	required	MongoDB collection name
`index_name`	`String`	required	Atlas vector search index name
`dims`	`u32`	required	Dimensionality of embedding vectors
`embedding_field`	`String`	`"embedding"`	Field name for the vector embedding
`content_field`	`String`	`"content"`	Field name for document text content
`metadata_field`	`String`	`"metadata"`	Field name for document metadata

Elasticsearch Vector Store

This guide shows how to use Elasticsearch as a vector store backend in Synaptic. Elasticsearch supports approximate kNN (k-nearest neighbors) search using dense vector fields.

Setup

Add the elasticsearch feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "elasticsearch"] }

Start an Elasticsearch instance (e.g. via Docker):

docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" \
  docker.elastic.co/elasticsearch/elasticsearch:8.12.0

Configuration

Create an ElasticsearchConfig with the server URL, index name, and vector dimensionality:

use synaptic::elasticsearch::{ElasticsearchConfig, ElasticsearchVectorStore};

let config = ElasticsearchConfig::new("http://localhost:9200", "my_index", 1536);
let store = ElasticsearchVectorStore::new(config);

Authentication

For secured Elasticsearch clusters, provide credentials:

let config = ElasticsearchConfig::new("https://es.example.com:9200", "my_index", 1536)
    .with_credentials("elastic", "changeme");

Creating the index

Call ensure_index() to create the index with the appropriate kNN vector mapping if it does not already exist:

store.ensure_index().await?;

This creates an index with a dense_vector field configured for the specified dimensionality and cosine similarity. The call is idempotent.

Similarity metric

The default similarity is cosine. You can change it:

let config = ElasticsearchConfig::new("http://localhost:9200", "my_index", 1536)
    .with_similarity("dot_product");

Available options: "cosine" (default), "dot_product", "l2_norm".

Adding documents

ElasticsearchVectorStore implements the VectorStore trait:

use synaptic::core::{VectorStore, Document, Embeddings};
use synaptic::openai::OpenAiEmbeddings;

let embeddings = OpenAiEmbeddings::new("text-embedding-3-small");

let docs = vec![
    Document::new("1", "Rust is a systems programming language"),
    Document::new("2", "Python is great for data science"),
    Document::new("3", "Go is designed for concurrency"),
];

let ids = store.add_documents(docs, &embeddings).await?;

Similarity search

Find the k most similar documents:

let results = store.similarity_search("fast systems language", 3, &embeddings).await?;
for doc in &results {
    println!("{}: {}", doc.id, doc.content);
}

Search with scores

let scored = store.similarity_search_with_score("concurrency", 3, &embeddings).await?;
for (doc, score) in &scored {
    println!("{} (score: {:.3}): {}", doc.id, score, doc.content);
}

Deleting documents

Remove documents by their IDs:

store.delete(&["1", "3"]).await?;

Using with a retriever

Wrap the store in a VectorStoreRetriever:

use std::sync::Arc;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::core::Retriever;

let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(store);

let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("fast language", 5).await?;

Index Mapping Configuration

While ensure_index() creates a default mapping automatically, you may want full control over the index mapping for production use. Below is the recommended Elasticsearch mapping for vector search:

{
  "mappings": {
    "properties": {
      "embedding": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine"
      },
      "content": { "type": "text" },
      "metadata": { "type": "object", "enabled": true }
    }
  }
}

Creating the index via the REST API

You can create the index with a custom mapping using the Elasticsearch REST API:

curl -X PUT "http://localhost:9200/my-index" \
  -H "Content-Type: application/json" \
  -d '{
    "mappings": {
      "properties": {
        "embedding": {
          "type": "dense_vector",
          "dims": 1536,
          "index": true,
          "similarity": "cosine"
        },
        "content": { "type": "text" },
        "metadata": { "type": "object", "enabled": true }
      }
    }
  }'

Key mapping fields

type: "dense_vector" -- Tells Elasticsearch this field stores a fixed-length float array for vector operations.
dims -- Must match the dimensionality of your embedding model (e.g. 1536 for text-embedding-3-small, 768 for many open-source models).
index: true -- Enables the kNN search data structure. Without this, you can store vectors but cannot perform efficient approximate nearest-neighbor queries. Set to true for production use.
similarity -- Determines the distance function used for kNN search:
- "cosine" (default) -- Cosine similarity, recommended for most embedding models.
- "dot_product" -- Dot product, best for unit-length normalized vectors.
- "l2_norm" -- Euclidean distance.

Mapping for metadata filtering

If you plan to filter search results by metadata fields, add explicit mappings for those fields:

{
  "mappings": {
    "properties": {
      "embedding": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine"
      },
      "content": { "type": "text" },
      "metadata": {
        "properties": {
          "source": { "type": "keyword" },
          "category": { "type": "keyword" },
          "created_at": { "type": "date" }
        }
      }
    }
  }
}

Using keyword type for metadata fields enables exact-match filtering in kNN queries.

RAG Pipeline Example

Below is a complete Retrieval-Augmented Generation (RAG) pipeline that loads documents, splits them, embeds and stores them in Elasticsearch, then retrieves relevant context to answer a question.

use std::sync::Arc;
use synaptic::core::{
    ChatModel, ChatRequest, Document, Embeddings, Message, Retriever, VectorStore,
};
use synaptic::elasticsearch::{ElasticsearchConfig, ElasticsearchVectorStore};
use synaptic::openai::{OpenAiChatModel, OpenAiEmbeddings};
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::vectorstores::VectorStoreRetriever;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Configure embeddings and LLM
    let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
    let llm = OpenAiChatModel::new("gpt-4o-mini");

    // 2. Connect to Elasticsearch and create the index
    let config = ElasticsearchConfig::new("http://localhost:9200", "rag_documents", 1536);
    let store = ElasticsearchVectorStore::new(config);
    store.ensure_index().await?;

    // 3. Load and split documents
    let raw_docs = vec![
        Document::new("doc1", "Rust is a multi-paradigm, general-purpose programming language \
            that emphasizes performance, type safety, and concurrency. It enforces memory safety \
            without a garbage collector."),
        Document::new("doc2", "Elasticsearch is a distributed, RESTful search and analytics engine. \
            It supports vector search through dense_vector fields and approximate kNN queries, \
            making it suitable for semantic search and RAG applications."),
    ];

    let splitter = RecursiveCharacterTextSplitter::new(500, 50);
    let chunks = splitter.split_documents(&raw_docs);

    // 4. Embed and store in Elasticsearch
    store.add_documents(chunks, embeddings.as_ref()).await?;

    // 5. Create a retriever
    let store = Arc::new(store);
    let retriever = VectorStoreRetriever::new(store, embeddings, 3);

    // 6. Retrieve relevant context
    let query = "What is Rust?";
    let relevant_docs = retriever.retrieve(query, 3).await?;

    let context = relevant_docs
        .iter()
        .map(|doc| doc.content.as_str())
        .collect::<Vec<_>>()
        .join("\n\n");

    // 7. Generate answer using retrieved context
    let messages = vec![
        Message::system("Answer the user's question based on the following context. \
            If the context doesn't contain relevant information, say so.\n\n\
            Context:\n{context}".replace("{context}", &context)),
        Message::human(query),
    ];

    let response = llm.chat(ChatRequest::new(messages)).await?;
    println!("Answer: {}", response.message.content());

    Ok(())
}

Configuration reference

Field	Type	Default	Description
`url`	`String`	required	Elasticsearch server URL
`index_name`	`String`	required	Name of the Elasticsearch index
`dims`	`u32`	required	Dimensionality of embedding vectors
`username`	`Option<String>`	`None`	Username for basic auth
`password`	`Option<String>`	`None`	Password for basic auth
`similarity`	`String`	`"cosine"`	Similarity metric (`cosine`, `dot_product`, `l2_norm`)

Redis Store & Cache

This guide shows how to use Redis for persistent key-value storage and LLM response caching in Synaptic. The redis integration provides two components:

RedisStore -- implements the Store trait for namespace-scoped key-value storage.
RedisCache -- implements the LlmCache trait for caching LLM responses with optional TTL.

Setup

Add the redis feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.3", features = ["openai", "redis"] }

Ensure you have a Redis server running:

docker run -p 6379:6379 redis:7

RedisStore

Creating a store

The simplest way to create a store is from a Redis URL:

use synaptic::redis::RedisStore;

let store = RedisStore::from_url("redis://127.0.0.1/")?;

Custom key prefix

By default, all keys are prefixed with "synaptic:store:". You can customize this:

use synaptic::redis::{RedisStore, RedisStoreConfig};

let config = RedisStoreConfig {
    prefix: "myapp:store:".to_string(),
};
let store = RedisStore::from_url_with_config("redis://127.0.0.1/", config)?;

Using an existing client

If you already have a configured redis::Client, pass it directly:

use synaptic::redis::{RedisStore, RedisStoreConfig};

let client = redis::Client::open("redis://127.0.0.1/")?;
let store = RedisStore::new(client, RedisStoreConfig::default());

Storing and retrieving data

RedisStore implements the Store trait with full namespace support:

use synaptic::redis::Store;
use serde_json::json;

// Put a value under a namespace
store.put(&["users", "prefs"], "theme", json!("dark")).await?;

// Retrieve the value
let item = store.get(&["users", "prefs"], "theme").await?;
if let Some(item) = item {
    println!("Theme: {}", item.value); // "dark"
}

Searching within a namespace

Search for items using substring matching on keys and values:

store.put(&["docs"], "rust", json!("Rust is fast")).await?;
store.put(&["docs"], "python", json!("Python is flexible")).await?;

// Search with a query string (substring match)
let results = store.search(&["docs"], Some("fast"), 10).await?;
assert_eq!(results.len(), 1);

// Search without a query (list all items in namespace)
let all = store.search(&["docs"], None, 10).await?;
assert_eq!(all.len(), 2);

Deleting data

store.delete(&["users", "prefs"], "theme").await?;

Listing namespaces

List all known namespace paths, optionally filtered by prefix:

store.put(&["app", "settings"], "key1", json!("v1")).await?;
store.put(&["app", "cache"], "key2", json!("v2")).await?;
store.put(&["logs"], "key3", json!("v3")).await?;

// List all namespaces
let all_ns = store.list_namespaces(&[]).await?;
// [["app", "settings"], ["app", "cache"], ["logs"]]

// List namespaces under "app"
let app_ns = store.list_namespaces(&["app"]).await?;
// [["app", "settings"], ["app", "cache"]]

Using with agents

Pass the store to create_agent so that RuntimeAwareTool implementations receive it via ToolRuntime:

use std::sync::Arc;
use synaptic::graph::{create_agent, AgentOptions};
use synaptic::redis::RedisStore;

let store = Arc::new(RedisStore::from_url("redis://127.0.0.1/")?);
let options = AgentOptions {
    store: Some(store),
    ..Default::default()
};
let graph = create_agent(model, tools, options)?;

RedisCache

Creating a cache

Create a cache from a Redis URL:

use synaptic::redis::RedisCache;

let cache = RedisCache::from_url("redis://127.0.0.1/")?;

Cache with TTL

Set a TTL (in seconds) so entries expire automatically:

use synaptic::redis::{RedisCache, RedisCacheConfig};

let config = RedisCacheConfig {
    ttl: Some(3600), // 1 hour
    ..Default::default()
};
let cache = RedisCache::from_url_with_config("redis://127.0.0.1/", config)?;

Without a TTL, cached entries persist indefinitely until explicitly cleared.

Custom key prefix

The default cache prefix is "synaptic:cache:". Customize it to avoid collisions:

let config = RedisCacheConfig {
    prefix: "myapp:llm_cache:".to_string(),
    ttl: Some(1800), // 30 minutes
};
let cache = RedisCache::from_url_with_config("redis://127.0.0.1/", config)?;

Wrapping a ChatModel

Use CachedChatModel to cache responses from any ChatModel:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::cache::CachedChatModel;
use synaptic::redis::RedisCache;
use synaptic::openai::OpenAiChatModel;

let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));
let cache = Arc::new(RedisCache::from_url("redis://127.0.0.1/")?);

let cached_model = CachedChatModel::new(model, cache);
// First call hits the LLM; identical requests return the cached response

Clearing the cache

Remove all cached entries:

use synaptic::redis::LlmCache;

cache.clear().await?;

This deletes all Redis keys matching the cache prefix.

Using an existing client

let client = redis::Client::open("redis://127.0.0.1/")?;
let cache = RedisCache::new(client, RedisCacheConfig::default());

Configuration reference

RedisStoreConfig

Field	Type	Default	Description
`prefix`	`String`	`"synaptic:store:"`	Key prefix for all store entries

RedisCacheConfig

Field	Type	Default	Description
`prefix`	`String`	`"synaptic:cache:"`	Key prefix for all cache entries
`ttl`	`Option<u64>`	`None`	TTL in seconds; `None` means entries never expire

Key format

Store keys: {prefix}{namespace_joined_by_colon}:{key} (e.g. synaptic:store:users:prefs:theme)
Cache keys: {prefix}{key} (e.g. synaptic:cache:abc123)
Namespace index: {prefix}__namespaces__ (a Redis SET tracking all namespace paths)

SQLite Cache

This guide shows how to use SQLite as a persistent LLM response cache in Synaptic. SqliteCache stores chat model responses locally so identical requests are served from disk without calling the LLM again.

Setup

Add the sqlite feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "sqlite"] }

No external service is required. The cache uses a local SQLite file (or an in-memory database for testing).

Configuration

File-based cache

Create a SqliteCacheConfig pointing to a database file:

use synaptic::sqlite::{SqliteCacheConfig, SqliteCache};

let config = SqliteCacheConfig::new("cache.db");
let cache = SqliteCache::new(config).await?;

The database file is created automatically if it does not exist. The constructor is async because it initializes the database schema.

In-memory cache

For testing or ephemeral use, create an in-memory SQLite cache:

let config = SqliteCacheConfig::in_memory();
let cache = SqliteCache::new(config).await?;

TTL (time-to-live)

Set an optional TTL so cached entries expire automatically:

use std::time::Duration;

let config = SqliteCacheConfig::new("cache.db")
    .with_ttl(Duration::from_secs(3600)); // 1 hour

let cache = SqliteCache::new(config).await?;

Without a TTL, cached entries persist indefinitely.

Usage

Wrapping a ChatModel

Use CachedChatModel from synaptic-cache to wrap any ChatModel:

use std::sync::Arc;
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::cache::CachedChatModel;
use synaptic::sqlite::{SqliteCacheConfig, SqliteCache};
use synaptic::openai::OpenAiChatModel;

let model: Arc<dyn ChatModel> = Arc::new(OpenAiChatModel::new("gpt-4o-mini"));

let config = SqliteCacheConfig::new("llm_cache.db");
let cache = Arc::new(SqliteCache::new(config).await?);

let cached_model = CachedChatModel::new(model, cache);

// First call hits the LLM
let request = ChatRequest::new(vec![Message::human("What is Rust?")]);
let response = cached_model.chat(&request).await?;

// Second identical call returns the cached response instantly
let response2 = cached_model.chat(&request).await?;

Direct cache access

SqliteCache implements the LlmCache trait, so you can use it directly:

use synaptic::core::LlmCache;

// Look up a cached response by key
let cached = cache.lookup("some-cache-key").await?;

// Store a response
cache.update("some-cache-key", &response).await?;

// Clear all entries
cache.clear().await?;

Configuration reference

Field	Type	Default	Description
`path`	`String`	required	Path to the SQLite database file (or `":memory:"` for in-memory)
`ttl`	`Option<Duration>`	`None`	Time-to-live for cache entries; `None` means entries never expire

PDF Loader

This guide shows how to load documents from PDF files using Synaptic's PdfLoader. It extracts text content from PDFs and produces Document values that can be passed to text splitters, embeddings, and vector stores.

Setup

Add the pdf feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.3", features = ["pdf"] }

The PDF extraction is handled by the pdf_extract library, which is pulled in automatically.

Loading a PDF as a single document

By default, PdfLoader combines all pages into one Document:

use synaptic::pdf::{PdfLoader, Loader};

let loader = PdfLoader::new("report.pdf");
let docs = loader.load().await?;

assert_eq!(docs.len(), 1);
println!("Content: {}", docs[0].content);
println!("Source: {}", docs[0].metadata["source"]);       // "report.pdf"
println!("Pages: {}", docs[0].metadata["total_pages"]);   // e.g. 12

The document ID is set to the file path string. Metadata includes:

source -- the file path
total_pages -- the total number of pages in the PDF

Loading with one document per page

Use with_split_pages to produce a separate Document for each page:

use synaptic::pdf::{PdfLoader, Loader};

let loader = PdfLoader::with_split_pages("report.pdf");
let docs = loader.load().await?;

for doc in &docs {
    println!(
        "Page {}/{}: {}...",
        doc.metadata["page"],
        doc.metadata["total_pages"],
        &doc.content[..80]
    );
}

Each document has the following metadata:

source -- the file path
page -- the 1-based page number
total_pages -- the total number of pages

Document IDs follow the format {path}:page_{n} (e.g. report.pdf:page_3). Empty pages are automatically skipped.

RAG pipeline with PDF

A common pattern is to load a PDF, split it into chunks, embed, and store for retrieval:

use synaptic::pdf::{PdfLoader, Loader};
use synaptic::splitters::{RecursiveCharacterTextSplitter, TextSplitter};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStore, VectorStoreRetriever};
use synaptic::openai::OpenAiEmbeddings;
use synaptic::retrieval::Retriever;
use std::sync::Arc;

// 1. Load the PDF
let loader = PdfLoader::with_split_pages("manual.pdf");
let docs = loader.load().await?;

// 2. Split into chunks
let splitter = RecursiveCharacterTextSplitter::new(1000, 200);
let chunks = splitter.split_documents(&docs)?;

// 3. Embed and store
let embeddings = Arc::new(OpenAiEmbeddings::new("text-embedding-3-small"));
let store = Arc::new(InMemoryVectorStore::new());
store.add_documents(chunks, embeddings.as_ref()).await?;

// 4. Retrieve
let retriever = VectorStoreRetriever::new(store, embeddings, 5);
let results = retriever.retrieve("How do I configure the system?", 5).await?;

This works equally well with QdrantVectorStore or PgVectorStore in place of InMemoryVectorStore.

Processing multiple PDFs

Use DirectoryLoader with a glob filter, or load PDFs individually and merge the results:

use synaptic::pdf::{PdfLoader, Loader};

let paths = vec!["docs/intro.pdf", "docs/guide.pdf", "docs/reference.pdf"];

let mut all_docs = Vec::new();
for path in paths {
    let loader = PdfLoader::with_split_pages(path);
    let docs = loader.load().await?;
    all_docs.extend(docs);
}
// all_docs now contains page-level documents from all three PDFs

How text extraction works

PdfLoader uses the pdf_extract library internally. Text extraction runs on a blocking thread via tokio::task::spawn_blocking to avoid blocking the async runtime.

Page boundaries are detected by form feed characters (\x0c) that pdf_extract inserts between pages. When using with_split_pages, the text is split on these characters and each non-empty segment becomes a document.

Configuration reference

Constructor	Behavior
`PdfLoader::new(path)`	All pages combined into a single `Document`
`PdfLoader::with_split_pages(path)`	One `Document` per page

Metadata fields

Field	Type	Present in	Description
`source`	`String`	Both modes	The file path
`page`	`Number`	Split pages only	1-based page number
`total_pages`	`Number`	Both modes	Total number of pages in the PDF

Tavily Search Tool

This guide shows how to use the Tavily web search API as a tool in Synaptic. Tavily is a search engine optimized for LLM agents, returning concise and relevant results.

Setup

Add the tavily feature to your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["openai", "tavily"] }

Set your Tavily API key:

export TAVILY_API_KEY="tvly-..."

Configuration

Create a TavilyConfig and build the tool:

use synaptic::tavily::{TavilyConfig, TavilySearchTool};

let config = TavilyConfig::new("your-tavily-api-key");
let tool = TavilySearchTool::new(config);

Max results

Control how many search results are returned (default is 5):

let config = TavilyConfig::new("your-tavily-api-key")
    .with_max_results(10);

Search depth

Choose between "basic" (default) and "advanced" search depth. Advanced search performs deeper crawling for more comprehensive results:

let config = TavilyConfig::new("your-tavily-api-key")
    .with_search_depth("advanced");

Usage

As a standalone tool

TavilySearchTool implements the Tool trait with the name "tavily_search". It accepts a JSON input with a "query" field:

use synaptic::core::Tool;

let result = tool.call(serde_json::json!({
    "query": "latest Rust programming news"
})).await?;

println!("{}", result);

The result is a JSON string containing search results with titles, URLs, and content snippets.

With an agent

use std::sync::Arc;
use synaptic::tavily::{TavilyConfig, TavilySearchTool};
use synaptic::tools::ToolRegistry;
use synaptic::graph::create_react_agent;
use synaptic::openai::OpenAiChatModel;

let search = TavilySearchTool::new(TavilyConfig::new("your-tavily-api-key"));

let mut registry = ToolRegistry::new();
registry.register(Arc::new(search));

let model = OpenAiChatModel::new("gpt-4o");
let agent = create_react_agent(Arc::new(model), registry)?;

The agent can now call tavily_search when it needs to look up current information.

Tool definition

The tool advertises the following schema to the LLM:

{
  "name": "tavily_search",
  "description": "Search the web for current information on a topic.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query"
      }
    },
    "required": ["query"]
  }
}

Configuration reference

Field	Type	Default	Description
`api_key`	`String`	required	Tavily API key
`max_results`	`usize`	`5`	Maximum number of search results to return
`search_depth`	`String`	`"basic"`	Search depth: `"basic"` or `"advanced"`

Procedural Macros

The synaptic-macros crate ships 12 attribute macros that eliminate boilerplate when building agents with Synaptic. Instead of manually implementing traits such as Tool, AgentMiddleware, or Entrypoint, you annotate an ordinary function and the macro generates the struct, the trait implementation, and a factory function for you.

All macros live in the synaptic_macros crate and are re-exported through the synaptic facade, so you can import them with:

use synaptic::macros::*;       // all macros at once
use synaptic::macros::tool;    // or pick individually

Macro	Purpose	Page
`#[tool]`	Define tools from functions	This page
`#[chain]`	Create runnable chains	This page
`#[entrypoint]`	Workflow entry points	This page
`#[task]`	Trackable tasks	This page
`#[traceable]`	Tracing instrumentation	This page
`#[before_agent]`	Middleware: before agent loop	Middleware Macros
`#[before_model]`	Middleware: before model call	Middleware Macros
`#[after_model]`	Middleware: after model call	Middleware Macros
`#[after_agent]`	Middleware: after agent loop	Middleware Macros
`#[wrap_model_call]`	Middleware: wrap model call	Middleware Macros
`#[wrap_tool_call]`	Middleware: wrap tool call	Middleware Macros
`#[dynamic_prompt]`	Middleware: dynamic system prompt	Middleware Macros

For complete end-to-end scenarios, see Macro Examples.

`#[tool]` -- Define Tools from Functions

#[tool] converts an async fn into a full Tool (or RuntimeAwareTool) implementation. The macro generates:

A struct named {PascalCase}Tool (e.g. web_search becomes WebSearchTool).
An impl Tool for WebSearchTool block with name(), description(), parameters() (JSON Schema), and call().
A factory function with the original name that returns Arc<dyn Tool>.

Basic Usage

use synaptic::macros::tool;
use synaptic::core::SynapticError;

/// Search the web for a given query.
#[tool]
async fn web_search(query: String) -> Result<String, SynapticError> {
    Ok(format!("Results for '{}'", query))
}

// The macro produces:
//   struct WebSearchTool;
//   impl Tool for WebSearchTool { ... }
//   fn web_search() -> Arc<dyn Tool> { ... }

let tool = web_search();
assert_eq!(tool.name(), "web_search");

Doc Comments as Description

The doc comment on the function becomes the tool description that is sent to the LLM. Write a clear, concise sentence -- this is what the model reads when deciding whether to call your tool.

/// Fetch the current weather for a city.
#[tool]
async fn get_weather(city: String) -> Result<String, SynapticError> {
    Ok(format!("Sunny in {}", city))
}

let tool = get_weather();
assert_eq!(tool.description(), "Fetch the current weather for a city.");

You can also override the description explicitly:

#[tool(description = "Look up weather information.")]
async fn get_weather(city: String) -> Result<String, SynapticError> {
    Ok(format!("Sunny in {}", city))
}

Parameter Types and JSON Schema

Each function parameter is mapped to a JSON Schema property automatically. The following type mappings are supported:

Rust Type	JSON Schema
`String`	`{"type": "string"}`
`i8`, `i16`, `i32`, `i64`, `u8`, `u16`, `u32`, `u64`, `usize`, `isize`	`{"type": "integer"}`
`f32`, `f64`	`{"type": "number"}`
`bool`	`{"type": "boolean"}`
`Vec<T>`	`{"type": "array", "items": <schema of T>}`
`serde_json::Value`	`{"type": "object"}`
`T: JsonSchema` (with `schemars` feature)	Full schema from schemars
Any other type (without `schemars`)	`{"type": "object"}` (fallback)

Parameter doc comments become "description" in the JSON Schema, giving the LLM extra context about what to pass:

#[tool]
async fn search(
    /// The search query string
    query: String,
    /// Maximum number of results to return
    max_results: i64,
) -> Result<String, SynapticError> {
    Ok(format!("Searching '{}' (limit {})", query, max_results))
}

This generates a JSON Schema similar to:

{
  "type": "object",
  "properties": {
    "query": { "type": "string", "description": "The search query string" },
    "max_results": { "type": "integer", "description": "Maximum number of results to return" }
  },
  "required": ["query", "max_results"]
}

Custom Types with `schemars`

By default, custom struct parameters generate a minimal {"type": "object"} schema with no field details — the LLM has no guidance about the struct's shape. To generate full schemas for custom types, enable the schemars feature and derive JsonSchema on your parameter types.

Enable the feature in your Cargo.toml:

[dependencies]
synaptic = { version = "0.2", features = ["macros", "schemars"] }
schemars = { version = "0.8", features = ["derive"] }

Derive JsonSchema on your parameter types:

use schemars::JsonSchema;
use serde::Deserialize;
use synaptic::macros::tool;
use synaptic::core::SynapticError;

#[derive(Deserialize, JsonSchema)]
struct UserInfo {
    /// User's display name
    name: String,
    /// Age in years
    age: i32,
    email: Option<String>,
}

/// Process user information.
#[tool]
async fn process_user(
    /// The user to process
    user: UserInfo,
    /// Action to perform
    action: String,
) -> Result<String, SynapticError> {
    Ok(format!("{}: {}", user.name, action))
}

Without schemars, user generates:

{ "type": "object", "description": "The user to process" }

With schemars, user generates a full schema:

{
  "type": "object",
  "description": "The user to process",
  "properties": {
    "name": { "type": "string" },
    "age": { "type": "integer", "format": "int32" },
    "email": { "type": "string" }
  },
  "required": ["name", "age"]
}

Nested types work automatically — if UserInfo contained an Address struct that also derives JsonSchema, the address schema is included via $defs references.

Note: Known primitive types (String, i32, Vec<T>, bool, etc.) always use the built-in hardcoded schemas regardless of whether schemars is enabled. Only unknown/custom types benefit from the schemars integration.

Optional Parameters (`Option<T>`)

Wrap a parameter in Option<T> to make it optional. Optional parameters are excluded from the "required" array in the schema. At runtime, missing or null JSON values are deserialized as None.

#[tool]
async fn search(
    query: String,
    /// Filter by language (optional)
    language: Option<String>,
) -> Result<String, SynapticError> {
    let lang = language.unwrap_or_else(|| "en".into());
    Ok(format!("Searching '{}' in {}", query, lang))
}

Default Values (`#[default = ...]`)

Use #[default = value] on a parameter to supply a compile-time default. Parameters with defaults are not required in the schema, and the default is recorded in the "default" field of the schema property.

#[tool]
async fn search(
    query: String,
    #[default = 10]
    max_results: i64,
    #[default = "en"]
    language: String,
) -> Result<String, SynapticError> {
    Ok(format!("Searching '{}' (max {}, lang {})", query, max_results, language))
}

If the LLM omits max_results, it defaults to 10. If it omits language, it defaults to "en".

Custom Tool Name (`#[tool(name = "...")]`)

By default the tool name matches the function name. Override it with the name attribute when you need a different identifier exposed to the LLM:

#[tool(name = "google_search")]
async fn search(query: String) -> Result<String, SynapticError> {
    Ok(format!("Searching for '{}'", query))
}

let tool = search();
assert_eq!(tool.name(), "google_search");

The factory function keeps the original Rust name (search()), but tool.name() returns "google_search".

Struct Fields (`#[field]`)

Some tools need to hold state — a database connection, an API client, a backend reference, etc. Mark those parameters with #[field] and they become struct fields instead of JSON Schema parameters. The factory function will require these values at construction time, and they are hidden from the LLM entirely.

use std::sync::Arc;
use synaptic::core::SynapticError;
use serde_json::Value;

#[tool]
async fn db_lookup(
    #[field] connection: Arc<String>,
    /// The table to query
    table: String,
) -> Result<String, SynapticError> {
    Ok(format!("Querying {} on {}", table, connection))
}

// Factory now requires the field parameter:
let tool = db_lookup(Arc::new("postgres://localhost".into()));
assert_eq!(tool.name(), "db_lookup");
// Only "table" appears in the schema; "connection" is hidden

The macro generates a struct with the field:

struct DbLookupTool {
    connection: Arc<String>,
}

You can combine #[field] with regular parameters, Option<T>, and #[default = ...]. Multiple #[field] parameters are supported:

#[tool]
async fn annotate(
    #[field] prefix: String,
    #[field] suffix: String,
    /// The input text
    text: String,
    #[default = 1]
    repeat: i64,
) -> Result<String, SynapticError> {
    let inner = text.repeat(repeat as usize);
    Ok(format!("{}{}{}", prefix, inner, suffix))
}

let tool = annotate("<<".into(), ">>".into());

Note: #[field] and #[inject] cannot be used on the same parameter. Use #[field] when the value is provided at construction time; use #[inject] when it comes from the agent runtime.

Raw Arguments (`#[args]`)

Some tools need to receive the raw JSON arguments without any deserialization — for example, echo tools that forward the entire input, or tools that handle arbitrary JSON payloads. Mark the parameter with #[args] and it will receive the raw serde_json::Value passed to call() directly.

use synaptic::macros::tool;
use synaptic::core::SynapticError;
use serde_json::{json, Value};

/// Echo the input back.
#[tool(name = "echo")]
async fn echo(#[args] args: Value) -> Result<Value, SynapticError> {
    Ok(json!({"echo": args}))
}

let tool = echo();
assert_eq!(tool.name(), "echo");

// parameters() returns None — no JSON Schema is generated
assert!(tool.parameters().is_none());

The #[args] parameter:

Receives the raw Value without any JSON Schema generation or deserialization
Causes parameters() to return None (unless there are other normal parameters)
Can be combined with #[field] parameters (struct fields are still supported)
Cannot be combined with #[inject] on the same parameter
At most one parameter can be marked #[args]

/// Echo with a configurable prefix.
#[tool]
async fn echo_with_prefix(
    #[field] prefix: String,
    #[args] args: Value,
) -> Result<Value, SynapticError> {
    Ok(json!({"prefix": prefix, "data": args}))
}

let tool = echo_with_prefix(">>".into());

Runtime Injection (`#[inject(state)]`, `#[inject(store)]`, `#[inject(tool_call_id)]`)

Some tools need access to agent runtime state that the LLM should not (and cannot) provide. Mark those parameters with #[inject(...)] and they will be populated from the ToolRuntime context instead of from the LLM-supplied JSON arguments. Injected parameters are hidden from the JSON Schema entirely.

When any parameter uses #[inject(...)], the macro generates a RuntimeAwareTool implementation (with call_with_runtime) instead of a plain Tool.

There are three injection kinds:

Annotation	Source	Typical Type
`#[inject(state)]`	`ToolRuntime::state` (deserialized from `Value`)	Your state struct, or `Value`
`#[inject(store)]`	`ToolRuntime::store` (cloned `Option<Arc<dyn Store>>`)	`Arc<dyn Store>`
`#[inject(tool_call_id)]`	`ToolRuntime::tool_call_id` (the ID of the current call)	`String`

use synaptic::core::{SynapticError, ToolRuntime};
use std::sync::Arc;

#[tool]
async fn save_note(
    /// The note content
    content: String,
    /// Injected: the current tool call ID
    #[inject(tool_call_id)]
    call_id: String,
    /// Injected: shared application state
    #[inject(state)]
    state: serde_json::Value,
) -> Result<String, SynapticError> {
    Ok(format!("Saved note (call={}) with state {:?}", call_id, state))
}

// Factory returns Arc<dyn RuntimeAwareTool> instead of Arc<dyn Tool>
let tool = save_note();

The LLM only sees content in the schema; call_id and state are supplied by the agent runtime automatically.

`#[chain]` -- Create Runnable Chains

#[chain] wraps an async fn as a BoxRunnable. It is a lightweight way to create composable runnable steps that can be piped together.

The macro generates:

A private {name}_impl function containing the original body.
A public factory function with the original name that returns a BoxRunnable<InputType, OutputType> backed by a RunnableLambda.

Output Type Inference

The macro automatically detects the return type:

Return Type	Generated Type	Behavior
`Result<Value, _>`	`BoxRunnable<I, Value>`	Serializes result to `Value`
`Result<String, _>`	`BoxRunnable<I, String>`	Returns directly, no serialization
`Result<T, _>` (any other)	`BoxRunnable<I, T>`	Returns directly, no serialization

Basic Usage

use synaptic::macros::chain;
use synaptic::core::SynapticError;
use serde_json::Value;

// Value output — result is serialized to Value
#[chain]
async fn uppercase(input: Value) -> Result<Value, SynapticError> {
    let s = input.as_str().unwrap_or_default().to_uppercase();
    Ok(Value::String(s))
}

// `uppercase()` returns BoxRunnable<Value, Value>
let runnable = uppercase();

Typed Output

When the return type is not Value, the macro generates a typed runnable without serialization overhead:

// String output — returns BoxRunnable<String, String>
#[chain]
async fn to_upper(s: String) -> Result<String, SynapticError> {
    Ok(s.to_uppercase())
}

#[chain]
async fn exclaim(s: String) -> Result<String, SynapticError> {
    Ok(format!("{}!", s))
}

// Typed chains compose naturally with |
let pipeline = to_upper() | exclaim();
let result = pipeline.invoke("hello".into(), &config).await?;
assert_eq!(result, "HELLO!");

Composition with `|`

Runnables support pipe-based composition. Chain multiple steps together by combining the factories:

#[chain]
async fn step_a(input: Value) -> Result<Value, SynapticError> {
    // ... transform input ...
    Ok(input)
}

#[chain]
async fn step_b(input: Value) -> Result<Value, SynapticError> {
    // ... transform further ...
    Ok(input)
}

// Compose into a pipeline: step_a | step_b
let pipeline = step_a() | step_b();
let result = pipeline.invoke(serde_json::json!("hello")).await?;

Note: #[chain] does not accept any arguments. Attempting to write #[chain(name = "...")] will produce a compile error.

`#[entrypoint]` -- Workflow Entry Points

#[entrypoint] defines a LangGraph-style workflow entry point. The macro generates a factory function that returns a synaptic::core::Entrypoint struct containing the configuration and a boxed async closure.

The decorated function must:

Be async.
Accept exactly one parameter of type serde_json::Value.
Return Result<Value, SynapticError>.

Basic Usage

use synaptic::macros::entrypoint;
use synaptic::core::SynapticError;
use serde_json::Value;

#[entrypoint]
async fn my_workflow(input: Value) -> Result<Value, SynapticError> {
    // orchestrate agents, tools, subgraphs...
    Ok(input)
}

let ep = my_workflow();
// ep.config.name == "my_workflow"

Attributes (`name`, `checkpointer`)

Attribute	Default	Description
`name = "..."`	function name	Override the entrypoint name
`checkpointer = "..."`	`None`	Hint which checkpointer backend to use (e.g. `"memory"`, `"redis"`)

#[entrypoint(name = "chat_bot", checkpointer = "memory")]
async fn my_workflow(input: Value) -> Result<Value, SynapticError> {
    Ok(input)
}

let ep = my_workflow();
assert_eq!(ep.config.name, "chat_bot");
assert_eq!(ep.config.checkpointer, Some("memory"));

`#[task]` -- Trackable Tasks

#[task] marks an async function as a named task. This is useful inside entrypoints for tracing and streaming identification. The macro:

Renames the original function to {name}_impl.
Creates a public wrapper function that defines a __TASK_NAME constant and delegates to the impl.

Basic Usage

use synaptic::macros::task;
use synaptic::core::SynapticError;

#[task]
async fn fetch_weather(city: String) -> Result<String, SynapticError> {
    Ok(format!("Sunny in {}", city))
}

// Calling fetch_weather("Paris".into()) internally sets __TASK_NAME = "fetch_weather"
// and delegates to fetch_weather_impl("Paris".into()).
let result = fetch_weather("Paris".into()).await?;

Custom Task Name

Override the task name with name = "...":

#[task(name = "weather_lookup")]
async fn fetch_weather(city: String) -> Result<String, SynapticError> {
    Ok(format!("Sunny in {}", city))
}
// __TASK_NAME is now "weather_lookup"

`#[traceable]` -- Tracing Instrumentation

#[traceable] adds tracing instrumentation to any function. It wraps the function body in a tracing::info_span! with parameter values recorded as span fields. For async functions, the span is propagated correctly using tracing::Instrument.

Basic Usage

use synaptic::macros::traceable;

#[traceable]
async fn process_data(input: String, count: usize) -> String {
    format!("{}: {}", input, count)
}

This generates code equivalent to:

async fn process_data(input: String, count: usize) -> String {
    use tracing::Instrument;
    let __span = tracing::info_span!(
        "process_data",
        input = tracing::field::debug(&input),
        count = tracing::field::debug(&count),
    );
    async move {
        format!("{}: {}", input, count)
    }
    .instrument(__span)
    .await
}

For synchronous functions, the macro uses a span guard instead of Instrument:

#[traceable]
fn compute(x: i32, y: i32) -> i32 {
    x + y
}
// Generates a span guard: let __enter = __span.enter();

Custom Span Name

Override the default span name (which is the function name) with name = "...":

#[traceable(name = "data_pipeline")]
async fn process_data(input: String) -> String {
    input.to_uppercase()
}
// The span is named "data_pipeline" instead of "process_data"

Skipping Parameters

Exclude sensitive or large parameters from being recorded in the span with skip = "param1,param2":

#[traceable(skip = "api_key")]
async fn call_api(query: String, api_key: String) -> Result<String, SynapticError> {
    // `query` is recorded in the span, `api_key` is not
    Ok(format!("Called API with '{}'", query))
}

You can combine both attributes:

#[traceable(name = "api_call", skip = "api_key,secret")]
async fn call_api(query: String, api_key: String, secret: String) -> Result<String, SynapticError> {
    Ok("done".into())
}

Middleware Macros

Synaptic provides seven macros for defining agent middleware. Each one generates:

A struct named {PascalCase}Middleware (e.g. log_response becomes LogResponseMiddleware).
An impl AgentMiddleware for {PascalCase}Middleware with the corresponding hook method overridden.
A factory function with the original name that returns Arc<dyn AgentMiddleware>.

None of the middleware macros accept attribute arguments. However, all middleware macros support #[field] parameters for building stateful middleware (see Stateful Middleware with #[field] below).

`#[before_agent]`

Runs before the agent loop starts. The function receives a mutable reference to the message list.

Signature: async fn(messages: &mut Vec<Message>) -> Result<(), SynapticError>

use synaptic::macros::before_agent;
use synaptic::core::{Message, SynapticError};

#[before_agent]
async fn inject_system(messages: &mut Vec<Message>) -> Result<(), SynapticError> {
    println!("Starting agent with {} messages", messages.len());
    Ok(())
}

let mw = inject_system(); // Arc<dyn AgentMiddleware>

`#[before_model]`

Runs before each model call. Use this to modify the request (e.g., add headers, tweak temperature, inject a system prompt).

Signature: async fn(request: &mut ModelRequest) -> Result<(), SynapticError>

use synaptic::macros::before_model;
use synaptic::middleware::ModelRequest;
use synaptic::core::SynapticError;

#[before_model]
async fn set_temperature(request: &mut ModelRequest) -> Result<(), SynapticError> {
    request.temperature = Some(0.7);
    Ok(())
}

let mw = set_temperature(); // Arc<dyn AgentMiddleware>

`#[after_model]`

Runs after each model call. Use this to inspect or mutate the response.

Signature: async fn(request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError>

use synaptic::macros::after_model;
use synaptic::middleware::{ModelRequest, ModelResponse};
use synaptic::core::SynapticError;

#[after_model]
async fn log_usage(request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError> {
    if let Some(usage) = &response.usage {
        println!("Tokens used: {}", usage.total_tokens);
    }
    Ok(())
}

let mw = log_usage(); // Arc<dyn AgentMiddleware>

`#[after_agent]`

Runs after the agent loop finishes. Receives the final message list.

Signature: async fn(messages: &mut Vec<Message>) -> Result<(), SynapticError>

use synaptic::macros::after_agent;
use synaptic::core::{Message, SynapticError};

#[after_agent]
async fn summarize(messages: &mut Vec<Message>) -> Result<(), SynapticError> {
    println!("Agent finished with {} messages", messages.len());
    Ok(())
}

let mw = summarize(); // Arc<dyn AgentMiddleware>

`#[wrap_model_call]`

Wraps the model call with custom logic, giving you full control over whether and how the underlying model is invoked. This is the right hook for retries, fallbacks, caching, or circuit-breaker patterns.

Signature: async fn(request: ModelRequest, next: &dyn ModelCaller) -> Result<ModelResponse, SynapticError>

use synaptic::macros::wrap_model_call;
use synaptic::middleware::{ModelRequest, ModelResponse, ModelCaller};
use synaptic::core::SynapticError;

#[wrap_model_call]
async fn retry_once(
    request: ModelRequest,
    next: &dyn ModelCaller,
) -> Result<ModelResponse, SynapticError> {
    match next.call(request.clone()).await {
        Ok(response) => Ok(response),
        Err(_) => next.call(request).await, // retry once
    }
}

let mw = retry_once(); // Arc<dyn AgentMiddleware>

`#[wrap_tool_call]`

Wraps individual tool calls. Same pattern as #[wrap_model_call] but for tool invocations. Useful for logging, permission checks, or sandboxing.

Signature: async fn(request: ToolCallRequest, next: &dyn ToolCaller) -> Result<Value, SynapticError>

use synaptic::macros::wrap_tool_call;
use synaptic::middleware::{ToolCallRequest, ToolCaller};
use synaptic::core::SynapticError;
use serde_json::Value;

#[wrap_tool_call]
async fn log_tool(
    request: ToolCallRequest,
    next: &dyn ToolCaller,
) -> Result<Value, SynapticError> {
    println!("Calling tool: {}", request.call.name);
    let result = next.call(request).await?;
    println!("Tool returned: {}", result);
    Ok(result)
}

let mw = log_tool(); // Arc<dyn AgentMiddleware>

`#[dynamic_prompt]`

Generates a system prompt dynamically based on the current conversation. Unlike the other middleware macros, the decorated function is synchronous (not async). It reads the message history and returns a String that is set as the system prompt before each model call.

Under the hood, the macro generates a middleware whose before_model hook sets request.system_prompt to the return value of your function.

Signature: fn(messages: &[Message]) -> String

use synaptic::macros::dynamic_prompt;
use synaptic::core::Message;

#[dynamic_prompt]
fn context_aware_prompt(messages: &[Message]) -> String {
    if messages.len() > 10 {
        "Be concise. The conversation is getting long.".into()
    } else {
        "Be thorough and detailed in your responses.".into()
    }
}

let mw = context_aware_prompt(); // Arc<dyn AgentMiddleware>

Why is #[dynamic_prompt] synchronous?

Unlike the other middleware macros, #[dynamic_prompt] takes a plain fn instead of async fn. This is a deliberate design choice:

Pure computation — Dynamic prompt generation typically involves inspecting the message list and building a string. These are pure CPU operations (pattern matching, string formatting) with no I/O involved. Making them async would add unnecessary overhead (Future state machine, poll machinery) for zero benefit.

Simplicity — Synchronous functions are easier to write and reason about. No .await, no pinning, no Send/Sync bounds to worry about.

Internal async wrapping — The macro generates a before_model hook that calls your sync function inside an async context. The hook itself is async (as required by AgentMiddleware), but your function doesn't need to be.

If you need async operations in your prompt generation (e.g., fetching context from a database or calling an API), use #[before_model] directly and set request.system_prompt yourself:
#[before_model]
async fn async_prompt(request: &mut ModelRequest) -> Result<(), SynapticError> {
    let context = fetch_from_database().await?;  // async I/O
    request.system_prompt = Some(format!("Context: {}", context));
    Ok(())
}

Stateful Middleware with `#[field]`

All middleware macros support #[field] parameters — function parameters that become struct fields rather than trait method parameters. This lets you build middleware with configuration state, just like #[tool] tools with #[field].

Field parameters must come before the trait-mandated parameters. The factory function will accept the field values, and the generated struct stores them.

Example: Retry middleware with configurable retries

use std::time::Duration;
use synaptic::macros::wrap_tool_call;
use synaptic::middleware::{ToolCallRequest, ToolCaller};
use synaptic::core::SynapticError;
use serde_json::Value;

#[wrap_tool_call]
async fn tool_retry(
    #[field] max_retries: usize,
    #[field] base_delay: Duration,
    request: ToolCallRequest,
    next: &dyn ToolCaller,
) -> Result<Value, SynapticError> {
    let mut last_err = None;
    for attempt in 0..=max_retries {
        match next.call(request.clone()).await {
            Ok(val) => return Ok(val),
            Err(e) => {
                last_err = Some(e);
                if attempt < max_retries {
                    let delay = base_delay * 2u32.saturating_pow(attempt as u32);
                    tokio::time::sleep(delay).await;
                }
            }
        }
    }
    Err(last_err.unwrap())
}

// Factory function accepts the field values:
let mw = tool_retry(3, Duration::from_millis(100));

Example: Model fallback with alternative models

use std::sync::Arc;
use synaptic::macros::wrap_model_call;
use synaptic::middleware::{BaseChatModelCaller, ModelRequest, ModelResponse, ModelCaller};
use synaptic::core::{ChatModel, SynapticError};

#[wrap_model_call]
async fn model_fallback(
    #[field] fallbacks: Vec<Arc<dyn ChatModel>>,
    request: ModelRequest,
    next: &dyn ModelCaller,
) -> Result<ModelResponse, SynapticError> {
    match next.call(request.clone()).await {
        Ok(resp) => Ok(resp),
        Err(primary_err) => {
            for fallback in &fallbacks {
                let caller = BaseChatModelCaller::new(fallback.clone());
                if let Ok(resp) = caller.call(request.clone()).await {
                    return Ok(resp);
                }
            }
            Err(primary_err)
        }
    }
}

let mw = model_fallback(vec![backup_model]);

Example: Dynamic prompt with branding

use synaptic::macros::dynamic_prompt;
use synaptic::core::Message;

#[dynamic_prompt]
fn branded_prompt(#[field] brand: String, messages: &[Message]) -> String {
    format!("[{}] You have {} messages", brand, messages.len())
}

let mw = branded_prompt("Acme Corp".into());

Macro Examples

The following end-to-end scenarios show how the macros work together in realistic applications.

Scenario A: Weather Agent with Custom Tool

This example defines a tool with #[tool] and a #[field] for an API key, registers it, creates a ReAct agent with create_react_agent, and runs a query.

use synaptic::core::{ChatModel, Message, SynapticError};
use synaptic::graph::{create_react_agent, MessageState, GraphResult};
use synaptic::models::ScriptedChatModel;
use std::sync::Arc;

/// Get the current weather for a city.
#[tool]
async fn get_weather(
    #[field] api_key: String,
    /// City name to look up
    city: String,
) -> Result<String, SynapticError> {
    // In production, call a real weather API with api_key
    Ok(format!("72°F and sunny in {}", city))
}

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    let tool = get_weather("sk-fake-key".into());
    let tools: Vec<Arc<dyn synaptic::core::Tool>> = vec![tool];

    let model: Arc<dyn ChatModel> = Arc::new(ScriptedChatModel::new(vec![/* ... */]));
    let agent = create_react_agent(model, tools).compile()?;

    let state = MessageState::from_messages(vec![
        Message::human("What's the weather in Tokyo?"),
    ]);

    let result = agent.invoke(state, None).await?;
    println!("{:?}", result.into_state().messages);
    Ok(())
}

Scenario B: Data Pipeline with Chain Macros

This example composes multiple #[chain] steps into a processing pipeline that extracts text, normalizes it, and counts words.

use synaptic::core::{RunnableConfig, SynapticError};
use synaptic::runnables::Runnable;
use serde_json::{json, Value};

#[chain]
async fn extract_text(input: Value) -> Result<Value, SynapticError> {
    let text = input["content"].as_str().unwrap_or("");
    Ok(json!(text.to_string()))
}

#[chain]
async fn normalize(input: Value) -> Result<Value, SynapticError> {
    let text = input.as_str().unwrap_or("").to_lowercase().trim().to_string();
    Ok(json!(text))
}

#[chain]
async fn word_count(input: Value) -> Result<Value, SynapticError> {
    let text = input.as_str().unwrap_or("");
    let count = text.split_whitespace().count();
    Ok(json!({"text": text, "word_count": count}))
}

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    let pipeline = extract_text() | normalize() | word_count();
    let config = RunnableConfig::default();

    let input = json!({"content": "  Hello World  from Synaptic!  "});
    let result = pipeline.invoke(input, &config).await?;

    println!("Result: {}", result);
    // {"text": "hello world from synaptic!", "word_count": 4}
    Ok(())
}

Scenario C: Agent with Middleware Stack

This example combines middleware macros into a real agent with logging, retry, and dynamic prompting.

use synaptic::core::{Message, SynapticError};
use synaptic::middleware::{AgentMiddleware, MiddlewareChain, ModelRequest, ModelResponse, ModelCaller};
use std::sync::Arc;

// Log every model call
#[after_model]
async fn log_response(request: &ModelRequest, response: &mut ModelResponse) -> Result<(), SynapticError> {
    println!("[LOG] Model responded with {} chars",
        response.message.content().len());
    Ok(())
}

// Retry failed model calls up to 2 times
#[wrap_model_call]
async fn retry_model(
    #[field] max_retries: usize,
    request: ModelRequest,
    next: &dyn ModelCaller,
) -> Result<ModelResponse, SynapticError> {
    let mut last_err = None;
    for _ in 0..=max_retries {
        match next.call(request.clone()).await {
            Ok(resp) => return Ok(resp),
            Err(e) => last_err = Some(e),
        }
    }
    Err(last_err.unwrap())
}

// Dynamic system prompt based on conversation length
#[dynamic_prompt]
fn adaptive_prompt(messages: &[Message]) -> String {
    if messages.len() > 20 {
        "Be concise. Summarize rather than elaborate.".into()
    } else {
        "You are a helpful assistant. Be thorough.".into()
    }
}

fn build_middleware_stack() -> Vec<Arc<dyn AgentMiddleware>> {
    vec![
        adaptive_prompt(),
        retry_model(2),
        log_response(),
    ]
}

Scenario D: Store-Backed Note Manager with Typed Input

This example combines #[inject] for runtime access and schemars for rich JSON Schema generation. A save_note tool accepts a custom NoteInput struct whose full schema (title, content, tags) is visible to the LLM, while the shared store and tool call ID are injected transparently by the agent runtime.

Cargo.toml -- enable the agent, store, and schemars features:

[dependencies]
synaptic = { version = "0.2", features = ["agent", "store", "schemars"] }
schemars = { version = "0.8", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Full example:

use std::sync::Arc;
use schemars::JsonSchema;
use serde::Deserialize;
use serde_json::json;
use synaptic::core::{Store, SynapticError};
use synaptic::macros::tool;

// --- Custom input type with schemars ---
// Deriving JsonSchema gives the LLM a complete description of every field,
// including the nested Vec<String> for tags.

#[derive(Deserialize, JsonSchema)]
struct NoteInput {
    /// Title of the note
    title: String,
    /// Body content of the note (Markdown supported)
    content: String,
    /// Tags for categorisation (e.g. ["work", "urgent"])
    tags: Vec<String>,
}

// --- What the LLM sees (with schemars enabled) ---
//
// The generated JSON Schema for the `note` parameter looks like:
//
// {
//   "type": "object",
//   "properties": {
//     "title":   { "type": "string", "description": "Title of the note" },
//     "content": { "type": "string", "description": "Body content of the note (Markdown supported)" },
//     "tags":    { "type": "array",  "items": { "type": "string" },
//                  "description": "Tags for categorisation (e.g. [\"work\", \"urgent\"])" }
//   },
//   "required": ["title", "content", "tags"]
// }
//
// --- Without schemars, the same parameter would produce only: ---
//
// { "type": "object" }
//
// ...giving the LLM no guidance about the expected fields.

/// Save a note to the shared store.
#[tool]
async fn save_note(
    /// The note to save (title, content, and tags)
    note: NoteInput,
    /// Injected: persistent key-value store
    #[inject(store)]
    store: Arc<dyn Store>,
    /// Injected: the current tool call ID for traceability
    #[inject(tool_call_id)]
    call_id: String,
) -> Result<String, SynapticError> {
    // Build a unique key from the tool call ID
    let key = format!("note:{}", call_id);

    // Persist the note as a JSON item in the store
    let value = json!({
        "title":   note.title,
        "content": note.content,
        "tags":    note.tags,
        "call_id": call_id,
    });

    store.put("notes", &key, value.clone()).await?;

    Ok(format!(
        "Saved note '{}' with {} tag(s) [key={}]",
        note.title,
        note.tags.len(),
        key,
    ))
}

// Usage:
//   let tool = save_note();          // Arc<dyn RuntimeAwareTool>
//   assert_eq!(tool.name(), "save_note");
//
// The LLM sees only the `note` parameter in the schema.
// `store` and `call_id` are injected by ToolNode at runtime.

Key takeaways:

NoteInput derives both Deserialize (for runtime deserialization) and JsonSchema (for compile-time schema generation). The schemars feature must be enabled in Cargo.toml for the #[tool] macro to pick up the derived schema.
#[inject(store)] gives the tool direct access to the shared Store without exposing it to the LLM. The ToolNode populates the store from ToolRuntime before each call.
#[inject(tool_call_id)] provides a unique identifier for the current invocation, useful for creating deterministic storage keys or audit trails.
Because #[inject] is present, the macro generates a RuntimeAwareTool (not a plain Tool). The factory function returns Arc<dyn RuntimeAwareTool>.

Scenario E: Workflow with Entrypoint, Tasks, and Tracing

This scenario demonstrates #[entrypoint], #[task], and #[traceable] working together to build an instrumented data pipeline.

use synaptic::core::SynapticError;
use synaptic::macros::{entrypoint, task, traceable};
use serde_json::{json, Value};

// A helper that calls an external API. The #[traceable] macro wraps it
// in a tracing span. We skip the api_key so it never appears in logs.
#[traceable(name = "external_api_call", skip = "api_key")]
async fn call_external_api(
    url: String,
    api_key: String,
) -> Result<Value, SynapticError> {
    // In production: reqwest::get(...).await
    Ok(json!({"status": "ok", "data": [1, 2, 3]}))
}

// Each #[task] gets a stable name used by streaming and tracing.
#[task(name = "fetch")]
async fn fetch_data(source: String) -> Result<Value, SynapticError> {
    let api_key = std::env::var("API_KEY").unwrap_or_default();
    let result = call_external_api(source, api_key).await?;
    Ok(result)
}

#[task(name = "transform")]
async fn transform_data(raw: Value) -> Result<Value, SynapticError> {
    let items = raw["data"].as_array().cloned().unwrap_or_default();
    let doubled: Vec<Value> = items
        .iter()
        .filter_map(|v| v.as_i64())
        .map(|n| json!(n * 2))
        .collect();
    Ok(json!({"transformed": doubled}))
}

// The entrypoint ties the workflow together with a name and checkpointer.
#[entrypoint(name = "data_pipeline", checkpointer = "memory")]
async fn run_pipeline(input: Value) -> Result<Value, SynapticError> {
    let source = input["source"].as_str().unwrap_or("default").to_string();

    let raw = fetch_data(source).await?;
    let result = transform_data(raw).await?;

    Ok(result)
}

#[tokio::main]
async fn main() -> Result<(), SynapticError> {
    // Set up tracing to see the spans emitted by #[traceable] and #[task]:
    //   tracing_subscriber::fmt()
    //       .with_max_level(tracing::Level::INFO)
    //       .init();

    let ep = run_pipeline();
    let output = (ep.run)(json!({"source": "https://api.example.com/data"})).await?;
    println!("Pipeline output: {}", output);
    Ok(())
}

Key takeaways:

#[task] gives each step a stable name ("fetch", "transform") that appears in streaming events and tracing spans, making it easy to identify which step is running or failed.
#[traceable] instruments any function with an automatic tracing span. Use skip = "api_key" to keep secrets out of your traces.
#[entrypoint] ties the workflow together with a logical name and an optional checkpointer hint for state persistence.
These macros are composable -- use them in any combination. A #[task] can call a #[traceable] helper, and an #[entrypoint] can orchestrate any number of #[task] functions.

Scenario F: Tool Permission Gating with Audit Logging

This scenario demonstrates #[wrap_tool_call] with an allowlist field for permission gating, plus #[before_agent] and #[after_agent] for lifecycle audit logging.

use std::sync::Arc;
use synaptic::core::{Message, SynapticError};
use synaptic::macros::{before_agent, after_agent, wrap_tool_call};
use synaptic::middleware::{AgentMiddleware, ToolCallRequest, ToolCaller};
use serde_json::Value;

// --- Permission gating ---
// Only allow tools whose names appear in the allowlist.
// If the LLM tries to call a tool not in the list, return an error.

#[wrap_tool_call]
async fn permission_gate(
    #[field] allowed_tools: Vec<String>,
    request: ToolCallRequest,
    next: &dyn ToolCaller,
) -> Result<Value, SynapticError> {
    if !allowed_tools.contains(&request.call.name) {
        return Err(SynapticError::Tool(format!(
            "Tool '{}' is not in the allowed list: {:?}",
            request.call.name, allowed_tools,
        )));
    }
    next.call(request).await
}

// --- Audit: before agent ---
// Log the number of messages when the agent starts.

#[before_agent]
async fn audit_start(
    #[field] label: String,
    messages: &mut Vec<Message>,
) -> Result<(), SynapticError> {
    println!("[{}] Agent starting with {} messages", label, messages.len());
    Ok(())
}

// --- Audit: after agent ---
// Log the number of messages when the agent finishes.

#[after_agent]
async fn audit_end(
    #[field] label: String,
    messages: &mut Vec<Message>,
) -> Result<(), SynapticError> {
    println!("[{}] Agent completed with {} messages", label, messages.len());
    Ok(())
}

// --- Assemble the middleware stack ---

fn build_secured_stack() -> Vec<Arc<dyn AgentMiddleware>> {
    let allowed = vec![
        "web_search".to_string(),
        "get_weather".to_string(),
    ];

    vec![
        audit_start("prod-agent".into()),
        permission_gate(allowed),
        audit_end("prod-agent".into()),
    ]
}

Key takeaways:

#[wrap_tool_call] gives full control over tool execution. Check permissions, transform arguments, or deny the call entirely by returning an error instead of calling next.call().
#[before_agent] and #[after_agent] bracket the entire agent lifecycle, making them ideal for audit logging, metrics collection, or resource setup/teardown.
#[field] makes each middleware configurable and reusable. The permission_gate can be instantiated with different allowlists for different agents, and the audit middleware accepts a label for log disambiguation.

Scenario G: State-Aware Tool with Raw Arguments

This scenario demonstrates #[inject(state)] for reading graph state and #[args] for accepting raw JSON payloads, plus a combination of both patterns with #[field].

use std::sync::Arc;
use serde::Deserialize;
use serde_json::{json, Value};
use synaptic::core::SynapticError;
use synaptic::macros::tool;

// --- State-aware tool ---
// Reads the graph state to adjust its behavior. After 10 conversation
// turns the tool switches to shorter replies.

#[derive(Deserialize)]
struct ConversationState {
    turn_count: usize,
}

/// Generate a context-aware reply.
#[tool]
async fn smart_reply(
    /// The user's latest message
    message: String,
    #[inject(state)]
    state: ConversationState,
) -> Result<String, SynapticError> {
    if state.turn_count > 10 {
        // After 10 turns, keep it short
        Ok(format!("TL;DR: {}", &message[..message.len().min(50)]))
    } else {
        Ok(format!(
            "Turn {}: Let me elaborate on '{}'...",
            state.turn_count, message
        ))
    }
}

// --- Raw-args JSON proxy ---
// Accepts any JSON payload and forwards it to a webhook endpoint.
// No schema is generated -- the LLM sends whatever JSON it wants.

/// Forward a JSON payload to an external webhook.
#[tool(name = "webhook_forward")]
async fn webhook_forward(#[args] payload: Value) -> Result<String, SynapticError> {
    // In production: reqwest::Client::new().post(url).json(&payload).send().await
    Ok(format!("Forwarded payload with {} keys", payload.as_object().map_or(0, |m| m.len())))
}

// --- Configurable API proxy ---
// Combines #[field] for a base endpoint with #[args] for the request body.
// Each instance points at a different API.

/// Proxy arbitrary JSON to a configured API endpoint.
#[tool(name = "api_proxy")]
async fn api_proxy(
    #[field] endpoint: String,
    #[args] body: Value,
) -> Result<String, SynapticError> {
    // In production: reqwest::Client::new().post(&endpoint).json(&body).send().await
    Ok(format!(
        "POST {} with {} bytes",
        endpoint,
        body.to_string().len()
    ))
}

fn main() {
    // State-aware tool -- the LLM only sees "message" in the schema
    let reply_tool = smart_reply();

    // Raw-args tool -- parameters() returns None
    let webhook_tool = webhook_forward();

    // Configurable proxy -- each instance targets a different endpoint
    let users_api = api_proxy("https://api.example.com/users".into());
    let orders_api = api_proxy("https://api.example.com/orders".into());
}

Key takeaways:

#[inject(state)] gives tools read access to the current graph state without exposing it to the LLM. The state is deserialized from ToolRuntime::state into your custom struct automatically.
#[args] bypasses schema generation entirely -- the tool accepts whatever JSON the LLM sends. Use this for proxy/forwarding patterns or tools that handle arbitrary payloads. parameters() returns None when #[args] is the only non-field, non-inject parameter.
#[field] + #[args] combine naturally. The field is provided at construction time (hidden from the LLM), while the raw JSON arrives at call time. This makes it easy to create reusable tool templates that differ only in configuration.

Comparison with Python LangChain

If you are coming from Python LangChain / LangGraph, here is how the Synaptic macros map to their Python equivalents:

Python	Rust (Synaptic)	Notes
`@tool`	`#[tool]`	Both generate a tool from a function; Rust version produces a struct + trait impl
`RunnableLambda(fn)`	`#[chain]`	Rust version returns `BoxRunnable<I, O>` with auto-detected output type
`@entrypoint`	`#[entrypoint]`	Both define a workflow entry point; Rust adds `checkpointer` hint
`@task`	`#[task]`	Both mark a function as a named sub-task
Middleware classes	`#[before_agent]`, `#[before_model]`, `#[after_model]`, `#[after_agent]`, `#[wrap_model_call]`, `#[wrap_tool_call]`, `#[dynamic_prompt]`	Rust splits each hook into its own macro for clarity
`@traceable`	`#[traceable]`	Rust uses `tracing` crate spans; Python uses LangSmith
`InjectedState`, `InjectedStore`, `InjectedToolCallId`	`#[inject(state)]`, `#[inject(store)]`, `#[inject(tool_call_id)]`	Rust uses parameter-level attributes instead of type annotations

How Tool Definitions Reach the LLM

Understanding the full journey from a Rust function to an LLM tool call helps debug schema issues and customize behavior. Here is the complete chain:

#[tool] macro
    |
    v
struct + impl Tool    (generated at compile time)
    |
    v
tool.as_tool_definition() -> ToolDefinition { name, description, parameters }
    |
    v
ChatRequest::with_tools(vec![...])    (tool definitions attached to request)
    |
    v
Model Adapter (OpenAI / Anthropic / Gemini)
    |   Converts ToolDefinition -> provider-specific JSON
    |   e.g. OpenAI: {"type": "function", "function": {"name": ..., "parameters": ...}}
    v
HTTP POST -> LLM API
    |
    v
LLM returns ToolCall { id, name, arguments }
    |
    v
ToolNode dispatches -> tool.call(arguments)
    |
    v
Tool Message back into conversation

Key files in the codebase:

Step	File
`#[tool]` macro expansion	`crates/synaptic-macros/src/tool.rs`
`Tool` / `RuntimeAwareTool` traits	`crates/synaptic-core/src/lib.rs`
`ToolDefinition`, `ToolCall` types	`crates/synaptic-core/src/lib.rs`
`ToolNode` (dispatches calls)	`crates/synaptic-graph/src/tool_node.rs`
OpenAI adapter	`crates/synaptic-openai/src/lib.rs`
Anthropic adapter	`crates/synaptic-anthropic/src/lib.rs`
Gemini adapter	`crates/synaptic-gemini/src/lib.rs`

Testing Macro-Generated Code

Tools generated by #[tool] can be tested like any other Tool implementation. Call as_tool_definition() to inspect the schema and call() to verify behavior:

use serde_json::json;
use synaptic::core::Tool;

/// Add two numbers.
#[tool]
async fn add(
    /// The first number
    a: f64,
    /// The second number
    b: f64,
) -> Result<serde_json::Value, SynapticError> {
    Ok(json!({"result": a + b}))
}

#[tokio::test]
async fn test_add_tool() {
    let tool = add();

    // Verify metadata
    assert_eq!(tool.name(), "add");
    assert_eq!(tool.description(), "Add two numbers.");

    // Verify schema
    let def = tool.as_tool_definition();
    let required = def.parameters["required"].as_array().unwrap();
    assert!(required.contains(&json!("a")));
    assert!(required.contains(&json!("b")));

    // Verify execution
    let result = tool.call(json!({"a": 3.0, "b": 4.0})).await.unwrap();
    assert_eq!(result["result"], 7.0);
}

For #[chain] macros, test the returned BoxRunnable with invoke():

use synaptic::core::RunnableConfig;
use synaptic::runnables::Runnable;

#[chain]
async fn to_upper(s: String) -> Result<String, SynapticError> {
    Ok(s.to_uppercase())
}

#[tokio::test]
async fn test_chain() {
    let runnable = to_upper();
    let config = RunnableConfig::default();
    let result = runnable.invoke("hello".into(), &config).await.unwrap();
    assert_eq!(result, "HELLO");
}

What can go wrong

Custom types without schemars: The parameter schema is {"type": "object"} with no field details. The LLM guesses (often incorrectly) what to send. Fix: Enable the schemars feature and derive JsonSchema.
Missing as_tool_definition() call: If you construct ToolDefinition manually with json!({}) for parameters instead of calling tool.as_tool_definition(), the schema will be empty. Fix: Always use as_tool_definition() on your Tool / RuntimeAwareTool.
OpenAI strict mode: OpenAI's function calling strict mode rejects schemas with missing type fields. All built-in types and Value now generate valid schemas with "type" specified.

Architecture

Synaptic is organized as a workspace of focused Rust crates. Each crate owns exactly one concern, and they compose together through shared traits defined in a single core crate. This page explains the layered design, the principles behind it, and how the crates depend on each other.

Design Principles

Async-first. Every trait in Synaptic is async via #[async_trait], and the runtime is tokio. This is not an afterthought bolted onto a synchronous API -- async is the foundation. LLM calls, tool execution, memory access, and embedding queries are all naturally asynchronous operations, and Synaptic models them as such from the start.

One crate, one concern. Each provider has its own crate: synaptic-openai, synaptic-anthropic, synaptic-gemini, synaptic-ollama. The synaptic-tools crate knows how to register and execute tools. The synaptic-memory crate knows how to store and retrieve conversation history. No crate does two jobs. This keeps compile times manageable, makes it possible to use only what you need, and ensures that changes to one subsystem do not cascade across the codebase.

Shared traits in core. The synaptic-core crate defines every trait and type that crosses crate boundaries: ChatModel, Tool, MemoryStore, CallbackHandler, Message, ChatRequest, ChatResponse, ToolCall, SynapticError, RunnableConfig, and more. Implementation crates depend on core, never on each other (unless composition requires it).

Concurrency-safe by default. Shared registries use Arc<RwLock<_>> (standard library RwLock for low-contention read-heavy data like tool registries). Mutable state that requires async access -- callbacks, memory stores, checkpointers -- uses Arc<tokio::sync::Mutex<_>> or Arc<tokio::sync::RwLock<_>>. All core traits require Send + Sync.

Session isolation. Memory, agent runs, and graph checkpoints are keyed by a session or thread identifier. Two concurrent conversations never interfere with each other, even when they share the same model and tool instances.

Event-driven observability. The RunEvent enum captures every significant lifecycle event (run started, LLM called, tool called, run finished, run failed). Callback handlers receive these events asynchronously, enabling logging, tracing, recording, and custom side effects without modifying application code.

The Four Layers

Synaptic's crates fall into four layers, each building on the ones below it.

Layer 1: Core

synaptic-core is the foundation. It defines:

Traits: ChatModel, Tool, MemoryStore, CallbackHandler
Message types: The Message enum (System, Human, AI, Tool, Chat, Remove), AIMessageChunk for streaming, ToolCall, ToolDefinition, ToolChoice
Request/response: ChatRequest, ChatResponse, TokenUsage
Streaming: The ChatStream type alias (Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send>>)
Configuration: RunnableConfig (tags, metadata, concurrency limits, run IDs)
Events: RunEvent enum with six lifecycle variants
Errors: SynapticError enum with 19 variants spanning all subsystems

Every other crate in the workspace depends on synaptic-core. Nothing depends on synaptic-core except through this single shared foundation.

Layer 2: Implementation Crates

Each crate implements one core concern:

Crate	Purpose
`synaptic-models`	`ProviderBackend` abstraction, test doubles (`ScriptedChatModel`), wrappers (`RetryChatModel`, `RateLimitedChatModel`, `StructuredOutputChatModel<T>`, `BoundToolsChatModel`)
`synaptic-openai`	`OpenAiChatModel` + `OpenAiEmbeddings`
`synaptic-anthropic`	`AnthropicChatModel`
`synaptic-gemini`	`GeminiChatModel`
`synaptic-ollama`	`OllamaChatModel` + `OllamaEmbeddings`
`synaptic-tools`	`ToolRegistry`, `SerialToolExecutor`, `ParallelToolExecutor`, `HandleErrorTool`, `ReturnDirectTool`
`synaptic-memory`	`InMemoryStore` and strategy types: Buffer, Window, Summary, TokenBuffer, SummaryBuffer, `RunnableWithMessageHistory`, `FileChatMessageHistory`
`synaptic-callbacks`	`RecordingCallback`, `TracingCallback`, `CompositeCallback`
`synaptic-prompts`	`PromptTemplate`, `ChatPromptTemplate`, `FewShotChatMessagePromptTemplate`, `ExampleSelector`
`synaptic-parsers`	Output parsers: `StrOutputParser`, `JsonOutputParser`, `StructuredOutputParser<T>`, `ListOutputParser`, `EnumOutputParser`, `BooleanOutputParser`, `MarkdownListOutputParser`, `NumberedListOutputParser`, `XmlOutputParser`, `RetryOutputParser`, `FixingOutputParser`
`synaptic-cache`	`InMemoryCache`, `SemanticCache`, `CachedChatModel`
`synaptic-eval`	Evaluators (ExactMatch, JsonValidity, RegexMatch, EmbeddingDistance, LLMJudge), `Dataset`, batch evaluation pipeline

Layer 3: Composition and Retrieval

These crates combine the implementation crates into higher-level abstractions:

Crate	Purpose
`synaptic-runnables`	The LCEL system: `Runnable` trait, `BoxRunnable` with pipe operator, `RunnableSequence`, `RunnableParallel`, `RunnableBranch`, `RunnableWithFallbacks`, `RunnableAssign`, `RunnablePick`, `RunnableEach`, `RunnableRetry`, `RunnableGenerator`
`synaptic-graph`	LangGraph-style state machines: `StateGraph` builder, `CompiledGraph`, `Node` trait, `ToolNode`, `create_react_agent()`, checkpointing, streaming, visualization
`synaptic-loaders`	Document loaders: `TextLoader`, `JsonLoader`, `CsvLoader`, `DirectoryLoader`, `FileLoader`, `MarkdownLoader`, `WebLoader`
`synaptic-splitters`	Text splitters: `CharacterTextSplitter`, `RecursiveCharacterTextSplitter`, `MarkdownHeaderTextSplitter`, `HtmlHeaderTextSplitter`, `LanguageTextSplitter`, `TokenTextSplitter`
`synaptic-embeddings`	`Embeddings` trait, `FakeEmbeddings`, `CacheBackedEmbeddings`
`synaptic-vectorstores`	`VectorStore` trait, `InMemoryVectorStore`, `MultiVectorRetriever`
`synaptic-retrieval`	`Retriever` trait and seven implementations: InMemory, BM25, MultiQuery, Ensemble, ContextualCompression, SelfQuery, ParentDocument
`synaptic-qdrant`	`QdrantVectorStore` (Qdrant integration)
`synaptic-pgvector`	`PgVectorStore` (PostgreSQL pgvector integration)
`synaptic-redis`	`RedisStore` + `RedisCache` (Redis integration)
`synaptic-pdf`	`PdfLoader` (PDF document loading)

Layer 4: Facade

The synaptic crate re-exports everything from all sub-crates under a unified namespace. Application code can use a single dependency:

[dependencies]
synaptic = "0.2"

And then import from organized modules:

use synaptic::core::{Message, ChatRequest};
use synaptic::openai::OpenAiChatModel;       // requires "openai" feature
use synaptic::anthropic::AnthropicChatModel; // requires "anthropic" feature
use synaptic::graph::{create_react_agent, MessageState};
use synaptic::runnables::{BoxRunnable, Runnable};

Crate Dependency Diagram

                       synaptic (facade)
                             |
        +--------------------+--------------------+
        |                    |                    |
   synaptic-graph      synaptic-runnables    synaptic-eval
        |                    |                    |
   synaptic-tools        synaptic-core       synaptic-embeddings
        |                    ^                    |
   synaptic-core              |               synaptic-core
                             |
        +--------+-----------+-----------+--------+--------+
        |        |           |           |        |        |
   synap-   synap-    synap-    synap-   synap-  Provider
   tic-     tic-      tic-      tic-     tic-    crates:
   models   memory    callbacks prompts  parsers openai,
        |        |           |           |        | anthropic,
        +--------+-----------+-----------+--------+ gemini,
                             |                      ollama
                        synaptic-core

   Retrieval pipeline (all depend on synaptic-core):

   synaptic-loaders --> synaptic-splitters --> synaptic-embeddings
                                                   |
                                            synaptic-vectorstores
                                                   |
                                            synaptic-retrieval

   Integration crates (each depends on synaptic-core):

   synaptic-qdrant, synaptic-pgvector, synaptic-redis, synaptic-pdf

The arrows point downward toward dependencies. Every crate ultimately depends on synaptic-core. The composition crates (synaptic-graph, synaptic-runnables) additionally depend on the implementation crates they orchestrate.

Provider Abstraction

Each LLM provider lives in its own crate (synaptic-openai, synaptic-anthropic, synaptic-gemini, synaptic-ollama). They all use the ProviderBackend trait from synaptic-models to separate HTTP concerns from protocol mapping. HttpBackend makes real HTTP requests; FakeBackend returns scripted responses for testing. This means you can test any code that uses ChatModel without network access and without mocking at the HTTP level. You only compile the providers you actually use.

The Runnable Abstraction

The Runnable<I, O> trait in synaptic-runnables is the universal composition primitive. Prompt templates, output parsers, chat models, and entire graphs can all be treated as runnables. They compose via the | pipe operator into chains that can be invoked, batched, or streamed. See Runnables & LCEL for details.

The Graph Abstraction

The StateGraph builder in synaptic-graph provides a higher-level orchestration model for complex workflows. Where LCEL chains are linear pipelines (with branching), graphs support cycles, conditional routing, checkpointing, human-in-the-loop interrupts, and dynamic control flow via GraphCommand. See Graph for details.

Messages

Messages are the fundamental unit of communication in Synaptic. Every interaction with an LLM -- whether a simple question, a multi-turn conversation, a tool call, or a streaming response -- is expressed as a sequence of messages. This page explains the message system's design, its variants, and the utilities that operate on message sequences.

Message as a Tagged Enum

Message is a Rust enum with six variants, serialized with #[serde(tag = "role")]:

Variant	Role String	Purpose
`System`	`"system"`	Instructions to the model about behavior and constraints
`Human`	`"human"`	User input
`AI`	`"assistant"`	Model responses, optionally carrying tool calls
`Tool`	`"tool"`	Results from tool execution, linked by `tool_call_id`
`Chat`	custom	Messages with a user-defined role for special protocols
`Remove`	`"remove"`	A signal to remove a message by ID from history

This is a tagged enum, not a trait hierarchy. Pattern matching is exhaustive, serialization is automatic, and the compiler enforces that every code path handles every variant.

Why an Enum?

An enum makes it impossible to construct an invalid message. An AI message always has a tool_calls field (even if empty). A Tool message always has a tool_call_id. A System message never has tool calls. These invariants are enforced by the type system rather than by runtime checks.

Creating Messages

Synaptic provides factory methods rather than exposing struct literals. This keeps the API stable even as internal fields are added:

use synaptic::core::Message;

// Basic messages
let sys = Message::system("You are a helpful assistant.");
let user = Message::human("What is the weather?");
let reply = Message::ai("The weather is sunny today.");

// AI message with tool calls
let with_tools = Message::ai_with_tool_calls("Let me check.", vec![tool_call]);

// Tool result linked to a specific call
let result = Message::tool("72 degrees", "call_abc123");

// Custom role
let custom = Message::chat("moderator", "This message is approved.");

// Removal signal
let remove = Message::remove("msg_id_to_remove");

Builder Methods

Factory methods create messages with default (empty) optional fields. Builder methods let you set them:

let msg = Message::human("Hello")
    .with_id("msg_001")
    .with_name("Alice")
    .with_content_blocks(vec![
        ContentBlock::Text { text: "Hello".into() },
        ContentBlock::Image { url: "https://example.com/photo.jpg".into(), detail: None },
    ]);

Available builders: with_id(), with_name(), with_additional_kwarg(), with_response_metadata_entry(), with_content_blocks(), with_usage_metadata() (AI only).

Accessing Message Fields

Accessor methods work uniformly across variants:

let msg = Message::ai("Hello world");

msg.content()       // "Hello world"
msg.role()          // "assistant"
msg.is_ai()         // true
msg.is_human()      // false
msg.tool_calls()    // &[] (empty slice for non-AI messages)
msg.tool_call_id()  // None (only Some for Tool messages)
msg.id()            // None (unless set with .with_id())
msg.name()          // None (unless set with .with_name())

Type-check methods: is_system(), is_human(), is_ai(), is_tool(), is_chat(), is_remove().

The Remove variant is special: it carries only an id field. Calling content() on it returns "", and name() returns None. The remove_id() method returns Some(&str) only for Remove messages.

Common Fields

Every message variant (except Remove) carries these fields:

content: String -- the text content
id: Option<String> -- optional unique identifier
name: Option<String> -- optional sender name
additional_kwargs: HashMap<String, Value> -- extensible key-value metadata
response_metadata: HashMap<String, Value> -- provider-specific response metadata
content_blocks: Vec<ContentBlock> -- multimodal content (text, images, audio, video, files, data, reasoning)

The AI variant additionally carries:

tool_calls: Vec<ToolCall> -- structured tool invocations
invalid_tool_calls: Vec<InvalidToolCall> -- tool calls that failed to parse
usage_metadata: Option<TokenUsage> -- token usage from the provider

The Tool variant additionally carries:

tool_call_id: String -- links back to the ToolCall that produced this result

Streaming with AIMessageChunk

When streaming responses from an LLM, content arrives in chunks. The AIMessageChunk struct represents a single chunk:

pub struct AIMessageChunk {
    pub content: String,
    pub tool_calls: Vec<ToolCall>,
    pub usage: Option<TokenUsage>,
    pub id: Option<String>,
    pub tool_call_chunks: Vec<ToolCallChunk>,
    pub invalid_tool_calls: Vec<InvalidToolCall>,
}

Chunks support the + and += operators to merge them incrementally:

let mut accumulated = AIMessageChunk::default();
accumulated += chunk1;  // content is concatenated
accumulated += chunk2;  // tool_calls are extended
accumulated += chunk3;  // usage is summed

// Convert the accumulated chunk to a Message
let message = accumulated.into_message();

The merge semantics are:

content is concatenated via push_str
tool_calls, tool_call_chunks, and invalid_tool_calls are extended
id takes the first non-None value
usage is summed field-by-field (input_tokens, output_tokens, total_tokens)

Multimodal Content

The ContentBlock enum supports rich content types beyond plain text:

Variant	Fields	Purpose
`Text`	`text`	Plain text
`Image`	`url`, `detail`	Image reference with optional detail level
`Audio`	`url`	Audio reference
`Video`	`url`	Video reference
`File`	`url`, `mime_type`	Generic file reference
`Data`	`data: Value`	Arbitrary structured data
`Reasoning`	`content`	Model reasoning/chain-of-thought

Content blocks are carried alongside the content string field, allowing messages to contain both a text summary and structured multimodal data.

Message Utility Functions

Synaptic provides four utility functions for working with message sequences:

filter_messages

Filter messages by role, name, or ID with include/exclude lists:

use synaptic::core::filter_messages;

let humans_only = filter_messages(
    &messages,
    Some(&["human"]),  // include_types
    None,              // exclude_types
    None, None,        // include/exclude names
    None, None,        // include/exclude ids
);

trim_messages

Trim a message sequence to fit within a token budget:

use synaptic::core::{trim_messages, TrimStrategy};

let trimmed = trim_messages(
    messages,
    4096,                       // max tokens
    |msg| msg.content().len() / 4,  // token counter function
    TrimStrategy::Last,         // keep most recent
    true,                       // always preserve system message
);

TrimStrategy::First keeps messages from the beginning. TrimStrategy::Last keeps messages from the end, optionally preserving the leading system message.

merge_message_runs

Merge consecutive messages of the same role into a single message:

use synaptic::core::merge_message_runs;

let merged = merge_message_runs(vec![
    Message::human("Hello"),
    Message::human("How are you?"),
    Message::ai("I'm fine"),
]);
// Result: [Human("Hello\nHow are you?"), AI("I'm fine")]

For AI messages, tool calls and invalid tool calls are also merged.

get_buffer_string

Convert a message sequence to a human-readable string:

use synaptic::core::get_buffer_string;

let text = get_buffer_string(&messages, "Human", "AI");
// "System: You are helpful.\nHuman: Hello\nAI: Hi there!"

Serialization

Messages serialize as JSON with a role discriminator field:

{
  "role": "assistant",
  "content": "Hello!",
  "tool_calls": [],
  "id": null,
  "name": null
}

The AI variant serializes its role as "assistant" (matching OpenAI convention), while role() returns "assistant" at runtime as well. Empty collections and None optionals are omitted from serialization via skip_serializing_if attributes.

This serialization format is compatible with LangChain's message schema, making it straightforward to exchange message histories between Synaptic and Python-based systems.

Runnables & LCEL

The LangChain Expression Language (LCEL) is a composition system for building data processing pipelines. In Synaptic, this is implemented through the Runnable trait and a set of combinators that let you pipe, branch, parallelize, retry, and stream operations. This page explains the design and the key types.

The Runnable Trait

At the heart of LCEL is a single trait:

#[async_trait]
pub trait Runnable<I, O>: Send + Sync
where
    I: Send + 'static,
    O: Send + 'static,
{
    async fn invoke(&self, input: I, config: &RunnableConfig) -> Result<O, SynapticError>;

    async fn batch(&self, inputs: Vec<I>, config: &RunnableConfig) -> Vec<Result<O, SynapticError>>;

    fn stream<'a>(&'a self, input: I, config: &'a RunnableConfig) -> RunnableOutputStream<'a, O>;

    fn boxed(self) -> BoxRunnable<I, O>;
}

Only invoke() is required. Default implementations are provided for:

batch() -- runs invoke() sequentially for each input
stream() -- wraps invoke() as a single-item stream
boxed() -- wraps self into a type-erased BoxRunnable

The RunnableConfig parameter threads runtime configuration (tags, metadata, concurrency limits, run IDs) through the entire pipeline without changing the input/output types.

BoxRunnable and the Pipe Operator

Rust's type system requires concrete types for composition, but LCEL chains can contain heterogeneous steps. BoxRunnable<I, O> is a type-erased wrapper that erases the concrete type while preserving the Runnable interface.

The pipe operator (|) connects two boxed runnables into a RunnableSequence:

use synaptic::runnables::{BoxRunnable, Runnable, RunnableLambda};

let step1 = RunnableLambda::new(|x: String| async move {
    Ok(x.to_uppercase())
}).boxed();

let step2 = RunnableLambda::new(|x: String| async move {
    Ok(format!("Result: {x}"))
}).boxed();

let chain = step1 | step2;
let output = chain.invoke("hello".into(), &config).await?;
// output: "Result: HELLO"

This is Rust's BitOr trait overloaded on BoxRunnable. The intermediate type between steps must match -- the output of step1 must be the input type of step2.

Key Runnable Types

RunnablePassthrough

Passes input through unchanged. Useful as a branch in RunnableParallel or as a placeholder in a chain:

let passthrough = RunnablePassthrough::new().boxed();
// invoke("hello") => Ok("hello")

RunnableLambda

Wraps an async closure into a Runnable. This is the most common way to insert custom logic into a chain:

let transform = RunnableLambda::new(|input: String| async move {
    Ok(input.split_whitespace().count())
}).boxed();

Tip: For named, reusable functions you can use the #[chain] macro instead of RunnableLambda::new. It generates a factory function that returns a BoxRunnable directly. See Procedural Macros.

RunnableSequence

Created by the | operator. Executes steps in order, feeding each output as the next step's input. You rarely construct this directly.

RunnableParallel

Runs named branches concurrently and merges their outputs into a serde_json::Value object:

let parallel = RunnableParallel::new()
    .add("upper", RunnableLambda::new(|s: String| async move {
        Ok(Value::String(s.to_uppercase()))
    }).boxed())
    .add("length", RunnableLambda::new(|s: String| async move {
        Ok(Value::Number(s.len().into()))
    }).boxed());

let result = parallel.invoke("hello".into(), &config).await?;
// result: {"upper": "HELLO", "length": 5}

All branches receive a clone of the same input and run concurrently via tokio::join!. The output is a JSON object keyed by the branch names.

RunnableBranch

Routes input to one of several branches based on conditions, with a default fallthrough:

let branch = RunnableBranch::new(
    vec![
        (
            |input: &String| input.starts_with("math:"),
            math_chain.boxed(),
        ),
        (
            |input: &String| input.starts_with("code:"),
            code_chain.boxed(),
        ),
    ],
    default_chain.boxed(),  // fallback
);

Conditions are checked in order. The first matching condition's branch is invoked. If none match, the default branch handles it.

RunnableWithFallbacks

Tries alternatives when the primary runnable fails:

let robust = RunnableWithFallbacks::new(
    primary_model.boxed(),
    vec![fallback_model.boxed()],
);

If primary_model returns an error, fallback_model is tried with the same input. This is useful for model failover (e.g., try GPT-4, fall back to GPT-3.5).

RunnableAssign

Runs a parallel branch and merges its output into the existing JSON value. The input must be a serde_json::Value object, and the parallel branch's outputs are merged as additional keys:

let assign = RunnableAssign::new(
    RunnableParallel::new()
        .add("word_count", count_words_runnable)
);
// Input: {"text": "hello world"}
// Output: {"text": "hello world", "word_count": 2}

RunnablePick

Extracts specific keys from a JSON value:

let pick = RunnablePick::new(vec!["name".into(), "age".into()]);
// Input: {"name": "Alice", "age": 30, "email": "..."}
// Output: {"name": "Alice", "age": 30}

Single-key picks return the value directly rather than wrapping it in an object.

RunnableEach

Maps a runnable over each element of a collection:

let each = RunnableEach::new(transform_single_item.boxed());
// Input: vec!["a", "b", "c"]
// Output: vec![transformed_a, transformed_b, transformed_c]

RunnableRetry

Retries a runnable on failure with configurable policy:

let retry = RunnableRetry::new(
    flaky_runnable.boxed(),
    RetryPolicy {
        max_retries: 3,
        delay: Duration::from_millis(100),
        backoff_factor: 2.0,
    },
);

RunnableGenerator

Produces values from a stream, useful for wrapping streaming sources into the runnable pipeline:

let generator = RunnableGenerator::new(|input: String, _config| {
    Box::pin(async_stream::stream! {
        for word in input.split_whitespace() {
            yield Ok(word.to_string());
        }
    })
});

Config Binding

BoxRunnable::bind() applies a config transform before delegation. This lets you attach metadata, set concurrency limits, or override run names without changing the chain's input/output types:

let tagged = chain.bind(|mut config| {
    config.tags.push("production".into());
    config
});

with_config() is a convenience that replaces the config entirely. with_listeners() adds before/after callbacks around invocation.

Streaming Through Pipelines

When you call stream() on a chain, the streaming behavior depends on the components:

If the final component in a sequence truly streams (e.g., an LLM that yields token-by-token), the chain streams those chunks through.
Intermediate steps in the pipeline run their invoke() and pass the result forward.
RunnableGenerator produces a true stream from any async function.

This means a chain like prompt | model | parser will stream the model's output chunks through the parser, provided the parser implements true streaming.

Everything Is a Runnable

Synaptic's LCEL design means that many types across the framework implement Runnable:

Prompt templates (ChatPromptTemplate) implement Runnable<Value, Vec<Message>> -- they take template variables and produce messages.
Output parsers (StrOutputParser, JsonOutputParser, etc.) implement Runnable -- they transform one output format to another.
Chat models can be wrapped as runnables for use in chains.
Graphs produce state from state.

This uniformity means you can compose any of these with | and get type-safe, streamable pipelines.

Agents & Tools

Agents are systems where an LLM decides what actions to take. Rather than following a fixed script, the model examines the conversation, chooses which tools to call (if any), processes the results, and decides whether to call more tools or produce a final answer. This page explains how Synaptic models tools, how they are registered and executed, and how the agent loop works.

The Tool Trait

A tool in Synaptic is anything that implements the Tool trait:

#[async_trait]
pub trait Tool: Send + Sync {
    fn name(&self) -> &'static str;
    fn description(&self) -> &'static str;
    async fn call(&self, args: Value) -> Result<Value, SynapticError>;
}

name() returns a unique identifier the LLM uses to refer to this tool.
description() explains what the tool does, in natural language. This is sent to the LLM so it knows when and how to use the tool.
call() executes the tool with JSON arguments and returns a JSON result.

The trait is intentionally minimal. A tool does not know about conversations, memory, or models. It receives arguments, does work, and returns a result. This keeps tools reusable and testable in isolation.

ToolDefinition

When tools are sent to an LLM, they are described as ToolDefinition structs:

pub struct ToolDefinition {
    pub name: String,
    pub description: String,
    pub parameters: Value,  // JSON Schema
    pub extras: Option<HashMap<String, Value>>,  // provider-specific params
}

The parameters field is a JSON Schema that describes the tool's expected arguments. LLM providers use this schema to generate valid tool calls. The ToolDefinition is metadata about the tool -- it never executes anything.

The optional extras field carries provider-specific parameters (e.g., Anthropic's cache_control). Provider adapters in synaptic-models forward these to the API when present.

ToolCall and ToolChoice

When an LLM decides to use a tool, it produces a ToolCall:

pub struct ToolCall {
    pub id: String,
    pub name: String,
    pub arguments: Value,
}

The id links the call to its result. When a tool finishes execution, the result is wrapped in a Message::tool(result, tool_call_id) that references this ID, allowing the LLM to match results back to calls.

ToolChoice controls the LLM's tool-calling behavior:

Variant	Behavior
`Auto`	The model decides whether to call tools
`Required`	The model must call at least one tool
`None`	Tool calling is disabled
`Specific(name)`	The model must call the named tool

ToolChoice is set on ChatRequest via .with_tool_choice().

ToolRegistry

The ToolRegistry is a thread-safe collection of tools, backed by Arc<RwLock<HashMap<String, Arc<dyn Tool>>>>:

use synaptic::tools::ToolRegistry;

let registry = ToolRegistry::new();
registry.register(Arc::new(WeatherTool))?;
registry.register(Arc::new(CalculatorTool))?;

// Look up a tool by name
let tool = registry.get("weather");

Registration is idempotent -- registering a tool with the same name replaces the previous one. The Arc<RwLock<_>> ensures safe concurrent access: multiple readers can look up tools simultaneously, and registration briefly acquires a write lock.

Tool Executors

Executors bridge the gap between tool calls from an LLM and the tool registry:

SerialToolExecutor -- executes tool calls one at a time. Simple and predictable:

let executor = SerialToolExecutor::new(registry);
let result = executor.execute("weather", json!({"city": "Tokyo"})).await?;

ParallelToolExecutor -- executes multiple tool calls concurrently. Useful when the LLM produces several independent tool calls in a single response.

Tool Wrappers

Synaptic provides wrapper types that add behavior to existing tools:

HandleErrorTool -- catches errors from the inner tool and returns them as a string result instead of propagating the error. This allows the LLM to see the error and retry with different arguments.
ReturnDirectTool -- marks the tool's output as the final response, short-circuiting the agent loop instead of feeding the result back to the LLM.

ToolNode

In the graph system, ToolNode is a pre-built graph node that processes AI messages containing tool calls. It:

Reads the last message from the graph state
Extracts all ToolCall entries from it
Executes each tool call via a SerialToolExecutor
Appends the results as Message::tool(...) messages back to the state

ToolNode is the standard way to handle tool execution inside a graph workflow. You do not need to write tool dispatching logic yourself.

The ReAct Agent Pattern

ReAct (Reasoning + Acting) is the most common agent pattern. The model alternates between reasoning about what to do and acting by calling tools. Synaptic provides a prebuilt ReAct agent via create_react_agent():

use synaptic::graph::{create_react_agent, MessageState};

let graph = create_react_agent(model, tools)?;
let state = MessageState::from_messages(vec![
    Message::human("What is the weather in Tokyo?"),
]);
let result = graph.invoke(state).await?;

This builds a graph with two nodes:

[START] --> [agent] --tool_calls--> [tools] --> [agent] ...
                   \--no_tools----> [END]

"agent" node: Calls the LLM with the current messages and tool definitions. The LLM's response is appended to the state.
"tools" node: A ToolNode that executes any tool calls from the agent's response and appends results.

The conditional edge after "agent" checks if the last message has tool calls. If yes, route to "tools". If no, route to END. The edge from "tools" always returns to "agent", creating the loop.

The Agent Loop in Detail

The user message enters the graph state.
The "agent" node sends all messages to the LLM along with tool definitions.
The LLM responds. If it includes tool calls: a. The response (with tool calls) is appended to the state. b. Routing sends execution to the "tools" node. c. Each tool call is executed and results are appended as Tool messages. d. Routing sends execution back to the "agent" node. e. The LLM now sees the tool results and can decide what to do next.
When the LLM responds without tool calls, it has produced its final answer. Routing sends execution to END.

This loop continues until the LLM decides it has enough information to answer directly, or until the graph's iteration safety limit (100) is reached.

ReactAgentOptions

The create_react_agent_with_options() function accepts a ReactAgentOptions struct for advanced configuration:

let options = ReactAgentOptions {
    checkpointer: Some(Arc::new(MemorySaver::new())),
    system_prompt: Some("You are a helpful weather assistant.".into()),
    interrupt_before: vec!["tools".into()],
    interrupt_after: vec![],
};

let graph = create_react_agent_with_options(model, tools, options)?;

Option	Purpose
`checkpointer`	State persistence for resumption across invocations
`system_prompt`	Prepended to messages before each LLM call
`interrupt_before`	Pause before named nodes (for human approval of tool calls)
`interrupt_after`	Pause after named nodes (for human review of tool results)

Setting interrupt_before: vec!["tools".into()] creates a human-in-the-loop agent: the graph pauses before executing tools, allowing a human to inspect the proposed tool calls, modify them, or reject them entirely. The graph is then resumed via update_state().

Memory

Without memory, every LLM call is stateless -- the model has no knowledge of previous interactions. Memory in Synaptic solves this by storing, retrieving, and managing conversation history so that subsequent calls include relevant context. This page explains the memory abstraction, the available strategies, and how they trade off between completeness and cost.

The MemoryStore Trait

All memory backends implement a single trait:

#[async_trait]
pub trait MemoryStore: Send + Sync {
    async fn append(&self, session_id: &str, message: Message) -> Result<(), SynapticError>;
    async fn load(&self, session_id: &str) -> Result<Vec<Message>, SynapticError>;
    async fn clear(&self, session_id: &str) -> Result<(), SynapticError>;
}

Three operations, keyed by a session identifier:

append -- add a message to the session's history
load -- retrieve the full history for a session
clear -- delete all messages for a session

The session_id parameter is central to Synaptic's memory design. Two conversations with different session IDs are completely isolated, even if they share the same memory store instance. This enables multi-tenant applications where many users interact concurrently through a single system.

InMemoryStore

The simplest implementation -- a HashMap<String, Vec<Message>> wrapped in Arc<RwLock<_>>:

use synaptic::memory::InMemoryStore;

let store = InMemoryStore::new();
store.append("session_1", Message::human("Hello")).await?;
let history = store.load("session_1").await?;

InMemoryStore is fast, requires no external dependencies, and is suitable for development, testing, and short-lived applications. Data is lost when the process exits.

FileChatMessageHistory

A persistent store that writes messages to a JSON file on disk. Each session is stored as a separate file. This is useful for applications that need persistence without a database:

use synaptic::memory::FileChatMessageHistory;

let history = FileChatMessageHistory::new("./chat_history")?;

Memory Strategies

Raw MemoryStore keeps every message forever. For long conversations, this leads to unbounded token usage and eventually exceeds the model's context window. Memory strategies wrap a store and control which messages are included in the context.

ConversationBufferMemory

Keeps all messages. The simplest strategy -- everything is sent to the LLM every time.

Advantage: No information loss.
Disadvantage: Token usage grows without bound. Eventually exceeds the context window.
Use case: Short conversations where you know the total message count is small.

ConversationWindowMemory

Keeps only the last K message pairs (human + AI). Older messages are dropped:

use synaptic::memory::ConversationWindowMemory;

let memory = ConversationWindowMemory::new(store, 5); // keep last 5 exchanges

Advantage: Fixed, predictable token usage.
Disadvantage: Complete loss of older context. The model has no knowledge of what happened more than K turns ago.
Use case: Chat UIs, customer service bots, and any scenario where recent context matters most.

ConversationSummaryMemory

Summarizes older messages using an LLM, keeping only the summary plus recent messages:

use synaptic::memory::ConversationSummaryMemory;

let memory = ConversationSummaryMemory::new(store, summarizer_model);

After each exchange, the strategy uses an LLM to produce a running summary of the conversation. The summary replaces the older messages, so the context sent to the main model includes the summary followed by recent messages.

Advantage: Retains the gist of the entire conversation. Constant-ish token usage.
Disadvantage: Summarization has a cost (an extra LLM call). Details may be lost in compression. Summarization quality depends on the model.
Use case: Long-running conversations where historical context matters (e.g., a multi-session assistant that remembers past preferences).

ConversationTokenBufferMemory

Keeps as many recent messages as fit within a token budget:

use synaptic::memory::ConversationTokenBufferMemory;

let memory = ConversationTokenBufferMemory::new(store, 4096); // max 4096 tokens

Unlike window memory (which counts messages), token buffer memory counts tokens. This is more precise when messages vary significantly in length.

Advantage: Direct control over context size. Works well with models that have strict context limits.
Disadvantage: Still loses old messages entirely.
Use case: Cost-sensitive applications where you want to fill the context window efficiently.

ConversationSummaryBufferMemory

A hybrid: summarizes old messages and keeps recent ones, with a token threshold controlling the boundary:

use synaptic::memory::ConversationSummaryBufferMemory;

let memory = ConversationSummaryBufferMemory::new(store, model, 2000);
// Summarize when recent messages exceed 2000 tokens

When the total token count of recent messages exceeds the threshold, the oldest messages are summarized and replaced with the summary. The result is a context that starts with a summary of the distant past, followed by verbatim recent messages.

Advantage: Best of both worlds -- retains old context through summaries while keeping recent messages verbatim.
Disadvantage: More complex. Requires an LLM for summarization.
Use case: Production chat applications that need both historical awareness and accurate recent context.

Strategy Comparison

Strategy	What It Keeps	Token Growth	Info Loss	Extra LLM Calls
Buffer	Everything	Unbounded	None	None
Window	Last K turns	Fixed	Old messages lost	None
Summary	Summary + recent	Near-constant	Details compressed	Yes
TokenBuffer	Recent within budget	Fixed	Old messages lost	None
SummaryBuffer	Summary + recent buffer	Bounded	Old details compressed	Yes

RunnableWithMessageHistory

Rather than manually loading and saving messages around each LLM call, RunnableWithMessageHistory wraps any Runnable and handles it automatically:

use synaptic::memory::RunnableWithMessageHistory;

let chain_with_memory = RunnableWithMessageHistory::new(
    my_chain,
    store,
    |config| config.metadata.get("session_id")
        .and_then(|v| v.as_str())
        .unwrap_or("default")
        .to_string(),
);

On each invocation:

The session ID is extracted from the RunnableConfig metadata.
Historical messages are loaded from the store.
The inner runnable is invoked with the historical context prepended.
The new messages (input and output) are appended to the store.

This separates memory management from application logic. The inner runnable does not need to know about memory at all.

Session Isolation

A key design property: memory is always scoped to a session. The session_id is just a string -- it could be a user ID, a conversation ID, a thread ID, or any other identifier meaningful to your application.

Different sessions sharing the same InMemoryStore (or any other store) are completely independent. Appending to session "alice" never affects session "bob". This makes it safe to use a single store instance across an entire application serving multiple users.

Retrieval

Retrieval-Augmented Generation (RAG) grounds LLM responses in external knowledge. Instead of relying solely on what the model learned during training, a RAG system retrieves relevant documents at query time and includes them in the prompt. This page explains the retrieval pipeline's architecture, the role of each component, and the retriever types Synaptic provides.

The Pipeline

A RAG pipeline has five stages:

Load  -->  Split  -->  Embed  -->  Store  -->  Retrieve

Load: Read raw content from files, databases, or the web into Document structs.
Split: Break large documents into smaller, semantically coherent chunks.
Embed: Convert text chunks into numerical vectors that capture meaning.
Store: Index the vectors for efficient similarity search.
Retrieve: Given a query, find the most relevant chunks.

Each stage has a dedicated trait and multiple implementations. You can mix and match implementations at each stage depending on your data sources and requirements.

Document

The Document struct is the universal unit of content:

pub struct Document {
    pub id: Option<String>,
    pub content: String,
    pub metadata: HashMap<String, Value>,
}

content holds the text.
metadata holds arbitrary key-value pairs (source filename, page number, section heading, creation date, etc.).
id is an optional unique identifier used by stores for upsert and delete operations.

Documents flow through every stage of the pipeline. Loaders produce them, splitters transform them (preserving and augmenting metadata), and retrievers return them.

Loading

The Loader trait is async and returns a stream of documents:

Loader	Source	Behavior
`TextLoader`	Plain text files	One document per file
`JsonLoader`	JSON files	Configurable `id_key` and `content_key` extraction
`CsvLoader`	CSV files	Column-based, with metadata from other columns
`DirectoryLoader`	Directory of files	Recursive, with glob filtering to select file types
`FileLoader`	Single file	Generic file loading with configurable parser
`MarkdownLoader`	Markdown files	Markdown-aware parsing
`WebLoader`	URLs	Fetches and processes web content

Loaders handle the mechanics of reading and parsing. They produce Document values with appropriate metadata (e.g., a source field with the file path).

Splitting

Large documents must be split into chunks that fit within embedding models' context windows and that contain focused, coherent content. The TextSplitter trait provides:

pub trait TextSplitter: Send + Sync {
    fn split_text(&self, text: &str) -> Result<Vec<String>, SynapticError>;
    fn split_documents(&self, documents: Vec<Document>) -> Result<Vec<Document>, SynapticError>;
}

Splitter	Strategy
`CharacterTextSplitter`	Splits on a single separator (default: `"\n\n"`) with configurable chunk size and overlap
`RecursiveCharacterTextSplitter`	Tries a hierarchy of separators (`"\n\n"`, `"\n"`, `" "`, `""`) -- splits on the largest unit that fits within the chunk size
`MarkdownHeaderTextSplitter`	Splits on Markdown headers, adding header hierarchy to metadata
`HtmlHeaderTextSplitter`	Splits on HTML header tags, adding header hierarchy to metadata
`TokenTextSplitter`	Splits based on approximate token count (~4 chars/token heuristic, word-boundary aware)
`LanguageTextSplitter`	Splits code using language-aware separators (functions, classes, etc.)

The most commonly used splitter is RecursiveCharacterTextSplitter. It produces chunks that respect natural document boundaries (paragraphs, then sentences, then words) and includes configurable overlap between chunks so that information at chunk boundaries is not lost.

split_documents() preserves the original document's metadata on each chunk, so you can trace every chunk back to its source.

Embedding

Embedding models convert text into dense numerical vectors. Texts with similar meaning produce vectors that are close together in the vector space. The trait:

#[async_trait]
pub trait Embeddings: Send + Sync {
    async fn embed_documents(&self, texts: Vec<String>) -> Result<Vec<Vec<f32>>, SynapticError>;
    async fn embed_query(&self, text: &str) -> Result<Vec<f32>, SynapticError>;
}

Two methods because some providers optimize differently for documents (which may be batched) versus queries (single text, possibly with different prompt prefixes).

Implementation	Description
`OpenAiEmbeddings`	OpenAI's embedding API (text-embedding-ada-002, etc.)
`OllamaEmbeddings`	Local Ollama embedding models
`FakeEmbeddings`	Deterministic vectors for testing (no API calls)
`CachedEmbeddings`	Wraps any `Embeddings` with a cache to avoid redundant API calls

Vector Storage

Vector stores hold embedded documents and support similarity search:

#[async_trait]
pub trait VectorStore: Send + Sync {
    async fn add_documents(&self, docs: Vec<Document>, embeddings: Vec<Vec<f32>>) -> Result<Vec<String>, SynapticError>;
    async fn similarity_search(&self, query_embedding: &[f32], k: usize) -> Result<Vec<Document>, SynapticError>;
    async fn delete(&self, ids: &[String]) -> Result<(), SynapticError>;
}

InMemoryVectorStore uses cosine similarity with brute-force search. It stores documents and their embeddings in a RwLock<HashMap>, computes cosine similarity against all stored vectors at query time, and returns the top-k results. This is suitable for small to medium collections (thousands of documents). For larger collections, you would implement the VectorStore trait with a dedicated vector database.

Retrieval

The Retriever trait is the query-time interface:

#[async_trait]
pub trait Retriever: Send + Sync {
    async fn retrieve(&self, query: &str) -> Result<Vec<Document>, SynapticError>;
}

A retriever takes a natural-language query and returns relevant documents. Synaptic provides seven retriever implementations, each with different strengths.

InMemoryRetriever

The simplest retriever -- stores documents in memory and returns them based on keyword matching. Useful for testing and small collections.

BM25Retriever

Implements the Okapi BM25 scoring algorithm, a classical information retrieval method that ranks documents by term frequency and inverse document frequency. No embeddings required -- purely lexical matching.

BM25 excels at exact keyword matching. If a user searches for "tokio runtime" and a document contains exactly those words, BM25 will rank it highly even if semantically similar documents that use different words score lower.

MultiQueryRetriever

Uses an LLM to generate multiple query variants from the original query, then runs each variant through a base retriever and combines the results. This addresses the problem that a single query phrasing may miss relevant documents:

Original query: "How do I handle errors?"
Generated variants:
  - "What is the error handling approach?"
  - "How are errors propagated in the system?"
  - "What error types are available?"

EnsembleRetriever

Combines results from multiple retrievers using Reciprocal Rank Fusion (RRF). A typical setup pairs BM25 (good at exact matches) with a vector store retriever (good at semantic matches):

The RRF algorithm assigns scores based on rank position across retrievers, so a document that appears in the top results of multiple retrievers gets a higher combined score.

ContextualCompressionRetriever

Wraps a base retriever and compresses retrieved documents to remove irrelevant content. Uses a DocumentCompressor (such as EmbeddingsFilter, which filters out documents below a similarity threshold) to refine results after retrieval.

SelfQueryRetriever

Uses an LLM to parse the user's query into a structured filter over document metadata, combined with a semantic search query. For example:

User query: "Find papers about transformers published after 2020"
Parsed:
  - Semantic query: "papers about transformers"
  - Metadata filter: year > 2020

This enables natural-language queries that combine semantic search with precise metadata filtering.

ParentDocumentRetriever

Stores small child chunks for embedding (which improves retrieval precision) but returns the larger parent documents they came from (which provides more context to the LLM). This addresses the tension between small chunks (better for matching) and large chunks (better for context).

MultiVectorRetriever

Similar to ParentDocumentRetriever, but implemented at the vector store level. MultiVectorRetriever stores child document embeddings in a VectorStore and maintains a separate docstore mapping child IDs to parent documents. At query time, it searches for matching child chunks and then looks up their parent documents for return. This is available in synaptic-vectorstores.

Connecting Retrieval to Generation

Retrievers produce Vec<Document>. To use them in a RAG chain, you typically format the documents into a prompt and pass them to an LLM:

// Pseudocode for a RAG chain
let docs = retriever.retrieve("What is Synaptic?").await?;
let context = docs.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");
let prompt = format!("Context:\n{context}\n\nQuestion: What is Synaptic?");

Using LCEL, this can be composed into a reusable chain with RunnableParallel (to fetch context and pass through the question simultaneously), RunnableLambda (to format the prompt), and a chat model.

Graph

LCEL chains are powerful for linear pipelines, but some workflows need cycles, conditional branching, checkpointed state, and human intervention. The graph system (Synaptic's equivalent of LangGraph) provides these capabilities through a state-machine abstraction. This page explains the graph model, its key concepts, and how it differs from chain-based composition.

Why Graphs?

Consider a ReAct agent. The LLM calls tools, sees the results, and decides whether to call more tools or produce a final answer. This is a loop -- the execution path is not known in advance. LCEL chains compose linearly (A | B | C), but a ReAct agent needs to go from A to B, then back to A, then conditionally to C.

Graphs solve this. Each step is a node, transitions are edges, and the graph runtime handles routing, checkpointing, and streaming. The execution path emerges at runtime based on the state.

State

Every graph operates on a shared state type that implements the State trait:

pub trait State: Send + Sync + Clone + 'static {
    fn merge(&mut self, other: Self);
}

The merge() method defines how state updates are combined. When a node returns a new state, it is merged into the current state. This is the graph's "reducer" -- it determines how concurrent or sequential updates compose.

MessageState

Synaptic provides MessageState as the built-in state type for conversational agents:

pub struct MessageState {
    pub messages: Vec<Message>,
}

Its merge() implementation appends new messages to the existing list. This means each node can add messages (LLM responses, tool results, etc.) and they accumulate naturally.

You can define custom state types for non-conversational workflows. Any Clone + Send + Sync + 'static type that implements State (specifically, the merge method) can be used.

Nodes

A node is a unit of computation within the graph:

#[async_trait]
pub trait Node<S: State>: Send + Sync {
    async fn process(&self, state: S) -> Result<NodeOutput<S>, SynapticError>;
}

A node receives the current state, does work, and returns a NodeOutput<S>:

NodeOutput::State(S) -- a regular state update. The From<S> impl lets you write Ok(state.into()).
NodeOutput::Command(Command<S>) -- a control flow command: dynamic routing (Command::goto), early termination (Command::end), or interrupts (interrupt()).

FnNode wraps an async closure into a node, which is the most common way to define nodes:

let my_node = FnNode::new(|state: MessageState| async move {
    // Process state, add messages, etc.
    Ok(state.into())
});

ToolNode is a pre-built node that extracts tool calls from the last AI message, executes them, and appends the results. The tools_condition function provides standard routing: returns "tools" if the last message has tool calls, else END.

Building a Graph

StateGraph<S> is the builder:

use synaptic::graph::{StateGraph, MessageState, END};

let graph = StateGraph::new()
    .add_node("step_1", node_1)
    .add_node("step_2", node_2)
    .set_entry_point("step_1")
    .add_edge("step_1", "step_2")
    .add_edge("step_2", END)
    .compile()?;

add_node(name, node)

Registers a named node. Names are arbitrary strings. Two special constants exist: START (the entry sentinel) and END (the exit sentinel). You never add START or END as nodes -- they are implicit.

set_entry_point(name)

Defines which node executes first after START.

add_edge(source, target)

A fixed edge -- after source completes, always go to target. The target can be END to terminate the graph.

add_conditional_edges(source, router_fn)

A conditional edge -- after source completes, call router_fn with the current state to determine the next node:

.add_conditional_edges("agent", |state: &MessageState| {
    if state.last_message().map_or(false, |m| !m.tool_calls().is_empty()) {
        "tools".to_string()
    } else {
        END.to_string()
    }
})

The router function receives a reference to the state and returns the name of the next node (or END).

There is also add_conditional_edges_with_path_map(), which additionally provides a mapping from router return values to node names. This path map is used by visualization tools to render the conditional branches.

compile()

Validates the graph (checks that all referenced nodes exist, that the entry point is set, etc.) and returns a CompiledGraph<S>.

Executing a Graph

CompiledGraph<S> provides two execution methods:

invoke(state)

Runs the graph and returns a GraphResult<S>:

let initial = MessageState::with_messages(vec![Message::human("Hello")]);
let result = graph.invoke(initial).await?;

match result {
    GraphResult::Complete(state) => println!("Done: {} messages", state.messages.len()),
    GraphResult::Interrupted { state, interrupt_value } => {
        println!("Paused: {interrupt_value}");
    }
}

// Or use convenience methods:
let state = result.into_state(); // works for both Complete and Interrupted

stream(state, mode)

Returns a GraphStream that yields GraphEvent<S> after each node executes:

use futures::StreamExt;
use synaptic::graph::StreamMode;

let mut stream = graph.stream(initial, StreamMode::Values);
while let Some(event) = stream.next().await {
    let event = event?;
    println!("Node '{}' completed", event.node);
}

StreamMode::Values yields the full state after each node. StreamMode::Updates yields the per-node state changes.

Checkpointing

Graphs support state persistence through the Checkpointer trait. After each node executes, the current state and the next scheduled node are saved. This enables:

Resumption: If the process crashes, the graph can resume from the last checkpoint.
Human-in-the-loop: The graph can pause, persist state, and resume later after human input.

MemorySaver is the built-in in-memory checkpointer. For production use, you would implement Checkpointer with a database backend.

use synaptic::graph::MemorySaver;

let checkpointer = Arc::new(MemorySaver::new());
let graph = graph.with_checkpointer(checkpointer);

Checkpoints are identified by a CheckpointConfig that includes a thread_id. Different threads have independent checkpoint histories.

get_state / get_state_history

You can inspect the current state and full history of a checkpointed graph:

let current = graph.get_state(&config).await?;
let history = graph.get_state_history(&config).await?;

get_state_history() returns a list of (state, next_node) pairs, ordered from oldest to newest.

Human-in-the-Loop

Two mechanisms pause graph execution for human intervention:

interrupt_before(nodes)

The graph pauses before executing the named nodes. The current state is checkpointed, and the graph returns GraphResult::Interrupted.

let graph = StateGraph::new()
    // ...
    .interrupt_before(vec!["tools".into()])
    .compile()?;

After the interrupt, the human can inspect the state (e.g., review proposed tool calls), modify it via update_state(), and resume execution:

// Inspect the proposed tool calls
let state = graph.get_state(&config).await?.unwrap();

// Modify state if needed
graph.update_state(&config, updated_state).await?;

// Resume execution
let result = graph.invoke_with_config(
    MessageState::default(),
    Some(config),
).await?;
let final_state = result.into_state();

interrupt_after(nodes)

The graph pauses after executing the named nodes. The node's output is already in the state, and the next node is recorded in the checkpoint. Useful for reviewing a node's output before proceeding.

Programmatic interrupt()

Nodes can also interrupt programmatically using the interrupt() function:

use synaptic::graph::{interrupt, NodeOutput};

// Inside a node's process() method:
Ok(interrupt(serde_json::json!({"question": "Approve?"})))

This returns GraphResult::Interrupted with the specified value, which the caller can inspect via result.interrupt_value().

Dynamic Control Flow with Command

Nodes can override normal edge-based routing by returning NodeOutput::Command(...):

Command::goto(target)

Redirects execution to a specific node, skipping normal edge resolution:

Ok(NodeOutput::Command(Command::goto("summary")))

Command::goto_with_update(target, state_delta)

Routes to a node while also applying a state update:

Ok(NodeOutput::Command(Command::goto_with_update("next", delta)))

Command::end()

Ends graph execution immediately:

Ok(NodeOutput::Command(Command::end()))

Command::update(state_delta)

Applies a state update without overriding routing (uses normal edges):

Ok(NodeOutput::Command(Command::update(delta)))

Commands take priority over edges. After a node executes, the graph checks for a command before consulting edges. This enables dynamic, state-dependent control flow that goes beyond what static edge definitions can express.

Send (Fan-out)

The Send mechanism allows a node to dispatch work to multiple target nodes via Command::send(), enabling fan-out (map-reduce) patterns within the graph.

Visualization

CompiledGraph provides multiple rendering methods:

Method	Output	Requirements
`draw_mermaid()`	Mermaid flowchart string	None
`draw_ascii()`	Plain text summary	None
`draw_dot()`	Graphviz DOT format	None
`draw_png(path)`	PNG image file	Graphviz `dot` in PATH
`draw_mermaid_png(path)`	PNG via mermaid.ink API	Internet access
`draw_mermaid_svg(path)`	SVG via mermaid.ink API	Internet access

Display is also implemented, so println!("{graph}") outputs the ASCII representation.

Mermaid output example for a ReAct agent:

graph TD
    __start__(["__start__"])
    agent["agent"]
    tools["tools"]
    __end__(["__end__"])
    __start__ --> agent
    tools --> agent
    agent -.-> |tools| tools
    agent -.-> |__end__| __end__

Prebuilt Multi-Agent Patterns

Beyond create_react_agent, Synaptic provides two multi-agent graph constructors:

create_supervisor

Builds a supervisor graph where a central LLM orchestrates sub-agents. The supervisor decides which agent to delegate to by calling handoff tools (transfer_to_<agent_name>). Each sub-agent is itself a compiled react agent graph.

use synaptic::graph::{create_supervisor, SupervisorOptions};

let agents = vec![
    ("researcher".to_string(), researcher_graph),
    ("writer".to_string(), writer_graph),
];
let graph = create_supervisor(supervisor_model, agents, SupervisorOptions::default())?;

The supervisor loop: supervisor calls LLM → if handoff tool call, route to sub-agent → sub-agent runs to completion → return to supervisor → repeat until supervisor produces a final answer (no tool calls).

create_swarm

Builds a swarm graph where agents hand off to each other peer-to-peer, without a central coordinator. Each agent has its own model, tools, and system prompt. Handoff is done via transfer_to_<agent_name> tool calls.

use synaptic::graph::{create_swarm, SwarmAgent, SwarmOptions};

let agents = vec![
    SwarmAgent { name: "triage".into(), model, tools, system_prompt: Some("...".into()) },
    SwarmAgent { name: "support".into(), model, tools, system_prompt: Some("...".into()) },
];
let graph = create_swarm(agents, SwarmOptions::default())?;

The first agent in the list is the entry point. Each agent runs until it either produces a final answer or hands off to another agent.

Safety Limits

The graph runtime enforces a maximum of 100 iterations per execution to prevent infinite loops. If a graph cycles more than 100 times, it returns SynapticError::Graph("max iterations (100) exceeded"). This is a safety guard, not a configurable limit -- if your workflow legitimately needs more iterations, the graph structure should be reconsidered.

Streaming

LLM responses can take seconds to generate. Without streaming, the user sees nothing until the entire response is complete. Streaming delivers tokens as they are produced, reducing perceived latency and enabling real-time UIs. This page explains how streaming works across Synaptic's layers -- from individual model calls through LCEL chains to graph execution.

Model-Level Streaming

The ChatModel trait provides two methods:

#[async_trait]
pub trait ChatModel: Send + Sync {
    async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, SynapticError>;

    fn stream_chat(&self, request: ChatRequest) -> ChatStream<'_>;
}

chat() waits for the complete response. stream_chat() returns a ChatStream immediately:

pub type ChatStream<'a> =
    Pin<Box<dyn Stream<Item = Result<AIMessageChunk, SynapticError>> + Send + 'a>>;

This is a pinned, boxed, async stream of AIMessageChunk values. Each chunk contains a fragment of the response -- typically a few tokens of text, part of a tool call, or usage information.

Default Implementation

The stream_chat() method has a default implementation that wraps chat() as a single-chunk stream. If a model adapter does not implement true streaming, it falls back to this behavior -- the caller still gets a stream, but it contains only one chunk (the complete response). This means code that consumes a ChatStream works with any model, whether or not it supports true streaming.

Consuming a Stream

use futures::StreamExt;

let mut stream = model.stream_chat(request);

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    print!("{}", chunk.content);  // print tokens as they arrive
}

AIMessageChunk Merging

Streaming produces many chunks that must be assembled into a complete message. AIMessageChunk supports the + and += operators:

let mut accumulated = AIMessageChunk::default();

while let Some(chunk) = stream.next().await {
    accumulated += chunk?;
}

let complete_message: Message = accumulated.into_message();

The merge rules:

content: Concatenated via push_str. Each chunk's content fragment is appended to the accumulated string.
tool_calls: Extended. Chunks may carry partial or complete tool call objects.
tool_call_chunks: Extended. Raw partial tool call data from the provider.
invalid_tool_calls: Extended.
id: The first non-None value wins. Subsequent chunks do not overwrite the ID.
usage: Summed field-by-field. If both sides have usage data, input_tokens, output_tokens, and total_tokens are added together. If only one side has usage, it is preserved.

After accumulation, into_message() converts the chunk into a Message::AI with the complete content and tool calls.

LCEL Streaming

The Runnable trait includes a stream() method:

fn stream<'a>(&'a self, input: I, config: &'a RunnableConfig) -> RunnableOutputStream<'a, O>;

The default implementation wraps invoke() as a single-item stream, similar to the model-level default. Components that support true streaming override this method.

Streaming Through Chains

When you call stream() on a BoxRunnable chain (e.g., prompt | model | parser), the behavior is:

Intermediate steps run their invoke() method and pass the result forward.
The final component in the chain streams its output.

This means in a prompt | model | parser chain, the prompt template runs synchronously, the model truly streams, and the parser processes each chunk as it arrives (if it supports streaming) or waits for the complete output (if it does not).

let chain = prompt_template.boxed() | model_runnable.boxed() | parser.boxed();

let mut stream = chain.stream(input, &config);
while let Some(item) = stream.next().await {
    let output = item?;
    // Process each streamed output
}

RunnableGenerator

For producing custom streams, RunnableGenerator wraps an async function that returns a stream:

let generator = RunnableGenerator::new(|input: String, _config| {
    Box::pin(async_stream::stream! {
        for word in input.split_whitespace() {
            yield Ok(word.to_string());
        }
    })
});

This is useful when you need to inject a streaming source into an LCEL chain that is not a model.

Graph Streaming

Graph execution can also stream, yielding events after each node completes:

use synaptic::graph::StreamMode;

let mut stream = graph.stream(initial_state, StreamMode::Values);

while let Some(event) = stream.next().await {
    let event = event?;
    println!("Node '{}' completed. Messages: {}", event.node, event.state.messages.len());
}

StreamMode

Mode	Yields	Use Case
`Values`	Full state after each node	When you need the complete picture at each step
`Updates`	Post-node state snapshot	When you want to observe what each node changed

GraphEvent

pub struct GraphEvent<S> {
    pub node: String,
    pub state: S,
}

Each event tells you which node just executed and what the state looks like. For a ReAct agent, you would see alternating "agent" and "tools" events, with messages accumulating in the state.

When to Use Streaming

Use model-level streaming when you need token-by-token output for a chat UI or when you want to show partial results to the user as they are generated.

Use LCEL streaming when you have a chain of operations and want the final output to stream. The intermediate steps run synchronously, but the user sees the final result incrementally.

Use graph streaming when you have a multi-step workflow and want to observe progress. Each node completion is an event, giving you visibility into the graph's execution.

Streaming and Error Handling

Streams can yield errors at any point. A network failure mid-stream, a malformed chunk from the provider, or a graph node failure all produce Err items in the stream. Consumers should handle errors on each next() call:

while let Some(result) = stream.next().await {
    match result {
        Ok(chunk) => process(chunk),
        Err(e) => {
            eprintln!("Stream error: {e}");
            break;
        }
    }
}

There is no automatic retry at the stream level. If a stream fails mid-way, the consumer decides how to handle it -- retry the entire call, return a partial result, or propagate the error. For automatic retries, wrap the model in a RetryChatModel before streaming, which retries the entire request on failure.

Middleware

Middleware intercepts and transforms agent behavior at well-defined lifecycle points. Rather than modifying agent logic directly, middleware wraps around model calls and tool calls, adding cross-cutting concerns like rate limiting, human approval, summarization, and context management. This page explains the middleware abstraction, the lifecycle hooks, and the available middleware classes.

The AgentMiddleware Trait

All middleware implements a single trait with six hooks:

#[async_trait]
pub trait AgentMiddleware: Send + Sync {
    async fn before_agent(&self, state: &MessageState) -> Result<(), SynapticError> { Ok(()) }
    async fn after_agent(&self, state: &MessageState) -> Result<(), SynapticError> { Ok(()) }
    async fn before_model(&self, messages: &mut Vec<Message>) -> Result<(), SynapticError> { Ok(()) }
    async fn after_model(&self, response: &mut ChatResponse) -> Result<(), SynapticError> { Ok(()) }
    async fn wrap_model_call(&self, messages: Vec<Message>, next: ModelCallFn) -> Result<ChatResponse, SynapticError>;
    async fn wrap_tool_call(&self, name: &str, args: &Value, next: ToolCallFn) -> Result<Value, SynapticError>;
}

Each hook has a default implementation that passes through unchanged. Middleware only overrides the hooks it needs.

Lifecycle

A single agent turn follows this sequence:

before_agent → before_model → wrap_model_call → after_model → wrap_tool_call (per tool) → after_agent

before_agent -- called once at the start of each agent turn. Use for setup, logging, or state inspection.
before_model -- called before the LLM request. Can modify messages (e.g., inject context, trim history).
wrap_model_call -- wraps the actual model invocation. Can retry, add fallbacks, or replace the call entirely.
after_model -- called after the LLM responds. Can modify the response (e.g., fix tool calls, add metadata).
wrap_tool_call -- wraps each tool invocation. Can approve/reject, add logging, or modify arguments.
after_agent -- called once at the end of each agent turn. Use for cleanup or state persistence.

MiddlewareChain

Multiple middleware instances are composed into a MiddlewareChain. The chain applies middleware in order for "before" hooks and in reverse order for "after" hooks (onion model):

use synaptic::middleware::MiddlewareChain;

let chain = MiddlewareChain::new(vec![
    Arc::new(ToolCallLimitMiddleware::new(10)),
    Arc::new(HumanInTheLoopMiddleware::new(callback)),
    Arc::new(SummarizationMiddleware::new(model, 4000)),
]);

Available Middleware

ToolCallLimitMiddleware

Limits the total number of tool calls per agent session. When the limit is reached, subsequent tool calls return an error instead of executing.

Use case: Preventing runaway agents that call tools in an infinite loop.
Configuration: ToolCallLimitMiddleware::new(max_calls)

HumanInTheLoopMiddleware

Routes tool calls through an approval callback before execution. The callback receives the tool name and arguments and returns an approval decision.

Use case: High-stakes operations (database writes, external API calls) that require human review.
Configuration: HumanInTheLoopMiddleware::new(callback) or .for_tools(vec!["dangerous_tool"]) to guard only specific tools.

SummarizationMiddleware

Monitors message history length and summarizes older messages when a token threshold is exceeded. Replaces distant messages with a summary while preserving recent ones.

Use case: Long-running agents that accumulate large message histories.
Configuration: SummarizationMiddleware::new(summarizer_model, token_threshold)

ContextEditingMiddleware

Transforms the message history before each model call using a configurable strategy:

ContextStrategy::LastN(n) -- keep only the last N messages (preserving leading system messages).
ContextStrategy::StripToolCalls -- remove tool call/result messages, keeping only human and AI content messages.

ModelRetryMiddleware

Wraps the model call with retry logic, attempting the call multiple times on transient failures.

ModelFallbackMiddleware

Provides fallback models when the primary model fails. Tries alternatives in order until one succeeds.

Middleware vs. Graph Features

Middleware and graph features (checkpointing, interrupts) serve different purposes:

Concern	Middleware	Graph
Tool approval	HumanInTheLoopMiddleware	interrupt_before("tools")
Context management	ContextEditingMiddleware	Custom node logic
Rate limiting	ToolCallLimitMiddleware	Not applicable
State persistence	Not applicable	Checkpointer

Middleware operates within a single agent node. Graph features operate across the entire graph. Use middleware for per-turn concerns and graph features for workflow-level concerns.

Key-Value Store

The key-value store provides persistent, namespaced storage for structured data. Unlike memory (which stores conversation messages by session), the store holds arbitrary key-value items organized into hierarchical namespaces. It supports CRUD operations, namespace listing, and optional semantic search when an embeddings model is configured.

The Store Trait

The Store trait is defined in synaptic-core and implemented in synaptic-store:

#[async_trait]
pub trait Store: Send + Sync {
    async fn put(&self, namespace: &[&str], key: &str, value: Item) -> Result<(), SynapticError>;
    async fn get(&self, namespace: &[&str], key: &str) -> Result<Option<Item>, SynapticError>;
    async fn delete(&self, namespace: &[&str], key: &str) -> Result<(), SynapticError>;
    async fn search(&self, namespace: &[&str], query: &SearchQuery) -> Result<Vec<Item>, SynapticError>;
    async fn list_namespaces(&self, prefix: &[&str]) -> Result<Vec<Vec<String>>, SynapticError>;
}

Namespace Hierarchy

Namespaces are arrays of strings, forming a path-like hierarchy:

// Store user preferences
store.put(&["users", "alice", "preferences"], "theme", item).await?;

// Store project data
store.put(&["projects", "my-app", "config"], "settings", item).await?;

// List all user namespaces
let namespaces = store.list_namespaces(&["users"]).await?;
// [["users", "alice", "preferences"], ["users", "bob", "preferences"]]

Items in different namespaces are completely isolated. A get or search in one namespace never returns items from another.

Item

The Item struct holds the stored value:

pub struct Item {
    pub key: String,
    pub value: Value,           // serde_json::Value
    pub namespace: Vec<String>,
    pub created_at: Option<DateTime<Utc>>,
    pub updated_at: Option<DateTime<Utc>>,
    pub score: Option<f32>,     // populated by semantic search
}

The score field is None for regular CRUD operations and is populated only when items are returned from a semantic search query.

InMemoryStore

The built-in implementation uses Arc<RwLock<HashMap>> for thread-safe concurrent access:

use synaptic::store::InMemoryStore;

let store = InMemoryStore::new();

Suitable for development, testing, and applications that don't need persistence across restarts. For production use, implement the Store trait with a database backend.

Semantic Search

When an embeddings model is configured, the store supports semantic search -- finding items by meaning rather than exact key match:

use synaptic::store::InMemoryStore;

let store = InMemoryStore::with_embeddings(embeddings_model);

// Items are automatically embedded when stored
store.put(&["docs"], "rust-intro", item).await?;

// Search by semantic similarity
let results = store.search(&["docs"], &SearchQuery {
    query: Some("programming language".into()),
    limit: 5,
    ..Default::default()
}).await?;

Each returned item has a score field (0.0 to 1.0) indicating semantic similarity to the query.

Store vs. Memory

Aspect	Store	Memory (MemoryStore)
Purpose	General key-value storage	Conversation message history
Keyed by	Namespace + key	Session ID
Value type	Arbitrary JSON (`Value`)	`Message`
Operations	CRUD + search + list	Append + load + clear
Search	Semantic (with embeddings)	Not applicable
Use case	Agent knowledge, user profiles, configuration	Chat history, context management

Use memory for conversation state. Use the store for everything else -- agent knowledge bases, user preferences, cached computations, cross-session data.

Store in the Graph

The store is accessible within graph nodes through the ToolRuntime:

// Inside a RuntimeAwareTool
async fn call_with_runtime(&self, args: Value, runtime: &ToolRuntime) -> Result<Value, SynapticError> {
    if let Some(store) = &runtime.store {
        let item = store.get(&["memory"], "context").await?;
        // Use stored data in tool execution
    }
    Ok(json!({"status": "ok"}))
}

This enables tools to read and write persistent data during graph execution without passing the store through function arguments.

Integrations

Synaptic uses a provider-centric architecture for external service integrations. Each integration lives in its own crate, depends only on synaptic-core (plus any provider SDK), and implements one or more core traits.

Architecture

synaptic-core (defines traits)
  ├── synaptic-openai          (ChatModel + Embeddings)
  ├── synaptic-anthropic       (ChatModel)
  ├── synaptic-gemini          (ChatModel)
  ├── synaptic-ollama          (ChatModel + Embeddings)
  ├── synaptic-bedrock         (ChatModel)
  ├── synaptic-cohere          (DocumentCompressor)
  ├── synaptic-qdrant          (VectorStore)
  ├── synaptic-pgvector        (VectorStore)
  ├── synaptic-pinecone        (VectorStore)
  ├── synaptic-chroma          (VectorStore)
  ├── synaptic-mongodb         (VectorStore)
  ├── synaptic-elasticsearch   (VectorStore)
  ├── synaptic-redis           (Store + LlmCache)
  ├── synaptic-sqlite          (LlmCache)
  ├── synaptic-pdf             (Loader)
  └── synaptic-tavily          (Tool)

All integration crates share a common pattern:

Core traits — ChatModel, Embeddings, VectorStore, Store, LlmCache, Loader are defined in synaptic-core
Independent crates — Each integration is a separate crate with its own feature flag
Zero coupling — Integration crates never depend on each other
Config structs — Builder-pattern configuration with new() + with_*() methods

Core Traits

Trait	Purpose	Crate Implementations
`ChatModel`	LLM chat completion	openai, anthropic, gemini, ollama, bedrock
`Embeddings`	Text embedding vectors	openai, ollama
`VectorStore`	Vector similarity search	qdrant, pgvector, pinecone, chroma, mongodb, elasticsearch, (+ in-memory)
`Store`	Key-value storage	redis, (+ in-memory)
`LlmCache`	LLM response caching	redis, sqlite, (+ in-memory)
`Loader`	Document loading	pdf, (+ text, json, csv, directory)
`DocumentCompressor`	Document reranking/filtering	cohere, (+ embeddings filter)
`Tool`	Agent tool	tavily, (+ custom tools)

LLM Provider Pattern

All LLM providers follow the same pattern — a config struct, a model struct, and a ProviderBackend for HTTP transport:

use synaptic::openai::{OpenAiChatModel, OpenAiConfig};
use synaptic::models::{HttpBackend, FakeBackend};

// Production
let config = OpenAiConfig::new("sk-...", "gpt-4o");
let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));

// Testing (no network calls)
let model = OpenAiChatModel::new(config, Arc::new(FakeBackend::with_responses(vec![...])));

The ProviderBackend abstraction (in synaptic-models) enables:

HttpBackend — real HTTP calls in production
FakeBackend — deterministic responses in tests

Storage & Retrieval Pattern

Vector stores, key-value stores, and caches implement core traits that allow drop-in replacement:

// Swap InMemoryVectorStore for QdrantVectorStore — same trait interface
use synaptic::qdrant::{QdrantVectorStore, QdrantConfig};

let config = QdrantConfig::new("http://localhost:6334", "my_collection", 1536);
let store = QdrantVectorStore::new(config);
store.add_documents(docs, &embeddings).await?;
let results = store.similarity_search("query", 5, &embeddings).await?;

Feature Flags

Each integration has its own feature flag in the synaptic facade crate:

[dependencies]
synaptic = { version = "0.3", features = ["openai", "qdrant"] }

Feature	Integration
`openai`	OpenAI ChatModel + Embeddings (+ OpenAI-compatible providers + Azure)
`anthropic`	Anthropic ChatModel
`gemini`	Google Gemini ChatModel
`ollama`	Ollama ChatModel + Embeddings
`bedrock`	AWS Bedrock ChatModel
`cohere`	Cohere Reranker
`qdrant`	Qdrant vector store
`pgvector`	PostgreSQL pgvector store
`pinecone`	Pinecone vector store
`chroma`	Chroma vector store
`mongodb`	MongoDB Atlas vector search
`elasticsearch`	Elasticsearch vector store
`redis`	Redis store + cache
`sqlite`	SQLite LLM cache
`pdf`	PDF document loader
`tavily`	Tavily search tool

Convenience combinations: models (all 6 LLM providers including bedrock and cohere), agent (includes openai), rag (includes openai + retrieval stack), full (everything).

Provider Selection Guide

Choose a provider based on your requirements:

Provider	Auth	Streaming	Tool Calling	Embeddings	Best For
OpenAI	API key (header)	SSE	Yes	Yes	General-purpose, widest model selection
Anthropic	API key (`x-api-key`)	SSE	Yes	No	Long context, reasoning tasks
Gemini	API key (query param)	SSE	Yes	No	Google ecosystem, multimodal
Ollama	None (local)	NDJSON	Yes	Yes	Privacy-sensitive, offline, development
Bedrock	AWS IAM	AWS SDK	Yes	No	Enterprise AWS environments
OpenAI-Compatible	Varies	SSE	Varies	Varies	Cost optimization (Groq, DeepSeek, etc.)

Deciding factors:

Privacy & compliance — Ollama runs entirely locally; Bedrock keeps data within AWS
Cost — Ollama is free; OpenAI-compatible providers (Groq, DeepSeek) offer competitive pricing
Latency — Ollama has no network round-trip; Groq is optimized for speed
Ecosystem — OpenAI has the most third-party integrations; Bedrock integrates with AWS services

Vector Store Selection Guide

Store	Deployment	Managed	Filtering	Scaling	Best For
Qdrant	Self-hosted / Cloud	Yes (Qdrant Cloud)	Rich (payload filters)	Horizontal	General-purpose, production
pgvector	Self-hosted	Via managed Postgres	SQL WHERE clauses	Vertical	Teams already using PostgreSQL
Pinecone	Fully managed	Yes	Metadata filters	Automatic	Zero-ops, rapid prototyping
Chroma	Self-hosted / Docker	No	Metadata filters	Single node	Development, small-medium datasets
MongoDB Atlas	Fully managed	Yes	MQL filters	Automatic	Teams already using MongoDB
Elasticsearch	Self-hosted / Cloud	Yes (Elastic Cloud)	Full query DSL	Horizontal	Hybrid text + vector search
InMemory	In-process	N/A	None	N/A	Testing, prototyping

Deciding factors:

Existing infrastructure — Use pgvector if you have PostgreSQL, MongoDB Atlas if you use MongoDB, Elasticsearch if you already run an ES cluster
Operational complexity — Pinecone and MongoDB Atlas are fully managed; Qdrant and Elasticsearch require cluster management
Query capabilities — Elasticsearch excels at hybrid text + vector queries; Qdrant has the richest filtering
Cost — InMemory and Chroma are free; pgvector reuses existing database infrastructure

Cache Selection Guide

Cache	Persistence	Deployment	TTL Support	Best For
InMemory	No (process lifetime)	In-process	Yes	Testing, single-process apps
Redis	Yes (configurable)	External server	Yes	Multi-process, distributed
SQLite	Yes (file-based)	In-process	Yes	Single-machine persistence
Semantic	Depends on backing store	In-process	No	Fuzzy-match caching

Complete RAG Pipeline Example

This example combines multiple integrations into a full retrieval-augmented generation pipeline with caching and reranking:

use synaptic::core::{ChatModel, ChatRequest, Message, Embeddings};
use synaptic::openai::{OpenAiChatModel, OpenAiConfig, OpenAiEmbeddings};
use synaptic::qdrant::{QdrantConfig, QdrantVectorStore};
use synaptic::cohere::{CohereReranker, CohereConfig};
use synaptic::cache::{CachedChatModel, InMemoryCache};
use synaptic::retrieval::ContextualCompressionRetriever;
use synaptic::splitters::RecursiveCharacterTextSplitter;
use synaptic::loaders::TextLoader;
use synaptic::vectorstores::VectorStoreRetriever;
use synaptic::models::HttpBackend;
use std::sync::Arc;

let backend = Arc::new(HttpBackend::new());

// 1. Set up embeddings
let embeddings = Arc::new(OpenAiEmbeddings::new(
    OpenAiEmbeddings::config("text-embedding-3-small"),
    backend.clone(),
));

// 2. Ingest documents into Qdrant
let loader = TextLoader::new("knowledge-base.txt");
let docs = loader.load().await?;
let splitter = RecursiveCharacterTextSplitter::new(500, 50);
let chunks = splitter.split_documents(&docs)?;

let qdrant_config = QdrantConfig::new("http://localhost:6334", "knowledge", 1536);
let store = QdrantVectorStore::new(qdrant_config, embeddings.clone()).await?;
store.add_documents(&chunks).await?;

// 3. Build retriever with Cohere reranking
let base_retriever = Arc::new(VectorStoreRetriever::new(Arc::new(store)));
let reranker = CohereReranker::new(CohereConfig::new(std::env::var("COHERE_API_KEY")?));
let retriever = ContextualCompressionRetriever::new(base_retriever, Arc::new(reranker));

// 4. Wrap the LLM with a cache
let llm_config = OpenAiConfig::new(std::env::var("OPENAI_API_KEY")?, "gpt-4o");
let base_model = OpenAiChatModel::new(llm_config, backend.clone());
let cache = Arc::new(InMemoryCache::new());
let model = CachedChatModel::new(Arc::new(base_model), cache);

// 5. Retrieve and generate
let relevant = retriever.retrieve("How does Synaptic handle streaming?").await?;
let context = relevant.iter().map(|d| d.content.as_str()).collect::<Vec<_>>().join("\n\n");

let request = ChatRequest::new(vec![
    Message::system(&format!("Answer based on the following context:\n\n{context}")),
    Message::human("How does Synaptic handle streaming?"),
]);
let response = model.chat(&request).await?;
println!("{}", response.message.content().unwrap_or_default());

This pipeline demonstrates:

Qdrant for vector storage and retrieval
Cohere for reranking retrieved documents
InMemoryCache for caching LLM responses (swap with Redis/SQLite for persistence)
OpenAI for both embeddings and chat completion

Adding a New Integration

To add a new integration:

Create a new crate synaptic-{name} in crates/
Depend on synaptic-core for trait definitions
Implement the appropriate trait(s)
Add a feature flag in the synaptic facade crate
Re-export via pub use synaptic_{name} as {name} in the facade lib.rs

Error Handling

Synaptic uses a single error enum, SynapticError, across the entire framework. Every async function returns Result<T, SynapticError>, and errors propagate naturally with the ? operator. This page explains the error model, the available variants, and the patterns for handling and recovering from errors.

SynapticError

#[derive(Debug, Error)]
pub enum SynapticError {
    #[error("prompt error: {0}")]           Prompt(String),
    #[error("model error: {0}")]            Model(String),
    #[error("tool error: {0}")]             Tool(String),
    #[error("tool not found: {0}")]         ToolNotFound(String),
    #[error("memory error: {0}")]           Memory(String),
    #[error("rate limit: {0}")]             RateLimit(String),
    #[error("timeout: {0}")]                Timeout(String),
    #[error("validation error: {0}")]       Validation(String),
    #[error("parsing error: {0}")]          Parsing(String),
    #[error("callback error: {0}")]         Callback(String),
    #[error("max steps exceeded: {max_steps}")]  MaxStepsExceeded { max_steps: usize },
    #[error("embedding error: {0}")]        Embedding(String),
    #[error("vector store error: {0}")]     VectorStore(String),
    #[error("retriever error: {0}")]        Retriever(String),
    #[error("loader error: {0}")]           Loader(String),
    #[error("splitter error: {0}")]         Splitter(String),
    #[error("graph error: {0}")]            Graph(String),
    #[error("cache error: {0}")]            Cache(String),
    #[error("config error: {0}")]           Config(String),
    #[error("mcp error: {0}")]             Mcp(String),
}

Twenty variants, one for each subsystem. The design is intentional:

Single type everywhere: You never need to convert between error types. Any function in any crate can return SynapticError, and the caller can propagate it with ? without conversion.
String payloads: Most variants carry a String message. This keeps the error type simple and avoids nested error hierarchies. The message provides context about what went wrong.
thiserror derivation: SynapticError implements std::error::Error and Display automatically via the #[error(...)] attributes.

Variant Reference

Infrastructure Errors

Variant	When It Occurs
`Model(String)`	LLM provider returns an error, network failure, invalid response format
`RateLimit(String)`	Provider rate limit exceeded, token bucket exhausted
`Timeout(String)`	Request timed out
`Config(String)`	Invalid configuration (missing API key, bad parameters)

Input/Output Errors

Variant	When It Occurs
`Prompt(String)`	Template variable missing, invalid template syntax
`Validation(String)`	Input fails validation (e.g., empty message list, invalid schema)
`Parsing(String)`	Output parser cannot extract structured data from LLM response

Tool Errors

Variant	When It Occurs
`Tool(String)`	Tool execution failed (network error, computation error, etc.)
`ToolNotFound(String)`	Requested tool name is not in the registry

Subsystem Errors

Variant	When It Occurs
`Memory(String)`	Memory store read/write failure
`Callback(String)`	Callback handler raised an error
`Embedding(String)`	Embedding API failure
`VectorStore(String)`	Vector store read/write failure
`Retriever(String)`	Retrieval operation failed
`Loader(String)`	Document loading failed (file not found, parse error)
`Splitter(String)`	Text splitting failed
`Cache(String)`	Cache read/write failure

Execution Control Errors

Variant	When It Occurs
`Graph(String)`	Graph execution error (compilation, routing, missing nodes)
`MaxStepsExceeded { max_steps }`	Agent loop exceeded the maximum iteration count
`Mcp(String)`	MCP server connection, transport, or protocol error

Error Propagation

Because every async function in Synaptic returns Result<T, SynapticError>, errors propagate naturally:

async fn process_query(model: &dyn ChatModel, query: &str) -> Result<String, SynapticError> {
    let messages = vec![Message::human(query)];
    let request = ChatRequest::new(messages);
    let response = model.chat(request).await?;  // Model error propagates
    Ok(response.message.content().to_string())
}

There is no need for .map_err() conversions in application code. A Model error from a provider adapter, a Tool error from execution, or a Graph error from the state machine all flow through the same Result type.

Retry and Fallback Patterns

Not all errors are fatal. Synaptic provides several mechanisms for resilience:

RetryChatModel

Wraps a ChatModel and retries on transient failures:

use synaptic::models::RetryChatModel;

let robust_model = RetryChatModel::new(model, max_retries, delay);

On failure, it waits and retries up to max_retries times. This handles transient network errors and rate limits without application code needing to implement retry logic.

RateLimitedChatModel and TokenBucketChatModel

Proactively prevent rate limit errors by throttling requests:

RateLimitedChatModel limits requests per time window.
TokenBucketChatModel uses a token bucket algorithm for smooth rate limiting.

By throttling before hitting the provider's limit, these wrappers convert potential RateLimit errors into controlled delays.

RunnableWithFallbacks

Tries alternative runnables when the primary one fails:

use synaptic::runnables::RunnableWithFallbacks;

let chain = RunnableWithFallbacks::new(
    primary.boxed(),
    vec![fallback_1.boxed(), fallback_2.boxed()],
);

If primary fails, fallback_1 is tried with the same input. If that also fails, fallback_2 is tried. Only if all options fail does the error propagate.

RunnableRetry

Retries a runnable with configurable exponential backoff:

use std::time::Duration;
use synaptic::runnables::{RunnableRetry, RetryPolicy};

let retry = RunnableRetry::new(
    flaky_step.boxed(),
    RetryPolicy::default()
        .with_max_attempts(4)
        .with_base_delay(Duration::from_millis(200))
        .with_max_delay(Duration::from_secs(5)),
);

The delay doubles after each attempt (200ms, 400ms, 800ms, ...) up to max_delay. You can also set a retry_on predicate to only retry specific error types. This is useful for any step in an LCEL chain, not just model calls.

HandleErrorTool

Wraps a tool so that errors are returned as string results instead of propagating:

use synaptic::tools::HandleErrorTool;

let safe_tool = HandleErrorTool::new(risky_tool);

When the inner tool fails, the error message becomes the tool's output. The LLM sees the error and can decide to retry with different arguments or take a different approach. This prevents a single tool failure from crashing the entire agent loop.

Graph Interrupts (Not Errors)

Human-in-the-loop interrupts in the graph system are not errors. Graph invoke() returns GraphResult<S>, which is either Complete(state) or Interrupted(state):

use synaptic::graph::GraphResult;

match graph.invoke(state).await? {
    GraphResult::Complete(final_state) => {
        // Graph finished normally
        handle_result(final_state);
    }
    GraphResult::Interrupted(partial_state) => {
        // Human-in-the-loop: inspect state, get approval, resume
        // The graph has checkpointed its state automatically
    }
}

To extract the state regardless of completion status, use .into_state():

let state = graph.invoke(initial).await?.into_state();

Interrupts can also be triggered programmatically via Command::interrupt() from within a node:

use synaptic::graph::Command;

// Inside a node's process() method:
Command::interrupt(updated_state)

SynapticError::Graph is reserved for true errors: compilation failures, missing nodes, routing errors, and recursion limit violations.

Matching on Error Variants

Since SynapticError is an enum, you can match on specific variants to implement targeted error handling:

match result {
    Ok(value) => use_value(value),
    Err(SynapticError::RateLimit(_)) => {
        // Wait and retry
    }
    Err(SynapticError::ToolNotFound(name)) => {
        // Log the missing tool and continue without it
    }
    Err(SynapticError::Parsing(msg)) => {
        // LLM output was malformed; ask the model to try again
    }
    Err(e) => {
        // All other errors: propagate
        return Err(e);
    }
}

This pattern is especially useful in agent loops where some errors are recoverable (the model can try again) and others are not (network is down, API key is invalid).

API Reference

Synaptic is organized as a workspace of focused crates. Each crate has its own API documentation generated from doc comments in the source code.

Crate Reference

Crate	Description	Docs
`synaptic-core`	Shared traits and types (`ChatModel`, `Tool`, `Message`, `SynapticError`, etc.)	docs.rs
`synaptic-models`	`ProviderBackend` abstraction, `ScriptedChatModel` test double, wrappers (retry, rate limit, structured output, bound tools)	docs.rs
`synaptic-openai`	OpenAI provider (`OpenAiChatModel`, `OpenAiEmbeddings`)	docs.rs
`synaptic-anthropic`	Anthropic provider (`AnthropicChatModel`)	docs.rs
`synaptic-gemini`	Google Gemini provider (`GeminiChatModel`)	docs.rs
`synaptic-ollama`	Ollama provider (`OllamaChatModel`, `OllamaEmbeddings`)	docs.rs
`synaptic-runnables`	LCEL composition (`Runnable` trait, `BoxRunnable`, pipe operator, parallel, branch, fallbacks, assign, pick)	docs.rs
`synaptic-prompts`	Prompt templates (`PromptTemplate`, `ChatPromptTemplate`, `FewShotChatMessagePromptTemplate`)	docs.rs
`synaptic-parsers`	Output parsers (string, JSON, structured, list, enum, boolean, XML, fixing, retry)	docs.rs
`synaptic-tools`	Tool system (`ToolRegistry`, `SerialToolExecutor`, `ParallelToolExecutor`)	docs.rs
`synaptic-memory`	Memory strategies (buffer, window, summary, token buffer, summary buffer, `RunnableWithMessageHistory`)	docs.rs
`synaptic-callbacks`	Callback handlers (`RecordingCallback`, `TracingCallback`, `CompositeCallback`)	docs.rs
`synaptic-retrieval`	Retriever implementations (in-memory, BM25, multi-query, ensemble, contextual compression, self-query, parent document)	docs.rs
`synaptic-loaders`	Document loaders (text, JSON, CSV, directory, file, markdown, web)	docs.rs
`synaptic-splitters`	Text splitters (character, recursive character, markdown header, token, HTML header, language)	docs.rs
`synaptic-embeddings`	Embeddings trait, `FakeEmbeddings`, `CacheBackedEmbeddings`	docs.rs
`synaptic-vectorstores`	Vector store implementations (`InMemoryVectorStore`, `VectorStoreRetriever`, `MultiVectorRetriever`)	docs.rs
`synaptic-qdrant`	Qdrant vector store (`QdrantVectorStore`)	docs.rs
`synaptic-pgvector`	PostgreSQL pgvector store (`PgVectorStore`)	docs.rs
`synaptic-redis`	Redis store and cache (`RedisStore`, `RedisCache`)	docs.rs
`synaptic-pdf`	PDF document loader (`PdfLoader`)	docs.rs
`synaptic-graph`	Graph orchestration (`StateGraph`, `CompiledGraph`, `ToolNode`, `create_react_agent`, checkpointing, streaming)	docs.rs
`synaptic-cache`	LLM caching (`InMemoryCache`, `SemanticCache`, `CachedChatModel`)	docs.rs
`synaptic-eval`	Evaluation framework (exact match, regex, JSON validity, embedding distance, LLM judge evaluators; `Dataset` and `evaluate()`)	docs.rs
`synaptic`	Unified facade crate that re-exports all sub-crates under a single namespace	docs.rs

Note: The docs.rs links above will become active once the crates are published to crates.io. In the meantime, generate local documentation as described below.

Local API Documentation

You can generate and browse the full API documentation locally with:

cargo doc --workspace --open

This builds rustdoc for every crate in the workspace and opens the result in your browser. The generated documentation includes all public types, traits, functions, and their doc comments.

To generate docs without opening the browser (useful in CI):

cargo doc --workspace --no-deps

Using the Facade Crate

If you prefer a single dependency instead of listing individual crates, use the synaptic facade:

[dependencies]
synaptic = "0.2"

Then import through the unified namespace:

use synaptic::core::Message;
use synaptic::openai::OpenAiChatModel;   // requires "openai" feature
use synaptic::models::ScriptedChatModel; // requires "model-utils" feature
use synaptic::graph::create_react_agent;
use synaptic::runnables::Runnable;

Contributing

Thank you for your interest in contributing to Synaptic. This guide covers the workflow and standards for submitting changes.

Getting Started

Fork the repository on GitHub.

Clone your fork locally:

git clone https://github.com/<your-username>/synaptic.git
cd synaptic

Create a branch for your changes:
```
git checkout -b feature/my-change
```

Development Workflow

Before submitting a pull request, make sure all checks pass locally.

Run Tests

cargo test --workspace

All tests must pass. If you are adding a new feature, add tests for it in the appropriate tests/ directory within the crate.

Run Clippy

cargo clippy --workspace

Fix any warnings. Clippy enforces idiomatic Rust patterns and catches common mistakes.

Check Formatting

cargo fmt --all -- --check

If this fails, run cargo fmt --all to auto-format and commit the result.

Build the Workspace

cargo build --workspace

Ensure everything compiles cleanly.

Submitting a Pull Request

Push your branch to your fork.
Open a pull request against the main branch.
Provide a clear description of what your change does and why.
Reference any related issues.

Guidelines

Code

Follow existing patterns in the codebase. Each crate has a consistent structure with src/ for implementation and tests/ for integration tests.
All traits are async via #[async_trait]. Tests use #[tokio::test].
Use Arc<RwLock<_>> for shared registries and Arc<tokio::sync::Mutex<_>> for callbacks and memory.
Prefer factory methods over struct literals for core types (e.g., Message::human(), ChatRequest::new()).

Documentation

When adding a new feature or changing a public API, update the corresponding documentation page in docs/book/en/src/.
How-to guides go in how-to/, conceptual explanations in concepts/, and step-by-step walkthroughs in tutorials/.
If your change affects the project overview, update the README at the repository root.

Tests

Each crate has a tests/ directory with integration-style tests in separate files.
Use ScriptedChatModel or FakeBackend for testing model interactions without real API calls.
Use FakeEmbeddings for testing embedding-dependent features.

Commit Messages

Write clear, concise commit messages that explain the "why" behind the change.
Use conventional prefixes when appropriate: feat:, fix:, docs:, refactor:, test:.

Project Structure

The workspace contains 17 library crates in crates/ plus example binaries in examples/. See Architecture Overview for a detailed breakdown of the crate layers and dependency graph.

Questions

If you are unsure about an approach, open an issue to discuss before writing code. This helps avoid wasted effort and keeps changes aligned with the project direction.

Development Setup

This page covers everything you need to build, test, and run Synaptic locally.

Prerequisites

Rust 1.88 or later -- Install via rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Verify with:

rustc --version   # Should print 1.88.0 or later
cargo --version

cargo -- Included with the Rust toolchain. No separate install needed.

Clone the Repository

git clone https://github.com/<your-username>/synaptic.git
cd synaptic

Build

Build every crate in the workspace:

cargo build --workspace

Test

Run All Tests

cargo test --workspace

This runs unit tests and integration tests across all 17 library crates.

Test a Single Crate

cargo test -p synaptic-tools

Replace synaptic-tools with any crate name from the workspace.

Run a Specific Test by Name

cargo test -p synaptic-core -- chunk

This runs only tests whose names contain "chunk" within the synaptic-core crate.

Run Examples

The examples/ directory contains runnable binaries that demonstrate common patterns:

cargo run -p react_basic

List all available example targets with:

ls examples/

Lint

Run Clippy to catch common mistakes and enforce idiomatic patterns:

cargo clippy --workspace

Fix any warnings before submitting changes.

Format

Check that all code follows the standard Rust formatting:

cargo fmt --all -- --check

If this fails, auto-format with:

cargo fmt --all

Pre-commit Hook

The repository ships a pre-commit hook that runs cargo fmt --check automatically before each commit. Enable it once after cloning:

git config core.hooksPath .githooks

If formatting fails the hook will run cargo fmt --all for you — just re-stage the changes and commit again.

Build Documentation Locally

API Docs (rustdoc)

Generate and open the full API reference in your browser:

cargo doc --workspace --open

mdBook Site

The documentation site is built with mdBook. Install it and serve the English docs locally:

cargo install mdbook
mdbook serve docs/book/en

This starts a local server (typically at http://localhost:3000) with live reload. Edit any .md file under docs/book/en/src/ and the browser will update automatically.

To build the book without serving:

mdbook build docs/book/en

The output is written to docs/book/en/book/.

Editor Setup

Synaptic is a standard Cargo workspace. Any editor with rust-analyzer support will provide inline errors, completions, and go-to-definition across all crates. Recommended:

VS Code with the rust-analyzer extension
IntelliJ IDEA with the Rust plugin
Neovim with rust-analyzer via LSP

Environment Variables

Some provider adapters require API keys at runtime (not at build time):

Variable	Used by
`OPENAI_API_KEY`	`OpenAiChatModel`, `OpenAiEmbeddings`
`ANTHROPIC_API_KEY`	`AnthropicChatModel`
`GOOGLE_API_KEY`	`GeminiChatModel`

These are only needed when running examples or tests that hit real provider APIs. The test suite uses ScriptedChatModel, FakeBackend, and FakeEmbeddings for offline testing, so you can run cargo test --workspace without any API keys.

Keyboard shortcuts

Synaptic