Retry & Rate Limiting

This guide shows how to add automatic retry logic and rate limiting to any ChatModel.

Retry with `RetryChatModel`

RetryChatModel wraps a model and automatically retries on transient failures (rate limit errors and timeouts). It uses exponential backoff between attempts.

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::{RetryChatModel, RetryPolicy};

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Use default policy: 3 attempts, 500ms base delay
let retry_model = RetryChatModel::new(base_model, RetryPolicy::default());

Custom retry policy

Configure the maximum number of attempts and the base delay for exponential backoff:

use std::time::Duration;
use synaptic::models::RetryPolicy;

let policy = RetryPolicy {
    max_attempts: 5,                         // Try up to 5 times
    base_delay: Duration::from_millis(200),  // Start with 200ms delay
};

let retry_model = RetryChatModel::new(base_model, policy);

The delay between retries follows exponential backoff: base_delay * 2^attempt. With a 200ms base delay:

Attempt	Delay before retry
1st retry	200ms
2nd retry	400ms
3rd retry	800ms
4th retry	1600ms

Only retryable errors trigger retries:

SynapticError::RateLimit -- the provider returned a rate limit response.
SynapticError::Timeout -- the request timed out.

All other errors are returned immediately without retrying.

Streaming with retry

RetryChatModel also retries stream_chat() calls. If a retryable error occurs during streaming, the entire stream is retried from the beginning.

Concurrency limiting with `RateLimitedChatModel`

RateLimitedChatModel uses a semaphore to limit the number of concurrent requests to the underlying model:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::RateLimitedChatModel;

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Allow at most 5 concurrent requests
let limited = RateLimitedChatModel::new(base_model, 5);

When the concurrency limit is reached, additional callers wait until a slot becomes available. This is useful for:

Respecting provider concurrency limits.
Preventing resource exhaustion in high-throughput applications.
Controlling costs by limiting parallel API calls.

Token bucket rate limiting with `TokenBucketChatModel`

TokenBucketChatModel uses a token bucket algorithm for smoother rate limiting. The bucket starts full and refills at a steady rate:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::TokenBucketChatModel;

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// Bucket capacity: 100 tokens, refill rate: 10 tokens/second
let throttled = TokenBucketChatModel::new(base_model, 100.0, 10.0);

Each chat() or stream_chat() call consumes one token from the bucket. When the bucket is empty, callers wait until a token is refilled.

Parameters:

capacity -- the maximum burst size. A capacity of 100 allows 100 rapid-fire requests before throttling kicks in.
refill_rate -- tokens added per second. A rate of 10.0 means the bucket refills at 10 tokens per second.

Token bucket vs concurrency limiting

Feature	`RateLimitedChatModel`	`TokenBucketChatModel`
Controls	Concurrent requests	Request rate over time
Mechanism	Semaphore	Token bucket
Burst handling	Blocks when N requests are in-flight	Allows bursts up to capacity
Best for	Concurrency limits	Rate limits (requests/second)

Stacking wrappers

All wrappers implement ChatModel, so they compose naturally. A common pattern is retry on the outside, rate limiting on the inside:

use std::sync::Arc;
use synaptic::core::ChatModel;
use synaptic::models::{RetryChatModel, RetryPolicy, TokenBucketChatModel};

let base_model: Arc<dyn ChatModel> = Arc::new(model);

// First, apply rate limiting
let throttled: Arc<dyn ChatModel> = Arc::new(
    TokenBucketChatModel::new(base_model, 50.0, 5.0)
);

// Then, add retry on top
let reliable = RetryChatModel::new(throttled, RetryPolicy::default());

This ensures that retried requests also go through the rate limiter, preventing retry storms from overwhelming the provider.

Keyboard shortcuts

Synaptic