Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Together AI

Together AI provides access to leading open-source models (Llama, DeepSeek, Qwen, Mixtral) via an OpenAI-compatible API. It offers serverless inference at competitive prices, making it ideal for production workloads that require state-of-the-art open models.

Together AI is available as a compatibility submodule inside synaptic-models. No separate crate is needed.

Setup

[dependencies]
synaptic = { version = "0.4", features = ["openai"] }

Sign up at api.together.xyz to obtain an API key.

Configuration

use synaptic::openai::compat::together::{self, TogetherModel};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let model = together::chat_model("your-api-key", TogetherModel::Llama3_3_70bInstructTurbo.to_string(), Arc::new(HttpBackend::new()));

Builder methods

Use OpenAiConfig builder methods for customization:

use synaptic::openai::compat::together::{self, TogetherModel};
use synaptic::openai::OpenAiChatModel;
use synaptic::models::HttpBackend;
use std::sync::Arc;

let config = together::config("your-api-key", TogetherModel::Llama3_3_70bInstructTurbo.to_string())
    .with_temperature(0.7)
    .with_max_tokens(2048)
    .with_top_p(0.9)
    .with_stop(vec!["</s>".to_string()]);

let model = OpenAiChatModel::new(config, Arc::new(HttpBackend::new()));

For unlisted models, pass a string directly:

let model = together::chat_model("your-api-key", "custom-org/custom-model-v1", Arc::new(HttpBackend::new()));

Available Models

Enum VariantAPI Model IDBest For
Llama3_3_70bInstructTurbometa-llama/Llama-3.3-70B-Instruct-TurboGeneral purpose (recommended)
Llama3_1_8bInstructTurbometa-llama/Meta-Llama-3.1-8B-Instruct-TurboFast, cost-effective
Llama3_1_405bInstructTurbometa-llama/Meta-Llama-3.1-405B-Instruct-TurboMaximum quality
DeepSeekR1deepseek-ai/DeepSeek-R1Reasoning tasks
Qwen2_5_72bInstructTurboQwen/Qwen2.5-72B-Instruct-TurboMultilingual
Mixtral8x7bInstructmistralai/Mixtral-8x7B-Instruct-v0.1Long-context MoE
Custom(String)(any)Unlisted / preview models

Usage

use synaptic::openai::compat::together::{self, TogetherModel};
use synaptic::core::{ChatModel, ChatRequest, Message};
use synaptic::models::HttpBackend;
use std::sync::Arc;

let model = together::chat_model("your-api-key", TogetherModel::Llama3_3_70bInstructTurbo.to_string(), Arc::new(HttpBackend::new()));

let request = ChatRequest::new(vec![
    Message::system("You are a concise assistant."),
    Message::human("What is Rust famous for?"),
]);

let response = model.chat(request).await?;
println!("{}", response.message.content());

Streaming

use futures::StreamExt;

let request = ChatRequest::new(vec![
    Message::human("Explain Rust's ownership model in 3 sentences."),
]);

let mut stream = model.stream_chat(request);
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    print!("{}", chunk.content);
}
println!();

Error Handling

use synaptic::core::SynapticError;

match model.chat(request).await {
    Ok(response) => println!("{}", response.message.content()),
    Err(SynapticError::RateLimit(msg)) => eprintln!("Rate limited: {}", msg),
    Err(e) => return Err(e.into()),
}

Configuration Reference

All configuration is done through OpenAiConfig builder methods. See the OpenAI-Compatible Providers page for the full reference.

MethodDescription
.with_temperature(f64)Sampling temperature (0.0-2.0)
.with_max_tokens(u32)Maximum tokens to generate
.with_top_p(f64)Nucleus sampling threshold
.with_stop(Vec<String>)Stop sequences