Multi-Query Retriever
This guide shows how to use the MultiQueryRetriever to improve retrieval recall by generating multiple query perspectives with an LLM.
Overview
A single search query may not capture all relevant documents, especially when the user's phrasing does not match the vocabulary in the document corpus. The MultiQueryRetriever addresses this by:
- Using a
ChatModelto generate alternative phrasings of the original query. - Running each query variant through a base retriever.
- Deduplicating and merging the results.
This technique improves recall by overcoming limitations of distance-based similarity search.
Basic usage
use std::sync::Arc;
use synaptic::retrieval::{MultiQueryRetriever, Retriever};
let base_retriever: Arc<dyn Retriever> = Arc::new(/* any retriever */);
let model: Arc<dyn ChatModel> = Arc::new(/* any ChatModel */);
// Default: generates 3 query variants
let retriever = MultiQueryRetriever::new(base_retriever, model);
let results = retriever.retrieve("What are the benefits of Rust?", 5).await?;
When you call retrieve(), the retriever:
- Sends a prompt to the LLM asking it to rephrase the query into 3 different versions.
- Runs the original query plus all generated variants through the base retriever.
- Collects all results, deduplicates by document
id, and returns up totop_kdocuments.
Custom number of query variants
Specify a different number of generated queries:
let retriever = MultiQueryRetriever::with_num_queries(
base_retriever,
model,
5, // Generate 5 query variants
);
More variants increase recall but also increase the number of LLM and retriever calls.
How it works internally
The retriever sends a prompt like this to the LLM:
You are an AI language model assistant. Your task is to generate 3 different versions of the given user question to retrieve relevant documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of distance-based similarity search. Provide these alternative questions separated by newlines. Only output the questions, nothing else.
Original question: What are the benefits of Rust?
The LLM might respond with:
Why should I use Rust as a programming language?
What advantages does Rust offer over other languages?
What makes Rust a good choice for software development?
Each of these queries is then run through the base retriever, and all results are merged with deduplication.
Example with a vector store
use std::sync::Arc;
use synaptic::retrieval::{MultiQueryRetriever, Retriever};
use synaptic::vectorstores::{InMemoryVectorStore, VectorStoreRetriever, VectorStore};
use synaptic::embeddings::FakeEmbeddings;
use synaptic::retrieval::Document;
// Set up vector store
let embeddings = Arc::new(FakeEmbeddings::new(128));
let store = Arc::new(InMemoryVectorStore::new());
let docs = vec![
Document::new("1", "Rust ensures memory safety without a garbage collector"),
Document::new("2", "Rust's ownership system prevents data races at compile time"),
Document::new("3", "Go uses goroutines for lightweight concurrency"),
];
store.add_documents(docs, embeddings.as_ref()).await?;
// Wrap vector store as a retriever
let base = Arc::new(VectorStoreRetriever::new(store, embeddings, 5));
// Create multi-query retriever
let model: Arc<dyn ChatModel> = Arc::new(/* your model */);
let retriever = MultiQueryRetriever::new(base, model);
let results = retriever.retrieve("Why is Rust safe?", 5).await?;
// May find documents that mention "memory safety", "ownership", "data races"
// even if the original query doesn't use those exact terms