RAG with Vector Store
Retrieval-Augmented Generation (RAG) is a useful method when you want to use AI with your own documents. In RAG, the AI model gets extra knowledge from outside sources, usually stored in something called a vector store. Instead of depending only on what the model learned during training, RAG finds and adds related documents at the time you ask a question. This helps the AI give more accurate, up-to-date, and relevant answers.
In this example, we’ll walk you through a complete RAG workflow — how to build a
vector store(VectorStore
) and integrate it to the Agent
.
Initializing a Vector Store
Ailoy simplifies the construction of RAG pipelines through its built-in
VectorStore
component, which works alongside the Agent
.
To initialize a vector store:
- Python
- JavaScript(Node)
from ailoy import Runtime
from ailoy.vectorstore import VectorStore
rt = Runtime()
with VectorStore(rt, "BAAI/bge-m3", "faiss") as vs:
...
import { startRuntime, defineVectorStore } from "ailoy-node";
const rt = await startRuntime();
const vs = await defineVectorStore(rt, "BAAI/bge-m3", "faiss");
Ailoy currently supports both FAISS and ChromaDB as vector store backends. Refer to the official configuration guide for backend-specific options.
💡 Note: At this time, the only supported embedding model is
BAAI/bge-m3
. Additional embedding models will be supported in future releases.
Inserting Documents into the Vector Store
You can insert text along with optional metadata into the vector store:
- Python
- JavaScript(Node)
vs.insert(
"Ailoy is a lightweight library for building AI applications",
metadata={"topic": "Ailoy"}
)
await vs.insert({
document: "Ailoy is a lightweight library for building AI applications",
metadata: {
topic: "Ailoy",
},
});
In practice, you should split large documents into smaller chunks before inserting them. This improves retrieval quality. You may use any text-splitting tool (e.g., LangChain), or utilize Ailoy’s low-level runtime API for text splitting. (See Calling Low-Level APIs for more details.)
Retrieving Relevant Documents
To retrieve documents similar to a given query:
- Python
- JavaScript(Node)
query = "What is Ailoy?"
items = vs.retrieve(query, top_k=5)
const query = "What is Ailoy?";
const items = await vs.retrieve(query, 5);
This returns a list of VectorStoreRetrieveItem
instances representing the most
relevant chunks, ranked by similarity. The number of results is controlled via
the top_k
parameter (default is 5).
Constructing an Augmented Prompt
Once documents are retrieved, you can construct a context-enriched prompt as follows:
- Python
- JavaScript(Node)
prompt = f"""
Based on the provided contexts, try to answer user's question.
Context: {[item.document for item in items]}
Question: {query}
"""
const prompt = `
Based on the provided contexts, try to answer user' question.
Context: ${items.map((item) => item.document)}
Question: ${query}
`;
You can then pass this prompt to the agent for inference:
- Python
- JavaScript(Node)
for resp in agent.query(prompt):
agent.print(resp)
for await (const resp of agent.query(prompt)) {
agent.print(resp);
}
Complete Example
- Python
- JavaScript(Node)
from ailoy import Runtime, Agent, VectorStore
# Initialize Runtime
rt = Runtime()
# Initialize Agent and VectorStore
with Agent(rt, "Qwen/Qwen3-8B") as agent, VectorStore(rt, "BAAI/bge-m3", "faiss") as vs:
# Insert items
vs.insert(
"Ailoy is a lightweight library for building AI applications",
metadata={"topic": "Ailoy"}
)
# Search the most relevant items
query = "What is Ailoy?"
items = vs.retrieve(query, top_k=5)
# Augment user query
prompt = f"""
Based on the provided contexts, try to answer user's question.
Context: {[item.document for item in items]}
Question: {query}
"""
# Invoke agent
for resp in agent.query(prompt):
agent.print(resp)
import { createRuntime, defineAgent, defineVectorStore } from "ailoy-node";
async function main() {
// Initialize Runtime
const rt = await createRuntime();
// Initialize Agent
const agent = await defineAgent(rt, "Qwen/Qwen3-8B");
// Initialize VectorStore
const vs = await defineVectorStore(rt, "BAAI/bge-m3", "faiss");
// Insert items
await vs.insert({
document: "Ailoy is a lightweight library for building AI applications",
metadata: { topic: "Ailoy" },
});
// Search the most relevant items
const query = "What is Ailoy?";
const items = await vs.retrieve(query, 5);
// Augment user query
const prompt = `
Based on the provided contexts, try to answer user' question.
Context: ${items.map((item) => item.document)}
Question: ${query}
`;
// Invoke agent
for await (const resp of agent.query(prompt)) {
agent.print(resp);
}
// Delete agent
await agent.delete();
}
For best results, ensure your documents are chunked semantically (e.g., by paragraphs or sections).