RAG Using Documents
Retrieval-Augmented Generation (RAG) enables an AI model to use your own documents as part of its reasoning process. Instead of relying solely on information learned during model training, RAG allows the model to retrieve relevant knowledge dynamically from external sources.
It enables the AI to generate more accurate, up-to-date, and context-aware responses by grounding its answers in your document data.
In Ailoy, RAG is composed of three core components in addition to the
Agent:
EmbeddingModel— converts text into vector representations for similarity search.VectorStore— stores and retrieves those vectors along with their source text.Knowledge— a runtime component that performs retrieval and feeds the result into the agent’s prompt.
In the preparation phase, use the EmbeddingModel to encode documents and
populate the VectorStore. In the runtime phase, just use the Knowledge
to retrieve relevant documents for each query and include them in the agent’s
context.
Please refer to the Architecture section for more information.
Step-by-Step Guide
Initializing an Embedding Model and a Vector Store
- Python
- JavaScript
- JavaScript(Web)
import ailoy as ai
async def prepare_knowledge():
# create an embedding model
model = await ai.EmbeddingModel.new_local(
model_name="BAAI/bge-m3",
progress_callback=print,
)
# create a vector store with dimension 1024
vs = ai.VectorStore.new_faiss(dim=1024)
if __name__ == "__main__":
asyncio.run(prepare_knowledge())
import * as ai from "ailoy-node";
async function prepare_knowledge() {
// create an embedding model
const model = await ai.EmbeddingModel.newLocal(
"BAAI/bge-m3", // modelName
{
progressCallback: console.log,
}
);
// create a vector store with dimension 1024
const vs = await ai.VectorStore.newFaiss(1024);
}
prepare_knowledge().catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
async function prepare_knowledge() {
// create an embedding model
const model = await ai.EmbeddingModel.newLocal(
"BAAI/bge-m3", // modelName
{
progressCallback: console.log,
}
);
// create a vector store with dimension 1024
const vs = await ai.VectorStore.newFaiss(1024);
}
prepare_knowledge().catch((err) => {
console.error("Error:", err);
});
At this time, the only supported embedding model is BAAI/bge-m3. Additional embedding models will be supported in future releases.
Preparing Documents with a Vector Store
Before running RAG, create a vector store to store document embeddings for retrieval. This step prepares your data for semantic search and typically only needs to be performed once per dataset.
- Python
- JavaScript
- JavaScript(Web)
import asyncio
import ailoy as ai
CHUNKS = [...]
async def prepare_knowledge() -> ai.Knowledge:
model = await ai.EmbeddingModel.new_local("BAAI/bge-m3")
vs = ai.VectorStore.new_faiss(1024)
for i, chunk in enumerate(CHUNKS):
# embed chunks into semantic vectors
embedding = await model.infer(chunk)
# store semantic vectors
vs.add_vector(
ai.VectorStoreAddInput(
embedding,
chunk,
{"title": "The Old Man and the Sea", "index": i}
)
)
# create a knowledge based on the vector store
return ai.Knowledge.new_vector_store(vs, model)
if __name__ == "__main__":
asyncio.run(prepare_knowledge())
import * as ai from "ailoy-node";
const CHUNKS = [...];
async function prepare_knowledge() {
const model = await ai.EmbeddingModel.newLocal("BAAI/bge-m3");
const vs = await ai.VectorStore.newFaiss(1024);
for (const [i, chunk] of CHUNKS.entries()) {
// embed chunks into semantic vectors
const embedding = await model.infer(chunk);
// store semantic vectors
await vs.addVector({
embedding: embedding,
document: chunk,
metadata: {
title: "The Old Man and the Sea",
index: i,
},
});
}
// create a knowledge based on the vector store
return ai.Knowledge.newVectorStore(vs, model);
}
prepare_knowledge().catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
const CHUNKS = [...];
async function prepare_knowledge() {
const model = await ai.EmbeddingModel.newLocal("BAAI/bge-m3");
const vs = await ai.VectorStore.newFaiss(1024);
for (const [i, chunk] of CHUNKS.entries()) {
// embed chunks into semantic vectors
const embedding = await model.infer(chunk);
// store semantic vectors
await vs.addVector({
embedding: embedding,
document: chunk,
metadata: {
title: "The Old Man and the Sea",
index: i,
},
});
}
// create a knowledge based on the vector store
return ai.Knowledge.newVectorStore(vs, model);
}
prepare_knowledge().catch((err) => {
console.error("Error:", err);
});
Ailoy currently supports both FAISS and ChromaDB as vector store backends. Refer to the official configuration guide for backend-specific options.
Defining the Agent with Knowledge
You can now create an Agent with the Knowledge module that integrates your
vector store and embedding model.
- Python
- JavaScript
- JavaScript(Web)
import asyncio
import ailoy as ai
async def prepare_knowledge():
...
async def main(knowledge: ai.Knowledge):
# create an agent with knowledge
agent = ai.Agent(
await ai.LangModel.new_local("Qwen/Qwen3-0.6B"),
knowledge=knowledge,
)
if __name__ == "__main__":
knowledge = asyncio.run(prepare_knowledge())
asyncio.run(main(knowledge))
import * as ai from "ailoy-node";
async function prepare_knowledge() {...}
async function main(knowledge: ai.Knowledge) {
// create an agent with knowledge
const agent = new ai.Agent(
await ai.LangModel.newLocal("Qwen/Qwen3-0.6B"),
undefined, // tools (not used here)
knowledge // knowledge
);
}
prepare_knowledge()
.then((knowledge) => main(knowledge))
.catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
async function prepare_knowledge() {...}
async function main(knowledge: ai.Knowledge) {
// create an agent with knowledge
const agent = new ai.Agent(
await ai.LangModel.newLocal("Qwen/Qwen3-0.6B"),
undefined, // tools (not used here)
knowledge // knowledge
);
}
prepare_knowledge()
.then((knowledge) => main(knowledge))
.catch((err) => {
console.error("Error:", err);
});
Performing RAG
To perform retrieval and generate grounded responses:
Not all models supports documents natively.
To enable retrieval-based reasoning in these models, make sure that
applying the document polyfill which adapts the agent’s prompt
structure to include retrieved documents.
- Python
- JavaScript
- JavaScript(Web)
import asyncio
import ailoy as ai
async def prepare_knowledge():
...
async def main(knowledge: ai.Knowledge):
agent = ai.Agent(
await ai.LangModel.new_local("Qwen/Qwen3-0.6B"),
knowledge=knowledge,
)
config = ai.AgentConfig.from_dict({
# need polyfill here because Qwen3 doesn't natively support 'documents'
"inference": {"document_polyfill": "Qwen3"},
# set this to use `top_k` documents with high similarity. default `top_k`=1
"knowledge": {"top_k": 1},
})
async for resp in agent.run("Why did the boy stop fishing with the old man?", config):
print(resp.message.contents[0].text)
if __name__ == "__main__":
knowledge = asyncio.run(prepare_knowledge())
asyncio.run(main(knowledge))
import * as ai from "ailoy-node";
async function prepare_knowledge() {...}
async function main(knowledge: ai.Knowledge) {
const agent = new ai.Agent(
await ai.LangModel.newLocal("Qwen/Qwen3-0.6B"),
undefined,
knowledge
);
const config = {
// need polyfill here because Qwen3 doesn't natively support 'documents'
inference: { documentPolyfill: ai.getDocumentPolyfill("Qwen3") },
// set this to use `topK` documents with high similarity. default `topK`=1
knowledge: { topK: 1 }
};
for await (const resp of agent.run(
"Why did the boy stop fishing with the old man?",
config
)) {
if (resp.message.contents?.[0]?.type === "text")
console.log(resp.message.contents?.[0]?.text);
}
}
prepare_knowledge()
.then((knowledge) => main(knowledge))
.catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
async function prepare_knowledge() {...}
async function main(knowledge: ai.Knowledge) {
const agent = new ai.Agent(
await ai.LangModel.newLocal("Qwen/Qwen3-0.6B"),
undefined,
knowledge
);
const config = {
// need polyfill here because Qwen3 doesn't natively support 'documents'
inference: { documentPolyfill: ai.getDocumentPolyfill("Qwen3") },
// set this to use `topK` documents with high similarity. default `topK`=1
knowledge: { topK: 1 }
};
for await (const resp of agent.run(
"Why did the boy stop fishing with the old man?",
config
)) {
if (resp.message.contents?.[0]?.type === "text")
console.log(resp.message.contents?.[0]?.text);
}
}
prepare_knowledge()
.then((knowledge) => main(knowledge))
.catch((err) => {
console.error("Error:", err);
});
Complete Example
- Python
- JavaScript
- JavaScript(Web)
import asyncio
import ailoy as ai
CHUNKS = [
"""
He was an old man who fished alone in a skiff in the Gulf Stream and he had gone
eighty-four days now without taking a fish. In the first forty days a boy had been with him.
But after forty days without a fish the boy’s parents had told him that the old man was
now definitely and finally salao, which is the worst form of unlucky, and the boy had gone
at their orders in another boat which caught three good fish the first week. It made the
boy sad to see the old man come in each day with his skiff empty and he always went
down to help him carry either the coiled lines or the gaff and harpoon and the sail that
was furled around the mast. The sail was patched with flour sacks and, furled, it looked
like the flag of permanent defeat.
""",
"""
The old man was thin and gaunt with deep wrinkles in the back of his neck. The
brown blotches of the benevolent skin cancer the sun brings from its [9] reflection on the
tropic sea were on his cheeks. The blotches ran well down the sides of his face and his
hands had the deep-creased scars from handling heavy fish on the cords. But none of
these scars were fresh. They were as old as erosions in a fishless desert.
""",
"""
Everything about him was old except his eyes and they were the same color as the
sea and were cheerful and undefeated.
“Santiago,” the boy said to him as they climbed the bank from where the skiff was
hauled up. “I could go with you again. We’ve made some money.”
The old man had taught the boy to fish and the boy loved him.
“No,” the old man said. “You’re with a lucky boat. Stay with them.”
""",
]
async def prepare_knowledge():
model = await ai.EmbeddingModel.new_local("BAAI/bge-m3")
vs = ai.VectorStore.new_faiss(1024)
for i, chunk in enumerate(CHUNKS):
embedding = await model.infer(chunk)
vs.add_vector(
ai.VectorStoreAddInput(
embedding, chunk, {"title": "The Old Man and the Sea", "index": i}
)
)
return ai.Knowledge.new_vector_store(vs, model)
async def main(knowledge: ai.Knowledge):
agent = ai.Agent(
await ai.LangModel.new_local("Qwen/Qwen3-0.6B"),
knowledge=knowledge,
)
config = ai.AgentConfig.from_dict({"inference": {"document_polyfill": "Qwen3"}})
async for resp in agent.run("Why did the boy stop fishing with the old man?", config):
print(resp.message.contents[0].text)
if __name__ == "__main__":
knowledge = asyncio.run(prepare_knowledge())
asyncio.run(main(knowledge))
import * as ai from "ailoy-node";
const CHUNKS = [
`
He was an old man who fished alone in a skiff in the Gulf Stream and he had gone
eighty-four days now without taking a fish. In the first forty days a boy had been with him.
But after forty days without a fish the boy’s parents had told him that the old man was
now definitely and finally salao, which is the worst form of unlucky, and the boy had gone
at their orders in another boat which caught three good fish the first week. It made the
boy sad to see the old man come in each day with his skiff empty and he always went
down to help him carry either the coiled lines or the gaff and harpoon and the sail that
was furled around the mast. The sail was patched with flour sacks and, furled, it looked
like the flag of permanent defeat.
`,
`
The old man was thin and gaunt with deep wrinkles in the back of his neck. The
brown blotches of the benevolent skin cancer the sun brings from its [9] reflection on the
tropic sea were on his cheeks. The blotches ran well down the sides of his face and his
hands had the deep-creased scars from handling heavy fish on the cords. But none of
these scars were fresh. They were as old as erosions in a fishless desert.
`,
`
Everything about him was old except his eyes and they were the same color as the
sea and were cheerful and undefeated.
“Santiago,” the boy said to him as they climbed the bank from where the skiff was
hauled up. “I could go with you again. We’ve made some money.”
The old man had taught the boy to fish and the boy loved him.
“No,” the old man said. “You’re with a lucky boat. Stay with them.”
`,
];
async function prepare_knowledge() {
const model = await ai.EmbeddingModel.newLocal("BAAI/bge-m3");
const vs = await ai.VectorStore.newFaiss(1024);
for (const [i, chunk] of CHUNKS.entries()) {
const embedding = await model.infer(chunk);
await vs.addVector({
embedding: embedding,
document: chunk,
metadata: {
title: "The Old Man and the Sea",
index: i,
},
});
}
return ai.Knowledge.newVectorStore(vs, model);
}
async function main(knowledge: ai.Knowledge) {
const agent = new ai.Agent(
await ai.LangModel.newLocal("Qwen/Qwen3-0.6B"),
undefined,
knowledge
);
const config = { inference: { documentPolyfill: ai.getDocumentPolyfill("Qwen3") } };
for await (const resp of agent.run(
"Why did the boy stop fishing with the old man?",
config
)) {
if (resp.message.contents?.[0]?.type === "text")
console.log(resp.message.contents?.[0]?.text);
}
}
prepare_knowledge()
.then((knowledge) => main(knowledge))
.catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
const CHUNKS = [
`
He was an old man who fished alone in a skiff in the Gulf Stream and he had gone
eighty-four days now without taking a fish. In the first forty days a boy had been with him.
But after forty days without a fish the boy’s parents had told him that the old man was
now definitely and finally salao, which is the worst form of unlucky, and the boy had gone
at their orders in another boat which caught three good fish the first week. It made the
boy sad to see the old man come in each day with his skiff empty and he always went
down to help him carry either the coiled lines or the gaff and harpoon and the sail that
was furled around the mast. The sail was patched with flour sacks and, furled, it looked
like the flag of permanent defeat.
`,
`
The old man was thin and gaunt with deep wrinkles in the back of his neck. The
brown blotches of the benevolent skin cancer the sun brings from its [9] reflection on the
tropic sea were on his cheeks. The blotches ran well down the sides of his face and his
hands had the deep-creased scars from handling heavy fish on the cords. But none of
these scars were fresh. They were as old as erosions in a fishless desert.
`,
`
Everything about him was old except his eyes and they were the same color as the
sea and were cheerful and undefeated.
“Santiago,” the boy said to him as they climbed the bank from where the skiff was
hauled up. “I could go with you again. We’ve made some money.”
The old man had taught the boy to fish and the boy loved him.
“No,” the old man said. “You’re with a lucky boat. Stay with them.”
`,
];
async function prepare_knowledge() {
const model = await ai.EmbeddingModel.newLocal("BAAI/bge-m3");
const vs = await ai.VectorStore.newFaiss(1024);
for (const [i, chunk] of CHUNKS.entries()) {
const embedding = await model.infer(chunk);
await vs.addVector({
embedding: embedding,
document: chunk,
metadata: {
title: "The Old Man and the Sea",
index: i,
},
});
}
return ai.Knowledge.newVectorStore(vs, model);
}
async function main(knowledge: ai.Knowledge) {
const agent = new ai.Agent(
await ai.LangModel.newLocal("Qwen/Qwen3-0.6B"),
undefined,
knowledge
);
const config = { inference: { documentPolyfill: ai.getDocumentPolyfill("Qwen3") } };
for await (const resp of agent.run(
"Why did the boy stop fishing with the old man?",
config
)) {
if (resp.message.contents?.[0]?.type === "text")
console.log(resp.message.contents?.[0]?.text);
}
}
prepare_knowledge()
.then((knowledge) => main(knowledge))
.catch((err) => {
console.error("Error:", err);
});
For best results, ensure your documents are chunked semantically (e.g., by paragraphs or sections).
Output
The boy stopped fishing with the old man because he was with a lucky boat, not because he was in possession of a boat or because he was with the old man. The boy was with a boat, not a person. The old man had taught the boy to fish, and the boy loved him, but the reason for stopping fishing was related to the boat.