RAG with Vector Store

Retrieval-Augmented Generation (RAG) is a useful method when you want to use AI with your own documents. In RAG, the AI model gets extra knowledge from outside sources, usually stored in something called a vector store. Instead of depending only on what the model learned during training, RAG finds and adds related documents at the time you ask a question. This helps the AI give more accurate, up-to-date, and relevant answers.

In this example, we’ll walk you through a complete RAG workflow — how to build a vector store(VectorStore) and integrate it to the Agent.

Initializing a Vector Store

Ailoy simplifies the construction of RAG pipelines through its built-in VectorStore component, which works alongside the Agent.

To initialize a vector store:

Python
JavaScript(Node)

from ailoy import Runtime
from ailoy.vectorstore import VectorStore

rt = Runtime()
with VectorStore(rt, "BAAI/bge-m3", "faiss") as vs:
    ...

import { startRuntime, defineVectorStore } from "ailoy-node";

const rt = await startRuntime();
const vs = await defineVectorStore(rt, "BAAI/bge-m3", "faiss");

Ailoy currently supports both FAISS and ChromaDB as vector store backends. Refer to the official configuration guide for backend-specific options.

💡 Note: At this time, the only supported embedding model is BAAI/bge-m3. Additional embedding models will be supported in future releases.

Inserting Documents into the Vector Store

You can insert text along with optional metadata into the vector store:

Python
JavaScript(Node)

vs.insert(
    "Ailoy is a lightweight library for building AI applications",
    metadata={"topic": "Ailoy"}
)

await vs.insert({
  document: "Ailoy is a lightweight library for building AI applications",
  metadata: {
    topic: "Ailoy",
  },
});

In practice, you should split large documents into smaller chunks before inserting them. This improves retrieval quality. You may use any text-splitting tool (e.g., LangChain), or utilize Ailoy’s low-level runtime API for text splitting. (See Calling Low-Level APIs for more details.)

Retrieving Relevant Documents

To retrieve documents similar to a given query:

Python
JavaScript(Node)

query = "What is Ailoy?"
items = vs.retrieve(query, top_k=5)

const query = "What is Ailoy?";
const items = await vs.retrieve(query, 5);

This returns a list of VectorStoreRetrieveItem instances representing the most relevant chunks, ranked by similarity. The number of results is controlled via the top_k parameter (default is 5).

Constructing an Augmented Prompt

Once documents are retrieved, you can construct a context-enriched prompt as follows:

Python
JavaScript(Node)

prompt = f"""
    Based on the provided contexts, try to answer user's question.
    Context: {[item.document for item in items]}
    Question: {query}
"""

const prompt = `
  Based on the provided contexts, try to answer user' question.
  Context: ${items.map((item) => item.document)}
  Question: ${query}
`;

You can then pass this prompt to the agent for inference:

Python
JavaScript(Node)

for resp in agent.query(prompt):
    agent.print(resp)

for await (const resp of agent.query(prompt)) {
  agent.print(resp);
}

Complete Example

Python
JavaScript(Node)

from ailoy import Runtime, Agent, VectorStore

# Initialize Runtime

rt = Runtime()

# Initialize Agent and VectorStore

with Agent(rt, "Qwen/Qwen3-8B") as agent, VectorStore(rt, "BAAI/bge-m3", "faiss") as vs:
    # Insert items
    vs.insert(
        "Ailoy is a lightweight library for building AI applications",
        metadata={"topic": "Ailoy"}
    )

    # Search the most relevant items
    query = "What is Ailoy?"
    items = vs.retrieve(query, top_k=5)

    # Augment user query
    prompt = f"""
        Based on the provided contexts, try to answer user's question.
        Context: {[item.document for item in items]}
        Question: {query}
    """

    # Invoke agent
    for resp in agent.query(prompt):
        agent.print(resp)

import { createRuntime, defineAgent, defineVectorStore } from "ailoy-node";

async function main() {
  // Initialize Runtime
  const rt = await createRuntime();
  // Initialize Agent
  const agent = await defineAgent(rt, "Qwen/Qwen3-8B");
  // Initialize VectorStore
  const vs = await defineVectorStore(rt, "BAAI/bge-m3", "faiss");

  // Insert items
  await vs.insert({
    document: "Ailoy is a lightweight library for building AI applications",
    metadata: { topic: "Ailoy" },
  });

  // Search the most relevant items
  const query = "What is Ailoy?";
  const items = await vs.retrieve(query, 5);

  // Augment user query
  const prompt = `
    Based on the provided contexts, try to answer user' question.
    Context: ${items.map((item) => item.document)}
    Question: ${query}
  `;

  // Invoke agent
  for await (const resp of agent.query(prompt)) {
    agent.print(resp);
  }

  // Delete agent
  await agent.delete();
}

note

For best results, ensure your documents are chunked semantically (e.g., by paragraphs or sections).

Initializing a Vector Store​

Inserting Documents into the Vector Store​

Retrieving Relevant Documents​

Constructing an Augmented Prompt​

Complete Example​

Initializing a Vector Store

Inserting Documents into the Vector Store

Retrieving Relevant Documents

Constructing an Augmented Prompt

Complete Example