Skip to main content

WebAssembly Supports

warning

WebAssembly support is currently experimental, and you may encounter unexpected errors. If you do, please report them via our GitHub Issues or Discord channels so we can address them promptly.

Ailoy supports running agents entirely in modern web browsers using WebAssembly (WASM). This enables AI workloads directly in the browser with no backend required.

Key capabilities

  • Run local models accelerated by WebGPU—supports both Agents and VectorStores.
  • Use API models just like you would in native environments.
  • Access built-in tools, or register your own custom tools
  • Connect MCP tools via Streamable HTTP, Server-Sent Events (SSE), or WebSocket transports.

This guide walks you through setting up and using ailoy-web in your browser-based applications.

Hardware Requirements

If you plan to run local models with WebGPU, make sure your system has the necessary hardware accelerators and drivers installed. See Device & Environments for more details.

  • MacOS with Apple Silicon: Fully supported, typically works out of the box.
  • Windows / Linux with NVIDIA or AMD GPUs: Ensure that the latest GPU drivers are installed.

In addition, WebGPU must support certain features—such as shader-f16. You can quickly verify your setup using the isWebGPUSupported utility:

import { isWebGPUSupported } from "ailoy-web";

const { supported, reason } = await isWebGPUSupported();
if (supported) {
// WebGPU is supported
console.log("✅ WebGPU is available!");
} else {
// WebGPU is not supported
console.log(
"❌ WebGPU is not available due to the following reason: ",
reason
);
}
info

If WebGPU is not supported, consider showing users a clear message in your UI that explains why—using the reason value returned by the function.

Building a Simple Chat UI

In this tutorial, we're going to build a simple chat UI application using Ailoy. You can see the full example codes in our repository.

Setup Vite and install ailoy-web

We recommend using Vite for a fast development environment and optimized builds. Still you can use another build tools you prefer, such as Webpack, Rollup, or Parcel.

In this example, we use Vite to create a project with React + TypeScript configuration.

  1. Let's create a Vite project by running npm create vite@latest.

$ npm create vite@latest

> npx > create-vite

│ ◇ Project name: │ my-project │ ◇ Select a framework: │ React │ ◇ Select a variant: │ TypeScript │ ◇ Scaffolding project in /path/of/my-project... │ └ Done. Now run:

cd my-project npm install npm run dev

  1. Next, navigate to your new project and install ailoy-web. This will also install packages preconfigured in package.json.

$ cd my-project $ npm install ailoy-web

added 265 packages, and audited 266 packages in 4s

64 packages are looking for funding run npm fund for details

found 0 vulnerabilities

  1. To ensure optimal performance and compatibility, update your vite.config.ts with the following settings:

    • Exclude ailoy-web from dependency optimization (optimizeDeps.exclude)
    • Enable cross-origin isolation (required for SharedArrayBuffer in WASM threading)
    • Optimize bundle size by grouping ailoy-web into its own build chunk
import react from "@vitejs/plugin-react";
import { defineConfig } from "vite";

export default defineConfig({
plugins: [react()],
optimizeDeps: {
exclude: ["ailoy-web"],
},
server: {
headers: {
"Cross-Origin-Embedder-Policy": "require-corp",
"Cross-Origin-Opener-Policy": "same-origin",
},
},
build: {
rollupOptions: {
output: {
manualChunks: {
ailoy: ["ailoy-web"],
},
},
},
},
});

Install and configure assistant-ui

For this quick demo, we'll use assistant-ui package to create a basic chat interface.

It relies on shadcn, so we'll set that up first. Follow the guide from shadcn's documentation.

  1. Edit tsconfig.json.
{
"files": [],
"references": [
{
"path": "./tsconfig.app.json"
},
{
"path": "./tsconfig.node.json"
}
],
"compilerOptions": {
"baseUrl": ".",
"paths": {
"@/*": ["./src/*"]
}
}
}
  1. Edit tsconfig.app.json
{
"compilerOptions": {
// ...
"baseUrl": ".",
"paths": {
"@/*": ["./src/*"]
}
// ...
}
}
  1. Install tailwind and update vite.config.ts.

$ npm install --save-dev tailwindcss @tailwindcss/vite @types/node

added 22 packages, and audited 215 packages in 3s

51 packages are looking for funding run npm fund for details

found 0 vulnerabilities

import path from "node:path";
import tailwindcss from "@tailwindcss/vite";
import react from "@vitejs/plugin-react";
import { defineConfig } from "vite";

// https://vite.dev/config/
export default defineConfig({
plugins: [react(), tailwindcss()],
resolve: {
alias: {
"@": path.resolve(__dirname, "./src"),
},
},
// other configs
});
  1. Update src/index.css to use tailwindcss.
@import "tailwindcss";
  1. Install assistant-ui and add thread component.
    npx assistant-ui add thread

$ npx assistant-ui add thread ✔ You need to create a components.json file to add components. Proceed? … yes ✔ Which color would you like to use as the base color? › Neutral ✔ Writing components.json. ✔ Checking registry. ✔ Installing dependencies. ✔ Created 6 files:

  • src/components/assistant-ui/thread.tsx
  • src/components/assistant-ui/markdown-text.tsx
  • src/components/assistant-ui/tooltip-icon-button.tsx
  • src/components/assistant-ui/tool-fallback.tsx
  • src/components/ui/button.tsx
  • src/components/ui/tooltip.tsx
  1. Install framer-motion package which is required by assistant-ui's thread.
    npm install framer-motion

$ npm install framer-motion

up to date, audited 374 packages in 696ms

141 packages are looking for funding run npm fund for details

found 0 vulnerabilities

  1. Due to "verbatimModuleSyntax": true configuration, you need to fix src/components/assistant-ui/tool-fallback.tsx to prevent compile error.
import type { ToolCallMessagePartComponent } from "@assistant-ui/react";
import { CheckIcon, ChevronDownIcon, ChevronUpIcon } from "lucide-react";

// ...

Implementation

  1. Create src/AiloyRuntimeProvider.tsx and write the code as follow:
"use client";

import { useState, useEffect, type ReactNode } from "react";
import {
AssistantRuntimeProvider,
useExternalStoreRuntime,
useExternalMessageConverter,
type AppendMessage,
} from "@assistant-ui/react";
import * as ai from "ailoy-web";

export function AiloyRuntimeProvider({
children,
}: Readonly<{ children: ReactNode }>) {
const [agent, setAgent] = useState<ai.Agent | undefined>(undefined);
const [agentLoading, setAgentLoading] = useState<boolean>(false);
const [messages, setMessages] = useState<
(ai.UserMessage | ai.AgentResponse)[]
>([]);
const [isAnswering, setIsAnswering] = useState<boolean>(false);

useEffect(() => {
(async () => {
const { supported, reason } = await ai.isWebGPUSupported();
if (!supported) {
alert(`WebGPU is not supported: ${reason!}`);
return;
}

setAgentLoading(true);
const runtime = await ai.startRuntime();
const agent = await ai.defineAgent(
runtime,
ai.LocalModel({ id: "Qwen/Qwen3-0.6B" })
);
setAgent(agent);
setAgentLoading(false);
})();
}, []);

const onNew = async (message: AppendMessage) => {
if (agent === undefined) throw new Error("Agent is not initialized yet");

if (message.content[0]?.type !== "text")
throw new Error("Only text messages are supported");

const input = message.content[0].text;
const userMessage: ai.UserMessage = {
role: "user",
content: input,
};
setMessages((prev) => [...prev, userMessage]);
setIsAnswering(true);

for await (const resp of agent.query(input)) {
if (resp.type === "output_text" || resp.type === "reasoning") {
if (resp.isTypeSwitched) {
setMessages((prev) => [...prev, resp]);
} else {
setMessages((prev) => {
const last = prev[prev.length - 1];
last.content += resp.content;
return [...prev.slice(0, -1), last];
});
}
} else {
setMessages((prev) => [...prev, resp]);
}
}
setIsAnswering(false);
};

const convertedMessages = useExternalMessageConverter({
messages,
callback: (message: ai.UserMessage | ai.AgentResponse) => {
if (message.role === "user") {
return {
role: message.role,
content: [{ type: "text", text: message.content as string }],
};
} else if (message.type === "output_text") {
return {
role: "assistant",
content: [{ type: "text", text: message.content }],
};
} else if (message.type === "reasoning") {
return {
role: "assistant",
content: [{ type: "reasoning", text: message.content }],
};
} else if (message.type === "tool_call") {
return {
role: "assistant",
content: [
{
type: "tool-call",
toolCallId: message.content.id!,
toolName: message.content.function.name,
args: message.content.function.arguments,
},
],
};
} else if (message.type === "tool_call_result") {
return {
role: "tool",
toolCallId: message.content.tool_call_id!,
result: message.content.content[0].text,
};
} else {
throw new Error(`Unknown message type: ${message.type}`);
}
},
isRunning: isAnswering,
joinStrategy: "concat-content",
});

const runtime = useExternalStoreRuntime({
isLoading: agentLoading,
isDisabled: agent === undefined,
isRunning: isAnswering,
messages: convertedMessages,
onNew,
});

return (
<AssistantRuntimeProvider runtime={runtime}>
{children}
</AssistantRuntimeProvider>
);
}
  1. Update src/App.tsx to wrap <Thread /> inside <AiloyRuntimeProvider>.
import { Thread } from "@/components/assistant-ui/thread";
import { AiloyRuntimeProvider } from "./AiloyRuntimeProvider";

function App() {
return (
<AiloyRuntimeProvider>
<Thread />
</AiloyRuntimeProvider>
);
}

export default App;
  1. React’s <StrictMode> can cause components to mount twice during development.
    To avoid double state updates, remove <StrictMode> from src/main.tsx:
// import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App.tsx'

createRoot(document.getElementById('root')!).render(
// <StrictMode>
<App />
// </StrictMode>,
)
  1. Start the development server with npm run dev.

$ npm run dev

VITE v7.1.1 ready in 318 ms

➜ Local: http://localhost:5173/ ➜ Network: use --host to expose ➜ press h + enter to show help

  1. Visit http://localhost:5173 — your chat UI should be live.

info

When the agent initializes for the first time, model parameters are downloaded from Ailoy’s file server. These files are stored in the browser’s Origin Private File System(OPFS) which is isolated per origin and managed internally by the browser.

Once initialization completes, you can start chatting with the agent.

  1. Try to start a conversation with the agent running on your web browser.

🎉 Congratulations! You now have a fully local AI agent running entirely in your browser with zero backend servers!

Additional Features

Using API models

You can easily switch to API models by changing the model configuration as below.

const agent = await ai.defineAgent(
runtime,
// ai.LocalModel({ id: "Qwen/Qwen3-0.6B" })
ai.APIModel({
id: "gpt-5-mini",
apiKey: "<YOUR_OPENAI_API_KEY>",
})
);

You can use any API models listed in supported API models.

warning

The above code is for test purpose. You should never hardcode your API keys in your frontend codes! For example, consider making an input box to get API key from users and initialize agents using the key.

Multi-Modal Inputs

warning

Multi-Modal inputs are currently supported only on API models as described in Agent > Multi-Modal Inputs.

Follow the guide in assistant-ui's Attachments documentation to enable file attachment functionality.

Install attachment UI component.

$ npx shadcn@latest add "https://r.assistant-ui.com/attachment"

Similar to what we did for Thread component, edit src/components/assistant-ui/attachment.tsx as below to fix compile error.

import { type PropsWithChildren, useEffect, useState, type FC } from "react";
import { CircleXIcon, FileIcon, PaperclipIcon } from "lucide-react";

Edit src/components/assistant-ui/thread.tsx to add attachment components.

import {
ComposerAttachments,
ComposerAddAttachment,
UserMessageAttachments,
} from "./attachment";

// Update Composer
const Composer: FC = () => {
return (
<div ...>
<ThreadScrollToBottom />
<ThreadPrimitive.Empty>
<ThreadWelcomeSuggestions />
</ThreadPrimitive.Empty>
<ThreadPrimitive.Empty>
<ComposerAttachments />
</ThreadPrimitive.Empty>
<ComposerPrimitive.Root ...>
// ...
</ComposerPrimitive.Root>
</div>
)
}

// Update ComposerAction
const ComposerAction: FC = () => {
return (
<div ...>
<ThreadPrimitive.If running={false}>
<ComposerAddAttachment />
</ThreadPrimitive.If>
// ...
</div>
)
}

// Update UserMessage
const UserMessage: FC = () => {
return (
<MessagePrivitive.Root asChild>
<motion.div
// ...
>
<UserMessageAttachments />
<UserActionBar />
// ...
</motion.div>
</MessagePrivitive.Root>
)
}

In src/AiloyRuntimeProvider.tsx, update onNew to handle image contents.

const onNew = async (message: AppendMessage) => {
// ...

let userContent: ai.UserMessage["content"] = [];

// Add attachments
if (message.attachments !== undefined) {
for (const attach of message.attachments) {
if (attach.type === "image") {
const imageContent = await ai.ImageContent.fromFile(attach.file!);
userContent.push(imageContent);
}
// other types are skipped
}
}

// Add text prompt
if (message.content[0]?.type !== "text")
throw new Error("Only text messages are supported");
const textContent: ai.TextContent = {
type: "text",
text: message.content[0].text,
};
userContent.push(textContent);

// Set messages
setMessages((prev) => [...prev, { role: "user", content: userContent }]);

// ...

Update the message conversion callback in useExternalMessageConverter to handle multimodal contents.

const convertedMessages = useExternalMessageConverter({
messages,
callback: (message: ai.UserMessage | ai.AgentResponse) => {
if (message.role === "user") {
if (typeof message.content === "string") {
return {
role: message.role,
content: [{ type: "text", text: message.content }],
};
} else {
return {
role: message.role,
content: message.content.map((c) => {
if (c.type === "text") return c;
else if (c.type === "image_url")
return { type: "image", image: c.image_url.url };
else if (c.type === "input_audio")
return { type: "audio", audio: c.input_audio };
else throw Error("Unknown content type");
}),
};
}
}
// ...
},
});

Add adapters on useExternalStoreRuntime to handle image file attachments.

import {
CompositeAttachmentAdapter,
SimpleImageAttachmentAdapter,
SimpleTextAttachmentAdapter,
} from "@assistant-ui/react";

const runtime = useExternalStoreRuntime({
// ...
adapters: {
attachments: new CompositeAttachmentAdapter([
new SimpleImageAttachmentAdapter(),
new SimpleTextAttachmentAdapter(),
]),
},
});

Now you can attach images and ask about them.

Using Builtin Tools

You can use any tool presets as described in Out-of-the-box tools. Let's add handling tool call and tool result messages and see how it works.

After agent is defined, add calculator tool preset as below.

await agent.addToolsFromPreset("calculator");

Update the code in <MessageList> as below to show the corresponding message content for each message type.

<MessageList>
{messages.map((message, idx) => {
let content = "";
if (message.role === "user") {
content = (message as ai.UserMessage).content as string;
} else if (
message.type === "output_text" ||
message.type === "reasoning"
) {
content = (message as ai.AgentResponse).content as string;
} else if (message.type === "tool_call") {
content = `Tool Call: ${message.content.function.name} (${message.content.id})`;
content += `\nArguments: ${JSON.stringify(
message.content.function.arguments
)}`;
} else if (message.type === "tool_call_result") {
content = `Tool Result (${message.content.tool_call_id})`;
content += `\nContent: ${message.content.content[0].text}`;
}

return (
<Message
key={`message-${idx}`}
model={{
direction:
message.role === "user" ? "outgoing" : "incoming",
position: "normal",
sender: message.role,
message: content,
}}
/>
);
})}
</MessageList>

Test the tool calling with the prompt that might invoke the calculator tool.

tool calling

Using MCP Tools

You can register MCP clients and use their tools in agents via Streamable HTTP, Server-Sent Event, and WebSocket transports. Note that stdio transport is not supported in browsers because they cannot run a stdio process.

warning

When connecting to MCP servers from browsers, make sure the server is configured to use CORS middleware with allowing your origins and exposing required headers such as Mcp-Session-Id. See the official documentation for more details.

Let's create a simple MCP server for testing MCP tools availability.

Install the following packages.

$ npm install express cors zod @modelcontextprotocol/sdk $ npm install --save-dev @types/express @types/cors

Create src/mcpServer.ts and write the code as below:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import cors from "cors";
import express, { type Request, type Response } from "express";
import * as z from "zod/v3";

// Create an MCP server
const server = new McpServer({
name: "demo-server",
version: "1.0.0",
});

// Add an addition tool
server.registerTool(
"add",
{
title: "Addition Tool",
description: "Add two numbers",
inputSchema: { a: z.number(), b: z.number() },
},
async ({ a, b }) => ({
content: [{ type: "text", text: String(a + b) }],
})
);

const app = express();
app.use(express.json());
app.use(
cors({
origin: "*",
exposedHeaders: "*",
})
);

app.post("/mcp", async (req: Request, res: Response) => {
try {
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: undefined,
});
res.on("close", () => {
transport.close();
server.close();
});
await server.connect(transport);
await transport.handleRequest(req, res, req.body);
} catch (error) {
if (!res.headersSent) {
res.status(500).json({
jsonrpc: "2.0",
error: {
code: -32603,
message: "Internal server error",
},
id: null,
});
}
}
});

const PORT = 3000;
app.listen(PORT, () => {
console.log(`MCP server listening on ${PORT}`);
});
warning

Note that we used zod/v3 since the current latest version of MCP sdk does not work with zod v4. (related issue)

This MCP server runs on streamable HTTP transport and has a single tool add which takes two numbers and return the addition. The server runs on localhost:3000 and uses CORS middleware to allow any origins.

Install vite-node and add a script in package.json to run this server using vite-node.

$ npm install --save-dev vite-node

  "scripts": {
// ...
"dev:mcp": "vite-node src/mcpServer.ts"
}

Run the MCP server.

$ npm run dev:mcp

> my-project@0.0.0 dev:mcp > vite-node src/mcpServer.ts

MCP server listening on 3000

In src/AiloyRuntimeProvider.tsx, add MCP tools after the agent has been initialized.

const agent = ...;

await agent.addToolsFromMcpClient(
"calculator",
new ai.MCPStreamableHTTPClientTransport(
new URL("http://localhost:3000/mcp")
)
);

Check if MCP tools work as expected.

Using Vector Stores

You can interact with Vector Stores in Ailoy Web using the same API described in RAG with Vector Store.

When combined with local embedding models and in-memory vector stores such as Faiss, you can build a fully client-side RAG application that runs entirely inside the user's browser—no backend servers required. This approach is ideal for privacy-preserving, offline-capable AI assistants.

Defining a Vector Store

import * as ai from "ailoy-web";

const runtime = await ai.startRuntime();

// Using an in-memory Faiss vector store
const vectorstore = await ai.defineVectorStore(runtime, {
type: "faiss",
embedding: {
modelId: "BAAI/bge-m3",
},
});

// Using an external ChromaDB server
const vectorstore = await ai.defineVectorStore(runtime, {
type: "chromadb",
url: "http://localhost:8000", // Replace with your ChromaDB endpoint
collection: "my-collection", // Your target collection name
embedding: {
modelId: "BAAI/bge-m3",
},
})
warning

When using external vector stores such as ChromaDB, make sure they're configured to use CORS middleware to allow your origin. For ChromaDB, refer the CORS configuration.

Inserting Documents

Before you can retrieve documents, you must insert them into the vector store along with optional metadata.

Typically, you'll chunk your text into smaller pieces before insertion for better retrieval accuracy.

const items = [
{
document:
"BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.",
metadata: { topic: "bge-m3" },
},
{
document:
"BM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document",
metadata: { topic: "bm25" },
},
];

for (const item of items) {
const result = await vectorstore.insert(item);
console.log(result); // Example: {id: "1"}
}

Retrieving Documents

You can retrieve the most relevant documents by similarity score, which is computed using vector embeddings.

// Retrieve the top 1 item most similar to the query
const retrievedItems = await vectorstore.retrieve("What is BGE-M3?", 1);
console.log(retrievedItems);
// Expected: returns the document related to "bge-m3"

Clearing the Vector Store

If you need to reset or remove all entries from the vector store:

await vectorstore.clear();