Getting Started
Welcome to Ailoy’s Tutorial! 🤗
In this tutorial, we’ll explore how to run LLMs in Ailoy and extend their capabilities to build agent systems.
Installation
You can install Ailoy via a package manager. Just type the following command in your shell:
- Python
- JavaScript
- JavaScript(Web)
pip install ailoy-py
// Using npm
npm install ailoy-node
// Using yarn
yarn add ailoy-node
// Using npm
npm install ailoy-web
// Using yarn
yarn add ailoy-web
For using Ailoy in web browsers, see WebAssembly Supports for more details.
Running the Simplest Agent
The Language Model (LM) is the core of AI agents. An agent system operates based on a language model and its instructions, along with various extensions that enhance its capabilities.
API Models and Local Models
Ailoy lets you run language models in two different modes:
- API Models: run models hosted by AI vendors such as OpenAI’s GPT or Anthropic’s Claude
- Local Models: run open-source models directly on your own machine
Each approach has its own strengths and trade-offs:
- API Models: connect to external AI services over the internet. If you’re fine with API token costs and prefer cloud-hosted models, this is the simplest option.
- Local Models: run entirely on your own device using open-source models. Choose this if you want full control, offline availability, or privacy.
Ailoy supports both seamlessly — you can switch between them anytime with minimal changes.
Example
Let’s start with a simple example — Ailoy’s version of “Hello, World!” for LLMs.
In just a few lines of code, you’ll run a model, send it a prompt, and print the response.
Agent Using APIs
- Python
- JavaScript
- JavaScript(Web)
import asyncio
import ailoy as ai
async def main():
lm = ai.LangModel.new_stream_api(
spec="OpenAI",
model_name="gpt-4o",
api_key="YOUR_OPENAI_API_KEY"
)
agent = ai.Agent(lm)
async for resp in agent.run("Please give me a short poem about AI."):
if isinstance(resp.message.contents[0], ai.Part.Text):
print(resp.message.contents[0].text)
if __name__ == "__main__":
asyncio.run(main())
import * as ai from "ailoy-node";
async function main() {
const lm = await ai.LangModel.newStreamAPI(
"OpenAI", // spec
"gpt-5", // modelName
"YOUR_OPENAI_API_KEY" // apiKey
);
const agent = new ai.Agent(lm);
for await (const resp of agent.run("Please give me a short poem about AI")) {
if (resp.message.contents[0].type === "text") {
console.log(resp.message.contents[0].text);
}
}
}
main().catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
async function main() {
const lm = await ai.LangModel.newStreamAPI(
"OpenAI", // spec
"gpt-5", // modelName
"YOUR_OPENAI_API_KEY" // apiKey
);
const agent = new ai.Agent(lm);
for await (const resp of agent.run("Please give me a short poem about AI")) {
if (resp.message.contents[0].type === "text") {
console.log(resp.message.contents[0].text);
}
}
}
main().catch((err) => {
console.error("Error:", err);
});
Agent Using Local Models
Here, we use the open-source model Qwen3-0.6B, running entirely on your own
device — no API key or internet connection is required.
- Python
- JavaScript
- JavaScript(Web)
import asyncio
import ailoy as ai
async def main():
lm = await ai.LangModel.new_local(
model_name="Qwen/Qwen3-0.6B", progress_callback=print
)
agent = ai.Agent(lm)
async for resp in agent.run("Please give me a short poem about AI."):
if isinstance(resp.message.contents[0], ai.Part.Text):
print(resp.message.contents[0].text)
if __name__ == "__main__":
asyncio.run(main())
import * as ai from "ailoy-node";
async function main() {
const lm = await ai.LangModel.newLocal(
"Qwen/Qwen3-0.6B", // modelName
{
progressCallback: console.log,
}
);
const agent = new ai.Agent(lm);
for await (const resp of agent.run("Please give me a short poem about AI")) {
if (resp.message.contents[0].type === "text") {
console.log(resp.message.contents[0].text);
}
}
}
main().catch((err) => {
console.error("Error:", err);
});
import * as ai from "ailoy-web";
async function main() {
const lm = await ai.LangModel.newLocal(
"Qwen/Qwen3-0.6B", // modelName
{
progressCallback: console.log,
}
);
const agent = new ai.Agent(lm);
for await (const resp of agent.run("Please give me a short poem about AI")) {
if (resp.message.contents[0].type === "text") {
console.log(resp.message.contents[0].text);
}
}
}
main().catch((err) => {
console.error("Error:", err);
});
You can use the Ailoy CLI interface to manage downloaded model files.
Output
Since the model needs to be downloaded and initialized, the first run may take a
little time.
After a short wait, you will see an output similar to this.
In the digital realm, where thoughts run,
AI dreams, no more than dreams of our own.
With code and data, it creates, it learns,
But dreams, still, run in hearts, in minds.
That’s it! 🎉
You’ve just run your first AI agent with Ailoy.
For a detailed explanation of the input and output formats in Ailoy, please refer to Chat Completion Format.
Don't be surprised if the output changes each time you run it.
An LLM's output includes a certain level of randomness, controlled by the
temperature and top_p. In ailoy, it can be adjusted with InferenceConfig
in AgentConfig.
If both temperature and top_p are set to 0, the answer becomes
deterministic.