Getting started
Welcome to Ailoy’s Tutorial! 🤗
In this tutorial, we’ll explore how to run LLMs in Ailoy, extend their capabilities like agent system.
You can install Ailoy via a package manager. Just type the following command in your shell:
- Python
- JavaScript(Node)
pip install ailoy-py
npm install ailoy-node
yarn add ailoy-node
Let’s start right away with a simple code example. Below is the simplest way to run an LLM — it’s like Ailoy’s “Hello, World!”
- Python
- JavaScript(Node)
from ailoy import Runtime, Agent
# The runtime must be started to use Ailoy
rt = Runtime()
# Defines an agent
# During this step, the model parameters are downloaded and the LLM is set up for execution
with Agent(rt, model_name="Qwen/Qwen3-0.6B") as agent:
# This is where the actual LLM call happens
for resp in agent.query("Please give me a short poem about AI"):
agent.print(resp)
# Stop the runtime
rt.stop()
import { startRuntime, defineAgent } from "ailoy-node";
(async () => {
// The runtime must be started to use Ailoy
const rt = await startRuntime();
// Defines an agent
// During this step, the model parameters are downloaded and the LLM is set up for execution
const agent = await defineAgent(rt, "Qwen/Qwen3-0.6B");
// This is where the actual LLM call happens
for await (const resp of agent.query("Please give me a short poem about AI")) {
agent.print(resp);
}
// Once the agent is no longer needed, it can be released
await agent.delete();
// Stop the runtime
await rt.stop();
})();
Since the model needs to be downloaded and initialized, the first run may take a little time. After a short wait, you may see an output similar to this.
All done! You've just activated an LLM. 🎉
Don't be surprised if the output changes each time you run it. An LLM's output includes a certain level of randomness based on the temperature setting.
Now, let me explain through the code line by line to understand what each part does.
The very first step to using Ailoy is to start a Runtime.
- Python
- JavaScript(Node)
# The runtime must be started to use Ailoy
rt = Runtime()
# ...
# Stop the runtime
rt.stop()
// The runtime must be started to use Ailoy
const rt = await startRuntime();
// ...
// Stop the runtime
await rt.stop();
The Runtime
contains Ailoy’s internal engine. Most of Ailoy’s functionalities
are processed by this internal engine.
- Python
- JavaScript(Node)
# Defines an agent.
# During this step, the model parameters are downloaded and the LLM is set up for execution
with Agent(rt, model_name="Qwen/Qwen3-0.6B") as agent:
// Defines an agent.
// During this step, the model parameters are downloaded and the LLM is set up for execution
const agent = await defineAgent(rt, "Qwen/Qwen3-0.6B");
// ...
// Once the agent is no longer needed, it can be released
await agent.delete();
Next, you’ll see that we define a class called Agent
.
It is the simplest way to use LLMs (or agents) in Ailoy.
The Agent
class provides high-level APIs that abstract away the underlying runtime, allowing you to use LLM capabilities effortlessly.
In this example, we’re using Alibaba’s qwen3 to run the model directly on-device.
Once an Agent is defined, you can run the LLM using the query method. The output is returned as an iterator that yields the LLM’s response. This can also be considered a single step in successive generation process.
You can use the Ailoy CLI interface to manage downloaded model files.
- Python
- JavaScript(Node)
# This is where the actual LLM call happens.
for resp in agent.query("Please give me a short poem about AI"):
agent.print(resp)
// This is where the actual LLM call happens
for await (const resp of agent.query("Please give me a short poem about AI")) {
agent.print(resp);
}
The response is structured according to Ailoy’s defined output format. For a detailed specification of this format, please refer to Agent Response Format.