Skip to main content

Getting started

Welcome to Ailoy’s Tutorial! 🤗

In this tutorial, we’ll explore how to run LLMs in Ailoy, extend their capabilities like agent system.

You can install Ailoy via a package manager. Just type the following command in your shell:

pip install ailoy-py

Let’s start right away with a simple code example. Below is the simplest way to run an LLM — it’s like Ailoy’s “Hello, World!”

from ailoy import Runtime, Agent

# The runtime must be started to use Ailoy
rt = Runtime()

# Defines an agent
# During this step, the model parameters are downloaded and the LLM is set up for execution
with Agent(rt, model_name="Qwen/Qwen3-0.6B") as agent:
# This is where the actual LLM call happens
for resp in agent.query("Please give me a short poem about AI"):
agent.print(resp)

# Stop the runtime
rt.stop()

Since the model needs to be downloaded and initialized, the first run may take a little time. After a short wait, you may see an output similar to this.

In the digital realm, where thoughts run,
AI dreams, no more than dreams of our own.
With code and data, it creates, it learns,
But dreams, still, run in hearts, in minds.

All done! You've just activated an LLM. 🎉

note

Don't be surprised if the output changes each time you run it. An LLM's output includes a certain level of randomness based on the temperature setting.

Now, let me explain through the code line by line to understand what each part does.

The very first step to using Ailoy is to start a Runtime.

# The runtime must be started to use Ailoy
rt = Runtime()

# ...

# Stop the runtime
rt.stop()

The Runtime contains Ailoy’s internal engine. Most of Ailoy’s functionalities are processed by this internal engine.

# Defines an agent.
# During this step, the model parameters are downloaded and the LLM is set up for execution
with Agent(rt, model_name="Qwen/Qwen3-0.6B") as agent:

Next, you’ll see that we define a class called Agent. It is the simplest way to use LLMs (or agents) in Ailoy. The Agent class provides high-level APIs that abstract away the underlying runtime, allowing you to use LLM capabilities effortlessly. In this example, we’re using Alibaba’s qwen3 to run the model directly on-device.

Once an Agent is defined, you can run the LLM using the query method. The output is returned as an iterator that yields the LLM’s response. This can also be considered a single step in successive generation process.

info

You can use the Ailoy CLI interface to manage downloaded model files.

# This is where the actual LLM call happens.
for resp in agent.query("Please give me a short poem about AI"):
agent.print(resp)

The response is structured according to Ailoy’s defined output format. For a detailed specification of this format, please refer to Agent Response Format.