Skip to main content

Getting Started

Welcome to Ailoy’s Tutorial! 🤗

In this tutorial, we’ll explore how to run LLMs in Ailoy and extend their capabilities to build agent systems.

Installation

You can install Ailoy via a package manager. Just type the following command in your shell:

pip install ailoy-py
info

For using Ailoy in web browsers, see WebAssembly Supports for more details.

Running the Simplest Agent

The Language Model (LM) is the core of AI agents. An agent system operates based on a language model and its instructions, along with various extensions that enhance its capabilities.

API Models and Local Models

Ailoy lets you run language models in two different modes:

  • API Models: run models hosted by AI vendors such as OpenAI’s GPT or Anthropic’s Claude
  • Local Models: run open-source models directly on your own machine

Each approach has its own strengths and trade-offs:

  • API Models: connect to external AI services over the internet. If you’re fine with API token costs and prefer cloud-hosted models, this is the simplest option.
  • Local Models: run entirely on your own device using open-source models. Choose this if you want full control, offline availability, or privacy.

Ailoy supports both seamlessly — you can switch between them anytime with minimal changes.

Example

Let’s start with a simple example — Ailoy’s version of “Hello, World!” for LLMs.

In just a few lines of code, you’ll run a model, send it a prompt, and print the response.

Agent Using APIs

import asyncio

import ailoy as ai


async def main():
lm = ai.LangModel.new_stream_api(
spec="OpenAI",
model_name="gpt-4o",
api_key="YOUR_OPENAI_API_KEY"
)
agent = ai.Agent(lm)
async for resp in agent.run("Please give me a short poem about AI."):
if isinstance(resp.message.contents[0], ai.Part.Text):
print(resp.message.contents[0].text)


if __name__ == "__main__":
asyncio.run(main())

Agent Using Local Models

Here, we use the open-source model Qwen3-0.6B, running entirely on your own device — no API key or internet connection is required.

import asyncio

import ailoy as ai


async def main():
lm = await ai.LangModel.new_local(
model_name="Qwen/Qwen3-0.6B", progress_callback=print
)
agent = ai.Agent(lm)
async for resp in agent.run("Please give me a short poem about AI."):
if isinstance(resp.message.contents[0], ai.Part.Text):
print(resp.message.contents[0].text)


if __name__ == "__main__":
asyncio.run(main())
info

You can use the Ailoy CLI interface to manage downloaded model files.

Output

Since the model needs to be downloaded and initialized, the first run may take a little time.
After a short wait, you will see an output similar to this.

In the digital realm, where thoughts run,
AI dreams, no more than dreams of our own.
With code and data, it creates, it learns,
But dreams, still, run in hearts, in minds.

That’s it! 🎉

You’ve just run your first AI agent with Ailoy.

info

For a detailed explanation of the input and output formats in Ailoy, please refer to Chat Completion Format.

info

Don't be surprised if the output changes each time you run it.

An LLM's output includes a certain level of randomness, controlled by the temperature and top_p. In ailoy, it can be adjusted with InferenceConfig in AgentConfig.

If both temperature and top_p are set to 0, the answer becomes deterministic.