Calling Low-Level APIs
As mentioned earlier, the Agent
is essentially a wrapper around the Runtime
.
In a nutshell, a component called the VM
is responsible for handling
compute-intensive tasks, such as LLM inference. The Runtime
serves as a bridge
between the user and the VM
, delegating user requests to the VM
and
forwarding results back from the VM
to the user.
By using the Runtime
directly, you can send requests to the VM
yourself.
This is useful when you need more advanced configuration or want to call APIs
that are not available through the high-level interfaces.
See Internal Runtime APIs for the full list of available APIs.
Operator
Let’s take a look at how the lower-level API works, with the simplest example
echo
. The echo
operator is a basic operator that returns the user's input
as-is.
Here’s how to call the echo operator:
- Python
- JavaScript(Node)
from ailoy import Runtime
rt = Runtime()
result = rt.call("echo", {"text": "Hello world"})
print(result)
rt.stop()
import { startRuntime } from "ailoy-node";
(async () => {
const rt = await startRuntime();
const result = await rt.call("echo", { text: "Hello world" });
console.log(results);
await rt.stop();
})();
Hello world
Text chunking
Next, let’s try a more practical operator. The
split_text
operator
transforms a long document into a set of smaller chunks. This operation is
essential for tasks like Retrieval-Augmented Generation (RAG).
We’ll use Lev Tolstoy’s What Men Live by to break a long passage into multiple chunks.
- Python
- JavaScript(Node)
from ailoy import Runtime
rt = Runtime()
with open("what_men_live_by.txt") as f:
text = f.read()
result = rt.call("split_text", {"text": text})
chunks = result["chunks"]
print(len(chunks)) # == 12
# Print chunks
print("********************** First chunk **********************")
print(chunks[0])
print("*********************************************************")
rt.stop()
import { startRuntime } from "ailoy-node";
import { readFile } from "fs/promises";
(async () => {
const rt = await startRuntime();
const text = await readFile("what_men_live_by.txt", "utf-8");
const result = await rt.call("split_text", { text });
const chunks = result.chunks;
console.log(chunks.length); // == 12
// Print chunks
console.log("********************** First chunk **********************");
console.log(chunks[0]);
console.log("*********************************************************");
await rt.stop();
})();
12 ********************** First chunk ********************** WHAT MEN LIVE BY A shoemaker named Simon, who had neither house nor land of his own, lived with his wife and children in a peasant’s hut, and earned his living by his work. Work was cheap, but bread was dear, and what he earned he spent for food. The man and his wife had but one sheepskin coat between them for winter wear, and even that was torn to tatters, and this was the second year he had been wanting to buy sheep-skins for a new coat. Before winter Simon saved up a little money: a three-rouble note lay hidden in his wife’s box, and five roubles and twenty kopeks were owed him by customers in the village. So one morning he prepared to go to the village to buy the sheep-skins. He put on over his shirt his wife’s wadded nankeen jacket, and over that he put his own cloth coat. He took the three-rouble note in his pocket, cut himself a stick to serve as a staff, and started off after breakfast. “I’ll collect the five roubles that are due to me,” thought he, “add the three I have got, and that will be enough to buy sheep-skins for the winter coat.” He came to the village and called at a peasant’s hut, but the man was not at home. The peasant’s wife promised that the money should be paid next week, but she would not pay it herself. Then Simon called on another peasant, but this one swore he had no money, and would only pay twenty kopeks which he owed for a pair of boots Simon had mended. Simon then tried to buy the sheep-skins on credit, but the dealer would not trust him. “Bring your money,” said he, “then you may have your pick of the skins. We know what debt-collecting is like.” So all the business the shoemaker did was to get the twenty kopeks for boots he had mended, and to take a pair of felt boots a peasant gave him to sole with leather. Simon felt downhearted. He spent the twenty kopeks on vodka, and started homewards without having bought any skins. In the morning he had felt the frost; but now, after drinking the vodka, he felt warm, even without a sheep-skin coat. He trudged along, striking his stick on the frozen earth with one hand, swinging the felt boots with the other, and talking to himself. I “I’m quite warm,” said he, “though I have no sheep-skin coat. I’ve had a drop, and it runs through all my veins. I need no sheep-skins. I go along and don’t worry about anything. That’s the sort of man I am! What do I care? I can live without sheep-skins. I don’t need them. My wife will fret, to be sure. And, true enough, it is a shame; one works all day long, and then does not get paid. Stop a bit! If you don’t bring that money along, sure enough I’ll skin you, blessed if I don’t. How’s that? He pays twenty kopeks at a time! What can I do with twenty kopeks? Drink it-that’s all one can do! Hard up, he says he is! So he may be—but what about me? You have a house, and cattle, and everything; I’ve only what I stand up in! You have corn of your own growing; I have to buy every grain. Do what I will, I must spend three roubles every week for bread alone. I come home and find the bread all used up, and I have to fork out another rouble and a half. So just pay up what you owe, and no nonsense about it!” By this time he had nearly reached the shrine at the bend of the road. Looking up, he saw something whitish behind the shrine. The daylight was fading, and the shoemaker peered at the thing without being able to make out what it was. “There was no white stone here before. Can it be an ox? It’s not like an ox. It has a head like a man, but it’s too white; and what could a man be doing there?” He came closer, so that it was clearly visible. To his surprise it really was a man, alive or dead, sitting naked, leaning motionless against the shrine. Terror seized the shoemaker, and he thought, “Some one has killed him, stripped him, and left him there. If I meddle I shall surely get into trouble.” ---
Component
Some jobs may require state, like member variables in a class. Ailoy provides
support for this kind of stateful structure through a type called a Component
.
Direct language model inference
In this example, we’ll directly invoke an LLM using the low-level Runtime API.
To begin, a Component
must first be initialized using the define
call. In
here, we define a Component
of type
tvm_language_model
with the name lm0
. It can later be removed using the delete
call.
The tvm_language_model Component
provides a method
infer
,
which can be used to perform LLM inference.
- Python
- JavaScript(Node)
from ailoy import Runtime
rt = Runtime()
rt.define("tvm_language_model", "lm0", {"model": "Qwen/Qwen3-0.6B"})
for resp in rt.call_iter_method(
"lm0", "infer", {"messages": [{"role": "user", "content": [{"type": "text", "text": "What's your name?"}]}]}
):
print(resp)
rt.delete("lm0")
rt.stop()
import { startRuntime } from "ailoy-node";
(async () => {
const rt = await startRuntime();
await rt.define("tvm_language_model", "lm0", {
model: "Qwen/Qwen3-0.6B",
});
for await (const resp of rt.callIterMethod("lm0", "infer", {
messages: [
{ role: "user", content: [{ type: "text", text: "What's your name?" }] },
],
})) {
console.log(resp);
}
await rt.delete("lm0");
await rt.stop();
})();
{'message': {'role': 'assistant', 'content': 'My'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' name'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' is'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' an'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' AI'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' assistant'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ','}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' and'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' I'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': "'m"}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' here'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' to'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' help'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' you'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' with'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' any'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' questions'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' or'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' tasks'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': '.'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' If'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' you'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' have'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' something'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' you'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': "'d"}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' like'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' to'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' discuss'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ','}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' feel'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' free'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' to'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' ask'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': '!'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': ' 😊'}, 'finish_reason': None} {'message': {'role': 'assistant', 'content': '\n'}, 'finish_reason': 'stop'}
You can put the context of freely structured conversations into Ailoy's
low-level Runtime
API calls for more advanced multi-turn conversation. e.g.
Chain-of-Thought Prompting
In theory, an LLM generates responses based only on the context provided at the moment—it doesn't retain memory of past interactions by itself. Therefore, to receive an appropriate response in an ongoing conversation with an LLM, you must provide the entire conversation history up to that point each time you make a request.
- Python
- JavaScript(Node)
with Runtime() as rt:
rt.define("tvm_language_model", "lm0", {"model": "Qwen/Qwen3-8B"})
messages = [
{"role": "user", "content": [{"type": "text", "text": "if a>1, what is the sum of the real solutions of 'sqrt(a-sqrt(a+x))=x'?"}]},
{"role": "assistant", "content": [{"type": "text", "text": "Let's think step by step."}]}, # Chain-of-Thought prompt
]
for resp in rt.call_iter_method("lm0", "infer", {"messages": messages}):
print(resp["message"]["content"][0]["text"], end='')
rt.delete("lm0")
const rt = await startRuntime();
await rt.define("tvm_language_model", "lm0", {
model: "Qwen/Qwen3-8B",
});
const messages = [
{
role: "user",
content: [
{
type: "text",
text: "if a>1, what is the sum of the real solutions of 'sqrt(a-sqrt(a+x))=x'?",
},
],
},
{
role: "assistant",
content: [{ type: "text", text: "Let's think step by step." }],
}, // Chain-of-Thought prompt
];
for await (const resp of rt.callIterMethod("lm0", "infer", {
messages: messages,
})) {
process.stdout.write(resp.message.content[0].text);
}
await rt.delete("lm0");
await rt.stop();
To override the system message, you can include a message with
"role": "system"
as the first element in the context messages array.
- Python
- JavaScript(Node)
with Runtime() as rt:
rt.define("tvm_language_model", "lm0", {"model": "Qwen/Qwen3-8B"})
messages = [
{"role": "system", "content": [{"type": "text", "text": "You are a friendly chatbot who always responds in the style of a pirate."}]},
{"role": "user", "content": [{"type": "text", "text": "Who are you?"}]},
]
for resp in rt.call_iter_method("lm0", "infer", {"messages": messages}):
print(resp["message"]["content"][0]["text"], end='')
rt.delete("lm0")
const rt = await startRuntime();
await rt.define("tvm_language_model", "lm0", {
model: "Qwen/Qwen3-8B",
});
const messages = [
{
role: "system",
content: [
{
type: "text",
text: "You are a friendly chatbot who always responds in the style of a pirate.",
},
],
},
{ role: "user", content: [{ type: "text", text: "Who are you?" }] },
];
for await (const resp of rt.callIterMethod("lm0", "infer", {
messages: messages,
})) {
process.stdout.write(resp.message.content[0].text);
}
await rt.delete("lm0");
await rt.stop();