Chat Completion Format
Conventionally, in a chat completion setup, both input and output follow
a structured format.
Messages are typically represented as follows:
[
{
"role": "system",
"contents": [
{ "type": "text", "text": "You are a friendly and knowledgeable assistant." }
]
},
{
"role": "user",
"contents": [
{ "type": "text", "text": "Can you explain how photosynthesis works?" }
]
}
]
When it is executed, the output might look like:
{
"role": "assistant",
"contents": [
{
"type": "text",
"text": "Photosynthesis is the process by which plants convert sunlight, water, and carbon dioxide into energy. They use sunlight to produce glucose (a form of sugar) and release oxygen as a byproduct."
}
]
}
Please refer to followings for more about this schema.
Message
A Message represents one conversational turn — what one participant (system,
user, or assistant) says or does.
Each message contains:
- a role, indicating who is speaking,
- a set of contents, describing what was said or sent.
Role
| Role | Description |
|---|---|
| System | System instructions and constraints provided to the assistant. It usually defines the model’s behavior or persona. |
| User | Contents authored by the user. |
| Assistant | Contents automatically generated by the assistant / AI model. |
| Tool | Execution results produced by external tools / functions. |
Contents
What is said from each role is generally referred to as content.
However, a model can operate in different modes, producing outputs that serve
different purposes. These outputs reflect the intention behind what the
model says or performs.
| Intent | Description |
|---|---|
| Content | General output. |
| Thinking(Reasoning) | Some model generates intermediate thoughts or thinking traces before producing the final answer. These are stored in the thinking field (often hidden from the user) and can help trace or visualize the model’s internal decision process. |
| Tool call | When the model decides to invoke an external function or API instead of generating plain text. These are represented as structured objects that describe which function to call(name) and with what arguments. |
These types of outputs are stored in separate fields within a message, making it possible to distinguish from general conversation.
Example
[
Message {
role: "assistant",
thinking: "Let’s reason step by step: photosynthesis converts light energy into chemical energy...",
contents: [
{ type: "text", text: "Photosynthesis is the process by which plants convert sunlight, water, and carbon dioxide into energy." }
],
tool_calls: [
{
type: "function",
function: {
id: "func_call_1234abcd",
name: "get_current_location",
arguments: ...
}
}
],
},
]
Part
While Contents describe the intention of a message, Part defines the data
type of that message.
Part can be considered as a smallest semantic units within a conversation.
Each Message contains a list of Part objects. A Part can represent text,
images, function calls, or any structured values, enabling rich multimodal
communication.
| Part | Description |
|---|---|
| Text | Natural-language text content |
| Image | Visual data or reference (e.g., binary, URL, or metadata) |
| Function | Structured tool or function invocation |
| Value | Arbitrary data (numbers, objects, JSON values, etc.) |
For example, if a user asks about an image, the message could look like this:
contents: [
{type: "image", "image": {data: "..."}}
{type: "text", text: "What you can see in this image?"}
]
Together, these parts express Ailoy’s multimodal conversation.
Delta
The inference of a language model can take a significant amount of time. To improve real-time responsiveness, many AI systems stream tokens as they are generated. These streamed outputs are typically delivered in the form of deltas.
A delta(MessageDelta or PartDelta) represents an output of this
step(inference) updated during a streaming response. As the model generates text
token by token, each incremental addition is emitted as a delta, which is later
merged into a complete Message.
Ailoy provides a simple way to retrieve and aggregate deltas. One can use
accumulate and finish function.
Tool
A Tool enables the model to act — to execute external functions, access APIs, or perform any operation beyond plain text generation.
Each tool has two halves:
- Description (declarative) — defines what the tool is and how it can be called.
- Behavior (imperative) — defines what your code actually does when the tool is invoked.
Together, they allow the model to dynamically invoke external capabilities while keeping reasoning and execution logically separated.
Please refer to following resources for more about the tool schema convention.
Tool Description
Ailoy follows a JSON-Schema-like convention to describe tool arguments and optional return schemas. This ensures that language models can reliably construct valid function calls — the schema precisely defines the allowed parameters, required fields, and expected return structure.
{
"name": "get_temperature",
"description": "Retrieve current temperature for a specific city.",
"parameters": {
"type": "object",
"required": ["city"],
"properties": {
"city": { "type": "string", "description": "City name" },
"unit": { "type": "string", "enum": ["celcius", "fahrenheit"], "default": "celcius" }
},
"additionalProperties": false
},
"returns": {
"type": "number"
}
}
Tool Call
When a model decides to use a tool instead of generating plain text, it emits a tool call. This occurs inside an assistant message — since the assistant is the one deciding to call a tool — but the call is placed inside the tool_calls field (not contents).
A typical tool call message looks like this:
{
"role": "assistant",
"contents": [
{ "type": "text", "text": "Let me check the current weather for you." }
],
"tool_calls": [
{
"type": "function",
"function": {
"name": "get_weather",
"arguments": { "city": "Seoul", "unit": "celcius" }
},
"id": "call_01HZX2..."
}
]
}
Explanation:
- The assistant outputs a structured tool call in
tool_callssegment. - The runtime then executes the corresponding function based on its name and arguments.
- The
idfield can optionally be tagged, which uniquely identifies this tool call, allowing the tool’s response to be correctly linked.
Tool Response
Once the runtime executes the tool’s behavior, it must append a new message to
the conversation, with the role set to "tool". Also, the result is stored inside
the contents field.
For example
{
"role": "tool",
"tool_call_id": "call_01HZX2...",
"name": "get_weather",
"contents": [
{ "type": "text", "text": "12.3" }
]
}
When an error occurs in a tool, it should still be passed along so that the model can recognize and handle it.
{
"role": "tool",
"tool_call_id": "call_01HZX2...",
"name": "get_weather",
"contents": [
{ "type": "text", "text": "{ \"code\": \"NOT_FOUND\" }" }
]
}
Document
A Document is the normalized representation of any retrievable knowledge item. It consists of two components: a title and a text body.
{
"title": "...",
"text": "..."
}
-
titleA short, descriptive label that helps identify the document. The title is not used as part of model inference — it is primarily for indexing, display, and retrieval ranking. You can think of it as metadata or a summary, similar to a filename or headline. -
textThe actual content of the document. This field is fed directly into the language model during retrieval-augmented inference. It contains the meaningful information that the model can read, reason about, and use to generate responses.
All retrieved knowledge sources (e.g., from vector databases, APIs, or local
files) are normalized into this unified "document" format before being passed
to the model. This ensures that, regardless of the original source or schema,
the model always receives consistent input.
For example:
| Source | Normalized as |
|---|---|
| Web article | { "title": "Article Title", "text": "Full article content..." } |
| PDF extract | { "title": "File: research.pdf", "text": "Extracted paragraph..." } |
| Knowledge base entry | { "title": "FAQ: Model Loading", "text": "To load a model, use..." } |
This design simplifies retrieval and unifies downstream processing within Ailoy’s knowledge pipeline.