Available Models
Currently, the following AI models are supported:
Language Models
Local Models
-
Qwen3
Qwen/Qwen3-0.6BQwen/Qwen3-1.7BQwen/Qwen3-4BQwen/Qwen3-8BQwen/Qwen3-14BQwen/Qwen3-32BQwen/Qwen3-30B-A3B(MoE)
API Models
-
OpenAI
-
Gemini -
Claude
-
Grok
Embedding Models
Local Models
-
BAAI/bge-m3
VRAM requirements
These values may vary depending on the environment and circumstances.
Requirements for available VRAM size by models are estimated as follows:
| Model | Context length | VRAM (params) | VRAM (total) |
|---|---|---|---|
BAAI/bge-m3 | 8k | ≈ 0.3 GB | ≈ 0.3 GB |
Qwen/Qwen3-0.6B | 40k | ≈ 0.5 GB | ≈ 5.0 GB |
Qwen/Qwen3-1.7B | 40k | ≈ 1.0 GB | ≈ 5.5 GB |
Qwen/Qwen3-4B | 40k | ≈ 2.4 GB | ≈ 8.0 GB |
Qwen/Qwen3-8B | 40k | ≈ 4.5 GB | ≈ 10.5 GB |
Qwen/Qwen3-14B | 40k | ≈ 8.0 GB | ≈ 14.5 GB |
Qwen/Qwen3-32B | 40k | ≈ 17.6 GB | ≈ 25 GB |
Qwen/Qwen3-30B-A3B | 40k | ≈ 16.5 GB | ≈ 24 GB |