Pricing
Simple, transparent pricing.
Self-Hosted
Open Source
Free Forever
Everything AI researchers need to build state-of-the-art models in a single library.
- Pre-training, fine-tuning, and evaluation at any scale
- Text and multimodal, open and closed models
- Data synthesis and curation
- Run anywhere, from your laptop to the cloud
100x Faster Development
Pro Platform
Starts free · $25/month after
with pay-as-you-go after
Build high-performing custom models faster. No need to bring your own compute.
- Free credits equal to $50 corporate, $25 personal upon signup.
- Automated evaluation, synthesis, and training pipelines.
- Deploy to the inference provider of your choice
- Expert support
The fastest way to start building. No account needed.
For Production at Scale
Enterprise
Custom Pricing
Oumi's team works alongside yours to build custom models and agents for your most critical use cases.
- Dedicated experts embedded with your team
- Models and agents tuned to your domain
- Bespoke engagements scoped to your goals
Used by developers at leading organizations
Hosted Platform – detailed pricing
Detailed breakdown of tools, storage, training, and inference pricing.
Tools & Storage
Evaluation | 1,000 judgments / $1 |
Data Synthesis | 1,000 rows / $1 |
Storage | 4 GB/month / $1 |
Supervised Fine-Tuning
Priced per 1M training tokens — calculated as the number of tokens in your training dataset multiplied by the number of epochs.
| Model Size | Price |
|---|---|
Up to 16B | $0.49 |
16.1–32B | $2.00 |
32.1–80B | $3.00 |
80.1–300B | $6.00 |
On-Policy Distillation
Priced per GPU hour. Training runs on 8 GPUs. GPU used is subject to availability.
| GPU | Price |
|---|---|
A100-80GB | $2.90 |
H100 | $4.00 |
Inference for evaluation & synthesis
| Model Size | Input / 1M | Output / 1M |
|---|---|---|
Llama 3.1 70B | $1.00 | $1.00 |
Llama 3.1 8B | $0.22 | $0.22 |
Qwen2.5 7B Instruct | $0.22 | $0.22 |
Qwen3 235B A22B Instruct | $0.25 | $0.90 |
Qwen3.5 9B | $0.11 | $0.17 |
Qwen3.5 397B A17B | $0.70 | $4.00 |
Mixtral 8x7B Instruct v0.1 | $0.55 | $0.55 |
Mistral 7B Instruct v0.3 | $0.22 | $0.22 |
Kimi K2 Instruct | $0.70 | $2.80 |
Kimi K2 Thinking | $0.70 | $2.80 |
Kimi K2 Instruct 0905 | $0.70 | $2.80 |
gpt-oss-120b | $0.15 | $0.60 |
DeepSeek V3.1 | $0.60 | $1.70 |
DeepSeek-V4-Pro | $1.91 | $3.83 |
GLM-4.6 | $0.60 | $2.40 |
GLM-5 | $1.10 | $3.50 |
GLM-5.1 | $1.55 | $4.85 |
Gemma 4 31B | $0.22 | $0.55 |
Inference is only charged when you utilize models hosted by Oumi to power an action on the platform.
Production Inference
Deploy fully fine-tuned or LoRA models for inference and pay: