Oumi AI

Pricing

Simple, transparent pricing.

Self-Hosted

Open Source

Free Forever

Everything AI researchers need to build state-of-the-art models in a single library.

  • Pre-training, fine-tuning, and evaluation at any scale
  • Text and multimodal, open and closed models
  • Data synthesis and curation
  • Run anywhere, from your laptop to the cloud

100x Faster Development

Pro Platform

Starts free · $25/month after

with pay-as-you-go after

Build high-performing custom models faster. No need to bring your own compute.

  • Free credits equal to $50 corporate, $25 personal upon signup.
  • Automated evaluation, synthesis, and training pipelines.
  • Deploy to the inference provider of your choice
  • Expert support

The fastest way to start building. No account needed.

For Production at Scale

Enterprise

Custom Pricing

Oumi's team works alongside yours to build custom models and agents for your most critical use cases.

  • Dedicated experts embedded with your team
  • Models and agents tuned to your domain
  • Bespoke engagements scoped to your goals

Used by developers at leading organizations

Microsoft
Google
IBM
Apple
Intel
Citi
SAP
HP
DHL
Walmart
Concentrix
Johnson & Johnson
CNRS
DMG
OriginalVoices
Kaizen Gaming
Wired Informatics

Hosted Platform – detailed pricing

Detailed breakdown of tools, storage, training, and inference pricing.

Tools & Storage

Evaluation
1,000 judgments / $1
Data Synthesis
1,000 rows / $1
Storage
4 GB/month / $1

Supervised Fine-Tuning

Priced per 1M training tokens — calculated as the number of tokens in your training dataset multiplied by the number of epochs.

Model SizePrice
Up to 16B
$0.49
16.1–32B
$2.00
32.1–80B
$3.00
80.1–300B
$6.00

On-Policy Distillation

Priced per GPU hour. Training runs on 8 GPUs. GPU used is subject to availability.

GPUPrice
A100-80GB
$2.90
H100
$4.00

Inference for evaluation & synthesis

Model SizeInput / 1MOutput / 1M
Llama 3.1 70B
$1.00
$1.00
Llama 3.1 8B
$0.22
$0.22
Qwen2.5 7B Instruct
$0.22
$0.22
Qwen3 235B A22B Instruct
$0.25
$0.90
Qwen3.5 9B
$0.11
$0.17
Qwen3.5 397B A17B
$0.70
$4.00
Mixtral 8x7B Instruct v0.1
$0.55
$0.55
Mistral 7B Instruct v0.3
$0.22
$0.22
Kimi K2 Instruct
$0.70
$2.80
Kimi K2 Thinking
$0.70
$2.80
Kimi K2 Instruct 0905
$0.70
$2.80
gpt-oss-120b
$0.15
$0.60
DeepSeek V3.1
$0.60
$1.70
DeepSeek-V4-Pro
$1.91
$3.83
GLM-4.6
$0.60
$2.40
GLM-5
$1.10
$3.50
GLM-5.1
$1.55
$4.85
Gemma 4 31B
$0.22
$0.55

Inference is only charged when you utilize models hosted by Oumi to power an action on the platform.

Production Inference

Deploy fully fine-tuned or LoRA models for inference and pay:

Per token (auto-scales based on traffic)
Per GPU hour (you control capacity)
Contact us for pricing

Frequently asked questions

Sign up today with a corporate email for $50 in credits, or a personal email for $25.