Pricing
Start for free
Start experimenting with $25 in free credits
Public endpoints
For leading open source models pay per token while you prototype
Private endpoints
For custom models or dedicated infrastructure, pay per hour
Public endpoints
Great for prototyping and experimenting with leading off-the shelf models.
Available via playground or API.
Prices are per 1,000 tokens and vary by model size. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph has 35 tokens.
Model size* | Price |
---|---|
Up to 4B | $0.0001 / 1k tokens |
4 - 8B | $0.0002 / 1k tokens |
8.1 - 21B | $0.0003 / 1k tokens |
21.1 - 41B | $0.0008 / 1k tokens |
41.1B and over | $0.0009 / 1k tokens |
Private endpoints
Great for applications with specific latency or throughput requirements.
Guarantees 24/7 availability and fixed response times.
Prices are per hour and vary by model size and model performance requirements.
Start and stop your instance anytime.
Model size* | Price |
---|---|
Up to 4B | $0.07 / hour |
4 - 8B | $1.4 / hour |
8.1 - 21B | $2 / hour |
21.1 - 41B | $2.8 / hour |
41.1B and over | $6.17 / hour |
Fine tuning
Create a custom model using your data.
Any model trained on your data through our platform is yours.
Prices are per fine-tuned model and vary by model size.
Model size* | Price per model |
---|---|
Up to 4B | $500 |
4 - 8B | $800 |
8.1 - 21B | $1,000 |
21.1 - 41B | $2,000 |
41.1B and over | $5,000 |