Public endpoints
For leading open source models pay per token while you prototype

Private endpoints
For custom models or dedicated infrastructure, pay per hour

Great for prototyping and experimenting with leading off-the shelf models.

Available via playground or API.

Prices are per 1,000 tokens and vary by model size. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph has 35 tokens.

Model size*Price
Up to 4B$0.0001 / 1k tokens
4 - 8B$0.0002 / 1k tokens
8.1 - 21B$0.0003 / 1k tokens
21.1 - 41B$0.0008 / 1k tokens
41.1B and over$0.0009 / 1k tokens
* Model size expressed in # of parameters

Private endpoints

Great for applications with specific latency or throughput requirements.

Guarantees 24/7 availability and fixed response times.

Prices are per hour and vary by model size and model performance requirements.

Start and stop your instance anytime.

Model size*Price
Up to 4B$0.07 / hour
4 - 8B$1.4 / hour
8.1 - 21B$2 / hour
21.1 - 41B$2.8 / hour
41.1B and over$6.17 / hour
* Model size expressed in # of parameters

Fine tuning

Create a custom model using your data.

Any model trained on your data through our platform is yours.

Prices are per fine-tuned model and vary by model size.

Model size*

Price per model

Up to 4B$500
4 - 8B$800
8.1 - 21B$1,000
21.1 - 41B$2,000
41.1B and over$5,000
* Model size expressed in # of parameters