01/ Pricing
Pay for the GPU-seconds you run.
Serverless inference bills per GPU-second while your code is on the GPU. Queue, scheduling, and prepare time are not charged. Training mode bills at the underlying GCP VM rate.
Reference ratesSnapshot from dev catalog; log in for live values
| GPU | VRAM | / sec | / hr | Inference | Train |
|---|---|---|---|---|---|
NVIDIA T4t4 | 16 GB | $0.000223 | $0.80 | Warm pool | Supported |
NVIDIA L4l4 | 24 GB | $0.000358 | $1.29 | Warm pool | Supported |
NVIDIA A100 40 GBa100-40g | 40 GB | $0.001630 | $5.87 | Warm pool | Supported |
NVIDIA H100h100 | 80 GB | $0.005440 | $19.58 | Warm pool | Supported |
02/ Plans
Three plans. Zero hidden fees.
Team
$TBD
/ month
- /Shared workspace (coming soon)
- /Priority inference warm pool
- /Usage export
- /SLA response
Enterprise
Custom
- /Dedicated nodes
- /Commit discounts
- /SSO and audit log (roadmap)
- /Named support
03/ Notes
Billing pipeline (invoices, auto-recharge, spend-limit) ships after the v1 console. Current run summary shows a demo charge calculated client-side from the rates above.