01/ Pricing

Pay for the GPU-seconds you run.

Serverless inference bills per GPU-second while your code is on the GPU. Queue, scheduling, and prepare time are not charged. Training mode bills at the underlying GCP VM rate.

Reference ratesSnapshot from dev catalog; log in for live values
GPUVRAM/ sec/ hrInferenceTrain
NVIDIA T4t4
16 GB$0.000223$0.80Warm poolSupported
NVIDIA L4l4
24 GB$0.000358$1.29Warm poolSupported
NVIDIA A100 40 GBa100-40g
40 GB$0.001630$5.87Warm poolSupported
NVIDIA H100h100
80 GB$0.005440$19.58Warm poolSupported
02/ Plans

Three plans. Zero hidden fees.

Starter
$0
  • /Personal workspace
  • /Pay per GPU-second
  • /CLI and agent SDK
  • /Community support
Team
$TBD
/ month
  • /Shared workspace (coming soon)
  • /Priority inference warm pool
  • /Usage export
  • /SLA response
Enterprise
Custom
  • /Dedicated nodes
  • /Commit discounts
  • /SSO and audit log (roadmap)
  • /Named support
03/ Notes

Billing pipeline (invoices, auto-recharge, spend-limit) ships after the v1 console. Current run summary shows a demo charge calculated client-side from the rates above.