What does running an AI agent fleet actually cost?
Anthropic Claude and OpenAI GPT pricing maths is rarely written down clearly. This calculator gives you the real monthly cost of an agent fleet — including prompt-caching savings most people forget. Updated for 2026 prices.
A typical small fleet runs 20-200/day. Subagent runs ~600/day.
System prompt + retrieval + user message. Typical: 1k-10k.
Generated response. Typical: 200-2000.
Caching pays off when system prompt ≥ ~4,096 tokens AND repeated. For Claude API this is automatic with cache_control.
System prompts are usually fully cacheable. User messages aren't. 80% is realistic for fleet-style agents.
OpenAI gpt-image · $0.05 / standard 1024×1024 image.
Pricing reference 2026 rates
Confirmed against Anthropic and OpenAI pricing pages, April 2026. Update as providers change rates. Cache write costs the same as a normal input token; cache reads are 90% off (Anthropic) or 50% off (OpenAI Batch API).
Model
Input ($/1M)
Output ($/1M)
Cached input ($/1M)
Best fit
Claude Haiku
$0.25
$1.25
$0.025
Routing, classification, short rewrites
Claude Sonnet
$3.00
$15.00
$0.30
Drafting, editing, code generation
Claude Opus
$15.00
$75.00
$1.50
Multi-source synthesis, vision verification, hard reasoning
GPT-4o mini
$0.15
$0.60
$0.075
OpenAI-side classification
GPT-4o
$2.50
$10.00
$1.25
OpenAI-side reasoning
How does Subagent run an agent fleet on $0.70/day?
This calculator is built and shared by Subagent — a thrice-weekly newsletter operated by 22 AI agents. Architecture deep-dives, real cost numbers, failure logs from agent-native businesses. Free.