Prompt Cache ROI Calculator

Anthropic caching — calculate real cost savings

Cache write: 1.25× input price (25% premium) · Cache read: 0.1× input price (90% discount)
📚
Learn more — how it works, FAQ & guide
Click to expand

Anthropic prompt cache ROI calculator

Calculate exact savings from Claude prompt caching. Break-even, monthly + annual savings.

How to use this tool

  1. 1

    Pick model + cache TTL

    Claude Opus/Sonnet/Haiku, and 5-min or 1-hour cache.

  2. 2

    Enter cached vs dynamic tokens

    System prompt / examples (cached) vs changing user input.

  3. 3

    Enter requests per hour

    Higher traffic → better cache amortization.

  4. 4

    See monthly savings

    Break-even, annual savings, optimal cache strategy.

Frequently Asked Questions

How does Anthropic prompt caching work?
Anthropic caches portions of your prompt (system, examples, long context) with an ephemeral breakpoint. Cache writes cost 1.25× input price (25% premium). Cache reads cost 0.1× input price (90% discount). Default TTL is 5 minutes, extended 1h TTL costs 2× write (2.5× base).
When does caching save money?
Break-even: ~2 reads for 5-min cache (1.25 write + 0.1×N ≤ 1.0×(N+1) → N ≥ 2). Real savings start at 10+ reads per write. At 100 reads/write, you pay 1.25 + 10 = 11.25 units vs 100 without cache = ~89% savings.
What should I cache?
Large static content: system prompts, instruction templates, few-shot examples, long documents, tool definitions. Don't cache: short prompts (<1024 tokens on Sonnet/Opus, <2048 on Haiku), rapidly changing content, PII that varies per user.
Minimum cacheable tokens?
Claude Sonnet 4 / Opus: 1024 tokens minimum. Claude Haiku: 2048 tokens minimum. Below this, caching is not applied.

You might also like

🔒
100% Privacy. This tool runs entirely in your browser. Your data is never uploaded to any server.