Claude Fable 5 Is Now Credit-Only: What a Real Coding Session Costs After June 22
TL;DR: As of June 23, Claude Fable 5 is no longer free on Pro, Max, or Team — every call now bills against usage credits at the full API rate of $10/$50 per million tokens. A single light agentic task runs about $0.90, a heavy multi-turn session about $7. Opus 4.8 does the same work for roughly half, and GLM-5.2 for a tenth. Keep Fable 5 for the hard 20%; route everything else cheaper.
| Claude Fable 5 | Claude Opus 4.8 | GLM-5.2 | OpenCode + Ollama | |
|---|---|---|---|---|
| Price (in / out per M) | $10 / $50 | $5 / $25 | $1.40 / $4.40 | $0 (local) |
| Light task (~50K in / 8K out) | ~$0.90 | ~$0.45 | ~$0.11 | $0 |
| Heavy session (~400K in / 60K out) | ~$7.00 | ~$3.50 | ~$0.82 | $0 |
| The catch | Now metered; double Opus | Slightly behind on gnarliest tasks | Self-host or trust Z.ai routing | Your GPU, your tok/s |
Honest take: The free window was the time to fall in love with Fable 5; the bill is the time to be disciplined. Run Opus 4.8 (or GLM-5.2 for cost-sensitive work) as your default backend and reach for Fable 5 only when a task has already defeated the cheaper model. Letting an agent loop on Fable 5 all day is the fastest way to a four-figure month.
What actually changed on June 23
From June 9 through June 22, 2026, anyone on a paid Claude plan — Pro, Max, Team, or seat-based Enterprise — could call Claude Fable 5 at no extra cost. That promotional window is over. As of June 23, Anthropic removed Fable 5 from those plan allowances. It still shows up in the model picker, but using it now draws down usage credits, and those credits are billed at the standard API rate: $10 per million input tokens and $50 per million output tokens.
The important nuance: credits are not some softer consumer rate. They meter at the exact per-token API price. So whether you hit Fable 5 through the raw API, through Claude Code, or through a usage-credit balance attached to your Pro subscription, the math is identical. The subscription buys you the cheaper models in-plan; Fable 5 is now incremental spend on top.
If you spent the last two weeks letting Fable 5 drive your agent and it felt free, that feeling ends today. The same workflow now has a meter on it.
The per-session math, with no hand-waving
“$10 per million tokens” means nothing until you turn it into the cost of one task you actually run. So here are two concrete scenarios, priced across every backend a developer would realistically consider in June 2026. Token rates below are the current public API prices for each model (verified June 23, 2026; sources at the end).
A light task is a focused request: fix a bug, write a function, add a test. In Cursor or Cline agent mode this realistically moves ~50,000 input tokens (the agent reads a few files and the conversation) and ~8,000 output tokens (the diff plus reasoning).
A heavy session is a multi-file refactor or a feature that takes the agent several turns — re-reading files, running tools, re-reading again. That cumulative traffic lands around ~400,000 input and ~60,000 output tokens once you count the full back-and-forth.
LIGHT TASK (50,000 input + 8,000 output)
Fable 5 50K×$10/M + 8K×$50/M = $0.50 + $0.40 = $0.90
Opus 4.8 50K×$5/M + 8K×$25/M = $0.25 + $0.20 = $0.45
GPT-5.5 50K×$5/M + 8K×$30/M = $0.25 + $0.24 = $0.49
GLM-5.2 50K×$1.40 + 8K×$4.40 = $0.07 + $0.04 = $0.11
OpenCode+Ollama $0.00
HEAVY SESSION (400,000 input + 60,000 output)
Fable 5 400K×$10/M + 60K×$50/M = $4.00 + $3.00 = $7.00
Opus 4.8 400K×$5/M + 60K×$25/M = $2.00 + $1.50 = $3.50
GPT-5.5 400K×$5/M + 60K×$30/M = $2.00 + $1.80 = $3.80
GLM-5.2 400K×$1.40 + 60K×$4.40 = $0.56 + $0.26 = $0.82
OpenCode+Ollama $0.00
Two things jump out. Fable 5 is the most expensive option in every row — by design, since output is where it really stings at $50/M and agentic coding is output-heavy. And the gap to Opus 4.8 is almost exactly 2×, because Opus sits at half Fable’s rate on both input and output. GLM-5.2 is in a different league on price: roughly an eighth of Fable on a light task, and under a dollar on the heavy session.
What that becomes per month
Per-task numbers are abstract until you multiply by a real workday. Say you run ten heavy sessions a day across twenty working days — 200 sessions a month. That is a believable load for someone who leans on an agent for most non-trivial changes.
| Backend | Per heavy session | 200 sessions / month |
|---|---|---|
| Claude Fable 5 | $7.00 | $1,400 |
| GPT-5.5 | $3.80 | $760 |
| Claude Opus 4.8 | $3.50 | $700 |
| GLM-5.2 (Z.ai API) | $0.82 | $164 |
| OpenCode + Ollama | $0.00 | $0 (+ electricity) |
That $1,400 figure is the one that matters now that the free window is gone. Before June 23, a power user could run Fable 5 flat-out inside a $20 Pro plan. Today the same behavior is a $1,400 line item. Even a moderate user doing three heavy sessions a day lands near $420/month on Fable 5 — well past the $20 Cursor Pro flat rate and the $100 GitHub Copilot Max tier.
This is the same trap the GitHub Copilot token-billing change created earlier in June: the moment metered, output-priced agent usage replaces a flat subscription, heavy users see bills jump by an order of magnitude. Fable 5 going credit-only is that story repeated for Anthropic’s top model.
Where prompt caching changes the picture
The numbers above are raw, no caching. In real agentic loops, a large fraction of your input is the same context re-sent every turn — the system prompt, your rules file, the files already in scope. Anthropic, OpenAI, and Z.ai all discount cached input by about 90%.
For Fable 5, cached reads drop from $10/M to $1/M. On the heavy session above, if 300K of the 400K input tokens are cache hits, the input cost falls from $4.00 to roughly $1.30 (100K fresh at $10/M + 300K cached at $1/M), pulling the session from $7.00 down to about $4.30. Output is never cached, so the $3.00 output cost is unmovable — and that is the real reason Fable 5 stays expensive. You can cache your way out of input cost, never out of $50/M output.
Batch mode is the other lever: non-urgent jobs (bulk refactors, codemod-style passes, overnight test generation) run at $5/$25 per million on Fable 5, half the interactive rate. It is useless for a live agent — there is latency — but for fire-and-forget work it halves the bill.
The decision framework after June 22
Solo developer, cost-sensitive. Make Opus 4.8 your default in Cursor or Cline and you cut the bill in half versus Fable 5 for work most people cannot tell apart on a 40-line change. If you want to go further, point your editor at GLM-5.2 through an OpenAI-compatible endpoint — at $1.40/$4.40 it is roughly a tenth of Fable, and on long-horizon coding benchmarks it trades blows with GPT-5.5. The setup is covered in GLM 5.2 as your Cursor and Cline backend.
Privacy-first or zero-marginal-cost. Run a local model behind OpenCode + Ollama. The per-session cost is genuinely $0 once the hardware is paid for; what you trade is tokens-per-second and peak quality. For the hardware reality of running a capable coding model locally, see runaihome’s best local AI models by VRAM.
Team lead picking a standard. A flat $20 Cursor Pro seat or a $100 Copilot Max seat is now dramatically cheaper than metered Fable 5 for anyone running agents all day — see the Copilot Max breakdown and Cursor vs Claude Code for where the flat plans win. Reserve Fable 5 (via credits) for the senior engineers tackling the genuinely hard refactors, and let the rest of the team run a flat-rate tool.
When Fable 5 is still worth it. The model earns its premium on the hard 20%: long-horizon, multi-file refactors where holding the whole dependency graph in context lands the change in one pass instead of three. If one Fable 5 session at $7 saves an hour of senior-engineer time untangling what a cheaper model botched, the math is trivially in its favor. The mistake is making it the default — letting an agent burn $50/M output on boilerplate it could have written on Opus or GLM for a fraction.
A problem you’ll actually hit: the silent credit drain
The most common post-June-22 surprise is not a single big task — it is leaving Fable 5 selected as your default model and forgetting. Agent mode in Cursor and Cline will happily run dozens of small tasks a day, each one quietly metering at $10/$50 against your credit balance. Three weeks later you notice a $300 charge you cannot trace to any one session.
The fix is two minutes of config. In Cursor, set your default agent model to Opus 4.8 (or a cheaper backend) in Settings → Models, and only switch to Fable 5 deliberately for a specific hard task. In Cline, pin the model per-profile so your everyday profile never touches Fable. If your plan exposes a credit balance, set a hard spend cap — most consoles let you cap monthly usage credits, which turns a runaway loop into a hard stop instead of a surprise invoice.
The behavior that felt fine for two free weeks is exactly the behavior that bills $1,400 now. Change the default, not just your intentions.
FAQ
Is Claude Fable 5 still free anywhere? No. The free-on-plan window ran June 9–22, 2026. As of June 23, Fable 5 usage on Pro, Max, Team, and Enterprise plans meters against usage credits at the API rate. The cheaper in-plan models (Opus 4.8, Sonnet 4.6) remain included.
How much does one Fable 5 coding task cost now? About $0.90 for a light task (~50K input + 8K output) and around $7 for a heavy multi-turn session (~400K input + 60K output), before prompt caching. Caching can cut input cost ~90%, but output stays at $50/M.
Is Opus 4.8 good enough to replace Fable 5 for daily coding? For most work, yes. Opus 4.8 is half the price and close on everyday tasks. Fable 5’s lead shows up on the hardest long-horizon refactors. Default to Opus, escalate to Fable only when a task defeats it.
What’s the cheapest credible backend for Cursor or Cline? Among paid APIs, GLM-5.2 at $1.40/$4.40 per million is the strongest value — roughly a tenth of Fable 5. For zero marginal cost, a local model via OpenCode + Ollama is free after hardware, at the cost of speed and peak quality.
Does GitHub Copilot or Claude Code change the Fable 5 math? Flat-rate tiers (Cursor Pro $20, Copilot Max $100) are now far cheaper than metered Fable 5 for heavy agent users. Claude Code routes Fable 5 through the same per-token credits, so the cost is identical to direct API use.
Sources
- Introducing Claude Fable 5 and Claude Mythos 5 — Claude API Docs
- Claude API Pricing — Official
- Claude Fable 5 Pricing & Usage Credits Explained — claudefa.st
- Claude Fable 5 and Mythos 5: Pricing, API Costs, and Benchmark Comparison — Finout
- Claude Opus 4.8 Pricing 2026 — Finout
- OpenAI API Pricing 2026: GPT-5.5 and all models — AI Pricing Guru
- Z.ai GLM-5.2 API Pricing 2026 — AI Pricing Guru
- Z.ai’s open-weights GLM-5.2 beats GPT-5.5 on long-horizon coding for 1/6th the cost — VentureBeat
Last updated June 23, 2026. Pricing and features change frequently; verify current state before purchasing.
Was this article helpful?
Thanks for the feedback — it helps improve future articles.