Fable 5 vs. Opus 4.8 – No Hype, Just Facts
Anthropic's Fable 5 costs double Opus 4.8 and arrives wrapped in hype. I put both to work head-to-head on the same workflow to see if the price is justified.
Anthropic has just released Fable 5, its “most powerful model” ever. Fable 5 is allegedly a Mythos model with more safeguards and guardrails around it, as it is “too dangerous to release” to the public.
Obviously, there’s now a lot of hype around Fable 5. Priced double as compared to Opus 4.8 I wanted to put them to work head-to-head to see if all the hype is justified. The outcome might surprise you!
Setting Up the Test
The Fable 5 release came in nicely as I was working on something on the side, mostly for myself, but got into a place where the initial “product planning” made with Claude didn’t really seem to find a lot of common ground with what’s feasible to implement.
Basically, I want to gather a lot of different Claude Code usage data from the local machine and create an extensive dashboard app that allows me to understand my usage patterns, real-time and historical estimated prices and a lot of other data you can aggregate from Claude Code session logs.
The first step for me was to create two different worktrees where I could put Fable 5 and Opus 4.8 to work in isolation. The entire harness was the same as both had access to the same amount of information, context, MCP servers, skills, Claude Code rules and so on.
I then created a prompt that I purposely kept vague enough, though the before mentioned setup would offer a lot of context. In the first prompt sentence I established a plan file as the entry path to the entire process and provided my frustration: Claude created a product mock that didn’t have a lot in common with what’s de facto available in Claude Code sessions.
I then defined a goal (/goal) to map out all the parts from the mocked UI and map them to what can realistically be achieved using Claude Code session files as the sole source of data. I then prompted to create a workflow that would perform all the needed investigation, analysis, research and validate it through a product and technical lens.
Finally, I asked to dump the final synthesis in an .md file.
Fable 5 vs Opus 4.8: Ways of Working
I had both running in two different terminal windows, side-by-side, and one thing I did was to observe their ways of working. To be honest, I didn’t see a lot of differences in the way they approached the instructions.
Both Fable 5 started by reading different files in the repository and comparing the outputs both looked at the same .md files, loaded the same skills and rules and tried to get an initial “understanding” of the entire setup.
The main difference was that Fable 5 just went directly to it, while Opus 4.8 came back with some additional questions using the multiple questions tool. Usually, I’d consider this a bonus point for Opus 4.8, but honestly the questions showed me that the initial understanding was probably not that great, as the questions literally had the answers in the files or skills that both loaded.
In a second phase both kicked in a workflow spawning different agents to perform different tasks. Here there were some major differences.
Fable 5 created a more elaborate sub-agent setup. It had agents individually looking at just one UI component to map out against Claude Code session information. Then it had agents analyzing the findings and creating working theories. For each working theory it used 3 more different agents for adversarial review from a product, technical and general lens. In the end it had one agent synthesizing all the information. In total, Fable 5 used 33 different agents in the workflow.
On the other hand, Opus 4.8 created a much more frugal setup. To be honest I didn’t exactly understand how sub-agents were used, but the hard fact is that it used a total of just 13 agents to perform the work. There were some agents used for adversarial review, but just 1 agent per review that seemed to tackle all the different lenses. Also, based on the final number of used agents, I think some of the sub-agents were re-used for at least 2 different UI components that were under analysis.
The question is: did the more sophisticated Fable 5 setup translate into a qualitatively superior output? I’ll come to this soon!
Speed and Resource Consumption
Speed of execution is the part where hype started to fade away. Fable 5 completed the workflow in 13min:42seconds. Opus 4.8 completed the workflow in 15min:23seconds. Obviously, Fable 5 was faster but one of the main points advertised was its speed. And the difference to Opus 4.8 was minimal.
True, technically Fable 5 performed a lot more work in the virtually same amount of time, but what counts for me in the end is how long it takes to get the job done. It’s the difference between focused work and busy work.
Resource consumption is the main highlight, though. Fable 5 consumed 3.2M tokens while Opus 4.8 consumed 1.1M tokens for the same workflow. If we think in terms of pricing and to a rough estimation considering all the tokens as output tokens, Fable 5 generated a cost of $160, while Opus 4.8 generated a cost of $27.50. It’s a difference of $132, which in my opinion is quite substantial.
But prices by themselves don’t tell the real story as it’s more about the ROI. Obviously in my case we can’t technically calculate an ROI, but what we can definitely do is analyze the quality of the outputs and understand if the difference in quality could justify the cost difference.
Output quality
This is where things became interesting.
Given the difference in resource consumption, I expected Fable 5 to uncover inaccuracies in the Opus 4.8 report. I expected it to challenge assumptions, invalidate findings or identify blind spots that the smaller investigation simply failed to see.
That didn’t happen. After reading both reports side by side, I found that they converged on most of the important conclusions. Both identified the same strengths in the mockup. Both identified the same genuinely difficult problems. Both arrived at very similar conclusions regarding what data is available, what can be computed and where the hard limitations of Claude Code’s local data model begin.
The differences appeared mostly in how they reacted to uncertainty.
Opus 4.8 took a more conservative approach. Whenever a metric could not be derived with high confidence, its tendency was to simplify the design or remove the feature entirely.
Fable 5 approached the same situations differently. Rather than discarding features, it spent more effort looking for grounded alternatives that could preserve the original intent while remaining honest about the limitations of the data.
That said, the quality difference was nowhere near proportional to the cost difference. Fable 5 consumed 3.2 million tokens. Opus 4.8 consumed 1.1 million. Yet both arrived at broadly the same destination.
The additional tokens bought a more elaborate investigation process, more validation and, in some cases, better product thinking. They did not uncover a substantially different reality.
Closing Thoughts
My conclusion is that Fable 5 is a better model than Opus 4.8. The gap, however, feels much smaller than the current hype would suggest.
Could Fable 5 perform better on significantly larger and more complex workflows? Quite possibly. Looking at how it structured the investigation, it seems optimized for exactly those kinds of tasks. The challenge is that the economics scale alongside the complexity.
This investigation was not a trivial prompt, but neither was it an exceptionally difficult engineering problem. Yet Fable 5 consumed the equivalent of roughly $160 worth of tokens to complete it. Extrapolating that cost profile to large autonomous workflows, long-running automations or enterprise-scale agent systems raises some difficult questions around sustainability and ROI.
What stood out to me is that both models ultimately arrived at broadly the same conclusions. Fable 5 invested substantially more effort in getting there. The additional effort produced a better result, a more thorough investigation and, in some areas, stronger product thinking. It did not produce a radically different understanding of reality.
For now, that feels like the more useful lens through which to evaluate Fable 5. Not as a breakthrough that leaves previous models behind, but as a meaningful incremental improvement whose cost grows considerably faster than the quality gains it delivers.
FAQ
Is Fable 5 better than Opus 4.8?
In this test, yes, but only marginally. Fable 5 ran a more elaborate investigation and showed slightly stronger product thinking, yet both models converged on broadly the same conclusions. The gap felt much smaller than the current hype suggests.
How much more expensive is Fable 5 than Opus 4.8?
On the same workflow, Fable 5 consumed 3.2M tokens for a rough cost of $160, while Opus 4.8 consumed 1.1M tokens for about $27.50. That is a difference of roughly $132 for the same task.
Is Fable 5 faster than Opus 4.8?
Slightly. Fable 5 completed the workflow in 13min:42seconds and Opus 4.8 in 15min:23seconds. Fable 5 did more work in nearly the same time, but the wall-clock difference for finishing the job was minimal.
When is Fable 5 worth the higher cost?
Possibly on significantly larger and more complex workflows, where its more structured investigation may pay off. For a medium-complexity task like this one, the quality gain did not scale anywhere near as fast as the cost.