You Don't Need the Latest Model. You Need Better Tools.
OpenCode data shows 58% of real coding sessions run on Chinese models at $0.05/session. The frontier model race is a distraction — workflow, tooling, and being intentional with your agent matter way more than which model you pick.
I keep seeing the same conversation online. "Have you tried the new Opus?" "Gemini 3.5 is unbelievable." "You need the $200/mo plan or you're leaving money on the table."
Then I look at the actual usage data from OpenCode — a terminal-based coding agent platform — and the picture is completely different.
Market share by model author over the last eight weeks:
- DeepSeek — 58.4%
- Moonshot — 24.6%
- Qwen — 5.9%
- Zhipu — 5.5%
- MiniMax — 3.5%
- Xiaomi — 2.1%
Every single one is Chinese. Every single one is cheap. Average session cost: $0.05. Token cost per million: $0.14 input, $0.28 output. Cache ratio: 97%. Tokens per session: 4.8 million.
The models everyone on Twitter is hyping? Barely on the board. Real developers, doing real work, voting with their wallets.
This isn't a "China vs the world" thing. It's a cost-to-value thing. Frontier models are expensive — $2–$15 per session depending on what you're doing. But the vast majority of coding work doesn't need frontier intelligence. Writing a test, refactoring a function, debugging a type error — these are handled perfectly well by models that cost a nickel per session. The only thing holding them back was the tooling.
The Harness Is the Unlock
I've been using Pi — a minimal terminal harness for coding agents. You bring your own models, tools, and workflows. It supports fifteen-plus providers, so you can switch between a cheap model for daily work and a frontier model for the hard stuff. The harness doesn't care.
People use Claude Code and OpenCode for the same reason: they want the agent to actually touch the code. Not suggest it. Edit it, run it, fix it. That's the real shift — terminal-native agents that operate on your project instead of just chatting about it.
But the deep truth about why people default to the most expensive tools is simple: it's the path of least resistance. Claude Code ships as a product. Pi is a harness you configure. One is an on-ramp, the other is a workshop. Most people take the on-ramp because it's there, not because they've calculated the cost.
The Insane Economics of AI Subscriptions
There's a question nobody wants to ask: why are we spending hundreds or thousands of dollars a month on AI tools to write code for projects that don't even have any users yet?
Let's look at the actual numbers.
Subscriptions:
- Claude Pro: $20/mo. Max: $100/mo.
- ChatGPT Plus: $20/mo. Pro: $100/mo.
- GitHub Copilot Pro: $10/mo. Pro+: $39/mo. Max: $100/mo.
- Cursor Pro: $20/mo. Ultra: $60/mo.
Stack two or three of these and you're at $200–$400/mo easily. Before you've written a single line of code. For a project with five GitHub stars, zero active users, no revenue.
Frontier API costs:
- GPT-5.5: $5.00 input / $30.00 output per 1M tokens.
- GPT-5.4: $2.50 input / $15.00 output.
- Claude Opus class models: $5-15 input / $25-75 output per 1M tokens.
And what OpenCode data shows:
- Average session cost with Chinese models: $0.05.
- Token cost: $0.14 input / $0.28 output per 1M tokens.
- Tokens per session: 4.8 million.
You can run a hundred sessions on cheap models for the cost of a single frontier session. And the Chinese models are capturing 100% of real usage on the platform. Not because they're sentimental — because they work.
And what are you getting for that $100/mo subscription? A copy-paste workflow from a browser tab. Maybe some agentic features if you use the premium tiers. But the economics make no sense. You're burning capital on inference that could be spent on a dozen cheaper models running in a proper harness.
Even worse — the more you spend, the less you learn. I see people generating thousands of lines of code they don't understand, shipping features they couldn't explain, building systems they couldn't debug. They're paying a premium to stay incompetent. The expensive model writes the code, the developer approves it blindly, and when something breaks, they have no idea where to start looking.
You know what forces you to understand your code? Reading it. Running it. Debugging it. A $200/mo model generating code you skim and accept doesn't make you productive — it makes you a manager of an intern you can't fire who writes code you can't read.
The people doing this right aren't the ones spending the most. They're the ones who picked a cheap model, configured a harness with real tools, and treat the agent like a junior developer they actually supervise. The cost is incidental. The workflow is the point.
The Real Problem With How People Use Agents
Most junior developers use agents. A lot. But there's a pattern I see over and over: they treat the agent like a magic wand.
"Build me a website."
"Make me a fullstack app."
"Write a complete ..."
One-shot prompts, no precise context, massive scope, long context windows. The agent spits out hundreds of lines. Half of it doesn't compile. A junior dev aka the true vibe coder gets frustrated and blames the model. They try a different model. Same result. They conclude "agents don't work."
Bro... Are you serious?
The developers who get real value from agents aren't the ones asking for the moon. They're the ones who are intentional. They say:
- "I wrote this function that takes these parameters. Fix this section's performance."
- "Plug this endpoint into that service."
- "Refactor this component to use the new API shape."
Surgical, specific, contextual. The agent isn't being asked to build something from scratch — it's being asked to do a job within an existing codebase that it can read and understand.
The best way to use an agent is not to ask it to build you a house. It's to hand it a blueprint, point at a specific wall, and say "move this window three feet to the left."
Most people aren't intentional with their agents. They don't give them enough context. They don't frame the task narrowly enough. They don't let the agent read the surrounding code before asking it to make a change. They treat it like a search engine that generates code instead of like a teammate that needs a clear brief.
The Models Are Already Good Enough
The OpenCode data proves it. DeepSeek v4 Flash — the top model by a landslide — is a fraction of the cost of the frontier alternatives. It's definitely not the most capable model on the bench. But it doesn't need to be, because the workflow around it does the heavy lifting.
The agent has access to the project. It reads the context, runs the tests, iterates on failures. The model just needs to be good enough to write code that passes those tests. And at $0.05/session, it's good enough to do that all day.
Supervised Collaboration
The real paradigm, as DHH put it, is supervised collaboration. The agent does the grunt work. You review the output, guide the direction, make the calls. This only works when the agent has real tools — terminal access, file editing, build execution. And it works best when you give it a specific job, not a vague ambition.
The model race is a distraction. The data is clear: real developers don't use the most expensive models. They use the ones that are cheap enough to run constantly and good enough to get the job done. What separates a productive agent session from a frustrating one isn't the model. It's whether you told the agent what to build, or whether you told it what problem to solve.