When your AI assistant starts billing by the thought

There is a moment in every technology adoption cycle when the vendor decides the honeymoon is over. The product is established, the users are hooked, the switching cost is real, and the capital that subsidized cheap access is no longer available in unlimited quantities. GitHub Copilot reached that moment in spring 2026, when GitHub announced that all Copilot plans would move to usage-based billing built around GitHub AI Credits. The important detail is not that the sticker price suddenly exploded. According to GitHub’s pricing page, the base monthly subscription prices remain in place for paid tiers. What changes is the cost model behind heavy usage.

cover}

That still changes the psychological contract. Developers who had internalized Copilot as a predictable flat-fee tool now have to think in terms of included credits, model multipliers, and incremental spend once those credits are exhausted. Across developer communities, the backlash centers on predictability as much as on price.

The math was always broken

The flat-rate AI subscription was always a convenient fiction. It let vendors acquire users at scale, let those users build habits and integrations, and postponed the uncomfortable conversation about what large language models actually cost to run. Every token sent to Copilot, every multi-step refactoring session, and every agent loop that called the model dozens of times to fix a type error had a real infrastructure cost behind it.

In 2026, that cost is no longer being hidden as aggressively. GitHub’s own explanation is straightforward: agentic workflows, longer sessions, and more capable models produce materially higher compute and inference costs. The published billing transition and models and pricing documentation make the new logic explicit. Usage now tracks token consumption rather than a simpler premium-request abstraction.

This is not a story unique to GitHub. Much of the SaaS-AI pricing experiment from 2023 to 2025 rested on the same unstable foundation: take a product with variable per-unit costs, package it as a fixed subscription, acquire users, and deal with the economics later. The cloud industry went through a similar transition years ago when teams learned to reason about API calls, egress, and execution time as billable units. AI tooling is now going through its equivalent adjustment.

In brief

Consumption-based pricing transfers all cost variance to the buyer while the vendor captures a stable margin.
Power users, the ones generating the most value from the tool, now pay the most.
The main point of friction is whether spending controls and reporting are good enough to trust.
Agentic workflows are the multiplier: an agent loop gone wrong wastes time and increases spend.
Every AI tooling product launched in the flat-rate era is heading toward the same reckoning.

Predictability was the product

Usage-based billing is defensible for a variable-cost product. The harder question is whether the controls are granular, timely, and understandable enough for teams to manage the new reality without friction.

GitHub says those controls are coming with the transition. In the announcement and billing docs for individuals and organizations, it describes preview billing, pooled included usage, and admin budget controls at the enterprise, cost-center, and user levels. That matters. It means the conversation is no longer “there are no controls” but “will the controls be accurate and usable enough in real workflows?”

That distinction is important because predictability was part of the product people thought they had bought. A developer who can cap daily AI spend and see near-real-time usage will usually tolerate a variable model. A developer who opens an IDE or triggers an agentic workflow without a credible estimate of what the session may cost is dealing with an operational constraint that directly affects productivity. The agentic AI workflows that teams have been building over the past eighteen months make this sharper: an autonomous agent in a CI pipeline does not pause to ask whether it is over budget.

The migration is real, but not simple

Competitors now have an obvious sales pitch. Tools such as Cursor and Codeium can frame pricing predictability as a differentiator, while projects such as Continue can appeal to teams that want more control over model routing and cost visibility. The same pressure is already visible in adjacent discussions about AI-native security tooling, as seen in my piece on what Claude Code Security really means. For individual developers, that comparison is often simple. For enterprises, it usually is not.

For enterprise teams, it is considerably less simple. Switching means renegotiating integrations with internal systems, rebuilding review workflows, revisiting the security policies that were written specifically to cover Copilot, and then explaining to management why the tool they approved and budgeted last quarter is being replaced. The exit cost is largely organizational, and this is precisely the lock-in Microsoft was counting on. As a retention strategy, it is blunt but effective.

What makes this particularly interesting from a security and governance perspective is that billing unpredictability creates pressure to restrict usage, and restricting usage of a tool that has been deeply integrated into daily workflows creates its own class of problems. Teams that have used Copilot as part of their code review pipeline, integrated it into their SAST workflows, or built internal tools around it do not simply use it less when the budget tightens. They either absorb the cost, find workarounds, or start routing requests through unofficial channels that sit outside the approved security perimeter. The third option is the one worth worrying about, especially for organizations that have not yet invested in forensic readiness for AI systems.

What this means for anyone signing a contract in 2026

If you are evaluating AI tooling right now, the Copilot transition is a preview. Notion AI, Google Workspace AI, Microsoft 365 Copilot in its broader form, and most productivity-AI products launched in the flat-rate era will face the same pressure. The unit economics of LLM inference do not support unlimited flat pricing at scale for advanced usage, so the only real question is timing and contract structure.

The practical response is to treat every AI tooling contract the same way a competent cloud architect treats an AWS agreement: insist on explicit spending controls, clear alerting thresholds, and renegotiation clauses from day one. The baseline asks for any consumption-based AI contract should include hard per-user and per-team caps, granular cost reporting by team or department, published unit pricing by model and capability tier, usage exports that finance and engineering can reconcile independently, and a renegotiation clause tied to market pricing on a defined interval.

None of these are standard in current enterprise agreements. All of them are negotiable, especially right now when every vendor is watching Copilot’s customer attrition numbers and trying to differentiate on terms rather than features.

The deeper issue is structural. A consumption model tends to charge the most engaged users the most money, which is rarely how a buyer wants a productivity tool to scale internally. If there is a durable lesson in the Copilot episode beyond the specifics of GitHub’s pricing page, it is this: invest in observability of AI consumption from day one, treat AI inference as infrastructure with real cost accounting, and resist long-term agreements with vendors whose unit economics are still being discovered in public.

Paying by the thought was always where this was going. The surprise is that anyone is surprised.

FAQ

Why did GitHub Copilot switch to consumption-based pricing?

GitHub says the shift reflects the higher compute cost of agentic workflows and aligns billing with actual token consumption, while base subscription prices remain unchanged.

What are the security risks of unpredictable AI billing?

Budget pressure leads teams to restrict official tool usage, which can push developers toward unofficial channels or self-hosted workarounds outside the approved security perimeter.

How should organizations negotiate AI tooling contracts in 2026?

Demand hard spending caps, granular cost reporting, clear pricing by model and capability tier, and renegotiation clauses tied to market pricing.