Overview
- Microsoft has begun canceling most employee Claude Code licenses and is redirecting engineers to GitHub Copilot CLI after internal usage drove sharply higher costs.
- Unsupervised, multi-file agent sessions can burn very large numbers of tokens — one reported 30-minute session read hundreds of thousands of tokens — which quickly inflates per-token bills.
- Uber reported that heavy employee use exhausted its 2026 AI coding budget in months, showing the cost issue is not unique to Microsoft.
- Companies and analysts are responding with billing controls, human-in-the-loop review and memory-first designs to cut repeated context re-reads and curb token consumption.
- The shift highlights a wider industry tension: per-token pricing rewards high token use, which favors well-capitalized firms and may make current agentic coding models unaffordable for most developers.