Meta ties AI token usage to performance reviews, sparking 'tokenmaxxing'

Serge Bulaev

Serge Bulaev

Meta is tying how much employees use AI tokens to their performance reviews, which appears to be changing how people work there. Some workers began using more tokens to get noticed in company dashboards, a practice called 'tokenmaxxing.' Meta responded by introducing tools to watch token use, setting budgets, and saying there may be team limits by 2027. Employees may have felt mixed signals, as using more tokens was praised before but is now being watched more closely. In the future, Meta suggests it will focus on the results from using AI, not just how many tokens are used, and may adjust costs based on when tokens are used.

Meta ties AI token usage to performance reviews, sparking 'tokenmaxxing'

Meta is grappling with the unintended consequences of tying AI token usage to employee performance reviews, a policy that sparked a costly internal practice known as 'tokenmaxxing.' The move, intended to boost AI adoption, instead created an arms race among engineers to consume the most tokens, reshaping daily behavior inside the social media giant. In a recent 30-day period, employees burned through a reported 60.2 trillion tokens, a figure first highlighted by The Information and later noted in coverage from AI Weekly. At public rates, this usage could translate to a staggering $900 million bill, though Meta's actual costs are lower due to enterprise discounts.

The Rise of 'Tokenmaxxing' and Its Staggering Cost

The phenomenon began when Meta's Performance Summary Cycle (PSC) linked numerical token counts to employee ratings. Engineers quickly learned that higher consumption led to better visibility on internal dashboards, effectively gamifying AI usage. This competition, internally dubbed "tokenmaxxing," prioritized token volume over the actual value or efficiency of the AI-driven output.

Meta's performance reviews began factoring in AI token usage, creating a direct incentive for employees to increase their consumption. This led to "tokenmaxxing," where engineers optimized for higher token counts on internal leaderboards, causing an unsustainable surge in operational costs and prompting an abrupt policy change.

Meta's Pivot: From Bragging Rights to Budget Alarms

Reacting to alarming cost projections, Meta initiated a significant course correction. The company is rolling out an internal control plane named AI Gateway to approximately 6,000 developers, designed to monitor prompts, assign token budgets, and issue real-time alerts for consumption spikes. According to industry reports, this new strategy signals a policy shift toward more controlled token usage, with per-team caps expected in the coming years. The guidance also steers engineers away from expensive third-party models toward MetaCode, the company's in-house AI assistant, to control both spending and data governance.

Navigating Conflicting Signals and the Future of PSC

The abrupt policy shift reportedly created whiplash within the engineering culture. Teams that were previously encouraged to use verbose, chain-of-thought prompts found similar practices later scrutinized. This has developers joking on Slack about "prompt inflation" - crafting unnecessarily wordy inputs to pad counts - while leaders worry that hard caps could stifle innovation.

Looking ahead, Meta plans to evolve its evaluation metrics. A company executive quoted by Business Insider confirmed that future reviews will not use individual AI usage scores. Instead, the focus will shift to "AI-driven impact," prioritizing outcomes per token rather than raw volume. The AI Gateway team is also exploring dynamic pricing to make tokens consumed during peak hours weigh more heavily against budgets, framing the new controls as essential risk management.


Why did Meta start tying AI token usage to its Performance Summary Cycle (PSC)?

Based on available reports, Meta is tying performance expectations to 'AI-driven impact' rather than individual AI usage metrics. The goal was to encourage AI adoption and measure productivity, but the metric quickly became a competitive scoreboard. Employees discovered that higher token counts often translated into better review scores, so they optimized prompts and workflows to maximize token consumption rather than product value.

What does "tokenmaxxing" look like inside Meta?

Teams created internal leaderboards that publicly rank engineers by monthly token totals. A viral dashboard once let coworkers compete head-to-head until it was shut down. The behavior ballooned to an estimated 60.2 trillion tokens consumed in a 30-day window - enough to push internal AI bills toward significant costs even after Meta's large-volume discounts.

How expensive is tokenmaxxing?

At public-market rates for frontier models, those 60.2 trillion tokens would represent substantial costs, though the exact calculation depends on the specific models and pricing structures used. Meta pays considerably less due to volume discounts, cached-input savings, and partner-hosted inference arrangements, yet the absolute spend still triggered internal concerns about escalating AI costs.

What corrective actions has Meta taken?

Meta is now shifting toward more controlled token usage:
- AI Gateway dashboard tracks usage in real time and triggers alerts for anomalous spend
- Token budgets and quotas are being implemented
- Future reviews will focus on AI-driven impact rather than raw token volume

Could tighter token caps stifle innovation?

Meta faces a classic trade-off:
- Upside: Cost controls free budget for additional experiments and tighter model selection
- Risk: Strict caps make engineers reluctant to run exploratory prompts, multi-step reasoning, or large-context jobs that often yield the biggest breakthroughs

Internal guidance now encourages outcomes per token instead of raw volume, nudging teams toward leaner prompts and smarter orchestration rather than blanket token cuts.