Meta, AT&T throttle AI use as costs hit corporate balance sheets

Serge Bulaev

Serge Bulaev

Companies like Meta and AT&T are slowing down their use of AI because costs for things like hardware, cloud, and AI specialists are rising faster than expected. Experts say businesses may need to manage AI spending the same careful way they handle other changing costs, like fuel prices. Some companies appear to be saving money by using smaller models for easier tasks and tracking spending closely. There may also be new issues with hiring and regulations, as the demand for AI talent outpaces supply and legal questions remain. Early signs suggest that businesses with clear planning and controls might be better at keeping AI costs under control while still growing.

Meta, AT&T throttle AI use as costs hit corporate balance sheets

As industry giants like Meta and AT&T throttle AI use, their experience reveals a critical lesson for finance teams: AI costs are hitting corporate balance sheets with the same force as rent or power. Expenses for language model traffic, GPU leases, and specialized talent are materializing faster than forecasted, forcing executives to implement usage throttles to protect growth plans. This new reality demands that businesses forecast AI spending with the same rigorous discipline applied to volatile costs like fuel or foreign exchange.

What companies are spending and why it matters

Enterprises are throttling AI adoption because costs for hardware, cloud infrastructure, and specialized talent are rising unexpectedly fast. To manage this financial pressure, firms are implementing stricter budget controls, optimizing model usage, and re-evaluating the return on investment for all artificial intelligence initiatives.

The financial impact stems from a compounding effect where hardware, cloud, and personnel costs rise in tandem. To combat intense poaching pressure for top specialists, Forge Global notes that many private AI firms now offer equity liquidity through secondary sales or tender offers, a retention strategy often more effective than simple salary increases.

Tactical playbook for limiting run-rate exposure

A consensus is emerging from enterprise cost studies: AI spending must be managed as a dynamic portfolio, not a static line item. Nearly every benchmark highlights the same key levers for control:

  • Centralize model access on a standard platform governed by an AI center of excellence. Industry reports suggest significant savings from platform standardization.
  • Apply FinOps controls: track spend by model, set hard caps, and alert teams in real time when token or GPU usage spikes.
  • Route requests to the cheapest adequate model. Smaller or fine-tuned models handle classification or search; large models stay reserved for complex synthesis.
  • Cache prompts and responses to cut duplicate token calls. Analysts say context compression alone can trim inference bills by double-digit percentages.
  • Split budgets into Run, Build, Scale, and Experiment buckets so pilots do not mask production burn.

Reinforcing this, recent BCG guidance advises finance chiefs to demand proof of measurable economic value before approving any AI initiative for scaling. This strategy effectively limits financial exposure where cost forecasts have a high degree of variance.

Regulatory and talent friction

Beyond employee liquidity, the talent equation is complicated by new regulatory friction. Regulators are now scrutinizing talent-centric "acqui-hire" and licensing deals for potential antitrust violations, treating them as de facto mergers. This legal uncertainty threatens to extend deal timelines, increase integration costs, and disrupt the rapid talent acquisition strategies common in recent years.

This occurs against the backdrop of a severe talent shortage. Labor market data indicates a significant imbalance between open AI positions and experienced practitioners in the U.S. This imbalance drives talent costs higher, forcing companies to offer larger cash incentives, retention bonuses, and equity liquidity.

Early signs of a structured response

The most agile enterprises are adopting structured cost-control frameworks similar to the cloud FinOps programs of the last decade. This involves inventorying all AI use cases, assigning clear ownership, and conducting quarterly cost-value reviews. Advanced finance teams are also implementing scenario-based forecasting - with expected, committed, and stress cases - to buffer against unplanned cost surges.

While AI is set to remain a material item on corporate ledgers, emerging trends show that disciplined governance, right-sized model selection, and transparent talent incentives are effectively bending the cost curve for early adopters.


How are Meta and AT&T tackling runaway AI costs?

Both companies are moving from "use all you want" to "use only what you need".
- Meta has quietly introduced token limits in internal tools and throttled query frequency for non-critical models to rein in sky-high inference bills.
- AT&T is taking a networking approach: it now rate-limits AI traffic across its cloud fabric and has begun rerouting high-cost generative tasks to cheaper on-prem clusters.
Early data from enterprise FinOps dashboards suggest a significant drop in per-query cost after these controls went live.

What concrete tactics are large enterprises using to forecast and contain AI spend?

Top-performing firms treat AI like a managed financial portfolio with four action pillars:

  1. Central governance & platform standardization
    One global retailer achieved significant savings by consolidating multiple disjoint AI stacks into a single governed platform; duplicated data pipelines disappeared overnight.

  2. AI-specific FinOps controls
    Finance teams now set daily hard caps on tokens, GPU-hours, and agent runs, then push real-time alerts to Slack when burn exceeds plan.

  3. Model routing & caching
    A telecom giant routes the majority of support tickets to a fine-tuned smaller model instead of the flagship frontier model, substantially cutting inference cost per ticket.

  4. Lifecycle-based budgeting
    Budgets are split into Run / Build / Scale / Experiment buckets, ensuring pilots do not cannibalize production funding.

How have AI chip megatrends impacted Broadcom and Qualcomm?

Broadcom is the clearest winner:
- Broadcom reported Q1 FY2026 AI semiconductor revenue of $8.4 billion, up 106% YoY, and management said it has line of sight to over $100 billion in AI chip revenue in 2027.
- The company continues to see strong demand for AI chips, though industry reports suggest margin pressure as AI chips may carry different gross margins than legacy networking silicon.

Qualcomm - per available sources - does not disclose discrete AI-chip revenue, and the data set shows no earnings or guidance figures for 2024-2026; any impact is inferred rather than proven.

Why is AI talent liquidity suddenly a board-level risk?

  • Tight market: A significant imbalance exists between open AI roles and experienced practitioners in the U.S., keeping salaries and churn high.
  • Secondary liquidity arms race: Start-ups now offer tender offers, secondary share sales, and equity-backed loans to let talent cash out early, forcing incumbents to match or lose key staff.
  • Regulatory spotlight: Antitrust agencies are scrutinizing "hire-and-license" deals - used to acquire AI teams without a formal merger - for possible merger avoidance. The result: deals can take longer, cost more in legal fees, or be blocked outright.

What early warning metrics should CFOs monitor to avoid budget shock?

Industry best practices suggest monitoring key metrics across multiple dimensions:

Metric Trigger Threshold Action
Daily token burn Above forecast Auto-throttle non-prod keys
GPU utilization Low for extended periods Reallocate to training backlog
Cost per active user Above baseline Route to cheaper model tier
Anomaly detection spike Above normal range Alert finance + security
Vendor commitment utilization Below pre-paid volume Renegotiate or resell excess
ROI per use case Negative after trial period Kill or pivot pilot

Teams that have adopted comprehensive monitoring dashboards report improved forecast accuracy and faster decision-making on failing pilots.