AI Compute Deals Pressure Margins, Shift Valuation Metrics

Serge Bulaev

Serge Bulaev

AI compute costs are rising as companies spend more on both training and using AI models. While training big models like GPT-4 may cost over $100 million, most of the long-term spending now comes from using these models (inference), which might make margins shrink if revenue growth does not keep up. Large deals to lock in chip supply can help manage risks but also create high fixed costs and reduce flexibility if demand drops or new hardware appears. Some reports suggest that missing usage targets by even a small amount may erase profits. Overall, how well these companies match their compute supply with actual usage may decide if they stay profitable.

AI Compute Deals Pressure Margins, Shift Valuation Metrics

The economics of AI compute deals are reshaping board-level agendas, as nine-figure training budgets and billion-dollar chip reservations redefine risk for investors. Every dollar spent on compute flows into either front-loaded training costs or recurring inference expenses - a split that dictates unit economics, balance sheet structure, and the viability of long-term vendor contracts.

Training vs. Inference: Where the Lifecycle Money Goes

The majority of AI compute spending has shifted from one-time training costs to the recurring expense of inference, which is using the model in real-time. This change forces companies to operate like capital-intensive factories, where profitability depends on managing variable costs that rise with every user interaction.

While initial model training costs are immense - analysts cited by Telnyx note that advanced model training requires substantial compute investments - the long-term financial burden has shifted decisively to inference. Current benchmarks show inference consuming the majority of total compute budgets. Despite declining token prices, surging usage drives up total serving costs, creating an operating profile where revenue growth must significantly outpace the scaling inference bill to prevent margin compression.

The High Stakes of Long-Term Chip Contracts

To manage these pressures, leading AI labs are signing massive, multi-year chip deals. These agreements, like OpenAI's reported deals with various chip providers, hedge against supply shortages but lock in substantial fixed costs (CNBC). Similarly, partnerships between major chip manufacturers and AI companies provide demand visibility at the cost of customer concentration risk (CNBC). With contracts spanning multiple years and hardware depreciating rapidly, these deals front-load costs and can depress margins just as revenue begins to scale.

Margin Sensitivity Model

Financial models show that profitability is highly sensitive to a few key levers:

  • Utilization: Significant swings in compute utilization can substantially alter gross margins.
  • Efficiency: Gains in token throughput per watt translate directly to operating margin if pricing remains stable.
  • Compute Sourcing: Pre-paid contracts stabilize costs (COGS) but risk stranding capital if demand falters or new hardware emerges.
  • Token Pricing: For high-volume APIs, small changes in token pricing can have meaningful impacts on profitability.

Investor Lens on Multi-Year Chip Contracts

From an investor's perspective, these mega-deals are a double-edged sword. While they offer chipmakers stable revenue backlogs, they saddle AI companies with sunk costs that reduce strategic flexibility. Financial forecasts become vulnerable to hardware production delays, yield problems, or a competitor's shift to more efficient in-house silicon. Furthermore, a timing gap often exists between when a provider recognizes revenue (on shipment) and when the customer generates cash (on utilization), potentially straining working capital during major infrastructure build-outs.

Valuation Impact

Consequently, valuation metrics are shifting. Private market investors now prioritize access to low-cost compute over pure model performance. However, high fixed-cost exposure can dampen exit multiples, as it limits a company's ability to adopt newer, more efficient hardware. This strategic divergence is clear in peer comparisons: OpenAI has diversified its suppliers (Nvidia, AMD, Broadcom) to mitigate technical risk at the cost of complexity, while Anthropic has concentrated on Google TPUs and Amazon Trainium, simplifying its stack but increasing supplier dependency. Each strategy presents a different tradeoff between cost, control, and risk.

The combination of inverted cost structures (inference > training), aggressive hardware depreciation, and massive fixed-cost contracts makes one metric paramount: compute utilization. Financial models indicate that failing to meet utilization targets can significantly impact projected profits. For today's leading AI operators, the central challenge is the continuous, precise alignment of committed capacity with real-time user demand.


How do training and inference costs differ, and why does the split matter to investors?

Training is a bounded, one-time capital hit: Advanced model training requires substantial compute investments, with industry estimates suggesting costs in the hundreds of millions for the most sophisticated models.
Inference is an unbounded operating expense: every new user or agentic workflow adds real-time compute. Even though per-token prices have fallen dramatically since late 2022, total inference spend keeps rising because volume grows faster than unit cost drops.
For investors, this means the majority of lifetime compute dollars are now spent on inference, shifting metrics from "how expensive was training?" to "how efficiently will the model run at scale?"


What do major AI compute deals tell us about margin pressure?

Large-scale agreements secure substantial dedicated capacity over multi-year periods, but they also lock companies into long-duration capex commitments just as token prices fall. Historically, token price changes significantly impact margin sensitivity to utilization.
If capacity runs at lower utilization rates, these deals can pressure gross margins; at higher utilization they can still deliver meaningful margins even at current token pricing levels.
Contracts of this size therefore act as leverage on utilization: they de-risk supply but amplify downside if demand lags.


How do Anthropic and OpenAI compare in their infrastructure risk profiles?

Company Main compute stack Strategic approach Concentration risk
OpenAI Nvidia, AMD, Cerebras, Broadcom, Microsoft, AWS Multi-vendor diversification across substantial commitments Lower per-supplier exposure, higher fixed-cost stack
Anthropic Google TPUs and Amazon Trainium Focused partnerships with major cloud providers Fewer vendors, deeper lock-in, but more predictable unit economics

Anthropic's cloud-centric approach may front-load less absolute capex than OpenAI's multi-vendor strategy, yet leaves it more exposed to Google and Amazon pricing power. Diversification lowers single-supplier risk at OpenAI, but the aggregate fixed-cost exposure across many vendors increases overall commitment levels.


Which valuation metrics are replacing traditional P/E ratios for AI firms?

Instead of quarterly earnings, analysts now track:
- Compute capacity secured - committed infrastructure capacity relative to projected needs
- utilization-adjusted gross margin - modeled at various load factors
- time-to-cash conversion - months between chip delivery and first paid inference token
These forward-looking metrics bridge the gap between today's capex and tomorrow's per-token revenue stream, giving investors a clearer picture of when cash flow will actually turn positive.


How might multi-year chip contracts influence market-wide tail risks?

  • Concentration risk: When major hyperscalers drive a significant portion of AI chip demand, any renegotiation or shift to in-house silicon could substantially impact supplier revenue.
  • Duration supply in credit markets: AI data-center debt issuance is now large enough that it can influence overall duration supply and affect long-term rates.
  • "Chipflation" feedback loop: Industry analysts suggest that persistent chip shortages could contribute to goods inflation, feeding back into higher discount rates for AI equities.
    The takeaway: massive long-term contracts improve revenue visibility but export risk into the broader macro cycle that investors cannot hedge with company-level data alone.