AI Infrastructure Costs Soar: Goldman Sachs Estimates $15-20 Million Per MW

Serge Bulaev

Serge Bulaev

AI infrastructure costs are rising quickly, with Goldman Sachs estimating new AI-ready data centers may cost $15-20 million per megawatt, much higher than traditional builds. This increase is partly due to higher power needs, advanced cooling, and more frequent hardware upgrades. Large companies like Microsoft, Amazon, Google, and Meta are expected to spend hundreds of billions on data centers and related technology in the next few years. Startups may struggle with these costs and often choose smaller models or shared resources. Experts suggest that finding ways to use resources more efficiently might become the main way companies compete as AI costs grow.

AI Infrastructure Costs Soar: Goldman Sachs Estimates $15-20 Million Per MW

The rapid escalation of AI infrastructure costs is reshaping technology budgets globally, as capital expenditure on power-hungry hardware and specialized data centers outpaces forecasts. This spending surge is driven by a competitive rush to secure scarce hardware and grid capacity, fundamentally altering deployment timelines and strategies.

Why the infrastructure price tag keeps climbing

The higher cost of AI data centers is driven by specialized needs, including greater power density for advanced GPUs, sophisticated liquid cooling systems, and more frequent hardware upgrades. Traditional data centers are commonly cited at roughly $7M-$12M per MW, while AI-optimized designs can rise to $15M-$20M+ per MW depending on power density and cooling.

Next-generation AI facilities command a much higher price per megawatt. According to industry reports, new AI-optimized designs require significantly more capital investment compared to traditional hyperscale data centers. This increase is attributed to the demands of higher power density, advanced liquid cooling, and redundant power infrastructure.

Adding to the expense are accelerated silicon refresh cycles. Data center buildings have lifespans measured in decades, but the underlying GPUs are refreshed every two to three years - a cycle that can inflate long-term capital expenditures by hundreds of billions. Furthermore, memory shortages contribute to rising prices for DRAM and NAND.

Who is spending and where the money is going

Hyperscale cloud providers are leading the investment surge, with companies like Microsoft, Amazon, Google, and Meta making substantial capital investments. Goldman Sachs estimated hyperscale cloud providers' AI-related capex at about $200 billion in 2024, and broader AI infrastructure spending at about $765 billion in 2026. This trend signifies a major shift in global IT spending patterns.

Market dynamics are also a factor. With the advanced GPU market dominated by just a few suppliers, buyers have little negotiating power. This is compounded by long lead times for essential power components like transformers and switchgear, further limiting deployment flexibility.

Strategic pressure on startups and enterprises

For startups, the capital required for large-scale AI clusters presents a significant barrier, with initial builds requiring substantial upfront investment plus considerable operating costs. Consequently, many emerging companies opt for more cost-effective strategies, such as using managed APIs, fine-tuning smaller models, or leveraging shared GPU resources.

In response, enterprises are implementing stricter procurement and FinOps controls. Budgets now feature dedicated line items for AI governance and lifecycle management, while centralized teams focus on negotiating capacity reservations and monitoring GPU utilization to prevent waste. This financial scrutiny is intensifying as the cost to build critical IT load continues to rise, with a better-supported range of about $7M to $12M per MW of commissioned IT load, with many recent summaries citing roughly $10M to $12M per MW for standard builds.

Managing the bill: practical levers

As costs climb, a clear playbook for AI cost optimization is emerging:

  • Select the most efficient model that meets performance and latency requirements.
  • Implement caching for repeated inferences and batch process non-urgent jobs.
  • Adopt a hybrid cloud strategy, routing steady workloads to on-premise systems and using the cloud for burst training.
  • Monitor spending granularly by tracking cost per 1,000 tokens, per inference, and per team.

Hybrid deployments are critical for aligning workloads with optimal power, compliance, and pricing environments. Organizations further report that predictive monitoring tools can yield operational savings between 13-25 percent. Ultimately, these efficiency tactics, not just total capital expenditure, will likely define competitive advantage in the AI era.


What is driving the jump in costs for new AI data centers?

Industry reports pin the increase on power-hungry GPUs, liquid-cooling loops, and redundant switchgear required for high-density racks. Traditional cloud facilities averaged lower costs per MW; AI-oriented facilities now need significantly more capital to support dense accelerator pods and shorter chip-refresh cycles.

How does this cost pressure change startup fundraising?

Investors increasingly ask founders to show unit-economics proof before writing checks. With hyperscalers making substantial AI capex commitments, seed decks must explain why a lean model, domain data, or workflow lock-in can survive once giants discount compute. Many teams now manage capital more carefully by fine-tuning smaller parameter models instead of training from scratch.

Where are enterprises re-writing procurement rules?

CFOs are moving AI spend under central FinOps teams that demand:
- cost per 1,000 tokens
- committed-use GPU rates
- cross-cloud showback

Many contracts increasingly tie renewals to provable ROI (cycle-time cut, revenue uplift). Vendors that cannot itemize power, silicon, and inference line items are losing RFPs to suppliers offering transparent dashboards.

Which optimization tactics deliver fast relief without hurting accuracy?

  1. Model routing - send a majority of queries to a distilled smaller model, escalate only on confidence score
  2. Aggressive caching of embeddings and frequent prompts
  3. Dynamic batching plus 8-bit quantization - cuts GPU seconds significantly
  4. Rightsized context - trim prompt length; every 1,000 tokens saved provides meaningful cost savings on enterprise-class pricing

When does a hybrid (cloud + on-prem + edge) layout make financial sense?

Use the hybrid playbook once monthly cloud AI spend becomes substantial:
- keep steady baseline inference on reserved on-prem or colo space where power costs are lower
- burst training spikes into cloud spot instances
- place latency-critical workloads at edge POPs to avoid data-egress charges

Industry estimates suggest this approach can reduce overall compute costs while meeting sovereignty and low-latency constraints.