SAVRN Infrastructure Platform
Sources & Citations: AI Inference Infrastructure
Last updated: May 13, 2026
Sources · Citations · Receipts · Piece 14 · May 13, 2026
Every quantitative claim in AI Inference Infrastructure: The 2026 Operator’s Playbook mapped to the primary report that supports it. Ten primary sources organized by category. URLs verified May 13, 2026.
Editorial standard. Every URL below is either (a) the named publisher’s official research document, (b) an Independent Market Monitor or system operator’s published analysis, (c) a peer-reviewed research report from a named institution, or (d) a regulator’s primary publication. No paywalled trade-press summaries are load-bearing. Where a primary URL has rotated, the report title and publication year are sufficient to retrieve the record from the issuing publisher directly.
A · Market & spend forecastsMcKinsey · Goldman · EPRI
| Claim | Primary source | Document / URL |
|---|---|---|
| Global data center capacity nearly triples to 219 gigawatts by 2030, with about 70 percent of new demand from AI workloads. Inference identified as the dominant AI workload by 2030. AI-equipped data centers projected to require $5.2 trillion in capital expenditures through 2030. | McKinsey & Company — “The cost of compute: A $7 trillion race to scale data centers” (2024) | mckinsey.com/…/the-cost-of-compute-a-7-trillion-dollar-race-to-scale-data-centers |
| Data center power demand grows 165 percent by 2030 versus 2023. AI workload share of total data center power consumption rises from 14 percent (today) to 27 percent (2027) to 39 percent (2030). Inference becomes the main AI requirement by 2027. | Goldman Sachs Research — “AI to drive 165% increase in data center power demand by 2030” | goldmansachs.com/…/ai-to-drive-165-increase-in-data-center-power-demand-by-2030 |
| U.S. data centers projected to consume 4.6 to 9.1 percent of total U.S. electricity generation annually by 2030, up from roughly 4 percent in 2023. Flat-profile load methodology (load distributed evenly across hours of the year). | EPRI — “Powering Intelligence: Analyzing Artificial Intelligence and Data Center Energy Consumption” (2024) | epri.com/research/products/3002028905 |
B · Inference cost curveStanford HAI
| Claim | Primary source | Document / URL |
|---|---|---|
| Inference cost for a GPT-3.5-equivalent model (MMLU score 64.8) fell from $20 per million tokens in November 2022 to $0.07 per million tokens by October 2024 (Gemini-1.5-Flash-8B), a 280-fold reduction in approximately 18 months. | Stanford HAI — 2025 AI Index Report | hai.stanford.edu/ai-index/2025-ai-index-report |
C · Grid, interconnection, and electricity demandLBNL · IEA
| Claim | Primary source | Document / URL |
|---|---|---|
| As of year-end 2023: over 1,570 GW of generation and approximately 1,030 GW of storage active in U.S. interconnection queues (approximately 2,600 GW total). Median time from interconnection request to commercial operation reached five years for projects built in 2023, up from less than two years for the 2000-2007 cohort. | Lawrence Berkeley National Laboratory — “Queued Up: 2024 Edition, Characteristics of Power Plants Seeking Transmission Interconnection As of the End of 2023” | emp.lbl.gov/publications/queued-2024-edition-characteristics |
| AI workload load curves differ structurally from traditional industrial demand; data centers and AI are a rising share of electricity demand globally. | International Energy Agency — Electricity 2024 | iea.org/reports/electricity-2024 |
D · Density & infrastructureUptime · ASHRAE
| Claim | Primary source | Document / URL |
|---|---|---|
| Average typical rack density across 2024 survey respondents was approximately 8 kW, with only about 1 percent of operators reporting racks above 100 kW. Dense racks concentrated among hyperscalers and AI-specialized facilities. | Uptime Institute — Global Data Center Survey 2024 | intelligence.uptimeinstitute.com/resource/uptime-institute-global-data-center-survey-2024 |
| Direct-to-chip liquid cooling and immersion cooling are needed to sustain operation as rack densities climb past the 50-to-60 kW band that defines the air-cooling cliff. | ASHRAE — Technical Committee 9.9 (Mission Critical Facilities, Data Centers, Technology Spaces) thermal guidance | tc0909.ashraetcs.org |
E · Power cost & utility dataU.S. EIA
| Claim | Primary source | Document / URL |
|---|---|---|
| Retail commercial-industrial power rates vary widely by state and utility; tracked monthly. Used as the baseline retail tariff anchor in the worked-example unit-economic comparison. | U.S. Energy Information Administration — Electric Power Monthly | eia.gov/electricity/monthly |
F · State regulatory landscapeMultiState
| Claim | Primary source | Document / URL |
|---|---|---|
| Twelve U.S. states have introduced data center moratoria or restrictive AI-load bills as of early 2026. Carries forward from SAVRN piece 6 doctrine (“Data Center Moratorium: 12 States, 2026 Map, The Fix”). | MultiState Associates — state legislative tracker (AI / data center matrix) | multistate.us |
Cite this article
If you reference this SAVRN piece in your own research, op-ed, white paper, or trade-press writing, please use one of the citation formats below. One-click copy.
Chicago
Harris, Chad Everett. "AI Inference Infrastructure: The 2026 Operator's Playbook." SAVRN, May 13, 2026. https://savrn.com/ai-inference-infrastructure/.
APA
Harris, C. E. (2026, May 13). AI Inference Infrastructure: The 2026 Operator's Playbook. SAVRN. https://savrn.com/ai-inference-infrastructure/
BibTeX
@misc{savrn_ai_inference_infrastructure_2026,
author = {Harris, Chad Everett},
title = {{AI Inference Infrastructure: The 2026 Operator's Playbook}},
publisher = {SAVRN},
year = {2026},
month = {may},
url = {https://savrn.com/ai-inference-infrastructure/},
note = {Accessed: }
}
Questions?
Reach the team directly.
Legal and compliance inquiries can be directed to our general channel.