Truly useful primary sources
10
Among papers, open benchmarks, corporate statements, and auditable estimates.
Transparency / Opacity
Map of which providers publish data, which do not, and with what methodological quality.
As of April 2026, nearly all debate about AI energy consumption rests on a handful of laboratory measurements, a single granular production figure, and several corporate or academic estimates with high margins of error. The main problem is not a lack of interest: it is the lack of open, comparable telemetry by service.
Truly useful primary sources
10
Among papers, open benchmarks, corporate statements, and auditable estimates.
Public range for a text query
0.24–0.34 Wh
Google and OpenAI mark the narrow known reference range for general chat.
Maximum observed deviation
x 27
Opaque estimation chains can inflate the difference between inferred and actual figures.
This inventory separates direct measurement, production data, and indirect estimation to answer a simple question: what do we actually know and what are we still assuming.
The conclusion is uncomfortable: most figures circulating in the press, regulation, and marketing are not verifiable telemetry. They are approximations built on assumed hardware, estimated utilization, and proprietary models that remain closed.
Logarithmic scale based on the most cited public range for text, image generation, and open-source video.
Conclusion: the central problem is no longer calculating a nice number, but distinguishing between real telemetry and speculative narrative. Without that distinction, any comparison between models remains fragile.
This section gathers the sources that genuinely contribute to the energy debate: direct laboratory measurement, a granular production case, and a small set of academic or corporate estimates that, even with limitations, help bracket orders of magnitude.
Filter by type to distinguish real production, open laboratory, and indirect estimation.
| Reported value | Key finding | ||
|---|---|---|---|
| Google — Gemini median August 2025 · arXiv:2508.15734v1 | Production | 0.24 Wh / query | The only granular production figure published, including TPU, host overhead, and PUE. |
| Sam Altman — ChatGPT June 2025 · corporate blog | Estimate | 0.34 Wh / query | Serves as a media reference, but comes without methodology, peer review, or breakdown by modality. |
| Hugging Face AI Energy Score December 2025 · Sasha Luccioni et al. | Direct | 1 to 5 stars | Compares over 200 open models and shows that reasoning can increase consumption by up to hundreds of times. |
| ML.Energy (University of Michigan) 2025-2026 · Jae-Won Chung et al. | Direct | Open leaderboard | Provides useful context for open-source models, but does not solve the black box of closed providers. |
| The Hidden Cost of an Image June 2025 · arXiv:2506.17016 | Direct | Up to x46 between models | Confirms the enormous energy dispersion in image generation and the limited usefulness of comparing by brand without technical context. |
| Video Killed the Energy Budget September 2025 · arXiv:2509.19222 | Direct | Up to x2,000 vs text | Open-source video already marks a clear physical rupture: modality matters more than the model's marketing. |
| Generative audio May 2025 · arXiv:2505.07615 | Direct | Varies by model | Nearly the only useful empirical reference for text-to-audio, and it leaves out the dominant commercial platforms. |
| How Hungry is AI? 2025 · arXiv:2505.09598 | Estimate | o3: 39.2 Wh · Claude 3.7: 17 Wh | Good snapshot of possible scenarios, but remains theoretical inference based on pricing and hardware assumptions. |
| Monte Carlo bottom-up simulation September 2025 · arXiv:2509.20241 | Estimate | Median 0.34 Wh | One of the best academic approximations, but depends on too many unobservable input assumptions. |
| Claude Code energy estimate January 2026 · Simon P. Couch | Estimate | 41 Wh / median session | Useful for sizing agents, although the author himself acknowledges a margin of error close to x3. |
The table summarizes comparable findings. Full details and methodological limitations remain in the original sources.
Opacity is not uniform. There is an especially severe gap in agents, commercial video, aggregated inference, and distributed workloads within closed platforms. This table documents what key information remains unpublished and where there has been explicit refusal or sustained silence.
Filter by provider to see which information gaps remain open.
| Missing data | Status | |
|---|---|---|
| OpenAI Text (GPT-5) | Actual consumption per query | No data |
| OpenAI Image (DALL-E / GPT-4o) | Actual consumption per image | No data |
| OpenAI Video (Sora 2) | Consumption per clip in production | No data |
| OpenAI Agent (Deep Research) | Actual consumption per session | No data |
| Anthropic Text (Claude) | Actual consumption per query in production | No data |
| Anthropic Agents (Claude Code / Research) | Actual consumption per automated session | No data |
| Google Agent (Gemini Deep Research) | Actual consumption per session | Request denied |
| Google Video (Veo 2/3) | Consumption per clip in production | No data |
| Meta Integrated inference | Aggregated AI consumption across Facebook, Instagram, and WhatsApp | No data |
| xAI Text (Grok 4) | Actual consumption and emissions from Colossus | No data |
| Music platforms Suno / Udio | Any public empirical data | No data |
| Commercial video Runway / Pika / Kling | Any public empirical data | No data |
The absence of data does not mean the absence of internal telemetry. It means the absence of useful publication for customers, regulators, or researchers.
The most serious opacity is no longer in training, but in recurring commercial inference: agents, video, tools integrated into productivity suites, and aggregated consumption of platforms with billions of users.
The fact that Google was able to publish a median per query and, at the same time, deny more specific data for intensive services shows that the barrier is selective. Enough is shared to shape a narrative, not enough to enable comparison.
If the industry knows the exact consumption to manage capacity, pricing, and usage limits, then the absence of publication is not ignorance: it is strategy.
Bottom-up estimates do not fail due to individual bad faith, but because of the accumulation of unobservable assumptions. Each step adds uncertainty: architecture, hardware, utilization, overhead, PUE, and cost allocation among multiple tasks or users.
When a provider does not publish per-query telemetry, the analyst reconstructs the energy cost from the outside. That work can be intellectually rigorous and still remain an informed speculation.
The problem is cumulative: if each step introduces a reasonable margin, the total error can grow until it renders commercial or regulatory comparison useless.
AISHA: when an energy figure depends on too many invisible assumptions, it ceases to be an operational data point and becomes a sophisticated conjecture. The regulatory goal should not be to guess better, but to measure better.
Same category
Wed Apr 01 2026 00:00:00 GMT+0200 (Central European Summer Time)
Analysis of economic and strategic incentives behind the lack of transparency.
Wed Apr 01 2026 00:00:00 GMT+0200 (Central European Summer Time)
What can already be measured, what standards are missing, and how regulatory demands fit in.
Wed Apr 01 2026 00:00:00 GMT+0200 (Central European Summer Time)
Qué cambiaría si el mercado tuviera métricas comparables de consumo por servicio y modalidad.