How Much Energy Does the AI You Use Every Day Consume?

Text queries average ~0.3 Wh. Reasoning typically ranges between x 5 and x 130. Image between x 0.3 and x 14. Current commercial video between x 133 and x 1,400. Code agents between x 20 and x 150. Efficiency improves every year, but total consumption rises because each improvement triggers a surge in usage volume (Jevons Paradox).

Series	2024	2025	2026	2027	2028
Efficiency per token (Wh)	0.45	0.3	0.26	0.22	0.18
Daily queries (billions)	0.7	1.5	3.5	6	9

Generating 10 seconds of video with Veo 3.1 can consume as much energy as a microwave running for 1-2 hours.

That sentence is not rhetorical exaggeration. It’s a measured data point. And it’s just the tip of the iceberg of a reality that AI companies prefer not to quantify publicly.

At AISHA we have compiled, cross-referenced, and verified all available measurements as of April 2026 — academic papers, production data, independent benchmarks — to build the most comprehensive guide in English to the real energy consumption of artificial intelligence.

This is what we know.

It all starts with one number: 0.3 Wh

To talk in comparable numbers, we need a starting point. The reference unit is the standard text query: approximately 0.3 Wh (watt-hours).

How much is that? The energy a 10-watt LED bulb consumes in less than two minutes. It seems insignificant. But when multiplied by the billions of daily queries worldwide, the aggregate impact is anything but trivial.

Google is the only provider that has published a direct production measurement: 0.24 Wh as the median for text queries to Gemini (August 2025, real infrastructure measurement, not an estimate). Sam Altman stated that ChatGPT consumes 0.34 Wh on average, but without publishing any methodology. Anthropic has published absolutely nothing.

With that reference of 0.3 Wh as the baseline (x1), we can compare everything else.

Text: the cheapest modality (and the most unequal)

Not all text models consume the same amount. The difference between the lightest and the heaviest exceeds 40 times. This table shows it:

Model	Consumption per query	Multiplier
Gemini 2.5 Flash-Lite	0.10 – 0.15 Wh	x0.3 – x0.5
Llama 4 Scout	0.15 – 0.30 Wh	x0.5 – x1
DeepSeek V4	0.15 – 0.35 Wh	x0.5 – x1.2
GPT-5-mini	0.20 – 0.40 Wh	x0.7 – x1.3
Mistral Large	0.25 – 0.50 Wh	x0.8 – x1.7
Claude Sonnet 4.6	0.40 – 0.90 Wh	x1.3 – x3
GPT-5.4	0.50 – 1.20 Wh	x1.7 – x4
Gemini 2.5 Ultra	0.35 – 0.70 Wh	x1.2 – x2.3
Claude Opus 4.6	~4 Wh (estimated)	~x13

“Flash” or “mini” models are between 3 and 10 times more efficient than full frontier models. For the vast majority of everyday tasks — summarizing a text, drafting an email, answering a factual question — the small model is sufficient.

Model choice is not neutral. Choosing poorly can multiply your consumption by 26 times for the same task.

Reasoning: when thinking can cost up to 130 times more

The revolution of “thinking models” — models that reason internally before responding — has radically changed the energy equation. They generate chains of thought tens of thousands of tokens long before giving an answer, and that internal process consumes energy.

The following table collects the available measurements for the main reasoning modes:

Mode	Consumption	Multiplier vs. base text
GPT-5.4 with reasoning	4 – 18 Wh	x13 – x60
Claude with Extended Thinking	2 – 8 Wh	x7 – x27
o3 (long prompts)	~39 Wh	~x130
Deep Research (any provider)	10 – 40 Wh	x33 – x133

In the worst case, a single reasoning query consumes the same as 130 normal text queries.

The Hugging Face AI Energy Score v2 (December 2025), which measured 205 open-source models on H100 GPUs, found even more extreme results:

Phi-4-reasoning-plus: multiplier of x514 when activating reasoning (from 0.018 Wh to 9.46 Wh)
DeepSeek-R1-Distill-Llama-70B: multiplier of x154 (from 0.050 Wh to 7.63 Wh)
SmolLM3-3B: 13 Wh for a single question with reasoning activated

Activating reasoning mode when it’s not necessary is like using a 40-ton truck to go buy bread.

Images: every AI photo is like charging your phone

The research by Bertazzini et al. (June 2025) measured 17 diffusion models on an RTX 4090 and found a 46-fold variation between the most efficient and the least efficient.

These are the extremes of the spectrum:

Model	Consumption per image	Equivalence
LCM_SSD_1B (most efficient)	0.086 Wh	~0.3 text queries
Ideogram 3	0.8 – 2.5 Wh	3 – 8 queries
Midjourney v7	1 – 4 Wh	3 – 13 queries
DALL-E 4	2 – 6 Wh	7 – 20 queries
Native GPT-4o image	~3 Wh	~10 queries
Lumina (least efficient)	4.08 Wh	~14 queries

The difference between the cheapest and the most expensive model is the difference between turning on a flashlight and turning on an oven.

A counterintuitive finding: int8 quantization, which is supposed to reduce consumption, actually increases it by up to 64.5% in some image models. Efficiency is not always what it seems.

700 million images in one week. That’s what users generated when OpenAI launched native image generation in GPT-4o. That’s equivalent to approximately 2,100 MWh in image generation alone, in seven days.

Video: the great energy devourer

If text is the bicycle, video is the airplane. The research by Delavande and Luccioni (September 2025) measured 7 open-source video models on H100 and documented an 800-fold range between the cheapest and the most expensive.

These numbers speak for themselves:

Model	Duration	Consumption	Multiplier vs. text
AnimateDiff (most efficient)	2 sec	0.14 Wh	x0.5
Runway Gen-3	5 sec	3 – 8 Wh	x10 – x27
WAN2.1-14B	5 sec	~109 Wh	~x363
Kling 3.0	15 sec	~400 Wh	~x1,333
Sora 2	10 sec	~1,000 Wh	~x3,333

944 Wh per 5-second clip. That’s what Sora consumed — as much energy as charging a smartphone for a month. OpenAI shut it down on March 24, 2026 after accumulating total revenue of $2.1 million against estimated operating costs of $15 million per day.

A technical detail that aggravates the problem: doubling the video duration quadruples the energy consumption. The relationship is not linear — it’s exponential.

Audio: the modality nobody measures

Passoni et al. (May 2025) published the only paper with measurements of audio generation (text-to-audio), with 7 models on NVIDIA A40 GPUs:

AudioLDM (most efficient): ~0.25 Wh per 10-second clip
Tango2 (least efficient): ~2.0 Wh per 10-second clip

The concerning finding: newer models consistently consume more energy than older ones. The industry prioritizes quality over efficiency, without exception.

One single paper. Seven models. Zero data from commercial services. That is all the transparency that exists today in generative audio.

Code agents: 136 queries in a single session

Code agents represent a new consumption paradigm. Simon P. Couch analyzed Claude Code sessions (January 2026) and found that a median session processes 592,000 tokens and consumes approximately 41 Wh — the equivalent of 136 conventional text queries.

Complex sessions can reach 50 to 200 Wh. A developer using code agents during a full workday can consume as much energy as an average European household in a day.

A developer with a code agent running for eight hours consumes the same as their refrigerator in 24 hours.

The paradox that explains everything

This is perhaps the most important data point in the entire guide: efficiency per query improves constantly, but total consumption never stops growing.

Google demonstrated a 33-fold efficiency improvement in 12 months (May 2024 to May 2025). And yet, its total carbon emissions increased by 48-50% in the same period. Its actual electricity consumption grew by 27%, even though its accounting based on renewable energy certificates (market-based) declared a “12% reduction.”

This is the Jevons Paradox applied to AI: when a resource is used more efficiently, its cost drops, it becomes more accessible, the volume of use skyrockets, and total consumption increases.

The data confirms it:

Efficiency per token: improves 15-30% annually
Volume of daily queries: grows from 0.4-1.0 billion (2024) to 2.5-5.0 billion (2026)
Net result: total consumption rises 25% annually

Efficiency is necessary but insufficient. Without demand governance — choosing the right model, avoiding unnecessary use, measuring the impact — technological improvement only accelerates the problem.

The black holes: what we DON’T know

Everything above is based on the measurements that exist. But there are entire categories for which we have no data at all:

Deep Research from any provider (estimates range between 10 and 40 Wh — a x4 range)
Commercial image generation (DALL-E, Midjourney, Ideogram are excluded from academic benchmarks)
Sora and proprietary video models (estimates varied x27: from 35 to 936 Wh)
Music generation (Suno, Udio: literally zero published data)
Proprietary inference (GPT-5, Claude in production, Grok: no independent measurements)

The barrier is not technical. NVIDIA DCGM, the GPU monitoring system, is already deployed in every data center in the world. APIs already report costs in dollars per call. Adding an energy_wh field would be trivial.

Companies choose not to do it. The barrier is political, not technical.

What can I do?

If you’re a user: Use our AI footprint calculator to estimate your consumption. As a rule of thumb: text < image < audio < code < reasoning < video. The smallest model that solves your task is always the best choice.
If you’re a company: AI consumption is already part of your carbon footprint under CSRD. Demand consumption data per service from your providers. If Google can publish 0.24 Wh, so can everyone else.
If you’re a developer: Flash/mini by default. Reasoning only when the problem requires it. Cache results. Every architecture decision has an energy cost that gets multiplied by millions of users.
If you’re a regulator: Measurement is possible today, with technology that already exists in every datacenter. Appliance energy labels reduced consumption by 60% over 30 years. AI needs its own label.

How Much Energy Does the AI You Use Every Day Consume?

Energy consumption by AI modality

Even though each query costs less, doing many more means total spending rises

It all starts with one number: 0.3 Wh

Text: the cheapest modality (and the most unequal)

Reasoning: when thinking can cost up to 130 times more

Images: every AI photo is like charging your phone

Video: the great energy devourer

Audio: the modality nobody measures

Code agents: 136 queries in a single session

The paradox that explains everything

The black holes: what we DON’T know

What can I do?

Sources

Keep exploring AISHA

El 75% de los proyectos de IA no generan retorno. Cómo no ser uno de ellos

La IA puede ser la mejor herramienta de la historia — si la usamos bien

Next step

Calculate the approximate impact of your AI usage.