Visual scale comparing the energy consumption of different types of AI: text, image, audio, code, and video

How Much Energy Does the AI You Use Every Day Consume?

The definitive guide to energy consumption by model and modality in 2026

By AISHA · February 12, 2026 · 8 min read

Generating 10 seconds of video with Veo 3.1 can consume as much energy as a microwave running for 1-2 hours.

Text queries average ~0.3 Wh. Reasoning typically ranges between x 5 and x 130. Image between x 0.3 and x 14. Current commercial video between x 133 and x 1,400. Code agents between x 20 and x 150. Efficiency improves every year, but total consumption rises because each improvement triggers a surge in usage volume (Jevons Paradox).

Energy consumption by AI modality

Logarithmic scale. The width uses a conservative reference value; on the right the range documented in open sources is shown (0.3 Wh = x1).

Text (flash)

x0.17-x0.8

Text (frontier)

x0.8-x4

Reasoning

x5-x130

Image

x0.3-x14

Audio

x0.8-x7

Code agent

x20-x150

Video

x133-x1,400

Even though each query costs less, doing many more means total spending rises

This is the Jevons Paradox: if something becomes cheaper and more efficient, it gets used much more and total consumption can grow

Series 20242025202620272028
Efficiency per token (Wh) 0.450.30.260.220.18
Daily queries (billions) 0.71.53.569

0.24 Wh

Only direct measurement (Google Gemini)

x 133-x1,400

Current commercial video vs text

x 46

Variation across image models

x 514

Extreme peak in reasoning benchmark (Phi-4)

Generating 10 seconds of video with Veo 3.1 can consume as much energy as a microwave running for 1-2 hours.

That sentence is not rhetorical exaggeration. It’s a measured data point. And it’s just the tip of the iceberg of a reality that AI companies prefer not to quantify publicly.

At AISHA we have compiled, cross-referenced, and verified all available measurements as of April 2026 — academic papers, production data, independent benchmarks — to build the most comprehensive guide in English to the real energy consumption of artificial intelligence.

This is what we know.


It all starts with one number: 0.3 Wh

To talk in comparable numbers, we need a starting point. The reference unit is the standard text query: approximately 0.3 Wh (watt-hours).

How much is that? The energy a 10-watt LED bulb consumes in less than two minutes. It seems insignificant. But when multiplied by the billions of daily queries worldwide, the aggregate impact is anything but trivial.

Google is the only provider that has published a direct production measurement: 0.24 Wh as the median for text queries to Gemini (August 2025, real infrastructure measurement, not an estimate). Sam Altman stated that ChatGPT consumes 0.34 Wh on average, but without publishing any methodology. Anthropic has published absolutely nothing.

With that reference of 0.3 Wh as the baseline (x1), we can compare everything else.


Text: the cheapest modality (and the most unequal)

Not all text models consume the same amount. The difference between the lightest and the heaviest exceeds 40 times. This table shows it:

ModelConsumption per queryMultiplier
Gemini 2.5 Flash-Lite0.10 – 0.15 Whx0.3 – x0.5
Llama 4 Scout0.15 – 0.30 Whx0.5 – x1
DeepSeek V40.15 – 0.35 Whx0.5 – x1.2
GPT-5-mini0.20 – 0.40 Whx0.7 – x1.3
Mistral Large0.25 – 0.50 Whx0.8 – x1.7
Claude Sonnet 4.60.40 – 0.90 Whx1.3 – x3
GPT-5.40.50 – 1.20 Whx1.7 – x4
Gemini 2.5 Ultra0.35 – 0.70 Whx1.2 – x2.3
Claude Opus 4.6~4 Wh (estimated)~x13

“Flash” or “mini” models are between 3 and 10 times more efficient than full frontier models. For the vast majority of everyday tasks — summarizing a text, drafting an email, answering a factual question — the small model is sufficient.

Model choice is not neutral. Choosing poorly can multiply your consumption by 26 times for the same task.


Reasoning: when thinking can cost up to 130 times more

The revolution of “thinking models” — models that reason internally before responding — has radically changed the energy equation. They generate chains of thought tens of thousands of tokens long before giving an answer, and that internal process consumes energy.

The following table collects the available measurements for the main reasoning modes:

ModeConsumptionMultiplier vs. base text
GPT-5.4 with reasoning4 – 18 Whx13 – x60
Claude with Extended Thinking2 – 8 Whx7 – x27
o3 (long prompts)~39 Wh~x130
Deep Research (any provider)10 – 40 Whx33 – x133

In the worst case, a single reasoning query consumes the same as 130 normal text queries.

The Hugging Face AI Energy Score v2 (December 2025), which measured 205 open-source models on H100 GPUs, found even more extreme results:

  • Phi-4-reasoning-plus: multiplier of x514 when activating reasoning (from 0.018 Wh to 9.46 Wh)
  • DeepSeek-R1-Distill-Llama-70B: multiplier of x154 (from 0.050 Wh to 7.63 Wh)
  • SmolLM3-3B: 13 Wh for a single question with reasoning activated

Activating reasoning mode when it’s not necessary is like using a 40-ton truck to go buy bread.


Images: every AI photo is like charging your phone

The research by Bertazzini et al. (June 2025) measured 17 diffusion models on an RTX 4090 and found a 46-fold variation between the most efficient and the least efficient.

These are the extremes of the spectrum:

ModelConsumption per imageEquivalence
LCM_SSD_1B (most efficient)0.086 Wh~0.3 text queries
Ideogram 30.8 – 2.5 Wh3 – 8 queries
Midjourney v71 – 4 Wh3 – 13 queries
DALL-E 42 – 6 Wh7 – 20 queries
Native GPT-4o image~3 Wh~10 queries
Lumina (least efficient)4.08 Wh~14 queries

The difference between the cheapest and the most expensive model is the difference between turning on a flashlight and turning on an oven.

A counterintuitive finding: int8 quantization, which is supposed to reduce consumption, actually increases it by up to 64.5% in some image models. Efficiency is not always what it seems.

700 million images in one week. That’s what users generated when OpenAI launched native image generation in GPT-4o. That’s equivalent to approximately 2,100 MWh in image generation alone, in seven days.


Video: the great energy devourer

If text is the bicycle, video is the airplane. The research by Delavande and Luccioni (September 2025) measured 7 open-source video models on H100 and documented an 800-fold range between the cheapest and the most expensive.

These numbers speak for themselves:

ModelDurationConsumptionMultiplier vs. text
AnimateDiff (most efficient)2 sec0.14 Whx0.5
Runway Gen-35 sec3 – 8 Whx10 – x27
WAN2.1-14B5 sec~109 Wh~x363
Kling 3.015 sec~400 Wh~x1,333
Sora 210 sec~1,000 Wh~x3,333

944 Wh per 5-second clip. That’s what Sora consumed — as much energy as charging a smartphone for a month. OpenAI shut it down on March 24, 2026 after accumulating total revenue of $2.1 million against estimated operating costs of $15 million per day.

A technical detail that aggravates the problem: doubling the video duration quadruples the energy consumption. The relationship is not linear — it’s exponential.


Audio: the modality nobody measures

Passoni et al. (May 2025) published the only paper with measurements of audio generation (text-to-audio), with 7 models on NVIDIA A40 GPUs:

  • AudioLDM (most efficient): ~0.25 Wh per 10-second clip
  • Tango2 (least efficient): ~2.0 Wh per 10-second clip

The concerning finding: newer models consistently consume more energy than older ones. The industry prioritizes quality over efficiency, without exception.

One single paper. Seven models. Zero data from commercial services. That is all the transparency that exists today in generative audio.


Code agents: 136 queries in a single session

Code agents represent a new consumption paradigm. Simon P. Couch analyzed Claude Code sessions (January 2026) and found that a median session processes 592,000 tokens and consumes approximately 41 Wh — the equivalent of 136 conventional text queries.

Complex sessions can reach 50 to 200 Wh. A developer using code agents during a full workday can consume as much energy as an average European household in a day.

A developer with a code agent running for eight hours consumes the same as their refrigerator in 24 hours.


The paradox that explains everything

This is perhaps the most important data point in the entire guide: efficiency per query improves constantly, but total consumption never stops growing.

Google demonstrated a 33-fold efficiency improvement in 12 months (May 2024 to May 2025). And yet, its total carbon emissions increased by 48-50% in the same period. Its actual electricity consumption grew by 27%, even though its accounting based on renewable energy certificates (market-based) declared a “12% reduction.”

This is the Jevons Paradox applied to AI: when a resource is used more efficiently, its cost drops, it becomes more accessible, the volume of use skyrockets, and total consumption increases.

The data confirms it:

  • Efficiency per token: improves 15-30% annually
  • Volume of daily queries: grows from 0.4-1.0 billion (2024) to 2.5-5.0 billion (2026)
  • Net result: total consumption rises 25% annually

Efficiency is necessary but insufficient. Without demand governance — choosing the right model, avoiding unnecessary use, measuring the impact — technological improvement only accelerates the problem.


The black holes: what we DON’T know

Everything above is based on the measurements that exist. But there are entire categories for which we have no data at all:

  • Deep Research from any provider (estimates range between 10 and 40 Wh — a x4 range)
  • Commercial image generation (DALL-E, Midjourney, Ideogram are excluded from academic benchmarks)
  • Sora and proprietary video models (estimates varied x27: from 35 to 936 Wh)
  • Music generation (Suno, Udio: literally zero published data)
  • Proprietary inference (GPT-5, Claude in production, Grok: no independent measurements)

The barrier is not technical. NVIDIA DCGM, the GPU monitoring system, is already deployed in every data center in the world. APIs already report costs in dollars per call. Adding an energy_wh field would be trivial.

Companies choose not to do it. The barrier is political, not technical.


What can I do?

  • If you’re a user: Use our AI footprint calculator to estimate your consumption. As a rule of thumb: text < image < audio < code < reasoning < video. The smallest model that solves your task is always the best choice.

  • If you’re a company: AI consumption is already part of your carbon footprint under CSRD. Demand consumption data per service from your providers. If Google can publish 0.24 Wh, so can everyone else.

  • If you’re a developer: Flash/mini by default. Reasoning only when the problem requires it. Cache results. Every architecture decision has an energy cost that gets multiplied by millions of users.

  • If you’re a regulator: Measurement is possible today, with technology that already exists in every datacenter. Appliance energy labels reduced consumption by 60% over 30 years. AI needs its own label.

Sources

Related

Keep exploring AISHA

Next step

Calculate the approximate impact of your AI usage.

Our calculator helps you put queries, images, reasoning and agents into context.

Open calculator