BETA RELEASE

Summary

An analysis estimating that a typical GPT-4o query consumes roughly 0.3 watt-hours, significantly lower than previous estimates of 3 watt-hours.

Key quotes

We find that typical ChatGPT queries using GPT-4o likely consume roughly 0.3 watt-hours, which is ten times less than the older estimate.

The article details the methodology used to estimate the energy cost of LLM inference, accounting for parameter counts, hardware efficiency (H100 GPUs), and token lengths.