Intelligence Yield

How much useful work does each minute of compute produce?

IY = (p50 time horizon × avg score) / working time

Everyone focuses on the METR Time Horizons headline number — how hard a task can a model solve? But the benchmark data also captures how much work the model needed to get there. Opus 4.6 solves harder tasks, more reliably, with considerably less compute.

Intelligence Yield by Model

Frontier models — higher is better

Intelligence Yield (IY)

Intelligence Yield Over Time

Release date vs. IY — each dot is a model

Scale:

Intelligence Yield (IY)

Raw Data