Google DeepMind CEO Demis Hassabis has revealed a critical inconsistency in today’s AI systems: they can solve complex mathematical problems yet fail at elementary school questions.
Key Takeaways
- AI systems show “jagged intelligence” – excelling in complex tasks while failing simple ones
- Gemini models with DeepThink can win math olympiads but make high school math errors
- Major reasoning and planning capabilities still need to be “cracked” for true AI progress
Speaking on the Google for Developers podcast, Hassabis expressed concern that “it shouldn’t be that easy for the average person to just find a trivial flaw in the system.” He noted the paradox where Google’s Gemini models equipped with DeepThink can win gold medals at the International Mathematical Olympiad, yet “still makes simple mistakes in high school maths.”
What is ‘Jagged Intelligence’?
Hassabis characterized today’s AI as possessing “uneven” or “jagged” intelligence – remarkably strong in some areas yet surprisingly weak in others. This aligns with Google CEO Sundar Pichai’s recently introduced term “AJI” (artificial jagged intelligence), describing systems with inconsistent capabilities.
The DeepMind CEO emphasized that improving AI’s consistency requires more than just scaling up data and computing power. “Some missing capabilities in reasoning and planning in memory still need to be cracked,” he stated.
AGI Timeline and Challenges
Despite predicting in April that artificial general intelligence (AGI) could emerge “in the next five to 10 years,” Hassabis admits major challenges remain. His concerns echo those of OpenAI CEO Sam Altman, who acknowledged following GPT-5’s launch that the model lacks continuous learning – something he views as crucial for achieving true AGI.
These warnings highlight growing acknowledgment among AI leaders that issues like hallucinations, misinformation, and simple mistakes must be resolved before machines can reach human-level reasoning. The situation serves as a reminder similar to how social media platforms initially failed to foresee the large-scale impact of their technologies.





