The AlphaGo moment in the mathematical world: Google's DeepMind AI solves an IMO geometry problem in 19 seconds, just 1 point away from winning the gold medal

Google's artificial intelligence system achieved a performance equivalent to a silver medal in the International Mathematical Olympiad, demonstrating its breakthrough capabilities in solving mathematical problems. The system, combining AlphaProof and AlphaGeometry 2, answered international mathematical Olympiad questions at a high level. This is an important milestone that showcases the potential of artificial intelligence in the field of mathematics

Advanced mathematical reasoning is a key ability for modern artificial intelligence. Today, Google announced an important milestone in a long-term major challenge: Google's hybrid AI system achieved a performance equivalent to a silver medal in this year's International Mathematical Olympiad (IMO).

Specifically, Google showcased the first artificial intelligence system to solve an IMO problem at a silver medal level. It combines AlphaProof (a new breakthrough reasoning model) and AlphaGeometry 2 (an improved version of the previous system).

The International Mathematical Olympiad is the most prestigious and largest competition for young mathematicians globally, held annually since 1959. Participants are required to solve six extremely difficult problems involving algebra, combinatorics, geometry, and number theory. Many Fields Medalists have represented their countries in the IMO during their youth. In recent years, IMO has also become an important challenge in the field of machine learning, seen as a benchmark for measuring the advanced mathematical reasoning capabilities of AI systems.

In this year's competition, DeepMind's AI system scored 28 points (out of 42), equivalent to the level of a silver medalist. This score was only 1 point away from the gold standard, and out of the 609 participants this year, only 58 received a gold medal.

AlphaProof: Breakthrough in Formal Mathematical Reasoning

The AlphaProof system adopts a reinforcement learning approach, combining pre-trained language models with the AlphaZero algorithm. The advantage of this method is the ability to formally verify the correctness of proofs involving mathematical reasoning. To overcome the lack of formal language training data, the research team fine-tuned the Gemini model to create a library of formal problems of various difficulties.

AlphaProof is a self-learning system designed specifically to prove mathematical statements in the formal mathematical language Lean. Its core innovation lies in the combination of pre-trained language models and the AlphaZero reinforcement learning algorithm.

The workflow is as follows:

Problem Transformation: Firstly, using the fine-tuned Gemini model, natural language mathematical problems are automatically converted into Lean's formal language. This step creates a large library of formal problems covering different difficulty levels
Solution Generation: When faced with a new problem, AlphaProof will generate possible solutions.
Proof Search: The system searches for possible proof steps in Lean, attempting to prove or refute these solutions.
Reinforcement Learning: For each proof found and verified, it is used to reinforce AlphaProof's language model, enhancing the system's ability to tackle more challenging problems in the future.
Continuous Training: Leading up to the IMO competition, AlphaProof proved or refuted millions of questions within a few weeks, covering various difficulties and mathematical topics. During the competition, it continued this training cycle, strengthening its abilities by proving variants of the competition questions it generated until finding complete solutions.

AlphaGeometry 2

AlphaGeometry 2 is an improved version of AlphaGeometry, with its language model based on Gemini and trained on a synthetic data set one order of magnitude larger than its predecessor.

AlphaGeometry 2 is a neural-symbolic hybrid system. Key improvements include:

Enhanced Language Model: Based on Gemini, trained from scratch using a synthetic data set one order of magnitude larger than its predecessor. This significantly enhances the model's ability to handle complex geometric problems, including object motion, angle equations, proportions, or distance problems.
Faster Symbol Engine: The new version's symbol processing engine speed has increased by two orders of magnitude, greatly accelerating the problem-solving speed.
Knowledge Sharing Mechanism: Introduces a new knowledge sharing mechanism that can intelligently combine different search trees to solve more complex problems.
Performance Improvement: Prior to this year's IMO competition, AlphaGeometry 2 could solve 83% of the past 25 years' IMO geometry problems, far surpassing the predecessor's 53% solve rate.
Real-time Performance: In this year's IMO, AlphaGeometry 2 solved the formalized 4th question in just 19 seconds after receiving it.
DeepMind's research team is also exploring systems based on natural language reasoning, which do not require converting problems into formal languages and may be used in conjunction with other AI systems. This approach has shown significant potential in this year's IMO problems as well.

Article Author: AI Hanwuji, Source: AI Hanwuji, Original Title: "Mathematics' AlphaGo Moment: Google DeepMind AI Solves IMO Geometry Question in 19 Seconds, Just 1 Point Away from Winning Gold Medal"