Google DeepMind’s game-playing AI overcomes chatbot blind spots

A few years before ChatGPT started spouting gibberish, Google developed an entirely different kind of artificial intelligence program called AlphaGo, which learned through relentless practice to play the board game Go with superhuman skill.

The company’s researchers published work combining the power of large-scale language models (the AI behind today’s chatbots) with the abilities of AlphaZero, the chess-playing successor to AlphaGo, to solve extremely difficult mathematical proofs.

Their new Frankenstein-like creation, “AlphaProof,” has demonstrated its capabilities by tackling several problems in the 2024 International Mathematical Olympiad (IMO), a prestigious competition for high school students.

AlphaProof uses the Gemini large-scale language model to translate naturally expressed mathematical problems into a programming language called Lean, which provides the training material for a second algorithm to learn how to find verifiable proofs of correctness through trial and error.

Earlier this year, Google DeepMind introduced another mathematical algorithm called AlphaGeometry that combines language models with a different AI approach. AlphaGeometry uses Gemini to convert geometry problems into a form that can be manipulated and tested by programs that process geometric elements. Google today also announced a new and improved version of AlphaGeometry.

The researchers found that two math programs were able to provide silver medalist-level proofs of the IMO puzzle. The programs solved two algebraic problems and one number theory problem out of a total of six problems. One problem was solved in minutes, while the other took days. Google DeepMind did not disclose how much computer power it put into solving the problems.

Google DeepMind calls the approach used for both AlphaProof and AlphaGeometry “neurosymbolic” because it combines the pure machine learning of artificial neural networks — the technology underpinning recent advances in AI — with traditional programming languages.

“What we’ve seen here is that you can combine a very successful approach, and something like AlphaGo, with a large-scale language model, to produce something very competent,” says David Silver, a researcher at Google DeepMind who led the AlphaZero work. Silver says that the techniques demonstrated by AlphaProof should theoretically be extendable to other areas of mathematics.

In fact, this research suggests that applying logic and reasoning in a more grounded way may address the worst tendencies of large language models: they perform phenomenally, but they also have difficulty understanding even basic mathematics, let alone reasoning about problems logically.

In the future, neural symbolic methods could become a means of translating questions or tasks into a form that AI systems can reason about in a way that produces trustworthy results. OpenAI is also rumored to be working on such a system, codenamed “Strawberry.”

But as Silver acknowledges, the system unveiled today has important limitations. Mathematical solutions are either right or wrong, and AlphaProof and AlphaGeometry can work toward the right answer. Many real-world problems, such as figuring out the ideal travel itinerary, have many possible solutions, and it can be unclear which one is ideal. A solution for more ambiguous questions, Silver says, might be for the language model to determine what constitutes a “right” answer during training. “There are a variety of different approaches you can try,” he says.

Silver also cautions that Google DeepMind won’t take over the jobs of human mathematicians. “We aim to provide a system that can prove anything, but that’s not the only job of a mathematician,” Silver said. “A big part of math is posing problems and figuring out what are interesting questions to ask. Think of this as a tool, similar to a slide rule or a calculator or a computational tool.”

What's Hot

Patchwork Hackers Target Bhutan with Advanced Brute Ratel C4 Tool

OpenAI’s warnings about dangerous AI are mostly just marketing

NYT Strands July 27th Hints and Answers

See the frogs walking on the water with a high -speed abdomen flop

Do you want to play the UK immediately with Lynx? Sadly, it’s not a hell chance cat

How the ADETION that uses the generated AI and Amazon rocks is used to release hidden insights about patient groups

The scar of the tree shows how the past mountain fire for centuries was different from today.

Stiged a proposal for subsidies using Amazon Bedrock

Why determining Los Angeles air quality is so confusing right now

How a typo ruined the proof of Fermat’s Last Theorem

AI helps radiologists detect breast cancer in real-world exams

Some Pixel 9 Pro XL users are experiencing a strange camera tilt

Unlocking complex problem-solving with multi-agent collaboration on Amazon Bedrock

New data suggests X is still far from profitable

Most Popular

Microglia: Brain’s immune cells may drive dementia

Why I don’t watch movie trailers

The surprising ways species control their evolutionary destiny

Our Picks

More people are living with pain now than before the coronavirus outbreak

DC’s New Comics Give Some Fan Favorites a Deserved Spotlight

Why we need to save the Chandra Space Telescope

Subscribe to our newsletter

Subscribe to Updates

What's Hot

Google DeepMind’s game-playing AI overcomes chatbot blind spots

Related Posts

Subscribe to our newsletter

Subscribe to our newsletter