Written By: Ruby Balcombe
Edited by: Lucy Ahern and Megan Thomas
“I have always been convinced that the only way to get artificial intelligence to work is to do the computation in a way similar to the human brain.” - Geoffrey Hinton, Nobel Prize Winner for his work on neural networks and deep learning, the ‘Godfather of AI’.
Two of the greatest pursuits of our time are the search for understanding the intricacies of the human brain and the quest for artificial general intelligence (AGI). Often defined as a machine capable of performing as well as a human across virtually all cognitive domains, AGI is more than just a technical goal, it is a pursuit deeply informed by our own biological architecture [1]. From their inception, neuroscience and artificial intelligence have been intertwined, with many early pioneers contributing to both fields [2].
DeepMind researchers describe this relationship as a "virtuous circle” [3] – where neuroscience provides inspiration for novel algorithms, and AI offers new tools and frameworks to help in our understanding of the brain.

The human brain is the only existing proof that generalisable intelligence is possible. Therefore, if a particular biological mechanism in us is found to be vital for cognitive function (such as in learning or decision making), the mechanism is often considered an excellent candidate for incorporation into artificial systems to help in achieving AGI.
A landmark example of this comes from the Nobel Prize winning work of David Hubel and Torsten Wiesel, who recorded the activity of individual neurons in the mammalian visual cortex. They discovered a clear hierarchy: in the early stages of visual processing, simple cells act like specialised filters, only responding to a specific edge or line at a certain angle. As information moves deeper into the brain, complex cells combine these signals to recognise shapes and motion [4]. This layered organisation in which simple features are progressively combined to form complex representations directly inspired the development of convolutional neural networks (CNNs) which now underpin the entire field of modern computer vision.
While these biological blueprints gave rise to the first generation of vision algorithms, the relationship soon evolved into a two-way street, where artificial frameworks began to solve long-standing mysteries of the brain.
Machines learn through reinforcement. Much like training a dog with treats, an AI agent is given a reward signal (typically a numerical score) when it performs a desired action. It doesn’t start with instructions, it learns by trial and error, repeating actions that lead to a high score and avoiding those that don’t.
The Credit Assignment Problem was a central challenge in training artificial systems: In trying to teach an AI how to play chess, a game that can consist of hundreds of moves before winning, how can the artificial system be taught which of its actions led to the win and deserve credit?
Richard Sutton, a computer scientist who studied psychology as an undergraduate, tackled this problem from a biological perspective. He suggested that instead of rewarding an AI system once it has won, you should reward it when it thinks it is winning, an approach inspired by his intuition of how biological learning might function. This split reinforcement learning into two components: a critic and an actor. The critic predicts the likelihood of winning at every moment during the game. The actor chooses what action to take and gets rewarded if the critic thinks that the actor’s move increased the likelihood of winning [5]. This approach is the foundation of Temporal Difference learning, a method where the system learns by comparing its predictions at different points in time.
Gerald Tesauro, a physicist working at IBM, was inspired by this work and used it to build TD-Gammon, a system that learned to play backgammon by playing thousands of games against itself. By using the critic to update the value of each board position, T-D gammon achieved a “truly staggering level of performance”. It proved that Sutton’s biological intuition wasn’t just a psychological theory, but a blueprint for the seeds of intelligence in a machine.
It turned out that Sutton had discovered a trick that evolution had already worked out over five hundred million years ago, manifested in the basal ganglia of the brain. One leading theory is that the basal ganglia is formed of two key circuits. One circuit learns to repeat behaviours that trigger dopamine release, and the other circuit learns to predict future rewards and triggers its own dopamine activation, much like the actor and the critic.
What began as an AI framework provided a precise, computational explanation for a fundamental neural mechanism.
A long standing challenge in current AI systems is catastrophic forgetting, where learning a new task causes the network to forget how to perform a previous one. The brain, however, solves this with remarkable ease. According to the Complementary Learning Systems theory, it does so with two interacting systems: a slow-learning neocortex that gradually integrates knowledge in a structured way, and a fast learning hippocampus that rapidly encodes the specifics of individual experiences.
More recently, this dual-system architecture inspired the experience replay mechanism used in DeepMind’s Deep Q-Network agent, which achieved human-level performance across dozens of Atari games in 2015 [6].
The next great frontier for AI is to achieve human-like generalisation and the ability to imagine. For these challenges, neuroscience remains our most valuable source of clues.
A current puzzle lies within the hippocampus. This structure must keep detailed memories of specific events separate to avoid interference, but it is also essential to connect those events for generalised knowledge. Theoretical work by Dharshan Kumaran and James McClelland proposes a solution through a mechanism they term “recurrent similarity computation”. This suggests that the brain has a set of recurrent feedback loops between the hippocampus and neocortex through which it can dynamically reactivate and compare distinct high-fidelity memories to discover relationships. This allows for powerful generalisation to emerge from a few specific memories, offering a blueprint to allow AI to achieve better generalisation from smaller datasets.
At the forefront of this effort is the Neuroscience Lab at Google DeepMind, co-led by Kevin Miller and Zeb Kurth-Nelson. Their goal is to form an understanding of how perception leads to behaviour. To explore this, they create software agents tasked with performing the same tasks humans face, such as navigating space or predicting reward. If the agents are successful at the task, it could then suggest that this is a potential cognitive mechanism.
Historically, this field has been split between two imperfect methods:
The DeepMind lab uses Hybrid Recurrent Neural Networks. This involves taking a clear, understandable classic model, but any “mystery gaps” are replaced with neural networks.
The lab recently put this to the test with a massive dataset of over 800 participants in a complex reward-learning game. While the other models were too rigid to explain how humans weighed their options, the Hybrid approach successfully learned the hidden algorithm the human subjects were using. By placing constraints on the agent and building a structured framework, the lab not only created an “intelligent” machine, they also created a tool providing insight into the biological logic of the human mind.
We have entered an era of Co-Design, where breakthroughs in our understanding of the human brain provide new approaches to AGI, and AI architectures provide fresh hypotheses for how we think, reason, and learn. There has never been a more exhilarating time to be at the intersection of these two fields. As Hinton suspected decades ago, the path to true artificial intelligence has been hidden inside us all along.
Sign up for our monthly newsletter of upcoming events, recently published insights, member news and SEC updates.
.jpg)