What future problems might we face on the road to Artificial General Intelligence?
In the previous article in this series on Artificial General Intelligence (AGI), we learned about the problems in AI research that we need to solve to achieve AGI. We covered topics of tomorrow’s AI, such as generality, reasoning, and embodiment.
In this article, we’ll learn about the AI research problems we might face over the next decade or so on the road to AGI. These include topics like recursive self-improvement, artificial consciousness, and value alignment.
Most of today’s AI systems are static. After they are initially trained, they don’t learn anything new. Unlike humans, most neural networks use fixed weights that don’t update in real-time. This limits their ability to learn from feedback, preventing them from adapting and evolving continuously. However, we have some AI systems that can self-improve.
For example, Voyager, an autonomous Minecraft agent, has some self-improvement capabilites. Voyager explores the game of Minecraft, writes code to solve problems, and stores the code in a skill library. When it runs into a problem, it either finds existing code or writes new code to solve the problem. It incrementally writes and then verifies the code it creates to ensure it works correctly.
Voyager uses simple self-improvement. However, recursive self-improvement allows an AI to update its own model, prompt, architecture, or code. It allows the AI agent to incrementally improve itself by running experiments, collecting training data, learning from feedback, or re-writing its own codebase.
Recursive self-improvement could lead to an intelligence explosion – a situation where AI rapidly surpasses human intelligence. Once AI reaches the level of an average AI engineer, it can improve its own models and codebase, quickly becoming smarter than humans. This could lead to the Escape Problem – a situation where we have an Artificial Super Intelligence (ASI) we can’t control.
Consciousness arises in complex organisms like humans, dolphins, elephants, and possibly even crows. It is the ability to be aware of oneself as a distinct entity separate from the world around you. It also provides an agent with an identity and subjective experiences. Despite our recent advances in artificial intelligence, we are still far from understanding and achieving artificial consciousness.
Our current understanding of consciousness suggests the brain is composed of specialized modules. We have modules for sensory input, muscle control, memory, emotion, etc. However, these modules need to communicate efficiently with each other so their information is integrated within a global workspace.
You can think of a global workspace like actors performing in a theater. Each actor (i.e., brain module) competes to say the next line of the play. The actor with the best line goes up on stage and reads their line. The various lines spoken on stage create a stream of consciousness. However, this stream involves more than just dialogue – it also includes sounds, images, actions, and emotions.
Unfortunately, even with this analogy, we don’t really know how to build an artificial consciousness. However, it’s been theorized that consciousness might naturally evolve in brains with submodules that need to communicate in a resource-constrained environment. So, we may not have to build an artificial consciousness – it might just evolve on its own.
Humans develop values like fairness, honesty, and kindness through their life experiences and social norms. Machines, however, do not have these experiences, so there’s no guarantee that they will develop the same values as humans. This leads to the value-alignment problem – the problem of ensuring that the goals and values of both humans and AI remain in alignment.
AI systems are really good at following precise instructions. However, they often miss nuances. For example, AI agents find and exploit loopholes in games to score points by cheating. They’ve also been known to find and exploit bugs in physics engines. As a result, it’s very difficult to ensure that AI systems behave in the ways we expect – especially when they encounter edge cases.
So, AI researchers are working on solutions to the value alignment problem. We have techniques like Reinforcement Learning from Human Feedback (RHLF), where humans guide AI actions through feedback. We also use imitation learning and approval-seeking techniques to ensure AI systems behave as expected and avoid taking risky actions without permission from a human.
While these techniques aim to improve alignment, they aren’t foolproof. An AGI might still seek power to better achieve its objectives, or it might reprogram itself, bypassing safeguards. Unfortunately, we don’t even know if it’s possible to align an Artificial Super Intelligence (ASI) with human goals and values. So, this is likely the most important problem we will need to solve with ASI.
Recursive self-improvement, artificial consciousness, and value alignment are important problems that must be solved before we build an AGI. However, there are likely other problems we need to address to achieve AGI and ASI. To learn more, please check out the final article in this series Beyond AI.