Automating the search for entirely new “curiosity” algorithms


Driven by an innate curiosity, children pick up new skills as they explore the world and learn from their experience. Computers, by contrast, often get stuck when thrown into new environments.

To get around this, engineers have tried encoding simple forms of curiosity into their algorithms with the hope that an agent pushed to explore will learn about its environment more effectively. An agent with a child’s curiosity might go from learning to pick up, manipulate, and throw objects to understanding the pull of gravity, a realization that could dramatically accelerate its ability to learn many other things. 

Engineers have discovered many ways of encoding curious exploration into machine learning algorithms. A research team at MIT wondered if a computer could do better, based on a long history of enlisting computers in the search for new algorithms. 

In recent years, the design of deep neural networks, algorithms that search for solutions by adjusting numeric parameters, has been automated with software like Google’s AutoML and auto-sklearn in Python. That’s made it easier for non-experts to develop AI applications. But while deep nets excel at specific tasks, they have trouble generalizing to new situations. Algorithms expressed in code, in a high-level programming language, by contrast, have the capacity to transfer knowledge across different tasks and environments. 

“Algorithms designed by humans are very general,” says study co-author  Ferran Alet , a graduate student in MIT’s Department of Electrical Engineering and Computer Science and Computer Science and Artificial Intelligence Laboratory (CSAIL). “We were inspired to use AI to find algorithms with curiosity strategies that can adapt to a range of environments.”

The researchers created a “meta-learning” algorithm that generated 52,000 exploration algorithms. They found that the top two were entirely new — seemingly too obvious or counterintuitive for a human to have proposed. Both algorithms generated exploration behavior that substantially improved learning in a range of simulated tasks, from navigating a two-dimensional grid based on images to making a robotic ant walk. Because the meta-learning process generates high-level computer code as output, both algorithms can be dissected to peer inside their decision-making processes.

The paper’s senior authors are  Leslie Kaelbling  and  Tomás Lozano-Pérez , both professors of computer science and electrical engineering at MIT. The work will be presented at the virtual International Conference on Learning Representations later this month. 

The paper received praise from researchers not involved in the work. “The use of program search to discover a better intrinsic reward is very creative,” says Quoc Le, a principal scientist at Google who has helped pioneer computer-aided design of deep learning models. “I like this idea a lot, especially since the programs are...

Top