Scientists have formulated a method known as Meta-learning for Compositionality (MLC), designed to improve the capability of artificial intelligence systems to perform “compositional generalizations.” This aspect, which enables humans to associate and integrate various concepts, has been a subject of discussion in the AI community for an extended period. Utilizing a distinct learning algorithm, MLC demonstrated results that were not only equivalent to human performance but in certain instances, exceeded it. This significant advance indicates that conventional neural networks can indeed be configured to simulate the human-like ability of systematic generalization.
Emerging Prospects for Compositional Generalization in AI
Human beings naturally comprehend how to associate ideas; learning the concept of “skip,” for instance, instantaneously equips them with the understanding of actions like “skip twice around the room” or “skip with your hands raised.”
The question of whether machines possess this sort of cognitive capacity has been long-standing. Philosophers and cognitive scientists Jerry Fodor and Zenon Pylyshyn argued in the late 1980s that artificial neural networks— the foundational elements of artificial intelligence and machine learning— were not capable of such compositional generalizations. In the years following, there has been incremental progress in enabling neural networks to acquire this ability, but these efforts have achieved mixed results, sustaining this ongoing debate.
A Pioneering Method: Meta-learning for Compositionality
A collaborative research effort between New York University and Spain’s Pompeu Fabra University has produced a method—documented in the scientific journal Nature—that augments the capacity of existing technologies, such as ChatGPT, to achieve compositional generalizations. MLC not only outperforms extant methodologies but also demonstrates performance that is comparable to, or in certain cases superior to, human capabilities. The focus of MLC is on the iterative training of neural networks, the underlying technology for systems like ChatGPT involved in speech recognition and natural language processing, to enhance their aptitude for compositional generalization through repetitive exercises.
Developers of current systems, encompassing large language models, have either anticipated that compositional generalization would manifest through standard training methods or have engineered specialized architectures to gain these faculties. MLC diverges from these practices by elucidating how focused training in these competencies enables these systems to reveal previously untapped capabilities, according to the authors.
“For over three decades, scholars from various disciplines, including cognitive science, artificial intelligence, linguistics, and philosophy, have been embroiled in debates over the capacity of neural networks to achieve human-like systematic generalization,” states Brenden Lake, an assistant professor at NYU’s Center for Data Science and Department of Psychology, and a co-author of the research. “Our findings demonstrate, for the first time, that a standard neural network can either emulate or surpass human systematic generalization when directly compared.”
The Mechanics of MLC
The researchers developed MLC as an innovative learning process where a neural network undergoes ongoing adjustments to refine its competencies through a sequence of episodes. In each episode, MLC receives a new word and is tasked to employ it in a compositional manner—for example, using the word “jump” to generate new combinations like “jump twice” or “jump around right twice.” Each new episode presents a different word, successively enhancing the network’s skills in compositional understanding.
Evaluating the Methodology
To assess the efficacy of MLC, Brenden Lake, who is also the co-director of NYU’s Minds, Brains, and Machines Initiative, along with Marco Baroni, a scholar at the Catalan Institute for Research and Advanced Studies and professor at the Department of Translation and Language Sciences of Pompeu Fabra University, executed a set of experiments involving human subjects. These experiments were identical to the tasks undertaken by MLC.
Moreover, instead of learning the meaning of real words—which humans would inherently understand—they were also required to comprehend the meaning of fabricated terms (e.g., “zup” and “dax”) as defined by the researchers, and discern how to apply them diversely. MLC performed on par with, and in some instances, outperformed its human competitors. Both MLC and human subjects also surpassed ChatGPT and GPT-4, which, despite their advanced capabilities, demonstrated limitations in this particular learning challenge.
“Despite improvements over recent years, large language models like ChatGPT continue to encounter challenges with compositional generalization,” notes Marco Baroni, a member of Pompeu Fabra University’s Computational Linguistics and Linguistic Theory research group. “We believe that the MLC technique can provide further advancements in the compositional abilities of these large language models.”
Reference: “Human-like systematic generalization through a meta-learning neural network” by Brenden M. Lake, and Marco Baroni, published on October 25, 2023, in Nature.
DOI: 10.1038/s41586-023-06668-3