Reinforcement Learning Enhances Performance of Generative Flow Networks

Scientists at the AI Research Centre and the AI and Digital Science Institute of the HSE Faculty of Computer Science applied classical reinforcement learning algorithms to train generative flow networks (GFlowNets). This enabled significant performance improvements in GFlowNets, which have been employed for three years in tackling the most complex scientific challenges at modelling, hypothesis generation, and experimental design stages. The results of their work achieved a top 5% ranking among publications at the International Conference on Artificial Intelligence and Statistics AISTATS, held on May 2-4, 2024, in Valencia, Spain.

The study was supported by a grant for research centres in the field of AI provided by the Analytical Centre for the Government of the Russian Federation.

Generative flow networks (GFlowNets) are a machine learning method that facilitates the creation of diverse and high-quality data samples by training the model to generate varied objects with high rewards. First introduced in 2021, they have since found applications in various fields such as training language models, solving combinatorial optimisation problems (like creating complex schedules), PCB design, and modelling drug molecules with specific properties.

The functioning of GFlowNets can be likened to that of a Lego constructor: given an incomplete object and a collection of available parts, the model predicts where and with what probability each part should be added to assemble a high-quality mock-up of a car or ship.

Nikita Morozov
Research Assistant, Centre of Deep Learning and Bayesian Methods, AI and Digital Science Institute, Faculty of Computer Science, HSE University

Reinforcement Learning (RL) is a machine learning paradigm where an agent is trained to interact with an environment to maximise a reward function. AlphaGo, a classic model based on reinforcement learning, is the world's first program to defeat a professional player in the board game of Go.

Generative streaming networks and reinforcement learning share the similarity of receiving a reward function as a training signal. However, GFlowNets do not aim to maximise the reward directly; instead, they learn to generate objects with probabilities that are proportional to the reward.

Scientists at the AI Research Centre and the AI and Digital Science Institute of the HSE Faculty of Computer Science demonstrated that training generative streaming networks can closely resemble the general task of reinforcement learning, and applied specialised reinforcement learning methods to generate discrete objects, such as molecular graphs.

We have shown that classic reinforcement learning algorithms, when applied to GFlowNets, perform comparably and even more effectively than well-known modern approaches developed specifically for training these models. Thus, in the task of modelling drug molecules with specified properties, our method generated 30% more high-quality molecules during training compared to existing methods.

Alexey Naumov
Academic Supervisor, AI Research Centre; Director for Basic Research, AI and Digital Science Institute, Faculty of Computer Science, HSE University

The researchers emphasise that using existing reinforcement learning methods to directly train GFlowNets, without further adaptation, will accelerate progress in developing new methods across various fields such as medical chemistry, materials science, energy, biotechnology, and many others where GFlowNets have been applied over the three years.
IQ

July 01, 2024

High Tech