SUBMISSIONS

SUBMISSION DETAIL

Hakan Alp EREN, Nihat ADAR, Ahmet YAZAR
 


Keywords:



THE IMPORTANCE OF EXPERIENCE REPLAY BUFFER SIZE IN DEEP REINFORCEMENT LEARNING
 
Reinforcement learning is based on the use of experiences obtained from interacting with the environment, instead of a ready-made dataset, unlike supervised and unsupervised learning. Q-learning and many other reinforcement learning algorithms aim to obtain the best action-value function by using these experiences in iterative updates. Although such classical reinforcement learning methods have shown success in environments with low-dimensional state space, it is not practical to use them in complex problems involving a high-dimensional state space. Using neural networks as function approximators for such problems allows generalization about states that are not observed during training. However, since the agent's experiences are sequential and there is a high correlation between consecutive examples, training the neural network in this way will lead to catastrophic forgetting. To prevent this situation, an independent and identical distribution is obtained by storing the experiences in a buffer and using randomly selected samples. This method, called experience replay, provides stability to training by breaking the correlation and increasing data efficiency. Although the replay buffer is a critical mechanism for many deep reinforcement learning algorithms, the size of the buffer has often been an underestimated hyperparameter. In this study, experiments were carried out on five games in Atari 2600, a standard reinforcement learning environment, using deep Q-learning algorithm and buffers of three different sizes: 50000, 100000 and 150000. The results obtained from 45 trained agents show that using a larger replay buffer will not always yield successful results and that the replay buffer size is an important hyperparameter that needs to be tuned. ORCID NO: 0000-0001-6105-158X, 0000-0002-0555-0701, 0000-0001-9348-9092

Anahtar Kelimeler: Reinforcement learning, Deep Q-learning, Experience replay, Replay buffer, Buffer size