Reinforcement Learning in Continuous State and Action Spaces

In this talk I present some reinforcement learning architectures and algorithms designed to solve robot control problems with continuous state and action spaces, where the use of function approximators to represent the value functions is a must. I start by defining an architecture for supervised reinforcement learning that allows a reinforcement learner to benefit from imperfect advice provided by an expert. Then I present one architecture and two algorithms for implementing Actor-Critic learning. Finally, the performance of three kinds of functions approximators are compared: an array of radial basis functions, a multilayer perceptron, and a multilayer perceptron enhanced with a layer of radial basis functions. The experimental work shows that the advice provided by an expert can be used to accelerate the reinforcement learning process, even when the advice is imperfect. The results also show that one of the Actor-Critic algorithms provided, which is a hybrid of Sarsa learning and the Actor-Critic method, has some characteristics that make it a more robust candidate for solving robot control problems than the standard Actor-Critic algorithm.

In der Regel sind die Vorträge Teil von Lehrveranstaltungsreihen der Universität Bremen und nicht frei zugänglich. Bei Interesse wird um Rücksprache mit dem Sekretariat unter sek-ric(at)dfki.de gebeten.

zuletzt geändert am 31.03.2023