As shown in multiple research, providing feedback to autonomous learning agents can speed up learning. However, the quantification/characterization of different aspects of feedback such as feedback quantity, quality, temporal and spatial misalignments, etc., in learning speed, performance and other relevant metrics is still an open question. This question does not only addresses theoretical aspects of the learning algorithms, but it is also very relevant for application in real systems because although feedback is beneficial, (human-)feedback is also expensive and adds complexity to the systems. Thus, it is essential to know the minimal requirements for the (human-)feedback to achieve a significant increment in performance, learning speed etc., that it is worth the added complexity.
To address these challenges, this project presents a series of questions that can be addressed independently towards achieving a deeper understanding of the role of feedback in a learning system's performance.Several assumptions and simplifications have been made to facilitate the study of these questions. These include the use of binary and low-dimensional feedback comparable to those used in the M-RoCK project. Also motivated by the M-RoCK project, this project will be studied in a robot reaching task for a KUKA LBR iiwa, a robotic arm of 7 degrees of freedom (DoF). Configurations from 1 to 7 DoF will be used to study feedback effects at different levels of task complexity. This project will use artificial feedback and primarily be studied in simulated environments. Eventually, once a better understanding of the effects of feedback is obtained, experiments with real users will be carried out.
Several thesis directions are possible, and these will be discussed with the candidates. These include:
≫Quantifying the Effect of Feedback Quantity in IRL Performance
≫Quantifying the Effect of Feedback Accuracy in IRL Performance
≫Quantifying the Effect of Time-Delayed Feedback in IRL Performance
≫Policy Shaping for Dynamical System using low-dimensional feedback
Prior Knowledge or interest in:
≫Reinforcement Learning and Machine Learning
≫Python, Latex, git, Linux
≫Stahlhut, C., Navarro-Guerrero, N., Weber, C., & Wermter, S. (2015). Interaction in Reinforcement Learning Reduces the Need for Finely Tuned Hyperparameters in Complex Tasks. Kognitive Systeme, 3(2). https://doi.org/10.17185/duepublico/40718
Deutsches Forschungszentrum für
Künstliche Intelligenz GmbH
Robotics Innovation Center
28359 Bremen, Germany
Phone: +49 421 17845 firstname.lastname@example.org