Behavior learning is a promising alternative to planning and control for behavior generation in robotics. The field is becoming more and more popular in applications where modeling the environment and the robot is cumbersome, difficult, or maybe even impossible.
Learning behaviors for real robots that generalize over task parameters with as few interactions with the environment as possible is a challenge that this dissertation tackles. Which problems we can currently solve with behavior learning algorithms and which algorithms we need in the domain of robotics is not apparent at the moment as there are many related fields: imitation learning, reinforcement learning, self-supervised learning, and black-box optimization.
After an extensive literature review, we decide to use methods from imitation learning and policy search to address the challenge. Specifically, we use human demonstrations recorded by motion capture systems and imitation learning with movement primitives to obtain initial behaviors that we later on generalize through contextual policy search.
Imitation from motion capture data leads to the correspondence problem: the kinematic and dynamic capabilities of humans and robots are often fundamentally different and, hence, we have to compensate for that. This thesis proposes a procedure for automatic embodiment mapping through optimization and policy search and evaluates it with several robotic systems.
Contextual policy search algorithms are often not sample efficient enough to learn directly on real robots. This thesis tries to solve the issue with active context selection, active training set selection, surrogate models, and manifold learning. The progress is illustrated with several simulated and real robot learning tasks. Strong connections between policy search and black-box optimization are revealed and exploited in this part of the thesis. This thesis demonstrates that learning manipulation behaviors is possible within a few hundred episodes directly on a real robot.
Furthermore, these new approaches to imitation learning and contextual policy search are integrated in a coherent framework that can be used to learn new behaviors from human motion capture data almost automatically. Corresponding implementations that were developed during this thesis are available in an open source software.
Vortragsdetails
Promotionsvortrag: Learning and Generalizing Behaviors for Robots from Human Demonstration
In der Regel sind die Vorträge Teil von Lehrveranstaltungsreihen der Universität Bremen und nicht frei zugänglich. Bei Interesse wird um Rücksprache mit dem Sekretariat unter sek-ric(at)dfki.de gebeten.