In a dynamic environment, the capacity of a robot to adapt is crucial. Quality-Diversity (QD) algorithms, which produce multiple effective solutions, are effective at increasing adaptability through a descriptor function. For example, if a robot loses an extremity, thanks to the repertoire of solutions generated by the QD algorithm, the robot can easily adapt to this scenario.
Multi-Objective MAP-Elites with Policy-Gradient Assistance and Crowding-based Exploration (MOME-PGX) is a data-efficient type of multi-objective QD algorithm and will be used for this research, which aims to determine how important inaccuracies in simulation models impact the algorithm's performance and how learned exploratory behaviors can accelerate model improvements.
This will be tested using a robotic model in a simulated environment, with the task of kicking an object to a goal, and two objectives defined: to minimize energy and maximize accuracy. Then some inaccuracies will be introduced to the environment, such as changes in friction, object shape, joint restrictions, and others, to challenge the robot while trying to complete its task.