Model-based Direct Policy Search for Skill Learning in Continuous Domains
Jan Hendrik Metzen
In Proceedings of the 10th European Workshop on Reinforcement Learning, (EWRL-12), 30.6.-01.7.2012, Edinburgh, o.A., Jun/2012.
One interesting problem domain for reinforcement learning (RL) are real-world robotic control applications. These domains can be modeled as (potentially partially observable or noisy) Markov Decision Processes with both continuous state and action spaces (cMDPs). Several authors (Togelius et al., 2009; Kalyanakrishnan and Stone, 2009) argue that for such continuous and noisy domains, direct policy search (DPS) methods may outperform value-function based RL.