The challenge of controlling nonlinear systems has sparked significant interest within the research community. In particular, reinforcement learning-based control methods present promising advantages for addressing nonlinear control challenges, including reduced reliance on detailed system models and improved adaptability in the presence of model uncertainty.
In our choice of control tasks, our objective is to achieve the swing-up and stabilization of highly nonlinear underactuated double pendulum systems, specifically the acrobot and pendubot. We have developed a model-free reinforcement learning approach employing the Soft Actor-Critic (SAC) algorithm, which has demonstrated success in simulated environments. Furthermore, this method has shown its effectiveness in real-world experiments, but with limitations. Specifically, it performs well with the pendubot system but fails when applied to the acrobot system. Nevertheless, challenges associated with bridging the gap between simulation and reality, known as the "sim2real" problem, persist and require further attention.