Controller | Short Controller Description | #swingups | Uptime [s] | Best RealAI Score | Average RealAI Score | Username | Data |
---|---|---|---|---|---|---|---|
ar_eapo | Policy trained with average reward maximum entropy RL | 12 | 42.849 | 0.714 | 0.696 | rnilva | data plot video |
AcadosMPC | Real-Time nonlinear Model Predictive Conntrol implemented with Acados framework. | 12 | 43.269 | 0.721 | 0.716 | maranderine | data plot video |
MC-PILCO | Controller trained with the Model-Based Reinforcement Learning algorithm MC-PILCO. | 12 | 33.545 | 0.559 | 0.542 | turcato-niccolo | data plot video |
VIMPPI | Variational Integrator Model Predictive Path Integral for direct torque planning | 11 | 24.056 | 0.401 | 0.363 | adk | data plot video |
This leaderboard shows the final results from the RealAIGym competition at ICRA 2025.
The simulation leaderboard tests the global coverage of the double pendulums state space of different control methods in simulation. The task for the controller is to swingup and balance the pendubot and keep the end-effector above the threshold line. At random times during the execution, the pendulum is reset to a new initial state.
The model parameters identified by us with a least squares optimization of the pendubot are:
More information about the dynamic model of the double pendulum can be found here: Double Pendulum Dynamics. For a urdf file with this model see here: URDF.
The \(0.5\,\text{Nm}\) torque limit on the passive joint can be used to compensate the friction of the motor.
The actuators can be controlled with arbitrary control frequency of up to \(500\, \text{Hz}\) and the experiment takes \(10\,\text{s}\). The initial pendubot configuration is \(x_0 = (0, 0, 0, 0)\) (hanging down) and the goal is the unstable fixpoint at the upright configuration \(x_g = (\pi, 0, 0, 0)\). The upright position is considered to be reached when the end-effector is above the threshold line at \(h=0.45 \, \text{m}\) (origin at the mounting point).
The scores are the fraction of the 60s runtime that the pendulum is in the goal region, i.e. above the threshold line.
This leaderboard is only for the results from the competition at ICRA 2025. For participating checkout the ongoing leaderboard