ICRA 2025 Pendubot Real System Results

Controller Short Controller Description #swingups Uptime [s] Best RealAI Score Average RealAI Score Username Data
ar_eapo Policy trained with average reward maximum entropy RL 12 42.849 0.714 0.696 rnilva data plot video
AcadosMPC Real-Time nonlinear Model Predictive Conntrol implemented with Acados framework. 12 43.269 0.721 0.716 maranderine data plot video
MC-PILCO Controller trained with the Model-Based Reinforcement Learning algorithm MC-PILCO. 12 33.545 0.559 0.542 turcato-niccolo data plot video
VIMPPI Variational Integrator Model Predictive Path Integral for direct torque planning 11 24.056 0.401 0.363 adk data plot video
Videos from left to right: Acados_MPC, AR_EAPO, MCPILCO, VIMPPI

Rules

This leaderboard shows the final results from the RealAIGym competition at ICRA 2025.

The simulation leaderboard tests the global coverage of the double pendulums state space of different control methods in simulation. The task for the controller is to swingup and balance the pendubot and keep the end-effector above the threshold line. At random times during the execution, the pendulum is reset to a new initial state.

The model parameters identified by us with a least squares optimization of the pendubot are:

More information about the dynamic model of the double pendulum can be found here: Double Pendulum Dynamics. For a urdf file with this model see here: URDF.

The \(0.5\,\text{Nm}\) torque limit on the passive joint can be used to compensate the friction of the motor.

The actuators can be controlled with arbitrary control frequency of up to \(500\, \text{Hz}\) and the experiment takes \(10\,\text{s}\). The initial pendubot configuration is \(x_0 = (0, 0, 0, 0)\) (hanging down) and the goal is the unstable fixpoint at the upright configuration \(x_g = (\pi, 0, 0, 0)\). The upright position is considered to be reached when the end-effector is above the threshold line at \(h=0.45 \, \text{m}\) (origin at the mounting point).

Scores

The scores are the fraction of the 60s runtime that the pendulum is in the goal region, i.e. above the threshold line.

Participating

This leaderboard is only for the results from the competition at ICRA 2025. For participating checkout the ongoing leaderboard