Controller | Short Controller Description | #swingups | Uptime [s] | RealAI Score | Username | Data |
---|---|---|---|---|---|---|
mcpilco | Swingup trained with MBRL algorithm MC-PILCO + stabilization with LQR. | 34 | 18.408 | 0.307 | turcato-niccolo | data plot video |
VIMPPI | Variational Integrator Model Predictive Path Integral for direct torque planning | 16 | 38.882 | 0.648 | adk | data plot video |
AcadosMpc | Real-Time nonlinear Model Predictive Conntrol implemented with Acados framework | 21 | 37.798 | 0.63 | maranderine | data plot video |
AR-EAPO | Policy trained with average reward maximum entropy RL | 21 | 39.484 | 0.658 | rnilva | data plot video |
This leaderboard shows the simulation results from the RealAIGym competition at ICRA 2025. The simulation leaderboard tests the global coverage of the double pendulums state space of different control methods in simulation. The task for the controller is to swingup and balance the acrobot and keep the end-effector above the threshold line. At random times during the execution, the pendulum is reset to a new initial state.
The model parameters of the acrobot are:
More information about the dynamic model of the double pendulum can be found here: Double Pendulum Dynamics. In the Double Pendulum Repository the parameters above are labeled as ‘designC.1/model1.1’. For a urdf file with this model see here: URDF.
The acrobot is simulated with a Runge-Kutta 4 integrator with a timestep of \(dt = 0.002 \, \text{s}\) for \(T = 60 \, \text{s}\). The initial acrobot configuration is \(x_0 = (0, 0, 0, 0)\) (hanging down) and the goal is the unstable fixpoint at the upright configuration \(x_g = (\pi, 0, 0, 0)\). The upright position is considered to be reached when the end-effector is above the threshold line at \(h=0.45 \, \text{m}\) (origin at the mounting point) and stays there until the end. At 15 random times during the execution the controller is switched off for 0.2s and the pendulum is reset to a new initial state. After the reset the controller is switched on again and the controller is supposed to swing the pendulum up from the new initial state. The leaderboard was evaluated with the numpy random seed 777.
The score is the time the pendulum spends in the goal region divided by the total runtime.
This leaderboard is only for the results from the competition at ICRA 2025. For participating checkout the ongoing leaderboard