Simulator

The simulator class can simulate and animate the pendulum motion forward in time. The gym environment can be used for reinforcement learning.

API

The simulator

The simulator should by initialized with a plant (here the PendulumPlant) as follows:

pendulum = PendulumPlant()
sim = Simulator(plant=pendulum)

To simulate the dynamics of the plant forward in time call:

T, X, TAU = sim.simulate(t0=0.0,
                         x0=[0.5, 0.0],
                         tf=10.0,
                         dt=0.01,
                         controller=None,
                         integrator="runge_kutta")

The inputs of the function are:

  • t0: float, start time, unit: s

  • x0: start state (dimension as the plant expects it)

  • tf: float, final time, unit: s

  • dt: float, time step, unit: s

  • controller: controller that computes the motor torque(s) to be applied. The controller should have the structure of the AbstractController class in utilities/abstract_controller. If controller=None, no controller is used and the free system is simulated.

  • integrator: string, euler for euler integrator, runge_kutta for Runge-Kutta integrator

The function returns three lists:

  • T: List of time values

  • X: List of states

  • TAU: List of actuations

The same simulation can be executed together with an animation of the plant (only implemented for 2d serial chains). For the simuation with animation call:

T, X, TAU = sim.simulate_and_animate(t0=0.0,
                                     x0=[0.5, 0.0],
                                     tf=10.0,
                                     dt=0.01,
                                     controller=None,
                                     integrator="runge_kutta",
                                     phase_plot=True,
                                     save_video=False,
                                     video_name="")

The additional parameters are:

  • phase plot: bool

    Whether to show a phase plot along the animation plot

  • save_video: bool

    Whether to save the animation as mp4 video

  • video_name: string

    Name of the file where the video should be saved (only used if save_video=True)

The gym environment

The environment can be initialized with:

pendulum = PendulumPlant()
sim = Simulator(plant=pendulum)
env = SimplePendulumEnv(simulator=sim,
                        max_steps=5000,
                        target=[np.pi, 0.0],
                        state_target_epsilon=[1e-2, 1e-2],
                        reward_type='continuous',
                        dt=1e-3,
                        integrator='runge_kutta',
                        state_representation=2,
                        validation_limit=-150,
                        scale_action=True,
                        random_init="False")

The parameters are:

  • simulator:

    Simulator object

  • max_steps: int, default=``5000``

    Maximum steps the agent can take before the episode is terminated

  • target: array-like, default=``[np.pi, 0.0]``

    The target state of the pendulum

  • state_target_epsilon: array-like, default=``[1e-2, 1e-2]``

    Target epsilon for discrete reward type

  • reward_type: string, default=``continuous``

    The reward type selects the reward function which is used Options: continuous, discrete, soft_binary, soft_binary_with_repellor

  • dt: float, default=``1e-3``

    Timestep for the simulation

  • integrator: string, default=’runge_kutta’

    The integrator which is used by the simulator Options: euler, runge_kutta

  • state_representation: int, default=``2``

    Determines how the state space of the pendulum is represented 2 means state = [position, velocity] 3 means state = [cos(position), sin(position), velocity]

  • validation_limit: float, default=-150

    If the reward during validation episodes surpasses this value the training stops early

  • scale_action: bool, default=True

    Whether to scale the output of the model with the torque limit of the simulator’s plant. If True the model is expected so return values in the intervall [-1, 1] as action.

  • random_init: string, default=``False``

    A string determining the random state initialisation False: The pendulum is set to [0, 0], start_vicinity: The pendulum position and velocity are set in the range [-0.31, -0.31], everywhere: The pendulum is set to a random state in the whole

    possible state space

Usage

For examples of usages of the simulator class check out the scripts in the examples folder.

The gym environment is used for example in the ddpg training.