German Research Center for Artificial Intelligence
1
University of Bremen
2
Generating physical movement behaviours from their symbolic description is a long-standing challenge in artificial intelligence (AI) and robotics, requiring insights into numerical optimization methods as well as into formalizations from symbolic AI and reasoning. In this paper, a novel approach to finding a reward function from a symbolic description is proposed. The intended system behaviour is modelled as a hybrid automaton, which reduces the system state space to allow more efficient reinforcement learning. The approach is applied to bipedal walking, by modelling the walking robot as a hybrid automaton over state space orthants, and used with the compass walker to derive a reward that incentivizes following the hybrid automaton cycle. As a result, training times of reinforcement learning controllers are reduced while final walking speed is increased. The approach can serve as a blueprint how to generate reward functions from symbolic AI and reasoning.
The source code for the work described in the paper can be found here.
@inproceedings{HLGKK23,
author = { Harnack, Daniel and
L\"{u}th, Christoph and
Gross, Lukas and
Kumar, Shivesh and
Kirchner, Frank}
title = { Deriving Rewards for Reinforcement Learning from Symbolic Behaviour Descriptions of Bipedal Walking},
booktitle = {62nd {IEEE} Conference on Decision and Control ({CDC})},
address = {Marina Bay Sands, Singapore}
pages = {2135 -- 2140},
year = {2023},
publisher = {{IEEE}}
}