Autonomous Navigation of a boat with Model Based Reinforcement Learning

The overall objective of this project is to improve flexibility and robustness of Model Base Reinforcement Learning agents. Not only do we train and evaluate models in simulation, but we also deploy and evaluate them on the field. We have chosen a challenging robot, a Clearpath Kingfisher (previous version of the HERON). On Unmanned Surface Vessels, the dynamics is really challenging, making any navigation task even more difficult.

How To Train Your HERON:

In this paper we apply Deep Reinforcement Learning (Deep RL) and Domain Randomization to solve a navigation task in a natural environment relying solely on a 2D laser scanner. We train a model-based RL agent in simulation to follow lake and river shores and apply it on a real Unmanned Surface Vehicle in a zero-shot setup. We demonstrate that even though the agent has not been trained in the real world, it can fulfill its task successfully and adapt to changes in the robot’s environment and dynamics. Finally, we show that the RL agent is more robust, faster, and more accurate than a state-aware Model-Predictive-Controller.

Richard, A., Aravecchia, S., Schillaci, T., Geist, M., & Pradalier, C. (2021). How to train your heron. IEEE Robotics and Automation Letters, 6(3), 5247–5252.

The Heron dashing around the Symphonie Lake.

Learning Behaviors through Physics-driven Latent Imagination:

Model-based reinforcement learning (MBRL) consists in learning a so-called world model, a representation of the environment through interactions with it, then use it to train an agent. This approach is particularly interesting in the con-text of field robotics, as it alleviates the need to train online, and reduces the risks inherent to directly training agents on real robots. Generally, in such approaches, the world encompasses both the part related to the robot itself and the rest of the environment. We argue that decoupling the environment representation (for example, images or laser scans) from the dynamics of the physical system (that is, the robot and its physical state) can increase the flexibility of world models and open doors to greater robustness. In this paper, we apply this concept to a strong latent-agent, Dreamer. We then showcase the increased flexibility by transferring the environment part of the world model from one robot (a boat) to another (a rover), simply by adapting the physical model in the imagination. We additionally demonstrate the robustness of our method through real-world experiments on a boat.

Richard, A., Aravecchia, S., Geist, M., & Pradalier, C. (2022). Learning behaviors through physics-driven latent imagination. Conference on Robot Learning, 1190–1199.

Real world experiment. First row, overhead imagery of the deployment site, with the full trajectory of the agents in yellow. Center row: zoom on the bottom right corner of the lake, the trajectory of the agent can be seen in yellow. Last row, comparison of the forward velocities reached by the two agents. We invite the reader to report to Appx. E for higher resolution images. Overhead imagery from Google Earth, 2021, trajectories plotted using Google Earth KML API.