An Online Hyper-volume Action Bounding Approach for Accelerating the Process of Deep Reinforcement Learning from Multiple Controllers

Alireza Rastegarpanah; Ali Aflakian; Jamie Hathaway; Rustam Stolkin

doi:10.22541/au.168312072.24350238/v1

loading page

An Online Hyper-volume Action Bounding Approach for Accelerating the Process of Deep Reinforcement Learning from Multiple Controllers

Ali Aflakian,
Alireza Rastegarpanah,
Jamie Hathaway,
Rustam Stolkin

Abstract

This paper fuses ideas from Reinforcement Learning (RL), Learning from Demonstration (LfD), and Ensemble Learning into a single paradigm. Knowledge from a mixture of control algorithms (experts) are used to constrain the action space of the agent, enabling faster RL refining of a control policy, by avoiding unnecessary explorative actions. Domain-specific knowledge of each expert is exploited. However, the resulting policy is robust against errors of individual experts, since it is refined by a RL reward function without copying any particular demonstration. Our method has the potential to supplement existing RLfD methods when multiple algorithmic approaches are available to function as experts. We illustrate our method in the context of a Visual Servoing (VS) task, in which a 7-dof robot arm is controlled to maintain a desired pose relative to a target object. We explore four methods for bounding the actions of the RL agent during training. These methods include using a hypercube and convex hull with modified loss functions, ignoring actions outside the convex hull, and projecting actions onto the convex hull. We compare the training progress of each method with and without using the expert demonstrators. Our experiments show that using the convex hull with a modified loss function significantly improves training progress. Furthermore, we demonstrate faster VS error convergence while maintaining higher manipulability of the arm, compared to classical image-based VS, position-based VS, and hybrid-decoupled VS.

03 May 2023Submitted to Journal of Field Robotics

Show details

Hide details

03 May 2023Assigned to Editor

03 May 2023Submission Checks Completed

10 May 2023Review(s) Completed, Editorial Evaluation Pending

15 May 2023Reviewer(s) Assigned

15 Aug 2023Editorial Decision: Revise Major

16 Oct 20231st Revision Received

16 Oct 2023Review(s) Completed, Editorial Evaluation Pending

16 Oct 2023Assigned to Editor

16 Oct 2023Submission Checks Completed

18 Oct 2023Reviewer(s) Assigned

24 Jan 20242nd Revision Received

29 Jan 2024Submission Checks Completed

29 Jan 2024Assigned to Editor

29 Jan 2024Review(s) Completed, Editorial Evaluation Pending

31 Jan 2024Reviewer(s) Assigned

Abstract

Peer review status:UNDER REVIEW