Abstract
Reward shaping is a well-established family of techniques that have been successfully used to improve the performance and learning speed of Reinforcement Learning agents in singleobjective problems. Here we extend the guarantees of Potential- Based Reward Shaping (PBRS) by providing theoretical proof that PBRS does not alter the true Pareto front in MORL domains. We also contribute the rst empirical studies of the e ect of PBRS in MORL problems.
Original language | English (Ireland) |
---|---|
Media of output | Workshops |
Publication status | Published - 1 May 2017 |