Potential-Based Reward Shaping Preserves Pareto Optimal Policies

Research output: Other contribution (Published)Other contribution

Abstract

Reward shaping is a well-established family of techniques that have been successfully used to improve the performance and learning speed of Reinforcement Learning agents in singleobjective problems. Here we extend the guarantees of Potential- Based Reward Shaping (PBRS) by providing theoretical proof that PBRS does not alter the true Pareto front in MORL domains. We also contribute the rst empirical studies of the e ect of PBRS in MORL problems.
Original languageEnglish (Ireland)
Media of outputWorkshops
Publication statusPublished - 1 May 2017

Fingerprint

Dive into the research topics of 'Potential-Based Reward Shaping Preserves Pareto Optimal Policies'. Together they form a unique fingerprint.

Cite this