Abstract
Reward shaping is a well-established family of techniques that have been successfully used to improve the performance and learning speed of Reinforcement Learning agents in singleobjective problems. Here we extend the guarantees of Potential- Based Reward Shaping (PBRS) by providing theoretical proof that PBRS does not alter the true Pareto front in MORL domains. We also contribute the rst empirical studies of the e ect of PBRS in MORL problems.
| Original language | English (Ireland) |
|---|---|
| Media of output | Workshops |
| Publication status | Published - 1 May 2017 |
Fingerprint
Dive into the research topics of 'Potential-Based Reward Shaping Preserves Pareto Optimal Policies'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver