Abstract
The goal of multi-objective problems is to find solutions that balance different objectives. When solving multi-objective problems using reinforcement learning linear scalarisation techniques are generally used, however system expertise is required to optimise the weights for linear scalarisation. Thresholded Lexicographic Ordering (TLO) is one technique that avoids the need for an expert to specify weights; instead a system designer can directly specify a preferred ordering over objectives, along with a desired threshold value for each objective. In this paper we propose a novel algorithm to dynamically set thresholds for use with TLO. We also present the first evaluation of TLO in a complex multi-objective multi-agent problem, the Dynamic Economic Emissions Dispatch domain. Our empirical results demonstrate that TLO with our dynamic thresholding algorithm achieves superior results when compared with a hand-tuned linear scalarisation method from previously published work.
| Original language | English |
|---|---|
| Publication status | Published - 2020 |
| Event | Adaptive and Learning Agents Workshop, ALA 2020 at AAMAS 2020 - Auckland, New Zealand Duration: 9 May 2020 → 10 May 2020 |
Conference
| Conference | Adaptive and Learning Agents Workshop, ALA 2020 at AAMAS 2020 |
|---|---|
| Country/Territory | New Zealand |
| City | Auckland |
| Period | 9/05/20 → 10/05/20 |
Keywords
- Multi-agent systems
- Multi-objective
- Reinforcement Learning