TY - GEN
T1 - Learning to shoot in first person shooter games by stabilizing actions and clustering rewards for reinforcement learning
AU - Glavin, Frank G.
AU - Madden, Michael G.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/4
Y1 - 2015/11/4
N2 - While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-Time are beginning to receive more attention. A challenge in such environments is that the time that elapses between deciding to take an action and receiving a reward based on its outcome can be longer than the interval between successive decisions. We explore this in the context of a non-player character (NPC) in a modern first-person shooter game. Such games take place in 3D environments where players, both human and computer-controlled, compete by engaging in combat and completing task objectives. We investigate the use of RL to enable NPCs to gather experience from game-play and improve their shooting skill over time from a reward signal based on the damage caused to opponents. We propose a new method for RL updates and reward calculations, in which the updates are carried out periodically, after each shooting encounter has ended, and a new weighted-reward mechanism is used which increases the reward applied to actions that lead to damaging the opponent in successive hits in what we term 'hit clusters'.
AB - While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-Time are beginning to receive more attention. A challenge in such environments is that the time that elapses between deciding to take an action and receiving a reward based on its outcome can be longer than the interval between successive decisions. We explore this in the context of a non-player character (NPC) in a modern first-person shooter game. Such games take place in 3D environments where players, both human and computer-controlled, compete by engaging in combat and completing task objectives. We investigate the use of RL to enable NPCs to gather experience from game-play and improve their shooting skill over time from a reward signal based on the damage caused to opponents. We propose a new method for RL updates and reward calculations, in which the updates are carried out periodically, after each shooting encounter has ended, and a new weighted-reward mechanism is used which increases the reward applied to actions that lead to damaging the opponent in successive hits in what we term 'hit clusters'.
UR - https://www.scopus.com/pages/publications/84964555950
U2 - 10.1109/CIG.2015.7317928
DO - 10.1109/CIG.2015.7317928
M3 - Conference Publication
AN - SCOPUS:84964555950
T3 - 2015 IEEE Conference on Computational Intelligence and Games, CIG 2015 - Proceedings
SP - 344
EP - 351
BT - 2015 IEEE Conference on Computational Intelligence and Games, CIG 2015 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2015 IEEE Conference on Computational Intelligence and Games, CIG 2015
Y2 - 31 August 2015 through 2 September 2015
ER -