Combining Curriculum Learning and Behaviour Cloning to train First-person Shooter Agents
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
COMBINING CURRICULUM LEARNING AND BEHAVIOUR CLONING TO TRAIN FIRST-
PERSON SHOOTER AGENTS.
Pedro Almeida (a17564@alunos.ipca.pt) MEDJD
Vitor Carvalho (vcarvalho@ipca.pt)
Alberto Simões (asimoes@ipca.pt)
Commented [DAdSM1]: Delete if not needed
Keywords
Reinforcement Learning, Unity, First-Person Shooter, Bot, Artificial Intelligence
Commented [DAdSM2]: Max of five
Abstract
Reinforcement Learning is one of the many machine learning paradigms. With no labelled
data, it is concerned with balancing exploration and exploitation of an environment with one or
more agents present on it. In First Person shooter video games, AI bots are extensively used in
multiplayer. By using Reinforcement Learning techniques, we hope to improve their performance
and bring them to human skill levels.
Commented [DAdSM3]: Please do not add pictures. Save them
for your poster.
In this dissertation, we want to compare two of the Reinforcement Learning training methods
applied to First Person Shooter games, Curriculum Learning and Behavior Cloning, and combine
them to evaluate whether they can bring better performance when combined. To do this, we are
using unity’s ML-agents toolkit to train agents with each method and then combine both. After that,
we will make each agent battle each other’s to evaluate performance thought their obtained
victories.
With the progress we have made so far, experiments tell us that curriculum learning, and
behavior cloning are incompatible as behavior cloning makes then agents too reliant on the prerecorded
data, and they fail to progress through the curriculum’s stages.
References
McPartland, M.; Gallagher, M. "Reinforcement Learning in First Person Shooter Games," in IEEE Transactions
on Computational Intelligence and AI in Games, vol. 3, no. 1, pp. 43-56, March 2011.
https://doi.org/10.1109/TCIAIG.2010.2100395
Silver, D.; Hubert, T.; Schrittwieser, J.; Antonoglou, I.; Lai, M.; Guez, A.; Lanctot, M.; Sifre, L.; Kumaran, D.;
Graepel, T.; Lillicrap, T.; Simonyan, K.; Hassabis, D. “Mastering chess and shogi by self-play with a general
reinforcement learning algorithm”, Science, Vol 362, pp. 1140-1144, 2018.
https://doi.org/10.48550/arXiv.1712.01815
Jaderberg, M.; Czarnecki, W.; Dunning, I.; Marris, L.; Lever, G.; Castaneda, A.; Beattie, C.; Rabinowitz, N.;
Morcos, A.; Ruderman, A.; Sonnerat, N.; Green, T.; Deason, L.; Leibo, J.; Silver, D.; Hassabis, D.; Kavukcuoglu,
K.; Graepel, T. “Human-level performance in first-person multiplayer”, arXiv:1807.01281 [cs.LG], Jul. 2018.
https://doi.org/10.48550/arXiv.1807.01281
Wydmuch, M.; Kempka M.; Jasjiwski, W. “ViZDoom Competitions Playing Doom from Pixels”, arXiv:1809.03470
[cs.AI], Sep. 2018. https://doi.org/10.48550/arXiv.1809.03470
Commented [DAdSM4]: Up to a maximum of 4 references, APA
format.