22.05.2023 Views

Combining Curriculum Learning and Behaviour Cloning to train First-person Shooter Agents

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

COMBINING CURRICULUM LEARNING AND BEHAVIOUR CLONING TO TRAIN FIRST-

PERSON SHOOTER AGENTS.

Pedro Almeida (a17564@alunos.ipca.pt) MEDJD

Vitor Carvalho (vcarvalho@ipca.pt)

Alberto Simões (asimoes@ipca.pt)

Commented [DAdSM1]: Delete if not needed

Keywords

Reinforcement Learning, Unity, First-Person Shooter, Bot, Artificial Intelligence

Commented [DAdSM2]: Max of five

Abstract

Reinforcement Learning is one of the many machine learning paradigms. With no labelled

data, it is concerned with balancing exploration and exploitation of an environment with one or

more agents present on it. In First Person shooter video games, AI bots are extensively used in

multiplayer. By using Reinforcement Learning techniques, we hope to improve their performance

and bring them to human skill levels.

Commented [DAdSM3]: Please do not add pictures. Save them

for your poster.

In this dissertation, we want to compare two of the Reinforcement Learning training methods

applied to First Person Shooter games, Curriculum Learning and Behavior Cloning, and combine

them to evaluate whether they can bring better performance when combined. To do this, we are

using unity’s ML-agents toolkit to train agents with each method and then combine both. After that,

we will make each agent battle each other’s to evaluate performance thought their obtained

victories.

With the progress we have made so far, experiments tell us that curriculum learning, and

behavior cloning are incompatible as behavior cloning makes then agents too reliant on the prerecorded

data, and they fail to progress through the curriculum’s stages.

References

McPartland, M.; Gallagher, M. "Reinforcement Learning in First Person Shooter Games," in IEEE Transactions

on Computational Intelligence and AI in Games, vol. 3, no. 1, pp. 43-56, March 2011.

https://doi.org/10.1109/TCIAIG.2010.2100395

Silver, D.; Hubert, T.; Schrittwieser, J.; Antonoglou, I.; Lai, M.; Guez, A.; Lanctot, M.; Sifre, L.; Kumaran, D.;

Graepel, T.; Lillicrap, T.; Simonyan, K.; Hassabis, D. “Mastering chess and shogi by self-play with a general

reinforcement learning algorithm”, Science, Vol 362, pp. 1140-1144, 2018.

https://doi.org/10.48550/arXiv.1712.01815

Jaderberg, M.; Czarnecki, W.; Dunning, I.; Marris, L.; Lever, G.; Castaneda, A.; Beattie, C.; Rabinowitz, N.;

Morcos, A.; Ruderman, A.; Sonnerat, N.; Green, T.; Deason, L.; Leibo, J.; Silver, D.; Hassabis, D.; Kavukcuoglu,

K.; Graepel, T. “Human-level performance in first-person multiplayer”, arXiv:1807.01281 [cs.LG], Jul. 2018.

https://doi.org/10.48550/arXiv.1807.01281

Wydmuch, M.; Kempka M.; Jasjiwski, W. “ViZDoom Competitions Playing Doom from Pixels”, arXiv:1809.03470

[cs.AI], Sep. 2018. https://doi.org/10.48550/arXiv.1809.03470

Commented [DAdSM4]: Up to a maximum of 4 references, APA

format.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!