09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Reinforcement Learning

Let us now instantiate our agent for the CartPole environment and train it:

env_string = 'CartPole-v0'

agent = DQN(env_string)

scores = agent.train()

In the following screenshot you can see the agent being trained on my system. The

agent was able to achieve our set threshold of 45 in 254 steps:

Figure 2: Agent training for the CartPole environment, achieving the target treshold within 254 steps

And the average reward plot as the agent learns is:

import matplotlib.pyplot as plt

plt.plot(scores)

plt.show()

Figure 3: Average agent reward plot

Once the training is done you can close the environment:

agent.env.close()

You can see starting from no information about how to balance the pole, the agent

using DQN is able to balance the pole for more and more time (on average) as

it learns. Starting from the blank state, the agent is able to build information/

knowledge to fulfill the required goal. Remarkable!

[ 426 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!