16.03.2021 Views

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 9

Figure 9.5.1: Frozen lake environment in OpenAI Gym

An action applied on FrozenLake-v0 returns the observation (equivalent to the next

state), reward, done (whether the episode is finished), and a dictionary of debugging

information. The observable attributes of the environment, known as observation

space, are captured by the returned observation object.

The generalized Q-Learning can be applied to the FrozenLake-v0 environment.

Table 9.5.1 shows the improvement in performance of both slippery and nonslippery

environments. A method of measuring the performance of the policy

is the percent of episodes executed that resulted in reaching the Goal state. The

higher is the percentage, the better. From the baseline of pure exploration (random

action) of about 1.5%, the policy can achieve ~76% Goal state for non-slippery and

~71% for the slippery environment. As expected, it is harder to control the slippery

environment.

The code can still be implemented in Python and NumPy since it only requires

a Q-Table. Listing 9.5.1 shows the implementation of the QAgent class while listing

9.5.2 demonstrates the agent's perception-action-learning loop. Apart from using

FrozenLake-v0 environment from OpenAI Gym, the most important change is

the implementation of the generalized Q-Learning as defined by Equation 9.5.1

in the update_q_table() function.

The qagent object can operate in either slippery or non-slippery mode. The agent

is trained for 40,000 iterations. After training, the agent can exploit the Q-Table to

choose the action to execute given any policy as shown in the test mode of Table 9.5.1.

There is a huge performance boost in using the learned policy as demonstrated in

Table 9.5.1. With the use of the gym, a lot of the code in constructing the

environment is gone.

[ 289 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!