16.03.2021 Views

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Q-Learning in Python

The environment and the Q-Learning discussed in the previous section can be

implemented in Python. Since the policy is just a simple table, there is, at this

point in time no need for Keras. Listing 9.3.1 shows q-learning-9.3.1.py, the

implementation of the simple deterministic world (environment, agent, action,

and Q-Table algorithms) using the QWorld class. For conciseness, the functions

dealing with the user interface are not shown.

Chapter 9

In this example, the environment dynamics is represented by self.transition_

table. At every action, self.transition_table determines the next state. The

reward for executing an action is stored in self.reward_table. The two tables are

consulted every time an action is executed by the step() function. The Q-Learning

algorithm is implemented by update_q_table() function. Every time the agent

needs to decide which action to take, it calls the act() function. The action may be

randomly drawn or decided by the policy using the Q-Table. The percent chance that

the action chosen is random is stored in the self.epsilon variable which is updated

by update_epsilon() function using a fixed epsilon_decay.

Before executing the code in Listing 9.3.1, we need to run:

$ sudo pip3 install termcolor

To install termcolor package. This package helps in visualizing text outputs on the

Terminal.

The complete code can be found on GitHub at: https://github.com/

PacktPublishing/Advanced-Deep-Learning-with-Keras.

Listing 9.3.1, q-learning-9.3.1.py. A simple deterministic MDP with six states:

from collections import deque

import numpy as np

import argparse

import os

import time

from termcolor import colored

class QWorld():

def __init__(self):

# 4 actions

# 0 - Left, 1 - Down, 2 - Right, 3 - Up

[ 281 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!