16.03.2021 Views

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Deep Reinforcement Learning

# terminal Hole state

self.transition_table[5, 0] = 5

self.transition_table[5, 1] = 5

self.transition_table[5, 2] = 5

self.transition_table[5, 3] = 5

# execute the action on the environment

def step(self, action):

# determine the next_state given state and action

next_state = self.transition_table[self.state, action]

# done is True if next_state is Goal or Hole

done = next_state == 2 or next_state == 5

# reward given the state and action

reward = self.reward_table[self.state, action]

# the enviroment is now in new state

self.state = next_state

return next_state, reward, done

# determine the next action

def act(self):

# 0 - Left, 1 - Down, 2 - Right, 3 - Up

# action is from exploration

if np.random.rand() <= self.epsilon:

# explore - do random action

self.is_explore = True

return np.random.choice(4,1)[0]

# or action is from exploitation

# exploit - choose action with max Q-value

self.is_explore = False

return np.argmax(self.q_table[self.state])

# Q-Learning - update the Q Table using Q(s, a)

def update_q_table(self, state, action, reward, next_state):

# Q(s, a) = reward + gamma * max_a' Q(s', a')

q_value = self.gamma * np.amax(self.q_table[next_state])

q_value += reward

self.q_table[state, action] = q_value

[ 284 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!