16.03.2021 Views

Advanced Deep Learning with Keras

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Policy Gradient Methods

Train REINFORCE

from previously

saved weights

Train REINFORCE

with baseline from

previously saved

weights

Train Actor-Critic

from previously

saved weights

Train A2C from

previously saved

weights

python3 policygradient-car-10.1.1.py

--encoder_weights=encoder_weights.h5

--actor_weights=actor_weights.h5 --train

python3 policygradient-car-10.1.1.py

--encoder_weights=encoder_weights.h5

--actor_weights=actor_weights.h5

--value_weights=value_weights.h5 -b --train

python3 policygradient-car-10.1.1.py

--encoder_weights=encoder_weights.h5

--actor_weights=actor_weights.h5

--value_weights=value_weights.h5 -a --train

python3 policygradient-car-10.1.1.py

--encoder_weights=encoder_weights.h5

--actor_weights=actor_weights.h5

--value_weights=value_weights.h5 -c --train

Table 10.7.1: Different options in running policygradient-car-10.1.1.py

As a final note, the implementation of the policy gradient methods in Keras has

some limitations. For example, training the actor model requires resampling the

action. The action is first sampled and applied to the environment to observe the

reward and next state. Then, another sample is taken for training the log probability

model. The second sample is not necessarily the same as the first one, but the reward

that is used for training comes from the first sampled action, which can introduce

stochastic error in the computation of gradients.

The good news is Keras is gaining a lot of support from TensorFlow in the form

of tf.keras. Transitioning from Keras to a more flexible and powerful machine

learning library, like TensorFlow, has been made a lot easier. If you started with

Keras and wanted to build low-level custom machine learning routines, the APIs

of Keras and tf.keras share strong similarities.

There is a small learning curve in using Keras in TensorFlow. Furthermore, in tf.

keras, you're able to take advantage of the new easy to use Dataset and Estimators

APIs of TensorFlow. This simplifies a lot of the code and model reuse that ends

up with a clean pipeline. With the new eager execution mode of TensorFlow, it

becomes even easier to implement and debug Python codes in tf.keras and

TensorFlow. Eager execution allows the execution of codes without building

a computational graph as we did in this book. It also allows code structures

similar to a typical Python program.

[ 340 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!