Advanced Deep Learning with Keras
References1. Sutton and Barto. Reinforcement Learning: An Introduction, 2017(http://incompleteideas.net/book/bookdraft2017nov5.pdf).Chapter 92. Volodymyr Mnih and others, Human-level control through deep reinforcementlearning. Nature 518.7540, 2015: 529 (http://www.davidqiu.com:8888/research/nature14236.pdf)3. Hado Van Hasselt, Arthur Guez, and David Silver Deep ReinforcementLearning with Double Q-Learning. AAAI. Vol. 16, 2016 (http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12389/11847).4. Kai Arulkumaran and others A Brief Survey of Deep ReinforcementLearning. arXiv preprint arXiv:1708.05866, 2017 (https://arxiv.org/pdf/1708.05866.pdf).5. David Silver Lecture Notes on Reinforcement Learning, (http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html).6. Tom Schaul and others. Prioritized experience replay. arXiv preprintarXiv:1511.05952, 2015 (https://arxiv.org/pdf/1511.05952.pdf).7. Ziyu Wang and others. Dueling Network Architectures for Deep ReinforcementLearning. arXiv preprint arXiv:1511.06581, 2015 (https://arxiv.org/pdf/1511.06581.pdf).[ 305 ]
- Page 271 and 272: Variational Autoencoders (VAEs)Prec
- Page 273 and 274: Variational Autoencoders (VAEs)shap
- Page 275 and 276: Variational Autoencoders (VAEs)cvae
- Page 277 and 278: Variational Autoencoders (VAEs)Figu
- Page 279 and 280: Variational Autoencoders (VAEs)Figu
- Page 281 and 282: Variational Autoencoders (VAEs)In F
- Page 283 and 284: Variational Autoencoders (VAEs)Figu
- Page 285 and 286: Variational Autoencoders (VAEs)The
- Page 288 and 289: Deep ReinforcementLearningReinforce
- Page 290 and 291: [ 273 ]Chapter 9Formally, the RL pr
- Page 292 and 293: Chapter 9Where:( ) ( , )∗V s maxQ
- Page 294 and 295: Chapter 9Initially, the agent assum
- Page 296 and 297: Chapter 9Figure 9.3.6: Assuming the
- Page 298 and 299: Q-Learning in PythonThe environment
- Page 300 and 301: Chapter 9----------------"""self.re
- Page 302 and 303: Chapter 9# UI to dump Q Table conte
- Page 304 and 305: Chapter 9Figure 9.3.10: The value f
- Page 306 and 307: Chapter 9Figure 9.5.1: Frozen lake
- Page 308 and 309: Chapter 9# discount factorself.gamm
- Page 310 and 311: Chapter 9# training of Q Tableif do
- Page 312 and 313: Chapter 9Where all terms are famili
- Page 314 and 315: Listing 9.6.1 shows us the DQN impl
- Page 316 and 317: Chapter 9if self.ddqn:print("------
- Page 318 and 319: Chapter 9updates# correction on the
- Page 320 and 321: QmaxChapter 9⎧rj+1if episodetermi
- Page 324 and 325: Policy Gradient MethodsIn the final
- Page 326 and 327: Chapter 10( | , ) ( )π a s θ = so
- Page 328 and 329: Chapter 10The gradient updates are
- Page 330 and 331: REINFORCE with baseline methodChapt
- Page 332 and 333: Chapter 10Actor-Critic methodIn REI
- Page 334 and 335: Chapter 10Advantage Actor-Critic (A
- Page 336 and 337: Chapter 10Figure 10.6.1 MountainCar
- Page 338 and 339: Chapter 10Figure 10.6.4 Decoder mod
- Page 340 and 341: Chapter 10Figure 10.6.5: Policy mod
- Page 342 and 343: logp = Lambda(self.logp,output_shap
- Page 344 and 345: mean, stddev = argsdist = tf.distri
- Page 346 and 347: Chapter 10Similarly, the value loss
- Page 348 and 349: Chapter 10next_value = self.value(n
- Page 350 and 351: [ 333 ]Chapter 10The training strat
- Page 352 and 353: Chapter 10Performance evaluation of
- Page 354 and 355: Chapter 10Figure 10.7.4: The number
- Page 356 and 357: Chapter 10Figure 10.7.8: The total
- Page 358: Chapter 10ConclusionIn this chapter
- Page 361 and 362: Other Books You May EnjoyDeep Learn
- Page 364 and 365: IndexAaccuracy 17Actor-Critic (A2C)
- Page 366 and 367: Actor-Critic method, advantages 317
References
1. Sutton and Barto. Reinforcement Learning: An Introduction, 2017
(http://incompleteideas.net/book/bookdraft2017nov5.pdf).
Chapter 9
2. Volodymyr Mnih and others, Human-level control through deep reinforcement
learning. Nature 518.7540, 2015: 529 (http://www.davidqiu.com:8888/
research/nature14236.pdf)
3. Hado Van Hasselt, Arthur Guez, and David Silver Deep Reinforcement
Learning with Double Q-Learning. AAAI. Vol. 16, 2016 (http://www.aaai.
org/ocs/index.php/AAAI/AAAI16/paper/download/12389/11847).
4. Kai Arulkumaran and others A Brief Survey of Deep Reinforcement
Learning. arXiv preprint arXiv:1708.05866, 2017 (https://arxiv.org/
pdf/1708.05866.pdf).
5. David Silver Lecture Notes on Reinforcement Learning, (http://www0.cs.ucl.
ac.uk/staff/d.silver/web/Teaching.html).
6. Tom Schaul and others. Prioritized experience replay. arXiv preprint
arXiv:1511.05952, 2015 (https://arxiv.org/pdf/1511.05952.pdf).
7. Ziyu Wang and others. Dueling Network Architectures for Deep Reinforcement
Learning. arXiv preprint arXiv:1511.06581, 2015 (https://arxiv.org/
pdf/1511.06581.pdf).
[ 305 ]