Advanced Deep Learning with Keras

More documents

Recommendations

Info

QmaxChapter 9⎧rj+1if episodeterminates at j + 1⎫= ⎨⎪⎛⎞⎪−⎬rj+ 1+ γQ target sj+ 1,argmax Q( sj+ 1, aj+1; θ); θotherwise⎜a⎟j+1⎪⎩⎝⎠⎪⎭The term argmax Q( sj 1, aj 1;θ)by Q target.a j+1+ +lets Q to choose the action. Then this action is evaluatedIn Listing 9.6.1, both DQN and DDQN are implemented. Specifically, for DDQN,the modification on the Q value computation performed by get_target_q_value()function is highlighted:# compute Q_max# use of target Q Network solves the non-stationarity problemdef get_target_q_value(self, next_state):# max Q value among next state's actionsif self.ddqn:# DDQN# current Q Network selects the action# a'_max = argmax_a' Q(s', a')action = np.argmax(self.q_model.predict(next_state)[0])# target Q Network evaluates the action# Q_max = Q_target(s', a'_max)q_value = self.target_q_model.predict(next_state)[0][action]else:# DQN chooses the max Q value among next actions# selection and evaluation of action is on the target QNetwork# Q_max = max_a' Q_target(s', a')q_value = np.amax(self.target_q_model.predict(next_state)[0])# Q_max = reward + gamma * Q_maxq_value *= self.gammaq_value += rewardreturn q_valueFor comparison, on the average of 10 runs, the CartPole-v0 is solved by DDQNwithin 971 episodes. To use DDQN, run:$ python3 dqn-cartpole-9.6.1.py -d[ 303 ]
Deep Reinforcement LearningConclusionIn this chapter, we've been introduced to DRL. A powerful technique believedby many researchers as the most promising lead towards artificial intelligence.Together, we've gone over the principles of RL. RL is able to solve many toyproblems, but the Q-Table is unable to scale to more complex real-world problems.The solution is to learn the Q-Table using a deep neural network. However, trainingdeep neural networks on RL is highly unstable due to sample correlation and nonstationarityof the target Q-Network.DQN proposed a solution to these problems using experience replay and separatingthe target network from the Q-Network under training. DDQN suggested furtherimprovement of the algorithm by separating the action selection from actionevaluation to minimize the overestimation of Q value. There are other improvementsproposed for the DQN. Prioritized experience replay [6] argues that that experiencebuffer should not be sampled uniformly. Instead, experiences that are moreimportant based on TD errors should be sampled more frequently to accomplishmore efficient training. [7] proposes a dueling network architecture to estimate thestate value function and the advantage function. Both functions are used to estimatethe Q value for faster learning.The approach presented in this chapter is value iteration/fitting. The policy islearned indirectly by finding an optimal value function. In the next chapter, theapproach will be to learn the optimal policy directly by using a family of algorithmscalled policy gradient methods. Learning the policy has many advantages. Inparticular, policy gradient methods can deal with both discrete and continuousaction spaces.[ 304 ]
Page 2 and 3:
Advanced Deep Learningwith KerasApp
Page 4 and 5:
mapt.ioMapt is an online digital li
Page 6 and 7:
I would like to thank my family, Ch
Page 8 and 9:
Table of ContentsPrefaceVChapter 1:
Page 10 and 11:
[ iii ]Table of ContentsChapter 7:
Page 12 and 13:
[ v ]PrefaceIn recent years, deep l
Page 14 and 15:
Chapter 5, Improved GANs, covers al
Page 16 and 17:
def encoder_layer(inputs,filters=16
Page 18 and 19:
Introducing Advanced DeepLearning w
Page 20 and 21:
Chapter 1Installing Keras and Tenso
Page 22 and 23:
Chapter 1• RNNs: Recurrent neural
Page 24 and 25:
[ 7 ]Chapter 1In the preceding figu
Page 26 and 27:
Chapter 1Figure 1.3.3: MLP MNIST di
Page 28 and 29:
Chapter 1model.add(Activation('soft
Page 30 and 31:
Chapter 1model.add(Activation('relu
Page 32 and 33:
Chapter 1As an example, l2 weight r
Page 34 and 35:
[ 17 ]Chapter 1How far the predicte
Page 36 and 37:
Chapter 1Figure 1.3.8: Plot of a fu
Page 38 and 39:
Chapter 1The highest test accuracy
Page 40 and 41:
Chapter 1Figure 1.3.9: The graphica
Page 42 and 43:
Chapter 1# image is processed as is
Page 44 and 45:
Chapter 1The computation involved i
Page 46 and 47:
Chapter 1Listing 1.4.2 shows a summ
Page 48 and 49:
Chapter 164-64-64 RMSprop Dropout(0
Page 50 and 51:
Chapter 1There are the two main dif
Page 52 and 53:
Chapter 1Layers Optimizer Regulariz
Page 54:
ConclusionThis chapter provided an
Page 57 and 58:
Deep Neural NetworksWhile this chap
Page 59 and 60:
Deep Neural Networks# reshape and n
Page 61 and 62:
Deep Neural NetworksEverything else
Page 63 and 64:
Deep Neural Networksfrom keras.util
Page 65 and 66:
Deep Neural NetworksFigure 2.1.3: T
Page 67 and 68:
Deep Neural NetworksHence, the netw
Page 69 and 70:
Deep Neural NetworksGenerally speak
Page 71 and 72:
Deep Neural NetworksIn the dataset,
Page 73 and 74:
Deep Neural NetworksTransition Laye
Page 75 and 76:
Deep Neural NetworksThere are some
Page 77 and 78:
Deep Neural NetworksResNet v2 is al
Page 79 and 80:
Deep Neural Networks…if version =
Page 81 and 82:
Deep Neural NetworksTo prevent the
Page 83 and 84:
Deep Neural NetworksAverage Pooling
Page 85 and 86:
Deep Neural Networks# orig paper us
Page 88 and 89:
AutoencodersIn the previous chapter
Page 90 and 91:
Chapter 3The autoencoder has the te
Page 92 and 93:
Chapter 3Firstly, we're going to im
Page 94 and 95:
Chapter 3# reconstruct the inputout
Page 96 and 97:
Chapter 3Figure 3.2.2: The decoder
Page 98 and 99:
batch_size=32,model_name="autoencod
Page 100 and 101:
Chapter 3Figure 3.2.6: Digits gener
Page 102 and 103:
Chapter 3As shown in Figure 3.3.2,
Page 104 and 105:
Chapter 3image_size = x_train.shape
Page 106 and 107:
Chapter 3# Mean Square Error (MSE)
Page 108 and 109:
Chapter 3from keras.layers import R
Page 110 and 111:
Chapter 3# build the autoencoder mo
Page 112 and 113:
Chapter 3x_train,validation_data=(x
Page 114:
Chapter 3ConclusionIn this chapter,
Page 117 and 118:
Generative Adversarial Networks (GA
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Improved GANsIn summary, the goal o
Page 145 and 146:
Improved GANsThe intuition behind E
Page 147 and 148:
Improved GANsThis makes sense since
Page 149 and 150:
Improved GANsIn the context of GANs
Page 151 and 152:
Improved GANsFigure 5.1.3: Top: Tra
Page 153 and 154:
Improved GANsThe functions include:
Page 155 and 156:
Improved GANsmodels = (generator, d
Page 157 and 158:
Improved GANsfor layer in discrimin
Page 159 and 160:
Improved GANsFollowing figure shows
Page 161 and 162:
Improved GANsThe preceding table sh
Page 163 and 164:
Improved GANsFollowing figure shows
Page 165 and 166:
Improved GANsEssentially, in CGAN w
Page 167 and 168:
Improved GANslayer = Dense(layer_fi
Page 169 and 170:
Improved GANsx = BatchNormalization
Page 171 and 172:
Improved GANsdiscriminator.compile(
Page 173 and 174:
Improved GANssize=batch_size)real_i
Page 175 and 176:
Improved GANsUnlike CGAN, the sampl
Page 177 and 178:
Improved GANsConclusionIn this chap
Page 179 and 180:
Disentangled Representation GANsIn
Page 181 and 182:
Disentangled Representation GANsInf
Page 183 and 184:
Disentangled Representation GANsFol
Page 185 and 186:
Disentangled Representation GANs# A
Page 187 and 188:
Disentangled Representation GANsif
Page 189 and 190:
Disentangled Representation GANsLis
Page 191 and 192:
Disentangled Representation GANsdat
Page 193 and 194:
Disentangled Representation GANsy[b
Page 195 and 196:
Disentangled Representation GANspyt
Page 197 and 198:
Disentangled Representation GANsThe
Page 199 and 200:
Disentangled Representation GANsSta
Page 201 and 202:
Disentangled Representation GANs( )
Page 203 and 204:
Disentangled Representation GANsThe
Page 205 and 206:
Disentangled Representation GANsfea
Page 207 and 208:
Disentangled Representation GANs# f
Page 209 and 210:
Disentangled Representation GANslat
Page 211 and 212:
Disentangled Representation GANsDis
Page 213 and 214:
Disentangled Representation GANsz_d
Page 215 and 216:
Disentangled Representation GANs2.
Page 217 and 218:
Disentangled Representation GANsFig
Page 220 and 221:
Cross-Domain GANsIn computer vision
Page 222 and 223:
Chapter 7There are many more exampl
Page 224 and 225:
The CycleGAN ModelFigure 7.1.3 show
Page 226 and 227:
Chapter 7Repeat for n training step
Page 228 and 229:
Chapter 7Implementing CycleGAN usin
Page 230 and 231:
filters=16,kernel_size=3,strides=2,
Page 232 and 233:
Chapter 7kernel_size=kernel_size)e3
Page 234 and 235:
Listing 7.1.3, cyclegan-7.1.1.py sh
Page 236 and 237:
Chapter 71) Build target and source
Page 238 and 239:
Chapter 7preal_target,reco_source,r
Page 240 and 241:
size=batch_size)real_source = sourc
Page 242 and 243:
Chapter 7returndirs=dirs,show=True)
Page 244 and 245:
Chapter 7Figure 7.1.10: Color (from
Page 246 and 247:
[ 229 ]Chapter 7titles = ('MNIST pr
Page 248 and 249:
Chapter 7Figure 7.1.13: Style trans
Page 250 and 251:
Chapter 7Figure 7.1.15: The backwar
Page 252:
Chapter 7References1. Yuval Netzer
Page 255 and 256:
Variational Autoencoders (VAEs)In t
Page 257 and 258:
Variational Autoencoders (VAEs)Typi
Page 259 and 260:
Variational Autoencoders (VAEs)For
Page 261 and 262:
Variational Autoencoders (VAEs)VAEs
Page 263 and 264:
Variational Autoencoders (VAEs)outp
Page 265 and 266:
Variational Autoencoders (VAEs)Figu
Page 267 and 268:
Variational Autoencoders (VAEs)The
Page 269 and 270: Variational Autoencoders (VAEs)Figu
Page 271 and 272: Variational Autoencoders (VAEs)Prec
Page 273 and 274: Variational Autoencoders (VAEs)shap
Page 275 and 276: Variational Autoencoders (VAEs)cvae
Page 281 and 282: Variational Autoencoders (VAEs)In F
Page 285 and 286: Variational Autoencoders (VAEs)The
Page 288 and 289: Deep ReinforcementLearningReinforce
Page 290 and 291: [ 273 ]Chapter 9Formally, the RL pr
Page 292 and 293: Chapter 9Where:( ) ( , )∗V s maxQ
Page 294 and 295: Chapter 9Initially, the agent assum
Page 296 and 297: Chapter 9Figure 9.3.6: Assuming the
Page 298 and 299: Q-Learning in PythonThe environment
Page 300 and 301: Chapter 9----------------"""self.re
Page 302 and 303: Chapter 9# UI to dump Q Table conte
Page 304 and 305: Chapter 9Figure 9.3.10: The value f
Page 306 and 307: Chapter 9Figure 9.5.1: Frozen lake
Page 308 and 309: Chapter 9# discount factorself.gamm
Page 310 and 311: Chapter 9# training of Q Tableif do
Page 312 and 313: Chapter 9Where all terms are famili
Page 314 and 315: Listing 9.6.1 shows us the DQN impl
Page 316 and 317: Chapter 9if self.ddqn:print("------
Page 318 and 319: Chapter 9updates# correction on the
Page 322: References1. Sutton and Barto. Rein
Page 325 and 326: Policy Gradient MethodsPolicy gradi
Page 327 and 328: Policy Gradient MethodsGiven a cont
Page 329 and 330: Policy Gradient MethodsRequire: Dis
Page 335 and 336: Policy Gradient MethodsRequire: θ
Page 337 and 338: Policy Gradient MethodsThe state is
Page 339 and 340: Policy Gradient Methodsself.encoder
Page 341 and 342: Policy Gradient MethodsThe policy n
Page 343 and 344: Policy Gradient MethodsFigure 10.6.
Page 345 and 346: Policy Gradient MethodsAfter buildi
Page 347 and 348: Policy Gradient Methods[_, _, _, re
Page 349 and 350: Policy Gradient MethodsEach algorit
Page 351 and 352: Policy Gradient Methodswhile not do
Page 357 and 358: Policy Gradient MethodsTrain REINFO
Page 360 and 361: Other Books YouMay EnjoyIf you enjo
Page 362: Other Books You May EnjoyLeave a re
Page 365 and 366: DenseNet 39DenseNet-BC (Bottleneck-
Page 367: VVariational Autoencoder (VAE)about
show all

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?