Advanced Deep Learning with Keras
Chapter 5WGAN ( D)L =− E D x + E D G( ) ( ( z))x~ p data w z w( G)L =−E zD w ( G( z))( − )w ← clip w, 0.01,0.015.1.215.1.225.1.20Algorithm 5.1.1 WGANTable 5.1.1: A comparison between the loss functions of GAN and WGANThe values of the parameters are α = 0.00005 , c = 0.01 m = 64, and n critic= 5.Require: a , the learning rate. c, the clipping parameter. m, the batch size. n critic, thenumber of the critic (discriminator) iterations per generator iteration.Require: w 0, initial critic (discriminator) parameters. θ0, initial generator parameters1. while θ has not converged do2. for t = 1, …, n criticdom(){ } ~1i3. Sample a batch xi=pdatafrom the real datam() i4. Sample a batch { z } ~ p( z)from the uniform noise distributioni=1()5.⎡ 1 m()( i 1 miw w i 1 w )i 1 w ( ( ⎤←∇ − += = θ ))⎢∑ D ∑ D G⎣ mm⎥⎦, compute thediscriminator gradients6. w ← w− α× RMSProp( w, gw), update the discriminator parameters7. w ← clip( w, − c,c), clip discriminator weights8. end form() i{ z } ~ p( z)i=11 m () iθ θ i 1 w θz=9. Sample a batch10.( ( ))from the uniform noise distributiong ←−∇ ∑ D G , compute the generator gradientsmθ ← θ − α× RMSProp θ,G , update generator parameters11. ( )12. end whileθ[ 133 ]
Improved GANsFigure 5.1.3: Top: Training the WGAN discriminator requires fake data from the generator and real data from thetrue distribution. Bottom: Training the WGAN generator requires fake data from the generator pretending to be real.Similar to GANs, WGAN alternately trains the discriminator and generator(through adversarial). However, in WGAN, the discriminator (also called the critic)trains n criticiterations (Lines 2 to 8) before training the generator for one iteration(Lines 9 to 11). This in contrast to GANs with an equal number of training iterationfor both discriminator and generator. Training the discriminator means learning theparameters (weights and biases) of the discriminator. This requires sampling a batchfrom the real data (Line 3) and a batch from the fake data (Line 4) and computingthe gradient of discriminator parameters (Line 5) after feeding the sampled datato the discriminator network. The discriminator parameters are optimized usingRMSProp (Line 6). Both lines 5 and 6 are the optimization of Equation 5.1.21.Adam was found to be unstable in WGAN.[ 134 ]
- Page 100 and 101: Chapter 3Figure 3.2.6: Digits gener
- Page 102 and 103: Chapter 3As shown in Figure 3.3.2,
- Page 104 and 105: Chapter 3image_size = x_train.shape
- Page 106 and 107: Chapter 3# Mean Square Error (MSE)
- Page 108 and 109: Chapter 3from keras.layers import R
- Page 110 and 111: Chapter 3# build the autoencoder mo
- Page 112 and 113: Chapter 3x_train,validation_data=(x
- Page 114: Chapter 3ConclusionIn this chapter,
- Page 117 and 118: Generative Adversarial Networks (GA
- Page 119 and 120: Generative Adversarial Networks (GA
- Page 121 and 122: Generative Adversarial Networks (GA
- Page 123 and 124: Generative Adversarial Networks (GA
- Page 125 and 126: Generative Adversarial Networks (GA
- Page 127 and 128: Generative Adversarial Networks (GA
- Page 129 and 130: Generative Adversarial Networks (GA
- Page 131 and 132: Generative Adversarial Networks (GA
- Page 133 and 134: Generative Adversarial Networks (GA
- Page 135 and 136: Generative Adversarial Networks (GA
- Page 137 and 138: Generative Adversarial Networks (GA
- Page 139 and 140: Generative Adversarial Networks (GA
- Page 141 and 142: Generative Adversarial Networks (GA
- Page 143 and 144: Improved GANsIn summary, the goal o
- Page 145 and 146: Improved GANsThe intuition behind E
- Page 147 and 148: Improved GANsThis makes sense since
- Page 149: Improved GANsIn the context of GANs
- Page 153 and 154: Improved GANsThe functions include:
- Page 155 and 156: Improved GANsmodels = (generator, d
- Page 157 and 158: Improved GANsfor layer in discrimin
- Page 159 and 160: Improved GANsFollowing figure shows
- Page 161 and 162: Improved GANsThe preceding table sh
- Page 163 and 164: Improved GANsFollowing figure shows
- Page 165 and 166: Improved GANsEssentially, in CGAN w
- Page 167 and 168: Improved GANslayer = Dense(layer_fi
- Page 169 and 170: Improved GANsx = BatchNormalization
- Page 171 and 172: Improved GANsdiscriminator.compile(
- Page 173 and 174: Improved GANssize=batch_size)real_i
- Page 175 and 176: Improved GANsUnlike CGAN, the sampl
- Page 177 and 178: Improved GANsConclusionIn this chap
- Page 179 and 180: Disentangled Representation GANsIn
- Page 181 and 182: Disentangled Representation GANsInf
- Page 183 and 184: Disentangled Representation GANsFol
- Page 185 and 186: Disentangled Representation GANs# A
- Page 187 and 188: Disentangled Representation GANsif
- Page 189 and 190: Disentangled Representation GANsLis
- Page 191 and 192: Disentangled Representation GANsdat
- Page 193 and 194: Disentangled Representation GANsy[b
- Page 195 and 196: Disentangled Representation GANspyt
- Page 197 and 198: Disentangled Representation GANsThe
- Page 199 and 200: Disentangled Representation GANsSta
Chapter 5
WGAN ( D)
L =− E D x + E D G
( ) ( ( z)
)
x~ p data w z w
( G)
L =−E z
D w ( G( z)
)
( − )
w ← clip w, 0.01,0.01
5.1.21
5.1.22
5.1.20
Algorithm 5.1.1 WGAN
Table 5.1.1: A comparison between the loss functions of GAN and WGAN
The values of the parameters are α = 0.00005 , c = 0.01 m = 64, and n critic
= 5.
Require: a , the learning rate. c, the clipping parameter. m, the batch size. n critic
, the
number of the critic (discriminator) iterations per generator iteration.
Require: w 0
, initial critic (discriminator) parameters. θ
0
, initial generator parameters
1. while θ has not converged do
2. for t = 1, …, n critic
do
m
()
{ } ~
1
i
3. Sample a batch x
i=
pdata
from the real data
m
() i
4. Sample a batch { z } ~ p( z)
from the uniform noise distribution
i=
1
()
5.
⎡ 1 m
()
( i 1 m
i
w w i 1 w )
i 1 w ( ( ⎤
←∇ − +
= = θ ))
⎢
∑ D ∑ D G
⎣ m
m
⎥⎦
, compute the
discriminator gradients
6. w ← w− α× RMSProp( w, gw)
, update the discriminator parameters
7. w ← clip( w, − c,
c)
, clip discriminator weights
8. end for
m
() i
{ z } ~ p( z)
i=
1
1 m () i
θ θ i 1 w θ
z
=
9. Sample a batch
10.
( ( ))
from the uniform noise distribution
g ←−∇ ∑ D G , compute the generator gradients
m
θ ← θ − α× RMSProp θ,
G , update generator parameters
11. ( )
12. end while
θ
[ 133 ]