16.03.2021 Views

Advanced Deep Learning with Keras

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 5

Lastly, the Lipschitz constraint in the EM distance optimization is imposed by

clipping the discriminator parameters (Line 7). Line 7 is the implementation of

Equation 5.1.20. After n critic

iterations of discriminator training, the discriminator

parameters are frozen. The generator training starts by sampling a batch of fake

data (Line 9). The sampled data is labeled as real (1.0) trying to fool the discriminator

network. The generator gradients are computed in Line 10 and optimized using

the RMSProp in Line 11. Lines 10 and 11 perform gradients update to optimize

Equation 5.1.22.

After training the generator, the discriminator parameters are unfrozen, and another

n critic

discriminator training iterations start. We should take note that there is no need

to freeze the generator parameters during discriminator training as the generator is

only involved in the fabrication of data. Similar to GANs, the discriminator can be

trained as a separate network. However, training the generator always requires the

participation of the discriminator through the adversarial network since the loss is

computed from the output of the generator network.

Unlike GAN, in WGAN real data are labeled 1.0 while fake data are labeled -1.0

as a workaround in computing the gradient in Line 5. Lines 5-6 and 10-11 perform

gradient update to optimize Equations 5.1.21 and 5.1.22 respectively. Each term in

Lines 5 and 10 is modelled as:

1 m

L =−ylabel

y

(Equation 5.1.23)

∑ prediction

m i = 1

Where y label

= 1.0 for the real data and y label

= -1.0 for the fake data. We removed the

superscript (i) for simplicity of the notation. For discriminator, WGAN increases

y

prediction

= D

w ( x)

to minimize the loss function when training using the real data.

When training using fake data, WGAN decreases y

prediction

= Dw ( G ( z)

) to minimize the

loss function. For the generator, WGAN increases y

prediction

= Dw ( G ( z)

) as to minimize

the loss function when the fake data is labeled as real during training. Note that y label

has no direct contribution in the loss function other than its sign. In Keras, Equation

5.1.23 is implemented as:

def wasserstein_loss(y_label, y_pred):

return -K.mean(y_label * y_pred)

WGAN implementation using Keras

To implement WGAN within Keras, we can reuse the DCGAN implementation of

GANs, something we introduced in the previous chapter. The DCGAN builder and

utility functions are implemented in gan.py in lib folder as a module.

[ 135 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!