16.03.2021 Views

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Variational Autoencoders (VAEs)

In Figures 8.2.5 and 8.2.6, it can be noticed that the width and roundness

(if applicable) of each digit change as z[0] is traced from left to right. Meanwhile,

the tilt angle and roundness (if applicable) of each digit change as z[1] is navigated

from top to bottom. As we move away from the center of the distribution, the

image of the digit starts to degrade. This is expected since the latent space is a circle.

Other noticeable variations in attributes may be digit specific. For example,

the horizontal stroke (arm) for digit 1 becomes visible in the upper left quadrant.

The horizontal stroke (crossbar) for digit 7 can be seen in the right quadrants only.

β -VAE: VAE with disentangled latent

representations

In Chapter 6, Disentangled Representation GANs, the concept, and importance of

the disentangled representation of latent codes were discussed. We can recall

that a disentangled representation is where single latent units are sensitive to

changes in single generative factors while being relatively invariant to changes

in other factors [3]. Varying a latent code results to changes in one attribute of

the generated output while the rest of the properties remain the same.

In the same chapter, InfoGANs [4] demonstrated to us that in the MNIST dataset,

it is possible to control which digit to generate and the tilt and thickness of writing

style. Observing the results in the previous section, it can be noticed that the VAE

is intrinsically disentangling the latent vector dimensions to a certain extent. For

example, looking at digit 8 in Figure 8.2.6, navigating z[1] from top to bottom

decreases the width and roundness while rotating the digit clockwise. Increasing

z[0] from left to right also decreases the width and roundness while rotating the digit

counterclockwise. In other words, z[1] controls the clockwise rotation, z[0] affects the

counterclockwise rotation, and both of them alter the width and roundness.

In this section, we'll demonstrate that a simple modification in the loss function of

VAE forces the latent codes to disentangle further. The modification is the positive

constant weight, β > 1, acting as a regularizer on the KL loss:

− VAE

= LR + βL KL (Equation 8.3.1)

This variation of VAE is called β -VAE [5]. The implicit effect of β is a tighter

standard deviation. In other words, β forces the latent codes in the posterior

distribution, Qφ ( z | x)

to be independent.

[ 264 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!