Advanced Deep Learning with Keras

fourpersent2020
from fourpersent2020 More from this publisher
16.03.2021 Views

Variational Autoencoders(VAEs)Similar to Generative Adversarial Networks (GANs) that we've discussed inthe previous chapters, Variational Autoencoders (VAEs) [1] belong to the familyof generative models. The generator of VAE is able to produce meaningful outputswhile navigating its continuous latent space. The possible attributes of the decoderoutputs are explored through the latent vector.In GANs, the focus is on how to arrive at a model that approximates the inputdistribution. VAEs attempt to model the input distribution from a decodablecontinuous latent space. This is one of the possible underlying reasons whyGANs are able to generate more realistic signals when compared to VAEs. Forexample, in image generation, GANs are able to produce more realistic lookingimages while VAEs in comparison generate images that are less sharp.Within VAEs, the focus is on the variational inference of latent codes.Therefore, VAEs provide a suitable framework for both learning and efficientBayesian inference with latent variables. For example, VAEs with disentangledrepresentations enable latent code reuse for transfer learning.In terms of structure, VAEs bear a resemblance to an autoencoder. They arealso made up of an encoder (also known as recognition or inference model)and a decoder (also known as a generative model). Both VAEs and autoencodersattempt to reconstruct the input data while learning the latent vector. However,unlike autoencoders, the latent space of VAEs is continuous, and the decoder itselfis used as a generative model.[ 237 ]

Variational Autoencoders (VAEs)In the same line of discussions on GANs that we discussed in the previous chapters,the VAEs decoder can also be conditioned. For example, in the MNIST dataset, we'reable to specify the digit to produce given a one-hot vector. This class of conditionalVAE is called CVAE [2]. VAE latent vectors can also be disentangled by includinga regularizing hyperparameter on the loss function. This is called β -VAE [5]. Forexample, within MNIST, we're able to isolate the latent vector that determines thethickness or tilt angle of each digit.The goal of this chapter is to present:• The principles of VAEs• An understanding of the reparameterization trick that facilitates the useof stochastic gradient descent on VAE optimization• The principles of conditional VAE (CVAE) and β -VAE• An understanding of how to implement VAEs within the Keras libraryPrinciples of VAEsIn a generative model, we're often interested in approximating the true distributionof our inputs using neural networks:x ~P ( x)θ(Equation 8.1.1)In the preceding equation, θ are the parameters determined during training. Forexample, in the context of the celebrity faces dataset, this is equivalent to findinga distribution that can draw faces. Similarly, in the MNIST dataset, this distributioncan generate recognizable handwritten digits.In machine learning, to perform a certain level of inference, we're interestedin finding Pθ ( x,z), a joint distribution between inputs, x, and the latent variables, z.The latent variables are not part of the dataset but instead encode certain propertiesobservable from inputs. In the context of celebrity faces, these might be facialexpressions, hairstyles, hair color, gender, and so on. In the MNIST dataset,the latent variables may represent the digit and writing styles.Pθ ( x,z)is practically a distribution of input data points and their attributes.P θ(x) can be computed from the marginal distribution:P ( x) P ( x,z)dzθ= ∫(Equation 8.1.2)θ[ 238 ]

Variational Autoencoders (VAEs)

In the same line of discussions on GANs that we discussed in the previous chapters,

the VAEs decoder can also be conditioned. For example, in the MNIST dataset, we're

able to specify the digit to produce given a one-hot vector. This class of conditional

VAE is called CVAE [2]. VAE latent vectors can also be disentangled by including

a regularizing hyperparameter on the loss function. This is called β -VAE [5]. For

example, within MNIST, we're able to isolate the latent vector that determines the

thickness or tilt angle of each digit.

The goal of this chapter is to present:

• The principles of VAEs

• An understanding of the reparameterization trick that facilitates the use

of stochastic gradient descent on VAE optimization

• The principles of conditional VAE (CVAE) and β -VAE

• An understanding of how to implement VAEs within the Keras library

Principles of VAEs

In a generative model, we're often interested in approximating the true distribution

of our inputs using neural networks:

x ~

P ( x)

θ

(Equation 8.1.1)

In the preceding equation, θ are the parameters determined during training. For

example, in the context of the celebrity faces dataset, this is equivalent to finding

a distribution that can draw faces. Similarly, in the MNIST dataset, this distribution

can generate recognizable handwritten digits.

In machine learning, to perform a certain level of inference, we're interested

in finding Pθ ( x,

z)

, a joint distribution between inputs, x, and the latent variables, z.

The latent variables are not part of the dataset but instead encode certain properties

observable from inputs. In the context of celebrity faces, these might be facial

expressions, hairstyles, hair color, gender, and so on. In the MNIST dataset,

the latent variables may represent the digit and writing styles.

Pθ ( x,

z)

is practically a distribution of input data points and their attributes.

P θ

(x) can be computed from the marginal distribution:

P ( x) P ( x,

z)

dz

θ

= ∫

(Equation 8.1.2)

θ

[ 238 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!