Advanced Deep Learning with Keras
Variational Autoencoders(VAEs)Similar to Generative Adversarial Networks (GANs) that we've discussed inthe previous chapters, Variational Autoencoders (VAEs) [1] belong to the familyof generative models. The generator of VAE is able to produce meaningful outputswhile navigating its continuous latent space. The possible attributes of the decoderoutputs are explored through the latent vector.In GANs, the focus is on how to arrive at a model that approximates the inputdistribution. VAEs attempt to model the input distribution from a decodablecontinuous latent space. This is one of the possible underlying reasons whyGANs are able to generate more realistic signals when compared to VAEs. Forexample, in image generation, GANs are able to produce more realistic lookingimages while VAEs in comparison generate images that are less sharp.Within VAEs, the focus is on the variational inference of latent codes.Therefore, VAEs provide a suitable framework for both learning and efficientBayesian inference with latent variables. For example, VAEs with disentangledrepresentations enable latent code reuse for transfer learning.In terms of structure, VAEs bear a resemblance to an autoencoder. They arealso made up of an encoder (also known as recognition or inference model)and a decoder (also known as a generative model). Both VAEs and autoencodersattempt to reconstruct the input data while learning the latent vector. However,unlike autoencoders, the latent space of VAEs is continuous, and the decoder itselfis used as a generative model.[ 237 ]
Variational Autoencoders (VAEs)In the same line of discussions on GANs that we discussed in the previous chapters,the VAEs decoder can also be conditioned. For example, in the MNIST dataset, we'reable to specify the digit to produce given a one-hot vector. This class of conditionalVAE is called CVAE [2]. VAE latent vectors can also be disentangled by includinga regularizing hyperparameter on the loss function. This is called β -VAE [5]. Forexample, within MNIST, we're able to isolate the latent vector that determines thethickness or tilt angle of each digit.The goal of this chapter is to present:• The principles of VAEs• An understanding of the reparameterization trick that facilitates the useof stochastic gradient descent on VAE optimization• The principles of conditional VAE (CVAE) and β -VAE• An understanding of how to implement VAEs within the Keras libraryPrinciples of VAEsIn a generative model, we're often interested in approximating the true distributionof our inputs using neural networks:x ~P ( x)θ(Equation 8.1.1)In the preceding equation, θ are the parameters determined during training. Forexample, in the context of the celebrity faces dataset, this is equivalent to findinga distribution that can draw faces. Similarly, in the MNIST dataset, this distributioncan generate recognizable handwritten digits.In machine learning, to perform a certain level of inference, we're interestedin finding Pθ ( x,z), a joint distribution between inputs, x, and the latent variables, z.The latent variables are not part of the dataset but instead encode certain propertiesobservable from inputs. In the context of celebrity faces, these might be facialexpressions, hairstyles, hair color, gender, and so on. In the MNIST dataset,the latent variables may represent the digit and writing styles.Pθ ( x,z)is practically a distribution of input data points and their attributes.P θ(x) can be computed from the marginal distribution:P ( x) P ( x,z)dzθ= ∫(Equation 8.1.2)θ[ 238 ]
- Page 203 and 204: Disentangled Representation GANsThe
- Page 205 and 206: Disentangled Representation GANsfea
- Page 207 and 208: Disentangled Representation GANs# f
- Page 209 and 210: Disentangled Representation GANslat
- Page 211 and 212: Disentangled Representation GANsDis
- Page 213 and 214: Disentangled Representation GANsz_d
- Page 215 and 216: Disentangled Representation GANs2.
- Page 217 and 218: Disentangled Representation GANsFig
- Page 220 and 221: Cross-Domain GANsIn computer vision
- Page 222 and 223: Chapter 7There are many more exampl
- Page 224 and 225: The CycleGAN ModelFigure 7.1.3 show
- Page 226 and 227: Chapter 7Repeat for n training step
- Page 228 and 229: Chapter 7Implementing CycleGAN usin
- Page 230 and 231: filters=16,kernel_size=3,strides=2,
- Page 232 and 233: Chapter 7kernel_size=kernel_size)e3
- Page 234 and 235: Listing 7.1.3, cyclegan-7.1.1.py sh
- Page 236 and 237: Chapter 71) Build target and source
- Page 238 and 239: Chapter 7preal_target,reco_source,r
- Page 240 and 241: size=batch_size)real_source = sourc
- Page 242 and 243: Chapter 7returndirs=dirs,show=True)
- Page 244 and 245: Chapter 7Figure 7.1.10: Color (from
- Page 246 and 247: [ 229 ]Chapter 7titles = ('MNIST pr
- Page 248 and 249: Chapter 7Figure 7.1.13: Style trans
- Page 250 and 251: Chapter 7Figure 7.1.15: The backwar
- Page 252: Chapter 7References1. Yuval Netzer
- Page 257 and 258: Variational Autoencoders (VAEs)Typi
- Page 259 and 260: Variational Autoencoders (VAEs)For
- Page 261 and 262: Variational Autoencoders (VAEs)VAEs
- Page 263 and 264: Variational Autoencoders (VAEs)outp
- Page 265 and 266: Variational Autoencoders (VAEs)Figu
- Page 267 and 268: Variational Autoencoders (VAEs)The
- Page 269 and 270: Variational Autoencoders (VAEs)Figu
- Page 271 and 272: Variational Autoencoders (VAEs)Prec
- Page 273 and 274: Variational Autoencoders (VAEs)shap
- Page 275 and 276: Variational Autoencoders (VAEs)cvae
- Page 277 and 278: Variational Autoencoders (VAEs)Figu
- Page 279 and 280: Variational Autoencoders (VAEs)Figu
- Page 281 and 282: Variational Autoencoders (VAEs)In F
- Page 283 and 284: Variational Autoencoders (VAEs)Figu
- Page 285 and 286: Variational Autoencoders (VAEs)The
- Page 288 and 289: Deep ReinforcementLearningReinforce
- Page 290 and 291: [ 273 ]Chapter 9Formally, the RL pr
- Page 292 and 293: Chapter 9Where:( ) ( , )∗V s maxQ
- Page 294 and 295: Chapter 9Initially, the agent assum
- Page 296 and 297: Chapter 9Figure 9.3.6: Assuming the
- Page 298 and 299: Q-Learning in PythonThe environment
- Page 300 and 301: Chapter 9----------------"""self.re
- Page 302 and 303: Chapter 9# UI to dump Q Table conte
Variational Autoencoders (VAEs)
In the same line of discussions on GANs that we discussed in the previous chapters,
the VAEs decoder can also be conditioned. For example, in the MNIST dataset, we're
able to specify the digit to produce given a one-hot vector. This class of conditional
VAE is called CVAE [2]. VAE latent vectors can also be disentangled by including
a regularizing hyperparameter on the loss function. This is called β -VAE [5]. For
example, within MNIST, we're able to isolate the latent vector that determines the
thickness or tilt angle of each digit.
The goal of this chapter is to present:
• The principles of VAEs
• An understanding of the reparameterization trick that facilitates the use
of stochastic gradient descent on VAE optimization
• The principles of conditional VAE (CVAE) and β -VAE
• An understanding of how to implement VAEs within the Keras library
Principles of VAEs
In a generative model, we're often interested in approximating the true distribution
of our inputs using neural networks:
x ~
P ( x)
θ
(Equation 8.1.1)
In the preceding equation, θ are the parameters determined during training. For
example, in the context of the celebrity faces dataset, this is equivalent to finding
a distribution that can draw faces. Similarly, in the MNIST dataset, this distribution
can generate recognizable handwritten digits.
In machine learning, to perform a certain level of inference, we're interested
in finding Pθ ( x,
z)
, a joint distribution between inputs, x, and the latent variables, z.
The latent variables are not part of the dataset but instead encode certain properties
observable from inputs. In the context of celebrity faces, these might be facial
expressions, hairstyles, hair color, gender, and so on. In the MNIST dataset,
the latent variables may represent the digit and writing styles.
Pθ ( x,
z)
is practically a distribution of input data points and their attributes.
P θ
(x) can be computed from the marginal distribution:
P ( x) P ( x,
z)
dz
θ
= ∫
(Equation 8.1.2)
θ
[ 238 ]