16.03.2021 Views

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 8

In other words, considering all of the possible attributes, we end up with the

distribution that describes the inputs. In celebrity faces, if we consider all the facial

expressions, hairstyles, hair colors, gender, the distribution describing the celebrity

faces is recovered. In the MNIST dataset, if we consider all of the possible digits,

writing styles, and so on, we end up with the distribution of handwritten digits.

The problem is Equation 8.1.2 is intractable. the equation does not have an analytic

form or an efficient estimator. It cannot be differentiated with respect to its

parameters. Therefore, optimization by a neural network is not feasible.

Using Bayes theorem, we can find an alternative expression for Equation 8.1.2:

P ( x) = ∫ P ( x | z) P( z)

dz (Equation 8.1.3)

θ

θ

P(z) is a prior distribution over z. It is not conditioned on any observations. If z is

discrete and Pθ ( x | z)

is a Gaussian distribution, then Pθ ( x)

is a mixture of Gaussians.

If z is continuous, Pθ ( x)

is an infinite mixture of Gaussians.

In practice, if we try to build a neural network to approximate Pθ ( x | z)

without

a suitable loss function, it will just ignore z and arrive at a trivial solution P ( x | z)

Pθ ( x)

. Therefore, Equation 8.1.3 does not provide us with a good estimate of P ( x)

Alternatively, Equation 8.1.2 can also be expressed as:

( ) ( | ) ( )

P x

θ

= ∫ P z x P x dz (Equation 8.1.4)

θ

=

θ

θ

.

However, Pθ ( z | x)

is also intractable. The goal of a VAEs is to find a tractable

distribution that closely estimates P ( z | x)

Variational inference

θ

.

In order to make Pθ ( z | x)

tractable, VAE introduces the variational inference model

(an encoder):

Q ( z | x) P ( z | x)

φ

θ

(Equation 8.1.5)

Qφ ( z | x)

provides a good estimate of Pθ

( z | x)

. It is both parametric and tractable.

Qφ ( z | x)

can be approximated by deep neural networks by optimizing the

parameters φ .

[ 239 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!