16.03.2021 Views

Advanced Deep Learning with Keras

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Autoencoders

As a result of the latent vector being a low-dimensional compressed representation

of the input distribution, it should be expected that the output recovered by the

decoder can only approximate the input. The dissimilarity between the input and

the output can be measured by a loss function.

But why would we use autoencoders? Simply put, autoencoders have practical

applications both in their original form or as part of more complex neural networks.

They're a key tool in understanding the advanced topics of deep learning as

they give you a low-dimensional latent vector. Furthermore, it can be efficiently

processed to perform structural operations on the input data. Common operations

include denoising, colorization, feature-level arithmetic, detection, tracking, and

segmentation, to name just a few.

In summary, the goal of this chapter is to present:

• The principles of autoencoders

• How to implement autoencoders into the Keras neural network library

• The main features of denoising and colorization autoencoders

Principles of autoencoders

In this section, we're going to go over the principles of autoencoders. In this section,

we're going to be looking at autoencoders with the MNIST dataset, which we were

first introduced to in the previous chapters.

Firstly, we need to be made aware that an autoencoder has two operators, these are:

• Encoder: This transforms the input, x, into a low-dimensional latent vector,

z = f(x). Since the latent vector is of low dimension, the encoder is forced

to learn only the most important features of the input data. For example, in

the case of MNIST digits, the important features to learn may include writing

style, tilt angle, roundness of stroke, thickness, and so on. Essentially, these

are the most important information needed to represent digits zero to nine.

• Decoder: This tries to recover the input from the latent vector, g ( z)

= ̃x

. Although the latent vector has a low dimension, it has a sufficient size

to allow the decoder to recover the input data.

The goal of the decoder is to make ̃x as close as possible to x. Generally, both the

encoder and decoder are non-linear functions. The dimension of z is a measure

of the number of salient features it can represent. The dimension is usually much

smaller than the input dimensions for efficiency and in order to constrain the

latent code to learn only the most salient properties of the input distribution[1].

[ 72 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!