Advanced Deep Learning with Keras
Chapter 3The autoencoder has the tendency to memorize the input when the dimension of thelatent code is significantly bigger than x.A suitable loss function, ( , ̃ )L x x , is a measure of how dissimilar the input, x, fromthe output which is the recovered input, ̃x . As shown in the following equation,the Mean Squared Error (MSE) is an example of such a loss function:1 i = m i i∑( , ̃ ) = MSE = ( x − x̃) 2L x x (Equation 3.1.1)mi=1In this example, m is the output dimensions (For example, in MNIST m = width× height × channels = 28 × 28 × 1 = 784). xi and x̃iare the elements of x and x̃respectively. Since the loss function is a measure of dissimilarity between the inputand output, we're able to use alternative reconstruction loss functions such as thebinary cross entropy or structural similarity index (SSIM).Similar to the other neural networks, the autoencoder tries to make this erroror loss function as small as possible during training. Figure 3.1.1 shows theautoencoder. The encoder is a function that compresses the input, x, into a lowdimensionallatent vector, z. This latent vector represents the important featuresof the input distribution. The decoder then tries to recover the original input fromthe latent vector in the form of x̃ .Figure 3.1.1: Block diagram of an autoencoderFigure 3.1.2: An autoencoder with MNIST digit input and output. The latent vector is 16-dim.[ 73 ]
AutoencodersTo put the autoencoder in context, x can be an MNIST digit which has a dimension of28 × 28 × 1 = 784. The encoder transforms the input into a low-dimensional z that canbe a 16-dimension latent vector. The decoder will attempt to recover the input in theform of x̃ from z. Visually, every MNIST digit x will appear similar to x̃ . Figure 3.1.2demonstrates this autoencoding process to us. We can observe that the decoded digit7, while not exactly the same remains close enough.Since both encoder and decoder are non-linear functions, we can use neuralnetworks to implement both. For example, in the MNIST dataset, the autoencodercan be implemented by MLP or CNN. The autoencoder can be trained by minimizingthe loss function through backpropagation. Similar to other neural networks, theonly requirement is that the loss function must be differentiable.If we treat the input as a distribution, we can interpret the encoder as an encoder ofdistribution, p ( z | x ) and the decoder, as the decoder of distribution, p ( x | z ) . The lossfunction of the autoencoder is expressed as follows:( )L = −log p x | z (Equation 3.1.2)The loss function simply means that we would like to maximize the chances ofrecovering the input distribution given the latent vector distribution. If the decoderoutput distribution is assumed to be Gaussian, then the loss function boils downto MSE since:m m m2 2( z) ∏ ( ̃ ) ( ) ( ) 2i iσ ∑ ̃i iσ α∑̃i iL = − log p | = − log N x ; x , = − log N x ; x , x − xx (Equation 3.1.3)i=12In this example, ( xi; x̃i,σ )i= 1 i=1N represents a Gaussian distribution with a mean of x̃i and2variance of σ . A constant variance is assumed. The decoder output x̃ is assumed toibe independent. While m is the output dimension.Building autoencoders using KerasWe're now going to move onto something really exciting, building an autoencoderusing Keras library. For simplicity, we'll be using the MNIST dataset for the first setof examples. The autoencoder will then generate a latent vector from the input dataand recover the input using the decoder. The latent vector in this first example is16-dim.[ 74 ]
- Page 40 and 41: Chapter 1Figure 1.3.9: The graphica
- Page 42 and 43: Chapter 1# image is processed as is
- Page 44 and 45: Chapter 1The computation involved i
- Page 46 and 47: Chapter 1Listing 1.4.2 shows a summ
- Page 48 and 49: Chapter 164-64-64 RMSprop Dropout(0
- Page 50 and 51: Chapter 1There are the two main dif
- Page 52 and 53: Chapter 1Layers Optimizer Regulariz
- Page 54: ConclusionThis chapter provided an
- Page 57 and 58: Deep Neural NetworksWhile this chap
- Page 59 and 60: Deep Neural Networks# reshape and n
- Page 61 and 62: Deep Neural NetworksEverything else
- Page 63 and 64: Deep Neural Networksfrom keras.util
- Page 65 and 66: Deep Neural NetworksFigure 2.1.3: T
- Page 67 and 68: Deep Neural NetworksHence, the netw
- Page 69 and 70: Deep Neural NetworksGenerally speak
- Page 71 and 72: Deep Neural NetworksIn the dataset,
- Page 73 and 74: Deep Neural NetworksTransition Laye
- Page 75 and 76: Deep Neural NetworksThere are some
- Page 77 and 78: Deep Neural NetworksResNet v2 is al
- Page 79 and 80: Deep Neural Networks…if version =
- Page 81 and 82: Deep Neural NetworksTo prevent the
- Page 83 and 84: Deep Neural NetworksAverage Pooling
- Page 85 and 86: Deep Neural Networks# orig paper us
- Page 88 and 89: AutoencodersIn the previous chapter
- Page 92 and 93: Chapter 3Firstly, we're going to im
- Page 94 and 95: Chapter 3# reconstruct the inputout
- Page 96 and 97: Chapter 3Figure 3.2.2: The decoder
- Page 98 and 99: batch_size=32,model_name="autoencod
- Page 100 and 101: Chapter 3Figure 3.2.6: Digits gener
- Page 102 and 103: Chapter 3As shown in Figure 3.3.2,
- Page 104 and 105: Chapter 3image_size = x_train.shape
- Page 106 and 107: Chapter 3# Mean Square Error (MSE)
- Page 108 and 109: Chapter 3from keras.layers import R
- Page 110 and 111: Chapter 3# build the autoencoder mo
- Page 112 and 113: Chapter 3x_train,validation_data=(x
- Page 114: Chapter 3ConclusionIn this chapter,
- Page 117 and 118: Generative Adversarial Networks (GA
- Page 119 and 120: Generative Adversarial Networks (GA
- Page 121 and 122: Generative Adversarial Networks (GA
- Page 123 and 124: Generative Adversarial Networks (GA
- Page 125 and 126: Generative Adversarial Networks (GA
- Page 127 and 128: Generative Adversarial Networks (GA
- Page 129 and 130: Generative Adversarial Networks (GA
- Page 131 and 132: Generative Adversarial Networks (GA
- Page 133 and 134: Generative Adversarial Networks (GA
- Page 135 and 136: Generative Adversarial Networks (GA
- Page 137 and 138: Generative Adversarial Networks (GA
- Page 139 and 140: Generative Adversarial Networks (GA
Chapter 3
The autoencoder has the tendency to memorize the input when the dimension of the
latent code is significantly bigger than x.
A suitable loss function, ( , ̃ )
L x x , is a measure of how dissimilar the input, x, from
the output which is the recovered input, ̃x . As shown in the following equation,
the Mean Squared Error (MSE) is an example of such a loss function:
1 i = m i i
∑
( , ̃ ) = MSE = ( x − x̃
) 2
L x x (Equation 3.1.1)
m
i=
1
In this example, m is the output dimensions (For example, in MNIST m = width
× height × channels = 28 × 28 × 1 = 784). x
i and x̃
i
are the elements of x and x̃
respectively. Since the loss function is a measure of dissimilarity between the input
and output, we're able to use alternative reconstruction loss functions such as the
binary cross entropy or structural similarity index (SSIM).
Similar to the other neural networks, the autoencoder tries to make this error
or loss function as small as possible during training. Figure 3.1.1 shows the
autoencoder. The encoder is a function that compresses the input, x, into a lowdimensional
latent vector, z. This latent vector represents the important features
of the input distribution. The decoder then tries to recover the original input from
the latent vector in the form of x̃ .
Figure 3.1.1: Block diagram of an autoencoder
Figure 3.1.2: An autoencoder with MNIST digit input and output. The latent vector is 16-dim.
[ 73 ]