Advanced Deep Learning with Keras

fourpersent2020
from fourpersent2020 More from this publisher
16.03.2021 Views

Chapter 7Implementing CycleGAN using KerasLet us tackle a simple problem that CycleGAN can address. In Chapter 3,Autoencoders, we used an autoencoder to colorize grayscale images from theCIFAR10 dataset. We can recall that the CIFAR10 dataset is made of 50,000 traineddata and 10,000 test data samples of 32 × 32 RGB images belonging to ten categories.We can convert all color images into grayscale using rgb2gray(RGB) as discussed inChapter 3, Autoencoders.Following on from that, we can use the grayscale train images as source domainimages and the original color images as the target domain images. It's worth notingthat although the dataset is aligned, the input to our CycleGAN is a random sampleof color images and a random sample of grayscale images. Thus, our CycleGAN willnot see the train data as aligned. After training, we'll use the test grayscale imagesto observe the performance of the CycleGAN:Figure 7.1.6: The forward cycle generator G, implementation in Keras.The generator is a U-Network made of encoder and decoder.[ 211 ]

Cross-Domain GANsAs discussed in the previous section, to implement the CycleGAN, we need to buildtwo generators and two discriminators. The generator of CycleGAN learns the latentrepresentation of the source input distribution and translates this representation intotarget output distribution. This is exactly what autoencoders do. However, typicalautoencoders similar to the ones discussed in Chapter 3, Autoencoders, use an encoderthat downsamples the input until the bottleneck layer at which point the processis reversed in the decoder. This structure is not suitable in some image translationproblems since many low-level features are shared between the encoder and decoderlayers. For example, in colorization problems, the form, structure, and edges of thegrayscale image are the same as in the color image. To circumvent this problem,the CycleGAN generators use a U-Net [7] structure as shown in Figure 7.1.6.In a U-Net structure, the output of the encoder layer e n-iis concatenated withthe output of the decoder layer d i, where n = 4 is the number of encoder/decoderlayers and i = 1, 2 and 3 are layer numbers that share information.We should note that although the example uses n = 4, problems with a higher input/output dimensions may require deeper encoder/decoder. The U-Net structureenables a free flow of feature-level information between encoder and decoder.An encoder layer is made of Instance Normalization(IN)-LeakyReLU-Conv2Dwhile the decoder layer is made of IN-ReLU-Conv2D. The encoder/decoder layerimplementation is shown in Listing 7.1.1 while the generator implementation isshown in Listing 7.1.2.The complete code is available on GitHub:https://github.com/PacktPublishing/Advanced-Deep-Learning-with-KerasInstance Normalization (IN) is Batch Normalization (BN) per sample of data(that is, IN is BN per image or per feature). In style transfer, it's important tonormalize the contrast per sample not per batch. Instance normalization isequivalent to contrast normalization. Meanwhile, Batch normalization breakscontrast normalization.Remember to install keras-contrib before using instance normalization:$ sudo pip3 install git+https://www.github.com/keras-team/keras-contrib.gitListing 7.1.1, cyclegan-7.1.1.py shows us the encoder and decoder layersimplementation in Keras:def encoder_layer(inputs,[ 212 ]

Cross-Domain GANs

As discussed in the previous section, to implement the CycleGAN, we need to build

two generators and two discriminators. The generator of CycleGAN learns the latent

representation of the source input distribution and translates this representation into

target output distribution. This is exactly what autoencoders do. However, typical

autoencoders similar to the ones discussed in Chapter 3, Autoencoders, use an encoder

that downsamples the input until the bottleneck layer at which point the process

is reversed in the decoder. This structure is not suitable in some image translation

problems since many low-level features are shared between the encoder and decoder

layers. For example, in colorization problems, the form, structure, and edges of the

grayscale image are the same as in the color image. To circumvent this problem,

the CycleGAN generators use a U-Net [7] structure as shown in Figure 7.1.6.

In a U-Net structure, the output of the encoder layer e n-i

is concatenated with

the output of the decoder layer d i

, where n = 4 is the number of encoder/decoder

layers and i = 1, 2 and 3 are layer numbers that share information.

We should note that although the example uses n = 4, problems with a higher input/

output dimensions may require deeper encoder/decoder. The U-Net structure

enables a free flow of feature-level information between encoder and decoder.

An encoder layer is made of Instance Normalization(IN)-LeakyReLU-Conv2D

while the decoder layer is made of IN-ReLU-Conv2D. The encoder/decoder layer

implementation is shown in Listing 7.1.1 while the generator implementation is

shown in Listing 7.1.2.

The complete code is available on GitHub:

https://github.com/PacktPublishing/Advanced-Deep-

Learning-with-Keras

Instance Normalization (IN) is Batch Normalization (BN) per sample of data

(that is, IN is BN per image or per feature). In style transfer, it's important to

normalize the contrast per sample not per batch. Instance normalization is

equivalent to contrast normalization. Meanwhile, Batch normalization breaks

contrast normalization.

Remember to install keras-contrib before using instance normalization:

$ sudo pip3 install git+https://www.github.com/keras-team/

keras-contrib.git

Listing 7.1.1, cyclegan-7.1.1.py shows us the encoder and decoder layers

implementation in Keras:

def encoder_layer(inputs,

[ 212 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!