16.03.2021 Views

Advanced Deep Learning with Keras

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Cross-Domain GANs

The main disadvantage of neural networks similar to pix2pix is the training input, and

output images must be aligned. Figure 7.1.1 is an example of an aligned image pair. The

sample target image is generated from the source. In most occasions, aligned image

pairs are not available or expensive to generate from the source images, or we have no

idea on how to generate the target image from the given source image. What we have

are sample data from the source and target domains. Figure 7.1.2 is an example of data

from the source domain (real photo) and the target domain (Van Gogh's art style) on

the same sunflower subject. The source and target images are not necessarily aligned.

Unlike pix2pix, CycleGAN learns image translation as long as there are a sufficient

amount and variation of source and target data. No alignment is needed. CycleGAN

learns the source and target distributions and how to translate from source to target

distribution from given sample data. No supervision is needed. In the context of

Figure 7.1.2, we just need thousands of photos of real sunflowers and thousands

of photos of Van Gogh's paintings of sunflowers. After training the CycleGAN,

we're able to translate a photo of sunflowers to a Van Gogh's painting:

Figure 7.1.3: The CycleGAN model is made of four networks: Generator G, Generator F,

Discriminator D y

, and Discriminator D x

[ 206 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!