pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 5Then the two feature vectors (one for the image, and one for the text) are combinedinto one joint vector that is provided as input to a dense network to produce thecombined network:# combine the encoded question and visual modelmerged = layers.concatenate([encoded_question, visual_model])# attach a dense network at the endoutput = layers.Dense(1000, activation='softmax')(merged)# get the combined modelvqa_model = models.Model(inputs=[image_input, question_input],outputs=output)vqa_model.summary()For instance, if we have a set of labeled images, then we can learn what the bestquestions and answers are for describing an image. The number of options isenormous! If you want to know more, I suggest that you investigate Maluuba,a start-up providing the FigureQA dataset with 100,000 figure images and1,327,368 question-answer pairs in the training set. Maluuba has been recentlyacquired by Microsoft, and the lab is advised by Yoshua Bengio, one of the fathersof deep learning.In this section, we have discussed how to implement VQA. The next section is aboutstyle transfer: a deep learning technique used for training neural networks to createart.Style transferStyle transfer is a funny neural network application that provides many insightsinto the power of neural networks. So what exactly is it? Imagine that you observea painting made by a famous artist. In principle you are observing two elements:the painting itself (say, the face of a woman, or a landscape) and something moreintrinsic, the "style" of the artist. What is the style? That is more difficult to define,but humans know that Picasso had his own style, Matisse had his own style, andeach artist has his/her own style. Now, imagine taking a famous painting of Matisse,giving it to a neural network, and letting the neural network repaint it in Picasso'sstyle. Or imagine taking your own photo, giving it to a neural network, and havingyour photo painted in Matisse's or Picasso's style, or in the style of any other artistthat you like. That's what style transfer does.[ 165 ]

Advanced Convolutional Neural NetworksFor instance, go to https://deepart.io/ and see a cool demo as shown in thefollowing image, where DeepArt has been applied by taking the "Van Gogh"style as observed in the Sunflowers painting and applying it to a picture of mydaughter Aurora:Now, how can we define more formally the process of style transfer? Well, styletransfer is the task of producing an artificial image x that shares the content ofa source content image p and the style of a source style image a. So, intuitivelywe need two distance functions: one distance function measures how different thecontent of two images is, L content, while the other distance function measures howdifferent the style of two images is, L style. Then, the transfer style can be seen as anoptimization problem where we try to minimize these two metrics. As in Leon A.Gatys, Alexander S. Ecker, Matthias Bethge [7], we use a pretrained network toachieve style transfer. In particular, we can feed a VGG19 (or any suitable pretrainednetwork) for extracting features that represent images in an efficient way. Now weare going to define two functions used for training the network: the content distanceand the style distance.Content distanceGiven two images, p content image and x input image, we define the contentdistance as the distance in the feature space defined by a layer l for a VGG19 networkreceiving the two images as an input. In other words, the two images are representedby the features extracted by a pretrained VGG19. These features project the imagesinto a feature "content" space where the "content" distance can be convenientlycomputed as follows:[ 166 ]

Chapter 5

Then the two feature vectors (one for the image, and one for the text) are combined

into one joint vector that is provided as input to a dense network to produce the

combined network:

# combine the encoded question and visual model

merged = layers.concatenate([encoded_question, visual_model])

# attach a dense network at the end

output = layers.Dense(1000, activation='softmax')(merged)

# get the combined model

vqa_model = models.Model(inputs=[image_input, question_input],

outputs=output)

vqa_model.summary()

For instance, if we have a set of labeled images, then we can learn what the best

questions and answers are for describing an image. The number of options is

enormous! If you want to know more, I suggest that you investigate Maluuba,

a start-up providing the FigureQA dataset with 100,000 figure images and

1,327,368 question-answer pairs in the training set. Maluuba has been recently

acquired by Microsoft, and the lab is advised by Yoshua Bengio, one of the fathers

of deep learning.

In this section, we have discussed how to implement VQA. The next section is about

style transfer: a deep learning technique used for training neural networks to create

art.

Style transfer

Style transfer is a funny neural network application that provides many insights

into the power of neural networks. So what exactly is it? Imagine that you observe

a painting made by a famous artist. In principle you are observing two elements:

the painting itself (say, the face of a woman, or a landscape) and something more

intrinsic, the "style" of the artist. What is the style? That is more difficult to define,

but humans know that Picasso had his own style, Matisse had his own style, and

each artist has his/her own style. Now, imagine taking a famous painting of Matisse,

giving it to a neural network, and letting the neural network repaint it in Picasso's

style. Or imagine taking your own photo, giving it to a neural network, and having

your photo painted in Matisse's or Picasso's style, or in the style of any other artist

that you like. That's what style transfer does.

[ 165 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!