09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Advanced Convolutional Neural Networks

Casing, HyperNets, DenseNets, Inception, and Xception are all available as

pretrained nets in both tf.keras.application and TF-Hub. The Keras application

(https://keras.io/applications) reports a nice summary of the performance

achieved on an ImageNet dataset and the depth of each network:

Figure 24: Performance summary, shown by Keras

In this section, we have discussed many CNN architectures. Next, we are going

to see how to answer questions about images by using CNNs.

Answering questions about images (VQA)

One of the nice things about neural networks is that different media types can

be combined together to provide a unified interpretation. For instance, Visual

Question Answering (VQA) combines image recognition and text natural language

processing. Training can use VQA (https://visualqa.org/), a dataset containing

open-ended questions about images. These questions require an understanding

of vision, language, and common knowledge to answer. The following images are

taken from a demo available at https://visualqa.org/.

Note the question at the top of the image, and the subsequent answers:

[ 162 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!