pdfcoffee
Advanced ConvolutionalNeural NetworksIn this chapter we will see some more advanced uses for convolutional neuralnetworks (CNNs). We will explore how CNNs can be applied within the areas ofcomputer vision, video, textual documents, audio, and music. We'll conclude witha section summarizing convolution operations. We'll begin our look into CNNs withimage processing.Computer visionIn this section we'll look at the ways in which CNN architecture can be utilizedwhen applied to the area of imagine processing, and the interesting results that canbe generated.Composing CNNs for complex tasksWe have discussed CNNs quite extensively in the previous chapter, and at this pointyou are probably convinced about the effectiveness of the CNN architecture forimage classification tasks. What you may find surprising, however, is that the basicCNN architecture can be composed and extended in various ways to solve a varietyof more complex tasks.[ 139 ]
Advanced Convolutional Neural NetworksIn this section, we will look at the computer vision tasks in the following diagramand show how they can be solved by composing CNNs into larger and morecomplex architectures:Figure 1: Different computer vision tasks. Source: Introduction to Artificial Intelligence and ComputerVision Revolution (https://www.slideshare.net/darian_f/introduction-to-the-artificial-intelligence-andcomputer-vision-revolution).Classification and localizationIn the classification and localization task not only do you have to report the class ofobject found in the image, but also the coordinates of the bounding box where theobject appears in the image. This type of task assumes that there is only one instanceof the object in an image.This can be achieved by attaching a "regression head" in addition to the"classification head" in a typical classification network. Recall that in a classificationnetwork, the final output of convolution and pooling operations, called thefeature map, is fed into a fully connected network that produces a vector of classprobabilities. This fully connected network is called the classification head, and it istuned using a categorical loss function (L c) such as categorical cross entropy.Similarly, a regression head is another fully connected network that takes the featuremap and produces a vector (x, y, w, h) representing the top-left x and y coordinates,width and height of the bounding box. It is tuned using a continuous loss function(L r) such as mean squared error. The entire network is tuned using a linearcombination of the two losses, that is:LL = ααLL CC + (1 − αα)LL rr[ 140 ]
- Page 123 and 124: RegressionLet us imagine a simpler
- Page 125 and 126: RegressionTake a look at the last t
- Page 127 and 128: Regression3. Now, we calculate the
- Page 129 and 130: RegressionIn the next section we wi
- Page 131 and 132: Regression2. Now, we define the fea
- Page 133 and 134: Regression2. Download the dataset:(
- Page 135 and 136: RegressionThe following is the Tens
- Page 137 and 138: RegressionIn regression the aim is
- Page 139 and 140: RegressionThe Estimator outputs the
- Page 141 and 142: RegressionThe following is the grap
- Page 143 and 144: RegressionReferencesHere are some g
- Page 145 and 146: Convolutional Neural NetworksIn thi
- Page 147 and 148: Convolutional Neural NetworksIn thi
- Page 149 and 150: Convolutional Neural NetworksIn oth
- Page 151 and 152: Convolutional Neural NetworksThen w
- Page 153 and 154: Convolutional Neural NetworksHoweve
- Page 155 and 156: Convolutional Neural NetworksPlotti
- Page 157 and 158: Convolutional Neural NetworksIn gen
- Page 159 and 160: Convolutional Neural NetworksOur ne
- Page 161 and 162: Convolutional Neural NetworksThese
- Page 163 and 164: Convolutional Neural NetworksSo, we
- Page 165 and 166: Convolutional Neural NetworksEach i
- Page 167 and 168: Convolutional Neural NetworksVery d
- Page 169 and 170: Convolutional Neural NetworksRecogn
- Page 171 and 172: Convolutional Neural NetworksIf we
- Page 173: Convolutional Neural NetworksRefere
- Page 177 and 178: Advanced Convolutional Neural Netwo
- Page 179 and 180: Advanced Convolutional Neural Netwo
- Page 181 and 182: Advanced Convolutional Neural Netwo
- Page 183 and 184: Advanced Convolutional Neural Netwo
- Page 185 and 186: Advanced Convolutional Neural Netwo
- Page 187 and 188: Advanced Convolutional Neural Netwo
- Page 189 and 190: Advanced Convolutional Neural Netwo
- Page 191 and 192: Advanced Convolutional Neural Netwo
- Page 193 and 194: Advanced Convolutional Neural Netwo
- Page 195 and 196: Advanced Convolutional Neural Netwo
- Page 197 and 198: Advanced Convolutional Neural Netwo
- Page 199 and 200: Advanced Convolutional Neural Netwo
- Page 201 and 202: Advanced Convolutional Neural Netwo
- Page 203 and 204: Advanced Convolutional Neural Netwo
- Page 205 and 206: Advanced Convolutional Neural Netwo
- Page 207 and 208: Advanced Convolutional Neural Netwo
- Page 209 and 210: Advanced Convolutional Neural Netwo
- Page 211 and 212: Advanced Convolutional Neural Netwo
- Page 213 and 214: Advanced Convolutional Neural Netwo
- Page 215 and 216: Advanced Convolutional Neural Netwo
- Page 217 and 218: Advanced Convolutional Neural Netwo
- Page 219 and 220: Advanced Convolutional Neural Netwo
- Page 221 and 222: Advanced Convolutional Neural Netwo
- Page 223 and 224: Advanced Convolutional Neural Netwo
Advanced Convolutional Neural Networks
In this section, we will look at the computer vision tasks in the following diagram
and show how they can be solved by composing CNNs into larger and more
complex architectures:
Figure 1: Different computer vision tasks. Source: Introduction to Artificial Intelligence and Computer
Vision Revolution (https://www.slideshare.net/darian_f/introduction-to-the-artificial-intelligence-andcomputer-vision-revolution).
Classification and localization
In the classification and localization task not only do you have to report the class of
object found in the image, but also the coordinates of the bounding box where the
object appears in the image. This type of task assumes that there is only one instance
of the object in an image.
This can be achieved by attaching a "regression head" in addition to the
"classification head" in a typical classification network. Recall that in a classification
network, the final output of convolution and pooling operations, called the
feature map, is fed into a fully connected network that produces a vector of class
probabilities. This fully connected network is called the classification head, and it is
tuned using a categorical loss function (L c
) such as categorical cross entropy.
Similarly, a regression head is another fully connected network that takes the feature
map and produces a vector (x, y, w, h) representing the top-left x and y coordinates,
width and height of the bounding box. It is tuned using a continuous loss function
(L r
) such as mean squared error. The entire network is tuned using a linear
combination of the two losses, that is:
LL = ααLL CC + (1 − αα)LL rr
[ 140 ]