09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Convolutional Neural Networks

Very deep convolutional networks

for large-scale image recognition

During 2014, an interesting contribution to image recognition was presented with

the paper, Very Deep Convolutional Networks for Large-Scale Image Recognition, K.

Simonyan and A. Zisserman [4]. The paper showed that a "significant improvement

on the prior-art configurations can be achieved by pushing the depth to 16-19 weight

layers." One model in the paper denoted as D or VGG-16 had 16 deep layers.

An implementation in Java Caffe (http://caffe.berkeleyvision.org/) was used

for training the model on the ImageNet ILSVRC-2012 (http://image-net.org/

challenges/LSVRC/2012/) dataset, which includes images of 1,000 classes, and is

split into three sets: training (1.3 million images), validation (50,000 images), and

testing (100,000 images). Each image is (224×224) on 3 channels. The model achieves

7.5% top-5 error on ILSVRC-2012-val, 7.4% top-5 error on ILSVRC-2012-test.

According to the ImageNet site, "The goal of this competition is to estimate the

content of photographs for the purpose of retrieval and automatic annotation using

a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images

depicting 10,000+ object categories) as training. Test images will be presented with

no initial annotation – no segmentation or labels – and algorithms will have to

produce labelings specifying what objects are present in the images."

The weights learned by the model implemented in Caffe have been directly

converted (https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3)

in tf.Keras and can be used by preloading them into the tf.Keras model, which is

implemented as follows, as described in the paper:

import tensorflow as tf

from tensorflow.keras import layers, models

# define a VGG16 network

def VGG_16(weights_path=None):

model = models.Sequential()

model.add(layers.ZeroPadding2D((1,1),input_shape=(224,224, 3)))

model.add(layers.Convolution2D(64, (3, 3), activation='relu'))

model.add(layers.ZeroPadding2D((1,1)))

model.add(layers.Convolution2D(64, (3, 3), activation='relu'))

model.add(layers.MaxPooling2D((2,2), strides=(2,2)))

model.add(layers.ZeroPadding2D((1,1)))

model.add(layers.Convolution2D(128, (3, 3), activation='relu'))

model.add(layers.ZeroPadding2D((1,1)))

[ 132 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!