pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 819/19 [==============================] - 6s 291ms/step - loss: 0.0575 -accuracy: 0.9833 - masked_accuracy_fn: 0.8140 - val_loss: 0.1569 - val_accuracy: 0.9615 - val_masked_accuracy_fn: 0.551111/11 [==============================] - 2s 170ms/step - loss: 0.1436 -accuracy: 0.9637 - masked_accuracy_fn: 0.5786test loss: 0.144, test accuracy: 0.963, masked test accuracy: 0.578Here are some examples of POS tags generated for some random sentences in the testset, shown together with the POS tags in the corresponding ground truth sentences.As you can see, while the metric values are not perfect, it seems to have learned to doPOS tagging fairly well:labeled : among/IN segments/NNS that/WDT t/NONE 1/VBP continue/NONE 2/TO to/VB operate/RB though/DT the/NN company/POS 's/NN steel/NN division/VBD continued/NONE 3/TO to/VB suffer/IN from/JJ soft/NN demand/IN for/PRPits/JJ tubular/NNS goods/VBG serving/DT the/NN oil/NN industry/CC and/JJother/NNSpredicted: among/IN segments/NNS that/WDT t/NONE 1/NONE continue/NONE 2/TO to/VB operate/IN though/DT the/NN company/NN 's/NN steel/NN division/NONE continued/NONE 3/TO to/IN suffer/IN from/IN soft/JJ demand/NN for/INits/JJ tubular/NNS goods/DT serving/DT the/NNP oil/NN industry/CC and/JJother/NNSlabeled : as/IN a/DT result/NN ms/NNP ganes/NNP said/VBD 0/NONE t/NONE2/PRP it/VBZ is/VBN believed/IN that/JJ little/CC or/DT no/NN sugar/INfrom/DT the/CD 1989/NN 90/VBZ crop/VBN has/VBN been/NONE shipped/RB 1/RByet/IN even/DT though/NN the/NN crop/VBZ year/CD is/NNS six/JJpredicted: as/IN a/DT result/NN ms/IN ganes/NNP said/VBD 0/NONE t/NONE 2/PRP it/VBZ is/VBN believed/NONE that/DT little/NN or/DT no/NN sugar/INfrom/DT the/DT 1989/CD 90/NN crop/VBZ has/VBN been/VBN shipped/VBN 1/RByet/RB even/IN though/DT the/NN crop/NN year/NN is/JJlabeled : in/IN the/DT interview/NN at/IN headquarters/NN yesterday/NNafternoon/NN both/DT men/NNS exuded/VBD confidence/NN and/CC seemed/VBD1/NONE to/TO work/VB well/RB together/RBpredicted: in/IN the/DT interview/NN at/IN headquarters/NN yesterday/NNafternoon/NN both/DT men/NNS exuded/NNP confidence/NN and/CC seemed/VBD1/NONE to/TO work/VB well/RB together/RBlabeled : all/DT came/VBD from/IN cray/NNP research/NNPpredicted: all/NNP came/VBD from/IN cray/NNP research/NNPlabeled : primerica/NNP closed/VBD at/IN 28/CD 25/NONE u/RB down/CD 50/NNSpredicted: primerica/NNP closed/VBD at/CD 28/CD 25/CD u/CD down/CD[ 315 ]

Recurrent Neural NetworksIf you would like to run this code yourself, you can find the code in the code folderfor this chapter. In order to run it from the command line, enter the followingcommand. The output is written to the console:$ python gru_pos_tagger.pyNow that we have seen some examples of three common RNN network topologies,let us explore the most popular of them all – the seq2seq model, also known as theRecurrent encoder-decoder architecture.Encoder-Decoder architecture – seq2seqThe example of a many-to-many network we just saw was mostly similar to themany-to-one network. The one important difference was that the RNN returnsoutputs at each time step instead of a single combined output at the end. One othernoticeable feature was that the number of input time steps was equal to the numberof output time steps. As you learn about the encoder-decoder architecture, which isthe "other," and arguably more popular, style of a many-to-many network, you willnotice another difference – the output is in line with the input in a many-to-manynetwork, that is, it is not necessary for the network to wait until all of the input isconsumed before generating the output.The Encoder-Decoder architecture is also called a seq2seq model. As the nameimplies, the network is composed of an encoder and a decoder part, both RNNbased,and capable of consuming and returning sequences of outputs correspondingto multiple time steps. The biggest application of the seq2seq network has been inneural machine translation, although it is equally applicable for problems that can beroughly structured as translation problems. Some examples are sentence parsing[10] and image captioning [24]. The seq2seq model has also been used for time seriesanalysis [25] and question answering.In the seq2seq model, the encoder consumes the source sequence, which is a batchof integer sequences. The length of the sequence is the number of input time steps,which corresponds to the maximum input sequence length (padded or truncatedas necessary). Thus the dimensions of the input tensor is (batch_size, number_of_encoder_timesteps). This is passed into an embedding layer, which will convertthe integer at each time step to an embedding vector. The output of the embeddingis a tensor of shape (batch_size, number_of_encoder_timesteps, encoder_embedding_dim).[ 316 ]

Recurrent Neural Networks

If you would like to run this code yourself, you can find the code in the code folder

for this chapter. In order to run it from the command line, enter the following

command. The output is written to the console:

$ python gru_pos_tagger.py

Now that we have seen some examples of three common RNN network topologies,

let us explore the most popular of them all – the seq2seq model, also known as the

Recurrent encoder-decoder architecture.

Encoder-Decoder architecture – seq2seq

The example of a many-to-many network we just saw was mostly similar to the

many-to-one network. The one important difference was that the RNN returns

outputs at each time step instead of a single combined output at the end. One other

noticeable feature was that the number of input time steps was equal to the number

of output time steps. As you learn about the encoder-decoder architecture, which is

the "other," and arguably more popular, style of a many-to-many network, you will

notice another difference – the output is in line with the input in a many-to-many

network, that is, it is not necessary for the network to wait until all of the input is

consumed before generating the output.

The Encoder-Decoder architecture is also called a seq2seq model. As the name

implies, the network is composed of an encoder and a decoder part, both RNNbased,

and capable of consuming and returning sequences of outputs corresponding

to multiple time steps. The biggest application of the seq2seq network has been in

neural machine translation, although it is equally applicable for problems that can be

roughly structured as translation problems. Some examples are sentence parsing

[10] and image captioning [24]. The seq2seq model has also been used for time series

analysis [25] and question answering.

In the seq2seq model, the encoder consumes the source sequence, which is a batch

of integer sequences. The length of the sequence is the number of input time steps,

which corresponds to the maximum input sequence length (padded or truncated

as necessary). Thus the dimensions of the input tensor is (batch_size, number_of_

encoder_timesteps). This is passed into an embedding layer, which will convert

the integer at each time step to an embedding vector. The output of the embedding

is a tensor of shape (batch_size, number_of_encoder_timesteps, encoder_

embedding_dim).

[ 316 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!