Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Figure 9.3 - Sequence datasetThe corners show the order in which they were drawn. In the first square, thedrawing started at the top-right corner (corresponding to the blue C corner) andfollowed a clockwise direction (corresponding to the CDAB sequence). The sourcesequence for that square would include corners C and D (1 and 2), while the targetsequence would include corners A and B (3 and 4), in that order.In order to output a sequence we need a more complex architecture; we need an…Encoder-Decoder ArchitectureThe encoder-decoder is a combination of two models: the encoder and thedecoder.EncoderThe encoder’s goal is to generate a representation of the sourcesequence; that is, to encode it."Wait, we’ve done that already, right?"Absolutely! That’s what the recurrent layers did: They generated a final hiddenstate that was a representation of the input sequence. Now you know why Iinsisted so much on this idea and repeated it over and over again in Chapter 8 :-)Encoder-Decoder Architecture | 689

The figure below should look familiar: It is a typical recurrent neural network thatwe’re using to encode the source sequence.Figure 9.4 - EncoderThe encoder model is a slim version of our models from Chapter 8: It simplyreturns a sequence of hidden states.Encoder1 class Encoder(nn.Module):2 def __init__(self, n_features, hidden_dim):3 super().__init__()4 self.hidden_dim = hidden_dim5 self.n_features = n_features6 self.hidden = None7 self.basic_rnn = nn.GRU(self.n_features,8 self.hidden_dim,9 batch_first=True)1011 def forward(self, X):12 rnn_out, self.hidden = self.basic_rnn(X)1314 return rnn_out # N, L, F"Don’t we need only the final hidden state?"That’s correct. We’ll be using the final hidden state only … for now.In the "Attention" section, we’ll be using all hidden states, andthat’s why we’re implementing the encoder like this.Let’s go over a simple example of encoding: We start with a sequence of690 | Chapter 9 — Part I: Sequence-to-Sequence

Figure 9.3 - Sequence dataset

The corners show the order in which they were drawn. In the first square, the

drawing started at the top-right corner (corresponding to the blue C corner) and

followed a clockwise direction (corresponding to the CDAB sequence). The source

sequence for that square would include corners C and D (1 and 2), while the target

sequence would include corners A and B (3 and 4), in that order.

In order to output a sequence we need a more complex architecture; we need an…

Encoder-Decoder Architecture

The encoder-decoder is a combination of two models: the encoder and the

decoder.

Encoder

The encoder’s goal is to generate a representation of the source

sequence; that is, to encode it.

"Wait, we’ve done that already, right?"

Absolutely! That’s what the recurrent layers did: They generated a final hidden

state that was a representation of the input sequence. Now you know why I

insisted so much on this idea and repeated it over and over again in Chapter 8 :-)

Encoder-Decoder Architecture | 689

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!