22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Output

Hidden: tensor([[[ 0.3105, -0.5263]]], grad_fn=<SliceBackward>)

Output: tensor([[[-0.2339, 0.4702]]], grad_fn=<ViewBackward>)

Hidden: tensor([[[ 0.3913, -0.6853]]], grad_fn=<StackBackward>)

Output: tensor([[[0.2265, 0.4529]]], grad_fn=<ViewBackward>)

You may set teacher_forcing_prob to 1.0 or 0.0 to replicate either of the two

outputs we generated before.

Now it is time to put the two of them together…

Encoder + Decoder

The figure below illustrates the flow of information from encoder to decoder.

Figure 9.6 - Encoder + decoder

Let’s go over it once again:

• The encoder receives the source sequence (x 0 and x 1 , in red) and generates the

representation of the source sequence, its final hidden state (h f , in blue).

• The decoder receives the hidden state from the encoder (h f , in blue), together

with the last known element of the sequence (x 1 , in red), to output a hidden

state (h 2 , in green) that is converted into the first set of predicted coordinates

(x 2 , in green) using a linear layer (w T h, in green).

• In the next iteration of the loop, the model randomly uses the predicted (x 2 , in

green) or the actual (x 2 , in red) set of coordinates as one of its inputs to output

698 | Chapter 9 — Part I: Sequence-to-Sequence

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!