22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Visualizing Attention

Let’s look at what the model is paying attention to by checking what’s stored in the

alphas attribute. The scores will be different for each source sequence, so let’s try

making predictions for the very first one:

inputs = full_train[:1, :2]

out = sbs_seq_attn.predict(inputs)

sbs_seq_attn.model.alphas

Output

tensor([[[0.8196, 0.1804],

[0.7316, 0.2684]]], device='cuda:0')

"How do I interpret these attention scores?"

The columns represent the elements in the source sequence, and the rows, the

elements in the target sequence:

Equation 9.10 - Attention score matrix

The attention scores we get tell us that the model mostly paid attention to the first

data point of the source sequence. This is not going to be the case for every

sequence, though. Let’s check what the model is paying attention to for the first ten

sequences in the training set.

734 | Chapter 9 — Part I: Sequence-to-Sequence

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!