22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

alphas = F.softmax(scaled_products, dim=-1)

alphas

Output

tensor([[[0.4254, 0.5746]]], grad_fn=<SoftmaxBackward>)

The computation of the context vectors using scaled dot product is usually

depicted like this:

Figure 9.15 - Scaled dot-product attention

If you prefer to see it in code, it looks like this:

def calc_alphas(ks, q):

dims = q.size(-1)

# N, 1, H x N, H, L -> N, 1, L

products = torch.bmm(q, ks.permute(0, 2, 1))

scaled_products = products / np.sqrt(dims)

alphas = F.softmax(scaled_products, dim=-1)

return alphas

Attention | 721

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!