22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

embeddings for every token in a sentence:

embed1 = get_embeddings(bert, watch1)

embed2 = get_embeddings(bert, watch2)

embed2

Output

tensor([[ 0.6554, -0.3799, -0.2842, ..., 0.8865, 0.4760],

[-0.1459, -0.0204, -0.0615, ..., 0.5052, 0.3324],

[-0.0436, -0.0401, -0.0135, ..., 0.5231, 0.9067],

...,

[-0.2582, 0.6933, 0.2688, ..., 0.0772, 0.2187],

[-0.1868, 0.6398, -0.8127, ..., 0.2793, 0.1880],

[-0.1021, 0.5222, -0.7142, ..., 0.0600, -0.1419]])

Then, let’s compare the embeddings for the word "watch" in both sentences once

again:

bert_watch1 = embed1[31]

bert_watch2 = embed2[13]

print(bert_watch1, bert_watch2)

Output

(tensor([ 8.5760e-01, 3.5888e-01, -3.7825e-01, -8.3564e-01,

...,

2.0768e-01, 1.1880e-01, 4.1445e-01]),

tensor([-9.8449e-02, 1.4698e+00, 2.8573e-01, -3.9569e-01,

...,

3.1746e-01, -2.8264e-01, -2.1325e-01]))

Well, they look more different from one another now. But are they, really?

similarity = nn.CosineSimilarity(dim=0, eps=1e-6)

similarity(bert_watch1, bert_watch2)

958 | Chapter 11: Down the Yellow Brick Rabbit Hole

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!