22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

There we go, 50 dimensions! It’s time to try the famous "equation": KING - MAN +

WOMAN = QUEEN. We’re calling the result a "synthetic queen":

synthetic_queen = glove['king'] - glove['man'] + glove['woman']

These are the corresponding embeddings:

fig = plot_word_vectors(

glove, ['king', 'man', 'woman', 'synthetic', 'queen'],

other={'synthetic': synthetic_queen}

)

Figure 11.17 - Synthetic queen

How similar is the "synthetic queen" to the actual "queen," you ask. It’s hard to tell

by looking at the vectors above alone, but Gensim’s word vectors have a

similar_by_vector() method that computes cosine similarity between a given

vector and the whole vocabulary and returns the top N most similar words:

glove.similar_by_vector(synthetic_queen, topn=5)

Output

[('king', 0.8859835863113403),

('queen', 0.8609581589698792),

('daughter', 0.7684512138366699),

('prince', 0.7640699148178101),

('throne', 0.7634971141815186)]

"The most similar word to the 'synthetic queen' is … king?"

Yes. It’s not always the case, but it’s fairly common to find out that, after performing

Word Embeddings | 929

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!