09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Word Embeddings

For example, "crucial" is virtually a synonym, and it is easy to see how the words

"historical" or "valuable" could be substituted in certain situations:

Figure 1: Visualization of nearest neighbors of the word "important" in a word embedding dataset,

from the TensorFlow Embedding Guide (https://www.tensorflow.org/guide/embedding)

In the next section we will look at various types of distributed representations

(or word embeddings).

Static embeddings

Static embeddings are the oldest type of word embedding. The embeddings are

generated against a large corpus but the number of words, though large, is finite.

You can think of a static embedding as a dictionary, with words as the keys and

their corresponding vector as the value. If you have a word whose embedding

needs to be looked up that was not in the original corpus, then you are out of luck.

In addition, a word has the same embedding regardless of how it is used, so static

embeddings cannot address the problem of polysemy, that is, words with multiple

meanings. We will explore this issue further when we cover non-static embeddings

later in this chapter.

[ 234 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!