09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Word Embeddings

fastText computes embeddings for character n-grams where n is between 3 and

6 characters (default settings, can be changed), as well as for the words themselves.

For example, character n-grams for n=3 for the word "green" would be "<gr", "gre",

"ree", "een", and "en>". Beginning and end of words are marked with "<" and ">"

characters respectively, to distinguish between short words and their n-grams

such as "<cat>" and "cat".

During lookup, you can look up a vector from the fastText embedding using the

word as the key if the word exists in the embedding. However, unlike traditional

word embeddings, you can still construct a fastText vector for a word that does not

exist in the embedding. This is done by decomposing the word into its constituent

trigram subwords as shown in the preceding example, looking up the vectors for

the subwords, and then taking the average of these subword vectors. The fastText

Python API [19] will do this automatically, but you will need to do this manually if

you use other APIs to access fastText word embeddings, such as gensim or NumPy.

Next up, we will look at Dynamic embeddings.

Dynamic embeddings

So far, all the embeddings we have considered have been static; that is, they are

deployed as a dictionary of words (and subwords) mapped to fixed dimensional

vectors. The vector corresponding to a word in these embeddings is going to be

the same regardless of whether it is being used as a noun or verb in the sentence,

for example the word "ensure" (the name of a health supplement when used as

a noun, and to make certain when used as a verb). It also provides the same vector

for polysemous words or words with multiple meanings, such as "bank" (which

can mean different things depending on whether it co-occurs with the word

"money" or "river"). In both cases, the meaning of the word changes depending

on clues available in its context, the sentence. Dynamic embeddings attempt to

use these signals to provide different vectors for words based on its context.

Dynamic embeddings are deployed as trained networks that convert your input

(typically a sequence of one-hot vectors) into a lower dimensional dense fixed-size

embedding by looking at the entire sequence, not just individual words. You can

either preprocess your input to this dense embedding and then use this as input to

your task-specific network, or wrap the network and treat it similar to the tf.keras.

layers.Embedding layer for static embeddings. Using a dynamic embedding

network in this way is usually much more expensive compared to generating it

ahead of time (the first option), or using traditional embeddings.

[ 260 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!