09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Word Embeddings

The change in validation accuracies, shown in Figure 3, illustrate the differences

between the three approaches:

Figure 3: Comparison of validation accuracy across training epochs for different embedding techniques

In the learning from scratch case, at the end of the first epoch, the validation accuracy

is 0.93, but over the next two epochs, it rises to 0.98. In the vectorizer case, the

network gets something of a head start from the third-party embeddings and ends

up with a validation accuracy of almost 0.95 at the end of the first epoch. However,

because the embedding weights are not allowed to change it is not able to customize

the embeddings to the spam detection task, and the validation accuracy at the end of

the third epoch is the lowest among the three. The fine-tune case, like the vectorizer,

also gets a head start, but is able to customize the embedding to the task as well, and

therefore is able to learn at the most rapid rate among the three cases. The fine-tune

case has the highest validation accuracy at the end of the first epoch and reaches

the same validation accuracy at the end of the second epoch that the scratch case

achieves at the end of the third.

Neural embeddings – not just for words

Word embedding technology has evolved in various ways since Word2Vec and

GloVe. One such direction is the application of word embeddings to non-word

settings, also known as Neural embeddings. As you will recall, word embeddings

leverage the distributional hypothesis that words occurring in similar contexts tend

to have similar meaning, where context is usually a fixed-size (in number of words)

window around the target word.

[ 252 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!