09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Word Embeddings

In the future, once the TensorFlow Hub team migrates over its models to TensorFlow

2.0, the code to generate embeddings from ELMo is expected to look like this. Note

that the module_url is likely to change. The pattern is similar to the examples of

using TensorFlow Hub in chapters 2 and 5 that you have seen already:

module_url = "https://tfhub.dev/google/tf2-preview/elmo/2"

embed = hub.KerasLayer(module_url)

embeddings = embed([

"i like green eggs and ham",

"would you eat them in a box"

])["elmo"]

print(embeddings.shape)

Sentence and paragraph embeddings

A simple, yet surprisingly effective solution for generating useful sentence and

paragraph embeddings is to average the word vectors of their constituent words.

Even though we will describe some popular sentence and paragraph embeddings

in this section, it is generally always advisable to try averaging the word vectors as

a baseline.

Sentence (and paragraph) embeddings can also be created in a task optimized

way by treating them as a sequence of words, and representing each word using

some standard word vector. The sequence of word vectors is used as input to

train a network for some task. Vectors extracted from one of the later layers of the

network just before the classification layer generally tend to produce very good

vector representation for the sequence. However, they tend to be very task specific,

and are of limited use as a general vector representation.

An idea for generating general vector representations for sentences that could

be used across tasks was proposed by Kiros, et al. [22]. They proposed using the

continuity of text from books to construct an encoder-decoder model that is trained

to predict surrounding sentences given a sentence. The vector representation of

a sequence of words constructed by an encoder-decoder network is typically called

a "thought vector". In addition, the proposed model works on a very similar basis to

skip-gram, where we try to predict the surrounding words given a word. For these

reasons, these sentence vectors were called skip-thought vectors. The project released

a Theano-based model that could be used to generate embeddings from sentences.

Later, the model was re-implemented with TensorFlow by the Google Research team

[23]. The Skip-Thoughts model emits vectors of size (2048) for each sentence. Using

the model is not very straightforward, but the README.md file on the repository [23]

provides instructions in case you would like to use it.

[ 262 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!