09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The earliest dynamic embedding was proposed by McCann, et al. [20], and was

called Contextualized Vectors (CoVe). This involved taking the output of the

encoder from the encoder-decoder pair of a machine translation network and

concatenating it with word vectors for the same word. You will learn more about

seq2seq networks in the next chapter. The researchers found that this strategy

improved performance of a wide variety of NLP tasks.

Chapter 7

Another dynamic embedding proposed by Peters, et al. [21], was Embeddings from

Language Models (ELMo). ELMo computes contextualized word representations

using character-based word representation and bidirectional Long Short-Term

Memory (LSTM). You will learn more about LSTMs in the next chapter. In the

meantime, a trained ELMo network is available from TensorFlow's model repository

TF-Hub. You can access it and use it for generating ELMo embeddings as follows.

The full set of models available on TF-Hub that are TensorFlow 2.0 compatible

can be found on the TF-Hub site for TensorFlow 2.0 [16]. Unfortunately, at the time

of writing this, the ELMo model is not one of them. You can invoke the older (pre-

TensorFlow 2.0) model from your code by turning off eager execution in your code,

but this does also mean that you won't be able to wrap ELMo as a layer in your own

model. This strategy will allow you to convert your input sentences to sequences

of contextual vectors, which you can then use as input to your own network. Here

I have used an array of sentences, where the model will figure out tokens by using

its default strategy of tokenizing on whitespace:

import tensorflow as tf

import tensorflow_hub as hub

module_url = "https://tfhub.dev/google/elmo/2"

tf.compat.v1.disable_eager_execution()

elmo = hub.Module(module_url, trainable=False)

embeddings = elmo([

"i like green eggs and ham",

"would you eat them in a box"

],

signature="default",

as_dict=True

)["elmo"]

print(embeddings.shape)

Output is (2, 7, 1024). The first index tells us that our input contained 2 sentences.

The second index refers to the maximum number of words across all sentences, in

this case, 7. The model automatically pads the output to the longest sentence. The

third index gives us the size of the contextual word embedding created by ELMo;

each word is converted to a vector of size (1024).

[ 261 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!