09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

self.encoder_dim = encoder_dim

self.embedding = tf.keras.layers.Embedding(

vocab_size, embedding_dim, input_length=num_timesteps)

self.rnn = tf.keras.layers.GRU(

encoder_dim, return_sequences=True, return_state=True)

def call(self, x, state):

x = self.embedding(x)

x, state = self.rnn(x, initial_state=state)

return x, state

def init_state(self, batch_size):

return tf.zeros((batch_size, self.encoder_dim))

The Decoder will have bigger changes. The biggest is the declaration of attention

layers, which need to be defined, so let us do that first. Let us first consider the

class definition for the additive attention proposed by Bahdanau. Recall that this

combines the decoder hidden state at each time step with all the encoder hidden

states to produce an input to the decoder at the next time step, which is given by

the following equation:

ee = vv TT tanh(WW[ss; h])

Chapter 8

The W [s;h] in the equation is shorthand for two separate linear transformations (of

the form y = Wx + b), one on s, and the other on h. The two linear transformations

are implemented as Dense layers as shown in the following implementation. We

subclass a tf.keras Layer object, since our end goal is to use this as a layer in our

network, but it is also acceptable to subclass a Model object. The call() method

takes a query (the decoder state) and values (the encoder states), computes the

score, then computes the alignment as the corresponding softmax, and context

vector as given by the equation, then returns them. The shape of the context vector

is given by (batch_size, num_decoder_timesteps), and the alignments have the

shape (batch_size, num_encoder_timesteps, 1). The weights for the dense layer's

W1, W2, and V tensors are learned during training:

class BahdanauAttention(tf.keras.layers.Layer):

def __init__(self, num_units):

super(BahdanauAttention, self).__init__()

self.W1 = tf.keras.layers.Dense(num_units)

self.W2 = tf.keras.layers.Dense(num_units)

self.V = tf.keras.layers.Dense(1)

def call(self, query, values):

# query is the decoder state at time step j

[ 331 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!