09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 8

In order to verify that the two classes are drop-in replacements for each other, we

run the following piece of throwaway code (commented out in the source code for

this example). We just manufacture some random inputs and send them to both

attention classes:

batch_size = 64

num_timesteps = 100

num_units = 1024

query = np.random.random(size=(batch_size, num_units))

values = np.random.random(size=(batch_size, num_timesteps, num_units))

# check out dimensions for Bahdanau attention

b_attn = BahdanauAttention(num_units)

context, alignments = b_attn(query, values)

print("Bahdanau: context.shape:", context.shape,

"alignments.shape:", alignments.shape)

# check out dimensions for Luong attention

l_attn = LuongAttention(num_units)

context, alignments = l_attn(query, values)

print("Luong: context.shape:", context.shape,

"alignments.shape:", alignments.shape)

The preceding code produces the following output, and shows, as expected, that the

two classes produce identically shaped outputs when given the same input, and are

hence drop-in replacements for each other:

Bahdanau: context.shape: (64, 1024) alignments.shape: (64, 8, 1)

Luong: context.shape: (64, 1024) alignments.shape: (64, 8, 1)

Now that we have our attention classes, let us look at the Decoder. The difference

in the init() method is the addition of the attention class variable, which we

have set to the BahdanauAttention class. In addition, we have two additional

transformations, Wc and Ws, that will be applied to the output of the decoder RNN.

The first one has a tanh activation to modulate the output between -1 and +1, and

the next one is a standard linear transformation. Compared to the seq2seq network

without an attention decoder component, this decoder takes an additional parameter

encoder_output in its call() method, and returns an additional context vector:

class Decoder(tf.keras.Model):

def __init__(self, vocab_size, embedding_dim, num_timesteps,

decoder_dim, **kwargs):

super(Decoder, self).__init__(**kwargs)

self.decoder_dim = decoder_dim

[ 333 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!