09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Recurrent Neural Networks

def call(self, x):

x = self.embedding_layer(x)

x = self.rnn_layer(x)

x = self.dense_layer(x)

return x

vocab_size = len(vocab)

embedding_dim = 256

rnn_output_dim = 1024

model = CharGenModel(vocab_size, seq_length, embedding_dim,

rnn_output_dim)

model.build(input_shape=(batch_size, seq_length))

Next we define a loss function and compile our model. We will use the sparse

categorical cross-entropy as our loss function because that is the standard loss

function to use when our inputs and outputs are sequences of integers. For the

optimizer, we will choose the Adam optimizer:

def loss(labels, predictions):

return tf.losses.sparse_categorical_crossentropy(

labels,

predictions,

from_logits=True

)

model.compile(optimizer=tf.optimizers.Adam(), loss=loss)

Normally, the character at each position of the output is found by computing the

argmax of the vector at that position, that is, the character corresponding to the

maximum probability value. This is known as greedy search. In the case of language

models where the output of one timestep becomes the input to the next timestep,

this can lead to repetitive output. The two most common approaches to overcome

this problem is either to sample the output randomly or to use beam search, which

samples from k the most probable values at each time step. Here we will use the

tf.random.categorical() function to sample the output randomly. The following

function takes a string as a prefix and uses it to generate a string whose length is

specified by num_chars_to_generate. The temperature parameter is used to control

the quality of the predictions. Lower values will create a more predictable output.

[ 296 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!