09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 8

sequences = sequences.map(split_train_labels)

# set up for training

# batches: [None, 64, 100]

batch_size = 64

steps_per_epoch = len(texts) // seq_length // batch_size

dataset = sequences.shuffle(10000).batch(

batch_size, drop_remainder=True)

We are now ready to define our network. As before, we define our network as a

subclass of tf.keras.Model as shown next. The network is fairly simple; it takes as

input a sequence of integers of size 100 (num_timesteps) and passes them through

an Embedding layer so that each integer in the sequence is converted to a vector of

size 256 (embedding_dim). So, assuming a batch size of 64, for our input sequence of

size (64, 100), the output of the Embedding layer is a matrix of shape (64, 100, 256).

The next layer is the RNN layer with 100 time steps. The implementation of RNN

chosen is a GRU. This GRU layer will take, at each of its time steps, a vector of

size (256,) and output a vector of shape (1024,) (rnn_output_dim). Note also

that the RNN is stateful, which means that the hidden state output from the

previous training epoch will be used as input to the current epoch. The return_

sequences=True flag also indicates that the RNN will output at each of the time

steps rather than an aggregate output at the last time steps.

Finally, each of the time steps will emit a vector of shape (1024,) into a Dense layer

that outputs a vector of shape (90,) (vocab_size). The output from this layer will

be a tensor of shape (64, 100, 90). Each position in the output vector corresponds to

a character in our vocabulary, and the values correspond to the probability of that

character occurring at that output position:

class CharGenModel(tf.keras.Model):

def __init__(self, vocab_size, num_timesteps,

embedding_dim, rnn_output_dim, **kwargs):

super(CharGenModel, self).__init__(**kwargs)

self.embedding_layer = tf.keras.layers.Embedding(

vocab_size,

embedding_dim

)

self.rnn_layer = tf.keras.layers.GRU(

num_timesteps,

recurrent_initializer="glorot_uniform",

recurrent_activation="sigmoid",

stateful=True,

return_sequences=True)

self.dense_layer = tf.keras.layers.Dense(vocab_size)

[ 295 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!