09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Recurrent Neural Networks

This is a common technique used to train seq2seq networks, which is called Teacher

Forcing, where the input to the decoder is the ground truth output instead of the

prediction from the previous time step. This is preferred because it makes training

faster, but also results in some degradation in prediction quality. To offset this,

techniques such as Scheduled Sampling can be used, where the input is sampled

randomly either from the ground truth or the prediction at the previous time step,

based on some threshold (depends on the problem, but usually varies between 0.1

and 0.4):

@tf.function

def train_step(encoder_in, decoder_in, decoder_out, encoder_state):

with tf.GradientTape() as tape:

encoder_out, encoder_state = encoder(encoder_in, encoder_state)

decoder_state = encoder_state

decoder_pred, decoder_state = decoder(

decoder_in, decoder_state)

loss = loss_fn(decoder_out, decoder_pred)

variables = (encoder.trainable_variables +

decoder.trainable_variables)

gradients = tape.gradient(loss, variables)

optimizer.apply_gradients(zip(gradients, variables))

return loss

The predict() method is used to randomly sample a single English sentence from

the dataset and use the model trained so far to predict the French sentence. For

reference, the label French sentence is also displayed. The evaluate() method

computes the BiLingual Evaluation Understudy (BLEU) score [35] between the

label and prediction across all records in the test set. BLEU scores are generally

used where multiple ground truth labels exist (we have only one), but compares up

to 4-grams (n-grams with n=4) in both reference and candidate sentences. Both the

predict() and evaluate() methods are called at the end of every epoch:

def predict(encoder, decoder, batch_size,

sents_en, data_en, sents_fr_out,

word2idx_fr, idx2word_fr):

random_id = np.random.choice(len(sents_en))

print("input : ", " ".join(sents_en[random_id]))

print("label : ", " ".join(sents_fr_out[random_id])

encoder_in = tf.expand_dims(data_en[random_id], axis=0)

decoder_out = tf.expand_dims(sents_fr_out[random_id], axis=0)

encoder_state = encoder.init_state(1)

encoder_out, encoder_state = encoder(encoder_in, encoder_state)

decoder_state = encoder_state

[ 324 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!