pdfcoffee
We train the autoencoder for 20 epochs using the following code. 20 epochs werechosen because the MSE loss converges within this time:num_train_steps = len(Xtrain) // BATCH_SIZEnum_test_steps = len(Xtest) // BATCH_SIZEsteps_per_epoch=num_train_steps,epochs=NUM_EPOCHS,validation_data=test_gen,validation_steps=num_test_steps,history = autoencoder.fit_generator(train_gen,steps_per_epoch=num_train_steps,epochs=NUM_EPOCHS,validation_data=test_gen,validation_steps=num_test_steps)Chapter 9The results of the training are shown as follows. As you can see, the training MSEreduces from 0.1161 to 0.0824 and the validation MSE reduces from 0.1097 to 0.0820:Since we are feeding in a matrix of embeddings, the output will also be a matrix ofword embeddings. Since the embedding space is continuous and our vocabulary isdiscrete, not every output embedding will correspond to a word. The best we cando is to find a word that is closest to the output embedding in order to reconstructthe original text. This is a bit cumbersome, so we will evaluate our autoencoder ina different way.Since the objective of the autoencoder is to produce a good latent representation,we compare the latent vectors produced from the encoder using the original inputversus the output of the autoencoder.[ 371 ]
AutoencodersFirst, we extract the encoder component into its own network:encoder = Model(autoencoder.input, autoencoder.get_layer("encoder_lstm").output)Then we run the autoencoder on the test set to return the predicted embeddings.We then send both the input embedding and the predicted embedding through theencoder to produce sentence vectors from each and compare the two vectors usingcosine similarity. Cosine similarities close to "one" indicate high similarity and thoseclose to "zero" indicate low similarity. The following code runs against a randomsubset of 500 test sentences and produces some sample values of cosine similaritiesbetween the sentence vectors generated from the source embedding and thecorresponding target embedding produced by the autoencoder:def compute_cosine_similarity(x, y):return np.dot(x, y) / (np.linalg.norm(x, 2) * np.linalg.norm(y,2))k = 500cosims = np.zeros((k))i= 0for bid in range(num_test_steps):xtest, ytest = test_gen.next()ytest_ = autoencoder.predict(xtest)Xvec = encoder.predict(xtest)Yvec = encoder.predict(ytest_)for rid in range(Xvec.shape[0]):if i >= k:breakcosims[i] = compute_cosine_similarity(Xvec[rid], Yvec[rid])if i <= 10:print(cosims[i])i += 1if i >= k:breakThe first 10 values of cosine similarities are shown as follows. As we can see, thevectors seem to be quite similar:0.9846865534782410.98157465457916260.97936713695526120.98051124811172490.9630994200706482[ 372 ]
- Page 355 and 356: Recurrent Neural NetworksNext we ha
- Page 357 and 358: Recurrent Neural Networksself.embed
- Page 359 and 360: Recurrent Neural NetworksThis is a
- Page 361 and 362: Recurrent Neural Networksreturn np.
- Page 363 and 364: Recurrent Neural NetworksAttention
- Page 365 and 366: Recurrent Neural NetworksFinally, V
- Page 367 and 368: Recurrent Neural Networks# query.sh
- Page 369 and 370: Recurrent Neural Networksself.atten
- Page 371 and 372: Recurrent Neural Networks30 try to
- Page 373 and 374: Recurrent Neural Networks3. Because
- Page 375 and 376: Recurrent Neural NetworksSummaryIn
- Page 377 and 378: Recurrent Neural Networks18. Shi, X
- Page 380 and 381: AutoencodersAutoencoders are feed-f
- Page 382 and 383: Depending upon the actual dimension
- Page 384 and 385: • __init__(): Here, you define al
- Page 386 and 387: Chapter 9And then we reshape the te
- Page 388 and 389: Chapter 9plt.imshow(x_test[index].r
- Page 390 and 391: Chapter 9Keeping the rest of the co
- Page 392 and 393: noise = np.random.normal(loc=0.5, s
- Page 394 and 395: Chapter 9x_train,validation_data=(x
- Page 396 and 397: Chapter 9import matplotlib.pyplot a
- Page 398 and 399: Chapter 9self.conv4 = Conv2D(1, 3,
- Page 400 and 401: Chapter 9You can see that the image
- Page 402 and 403: [ 367 ]Chapter 9Let us use the prec
- Page 404 and 405: Chapter 9Our autoencoder model take
- Page 408 and 409: Chapter 90.97905576229095460.989323
- Page 410 and 411: Unsupervised LearningThis chapter d
- Page 412 and 413: Chapter 10Next we load the MNIST da
- Page 414 and 415: Chapter 10TensorFlow Embedding APIT
- Page 416 and 417: 3. Recompute the centroids using cu
- Page 418 and 419: Chapter 10Figure 4: Plot of the fin
- Page 420 and 421: Chapter 10In SOMs, neurons are usua
- Page 422 and 423: [ 387 ]Chapter 10Colour mapping usi
- Page 424 and 425: Chapter 10# Calculating Neighbourho
- Page 426 and 427: We will also need to normalize the
- Page 428 and 429: Chapter 10ρρ(vv oo |h oo ) = σσ
- Page 430 and 431: # Generate the sample probabilityde
- Page 432 and 433: Chapter 10And the reconstructed ima
- Page 434 and 435: Chapter 10inpX = rbm.rbm_output(inp
- Page 436 and 437: Chapter 10(60000, 28, 28) (60000,)(
- Page 438 and 439: Chapter 10Figure 11: Summary of the
- Page 440 and 441: Chapter 10This chapter, along with
- Page 442 and 443: Reinforcement LearningThis chapter
- Page 444 and 445: Chapter 11And unlike unsupervised l
- Page 446 and 447: Chapter 11Normally, the value is de
- Page 448 and 449: Chapter 11• The next question tha
- Page 450 and 451: Chapter 11This neural network takes
- Page 452 and 453: Chapter 11The MuJoCo environment re
- Page 454 and 455: Chapter 11We will first import the
We train the autoencoder for 20 epochs using the following code. 20 epochs were
chosen because the MSE loss converges within this time:
num_train_steps = len(Xtrain) // BATCH_SIZE
num_test_steps = len(Xtest) // BATCH_SIZE
steps_per_epoch=num_train_steps,
epochs=NUM_EPOCHS,
validation_data=test_gen,
validation_steps=num_test_steps,
history = autoencoder.fit_generator(train_gen,
steps_per_epoch=num_train_steps,
epochs=NUM_EPOCHS,
validation_data=test_gen,
validation_steps=num_test_steps)
Chapter 9
The results of the training are shown as follows. As you can see, the training MSE
reduces from 0.1161 to 0.0824 and the validation MSE reduces from 0.1097 to 0.0820:
Since we are feeding in a matrix of embeddings, the output will also be a matrix of
word embeddings. Since the embedding space is continuous and our vocabulary is
discrete, not every output embedding will correspond to a word. The best we can
do is to find a word that is closest to the output embedding in order to reconstruct
the original text. This is a bit cumbersome, so we will evaluate our autoencoder in
a different way.
Since the objective of the autoencoder is to produce a good latent representation,
we compare the latent vectors produced from the encoder using the original input
versus the output of the autoencoder.
[ 371 ]