Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Outputtensor([[0.3924, 0.8146]], grad_fn=<TanhBackward>)That’s the updated hidden state!Now, let’s take a quick sanity check, feeding the same input to the original RNNcell:rnn_cell(X[0:1])Outputtensor([[0.3924, 0.8146]], grad_fn=<TanhBackward>)Great, the values match.We can also visualize this sequence of operations, assuming that every hiddenspace "lives" in a feature space delimited by the boundaries given by the hyperbolictangent. So, the initial hidden state (0, 0) sits at the center of this feature space,depicted in the left-most plot in the figure below:Figure 8.8 - Evolution of the hidden stateThe transformed hidden state (the output of linear_hidden()) is depicted in thesecond plot: It went through an affine transformation. The point in the centercorresponds to t h . In the third plot, we can see the effect of adding t x (the output oflinear_input()): The whole feature space was translated to the right and up. Andthen, in the right most plot, the hyperbolic tangent works its magic and brings thewhole feature space back to the (-1, 1) range. That was the first step in the journeyof a hidden state. We’ll do it once again, using the full sequence, after training amodel.Recurrent Neural Networks (RNNs) | 599

I guess it is time to feed the full sequence to the RNN cell, right? You may betempted to do it like this:# WRONG!rnn_cell(X)Outputtensor([[ 0.3924, 0.8146],[ 0.7864, 0.5266],[-0.0047, -0.2897],[-0.6817, 0.1109]], grad_fn=<TanhBackward>)This is wrong! Remember, the RNN cell has two inputs: one hidden state and onedata point."Where is the hidden state then?"That’s exactly the problem! If not provided, it defaults to the zeros correspondingto the initial hidden state. So, the call above is not processing four steps of asequence, but rather processing the first step of what it is assuming to be foursequences.To effectively use the RNN cell in a sequence, we need to loop over the data pointsand provide the updated hidden state at each step:hidden = torch.zeros(1, hidden_dim)for i in range(X.shape[0]):out = rnn_cell(X[i:i+1], hidden)print(out)hidden = outOutputtensor([[0.3924, 0.8146]], grad_fn=<TanhBackward>)tensor([[ 0.4347, -0.0481]], grad_fn=<TanhBackward>)tensor([[-0.1521, -0.3367]], grad_fn=<TanhBackward>)tensor([[-0.5297, 0.3551]], grad_fn=<TanhBackward>)600 | Chapter 8: Sequences

Output

tensor([[0.3924, 0.8146]], grad_fn=<TanhBackward>)

That’s the updated hidden state!

Now, let’s take a quick sanity check, feeding the same input to the original RNN

cell:

rnn_cell(X[0:1])

Output

tensor([[0.3924, 0.8146]], grad_fn=<TanhBackward>)

Great, the values match.

We can also visualize this sequence of operations, assuming that every hidden

space "lives" in a feature space delimited by the boundaries given by the hyperbolic

tangent. So, the initial hidden state (0, 0) sits at the center of this feature space,

depicted in the left-most plot in the figure below:

Figure 8.8 - Evolution of the hidden state

The transformed hidden state (the output of linear_hidden()) is depicted in the

second plot: It went through an affine transformation. The point in the center

corresponds to t h . In the third plot, we can see the effect of adding t x (the output of

linear_input()): The whole feature space was translated to the right and up. And

then, in the right most plot, the hyperbolic tangent works its magic and brings the

whole feature space back to the (-1, 1) range. That was the first step in the journey

of a hidden state. We’ll do it once again, using the full sequence, after training a

model.

Recurrent Neural Networks (RNNs) | 599

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!