Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

sequence so far, and a data point from the sequence (like the coordinates ofone of the corners from a given square).3. The two inputs are used to produce a new hidden state (h 0 for the first datapoint), representing the updated state of the sequence now that a new pointwas presented to it.4. The new hidden state is both the output of the current step and one of theinputs of the next step.5. If there is yet another data point in the sequence, it goes back to Step #2; ifnot, the last hidden state (h 1 in the figure above) is also the final hidden state (h f ) of the whole RNN.Since the final hidden state is a representation of the full sequence, that’s whatwe’re going to use as features for our classifier.In a way, that’s not so different from the way we used CNNs:There, we’d run the pixels through multiple convolutional blocks(convolutional layer + activation + pooling) and flatten them intoa vector at the end to use as features for a classifier.Here, we run a sequence of data points through RNN cells anduse the final hidden state (also a vector) as features for aclassifier.There is a fundamental difference between CNNs and RNNs, though: While thereare several different convolutional layers, each learning its own filters, the RNNcell is one and the same. In this sense, the "unrolled" representation is misleading: Itdefinitely looks like each input is being fed to a different RNN cell, but that’s not thecase.There is only one cell, which will learn a particular set of weightsand biases, and which will transform the inputs exactly the sameway in every step of the sequence. Don’t worry if this doesn’tcompletely make sense to you just yet; I promise it will becomemore clear soon, especially in the "Journey of a Hidden State"section.Recurrent Neural Networks (RNNs) | 593

RNN CellLet’s take a look at some of the internals of an RNN cell:Figure 8.6 - Internals of an RNN cellOn the left, we have a single RNN cell. It has three main components:• A linear layer to transform the hidden state (in blue)• A linear layer to transform the data point from the sequence (in red)• An activation function, usually the hyperbolic tangent (TanH), which is appliedto the sum of both transformed inputsWe can also represent them as equations:Equation 8.1 - RNNI chose to split the equation into smaller colored parts to highlight the fact thatthese are simple linear layers producing both a transformed hidden state (t h ) and atransformed data point (t x ). The updated hidden (h t ) state is both the output of thisparticular cell and one of the inputs of the "next" cell.But there is no other cell, really; it is just the same cell over and over again, asdepicted on the right side of the figure above. So, in the second step of thesequence, the updated hidden state will run through the very same linear layer theinitial hidden state ran through. The same goes for the second data point.594 | Chapter 8: Sequences

sequence so far, and a data point from the sequence (like the coordinates of

one of the corners from a given square).

3. The two inputs are used to produce a new hidden state (h 0 for the first data

point), representing the updated state of the sequence now that a new point

was presented to it.

4. The new hidden state is both the output of the current step and one of the

inputs of the next step.

5. If there is yet another data point in the sequence, it goes back to Step #2; if

not, the last hidden state (h 1 in the figure above) is also the final hidden state (

h f ) of the whole RNN.

Since the final hidden state is a representation of the full sequence, that’s what

we’re going to use as features for our classifier.

In a way, that’s not so different from the way we used CNNs:

There, we’d run the pixels through multiple convolutional blocks

(convolutional layer + activation + pooling) and flatten them into

a vector at the end to use as features for a classifier.

Here, we run a sequence of data points through RNN cells and

use the final hidden state (also a vector) as features for a

classifier.

There is a fundamental difference between CNNs and RNNs, though: While there

are several different convolutional layers, each learning its own filters, the RNN

cell is one and the same. In this sense, the "unrolled" representation is misleading: It

definitely looks like each input is being fed to a different RNN cell, but that’s not the

case.

There is only one cell, which will learn a particular set of weights

and biases, and which will transform the inputs exactly the same

way in every step of the sequence. Don’t worry if this doesn’t

completely make sense to you just yet; I promise it will become

more clear soon, especially in the "Journey of a Hidden State"

section.

Recurrent Neural Networks (RNNs) | 593

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!