Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Let’s take a look at the RNN’s arguments:• input_size: It is the number of features in each data point of the sequence.• hidden_size: It is the number of hidden dimensions you want to use.• bias: Just like any other layer, it includes the bias in the equations.• nonlinearity: By default, it uses the hyperbolic tangent, but you can change itto ReLU if you want.The four arguments above are exactly the same as those in the RNN cell. So, we caneasily create a full-fledged RNN like that:n_features = 2hidden_dim = 2torch.manual_seed(19)rnn = nn.RNN(input_size=n_features, hidden_size=hidden_dim)rnn.state_dict()OutputOrderedDict([('weight_ih_l0', tensor([[ 0.6627, -0.4245],[ 0.5373, 0.2294]])),('weight_hh_l0', tensor([[-0.4015, -0.5385],[-0.1956, -0.6835]])),('bias_ih_l0', tensor([0.4954, 0.6533])),('bias_hh_l0', tensor([-0.3565, -0.2904]))])Since the seed is exactly the same, you’ll notice that the weights and biases haveexactly the same values as our former RNN cell. The only difference is in theparameters' names: Now they all have an _l0 suffix to indicate they belong to thefirst "layer.""What do you mean by layer? Isn’t the RNN itself a layer?"Yes, the RNN itself can be a layer in our model. But it may have its own internallayers! You can configure those with the following four extra arguments:• num_layers: The RNN we’ve been using so far has one layer (the default value),Recurrent Neural Networks (RNNs) | 603

but if you use more than one, you’ll be creating a stacked RNN, which we’ll seein its own section.• bidirectional: So far, our RNNs have been handling sequences in the left-torightdirection (the default), but if you set this argument to True, you’ll becreating a bidirectional RNN, which we’ll also see in its own section.• dropout: This introduces an RNN’s own dropout layer between its internallayers, so it only makes sense if you’re using a stacked RNN.And I saved the best (actually, the worst) for last:• batch_first: The documentation says, "if True, then the input and output tensorsare provided as (batch, seq, feature)," which makes you think that you only needto set it to True and it will turn everything into your nice and familiar tensorswhere different batches are concatenated together as its first dimension—andyou’d be sorely mistaken."Why? What’s wrong with that?"The problem is, you need to read the documentation very literally: Only the inputand output tensors are going to be batch first; the hidden state will never be batchfirst. This behavior may bring complications you need to be aware of.ShapesBefore going through an example, let’s take a look at the expected inputs andoutputs of our RNN:• Inputs:◦ The input tensor containing the sequence you want to run through theRNN:▪ The default shape is sequence-first; that is, (sequence length, batchsize, number of features), which we’re abbreviating to (L, N, F).▪ But if you choose batch_first, it will flip the first two dimensions, andthen it will expect an (N, L, F) shape, which is what you’re likely gettingfrom a data loader.▪ By the way, the input can also be a packed sequence—we’ll get back tothat in a later section.604 | Chapter 8: Sequences

Let’s take a look at the RNN’s arguments:

• input_size: It is the number of features in each data point of the sequence.

• hidden_size: It is the number of hidden dimensions you want to use.

• bias: Just like any other layer, it includes the bias in the equations.

• nonlinearity: By default, it uses the hyperbolic tangent, but you can change it

to ReLU if you want.

The four arguments above are exactly the same as those in the RNN cell. So, we can

easily create a full-fledged RNN like that:

n_features = 2

hidden_dim = 2

torch.manual_seed(19)

rnn = nn.RNN(input_size=n_features, hidden_size=hidden_dim)

rnn.state_dict()

Output

OrderedDict([('weight_ih_l0', tensor([[ 0.6627, -0.4245],

[ 0.5373, 0.2294]])),

('weight_hh_l0', tensor([[-0.4015, -0.5385],

[-0.1956, -0.6835]])),

('bias_ih_l0', tensor([0.4954, 0.6533])),

('bias_hh_l0', tensor([-0.3565, -0.2904]))])

Since the seed is exactly the same, you’ll notice that the weights and biases have

exactly the same values as our former RNN cell. The only difference is in the

parameters' names: Now they all have an _l0 suffix to indicate they belong to the

first "layer."

"What do you mean by layer? Isn’t the RNN itself a layer?"

Yes, the RNN itself can be a layer in our model. But it may have its own internal

layers! You can configure those with the following four extra arguments:

• num_layers: The RNN we’ve been using so far has one layer (the default value),

Recurrent Neural Networks (RNNs) | 603

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!