Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Before moving on to packed sequences, though, let’s just check the (permuted,batch-first) final hidden state:hidden_padded.permute(1, 0, 2)Outputtensor([[[ 0.3161, -0.1675]],[[-0.0642, 0.6012]],[[-0.1007, 0.5349]]], grad_fn=<PermuteBackward>)PackingPacking works like a concatenation of sequences: Instead of padding them to haveequal-length elements, it lines the sequences up, one after the other, and keepstrack of the lengths, so it knows the indices corresponding to the start of eachsequence.Let’s work through an example using PyTorch’s nn.utils.rnn.pack_sequence().First, it takes a list of tensors as input. If your list is not sorted by decreasingsequence length, you’ll need to set its enforce_sorted argument to False.Sorting the sequences by their lengths is only necessary if you’replanning on exporting your model using the ONNX format,which allows you to import the model in different frameworks.packed = rnn_utils.pack_sequence(seq_tensors, enforce_sorted=False)packedVariable-Length Sequences | 657

OutputPackedSequence(data=tensor([[ 1.0349, 0.9661],[-1.1247, -0.9683],[-1.0911, 0.9254],[ 0.8055, -0.9169],[ 0.8182, -0.9944],[-1.0771, -1.0414],[-0.8251, -0.9499],[ 1.0081, 0.7680],[-0.8670, 0.9342]]), batch_sizes=tensor([3, 3, 2, 1]),sorted_indices=tensor([0, 2, 1]), unsorted_indices=tensor([0, 2,1]))The output is a bit cryptic, to say the least. Let’s decipher it, piece by piece, startingwith the unsorted_indices_tensor. Even though we didn’t sort the list ourselves,PyTorch did it internally, and it found that the longest sequence is the first (fourdata points, index 0), followed by the third (three data points, index 2), and then bythe shortest one (two data points, index 1).Once the sequences are listed in order of decreasing length, like in Figure 8.27, thenumber of sequences that are at least t steps long (corresponding to the numberof columns in the figure below) is given by the batch_sizes attribute:Figure 8.27 - Packing sequencesFor example, the batch_size for the third column is two because two sequenceshave at least three data points. Then, it goes through the data points in the same658 | Chapter 8: Sequences

Output

PackedSequence(data=tensor([[ 1.0349, 0.9661],

[-1.1247, -0.9683],

[-1.0911, 0.9254],

[ 0.8055, -0.9169],

[ 0.8182, -0.9944],

[-1.0771, -1.0414],

[-0.8251, -0.9499],

[ 1.0081, 0.7680],

[-0.8670, 0.9342]]), batch_sizes=tensor([3, 3, 2, 1]),

sorted_indices=tensor([0, 2, 1]), unsorted_indices=tensor([0, 2,

1]))

The output is a bit cryptic, to say the least. Let’s decipher it, piece by piece, starting

with the unsorted_indices_tensor. Even though we didn’t sort the list ourselves,

PyTorch did it internally, and it found that the longest sequence is the first (four

data points, index 0), followed by the third (three data points, index 2), and then by

the shortest one (two data points, index 1).

Once the sequences are listed in order of decreasing length, like in Figure 8.27, the

number of sequences that are at least t steps long (corresponding to the number

of columns in the figure below) is given by the batch_sizes attribute:

Figure 8.27 - Packing sequences

For example, the batch_size for the third column is two because two sequences

have at least three data points. Then, it goes through the data points in the same

658 | Chapter 8: Sequences

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!