22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Data Preparation

1 class CustomDataset(Dataset):

2 def __init__(self, x, y):

3 self.x = [torch.as_tensor(s).float() for s in x]

4 self.y = torch.as_tensor(y).float().view(-1, 1)

5

6 def __getitem__(self, index):

7 return (self.x[index], self.y[index])

8

9 def __len__(self):

10 return len(self.x)

11

12 train_var_data = CustomDataset(var_points, var_directions)

But this is not enough; if we create a data loader for our custom dataset and try to

retrieve a mini-batch out of it, it will raise an error:

train_var_loader = DataLoader(

train_var_data, batch_size=16, shuffle=True

)

next(iter(train_var_loader))

Output

-----------------------------------------------------------------

RuntimeError

Traceback (most recent call last)

<ipython-input-34-596b8081f8d1> in <module>

1 train_var_loader = DataLoader(train_var_data, batch_size=16,

shuffle=True)

----> 2 next(iter(train_var_loader))

...

RuntimeError: stack expects each tensor to be equal size, but got [

3, 2] at entry 0 and [4, 2] at entry 2

It turns out, the data loader is trying to stack() together the sequences, which, as

we know, have different sizes and thus cannot be stacked together.

We could simply pad all the sequences and move on with a TensorDataset and

regular data loader. But, in that case, the final hidden states would be affected by

Variable-Length Sequences | 665

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!