Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner's Guide

Model Configuration & Training

Once again, we create both encoder and decoder models and use them as

arguments in the large EncoderDecoderSelfAttn model that handles the

boilerplate, and we’re good to go:

Model Configuration

1 torch.manual_seed(23)

2 encself = EncoderSelfAttn(n_heads=3, d_model=2,

3 ff_units=10, n_features=2)

4 decself = DecoderSelfAttn(n_heads=3, d_model=2,

5 ff_units=10, n_features=2)

6 model = EncoderDecoderSelfAttn(encself, decself,

7 input_len=2, target_len=2)

8 loss = nn.MSELoss()

9 optimizer = optim.Adam(model.parameters(), lr=0.01)

Model Training

1 sbs_seq_selfattn = StepByStep(model, loss, optimizer)

2 sbs_seq_selfattn.set_loaders(train_loader, test_loader)

3 sbs_seq_selfattn.train(100)

fig = sbs_seq_selfattn.plot_losses()

Even though we did our best to ensure the reproducibility of the

results, you may still find some differences in the loss curves (and,

consequently, in the attention scores as well). PyTorch’s

documentation about reproducibility states the following:

"Completely reproducible results are not guaranteed across

PyTorch releases, individual commits, or different platforms.

Furthermore, results may not be reproducible between CPU and

GPU executions, even when using identical seeds."

