Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

• training the Transformer to tackle our sequence-to-sequence problem• understanding that the validation loss may be much lower than the trainingloss due to regularizing effect of dropout• training another model using PyTorch’s (norm-last) Transformer class• using the Vision Transformer architecture to tackle an image classificationproblem• splitting an image into flattened patches by either rearranging or embeddingthem• adding a special classifier token to the embeddings• using the encoder’s output corresponding to the special classifier token asfeatures for the classifierCongratulations! You’ve just assembled and trained your first Transformer (andeven a cutting-edge Vision Transformer!): This is no small feat. Now you know what"layers" and "sub-layers" stand for and how they’re brought together to build aTransformer. Keep in mind, though, that you may find slightly differentimplementations around. It may be either norm-first or norm-last or maybe yetanother customization. The details may be different, but the overall conceptremains: It is all about stacking attention-based "layers.""Hey, what about BERT? Shouldn’t we use Transformers to tackle NLPproblems?"I was actually waiting for this question: Yes, we should, and we will, in the nextchapter. As you have seen, it is already hard enough to understand the Transformereven when it’s used to tackle such a simple sequence-to-sequence problem as ours.Trying to train a model to handle a more complex natural language processingproblem would only make it even harder.In the next chapter, we’ll start with some NLP concepts and techniques like tokens,tokenization, word embeddings, and language models, and work our way up tocontextual word embeddings, GPT-2, and BERT. We’ll be using several Pythonpackages, including the famous HuggingFace :-)[146] https://github.com/dvgodoy/PyTorchStepByStep/blob/master/Chapter10.ipynb[147] https://colab.research.google.com/github/dvgodoy/PyTorchStepByStep/blob/master/Chapter10.ipynb[148] https://arxiv.org/abs/1906.04341[149] https://arxiv.org/abs/1706.03762Recap | 877

[150] http://nlp.seas.harvard.edu/2018/04/03/attention[151] https://arxiv.org/abs/1607.06450[152] https://arxiv.org/abs/2010.11929[153] https://github.com/arogozhnikov/einops[154] https://amaarora.github.io/2021/01/18/ViT.html[155] https://github.com/lucidrains/vit-pytorch878 | Chapter 10: Transform and Roll Out

[150] http://nlp.seas.harvard.edu/2018/04/03/attention

[151] https://arxiv.org/abs/1607.06450

[152] https://arxiv.org/abs/2010.11929

[153] https://github.com/arogozhnikov/einops

[154] https://amaarora.github.io/2021/01/18/ViT.html

[155] https://github.com/lucidrains/vit-pytorch

878 | Chapter 10: Transform and Roll Out

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!