Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
Frequently Asked Questions (FAQ)Why PyTorch?First, coding in PyTorch is fun :-) Really, there is something to it that makes it veryenjoyable to write code in. Some say it is because it is very pythonic, or maybethere is something else, who knows? I hope that, by the end of this book, you feellike that too!Second, maybe there are even some unexpected benefits to your health—checkAndrej Karpathy’s tweet [3] about it!Jokes aside, PyTorch is the fastest-growing [4] framework for developing deeplearning models and it has a huge ecosystem. [5] That is, there are many tools andlibraries developed on top of PyTorch. It is the preferred framework [6] in academiaalready and is making its way in the industry.Several companies are already powered by PyTorch; [7] to name a few:• Facebook: The company is the original developer of PyTorch, released inOctober 2016.• Tesla: Watch Andrej Karpathy (AI director at Tesla) speak about "how Tesla isusing PyTorch to develop full self-driving capabilities for its vehicles" in this video. [8]• OpenAI: In January 2020, OpenAI decided to standardize its deep learningframework on PyTorch (source [9] ).• fastai: fastai is a library [10] built on top of PyTorch to simplify model training andis used in its "Practical Deep Learning for Coders" [11] course. The fastai library isdeeply connected to PyTorch and "you can’t become really proficient at usingfastai if you don’t know PyTorch well, too." [12]• Uber: The company is a significant contributor to PyTorch’s ecosystem, havingdeveloped libraries like Pyro [13] (probabilistic programming) and Horovod [14] (adistributed training framework).• Airbnb: PyTorch sits at the core of the company’s dialog assistant for customerservice.(source [15] )This book aims to get you started with PyTorch while giving you a solidunderstanding of how it works.Why PyTorch? | 1
Why This Book?If you’re looking for a book where you can learn about deep learning and PyTorchwithout having to spend hours deciphering cryptic text and code, and one that’seasy and enjoyable to read, this is it :-)The book covers from the basics of gradient descent all the way up to fine-tuninglarge NLP models (BERT and GPT-2) using HuggingFace. It is divided into fourparts:• Part I: Fundamentals (gradient descent, training linear and logistic regressionsin PyTorch)• Part II: Computer Vision (deeper models and activation functions,convolutions, transfer learning, initialization schemes)• Part III: Sequences (RNN, GRU, LSTM, seq2seq models, attention, selfattention,Transformers)• Part IV: Natural Language Processing (tokenization, embeddings, contextualword embeddings, ELMo, BERT, GPT-2)This is not a typical book: most tutorials start with some nice and pretty imageclassification problem to illustrate how to use PyTorch. It may seem cool, but Ibelieve it distracts you from the main goal: learning how PyTorch works. In thisbook, I present a structured, incremental, and from-first-principles approach tolearn PyTorch.Moreover, this is not a formal book in any way: I am writing this book as if I werehaving a conversation with you, the reader. I will ask you questions (and give youanswers shortly afterward), and I will also make (silly) jokes.My job here is to make you understand the topic, so I will avoid fancymathematical notation as much as possible and spell it out in plain English.In this book, I will guide you through the development of many models in PyTorch,showing you why PyTorch makes it much easier and more intuitive to build modelsin Python: autograd, dynamic computation graph, model classes, and much, muchmore.We will build, step-by-step, not only the models themselves but also yourunderstanding as I show you both the reasoning behind the code and how to avoidsome common pitfalls and errors along the way.2 | Frequently Asked Questions (FAQ)
- Page 2 and 3: Deep Learning with PyTorchStep-by-S
- Page 4 and 5: "What I cannot create, I do not und
- Page 6 and 7: Train-Validation-Test Split. . . .
- Page 8 and 9: Random Split. . . . . . . . . . . .
- Page 10 and 11: The Precision Quirk . . . . . . . .
- Page 12 and 13: A REAL Filter . . . . . . . . . . .
- Page 14 and 15: Jupyter Notebook . . . . . . . . .
- Page 16 and 17: Stacked RNN . . . . . . . . . . . .
- Page 18 and 19: Attention Mechanism . . . . . . . .
- Page 20 and 21: 4. Positional Encoding. . . . . . .
- Page 22 and 23: Model Configuration & Training . .
- Page 24 and 25: AcknowledgementsFirst and foremost,
- Page 28 and 29: There is yet another advantage of f
- Page 30 and 31: • Classes and methods are written
- Page 32 and 33: What’s Next?It’s time to set up
- Page 34 and 35: After choosing a repository, it wil
- Page 36 and 37: 1. AnacondaIf you don’t have Anac
- Page 38 and 39: 3. PyTorchPyTorch is the coolest de
- Page 40 and 41: (pytorchbook) C:\> conda install py
- Page 42 and 43: (pytorchbook)$ pip install torchviz
- Page 44 and 45: 7. JupyterAfter cloning the reposit
- Page 46 and 47: Part IFundamentals| 21
- Page 48 and 49: notebook. If not, just click on Cha
- Page 50 and 51: Also, let’s say that, on average,
- Page 52 and 53: There is one exception to the "alwa
- Page 54 and 55: Random Initialization1 # Step 0 - I
- Page 56 and 57: Batch, Mini-batch, and Stochastic G
- Page 58 and 59: Outputarray([[-2. , -1.94, -1.88, .
- Page 60 and 61: one matrix for each data point, eac
- Page 62 and 63: Sure, different values of b produce
- Page 64 and 65: Output-3.044811379650508 -1.8337537
- Page 66 and 67: each parameter using the chain rule
- Page 68 and 69: What’s the impact of one update o
- Page 70 and 71: gradients, we know we need to take
- Page 72 and 73: Very High Learning RateWait, it may
- Page 74 and 75: true_b = 1true_w = 2N = 100# Data G
Why This Book?
If you’re looking for a book where you can learn about deep learning and PyTorch
without having to spend hours deciphering cryptic text and code, and one that’s
easy and enjoyable to read, this is it :-)
The book covers from the basics of gradient descent all the way up to fine-tuning
large NLP models (BERT and GPT-2) using HuggingFace. It is divided into four
parts:
• Part I: Fundamentals (gradient descent, training linear and logistic regressions
in PyTorch)
• Part II: Computer Vision (deeper models and activation functions,
convolutions, transfer learning, initialization schemes)
• Part III: Sequences (RNN, GRU, LSTM, seq2seq models, attention, selfattention,
Transformers)
• Part IV: Natural Language Processing (tokenization, embeddings, contextual
word embeddings, ELMo, BERT, GPT-2)
This is not a typical book: most tutorials start with some nice and pretty image
classification problem to illustrate how to use PyTorch. It may seem cool, but I
believe it distracts you from the main goal: learning how PyTorch works. In this
book, I present a structured, incremental, and from-first-principles approach to
learn PyTorch.
Moreover, this is not a formal book in any way: I am writing this book as if I were
having a conversation with you, the reader. I will ask you questions (and give you
answers shortly afterward), and I will also make (silly) jokes.
My job here is to make you understand the topic, so I will avoid fancy
mathematical notation as much as possible and spell it out in plain English.
In this book, I will guide you through the development of many models in PyTorch,
showing you why PyTorch makes it much easier and more intuitive to build models
in Python: autograd, dynamic computation graph, model classes, and much, much
more.
We will build, step-by-step, not only the models themselves but also your
understanding as I show you both the reasoning behind the code and how to avoid
some common pitfalls and errors along the way.
2 | Frequently Asked Questions (FAQ)