09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Preface

The complexity of deep learning models is also increasing. ResNet-50 is an image

recognition model (see chapters 4 and 5), with about 26 million parameters. Every

single parameter is a weight used to fine-tune the model. Transformers, gpt-

1, bert, and gpt-2 [7] are natural language processing (see Chapter 8, Recurrent

Neural Networks) models able to perform a variety of tasks on text. These models

progressively grew from 340 million to 1.5 billion parameters. Recently, Nvidia

claimed that it has been able to train the largest-known model, with 8.3 billion

parameters, in just 53 minutes. This training allowed Nvidia to build one of the most

powerful models to process textual information (https://devblogs.nvidia.com/

training-bert-with-gpus/).

Figure 3: Growth in number of parameters for various deep learning models

Besides that, computational capacity is significantly increasing. GPUs and TPUs

(Chapter 16, Tensor Processing Unit) are deep learning accelerators that have made it

possible to train large models in a very short amount of time. TPU3s, announced on

May 2018, are about twice as powerful (360 teraflops) as the TPU2s announced on

May 2017. A full TPU3 pod can deliver more than 100 petaflops of machine learning

performance, while TPU2 pods can get to 11.5 teraflops of performance.

[ xv ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!