09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Tensor Processing Unit

Many people believe that we are currently in an era where this trend cannot be

sustained for long, and indeed it has already declined during the past few years.

Therefore, we need some additional technology if we want to support the demand

for faster and faster computation to process the ever-growing amount of data that

is available out there.

One improvement came from so-called GPUs: special purpose chips that are

perfect for fast graphics operations such as matrix multiplication, rasterization,

frame buffer manipulation, texture mapping, and many others. In addition to

computer graphics where matrix multiplications are applied to pixels of images,

GPUs also turned out to be a great match for deep learning. This is a funny story of

serendipity: a great example of a technology created for one goal and then meeting

staggering success in a domain completely unrelated to the one they were originally

envisioned for.

Serendipity is the occurrence and development of events by chance

in a happy or beneficial way.

TPUs

One problem encountered in using GPUs for deep learning is that these chips are

made for graphics and gaming, not only for fast matrix computations. This would

of course be the case, given that the G in GPU stands for Graphics! GPUs led to

unbelievable improvements for deep learning but, in the case of tensor operations

for neural networks, large parts of the chip are not used at all. For deep learning,

there is no need for rasterization, no need for frame buffer manipulation, and no

need for texture mapping. The only thing that is necessary is a very efficient way

to compute matrix and tensor operations. It should be no surprise that GPUs are

not necessarily the ideal solution for deep learning, since CPUs and GPUs were

designed long before deep learning became successful.

Before going into the technical details, let's first discuss the fascinating genesis of

Tensor Processing Unit version 1, or TPU v1. In 2013, Jeff Dean, the Chief of Brain

Division at Google, estimated (see Figure 1) that if all the people owning a mobile

phone were talking only three minutes more per day, then Google would have

needed two times or three times more servers to process this data. This would

have been an unaffordable case of success-disaster, that is, where great success

has led to problems that cannot be properly managed.

[ 572 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!