09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

TensorFlow 1.x and 2.x

Note that each batch of the given input is divided equally among the multiple

GPUs. For instance, if using MirroredStrategy() with two GPUs, each batch of

size 256 will be divided among the two GPUs, with each of them receiving 128 input

examples for each step. In addition, note that each GPU will optimize on the

received batches and the TensorFlow backend will combine all these independent

optimizations on our behalf. If you want to know more, you can have a look to the

notebook online (https://colab.research.google.com/drive/1mf-PK0a20CkObn

T0hCl9VPEje1szhHat#scrollTo=wYar3A0vBVtZ) where I explain how to use GPUs

in Colab with a Keras model built for MNIST classification. The notebook is available

in the GitHub repository.

In short, using multiple GPUs is very easy and requires minimal changes to the

tf.keras code used for a single server.

MultiWorkerMirroredStrategy

This strategy implements synchronous distributed training across multiple workers,

each one with potentially multiple GPUs. As of September 2019 the strategy works

only with Estimators and it has experimental support for tf.keras. This strategy

should be used if you are aiming at scaling beyond a single machine with high

performance. Data must be loaded with tf.Dataset and shared across workers

so that each worker can read a unique subset.

TPUStrategy

This strategy implements synchronous distributed training on TPUs. TPUs are

Google's specialized ASICs chips designed to significantly accelerate machine

learning workloads in a way often more efficient than GPUs. We will talk more about

TPUs during Chapter 16, Tensor Processing Unit. According to this public information

(https://github.com/tensorflow/tensorflow/issues/24412):

"the gist is that we intend to announce support for TPUStrategy alongside

Tensorflow 2.1. Tensorflow 2.0 will work under limited use-cases but has many

improvements (bug fixes, performance improvements) that we're including in

Tensorflow 2.1, so we don't consider it ready yet."

ParameterServerStrategy

This strategy implements either multi-GPU synchronous local training or

asynchronous multi-machine training. For local training on one machine, the

variables of the models are placed on the CPU and operations are replicated across

all local GPUs.

[ 78 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!