pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

TensorFlow for Mobile andIoT and TensorFlow.jsIn this chapter we will learn the basics of TensorFlow for Mobile and IoT (Internetof Things). We will briefly present TensorFlow Mobile and we will introduceTensorFlow Lite in more detail. TensorFlow Mobile and TensorFlow Lite areopen source deep learning frameworks for on-device inference. Some examplesof Android, iOS, and Raspberry PI applications will be discussed, together withexamples of deploying pretrained models such as MobileNet v1, v2, v3 (imageclassification models designed for mobile and embedded vision applications),PoseNet for pose estimation (a vision model that estimates the poses of peoplein image or video), DeepLab segmentation (an image segmentation model thatassigns semantic labels (for example, dog, cat, car) to every pixel in the inputimage), and MobileNet SSD object detection (an image classification model thatdetects multiple objects with bounding boxes). This chapter will conclude withan example of federated learning, a new machine learning framework distributedover millions of mobile devices that is thought to respect user privacy.TensorFlow MobileTensorFlow Mobile is a framework for producing code on iOS and Android. Thekey idea is to have a platform that allows you to have light models that don'tconsume too much device resources such as battery or memory. Typical examplesof applications are image recognition on the device, speech recognition, or gesturerecognition. TensorFlow Mobile was quite popular until 2018 but then becameprogressively less and less adopted in favor of TensorFlow Lite.[ 461 ]

TensorFlow for Mobile and IoT and TensorFlow.jsTensorFlow LiteTensorFlow Lite is a lightweight platform designed by TensorFlow. This platformis focused on mobile and embedded devices such as Android, iOS, and RaspberryPI. The main goal is to enable machine learning inference directly on the deviceby putting a lot of effort in three main characteristics: (1) small binary and modelsize to save on memory, (2) low energy consumption to save on the battery, and(3) low latency for efficiency. It goes without saying that battery and memory aretwo important resources for mobile and embedded devices. In order to achievethese goals, Lite uses a number of techniques such as Quantization, FlatBuffers,Mobile interpreter, and Mobile converter, which we are going to review briefly inthe following sections.QuantizationQuantization refers to a set of techniques that constrains an input made ofcontinuous values (such as real numbers) into a discrete set (such as integers).The key idea is to reduce the space occupancy of Deep Learning (DL) models byrepresenting the internal weight with integers instead of real numbers. Of course,this implies trading space gains for some amount of performance of the model.However, it has been empirically shown in many situations that a quantized modeldoes not suffer from a significant decay in performance. TensorFlow Lite is internallybuilt around a set of core operators supporting both quantized and floating-pointoperations.Model quantization is a toolkit for applying quantization. This operation is appliedto the representations of weights and, optionally, to the activations for both storageand computation. There are two types of quantization available:• Post-training quantization quantizes weights and the result of activationspost training.• Quantization-aware training allows for the training of networks that can bequantized with minimal accuracy drop (only available for specific CNNs).Since this is a relatively experimental technique, we are not going to discussit in this chapter but the interested reader can find more information in [1].TensorFlow Lite supports reducing the precision of values from full floats to halfprecisionfloats (float16) or 8-bit integers. TensorFlow reports multiple trade-offs interms of accuracy, latency, and space for selected CNN models (see Figure 1, source:https://www.tensorflow.org/lite/performance/model_optimization):[ 462 ]

TensorFlow for Mobile and IoT and TensorFlow.js

TensorFlow Lite

TensorFlow Lite is a lightweight platform designed by TensorFlow. This platform

is focused on mobile and embedded devices such as Android, iOS, and Raspberry

PI. The main goal is to enable machine learning inference directly on the device

by putting a lot of effort in three main characteristics: (1) small binary and model

size to save on memory, (2) low energy consumption to save on the battery, and

(3) low latency for efficiency. It goes without saying that battery and memory are

two important resources for mobile and embedded devices. In order to achieve

these goals, Lite uses a number of techniques such as Quantization, FlatBuffers,

Mobile interpreter, and Mobile converter, which we are going to review briefly in

the following sections.

Quantization

Quantization refers to a set of techniques that constrains an input made of

continuous values (such as real numbers) into a discrete set (such as integers).

The key idea is to reduce the space occupancy of Deep Learning (DL) models by

representing the internal weight with integers instead of real numbers. Of course,

this implies trading space gains for some amount of performance of the model.

However, it has been empirically shown in many situations that a quantized model

does not suffer from a significant decay in performance. TensorFlow Lite is internally

built around a set of core operators supporting both quantized and floating-point

operations.

Model quantization is a toolkit for applying quantization. This operation is applied

to the representations of weights and, optionally, to the activations for both storage

and computation. There are two types of quantization available:

• Post-training quantization quantizes weights and the result of activations

post training.

• Quantization-aware training allows for the training of networks that can be

quantized with minimal accuracy drop (only available for specific CNNs).

Since this is a relatively experimental technique, we are not going to discuss

it in this chapter but the interested reader can find more information in [1].

TensorFlow Lite supports reducing the precision of values from full floats to halfprecision

floats (float16) or 8-bit integers. TensorFlow reports multiple trade-offs in

terms of accuracy, latency, and space for selected CNN models (see Figure 1, source:

https://www.tensorflow.org/lite/performance/model_optimization):

[ 462 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!