pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 3• crossed_column: When we want to use two columns combined as onefeature, for example, in the case of geolocation-based data it makes sense tocombine longitude and latitude values as one feature.• numeric_column: Used when the feature is a numeric, it can be a singlevalue or even a matrix.• indicator_column: We do not use this directly. Instead, it is used with thecategorical column, but only when the number of categories is limited andcan be represented as one-hot encoded.• embedding_column: We do not use this directly. Instead, it is used with thecategorical column, but only when the number of categories is very large andcannot be represented as one-hot encoded.• bucketized_column: This is used when, instead of a specific numeric value,we split the data into different categories depending upon its value.The first six functions inherit from the Categorical Column class, the next threeinherit from the Dense Column class, and the last one inherits from both classes. Inthe following example we will use numeric_column and categorical_column_with_vocabulary_list functions.Input functionsThe data for training, evaluation, as well as prediction, needs to be made availablethrough an input function. The input function returns a tf.data.Dataset object;the object returns a tuple containing features and labels.MNIST using TensorFlow Estimator APILet us build a simple TensorFlow estimator with a simple dataset for a multipleregression problem. We continue with the home price prediction, but now havetwo features, that is, we are considering two independent variables: the area of thehouse and its type (bungalow or apartment) on which we presume our price shoulddepend:1. We import the necessary modules. We will need TensorFlow and itsfeature_column module. Since our dataset contains both numeric andcategorical data, we need the functions to process both types of data:import tensorflow as tffrom tensorflow import feature_column as fcnumeric_column = fc.numeric_columncategorical_column_with_vocabulary_list = fc.categorical_column_with_vocabulary_list[ 95 ]

Regression2. Now, we define the feature columns we will be using to train the regressor.Our dataset, as we mentioned, consists of two features "area" a numericvalue signifying the area of the house and "type" telling if it is a "bungalow"or "apartment":featcols = [tf.feature_column.numeric_column("area"),tf.feature_column.categorical_column_with_vocabulary_list("type",["bungalow","apartment"])]3. In the next step, we define an input function to provide input for training.The function returns a tuple containing features and labels:def train_input_fn():features = {"area":[1000,2000,4000,1000,2000,4000],"type":["bungalow","bungalow","house","apartment","apartment","apartment"]}labels = [ 500 , 1000 , 1500 , 700 , 1300 , 1900 ]return features, labels4. Next, we use the premade LinearRegressor estimator and fit it on thetraining dataset:model = tf.estimator.LinearRegressor(featcols)model.train(train_input_fn, steps=200)5. Now that the estimator is trained, let us see the result of the prediction:def predict_input_fn():features = {"area":[1500,1800],"type":["house","apt"]}return featurespredictions = model.predict(predict_input_fn)print(next(predictions))print(next(predictions))--------------------------------------------------6. The result:{'predictions': array([692.7829], dtype=float32)}{'predictions': array([830.9035], dtype=float32)}[ 96 ]

Chapter 3

• crossed_column: When we want to use two columns combined as one

feature, for example, in the case of geolocation-based data it makes sense to

combine longitude and latitude values as one feature.

• numeric_column: Used when the feature is a numeric, it can be a single

value or even a matrix.

• indicator_column: We do not use this directly. Instead, it is used with the

categorical column, but only when the number of categories is limited and

can be represented as one-hot encoded.

• embedding_column: We do not use this directly. Instead, it is used with the

categorical column, but only when the number of categories is very large and

cannot be represented as one-hot encoded.

• bucketized_column: This is used when, instead of a specific numeric value,

we split the data into different categories depending upon its value.

The first six functions inherit from the Categorical Column class, the next three

inherit from the Dense Column class, and the last one inherits from both classes. In

the following example we will use numeric_column and categorical_column_

with_vocabulary_list functions.

Input functions

The data for training, evaluation, as well as prediction, needs to be made available

through an input function. The input function returns a tf.data.Dataset object;

the object returns a tuple containing features and labels.

MNIST using TensorFlow Estimator API

Let us build a simple TensorFlow estimator with a simple dataset for a multiple

regression problem. We continue with the home price prediction, but now have

two features, that is, we are considering two independent variables: the area of the

house and its type (bungalow or apartment) on which we presume our price should

depend:

1. We import the necessary modules. We will need TensorFlow and its

feature_column module. Since our dataset contains both numeric and

categorical data, we need the functions to process both types of data:

import tensorflow as tf

from tensorflow import feature_column as fc

numeric_column = fc.numeric_column

categorical_column_with_vocabulary_list = fc.categorical_column_

with_vocabulary_list

[ 95 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!