End-to-End Machine Learning With TensorFlow On GCP

Introduction
End-to-End Lab on Structured Data ML
1
Advanced ML with TensorFlow on GCP
Production ML Systems
Image Classification Models
Sequence Models
Recommendation Systems
Learn how to...
Explore large datasets for features
Create training and evaluation datasets
Build models with the Estimator API in
TensorFlow
Train at scale and deploy models into
production with GCP ML tools
7 hands-on machine learning labs
Summary of ML labs
1 Explore, visualize a dataset

5 Execute training
2 Create sampled dataset

6 Deploy prediction service
3 Develop a TensorFlow model

7 Invoke ML predictions
4
Create training and evaluation
datasets
cloud.google.com
Effective ML
Sequence Models
Steps involved in doing ML on GCP
1 Explore the dataset
2 Create the dataset
3 Build the model
4 Operationalize the model

You use distributed TensorFlow on Cloud ML Engine
Run TF at
scale
High-level API for distributed
tf.estimator training
Components useful when
Cloud ML Engine
tf.layers, tf.losses, tf.metrics building custom NN models
Python API gives you full

Core TensorFlow (Python) control
Core TensorFlow (C++) C++ API is quite low level
CPU GPU TPU Android TF runs on different hardware

Many machine learning frameworks can handle toy
problems
Train
Inputs Model
model
To build effective ML, you need:
Big Data Feature Model

Engineering Architectures
As your data size increases, batching and
distribution become important
Train
Inputs Model
model

Input necessary transformations
Pre- Feature Train

Inputs Model
processing creation model

Sharing our tools with researchers and developers
around the world
Released in Nov. 2015
for “machine learning”

category on GitHub

What else does an ML framework need to provide?
Hyperparameter tuning might be nice.

Hyperparameter
tuning
Pre- Feature Train

Inputs Model
Cloud machine learning: Repeatable, scalable, tuned
Hyperparameter
tuning
Pre- Feature Train

Inputs Model
Same Deploy
REST API call with

input variables Prediction Web
Clients
application
In Cloud Datalab, start locally on a sampled dataset
improve
Then, scale it out to GCP using serverless technology
store
improve/
serverless/
hypertune
The end-to-end machine learning set of labs
Explore, visualize a Create sampled

#1 #2
dataset dataset
Notebook Natality Dataset

Cloud Datalab BigQuery
Cloud Shell Create training and

Develop a TensorFlow #4
#3 Data Pipeline evaluation datasets
model
Cloud Dataflow
Deploy a web
application #5 Execute training
Training Dataset
Cloud Storage
Deploy prediction
#6
service
Web Application Managed ML Service

Use a web App Engine Cloud ML Engine
Web application
Browser #7 Invoke ML predictions
cloud.google.com
Explore the Dataset
Sequence Models
3 Build the model

The most common ML models at Google are models
that operate on structured data
# of network % of deployed
Type of network # of weights
layers models
MLP0 5 20M
61%
MLP1 4 5M
LSTM0 58 52M
29%
LSTM1 56 34M
CNN0 16 8M
5%
CNN1 89 100M
https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu
Our goal is to predict the weight of newborns so that
all newborns can get the care they need
Predict the weight Identify babies Get babies the

of newborns who may need care they need
special facilities
This is what we will build
An open dataset of births is available in BigQuery
Births recorded in the 50 states of the USA from 1969 to 2008.
https://bigquery.cloud.google.com/table/bigquery-public-data:samples.natality
The data set includes details about the pregnancy
Date of birth
Location of birth (US state)
Baby’s birth weight (lbs)
Mother’s age at birth
Duration of pregnancy
Mother’s weight gain (lbs)

The end-to-end machine learning set of labs

#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
#6 Deploy prediction
service

App Engine Cloud ML Engine
Use a web
Web
application
BigQuery is a serverless data warehouse
1 Interactive analysis of petabyte scale databases.
2 Familiar, SQL 2011 query language and

functions.
3 Many ways to ingest, transform, load, export data

to/from BigQuery.
4 Nested and repeated fields, user-defined functions.
5 Data storage is inexpensive; queries charged on amount of

data processed (or a monthly flat rate).
10
Run a query from BigQuery web UI
More options
Run Save/Share
VALIDATE
Cost
Analyze query
EXPORT
performance
https://bigquery.cloud.
google.com/
11
Demo: Query large datasets in seconds
# standardsql
# medicare claims in 2014

SELECT
nppes_provider_state AS state,
ROUND(SUM(total_claim_count) / 1e6) AS total_claim_count_millions
FROM
`bigquery-public-data.medicare.part_d_prescriber_2014`
GROUP BY
state
ORDER BY
total_claim_count_millions DESC
LIMIT 5;
https://bigquery.cloud.google.com/savedquery/663413318684:781a98ddf2264505af2b6a8fc398a80e
Cloud Datalab notebooks are developed in an
iterative, collaborative process
PHASE 5 PHASE 1
Share and Write code in
collaborate Python
Development
Process in
PHASE 4 Cloud Datalab
PHASE 2
Write
Run cell
commentary in
(Shift+Enter)
markdown
PHASE 3
Examine Output
Cloud Datalab notebooks are developed in an
iterative, collaborative process
5 2
5
1
3
4
You can develop locally with Cloud Datalab and then
scale out data processing to the cloud
Pandas improve
Cloud CSV Files
Dataframes
Datalab
Tensor
Apache
Flow
Beam
improve /
serverless /
hypertune
Cloud Cloud Cloud ML

Storage Dataflow Engine
Cloud Datalab notebooks let you change the
underlying hardware
Notebook server Hosted on

Cloud Datalab Compute Engine
Developer
Laptop
Save/read 10 GB PD
CUS
VM
Notebook files 4 2 1
Cloud repository
Starting Cloud Datalab in Cloud Shell is simple
1
datalab create my-datalab-vm \
--machine-type n1-highmem-8 \
2 --zone us-central1-a
3 4
Preprocessing data at scale with BigQuery + Cloud
Datalab
BigQuery in Python to get a Pandas DF
Pandas + BigQuery in notebook rocks!
Lab
Explore a BigQuery dataset to
publicdata.samples.natality
find features to use in an ML
model
In this lab, you will investigate which

features have influence on what you want
to predict: the baby's weight.
The end-to-end process

#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
Deploy prediction
#6
service

Use a web App Engine Cloud ML Engine
Web application
cloud.google.com
Create the Dataset
Sequence Models
3 Build the model

Building an ML model involves:
Creating Building Operationalizing

the dataset the model the model
What makes a feature “good”?
1 Be related to the objective.
2 Be known at prediction-time.
3 Be numeric with meaningful magnitude.
4 Have enough examples.
5 Bring human insight to problem.

Some data could be known immediately, and some
other data is not known in real time
Today’s sales
Sales Data transactions
Will we know all these things at prediction time?
With ultrasound Without ultrasound
Sex: Male/Female Sex: Unknown

Plurality: 1, 2, 3, 4, or 5 Plurality: Single/Multiple
The simplest option is to sample rows randomly
Birth weight in pounds
Each data point is a birth record

from the natality dataset.
Random sampling eliminates

potential biases due to order of
the training examples, but ...
Also ... what about triplets?
3 rows with essentially the

same data!
How can we make this data unique?
How can we solve this?
Solution: Split a dataset into training/validation using
hashing and modulo operators
#standardSQL Note: Even though we

SELECT select date, our model
date, wouldn’t actually use it
airline, during training.
departure_airport, Hash value on the Date
departure_schedule, will always return the
arrival_airport, same value.
arrival_delay
FROM Then we can use a
`bigquery-samples.airline_ontime_data.flights` modulo operator to
only pull 80% of that
WHERE data based on the last
MOD(ABS(FARM_FINGERPRINT(date)),10) < 8 few hash digits.
Developing the ML model software on the entire
dataset can be expensive; you want to develop on a
smaller sample
Develop your
TensorFlow code
on a small subset
of data, then scale
it out to the cloud.
Full Dataset
Solution: Sampling the split so that we have a small
dataset to develop our code on
#standardSQL
SELECT
date,
airline,
departure_airport,
departure_schedule,
arrival_airport,
arrival_delay
FROM
`bigquery-samples.airline_ontime_data.flights`
WHERE
MOD(ABS(FARM_FINGERPRINT(date)),10) < 8 AND RAND() < 0.01
Lab
Creating a sampled dataset
In this lab, you will sample a BigQuery

dataset to create datasets for ML, and
preprocess data using Pandas.
https://www.oreilly.com/learning/repeatable-sampling-of-data-sets-in-bigquery
-for-machine-learning

#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
service

Use a web
Web
application
cloud.google.com
Build the Model
Sequence Models
3 Build the model

TensorFlow is an open-source high-performance library
for n umerical computation that uses directed graphs
Nodes represent
mathematical
operations.
Edges represent
arrays of data.
A tensor is an N-dimensional array of data
Rank 0 Rank 1 Rank 2 Rank 3 Rank 4

Tensor Tensor Tensor Tensor Tensor
scalar vector matrix
TensorFlow toolkit hierarchy


tf.layers, tf.losses, tf.metrics
Cloud ML Engine
building custom NN models


Working with Estimator API
Set up machine learning model:
1 Regression or classification?
2 What is the label?
3 What are the features?

SQUARE ML
PRICE
Carry out ML steps: FOOTAGE MODEL
1 Train the model.
2 Evaluate the model.
3 Predict with the model.
7
Structure of an Estimator API ML model
import tensorflow as tf
#Define input feature columns
featcols = [
tf.feature_column.numeric_column("sq_footage") ]
#Instantiate Linear Regression Model

model = tf.estimator.LinearRegressor(featcols, './model_trained')
#Train
def train_input_fn():
...
SQUARE ML
return features, labels PRICE
model.train(train_input_fn, steps=100) FOOTAGE MODEL
#Predict
def pred_input_fn():
...
return features
out = model.predict(pred_input_fn)
8
Encoding categorical data to supply to a DNN
1a. If you know the complete vocabulary beforehand:
tf.feature_column.categorical_column_with_vocabulary_list('zipcode',
vocabulary_list = ['83452', '72345', '87654', '98723', '23451']),
1b. If your data is already indexed; i.e., has integers in [0-N):
tf.feature_column.categorical_column_with_identity('stateId',
num_buckets = 50)
2. To pass in a categorical column into a DNN, one option is to one-hot encode it:
tf.feature_column.indicator_column( my_categorical_column )
To read CSV files, create a TextLineDataset giving it a
function to decode the CSV into features, labels
CSV_COLUMNS = ['sqfootage','city','amount']
LABEL_COLUMN = 'amount'
DEFAULTS = [[0.0], ['na'], [0.0]]
def read_dataset(filename, mode, batch_size=512):

def decode_csv(value_column):
columns = tf.decode_csv(value_column, record_defaults=DEFAULTS)
features = dict(zip(CSV_COLUMNS, columns))
label = features.pop(LABEL_COLUMN)
return features, label
dataset = tf.data.TextLineDataset(filename).map(decode_csv)
...
return ...
Shuffling is important for distributed training
def read_dataset(filename, mode, batch_size=512):

...
dataset = tf.data.TextLineDataset(filename).map(decode_csv)
if mode == tf.estimator.ModeKeys.TRAIN:
num_epochs = None # indefinitely
dataset = dataset.shuffle(buffer_size=10*batch_size)
else:
num_epochs = 1 # end-of-input after this
dataset = dataset.repeat(num_epochs).batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
Estimator API comes with a method that handles
distributed training and evaluation
estimator = tf.estimator.LinearRegressor( Distribute the graph

model_dir=output_dir,
feature_columns=feature_cols)
Share variables
... Evaluate occasionally
tf.estimator.train_and_evaluate(estimator,
train_spec, Handle machine failures
eval_spec)
Create checkpoint files
PASS IN:
1. ESTIMATOR Recover from failures
2. TRAIN SPEC
3. EVAL SPEC Save summaries for TensorBoard
TrainSpec consists of the things that used to be
passed into the train() method
train_spec = tf.estimator.TrainSpec(
input_fn=read_dataset('gs://.../train*',
mode=tf.contrib.learn.ModeKeys.TRAIN),
max_steps=num_train_steps)
...
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
Think “steps”, not “epochs,” with production-ready, distributed models.

1. Gradient updates from slow workers could get ignored.
2. When retraining a model with fresh data, we’ll resume from earlier number of
steps (and corresponding hyper-parameters).
EvalSpec controls the evaluation and the
checkpointing of the model because they happen at
the same time
exporter = ...
eval_spec=tf.estimator.EvalSpec(
input_fn=read_dataset('gs://.../valid*',
mode=tf.contrib.learn.ModeKeys.EVAL),
steps=None,
start_delay_secs=60, # start evaluating after N seconds
throttle_secs=600, # evaluate every N seconds
exporters=exporter)

Lab
Creating a TensorFlow model
In this lab, you use the Estimator API to

build linear and deep neural network
models, use the Estimator API to build
wide and deep model, and monitor
training using TensorBoard.
Two types of features: Dense and sparse
{
"transactionId": 42,
"name": "Ice Cream",
"price": 2.50,
"tags": ["cold", "dessert"], price 8345 72345 87654 98723 wait
"servedBy": { 2.50 0 1 0 0 1.4
"employeeId": 72365,
"waitTime": 1.4,
"customerRating": 4
},
"storeLocation": {
"latitude": 35.3,
"longitude": -98.7
}
},
DNNs good for dense, highly-correlated inputs
pixel_values ( )
10242 input 10 hidden nodes

nodes -> 10 image
https://commons.wikimedia.org/wiki/File:Two_layer_ann.svg features
Linear models are better at handling sparse,
independent features
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
Wide-and-deep models let you handle both
Wide and Deep
memorization generalization
relevance diversity
Wide-and-deep network in Estimator API
model = tf.estimator.DNNLinearCombinedClassifier(
model_dir=...,
linear_feature_columns=wide_columns,
dnn_feature_columns=deep_columns,
dnn_hidden_units=[100, 50])
22

#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
service

Use a web
Web
application
cloud.google.com
Operationalizing the Model
Sequence Models
3 Build the model


Beam is a way to write elastic data processing
pipelines
BigQuery
Cloud
Dataflow
Cloud Storage
Open-source API, Google infrastructure
p = beam.Pipeline()
Open-source API (Apache
Input (p
Beam) can be executed on
Flink, Spark, etc. also
Read | beam.io.ReadFromText('gs://..')
Parallel tasks
Transform | beam.Map(Transform) (autoscaled by execution
framework)
Group | beam.GroupByKey()
Filter | beam.FlatMap(Filter)
Write | beam.io.WriteToText('gs://...') def Transform(line):

) return (parse_custid(line), 1)
Output def Filter(key, values):

p.run(); return sum(values) > 10
The code is the same between real-time and batch
p = beam.Pipeline()
(p
BigQuery | beam.io.ReadStringsFromPubSub('project/topic')
Cloud
Pub/Sub | beam.WindowInto(SlidingWindows(60))
| beam.Map(Transform)
| beam.GroupByKey()
| beam.FlatMap(Filter)
Cloud Cloud | beam.io.WriteToBigQuery(table)
Dataflow Pub/Sub )
p.run()
Cloud
Storage
Cloud
Storage
An example Beam pipeline for BigQuery->CSV on cloud
import apache_beam as beam
def transform(rowdict):
import copy
result = copy.deepcopy(rowdict)
if rowdict['a'] > 0:
result['c'] = result['a'] * result['b']
yield ','.join([ str(result[k]) if k in result else 'None' for k in ['a','b','c'] ])
if __name__ == '__main__':
p = beam.Pipeline(argv=sys.argv)
selquery = 'SELECT a,b FROM someds.sometable'
(p
| beam.io.Read(beam.io.BigQuerySource(query = selquery,
use_standard_sql = True)) # read input
| beam.Map(transform_data) # do some processing
| beam.io.WriteToText('gs://...') # write output
)
p.run() # run the pipeline
Executing pipeline (Python)
Simply running main() runs pipeline locally.
python ./etl.py
To run on cloud, specify cloud parameters.

python ./etl.py \
--project=$PROJECT \
--job_name=myjob \
--staging_location=gs://$BUCKET/staging/ \
--temp_location=gs://$BUCKET/staging/ \
--runner=DataflowRunner # DirectRunner would be local
Split the full dataset into train/eval and do preprocessing
BigQuery -> Dataflow -> CSV

Lab
Preprocessing using
Cloud Dataflow
In this lab, you use Cloud Dataflow to

create datasets for Machine Learning.

#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
service

Use a web
Web
application

Create task.py to parse command-line parameters
and send to train_and_evaluate
task.py
parser.add_argument(
model.py '--train_data_paths', required=True)
parser.add_argument(
def train_and_evaluate(args): '--train_steps', ...
estimator = tf.estimator.DNNRegressor(
model_dir=args['output_dir'],
feature_columns=feature_cols,
hidden_units=args['hidden_units'])
train_spec=tf.estimator.TrainSpec(
input_fn=read_dataset(args['train_data_paths'],
batch_size=args['train_batch_size'],
mode=tf.contrib.learn.ModeKeys.TRAIN),
max_steps=args['train_steps'])
exporter = tf.estimator.LatestExporter('exporter', serving_input_fn)
eval_spec=tf.estimator.EvalSpec(...)
The model.py contains the ML model in TensorFlow
(Estimator API)
Example of the code in model.py (see Lab #3)
Training and CSV_COLUMNS = ...

evaluation input def read_dataset(filename, mode, batch_size=512):
functions ...
Feature columns INPUT_COLUMNS = [

tf.feature_column.numeric_column('gestation_weeks'),
Feature def add_more_features(feats):

engineering # feature crosses etc.
return feats
Serving input def serving_input_fn():

function ...
return tf.estimator.export.ServingInputReceiver(features, feature_pholders)
Train and evaluate def train_and_evaluate(args):

loop ...
Package TensorFlow model as a Python package
taxifare/
taxifare/PKG-INFO
Python packages need to
taxifare/setup.cfg
contain an __init__.py in
taxifare/setup.py every folder.
taxifare/trainer/
taxifare/trainer/__init__.py
taxifare/trainer/task.py
taxifare/trainer/model.py
Verify that the model works as a Python package
export PYTHONPATH=${PYTHONPATH}:/somedir/babyweight
python -m trainer.task \
--train_data_paths="/somedir/datasets/*train*" \
--eval_data_paths=/somedir/datasets/*valid* \
--output_dir=/somedir/output \
--train_steps=100 --job-dir=/tmp
You use distributed TensorFlow on Cloud ML Engine
Run TF at
scale

tf.layers, tf.losses, tf.metrics
Cloud ML Engine
building custom NN models


Use the gcloud command to submit the training job
either locally or to the cloud
gcloud ml-engine local train \

--module-name=trainer.task \
--package-path=/somedir/babyweight/trainer \
-- \
--train_data_paths etc.
REST as before
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--job-dir=$OUTDIR --staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
REST as before
Monitor training jobs with GCP Console
You can also view CPU and Memory utilization charts

for this training job with StackDriver Monitoring.
Monitor training jobs with TensorBoard
Pre-made estimators automatically populate summary data

that you can examine and visualize using TensorBoard.
Lab
Training on Cloud ML Engine
In this lab, you will do distributed training

using Cloud ML Engine, and improve model
accuracy using hyperparameter tuning.
Lab Steps
1 Change the batch size if necessary.
2 Calculate the train steps based on the # examples.
3 Make hyperparameter command-line parameters.

Submit the training job on the full dataset and
monitor using TensorBoard

#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
#6 Deploy prediction service

Use a web
Web
application
It can take days to months to create an ML model
Export data
Train and evaluate

Simplify model development with BigQuery ML
1 2 3
Use familiar SQL Train models over Don’t worry about

for machine all their data in hypertuning or
learning. BigQuery. feature
transformations.
Simplify model development with BigQuery ML
Create sampled
#2
dataset
Explore and
#1
visualize a dataset #3 Develop a BQML
model
Create training
#4 and evaluation
datasets
#5 Execute training
Cloud #6 Batch prediction

Shell
Behind the scenes
With 2 lines of code: For the advanced user:
● Leverages BigQuery’s ● L1/L2 regularization.

processing power to build a
● 3 strategies for training/test
model.
split: Random, Sequential,
● Auto-tunes learning rate. Custom.
● Auto-splits data into training ● Set learning rate.
and test.
Supported features
1 StandardSQL and UDFs within the ML queries.
2 Linear Regression (Forecasting).
3 Binary Logistic Regression (Classification).
4 Model evaluation functions for standard metrics, including ROC and

precision-recall curves.
5 Model weight inspection.
6 Feature distribution analysis through standard functions.

The end-to-end BQML process
ETL into BigQuery
1 Preprocess Features 2
● BQ Public Data Sources ● Explore
● Google Marketing Platform ● Join
○ Analytics
● Create Train / Test Tables
○ Ads
● YouTube
● Your Datasets
#standardSQL #standardSQL #standardSQL

CREATE MODEL
ecommerce.classification
3 SELECT
roc_auc,
4 SELECT * FROM 5
ML.PREDICT
accuracy,
(MODEL ecommerce.classification,
OPTIONS precision,
( (
recall
model_type='logistic_reg', FROM
# SQL query with test data
input_label_cols = ML.EVALUATE(MODEL
['will_buy_later'] ecommerce.classification
) AS
# SQL query with eval data
# SQL query with training data
Lab
Predicting baby weight with
BigQuery ML
In this lab, you will do the model

training, evaluation, and prediction, all
within BigQuery.
Cloud ML Engine makes deploying models and
scaling the prediction infrastructure easy
Model
Deploy
REST API call

with input Cloud ML Engine Prediction pipeline
variables
Serving Pre
Clients processing Model
Input Fn
You can’t reuse the training input function for serving
Features Features
Data format Training Data format Serving

might be input_fn Model might be input_fn Model
CSV JSON
Labels Labels
1. The serving_input_fn specifies what the caller of
the predict() method must provide
def serving_input_fn():
feature_placeholders = {
'pickuplon' : tf.placeholder(tf.float32, [None]),
'pickuplat' : tf.placeholder(tf.float32, [None]),
'dropofflat' : tf.placeholder(tf.float32, [None]),
'dropofflon' : tf.placeholder(tf.float32, [None]),
'passengers' : tf.placeholder(tf.float32, [None]),
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features,
feature_placeholders)
2. Deploy a trained model to GCP
MODEL_NAME="taxifare"
MODEL_VERSION="v1"
MODEL_LOCATION="gs://${BUCKET}/taxifare/smallinput/taxi_trained/export/exporter
/.../"
gcloud ml-engine models create ${MODEL_NAME} --regions $REGION

gcloud ml-engine versions create ${MODEL_VERSION} --model ${MODEL_NAME}
--origin ${MODEL_LOCATION}
Could also be a locally trained model.

3. Client code can make REST calls
credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1beta1_
discovery.json')
request_data = [
{'pickup_longitude': -73.885262,
'pickup_latitude': 40.773008,
'dropoff_longitude': -73.987232,
'dropoff_latitude': 40.732403,
'passenger_count': 2}]
parent = 'projects/%s/models/%s/versions/%s' % ('cloud-training-demos',
'taxifare', 'v1')
response = api.projects().predict(body={'instances': request_data},
name=parent).execute()
38
Lab
Deploying and Predicting
with Cloud ML Engine
In this lab, you will deploy the trained

model to act as a REST web service,
and send a JSON request to the
endpoint of the service to make it
predict a baby's weight.
Lab Steps
1 Deploy a trained model to Cloud ML Engine.
2 Send a JSON request to model to get predictions.


#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
service

Use a web
Web
application
Browser #7 Invoke ML predictions # = Lab Exercises
Lab
Building an App Engine app to
serve ML predictions
In this lab, you will deploy a python Flask

app as a App Engine web application, and
use the App Engine app to post JSON data,
based on user interface input, to the
deployed ML model and get predictions.
Use App Engine to invoke ML predictions
Web API requests
predict()
Cloud ML
App Engine
Engine
You can also invoke the ML service from Cloud
Dataflow and save predictions to BigQuery
Events,
metrics, etc.
Cloud Stream Save Retrain
BigQuery
Pub/Sub
Cloud
Cloud
Raw logs, ML
files, assets,
Dataflow
Engine
Google
Analytics data,
etc.
Cloud
Storage Batch Predict
cloud.google.com
Summary
Summary: An end-to-end process to operationalize
ML models
#1 #2
dataset dataset


model
Cloud Dataflow
Deploy a web
Training Dataset
Cloud Storage
service

Use a web
Web
application
Summary of learning
4 Create training and evaluation datasets
5 Execute training
Visualise the dataset in Cloud Datalab
Summary of learning
5 Execute training
Create a sampled dataset
Summary of learning
5 Execute training
Develop a model utilizing TensorFlow model
techniques
Summary of learning
5 Execute training
Peprocess and create .csv files in Cloud Dataflow to
create training and evaluation datasets
Summary of learning
5 Execute training
Execute training in the Cloud
Summary of learning
5 Execute training
Deploy the model
Summary of learning
5 Execute training
Deploy a Flask application using Python and App Engine
Machine learning on Google Cloud Platform
2 ● Creating ML Datasets
● Introduction to TensorFlow 4 ● End-to-End Lab on Structured ML Data
● Production ML Systems
Why ML with Improving ML ML at Specialized

ML? TensorFlow Accuracy Scale ML Models
1 3 5
● Image Classification Models
● How Google ● Feature Engineering
● Sequence Models
does ML ● The Art and Science of ML
● Recommendation Systems
cloud.google.com

End-to-End Machine Learning With TensorFlow On GCP

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

End-to-End Machine Learning With TensorFlow On GCP

Uploaded by

Copyright:

Available Formats

Introduction

End-to-End Lab on Structured Data ML

1 Explore, visualize a dataset

2 Create sampled dataset

3 Develop a TensorFlow model

1 Explore the dataset

2 Create the dataset

3 Build the model

4 Operationalize the model

Components useful when

Python API gives you full

Core TensorFlow (C++) C++ API is quite low level

CPU GPU TPU Android TF runs on different hardware

Big Data Feature Model

Big Data Feature Model

Pre- Feature Train

Big Data Feature Model

Released in Nov. 2015

for “machine learning”

Big Data Feature Model

Hyperparameter tuning might be nice.

Pre- Feature Train

Pre- Feature Train

REST API call with

Explore, visualize a Create sampled

Notebook Natality Dataset

Cloud Shell Create training and

Web Application Managed ML Service

1 Explore the dataset

2 Create the dataset

3 Build the model

4 Operationalize the model

Predict the weight Identify babies Get babies the

Births recorded in the 50 states of the USA from 1969 to 2008.

Location of birth (US state)

Baby’s birth weight (lbs)

Mother’s age at birth

Mother’s weight gain (lbs)

Explore, visualize a Create sampled

Notebook Natality Dataset

Cloud Shell Create training and

Web Application Managed ML Service

1 Interactive analysis of petabyte scale databases.

2 Familiar, SQL 2011 query language and

3 Many ways to ingest, transform, load, export data

4 Nested and repeated fields, user-defined functions.

5 Data storage is inexpensive; queries charged on amount of

# medicare claims in 2014

Cloud Cloud Cloud ML

Notebook server Hosted on

In this lab, you will investigate which

Explore, visualize a Create sampled

Notebook Natality Dataset

Cloud Shell Create training and

Web Application Managed ML Service

1 Explore the dataset

2 Create the dataset

3 Build the model

4 Operationalize the model

Creating Building Operationalizing

1 Be related to the objective.

3 Be numeric with meaningful magnitude.