Professional Documents
Culture Documents
End-to-End Machine Learning With TensorFlow On GCP
End-to-End Machine Learning With TensorFlow On GCP
1
Advanced ML with TensorFlow on GCP
End-to-End Lab on Structured Data ML
Production ML Systems
Image Classification Models
Sequence Models
Recommendation Systems
Learn how to...
Explore large datasets for features
Create training and evaluation datasets
Build models with the Estimator API in
TensorFlow
Train at scale and deploy models into
production with GCP ML tools
7 hands-on machine learning labs
Summary of ML labs
4
Create training and evaluation
datasets
cloud.google.com
Effective ML
Advanced ML with TensorFlow on GCP
End-to-End Lab on Structured Data ML
Production ML Systems
Image Classification Models
Sequence Models
Recommendation Systems
Steps involved in doing ML on GCP
Cloud ML Engine
tf.layers, tf.losses, tf.metrics building custom NN models
Train
Inputs Model
model
To build effective ML, you need:
Train
Inputs Model
model
To build effective ML, you need:
Hyperparameter
tuning
Same Deploy
improve
Then, scale it out to GCP using serverless technology
store
improve/
serverless/
hypertune
The end-to-end machine learning set of labs
# of network % of deployed
Type of network # of weights
layers models
MLP0 5 20M
61%
MLP1 4 5M
LSTM0 58 52M
29%
LSTM1 56 34M
CNN0 16 8M
5%
CNN1 89 100M
https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu
Our goal is to predict the weight of newborns so that
all newborns can get the care they need
https://bigquery.cloud.google.com/table/bigquery-public-data:samples.natality
The data set includes details about the pregnancy
Date of birth
Duration of pregnancy
10
Run a query from BigQuery web UI
More options
Run Save/Share
VALIDATE
Cost
Analyze query
EXPORT
performance
https://bigquery.cloud.
google.com/
11
Demo: Query large datasets in seconds
# standardsql
https://bigquery.cloud.google.com/savedquery/663413318684:781a98ddf2264505af2b6a8fc398a80e
Cloud Datalab notebooks are developed in an
iterative, collaborative process
PHASE 5 PHASE 1
Share and Write code in
collaborate Python
Development
Process in
PHASE 4 Cloud Datalab
PHASE 2
Write
Run cell
commentary in
(Shift+Enter)
markdown
PHASE 3
Examine Output
Cloud Datalab notebooks are developed in an
iterative, collaborative process
5 2
5
1
3
4
You can develop locally with Cloud Datalab and then
scale out data processing to the cloud
Pandas improve
Cloud CSV Files
Dataframes
Datalab
Tensor
Apache
Flow
Beam
improve /
serverless /
hypertune
Save/read 10 GB PD
CUS
VM
Notebook files 4 2 1
Cloud repository
Starting Cloud Datalab in Cloud Shell is simple
1
datalab create my-datalab-vm \
--machine-type n1-highmem-8 \
2 --zone us-central1-a
3 4
Preprocessing data at scale with BigQuery + Cloud
Datalab
BigQuery in Python to get a Pandas DF
Pandas + BigQuery in notebook rocks!
Lab
Explore a BigQuery dataset to
publicdata.samples.natality
find features to use in an ML
model
2 Be known at prediction-time.
Today’s sales
Sales Data transactions
Will we know all these things at prediction time?
Develop your
TensorFlow code
on a small subset
of data, then scale
it out to the cloud.
Full Dataset
Solution: Sampling the split so that we have a small
dataset to develop our code on
#standardSQL
SELECT
date,
airline,
departure_airport,
departure_schedule,
arrival_airport,
arrival_delay
FROM
`bigquery-samples.airline_ontime_data.flights`
WHERE
MOD(ABS(FARM_FINGERPRINT(date)),10) < 8 AND RAND() < 0.01
Lab
Creating a sampled dataset
https://www.oreilly.com/learning/repeatable-sampling-of-data-sets-in-bigquery
-for-machine-learning
The end-to-end process
Nodes represent
mathematical
operations.
Edges represent
arrays of data.
A tensor is an N-dimensional array of data
Cloud ML Engine
building custom NN models
1 Regression or classification?
7
Structure of an Estimator API ML model
import tensorflow as tf
#Define input feature columns
featcols = [
tf.feature_column.numeric_column("sq_footage") ]
#Train
def train_input_fn():
...
SQUARE ML
return features, labels PRICE
model.train(train_input_fn, steps=100) FOOTAGE MODEL
#Predict
def pred_input_fn():
...
return features
out = model.predict(pred_input_fn)
8
Encoding categorical data to supply to a DNN
tf.feature_column.categorical_column_with_vocabulary_list('zipcode',
vocabulary_list = ['83452', '72345', '87654', '98723', '23451']),
tf.feature_column.categorical_column_with_identity('stateId',
num_buckets = 50)
2. To pass in a categorical column into a DNN, one option is to one-hot encode it:
tf.feature_column.indicator_column( my_categorical_column )
To read CSV files, create a TextLineDataset giving it a
function to decode the CSV into features, labels
CSV_COLUMNS = ['sqfootage','city','amount']
LABEL_COLUMN = 'amount'
DEFAULTS = [[0.0], ['na'], [0.0]]
dataset = tf.data.TextLineDataset(filename).map(decode_csv)
...
return ...
Shuffling is important for distributed training
dataset = tf.data.TextLineDataset(filename).map(decode_csv)
if mode == tf.estimator.ModeKeys.TRAIN:
num_epochs = None # indefinitely
dataset = dataset.shuffle(buffer_size=10*batch_size)
else:
num_epochs = 1 # end-of-input after this
dataset = dataset.repeat(num_epochs).batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
Estimator API comes with a method that handles
distributed training and evaluation
tf.estimator.train_and_evaluate(estimator,
train_spec, Handle machine failures
eval_spec)
Create checkpoint files
PASS IN:
1. ESTIMATOR Recover from failures
2. TRAIN SPEC
3. EVAL SPEC Save summaries for TensorBoard
TrainSpec consists of the things that used to be
passed into the train() method
train_spec = tf.estimator.TrainSpec(
input_fn=read_dataset('gs://.../train*',
mode=tf.contrib.learn.ModeKeys.TRAIN),
max_steps=num_train_steps)
...
{
"transactionId": 42,
"name": "Ice Cream",
"price": 2.50,
"tags": ["cold", "dessert"], price 8345 72345 87654 98723 wait
"servedBy": { 2.50 0 1 0 0 1.4
"employeeId": 72365,
"waitTime": 1.4,
"customerRating": 4
},
"storeLocation": {
"latitude": 35.3,
"longitude": -98.7
}
},
DNNs good for dense, highly-correlated inputs
pixel_values ( )
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
Wide-and-deep models let you handle both
Wide and Deep
memorization generalization
relevance diversity
Wide-and-deep network in Estimator API
model = tf.estimator.DNNLinearCombinedClassifier(
model_dir=...,
linear_feature_columns=wide_columns,
dnn_feature_columns=deep_columns,
dnn_hidden_units=[100, 50])
22
The end-to-end process
BigQuery
Cloud
Dataflow
Cloud Storage
Open-source API, Google infrastructure
p = beam.Pipeline()
Open-source API (Apache
Input (p
Beam) can be executed on
Flink, Spark, etc. also
Read | beam.io.ReadFromText('gs://..')
Parallel tasks
Transform | beam.Map(Transform) (autoscaled by execution
framework)
Group | beam.GroupByKey()
Filter | beam.FlatMap(Filter)
p = beam.Pipeline()
(p
BigQuery | beam.io.ReadStringsFromPubSub('project/topic')
Cloud
Pub/Sub | beam.WindowInto(SlidingWindows(60))
| beam.Map(Transform)
| beam.GroupByKey()
| beam.FlatMap(Filter)
Cloud Cloud | beam.io.WriteToBigQuery(table)
Dataflow Pub/Sub )
p.run()
Cloud
Storage
Cloud
Storage
An example Beam pipeline for BigQuery->CSV on cloud
def transform(rowdict):
import copy
result = copy.deepcopy(rowdict)
if rowdict['a'] > 0:
result['c'] = result['a'] * result['b']
yield ','.join([ str(result[k]) if k in result else 'None' for k in ['a','b','c'] ])
if __name__ == '__main__':
p = beam.Pipeline(argv=sys.argv)
selquery = 'SELECT a,b FROM someds.sometable'
(p
| beam.io.Read(beam.io.BigQuerySource(query = selquery,
use_standard_sql = True)) # read input
| beam.Map(transform_data) # do some processing
| beam.io.WriteToText('gs://...') # write output
)
p.run() # run the pipeline
Executing pipeline (Python)
python ./etl.py
parser.add_argument(
model.py '--train_data_paths', required=True)
parser.add_argument(
def train_and_evaluate(args): '--train_steps', ...
estimator = tf.estimator.DNNRegressor(
model_dir=args['output_dir'],
feature_columns=feature_cols,
hidden_units=args['hidden_units'])
train_spec=tf.estimator.TrainSpec(
input_fn=read_dataset(args['train_data_paths'],
batch_size=args['train_batch_size'],
mode=tf.contrib.learn.ModeKeys.TRAIN),
max_steps=args['train_steps'])
exporter = tf.estimator.LatestExporter('exporter', serving_input_fn)
eval_spec=tf.estimator.EvalSpec(...)
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
The model.py contains the ML model in TensorFlow
(Estimator API)
Example of the code in model.py (see Lab #3)
taxifare/
taxifare/PKG-INFO
Python packages need to
taxifare/setup.cfg
contain an __init__.py in
taxifare/setup.py every folder.
taxifare/trainer/
taxifare/trainer/__init__.py
taxifare/trainer/task.py
taxifare/trainer/model.py
Verify that the model works as a Python package
export PYTHONPATH=${PYTHONPATH}:/somedir/babyweight
python -m trainer.task \
--train_data_paths="/somedir/datasets/*train*" \
--eval_data_paths=/somedir/datasets/*valid* \
--output_dir=/somedir/output \
--train_steps=100 --job-dir=/tmp
You use distributed TensorFlow on Cloud ML Engine
Run TF at
scale
High-level API for distributed
tf.estimator training
Cloud ML Engine
building custom NN models
Export data
1 2 3
Create sampled
#2
dataset
Explore and
#1
visualize a dataset #3 Develop a BQML
model
Notebook Natality Dataset
Cloud Datalab BigQuery
Create training
#4 and evaluation
datasets
#5 Execute training
) AS
# SQL query with eval data
# SQL query with training data
Lab
Predicting baby weight with
BigQuery ML
Model
Deploy
Serving Pre
Clients processing Model
Input Fn
You can’t reuse the training input function for serving
Features Features
Labels Labels
1. The serving_input_fn specifies what the caller of
the predict() method must provide
def serving_input_fn():
feature_placeholders = {
'pickuplon' : tf.placeholder(tf.float32, [None]),
'pickuplat' : tf.placeholder(tf.float32, [None]),
'dropofflat' : tf.placeholder(tf.float32, [None]),
'dropofflon' : tf.placeholder(tf.float32, [None]),
'passengers' : tf.placeholder(tf.float32, [None]),
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features,
feature_placeholders)
2. Deploy a trained model to GCP
MODEL_NAME="taxifare"
MODEL_VERSION="v1"
MODEL_LOCATION="gs://${BUCKET}/taxifare/smallinput/taxi_trained/export/exporter
/.../"
credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1beta1_
discovery.json')
request_data = [
{'pickup_longitude': -73.885262,
'pickup_latitude': 40.773008,
'dropoff_longitude': -73.987232,
'dropoff_latitude': 40.732403,
'passenger_count': 2}]
parent = 'projects/%s/models/%s/versions/%s' % ('cloud-training-demos',
'taxifare', 'v1')
response = api.projects().predict(body={'instances': request_data},
name=parent).execute()
38
Lab
Deploying and Predicting
with Cloud ML Engine
predict()
Cloud ML
App Engine
Engine
You can also invoke the ML service from Cloud
Dataflow and save predictions to BigQuery
Events,
metrics, etc.
Cloud Stream Save Retrain
BigQuery
Pub/Sub
Cloud
Cloud
Raw logs, ML
files, assets,
Dataflow
Engine
Google
Analytics data,
etc.
Cloud
Storage Batch Predict
cloud.google.com
Summary
Summary: An end-to-end process to operationalize
ML models
Explore, visualize a Create sampled
#1 #2
dataset dataset
5 Execute training
7 Invoke ML predictions
Visualise the dataset in Cloud Datalab
Summary of learning
5 Execute training
7 Invoke ML predictions
Create a sampled dataset
Summary of learning
5 Execute training
7 Invoke ML predictions
Develop a model utilizing TensorFlow model
techniques
Summary of learning
5 Execute training
7 Invoke ML predictions
Peprocess and create .csv files in Cloud Dataflow to
create training and evaluation datasets
Summary of learning
5 Execute training
7 Invoke ML predictions
Execute training in the Cloud
Summary of learning
5 Execute training
7 Invoke ML predictions
Deploy the model
Summary of learning
5 Execute training
7 Invoke ML predictions
Deploy a Flask application using Python and App Engine
Machine learning on Google Cloud Platform
2 ● Creating ML Datasets
● Introduction to TensorFlow 4 ● End-to-End Lab on Structured ML Data
● Production ML Systems
1 3 5
● Image Classification Models
● How Google ● Feature Engineering
● Sequence Models
does ML ● The Art and Science of ML
● Recommendation Systems
cloud.google.com