Professional Documents
Culture Documents
Deploying ML PDF
Deploying ML PDF
Deploying ML PDF
1
Who is Svetlana Levitan?
Originally from Moscow, Russia
2
1500 Drones Are Ready To Fly
IBM Developer Drone Challenge
Challenge runs from November 12 through December 16 and those 18 years of
age or older residing in the US and Canada are eligible to enter
Enter at https://developer.ibm.com/contest
Once per week during the 5 weeks of the contest a random drawing
Will be held to determine the winners (watch for the drawing on Twitch)
Winners will also receive a special code to unlock more challenges and given the
opportunity to contribute back to the challenge
#IBMDroneDrop
3
Agenda
• Why Machine Learning?
• Some History
• PMML Internals
• PMML in Python and R
• PMML in IBM products
• PFA
• Other tools
4
Why is machine learning (ML) so popular now?
Data Science = ML +
Business Domain Knowledge
ML is everywhere!
Weather forecast
www.cubicsol.com/
machine-learning-
algorithms/
7
Some ML models
Neural network (Multi-Layer perceptron)
Clustering
Linear regression
Logistic regression
Deep learning
Decision tree
(C) 2018 IBM Corp 8
The Iris dataset
Monitor and
Deploy the
update as Build a Model
model
needed
dmg.org
13
What is PMML?
Transformations
First Six Naïve Bayes,
models Sequence
15
Main Components of PMML
Header
Data Dictionary
Transformation Dictionary
Model(s)
© 2018 IBM Corporation
PMML under the hood
• Application name and version
Header • Timestamp and copyright
• Mining Schema
Model(s)
• Specific model contents
17
Transformations
• NormContinuous: piece-wise linear transform
• Discretize: binning
• MapValues: map one or more categorical fields into another categorical one
• Functions: built-in and user-defined
• Other transformations
PMML Models
o Association Rules Model o Baseline Model
• RegressionTable
Regression • NumericPredictor, Categorical Predictor
22
Example PMML – Neural Network MiningSchema and inputs
Predictors
23
Example PMML - Neural Network hidden layer and outputs
Hidden layer neuron
Output
Layer
Neurons
Connecting
target to the
neurons
24
Example PMML for a Tree Model
<Node id=“0"> <True/>
<Node id=“1" score="Iris-setosa" recordCount="50.0">
<SimplePredicate field="petal_length" operator="lessOrEqual“
value=“2.6"/>
<ScoreDistribution value="Iris-setosa" recordCount="50.0"/>
<ScoreDistribution value="Iris-versicolor" recordCount="0.0"/>
<ScoreDistribution value="Iris-virginica" recordCount="0.0"/>
</Node>
<Node id=“2">
<SimplePredicate field="petal_length" operator="greaterThan“
value=“2.6"/>
<Node id=“3“score="Iris-versicolor" recordCount=“40.0">
<SimplePredicate field="petal_length"
operator="lessOrEqual" value=“4.75"/>
Mining Model
Predicate, weight
Segment
MODEL
Predicate, weight
Segmentation
Segment
MODEL
28
PMML in Python
JPMML package is created and maintained by Villu Ruusmann.
From https://stackoverflow.com/questions/33221331/export-python-scikit-learn-models-into-pmml
pip install git+https://github.com/jpmml/sklearn2pmml.git
Example of how to export a classifier tree to PMML. First grow the tree:
# example tree & viz from http://scikit-learn.org/stable/modules/tree.html
from sklearn import datasets, tree
iris = datasets.load_iris()
clf = tree.DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)
SkLearn2PMML conversion takes 2 arguments: an estimator (our clf) and a mapper for preprocessing.
Our mapper is pretty basic, since no transformations.
>library(XML);
>library(pmml);
> data(iris);
Build and save a decision tree (C&RT) model predicting Species class:
> irisTree <- rpart( Species~., iris )
> saveXML( pmml( irisTree ), "IrisTree.xml" )
Agenda
• Why Machine Learning?
• Some History
• PMML Internals
• PMML in Python and R
• PMML in IBM products
• PFA
• Other tools
32
IBM SPSS Statistics
1968
Statistical Package for Social Sciences
Integration with
Python and R
33
IBM SPSS Modeler
34
IBM SPSS Statistics IBM SPSS Modeler
Transformation PMML from: Apriori, CARMA, Association Rules
ADP (Automatic Data Preparation) C5, CART, Chaid decision trees
TMS Begin/TMS End Cox regression
GENLIN
Model PMML from: Decision List
COXREG, CSCOXREG K-Means Cluster
CSGLM, CSLOGISTIC, CSORDINAL
GENLIN, Logistic regression, NOMREG KNN
GENLINMIXED LINEAR, Regression
LINEAR, KNN Logistic Regression
MLP, RBF neural networks MLP and RBF
NAÏVE BAYES NOMREG
REGRESSION Random Trees
TREE, TSMODEL Regression
TWOSTEP CLUSTER Two Step Cluster
35
Requesting PMML export in a decision tree analysis
and Bayesian regression in IBM SPSS Statistics
Score PMML in IBM SPSS
Statistics
Utilities->Scoring Wizard
Watson Studio (formerly Data Science Experience)
PMML export
enabled in Jupyter
notebooks, also
possible in R
Studio.
PMML scoring can
be done in Flows,
notebooks,
Watson Machine
Learning
38
(C) 2018 IBM Corp
Using SPSS Two Step Cluster model in Python notebook
clusterModel = cluster.fit(df_data_1)
cePMML = clusterModel.toPMML()
39
Watson Studio Flows
TODO
40
Scoring PMML in Watson Machine Learning
41
Benefits of PMML
43
Portable Format for Analytics - PFA
Info: dmg.org/pfa
Jim Pivarski
44
PFA details
• Type and function system means PFA can be fully & statically
verified on load and run by any compliant execution engine
45
A Simple Example of PFA (copied from Nick Pentreath’s presentation)
46
(C) 2018 IBM Corp 46
Managing State in PFA (copied from Nick Pentreath’s presentation)
47
(C) 2018 IBM Corp 47
Known Support for PFA
49
ONNX use pattern
Tools
Netron visualizer
Net Drawer visualizer
Checker
Shape Inferencer
Graph Optimizer
Opset Version Converter
Inference
Training
50
Docker containers: https://docs.docker.com/engine/docker-overview/
Isolated and secure environments
51
52
Conclusions
53
Questions?
54
Connect with me:
LinkedIn: https://www.linkedin.com/in/svetlanalevitan/
Twitter: @SvetaLevitan Email: slevitan@us.ibm.com
Useful links:
56