PPA-Building Prediction Model ML

Uploaded by

Daffa Ammarul

0% found this document useful (0 votes)

2 views26 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

2 views26 pages

PPA-Building Prediction Model ML

Uploaded by

Daffa Ammarul

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 26

Search inside document

Classification usually used to Regression usually used to

predict labeling predict numerical values

Groups items using a
hierarchical clustering
algorithm.
Groups items using the k-
Means clustering algorithm.
Building Basic Models of Learnings
This is where we load our data table into
orange and define what data type our
features are. In this case we are loading an
Excel table file (.xlsx) with our 94 (95 if we
include the target variable ‘ROCK’) features.
As mentioned previously, the data in our
features can come in the form of numeric,
categorical, text or time series. To access the
widget double click on it, navigate to and
open the file then define what each
feature’s (column) data type is. In this case
ROCK will be categorical and the remainder
will be numeric. Under the ‘Role’ column
ROCK also needs to be put as our target
variable as it’s what we are trying to classify.
Rank assesses the relationship between the
features and target variable and tells us how
well they correlate. As geologists we know
that the biggest differentiating features
between mafic and felsic rocks are the
magnesium and silica content. Rank is
telling us that MgO varies the most between
the 4 lithologies, followed by SiO2, Al2O3
etc etc. Here we can decide what features
can go into our model. There is a goldilocks
zone of how many features should be
included into to make an optimum model
and it is determined on a case by case basis.
You don’t want too few and you don’t want
too many, thankfully in Orange it is easy to
just select how many pass through the
workflow by just highlighting them. In this
case I’ve selected the top 10 features shown
below.
Train and Test Data
Train and Test Data
THE JUICY BITS
These 5 pink widgets are Machine Learning algorithms, each with their own way of mathematically
classifying/predicting our target variable using the geochem data. In the beer example we used a a single
regression algorithm to create our model, in Orange we can use many different algorithms at once and
compare them. There is an ever-growing list of ML algorithms but in this workflow I have used k-Nearest-
Neighbour (kNN), Support Vector Machines (SVM), Naive Bayes, Random Forest and Adaptive Boosting
(AdaBoost). Each algorithm has parameters that can be tweaked in attempt to increase model accuracy.
Machine Learning algorithms can be quite (definitely are) mathematically intense and difficult to break
down into simple terms. There are many documents online that outline each specific algorithms function,
but the Orange documentation is usually pretty good. This is a good summary for selecting the right
algorithm too.
TEST AND SCORE
When you create a Machine Learning model you need a way
to make sure your model actually works. We can do this by
randomly splitting the data set into ‘training’ and ‘test’ data.
The training data is used to create the model and the test data
is used to determine the accuracy of the model. The 5
different models created by the training data ignores the
target variable in our test data (just looks at the chemistry and
not the rock type) and attempts to classify/predict what rock
type each instance would be. The predicted rock type is then
compared to the actual known value in our test data and each
model is scored based on accuracy. In this case the split is
selected at 80% training and 20% test, representing 320 and
80 instances respectively.
To put it simply we, the operator, know what the target
variables are for both our training and our test data, but our
model only knows the training data. eg the model knows that
the training data is Homer Simpson and now has to predict
whether the test data is or not.
TEST AND SCORE
TEST AND SCORE
PREDICTION

21 Machine Learning Design Patterns Interview Questions (ANSWERED) MLStack
Document29 pages
21 Machine Learning Design Patterns Interview Questions (ANSWERED) MLStack
Christine Cao
No ratings yet
Thor Hurricane Owners Manual 2012
Document131 pages
Thor Hurricane Owners Manual 2012
glenrg
No ratings yet
Business Report M2 PDF
Document14 pages
Business Report M2 PDF
A d
100% (2)
Deploying Enterprise SIP Trunks With CUBE, CUCM and MediaSense SBC PDF
Document96 pages
Deploying Enterprise SIP Trunks With CUBE, CUCM and MediaSense SBC PDF
Zeeshan Dawoodani
No ratings yet
Reading 11 - Programming End-to-End Solution
Document13 pages
Reading 11 - Programming End-to-End Solution
lussy
No ratings yet
Query For Machine Learning
Document6 pages
Query For Machine Learning
priyada16
No ratings yet
The 5 Feature Selection Algorithms Every Data Scientist Should Know
Document29 pages
The 5 Feature Selection Algorithms Every Data Scientist Should Know
Rama Chandra Gunturi
No ratings yet
Malignant Comments Classifier Project
Document30 pages
Malignant Comments Classifier Project
Saranya M
No ratings yet
A "Short" Introduction To Model Selection
Document25 pages
A "Short" Introduction To Model Selection
Suvin Chandra Gandhi (MT19AIE325)
No ratings yet
Receiver Operator Characteristic
Document25 pages
Receiver Operator Characteristic
Suvin Chandra Gandhi (MT19AIE325)
No ratings yet
Ranking Features Based On Predictive Power - Importance of The Class Labels
Document11 pages
Ranking Features Based On Predictive Power - Importance of The Class Labels
Juan
No ratings yet
Employee Attrition Miniblogs
Document15 pages
Employee Attrition Miniblogs
Codein
100% (1)
Building Good Training Sets
Document51 pages
Building Good Training Sets
thulasi prasad
No ratings yet
TB 969425740
Document16 pages
TB 969425740
guohong hu
No ratings yet
Article Review 11 Eng
Document18 pages
Article Review 11 Eng
Cecilia Fauziah
No ratings yet
ML Unit 2
Document18 pages
ML Unit 2
SUJATA SONWANE
No ratings yet
Model Evaluation
Document29 pages
Model Evaluation
niti gupta
No ratings yet
Chapter5: Experiment On Explanation Utility
Document7 pages
Chapter5: Experiment On Explanation Utility
Lillian Lin
No ratings yet
Ch3 - Structering ML Project
Document36 pages
Ch3 - Structering ML Project
amal
No ratings yet
Data Science
Document38 pages
Data Science
DINESH REDDY
No ratings yet
ML Metrics
Document9 pages
ML Metrics
zpddf9hqx5
No ratings yet
Opening Black Boxes: How To Leverage Explainable Machine Learning
Document11 pages
Opening Black Boxes: How To Leverage Explainable Machine Learning
Sowrya Regana
No ratings yet
Dealing With Missing Data in Python Pandas
Document14 pages
Dealing With Missing Data in Python Pandas
Sello
No ratings yet
Employee Attrition Prediction
Document21 pages
Employee Attrition Prediction
user user
100% (1)
Unit III 1
Document21 pages
Unit III 1
mananrawat537
No ratings yet
PS Notes (Machine Learning
Document14 pages
PS Notes (Machine Learning
Kodjo ALIPUI
No ratings yet
Data Science Interview Guide
Document23 pages
Data Science Interview Guide
Mary Koko
No ratings yet
The Art of Finding The Best Features For Machine Learning - by Rebecca Vickery - Towards Data Science
Document14 pages
The Art of Finding The Best Features For Machine Learning - by Rebecca Vickery - Towards Data Science
Hamdan Gani, S.Kom., MT
No ratings yet
11 Important Model Evaluation Error Metrics 2
Document4 pages
11 Important Model Evaluation Error Metrics 2
PRAKASH KUMAR
100% (1)
Machine LEarning
Document4 pages
Machine LEarning
Karim
No ratings yet
Stock Market Analysis Using Supervised Machine Learning
Document6 pages
Stock Market Analysis Using Supervised Machine Learning
Abishek Pangotra (Abi Sharma)
No ratings yet
Chapter-3-Common Issues in Machine Learning
Document20 pages
Chapter-3-Common Issues in Machine Learning
codeavengers0
No ratings yet
Rapid Miner Tutorial
Document15 pages
Rapid Miner Tutorial
Deepika Vaidhyanathan
100% (1)
Ss PPT Presentation
Document11 pages
Ss PPT Presentation
NAGA LAKSHMI GAYATRI SAMANVITA POTTURU
No ratings yet
ML Model Paper 1 Solution-1
Document10 pages
ML Model Paper 1 Solution-1
VIKAS KUMAR
No ratings yet
Validation Over Under Fir Unit 5
Document6 pages
Validation Over Under Fir Unit 5
Harpreet Singh Bagga
No ratings yet
Modelling and Error Analysis
Document8 pages
Modelling and Error Analysis
Atmuri Ganesh
No ratings yet
Decision-MakingUsingtheAnalyticHierarchyProcessAHPandPJM Melvin Alexander
Document15 pages
Decision-MakingUsingtheAnalyticHierarchyProcessAHPandPJM Melvin Alexander
Ravi Tej
No ratings yet
How To Choose The Right Test Options When Evaluating Machine Learning Algorithms
Document16 pages
How To Choose The Right Test Options When Evaluating Machine Learning Algorithms
prediatech
No ratings yet
Company Wise Data Science Interview Questions
Document39 pages
Company Wise Data Science Interview Questions
chaddi
100% (1)
07two Marks Quest & Ans
Document4 pages
07two Marks Quest & Ans
V MERIN SHOBI
No ratings yet
Data Prep and Cleaning For Machine Learning
Document22 pages
Data Prep and Cleaning For Machine Learning
Shubham J
No ratings yet
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
Document27 pages
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
Mamafou
No ratings yet
TEAM DS Final Report
Document14 pages
TEAM DS Final Report
Gurucharan Reddy
No ratings yet
Chapter 2 Solutions
Document6 pages
Chapter 2 Solutions
fatmahelawden000
No ratings yet
Machine Learning Model
Document9 pages
Machine Learning Model
Sanjay Kumar
No ratings yet
Advanced Machine Learning and Feature Engineering: Stacking
Document7 pages
Advanced Machine Learning and Feature Engineering: Stacking
Atmuri Ganesh
No ratings yet
Machine Learning
Document9 pages
Machine Learning
Sanjay Kumar
No ratings yet
House Price Prediction
Document14 pages
House Price Prediction
Sanidhya pasari
No ratings yet
Machine Learning KNN - Supervised
Document9 pages
Machine Learning KNN - Supervised
daniel
No ratings yet
Data Mining Primer
Document5 pages
Data Mining Primer
JoJo Bristol
No ratings yet
40 Interview Questions On Machine Learning From Analytics Vidhya
Document14 pages
40 Interview Questions On Machine Learning From Analytics Vidhya
shakir ali
No ratings yet
Data Science Interview Questions 30 Days 1686062665
Document300 pages
Data Science Interview Questions 30 Days 1686062665
yassine.boutakbout
No ratings yet
Interview Questions
Document4 pages
Interview Questions
Mahima Sharma
100% (1)
Train Test Split in Python
Document11 pages
Train Test Split in Python
Nikhil Tiwari
No ratings yet
ML 5
Document14 pages
ML 5
dibloa
No ratings yet
Interview Questions On Machine Learning
Document22 pages
Interview Questions On Machine Learning
Praveen
100% (4)
Basic Interview Q's On ML PDF
Document243 pages
Basic Interview Q's On ML PDF
sourajit roy chowdhury
100% (2)
CE802 Report
Document7 pages
CE802 Report
prenithjohnsamuel
No ratings yet
Machine Learning Models: by Mayuri Bhandari
Document48 pages
Machine Learning Models: by Mayuri Bhandari
mayuri
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Advanced Analytics with Transact-SQL: Exploring Hidden Patterns and Rules in Your Data
From Everand
Advanced Analytics with Transact-SQL: Exploring Hidden Patterns and Rules in Your Data
Dejan Sarka
No ratings yet
60-Cell Bifacial Mono PERC Double Glass Module (30mm Frame) JAM60D00 - BP
Document2 pages
60-Cell Bifacial Mono PERC Double Glass Module (30mm Frame) JAM60D00 - BP
Alcides Araujo Santos
No ratings yet
Atlas of Obstetric Ultrasound
Document48 pages
Atlas of Obstetric Ultrasound
Sanchia Theresa
100% (1)
(CANEDA 81-A) Narrative Report - Project Implementation Plan
Document2 pages
(CANEDA 81-A) Narrative Report - Project Implementation Plan
JULIANNE BAYHON
No ratings yet
BPO Culture
Document2 pages
BPO Culture
shashi1810
No ratings yet
SOP 04 - Preparation of Glycerol-Malachite Green Soaked Clippings - v1 - 0
Document1 page
SOP 04 - Preparation of Glycerol-Malachite Green Soaked Clippings - v1 - 0
MioDe Joseph Tetra Dumm
No ratings yet
CDCA 2203 Ram & Rom
Document11 pages
CDCA 2203 Ram & Rom
MUHAMAD AMMAR SYAFIQ BIN MAD ZIN STUDENT
No ratings yet
Quantum User Manual
Document220 pages
Quantum User Manual
Roshi_11
No ratings yet
Experimental Study On Strength and Durability Characteristics of Concrete With Partial Replacement of Nano-Silica, Nano-Vanadium Mixture
Document4 pages
Experimental Study On Strength and Durability Characteristics of Concrete With Partial Replacement of Nano-Silica, Nano-Vanadium Mixture
International Journal of Innovative Science and Research Technology
No ratings yet
2000 Seadoo Shop Manual 1 Bombardier
Document456 pages
2000 Seadoo Shop Manual 1 Bombardier
Ecocec Centralita Electronica Complementaria
No ratings yet
MPF Multi-Pole Automotive Fuses: Technical Data 10602
Document2 pages
MPF Multi-Pole Automotive Fuses: Technical Data 10602
Darren DeVose
No ratings yet
Assignment 1 (Internal Control)
Document23 pages
Assignment 1 (Internal Control)
Ali Waqar
No ratings yet
Evolution of Islamic Geometric Patterns PDF
Document9 pages
Evolution of Islamic Geometric Patterns PDF
Saw Tun Lynn
100% (1)
UBS Price and Earnings Report
Document43 pages
UBS Price and Earnings Report
Arun Prabhudesai
No ratings yet
Mis All Topics
Document28 pages
Mis All Topics
faisal
No ratings yet
Haha Youre Not Real Schizophrenia Has Spread NIGHTMARE NIGHTMARE NIGHTMARE
Document9 pages
Haha Youre Not Real Schizophrenia Has Spread NIGHTMARE NIGHTMARE NIGHTMARE
Ryanflare1231
No ratings yet
Module 2 - SAMPLE OF REPORT WRITING FORMAT
Document5 pages
Module 2 - SAMPLE OF REPORT WRITING FORMAT
Mirza Farouq Beg
No ratings yet
Optimization of Spray Drying Process For Developing Seabuckthorn Fruit Juice Powder Using Response Surface Methodology
Document9 pages
Optimization of Spray Drying Process For Developing Seabuckthorn Fruit Juice Powder Using Response Surface Methodology
Laylla Coelho
No ratings yet
Hso422567 Issue2
Document13 pages
Hso422567 Issue2
Александр Щербаков
No ratings yet
Computer Network Unit-5 Notes
Document44 pages
Computer Network Unit-5 Notes
suchita
No ratings yet
Rainas Municipality Office of The Municipal Executive AOC: Tinpiple, Lamjung, Gandaki Province, Nepal
Document10 pages
Rainas Municipality Office of The Municipal Executive AOC: Tinpiple, Lamjung, Gandaki Province, Nepal
Laxu Khanal
No ratings yet
Cip 002 1
Document3 pages
Cip 002 1
silviofigueiro
No ratings yet
West Africa Gas Pipeline - Benin EIA
Document677 pages
West Africa Gas Pipeline - Benin EIA
Bdaejo Rahmon
No ratings yet
Miele Tumble Dryer T5206 Operating Instructions en
Document36 pages
Miele Tumble Dryer T5206 Operating Instructions en
Alberto Arias
No ratings yet
Data Theft by Social Media Platform
Document1 page
Data Theft by Social Media Platform
Sarika Singh
No ratings yet
Diagnose
Document25 pages
Diagnose
sambathnatarajan
No ratings yet
Hooverphonic Biography (2000)
Document4 pages
Hooverphonic Biography (2000)
6980MulhollandDrive
No ratings yet
The Engulfing Trader Handbook English Version
Document26 pages
The Engulfing Trader Handbook English Version
Jm Tolentino
No ratings yet
Engine Variant: V2527-A5
Document12 pages
Engine Variant: V2527-A5
Kartika Ningtyas
100% (1)