Welcome to Scribd!

Evaluating Model Accuracy and Bias-Variance Tradeoff

Uploaded by

0% found this document useful (0 votes)

47 views40 pages

This document discusses evaluating model accuracy and the bias-variance tradeoff in machine learning. It provides an overview of decision tree algorithms, including terminology like root nodes, branches, and leaves. Popular decision tree algorithms like C5.0 and CART are described. C5.0 uses information gain and pessimistic pruning, while CART uses the Gini index and cost complexity pruning. The document also covers attribute selection criteria to maximize purity using metrics like entropy and information gain. It explains how decision trees can be used for both classification and regression problems.

Original Description:

Decision Trees Using ML

Original Title

Decision Trees

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

47 views40 pages

Evaluating Model Accuracy and Bias-Variance Tradeoff

Uploaded by

vijayamba

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 40

Search inside document

Evaluating Model Accuracy and

Bias-Variance TradeOff
Bias check:
How well the predicted values fitted the actual values? (Ideally low bias model is the best model)

Variance check(Error variance) : (actual-predicted)^2/n

Model error check between Training Vs Test/Validation
(Ideally low error variance in both Train & Test/validation is the best fit model)
MACHINE LEARNING ALGORITHMS
Decision Trees
Decision Tree Algorithms are also called as Top Down Induction Of
Decision Trees(TDIDT)

Important Terminology:
Root Node : Test/Decision Points
Branch : Collection of nodes and the Leaf
Leaves : End route note/Final decisions/Conclusions

Famous TDIDT Algorithms are

- C5.0(Quinlan)
- CART(Breiman)
Trees are Rules expressed

Within branch nodes are connected with

“AND”

Branches with similar outcomes are

connected with “OR”

What is best Tree?

Smallest Tree(Least number of nodes) with

smallest error(least number of incorrectly
classified records)

Advantages:
• Fast
• Robust
• Explicable
Regression Trees

It turns out that, we are collecting very similar records at each

leaf. So that we can use mean (or) median of the records at a
leaf as the predictor value for all the new records that obey
similar conditions. Such tree are called Regression Trees.

Two Aspects It follows for both Regression & Classification

problems.

• Which Attribute to Choose(Where to Start)?

• Where to Stop(To avoid overfitting)?
Attribute Selection Criteria
• Main principle:
- Select attribute which partitions the learning set(dataset) into subsets as
“PURE” as possible

• Various measures of Purity:

- Entropy
- Information Gain
- Gini Index

Note: Lower the above values higher the Purity of nodes

We can measure the purity of a Leaf/Node by using below the methods:

For Classification Trees : ENTROPY/GINI Index

For Regression : RMSE/MAPE
Two Most Popular Decision Tree Algorithms

• C5.0:
- Multi split
- Information Gain (Measure of Purity)
- Pessimistic pruning (To avoid overfitting)

• CART:
- Binary Split
- Gini Index (Measure of Purity)
- Cost Complexity Pruning (To avoid overfitting)
C5.0 Algorithm
MEASURE OF PURITY
Entropy becomes Zero in case Probability value of any class(Pi) = 1
LOG(1) =0
We can grow until we exhaust the data. But is
that the right time to stop?

HOW TO MINIMIZE THE OVERFIT?

Why Prune? : To avoid Overfitting
REDUCED ERROR PRUNING/PESSIMISTIC PRUNING
Here ‘f’ is # bad/Total  2/6, 1/2, 2/6
‘e’ value comes out from the formulae

Weighted sum of errors(0.51) for the lowest layer is calculated as fallows(which is higher than its previous layer , so Prune it
(6/14)*0.47+(2/14)*0.72+(6/14)*0.47 = 0.51

Finally, the lower layer’s error is higher the its mother branch, hence Prune the complete layer.
CART Algorithm
MEASURE OF PURITY
** Here ‘S’ is Total records

TestKing CISM
Document7 pages
TestKing CISM
Stanley Samuel
50% (2)
Hacking Guide
Document13 pages
Hacking Guide
John White
No ratings yet
Understanding BEx Query Designer Part-4 Conditions & Exceptions
Document22 pages
Understanding BEx Query Designer Part-4 Conditions & Exceptions
Ashwin Kumar
100% (2)
Decision Tree
Document68 pages
Decision Tree
2K20CO345 Priti Meena
No ratings yet
Decision Tree Using Sci-Kit Learn
Document9 pages
Decision Tree Using Sci-Kit Learn
sudeepvmenon
No ratings yet
Decision Tree
Document26 pages
Decision Tree
Arpan Kumar
No ratings yet
Lecture Notes 3
Document11 pages
Lecture Notes 3
vivek gupta
No ratings yet
Mod4 Eda
Document13 pages
Mod4 Eda
Arjun Singh A
No ratings yet
Unit 4
Document33 pages
Unit 4
Prathmesh Mane Deshmukh
No ratings yet
Unit 3 (A) NGP
Document78 pages
Unit 3 (A) NGP
animehv5500
No ratings yet
Tree
Document7 pages
Tree
Sailla Raghu raj
No ratings yet
Decision Tree Algorithm in Machine Learning
Document17 pages
Decision Tree Algorithm in Machine Learning
sambhvathan
No ratings yet
Decision Tree (Autosaved)
Document14 pages
Decision Tree (Autosaved)
Bhardwaj Diwakar
No ratings yet
Decisiontree 2
Document16 pages
Decisiontree 2
shilpa
No ratings yet
Decision Tree & Random Forest
Document16 pages
Decision Tree & Random Forest
reshma acharya
No ratings yet
Decision Tree: "For Each Node of The Tree, The Information Value Measures
Document3 pages
Decision Tree: "For Each Node of The Tree, The Information Value Measures
Aditya Narain Singh
No ratings yet
Chapter 4classification and Prediction
Document19 pages
Chapter 4classification and Prediction
srjaswar
No ratings yet
Ch5 Data Science
Document60 pages
Ch5 Data Science
Ingle Ashwini
No ratings yet
Session 9 10 Decision Tree
Document41 pages
Session 9 10 Decision Tree
Shishir Gupta
No ratings yet
Decision Tree Algorithm Tutorial With Example in R
Document23 pages
Decision Tree Algorithm Tutorial With Example in R
gouthamk5151
No ratings yet
Decision Trees and Random Forest Q&a
Document48 pages
Decision Trees and Random Forest Q&a
Vishnu
No ratings yet
Classification and Regression Trees
Document36 pages
Classification and Regression Trees
Anonymous W34gDjrP
No ratings yet
Classification and Regression Trees
Document10 pages
Classification and Regression Trees
Flor Cabillar
No ratings yet
Data Mining
Document18 pages
Data Mining
Nelson Raja
No ratings yet
Unit 2
Document11 pages
Unit 2
hollowpurple156
No ratings yet
What Is Data? Data Is A Set of Values of Subjects With Respect To Qualitative or Quantitative Variables
Document10 pages
What Is Data? Data Is A Set of Values of Subjects With Respect To Qualitative or Quantitative Variables
vinothkumar441
No ratings yet
Chapter 4
Document19 pages
Chapter 4
storkydd
No ratings yet
Decision Tree R
Document5 pages
Decision Tree R
Divya B
No ratings yet
CART - Machine Learning
Document29 pages
CART - Machine Learning
adnan arshad
No ratings yet
Exam PA Knowledge Based Outline
Document22 pages
Exam PA Knowledge Based Outline
Trong Nghia Vu
No ratings yet
Handling The Dataset Using R - Word
Document54 pages
Handling The Dataset Using R - Word
swarnalathasatheesh.13
No ratings yet
Anshul - Sharma 18IT06 Seminar
Document15 pages
Anshul - Sharma 18IT06 Seminar
Anshul Pancholi
No ratings yet
Experiment No-2
Document4 pages
Experiment No-2
Apurva Patil
No ratings yet
Decision Trees and How To Build and Optimize Decision Tree Classifier
Document16 pages
Decision Trees and How To Build and Optimize Decision Tree Classifier
Shalini Singhal
No ratings yet
Data Science Intervieew Questions
Document16 pages
Data Science Intervieew Questions
Satyam Anand
100% (1)
Evans Analytics2e PPT 04
Document57 pages
Evans Analytics2e PPT 04
heykoolkid0
No ratings yet
Research Paper On Decision Tree Algorithm
Document8 pages
Research Paper On Decision Tree Algorithm
c9j9xqvy
100% (1)
QnA - Business Analytics
Document6 pages
QnA - Business Analytics
Rumani Chakraborty
No ratings yet
Decision Tree Algorithm, Explained-1-22
Document22 pages
Decision Tree Algorithm, Explained-1-22
shyla
No ratings yet
ML-Lec-07-Decision Tree Overfitting
Document25 pages
ML-Lec-07-Decision Tree Overfitting
Hamza
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
Document3 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
vishweshhampali
No ratings yet
Data Pre Processing - NG
Document43 pages
Data Pre Processing - NG
Vatesh_Pasrija_9573
No ratings yet
Decision Tree Classifier-Introduction, ID3
Document34 pages
Decision Tree Classifier-Introduction, ID3
mehra.harshal25
No ratings yet
Evans Analytics2e PPT 04
Document63 pages
Evans Analytics2e PPT 04
sener.asli
No ratings yet
Chapter 4 - Descriptive Statistical Measures
Document63 pages
Chapter 4 - Descriptive Statistical Measures
Hoàng Hiểu Yến
No ratings yet
DMDW 04
Document10 pages
DMDW 04
Harsh Nag
No ratings yet
Machine Learning-Lecture 05
Document21 pages
Machine Learning-Lecture 05
Amna Arooj
No ratings yet
10 2
Document10 pages
10 2
uxama
No ratings yet
Decision Tree Classification Algorithm
Document14 pages
Decision Tree Classification Algorithm
Swati Powar
No ratings yet
Script
Document5 pages
Script
Simo Jayat
No ratings yet
Decision Tree Algorithm: and Classification Problems Too
Document12 pages
Decision Tree Algorithm: and Classification Problems Too
Ava White
No ratings yet
Unit-5 Decision Trees and Ensemble Learning
Document162 pages
Unit-5 Decision Trees and Ensemble Learning
Rahul Vashistha
100% (1)
C4.5 and CHAID Algorithm: Pavan J Joshi 2010MCS2095 Special Topics in Database Systems
Document30 pages
C4.5 and CHAID Algorithm: Pavan J Joshi 2010MCS2095 Special Topics in Database Systems
Fidia Dta
No ratings yet
MLunit 2 Mynotes
Document15 pages
MLunit 2 Mynotes
Vali Bhasha
No ratings yet
Random Forest - Basics
Document9 pages
Random Forest - Basics
arunspai1478
No ratings yet
Coincent Data Analysis Answers
Document16 pages
Coincent Data Analysis Answers
GURURAJ V A ,Deeksha
No ratings yet
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
Document34 pages
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
Utkarsh Choudhary
No ratings yet
Unit-3 Decision Tree Learning (Februray 26, 2024)
Document51 pages
Unit-3 Decision Tree Learning (Februray 26, 2024)
nikhilbadlani77
No ratings yet
Chapter 9 - Classification and Regression Trees: Data Mining For Business Intelligence
Document36 pages
Chapter 9 - Classification and Regression Trees: Data Mining For Business Intelligence
Marian
No ratings yet
J48
Document3 pages
J48
TaironneMatos
No ratings yet
Interview Questions
Document67 pages
Interview Questions
vaishnav Jyothi
100% (1)
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Karoq 1.5 Tsi
Document1 page
Karoq 1.5 Tsi
vijayamba
No ratings yet
fICO resilienceIndexfromEquifax
Document4 pages
fICO resilienceIndexfromEquifax
vijayamba
No ratings yet
Risks: Predicting Motor Insurance Claims Using Telematics Data-Xgboost Versus Logistic Regression
Document16 pages
Risks: Predicting Motor Insurance Claims Using Telematics Data-Xgboost Versus Logistic Regression
vijayamba
No ratings yet
Credit Assessment and Robotics Transformation (C.A.R.T) : Ai Driven Solutions For Smoother Banking Experience
Document11 pages
Credit Assessment and Robotics Transformation (C.A.R.T) : Ai Driven Solutions For Smoother Banking Experience
vijayamba
No ratings yet
Presented By: Jim Olsen, CTO: April 27, 2020
Document5 pages
Presented By: Jim Olsen, CTO: April 27, 2020
vijayamba
No ratings yet
Session #2: The Model Catalog: Presented By: Jim Olsen, CTO
Document7 pages
Session #2: The Model Catalog: Presented By: Jim Olsen, CTO
vijayamba
No ratings yet
Brinjalarticle PDF
Document10 pages
Brinjalarticle PDF
vijayamba
No ratings yet
Thota Panulu
Document487 pages
Thota Panulu
vijayamba
No ratings yet
PLC Vs PAC
Document2 pages
PLC Vs PAC
PandaGendut
No ratings yet
Data Mining Abhas
Document24 pages
Data Mining Abhas
Mohit Chauhan
No ratings yet
Emergency Access Management
Document5 pages
Emergency Access Management
PabitraKumar
No ratings yet
An Optimized Approach For Fake Currency Detection Using Discrete Wavelet Transform
Document10 pages
An Optimized Approach For Fake Currency Detection Using Discrete Wavelet Transform
James Moreno
No ratings yet
Pervasive Data Integrator
Document222 pages
Pervasive Data Integrator
boydrupal719
No ratings yet
CP Student Manual
Document125 pages
CP Student Manual
vinod3457
No ratings yet
"From DFD To Structure Chart": TCS2411 Software Engineering 1
Document19 pages
"From DFD To Structure Chart": TCS2411 Software Engineering 1
Velmurugan Rajarathinam
No ratings yet
What Is Data Structure
Document20 pages
What Is Data Structure
gauravujjawal
No ratings yet
A Multi-Functional In-Memory Inference Processor Using A Standard 6T SRAM Array
Document14 pages
A Multi-Functional In-Memory Inference Processor Using A Standard 6T SRAM Array
NavritaBeniwal
No ratings yet
3d Connexion
Document3 pages
3d Connexion
Serkan Ozkaymak
No ratings yet
React Webpages
Document47 pages
React Webpages
Hernan Ariel
No ratings yet
Integrated Load Balancer Description
Document2 pages
Integrated Load Balancer Description
fsussan
100% (1)
PIM InfrastructureToSync External
Document221 pages
PIM InfrastructureToSync External
Obilesu Rekatla
No ratings yet
Pie Chart
Document3 pages
Pie Chart
Elma McGougan
No ratings yet
Ip Packet Delivery
Document4 pages
Ip Packet Delivery
Kiran Kumar Nuthanapati
100% (1)
Installing Moodle
Document22 pages
Installing Moodle
Roza_Gagah
100% (2)
7 Steps To Mastering Machine Learning With Python
Document8 pages
7 Steps To Mastering Machine Learning With Python
ypravi
100% (1)
ARM LPC1769 Sevensegment Interface
Document6 pages
ARM LPC1769 Sevensegment Interface
manojram18
No ratings yet
Assignment 3
Document2 pages
Assignment 3
Dhivya N
No ratings yet
HW0483057.0 Trouble Schooting
Document142 pages
HW0483057.0 Trouble Schooting
luho1979
No ratings yet
Forouzan - MCQ in Introduction To Data Communications and Networking
Document12 pages
Forouzan - MCQ in Introduction To Data Communications and Networking
Piush Bhagat
No ratings yet
C Programs
Document10 pages
C Programs
Brensa Rumao
100% (1)
Bapi (Business Application Programming Interface)
Document12 pages
Bapi (Business Application Programming Interface)
Sandeep
No ratings yet
Self Modifying Code by Giovanni Tropeano: Vol. 1, No. 2, 2006
Document6 pages
Self Modifying Code by Giovanni Tropeano: Vol. 1, No. 2, 2006
Aj Welch
No ratings yet
Passive Ip
Document69 pages
Passive Ip
Sai Sandeep
No ratings yet
Microprocessor: 8086/88 Instruction Set
Document8 pages
Microprocessor: 8086/88 Instruction Set
John Principio
No ratings yet