Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 83


What is WEKA?
WEKA Explorer
Preprocessing the data
Association Rules
Attribute Selection
Data Visualization
1 What is WEKA?
Waikato Environment for Knowledge Analysis

Developed by Department of Computer Science, University

of Waikato, New Zealand.
Weka is also a bird found only on the
islands of New Zealand.
A collection of machine learning algorithms
for data mining tasks.

Download and Install WEKA


Platform independent
WEKA GUI Exploratory data analysis


New process model inspire


Command Line Interface


WEKA Explorer
Preprocessing the data



Association Rules

Attribute Selection

Data Visualization
Pre-Processing the data

Data can be imported from a file in various

ARFF-Attribute-Relation File Format
CSV - Comma Separated Values

Data can be read from a URL or from a SQL


Filters are used for pre-processing

WEKA only deals with flat files

@relation heart-disease-simplified
@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}
@attribute cholesterol numeric
@attribute exercise_induced_angina { no, yes}
@attribute class { present, not_present}
University of Waikato
4/13/17 13
University of Waikato
4/13/17 14
University of Waikato
4/13/17 15
University of Waikato
4/13/17 16
University of Waikato
4/13/17 17
University of Waikato
4/13/17 18
University of Waikato
4/13/17 19
University of Waikato
4/13/17 20
University of Waikato
4/13/17 21
University of Waikato
4/13/17 22
University of Waikato
4/13/17 23
University of Waikato
4/13/17 24
University of Waikato
4/13/17 25
University of Waikato
4/13/17 26
University of Waikato
4/13/17 27
University of Waikato
4/13/17 28
University of Waikato
4/13/17 29
University of Waikato
4/13/17 30
Building Classifiers

Classifiers in WEKA are models for predicting nominal

or numeric quantities

Implemented learning schemes include:

Decision trees and lists, instance-based classifiers,

support vector machines, multi-layer perceptron,
logistic regression, Bayes nets,
Decision Tree Induction: Training Dataset
Output: A Decision Tree for buys_computer

<=30 overcast
31..40 >40

student? yes credit rating?

no yes excellent fair
no yes yes
4/13/17 University of Waikato 34
4/13/17 University of Waikato 35
4/13/17 University of Waikato 36
University of Waikato
4/13/17 37
University of Waikato
4/13/17 38
University of Waikato
4/13/17 39
University of Waikato
4/13/17 40
University of Waikato
4/13/17 41
University of Waikato
4/13/17 42
University of Waikato
4/13/17 43
University of Waikato
4/13/17 44
University of Waikato
4/13/17 45
University of Waikato
4/13/17 46
University of Waikato
4/13/17 47
University of Waikato
4/13/17 48
University of Waikato
4/13/17 49
University of Waikato
4/13/17 50
University of Waikato
4/13/17 51
University of Waikato
4/13/17 52
University of Waikato
4/13/17 53
University of Waikato
4/13/17 54
University of Waikato
4/13/17 55
Clustering data

Finding groups of similar instances in a


Implemented schemes in WEKA are:

k-Means, EM, Cobweb, X-means,
Finding Associations
WEKA contains an implementation of the Apriori
algorithm for learning association rules

Works only with discrete data

Can identify statistical dependencies between

groups of attributes:
University of Waikato
4/13/17 58
University of Waikato
4/13/17 59
University of Waikato
4/13/17 60
University of Waikato
4/13/17 61
University of Waikato
4/13/17 62
Attribute Selection
Used to determine the most predictive attributes

Consists of two parts

1.) A search method : best-first, forward selection,

random, exhaustive, genetic algorithm and etc.

2.)An evaluation method : correlation-based,

wrapper, information gain an etc.
University of Waikato
4/13/17 64
University of Waikato
4/13/17 65
University of Waikato
4/13/17 66
University of Waikato
4/13/17 67
University of Waikato
4/13/17 68
University of Waikato
4/13/17 69
University of Waikato
4/13/17 70
University of Waikato
4/13/17 71
Data Visualization
WEKA can visualize single attributes (1-d) and
pairs of attributes (2-d)

Color-coded class values

Use of jitter option

Zoom-in function
University of Waikato
4/13/17 73
University of Waikato
4/13/17 74
University of Waikato
4/13/17 75
University of Waikato
4/13/17 76
University of Waikato
4/13/17 77
University of Waikato
4/13/17 78
University of Waikato
4/13/17 79
University of Waikato
4/13/17 80
University of Waikato
4/13/17 81
University of Waikato
4/13/17 82
The End

You might also like