Seminar On Thesis Writing Report

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Principal Component Analysis

Seminar for Thesis/Dissertation Writing


February 26,2017
Starr Clyde L. Sebial, PhD.Sci.Ed. Math – II
An Exploratory Analysis On The Key Performance
Indicators (KPI) Of Philippine Regions
STARR CLYDE L. SEBIAL
DATA MINING!!!
The Variables
HEI PRC Performance Per Capita Poverty Threshold
Poverty Index
Labor Force Poverty Incidence Among Families
Employment
Employment Rate
Index Average Income
Income Index
Unemployment Rate Average Expenditure
Underemployment Rate Access to Electricity
Crime Access to Potable Water
Completion Rate
Education Index Access
Basic to Sanitary
Housing Index Toilet Facility
Functional Literacy Rate Proportion of Fam Living in Makeshift Housing
Infant Mortality Rate Proportion of Fam Living in Informal Settlements
Health Index
Prevalence of Underweight Children
PRINCIPAL COMPONENTS ANALYSIS
Purpose: To reduce the dimension of data
from n to fewer linear combinations of the
variables.

Data Form: X = (x1, x2, ..., xn)


Desired Output: One or two linear
combinations only:
Y1 = ax1 + bx2 +...+ cxn
which explains as much as the original data.
USES OF PCA
1. To obtain a single numerical index representing a
concept.
Example: Concept = Employability
Indicators: x1 = length of job search,
x2 = hiring rate,
x3 = type of course,
x4 = grade in college.
PCA will produce an index of employability:
Index = ax1 + bx2 +cx3 +dx4
2. To determine weights that should be placed on
certain variables.
Example
• Example: x1 = prelims,
x2 = midterm,
x3 = finals.
How much weight should you put in
each kind of test?

• PCA will produce:


Grade = ax1 + bx2 +cx3
Data Requirements for PCA
Data are multivariate observations:
X1 = (x1,x2,x3,...,xn); X2 = (x1,...,xn) etc.

We can compute a covariance matrix from


the data:
S = (s(ij)) where:
S(i,j) = covariance between Xi and Xj
= relationship of Xi and Xj
The Maximization Problem of PCA
Maximize: Variance( ax1 + bx2 +...+cxn)
Subject to: a^2 +b^2 +...+c^2 = 1

The solutions a,b and c turn out to be the eigenvector


(a, b, ...,c) of the covariance matrix corresponding to
an eigenvalue λ.
The eigenvalues λ represent how much of the total
variance is being represented by a linear
combination (called Principal Component)
STEPS
1. GO TO “STAT”
2. GO TO “MULTIVARIATE”
3. GO TO “PRINCIPAL COMPONENTS”
4. Input “number of components = 3”
5. Input “type of matrix = covariance”
6. Click “OK”
AN EXAMPLE
We want to find an index of employability of
new graduate of a program based on:
X1 = length job search (months)
X2 = hiring rate per month (pesos)
X3 = grade in college
A simulated data is given on the next page.
SAMPLE EMPLOYABILITY DATA
LENGTH RATE(P1,000) GRADE
3 14 90
3 14 91
6 8 83
5 10 86
6 8 78
4 12 87
8 4 75
4 12 86
8 4 76
5 10 87
MINITAB RESULTS
• Principal Component Analysis: LENGTH, RATE,
GRADE

Eigen analysis of the Covariance Matrix

Eigenvalue 47.980 1.009 -0.000


Proportion 0.979 0.021 -0.000
Cumulative 0.979 1.000 1.000

(Results show that the first eigenvector or principal


component represents 97.9% of the total
variance.This is sufficient to represent all three
variables)
MINITAB CONTINUED
Variable PC1 PC2 PC3
LENGTH -0.256 0.366 0.894
RATE 0.513 -0.733 0.447
GRADE 0.819 0.573 -0.000

Index of employability =
-0.256Length + 0.513Rate + 0.819grade

• The longer the student waits for a job, the smaller is the
index; the higher the hiring rate, the greater is the index;
the higher the grade in college, the higher is the index.
WORKSHOP 4
• We want to develop a “Management Proficiency Index” based
on the following scores:
X1 = leadership score
X2 = decision-making score
X3 = interpersonal relations score
X4 = task-orientation score
• Data are shown on the next page. Perform a PCA and interpret
your index. Compare the indices for the following two people:
PERSON 1: X1 = 1, X2 = 2 , X3 = 2, X4 =3
PERSON 2: X1 = 3, X2 = 3, X4 = 2 , X4 = 1
WORKSHOP DATA
LEADER DECISION PERSON TASK
3 3.10 4.1 6.18
3 3.20 4.3 6.12
4 3.40 4.4 6.85
5 3.55 4.5 7.00
4 3.40 4.4 6.90
3 3.19 4.2 6.14
5 3.50 4.5 6.90
5 3.50 4.6 7.00
5 3.50 4.5 6.99
5 3.50 4.7 7.20

You might also like