Professional Documents
Culture Documents
Course 4.6.1. Van Dam PDF
Course 4.6.1. Van Dam PDF
PRESENTATION
MULTIVARIATE ANALYSIS
• What is it?
¾ Why/when use it
¾ Classification of techniques
• Examples
¾ multiple regression analysis
¾ factor analysis (Dr. Peter Kelderman)
Multivariate data analysis – what is it?
40
Frequency
35
No. of observations
30
25
20
15
Mean = 4.78
10 S.D. = 0.95
5
0
1 3 5 7 9 Meer
DO concentration (mg/l)
Multivariate data analysis: example
Predicting the phosphporous concentrations of lakes
What is a variate?
Definition:
a variate is a linear combination of variables with
empirically determined weights
Metric or quantitative:
differing in amount or degree
Example: temperature
Interval scale: arbitrary zero point (e.g., temperature in Celcius versus
Fahrenheit)
Ratio scale: absolute zero point (e.g., weight or length)
Classification of multivariate techniques
Dependence techniques:
variable or set of variables is identified as the dependent
variable to be predicted or explained by independent
variables
Example: multiple regression analysis
Interdependence techniques:
set of variables is analysed simultaneously without
defining dependence relationships
Example: factor analysis
Multivariate dependence methods
(one dependent variable)
• Analysis of variance
Y1 = X1 + X2 + X3 + ... + Xn
(metric) (nonmetric)
• Canonical correlation
Y1+ Y2 + Y3 + ... + Yn = X1 + X2 + X3 + ... + Xn
(metric, nonmetric) (metric, nonmetric)
dependence interdependence
How many
variables
predicted?
multiple relationships
of dependent and several dependent variable one dependent variable
independent variables in a single relationship in a single relationship
Multiple Multiple
Measurement Canonical
regression discriminant analysis
scale of predictor correlation w/ Conjoint Linear probability
variable? dummy variables analysis models
metric nonmetric
Canonical Multivariate
correlation analysis of
analysis variance
Source: Hair et al. 1998. Multivariate data analysis, 5th ed.
Type of relationship?
dependence interdependence
Structure of
relationships
among:
metric nonmetric
nonmetric
Multi- Correspondence
dimensional analysis
scaling
Exploratory methods:
Main objective is to identify interrelationships and
structures among variables. Reduction of large number
of variables to a few key components
Examples: principal component and factor analysis; cluster analysis;
multidimensional scaling
Confirmatory methods:
Main objective is to test hypothesized relationships
between variables. Researcher has a-priori
understanding of relationships
Examples: correlation analysis; multiple regression; canonical correlation;
analysis of variance; discriminant analysis; conjoint analysis; structural
equation modelling
Structured approach to multivariate analysis
Stage 1: Define problem, objectives, and choose technique
Nile tilapia
(Oreochromis niloticus L.)
Multiple regression analysis of rice-fish data
Analysis plan / methodology / assumptions
• Database management
• Transformations/re-expression
Multiple regression analysis of rice-fish data
Database
management
Data plot; trends
120.00 0.90
0.80
100.00
0.70
80.00 0.60
0.50
(g/day)
(%)
60.00
0.40
40.00 0.30
0.20
20.00
0.10
0.00 0.00
0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00
Fish stocking size (g)
recovery growth rate
Data plot; trends
350.00
300.00
250.00
200.00
y = 1.8516x - 50.993
R2 = 0.4605
Net fish yield (kg/ha)
150.00
100.00
50.00
0.00
0.00 20.00 40.00 60.00 80.00 100.00 120.00
-50.00
-100.00
-150.00
Recovery (%)
DESC R IPT IVE ST AT IST IC S OF R IFE-FISH DAT ASET . NO. OF C ASES (N) = 198
Nam e M ean SD M in. M ax.
analysis of -1
R ice yield (kg ha )
Independent variables
4337.46 1689.08 600 8250
Plot size (m 2 )
rice-fish data Period (d)
201.52
78.97
27.55
17.73
100
50
400
114
Log period (d) 1.89 0.0995 1.70 2.06
-1
Stocking density (no. ha ) 5878.79 1883.80 2000 10000
with
Y : dependent variable
X1..k : independent or explaining variable
B1..k : partial regression coefficients (slopes)
a : constant (intercept)
ε : residual
Conclusion:
significant models; 35-71% of variation in Y’s explained
Multiple regression analysis of rice-fish data
Multiple regression models for gross fish yield (kg ha-1). All partial regression coefficients (bs) were significant at the
0.1% level, except when marked* (5%) or ** (1%). Also given are the standard errors (SE) of thebs and the
standardized bs or betaweights (rankings between brackets). Number of cases = 198
Model 1 Model 2
b SE beta b SE beta
Independent variables
Period (d) 1.57 0.225 0.359(4)
Log period (d) 230.26 42.23 0.295(5)
Stocking density (no. ha-1) 0.012 0.002 0.279(6)
Log stocking density (no. ha-1) 136.31 27.31 0.225(6)
Stocking size (g) 3.78 0.432 0.458(1) 3.82 0.423 0.463(2)
Basal N application (kg ha-1) 1.74 0.283 0.296(5)
Log basal N application (kg ha-1) 276.07 36.36 0.342(4)
Basal P application (kg ha-1) -2.05 0.318 -0.361(3)
Log basal P application (kg ha-1) -166.37 21.16 -0.439(3)
*
No. of insecticidesprayings -10.03 4.15 -0.114(7)
Maximum air temperature (°C) 26.97 3.21 0.452(2) 28.67 3.17 0.481(1)
βk = bk • (sdXk / sdYk)
Multiple regression analysis of rice-fish data
Beta weight
Figure 1. Beta-weights of variables in all models. Pesticides (dotted bars)
were of minor importance for yield and recovery, had a strong negative effect
on fish growth rate and positive effects on rice yields. Phosphorous fertilization
(striped bars) showed a negative effect on all fish variables (PER=length of
the culture period, SD=stocking density, SS=stocking size, N=basal nitrogen
application, P=basal phosphorous application, H=herbicide application,
B=basal insecticide application, I=number of insecticide sprayings,
T=maximum air temperature).
Factor analysis
• Analyze interrelationships among large no. of variables
• Explain in terms of underlying relationships (= factors = variates)
• Data reduction (reduce large no. of variables to 2-4 factors)
Variance
explained (%) 21 16 14 12
temp 31.3
secchi 31
alk 154
pH 8.03
DO 9.8
NHx 0.8
0.81 -1.00 0.23 1.08
NO2 0.08
NO3 2.0
ChlA 177
PO4 1.5
SPSS http://www.spss.com
SAS http://www.sas.com
SYSTAT http://www.systat.com
Statistica http://www.statsoftinc.com
Minitab http://www.minitab.com
LISREL http://www.ssicentral.com/lisrel/mainlis.htm
Etcetera !!!!!