Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Subject: Statistics

Paper: Multivariate Analysis


Module: Introduction to Multivariate Analysis

1 / 18
Development Team

Principal investigator: Dr. Bhaswati Ganguli, Professor,


Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Sugata SenRoy, Professor, Department of
Statistics, University of Calcutta
Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian
Institute of Public Health, Hyderabad
Content reviewer: Dr. Kalyan Das, Professor, Department of
Statistics, University of Calcutta

2 / 18
What is Multivariate Analysis?

Multivariate analysis is the simultaneous study of several variables.

I Rather than considering each variable individually, we consider


all the variables under study together.
I Formally, in multivariate analysis, we have a set of m related
variables
X = (X1 , X2 , ., Xm )0
which are studied together.

3 / 18
Why Multivariate Analysis?

I A major advantage of multivariate analysis is that, unlike


univariate studies, it takes into account the interdependences
between the variables under study and hence is more
informative.
I However, this also brings in more complexities in the analyses.
I In spite of it, most often a multivariate analysis of the m
variables is preferred to m separate univariate analyses.

4 / 18
Multivariate Analysis?

I Usually most of the univariate methods can be extended to


multivariate methods, with more complexities but better
results.
I However, there are several multivariate techniques which are
peculiar to themselves. These are
I either too trivial in univariate studies
I or do not arise in univariate problems
I It is mainly this latter with which we will be concerned here.

5 / 18
Some Examples

Let us next look at a few examples of multivariate data in different


fields :
I Social Science : Characteristics of an individual like
(Gender, Age, Nationality)
I Climatology : Weather on a particular day like
(Minimum temperature, Maximum temperature, Rainfall,
Humidity)
I Economics : Management of a firm like
(Input costs, Production, Profit)
I Socio-Demographic : Profile of a country like
(GDP, Life expectancy, Literacy rate)

6 / 18
Some Examples

There can be several examples from Health Sciences :


I General Health : Health profile of an individual like
(Systolic BP, Diastolic BP, Pulse rate)
I Pathology : The pathological profile of a patient like
(Blood sugar level, Uric acid concentration, Hemoglobin
count)
I Administrative : Day-to-day administration in a hospital
(Admissions, Operations, Discharges, Deaths)
I Pharmaceutical : Quantity of sales per day in a pharmacy
(Drug A, Drug B, ., Drug Z)

7 / 18
Visualization
Unlike univariate, bivariate or trivariate data, multivariate data
(m ≥ 4) is impossible to plot. Even for trivariate data (as shown in
the diagram), the plots are often difficult to understand.

8 / 18
Example

Problem : In this problem, each face has several features. It is


impossible to quantify and then plot these features to analyse their
characteristics.

9 / 18
Different aspects of Multivariate Analysis

A major aspect of multivariate analysis is to extend the univariate


results to the multivariate set-up.

This is mostly done to study the inter-relationships between the


variables under study, which are lost in univariate studies. Hence
emphasis here is on joint rather than marginal studies.

I This may mean the extension of the Binomial or Normal


distributions to their multivariate counterparts.
I It can also mean the extensions of the inference techniques to
multivariate data. This can relate both to the estimation and
hypothesis testing problems.

Different aspects of Multivariate Analysis 10 / 18


Cause-Effect Relationships

I A major extension is in the study of cause-effect relationships.


The multiple regression model with one response, the ANOVA
model and the ANCOVA models can all be extended to their
multivariate counterparts where several responses are studied
simultaneously.
I The question asked here is whether a set of variables have an
effect on another set of variables, and if so, how ?
I As the model equations are linear in the parameters, the
models are broadly referred to as Multivariate Linear Models.
I Special cases of this, depending on the nature of the
covariates, are Multivariate Regression, MANOVA and
MANCOVA.

Different aspects of Multivariate Analysis 11 / 18


Different aspects of Multivariate Analysis

But there are aspects of multivariate analysis which are unique to


itself. Two such broad aspects are
I Classification of Individuals
I Dimension Reduction
The problem of Classification of Individuals is too trivial for
univariate data
while the Dimension Reduction problem does not arise for single
variable studies.

Aspects of Multivariate Analysis 12 / 18


Classification of Individuals

At the exploratory stage in the classification problem, the question


often asked is

Can a group of individuals or units be separated into smaller


subgroups according to similarity or dissimilarity ?

I The answer is often sought to segregate similar units into


homogeneous groups and allow the heterogeneity to be
treated as between group variability.
I The greater this between group variability, the more distinctly
defined are the groups.
I The technique to do this is known as Cluster Analysis.

Aspects of Multivariate Analysis 13 / 18


Classification of Individuals

Following the formation of such clusters, it is necessary to


characterize these clusters. Thus the next natural question is

Why and how are these clusters different ?

I Such answers are generally provided by Discriminant Analysis,


which helps to distinguish between the characteristics of the
clusters.
I Classification techniques are then used to assign new
individuals into one of these well-defined clusters.

Aspects of Multivariate Analysis 14 / 18


Dimension Reduction

I Very often, a data may contain a large number of variables.


I This may make both the analysis complicated and the
interpretations difficult.
I A question thus asked is

Whether it is necessary to look at all the variables,


Or is it possible to capture the same information through a smaller
set of variables ?
I The solutions to this problem are obtained through Dimension
Reduction methods.

Aspects of Multivariate Analysis 15 / 18


Dimension Reduction

There are several Dimension Reduction techniques. Four of the


important ones are
I Principal Component Analysis
I Factor Analysis
I Canonical Correlation methods
I Multidimensional Scaling

Aspects of Multivariate Analysis 16 / 18


Summary

In summary, in this course we will be restricting ourselves to


I The extensions of univariate inferential techniques to
multivariate data
I The study of Multivariate Linear Models
I The study of Dimension reduction techniques
I The study of different Classification techniques

Aspects of Multivariate Analysis 17 / 18


Summary

In summary, in this course we will be restricting ourselves to


I The extensions of univariate inferential techniques to
multivariate data
I The study of Multivariate Linear Models
I The study of Dimension reduction techniques
I The study of different Classification techniques

Aspects of Multivariate Analysis 17 / 18


Summary

In summary, in this course we will be restricting ourselves to


I The extensions of univariate inferential techniques to
multivariate data
I The study of Multivariate Linear Models
I The study of Dimension reduction techniques
I The study of different Classification techniques

Aspects of Multivariate Analysis 17 / 18


Summary

In summary, in this course we will be restricting ourselves to


I The extensions of univariate inferential techniques to
multivariate data
I The study of Multivariate Linear Models
I The study of Dimension reduction techniques
I The study of different Classification techniques

Aspects of Multivariate Analysis 17 / 18


Thank You

Aspects of Multivariate Analysis 18 / 18

You might also like