Multivariate Techniques Assignment

KWAME NKRUMAH UNIVERSITY OF SCIENCE AND
TECHNOLOGY, KNUST; KUMASI.
GEOLOGICAL ENGINEERING DEPARTMENT
TOPIC: MINOR AND TRACE ELEMENTS GEOCHEMISTRY OF SOILS

IN BROFOYEDU AREA, ASHANTI REGION.
SUPERVISOR: Dr. O.B. Nuamah
MEMBERS: Acheampong Solomon, Baku Aremiyaw and Owusu-Bempong

Theophilus
Assignment 2: Multivariate Techniques
DATE: 16th March, 2022.

Table of Contents
Table of Contents.............................................................................................................................1
Table of figures................................................................................................................................1
CORRELATION ANALYSIS........................................................................................................2
Positive correlation......................................................................................................................3
Negative Correlation....................................................................................................................3
No Correlation..............................................................................................................................4
Principal Components Analysis.......................................................................................................5
Factor Analysis................................................................................................................................5
Differences between the Principal Component Analysis and the Factor Analysis.........................6
CLUSTER ANALYSIS...................................................................................................................6
SPATIAL DISTRIBUTION PLOTS...............................................................................................7
Table of figures.
Figure 1. Positive correlation..........................................................................................................3

Figure 2. Negative correlation.........................................................................................................3
Figure 3. No Correlation..................................................................................................................4
Figure 4. The spatial distribution of earthquake stress rotations following large subduction zone
earthquakes......................................................................................................................................7
Figure 5. Spatial distribution of heavy metal concentrations surrounding a cement factory and its
effect on Astragalus gossypinus and wheat in Kurdistan Province, Iran........................................8
Figure 6. Spatial distribution of neighborhood-level housing prices and its association with all-
cause mortality in Seoul, Korea (2013–2018): A spatial panel data analysis.................................8
1|Page
CORRELATION ANALYSIS
Correlation Analysis is a statistical method that is used to discover if there is a relationship

between two variables/datasets, and how strong that relationship may be. They are used to
analyse qualitative and quantitative geochemical data gathered from exploration methods to
identify whether there is any significant connections, patterns or trends between the two.
Essentially, correlation analysis is used for spotting patterns with datasets. A positive correlation
result means that both variables increase in relation to each other, while a negative correlation
means that as one variable decreases, the other increases.
Two basic methods ae used whether the parameters associated with the data gathered. The two
terms to watch out for are:
 Parametric:(Pearson’s coefficient) where the data must be handled in relation to the

parameters of population or probability distributions. Typically used with quantitative
data already set out within said parameters.
 Non-Parametric:(Spearman’s coefficient) where no assumptions can be made about

the probability distribution. Typically used with qualitative data, but can be used with
quantitative data if spearman’s rank proves inadequate.
In cases when both are applicable, statisticians recommend using the parametric methods
such as Pearson’s coefficient, because they tend to be more precise. But that doesn’t mean
discount the non-parametric methods if there isn’t enough data or a more specified accurate
result.
Interpreting results
Typically, the best way to gain a generalized but more immediate interpretation of the results
of a set of data, is to visualize it on a scatter graph such as these.
1|Page
Positive correlation
Any score of +0.5 to +1 indicates a very strong positive correlation, which means that they
both increase at the same time. The line of best fit is placed best represent the data on the
graph. In this case, it is following the data upwards to indicate the positive correlation.
Figure 1. Positive correlation
Negative Correlation
Any score from -0.5 to -1 indicate a strong negative correlation, which means that as one
variable increases, the other decreases proportionally. The line of best fit can be seen here to
indicate the negative correlation.in these cases is will slope downwards from the point of origin.
2|Page
Figure 2. Negative correlation.
No Correlation
Very simply, a score of zero indicates that there is no correlation, or relationship between two variables.
The larger the sample, the more accurate the result. No matter which formula is used, this fact will stand
true for all. The more data there is in putted into the formula, the more accurate the end result will be.
Figure 3. No Correlation
The cause of any relationship that may be discovered through the correlation analysis, is for the
researcher to determine through other means of statistical analysis, such as the coefficient of
determination analysis. However, this is a great amount of value that correlation analysis can
provide for egs the value dependency or the variables can be estimated.
3|Page
Principal Components Analysis
Principal components analysis is a geochemical technique that removes dependency or

redundancy in the data by dropping those features that contain the same information as given by
other sttributes.and the derived components of each other. In much sense, a component is a
derived new dimension (or variable) so that the derived variables are linearly independent of
each other
The approach of PCA is to reduce the unnecessary features, which are present in the data, this is
by creating or deriving new dimensions (or also referred to as components). These components
are a linear combination of the original variables.
Factor Analysis
Factor analysis is another multivariate technique whereby a factor is a common or underlying

element which several other variables are correlated. This is used to understand the underlying
cause which these factors capture much of the information of a set of variables in the dataset
data. The primary aspect of factor analysis is to unravel the latent (also known as the factors) that
store a variable’s spread (or the information).
Factor analysis is performed to decrease the large number of attributes into a smaller set of
factors. When analyzing data with many predictors, some of the features may have a common
theme amongst themselves. The features that have similar meaning underneath could be
4|Page
influencing the target variable by sharing this causation, and hence such features are combined
into one factor. Thus, a factor is a common element which several other variables are correlated.
Differences between the Principal Component Analysis and the Factor Analysis
 In principal components analysis, the goal is to explain as much of the total variance in
the variables as possible whereas in factor analysis, the original variables are defined as
linear combinations of the factors. The goal of factor analysis is to explain the
covariances or correlations between the variables.
 In principal components analysis, the components are calculated as linear combinations

of the original variables. In factor analysis, the original variables are defined as linear
combinations of the factors.
 Principal component analysis is used reduce the data into smaller number of components
but factor analysis is used to understand what constructs underlie the data.
CLUSTER ANALYSIS
Another interdependence technique, cluster analysis is used to group similar items within
a geochemical dataset into clusters, when grouping data into clusters, the aim is for the
5|Page
variables in one cluster to be more similar to each other than they are to variables in other
clusters. This measured in measured in terms of intracluster and intercluster distance.
Intracluster distance looks at the distance between data points within one cluster. This
should be small.
Intercluster distance looks at the distance between data points in different clusters. This
should be large or the intercluster distances are maximized. Cluster analysis helps you to
understand how data in your sample is distributed and to find patterns.
SPATIAL DISTRIBUTION PLOTS
A spatial distribution in statistics is a distribution or set of geographic observations representing

the values of behavior of a particular phenomenon or characteristic across many locations on the
surface of the Earth. A graphical display of a spatial distribution may summarize raw data
directly or may reflect the outcome of a more sophisticated data analysis. This graphical display
of the spatial distribution of raw data directly is called a Spatial Distribution Plot. Many
different aspects of a phenomenon can be shown in a single graphical display by using a suitable
choice of different colors to represent differences.
One example of such a display could be observations made to describe the geographic patterns of
features, both physical and human across the earth. The information included could be where
units of something are, how many units of the thing there are per units of area, and how sparsely
or densely packed they are from each other.
An example is shown in (Fig. 4, Fig. 5 and Fig. 7) below.
6|Page
Figure 4. The spatial distribution of earthquake stress rotations following large subduction zone
earthquakes.
7|Page
Figure 5. Spatial distribution of heavy metal concentrations surrounding a cement factory and
its effect on Astragalus gossypinus and wheat in Kurdistan Province, Iran.
Figure 6. Spatial distribution of neighborhood-level housing prices and its association with all-
cause mortality in Seoul, Korea (2013–2018): A spatial panel data analysis.
8|Page

Multivariate Techniques Assignment

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multivariate Techniques Assignment

Uploaded by

Copyright:

Available Formats

KWAME NKRUMAH UNIVERSITY OF SCIENCE AND

TECHNOLOGY, KNUST; KUMASI.

GEOLOGICAL ENGINEERING DEPARTMENT

TOPIC: MINOR AND TRACE ELEMENTS GEOCHEMISTRY OF SOILS

SUPERVISOR: Dr. O.B. Nuamah

MEMBERS: Acheampong Solomon, Baku Aremiyaw and Owusu-Bempong

Assignment 2: Multivariate Techniques

DATE: 16th March, 2022.

Principal Components Analysis.......................................................................................................5

SPATIAL DISTRIBUTION PLOTS...............................................................................................7

Figure 1. Positive correlation..........................................................................................................3

Correlation Analysis is a statistical method that is used to discover if there is a relationship

 Parametric:(Pearson’s coefficient) where the data must be handled in relation to the

 Non-Parametric:(Spearman’s coefficient) where no assumptions can be made about

Figure 1. Positive correlation

Principal components analysis is a geochemical technique that removes dependency or

Factor analysis is another multivariate technique whereby a factor is a common or underlying

 In principal components analysis, the components are calculated as linear combinations

SPATIAL DISTRIBUTION PLOTS

A spatial distribution in statistics is a distribution or set of geographic observations representing

An example is shown in (Fig. 4, Fig. 5 and Fig. 7) below.

You might also like