Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Interpreting loading plots

Recall that the loadings plot is a plot of the direction vectors that define the model.
Returning back to a previous illustration:

In this system the first component, p1, is oriented primarily in the x2 direction, with
smaller amounts in the other directions. A loadings plot would show a large
coefficient (negative or positive) for the x2 variable and smaller coefficients for the
others. Imagine this were the only component in the model, i.e. it is a one-
component model. We would then correctly conclude the other variables measured
have little importance or relevance in understanding the total variability in the
system. Say these 3 variables represented the quality of our product, and we had
been getting complaints about the variability of it. This model indicates we should
focus on whatever aspect causes in variance in x2, rather than other variables.

Let’s consider another visual example where two variables, x1 and x2, are the
predominant directions in which the observations vary; the x3 variable is only “noise”.
Further, let the relationship between x1 and x2 have a negative correlation.
A model of such a system would have a loading vector with roughly equal weight in
the +x1 direction as it has in the −x2 direction. The direction could be represented
as p1=[+1,−1,0], or rescaled as a unit vector: p1=[+0.707,−0.707,0]. An equivalent
representation, with exactly the same interpretation, could be p1=[−0.707,+0.707,0].

This illustrates two points:

 Variables which have little contribution to a direction have almost zero weight in
that loading.

 Strongly correlated variables, will have approximately the same weight value
when they are positively correlated. In a loadings plot of pi vs pj they will appear
near each other, while negatively correlated variables will appear diagonally
opposite each other.

 Signs of the loading variables are useful to compare within a direction vector; but
these vectors can be rotated by 180° and still have the same interpretation.

This is why they are called loadings: they show how the original variables load,
(contribute), to creating the component.

Another issue to consider is the case when one has many highly correlated
variables. Consider the room temperature example where the four temperatures are
highly correlated with each other. The first component from the PCA model is shown
here:
Notice how the model spreads the weights out evenly over all the correlated
variables. Each variable is individually important. The model could well have
assigned a weight of 1.0 to one of the variables and 0.0 to the others. This is a
common feature in latent variable models: variables which have roughly equal
influence on defining a direction are correlated with each other and will have roughly
equal numeric weights.

Finally, one way to locate unimportant variables in the model is by finding which
variables which have small weights in all components. These variables can generally
be removed, as they show no correlation to any of the components or with other
variables.

You might also like