Professional Documents
Culture Documents
Factor Analysis Nature Mechanism Uses in Social and Management Researches
Factor Analysis Nature Mechanism Uses in Social and Management Researches
Factor Analysis Nature Mechanism Uses in Social and Management Researches
net/publication/200564629
CITATIONS READS
16 21,491
2 authors, including:
Nimalathasan Balasundaram
University of Jaffna
119 PUBLICATIONS 1,370 CITATIONS
SEE PROFILE
All content following this page was uploaded by Nimalathasan Balasundaram on 02 June 2014.
Factor Analysis: Nature, Mechanism and Uses in Social and Management Science
Research
Abstract
Factor Analysis (FA) attempts to simplify complex and diverse relationships that exist among
a set of observed variables by uncovering common dimensions or factors that link together
the seemingly unrelated variables and consequently provides insight into the significance of
underlying structure of the data. This state of affairs might produce some frustration among
social and management science practitioners and even among some academicians, who find it
difficult to comprehend and interpret the mechanism and results of FA. Therefore the study is
designed to assist those who need to read and comprehend research articles on FA, as well as
*
Professor of Management Studies, Department of Management Studies, University of
Chittagong.
**
Lecturer, Department of Commerce, University of Jaffna, SriLanka & Ph.D Research
Fellow (SAARC), Department of Management Studies, University of Chittagong.
1
Prelude
Thousands of variables are proposed to explain the complex situation and their
interconnections and interrelationships. In this regard the few basic variables and propositions
correlations among these variables are charted only on presence - absence or rank order
scales. And to take the data on any one variable at face value is questionable in terms of their
unknown interdependencies, masses of qualitative and quantitative variables, and bad data,
many social scientists are turning towards Factor Analysis (FA) to uncover characteristic
FA can simultaneously manage over a hundred variables, compensate for random error and
invalidity, and disentangle complex interrelationships into their major and distinct
diverse and numerous considerations in application. Its technical vocabulary includes new
‘communality’.
There are many studies conducted in the field of FA. Most of the articles are concerned with
the concerns of western countries. A few abridged studies are found available in Bangladesh
and other similar countries, but no detailed study is seen on it. Therefore, the authors took
interest to somewhat cover this research gap. The study was undertaken to understand the FA
2
Objectives of the Study
Conceptual Overview
Factor Analysis
FA is a part of the General Linear Model (GLM) family of procedures bearing same
assumptions as multiple regressions e.g. linear relationships, interval or near – interval data,
latent variables, proper specification including relevant variables and excluding extraneous
ones, lack of high multicollineraity, and multivariate normality. It is useful for the purpose of
testing significance of results of the research. Further, FA is one of the most commonly used
methods for summarizing and reducing data to significant ones in social science researches.
phenomena.
inconvenient even with a small number of variables and data, and the varied statistical
packages viz.,Statistical Package for Social Sciences (SPSS), Analysis of MOment Structures
used in data reduction to identify a small number of factors that explain most of the variances
observed in a much larger number of manifest variables. FA can also be used to generate
hypotheses regarding casual mechanisms or to screen variables for subsequent analysis (e.g.
to identify co-linearity prior to performing a liner regression analysis). It is a generic term for
a family of statistical techniques concerned with the reduction of a set of observable variables
in terms of small number of latent factors. It has been developed primarily for analysing
relationships among a number of measurable entities (such as survey items or test scores).
The underlying assumption of FA is that there exists a number of independent variable (or
“latent variables”) that account for the correlations among dependent variables all becoming
zero. In other words, the latent variables determine the values of the dependent variables (The
University of Texas at Austin, 1995). Each dependent variable (Y) can be expressed as a
Y = α1 F1 + α2 F2 + ----------- + αn Fn
Where:
Y = Dependent variable
α = A constant
F = Independent variable
On the other hand Dillon and Goldstein (1984) pointed out that FA is essentially a method of
meaningful reduction of data. It tries to simplify complex and diverse relationships that exist
among a set of observed variables by uncovering common dimensions or factors that link
together the seemingly unrelated variables and consequently provides insight into the
4
underlying structure of the data. FA has the ability to produce descriptive summaries of data
matrices, which aid in detecting the presence of meaningful patterns among a set of variables
FA has most frequently been used to study relationships among the attributes. This use of FA
is called ‘R-type’, and is well known in the social sciences. Examples exist in many areas
including intelligence (Guilford, 1967; Thrustone, 1938) and personality (Cattell, Eber, and
Tatsuoka, 1970; Costa and McCrae, 1992). When FA is used to identify relationships among
the entities it is called ‘Q-type’. Q-type FA has been employed with less frequency than R-
type, and is not usually used by the social science researchers. It is of use in situations where
the profile of the attributes (the pattern of scores that an individual obtains over a relevant set
of measures) is of interest rather than focusing upon the analysis or description of a single
variable at a time. The use of FA may be demonstrated in two different data analysis
contexts. In one instance, the data analyst may have no theoretical hypothesis in mind when
using FA and is simply searching for a common structure underlying the data. The use of FA
in this way is called exploratory. On the other hand, FA may also be used in a case where the
data analyst may have some prior theoretical information on the common structure
underlying the data and wishes to confirm the hypothesized structure. The use of FA in this
Most applications of FA have been in psychology and in social and management sciences. As
for example, suppose that information is called from a wide range of people as to their
occupation, type of education, whether or not they own their own home, and so on. Then one
5
Recommendations of the Sample Size
A wide range of recommendations regarding sample size in FA has been made. The
recommendations are usually stated in terms of either the minimum sample size (N) for a
particular analysis or the minimum ratio of N to the number of variables, p, i.e. the number of
Gorsuch (1983) recommended five subjects per item, with a minimum of 100 subjects,
regardless of the number of items. Guilford (1954) argued that N ‘should be at least’ 200,
while Cattell (1978) recommended three to six subjects per item, with a minimum of 250.
Comrey and Lee (1992) provided the following guidance in determing the adequacy of
sample size: 100 = poor, 200= fair, 300=good, 500 = very good, 1000 or more = excellent.
More demanding recommendation for sample size is ideally several hundred (Cureton and
D’Agostino, 1983).
Factor model
In practice, there are several factor models which differ in significant respects. A model most
often applied in psychology is called “common FA”. Indeed, psychologists usually reserve
the term “FA” for just this model. Common FA is concerned with defining the patterns of
contrast, another factor model called “component FA” is concerned with covering all the
variations in a set of variables, whether common or unique. Other factor models are “image
analysis”, “canonical analysis”, and “alpha analysis”. Image analysis has the same purpose
as common FA, but with more elegant mathematical properties. Canonical analysis defines
6
common factors for a sample of cases that are the best estimates of those for the population; it
enables test of significance. Alpha analysis defines common factors for a sample of variables
Factor loadings
The factor loadings, also called component loading in Principal Component Analysis (PCA),
are the correlation coefficients between the variables (rows) and factors (columns). PCA is
the commonly used method for grouping the variables under few unrelated factors. Variables
with a factor loading of higher than 0.5 are grouped under a factor. A factor loading is the
correlation between the original variable with the specific factor and the key to understanding
Communality
by the factors. It is the amount of variance an original variable share with all other variables
included in the analysis. A relatively higher communality indicates that a variable has much
in common with the other variables taken as a group (Islam and Mamun, 2005). Further, the
communality measures the presence of variance in a given variable explained by all the
The Eigen value for a given factor measures the variance in all the variables which is
accounted for by that factor. The ratio of Eigen values is the ratio of explanatory importance
of the factors with respect to the variables. If a factor has a low Eigen value, then it is
contributing little to the explanation of variances in the variables and may be ignored as
redundant as compared to more important factors. Perhaps the most frequently used
extraction approach is the “root greater than one” criterion. Originally suggested by Kasier
7
(1958), this criterion retains those components whose Eigen values are greater than 1. The
rationale for this criterion is that any component should account for more “variance” than any
Scree plots
Scree plots are formed by plotting the number of factors against their respective eigen value
(Hackett and Foxall, 1999). It is a graph of the eigen values against all the factors. The graph
According to the Figure 1, the plot looks like the side of a mountain, and "scree" refers to the
debris fallen from a mountain and lying at its base. Therefore, the scree plot proposes to stop
analysis at the point the mountain ends and the debris (error) begins. In this instance, that
8
Factor extraction
There are several ways to conduct FA (i.e., principal components; unweighted least squares;
generalized least squares; maximum likelihood; principal axis factoring; alpha factoring;
image factoring) and alternative choice of methods (i.e., correlation matrix or a covariance
matrix).
Factor rotation
The interpretability of factors can be improved through rotation. Rotation maximizes the
loading of each variable on one of the extracted factors whilst minimizing the loading on all
other factors. Rotation works through changing the absolute values of the variables whilst
keeping their differential values constant. Varimax, quartimax and equamax are the variant
The varimax method is the most popular among these techniques and is often used to make
principal components analysis (PCA). The procedure seeks to rotate factors so that the
variation of the squared factor loadings for a given factor is made large. The exact choice of
rotation depends largely on whether or not the researcher should choose one of the orthogonal
orthogonal alternative which minimizes the number of factors needed to explain each
variable. This type of rotation often generates a general factor on which most variables are
loaded to a high or medium degree. Such factor structure is usually not helpful for the
research purpose. Finally, the equimax method attempts to achieve simple structure with
respect to both the rows and columns of the factor loading matrix.
9
Oblimax, quartimin, covarimin, biquartimin, and oblimin methods are oblique rotation.
Oblimax seeks to rotate the factors so that the numbers of high and low loadings are
increased by decreasing those in the middle range; quartimin minimizes the sum of inner
products of the (reference) structure loadings; covarimin is the varimax analog of the oblique
quartimin and covarimin methods; and oblimin is similar to the biquartimin method in that it
Factor scores
The methods of principal components and FA are both data reduction techniques.
Consequently, the researcher may want to calculate the projection of each observation on
each of the factors. Factor scores give the location of each observation in the space of the
common factors.
Statistical validity
not the distribution of value is adequate for conducting FA. A measure of >0.9 is marvellous,
>0.8 is meritorious, >0.7 is middling, >0.6 is mediocre, >0.5 is miserable and <0.5 is
indicates that the data DO NOT produce an identity matrix and are thus appropriately
Norusis (1993) described the process of FA in the following ways: The first step in FA is to
produce a correlation matrix for all variables. Variables that do not appear to be related to
other variables can be indentified from this matrix. The number of factors necessary to
10
represent the data and the method for calculating them must then be determined. Principal
Component Analysis (PCA) is the most widely used method of extracting factors. In PCA,
linear combinations of variables are formed. The first principal component is that which
accounts for the largest amount of variance in the sample, the second principal component is
that which accounts for the next largest amount of variance and is not correlated with the first
data. Next coefficients called ‘factor loadings’ that relate variables to identified factors are
calculated. Factor models are then often ‘rotated’ to ensure that each factor has non-zero
loadings for only some of the variables. Rotation makes the factor matrix more interpretable.
Following rotation, scores for each factor can be computed for each case in a sample. These
Research Methodology
Sampling design
11
The sample was derived from the Bangladesh Garment Manufacturing Export Association
(BGMEA). Twenty five RMGs entrepreneurs were selected as convenience sample method in
Chittagong.
Data collection
Primary and secondary data were used for the study. Primary data were collected through the
written questionnaire following direct personal interviewing technique. The secondary data
Measures
The questionnaire was administrated to RMG entrepreneurs in Chittagong port city. A seven
points Likert type summated rating scales of questionnaire from strongly disagree (-3) to
The present study has used a sophisticated method of statistics - FA using varimax rotation
analyzing the data collected. In order to obtain interpretable characteristics and simple
structure solutions, researchers have subjected the initial factor matrices to varimax rotation
procedures (Kaiser, 1958). Varimax rotated factors matrix provides orthogonal common
factors. Finally ranking of the indicators has been made on the basis of factor scores.
The reliability value of our surveyed data was 0.787 for characteristics. If we compare our
reliability value with the standard value alpha of 0.7 advocated by Cronbach (1951), a more
accurate recommendation (Nunnally & Bernstein’s, 1994) or with the standard value of 0.6 as
recommendated by Bagozzi & Yi’s (1988) we find that the scales used by us are sufficiently
reliable for data analysis. Regarding validity, Kasier – Meyer –Olkin (KMO) measure of
12
Sampling Adequacy is a measure of whether or not the distribution of value is adequate for
conducting FA. As per KMO measure, a measure of >0.9 is marvellous, >0.8 is meritorious,
According to Table 1 the data returned a value sampling adequacy of 0.772 indicating
middling. Bartlett’s test of Sphericity is a measure of the multivariate normality of the set of
distributions. It also tests whether the correlation matrix conducted within the FA is an
identity matrix. FA would be meaningless with an identity matrix. A significance value <0.05
indicates that the data DO NOT produce an identity matrix and are thus appropriately
multivariate normal and acceptable for FA (George and Mallery, 2003). The data within this
study returned a significance value of 0.000, indicating that the data was acceptable for FA.
When the original ten characteristics were analysed by the Principal Component Analysis
(PCA) with varimax rotation, three characteristics extracted from the analysis with an Eigen
value of =1, which explained 80.862 percent of the total variance. The result of the FA is
presented in Table 2. The factor loadings have ranged from 0.956 to .674. The higher a factor
loading, the more would its test reflect or measure as characteristics. The characteristic
getting highest loading becomes the title of each group of characteristics e.g. risk taking –
title of characteristics group I and the like. Further, the present study has interpreted the
characteristics loaded by variables having significant loadings of the magnitudes of 0.50 and
13
Table 2: Principal Component Analysis – Varimax Rotation of Characteristics
Characteristics
Name of the characteristics Characteristics group – I Characteristics group - II Characteristics group -III
Persistence .887
contract
Explained
characteristics with factor loadings ranging from .893 to .674. They were risk taking;
14
information seeking; persistence; systematic planning; commitment to work contract;
persuasion; self confidence and goal setting. This characteristic accounted for 56.480% of the
rated variance.
Characteristics group II: Demand for work contract – One characteristic with .956
belonged to demand for work contract. This characteristic explained 13.068% of the rated
variance.
Characteristics group III: Opportunity Seeking –Only one characteristic with .936, it
Ranking of the above characteristics in order of their importance, along with factor score, is
contract’ and ‘Risk taking’ got the ranks of first, second and third respectively and constitute
15
Uses of FA in Social and Management Science Research
The following are the applications of FA relevant to various scientific and policy concerns.
relationships into their separate patterns. Each pattern will appear as a factor delineating a
economical description.
revolution, liberal voting, and authoritarianism. It can be used to classify nation profiles into
Scaling: The scale may refer to such phenomena as political participation, voting behaviour,
variation (factors). Each factor then represents a scale based on the empirical relationship
personality, group, social behaviour, voting, and conflict. Since the meaning usually
characteristics or behaviour, FA may be used to test for their empirical existence. Which
16
characteristics or behaviour should, by theory, be related to which dimensions can be
postulated in advance and statistical tests of significance can be applied to the FA results.
Besides those relating to dimensions, there are other kinds of hypotheses that may be tested
e.g. if the concern is with a relationship between economic development and instability,
holding other things constant, a FA can be done of economic and instability variables along
with other variables that may affect (hide, mediate, depress) their relationship. The resulting
factors can be so defined (rotated) that the first several factors involve the mediating
measures (to the maximum allowed by the empirical relationships). A remaining independent
factor can be calculated to best define the postulated relationships between the economic and
instability measures. The magnitude of involvement of both variables in this pattern enables
the scientist to see whether an economic development instability pattern actually exists when
Data transformation: FA can be used to transform data to meet the assumption of other
technique for example application of the multiple regression technique assumes (if tests of
significance are to be applied to the regression coefficients) that predictors –the so—called
independent variables – are statistically unrelated. If the predictor variables are correlated in
uncorrelated factor scores. The scores may be used in the regression analysis in place of the
original variables, with the knowledge that the meaningful variation in the original data has
not been lost. Likewise, a large number of dependent variables can also be reduced through
FA.
Exploration: The unknown domain may be explored through FA. It can reduce complex
17
Mapping: Besides facilitating exploration FA also enables a scientist to map the social
environment. Mapping means the systematic attempt to chart major empirical concepts and
sources of variation.
Conclusion
FA refers to a collection of statistical methods for reducing correlated data into a smaller
number of dimensions or factors. Often in the social or management sciences, indeed in many
relevant measures from that domain. This paper focused primarily the approaches to FA in
the perspective of nature, mechanism and uses in social and management sciences.
Bagozzi R.P., and Yi, Y. (1988). “On the Evaluation of Structural Equation Models”, Journal
Catell,R.B.,Eber,H.W. and Tatsuoka,M.M. (1970). The Handbook for the Sixteen Personality
Factor Questionnaire. Champaign, IL: Institute for Personality and Ability Testing.
Comrey, A.L. and Lee, H.B. (1992). A first course in Factor Analysis, Hillsdale, New Jersey:
Erlbaum.
18
Cronbach, L.J. (1951).Coefficient Alpha and the Internal Structure of tests, Psychometrika,16
(3):297-334.
NJ: Erlbaum.
46-482.
Debasish, S.S.(2004). Exploring Customer Preference for Life Insurance in India: Factor
Dillon and Goldstein. (1984). The Essential of Factor Analysis. New York: Holt, Rinchard
and Winston.
George, D. and Mallery,P.(2003). SPSS for Windows Step by Step: A Simple Guide and
Guilford, J.P. (1954). Psychometric methods, 2nd edition, New York: McGraw Hill.
Gorsuch, R.L. (1983). Factor Analysis (2nd ed). Hillsdae, NJ: Erlbaum.
Location Specific Values: A Traditional High Street and a Modern Shopping Mall: In Hooley
Islam Nariel, and Mamun Muhammad,Z.(2005). Factors for Not Buying Life Insurance
1 &2:31
19
Kasier,H.F.(1958). “The Varimax criterion for analytic rotation in Factor Analysis”,
Psycometrica,23:187-200.
Norusis (1993) as cited in Helen C.L and Steve,R.(---). Sample size in factor analysis: why
Nunnally, J. C.,and Bernstein.(1994). Ira Psychometrics Theory, McGraw – Hill, New York,
Pal, Y. (1986). A Theoretical study of Some Factor Analysis Problems and Pal,Y. and Bagai,
O.P. (1987). A Common Factor Bettery Reliability Approach to Determine the Number of
Interpretable Factors”, a paper presented at the IX Annual Conference of the Indian Society
The University of Texas at Austin (1995). Factor Analysis Using SAS PROC FACTOR,
20