Professional Documents
Culture Documents
BRM Report
BRM Report
PROJECT
Using Factor Analysis to find the attributes for
successful career growth in the software industry.
- Submitted by
- Varuntej Jainapur PGP/0013/04
- Praveen Piridi PGP/0024/04
- Abhishek Panda PGP/0040/04
- Priyanshu Raj PGP/0043/04
- Navneet Kumar Singh PGP/0053/04
Prajot Telang PGP/0069/04
Factor Analysis:
Factor Analysis Is A Statistical Technique Used To Transform The Original Correlated
Variables Into Set Of Uncorrelated Variables.
The Main Focus Of The Factor Analysis Is To Summarize The Information Contained In A
Large Number Of Variables Into A Few Small Number Of Factors.
PROBLEM STATEMENT:
Using Factor Analysis to find the attributes for successful career growth in the software
industry.
We are to find the underlying factors which govern the career growth in the software
industry and doing so Factor Analysis is the method which we are undertaking in order to
arrive at the same. Factor analysis is a co relational method used to find and describe the
underlying factors driving data values for a large set of variables.
QUALITATIVE ANALYSIS:
We began with more general open-ended questions with software professionals & people
inclined towards IT sector and came up with possible list variables. Upon further exploratory
research, we identified the gap and narrowed our analysis to 11 variables which have been
mentioned below simultaneously moving towards greater precision. Pre-define variables
are not identified in advance. Preliminary analysis was inherent part of the data collection.
Age
Gender
Marital Status
Graduation/Post Graduation in Computer
Science
Hot Skills Possession (AI/Data Analytics/ML)
Relevant Certifications
Significance of First Job (Role/Salary/Company)
Onsite Opportunities
Open Source Contribution
Location Preference
Macroeconomic/ Govt. Policies
Active Presence on Linkedin
Niche Skill (over broad base of programming
languages
Intelligence Quotient
QUESTIONNAIRE AND DATA COLLECTION:
The questionnaire has been designed to ask the relevance of 11 major variables which we
consider to have impact on the career growth in the software industry. It is less time
consuming and keeps the people filling the questionnaire interested. We have used a
Multiple Choice Grid with variables in the row and corresponding ratings in the column
(1(Least Preferred) – 5 (Most Preferred)).
The questions included cover some of the primary factors which have been formulated after
multiple discussions among people working in the IT industry. One of our group members
interacted with his ex-colleagues and other relatives working in the IT industry to come up
with the variables.
We used this technique, since it divides the elements of the population into small subgroups
(strata) based on the similarity in such a way that the elements within the group are
homogeneous and heterogeneous among the other subgroups formed. And then the
elements are randomly selected from each of these strata. We considered only
pursuing/prospective MBA students and IT professionals. The link of the questionnaire was
shared on platforms like Cat Groups on Facebook, Pagalguy and with people working in the
IT industry.
Data has been collected through the below mentioned mediums
• Internet (Floating a google form with the relevant questions – Facebook/Pagalguy)
• Telephone Surveys (One of our group members called his ex-colleagues)
Here is the data collected from the respondents.
We would be using factor analysis in order to determine the expected
objective. Factor analysis is usually done in order to group the variables into
underlying factors in accordance with their dependence.
Figure (a)
Figure (a) is a simple table that shows the descriptive statistics for the variables taken into
study. In this figure, the second column shows the mean value for each variable for 56
participants, the third column the degree of variability in scores for each item, and the
fourth column the number of observations (sample size).
Figure (b)
Figure (b) is the SPSS-produced correlation matrix for the descriptor variables. This is the
initial stage of Factor analysis, which gives some initial clues about the patterns of factor
analysis. It is important to note that for the appropriateness of the factor analysis, the
variables must be correlated. Figure 2 shows that a few variables are correlated with each
other to form factor.
Figure (c)
Figure (c) shows 2 important statistics: the KMO measure of sampling adequacy and the
Bartlett’s test of sphericity for judging the appropriateness of a factor model. A high value of
this statistics (from 0.5 to 1) indicates the appropriateness of the factor analysis. Kaiser has
presented the range as follows: >0.9 is marvelous, >0.8 meritorious, >0.7 middling, >0.6
mediocre, >0.5 miserable and <0.5 unacceptable. The figure 3 shows that KMO statics is
computed as 0.651, which indicates the value in the acceptance region of the factor
analysis.
Bartlett’s test of sphericity tests the hypothesis whether the population correlation matrix is
an identity matrix. The existence of the identity matrix puts the correctness of the factor
analysis under suspicion. A value less than 0.05 indicates the data in hand do not produce
identity matrix. Figure (3) shows that value is significant at 0.01 level.
Both the results, KMO statistic and Bartlett’s test of sphericity, indicate an appropriate
factor analysis model.
Figure. (d)
Figure (d) shows the initial and extracted communalities. The communalities describe the
amount of variance a variable share with all other variables taken into study. Relatively
small value of the communalities suggests that the concerned variable is a misfit for the
factor solution and can (should) be dropped out of factor analysis. The extracted
communalities as shown in the figure (d) is the estimate of variance in each variable, which
can be attributed to factors in the factor solution.
Decision regarding the Number of factors to be retained in the final
solution:
Figure (e)
Figure (e) presents the initial eigenvalues (total, % of variance, and cumulative % ),
extraction sums of squared loadings (total, % of variance, and cumulative % ).
Eigen value: An Eigen value is the amount of variance in the variable taken for the study that
is associated with a factor. According to eigenvalue criteria, the factors having more than
one eigenvalue are included in the model.
Figure (f)
Screen Plot: Scree plot is a plot of the eigenvalues and component(factor) according to the
order of extraction. The shape of the plot determines the optimal number of factors to be
retained in the final solution. The plot looks like intersection of two lines; Steep should be
retained and the factors on the shallow slope can be excluded.
From Figure (e) & (f), it shows the optimal number of factors to be retained the final
solution is 3
Figure (g)
Figure (g) shows the component matrix table that presents the factor loading for each
variable on unrotated factors (components). As can be seen from the figure each
component (1 or 2 or 3) represents the correlation between the variables and unrotated
factors.
Figure (h)
Figure (h) shows the reproduced correlation and residuals for the factor analysis solution.
Factor rotation for enhancing the Interpretability of the Solution:
After selection of factors, the immediate step is to rotate factors. The rotated simple
structure solutions are easy to interpret, whereas original are difficult.
Figure (I)
Figure (I) represents the rotated component matrix that is often referred as the “pattern
matrix for oblique rotation”. The columns in this figure represent the factor loading for each
variable, for the concerned factor, after rotation.
Figure (k)
Figure (k), as the component plot in rotated space, provides a point to identity the factors
with concerned variables as the constituents of the factor.
Substantive Interpretation:
At this stage, Figure (i) & (k) as rotated component matrix and the component plot in
rotated space, provides the opportunity to identify the factors with concerned variables as
the constituents of the factor. Variables with high loading factor with components are
consider in that particular components. Figure (i) shows that Variables (Graduation/Post
Graduation in Computer Science, Hot Skills Possession (AI/Data Analytics/ML), Relevant
Certifications & Significance of First Job (Role/Salary/Company)) are highly loaded on
Component 1 or Factor 1. Similarly Variables (Onsite Opportunities, Open Source
Contribution, Location Preference, Macroeconomic/ Govt. Policies & Active Presence on
Linkedin) are highly loaded on Component 2 or Factor 2 and Variables (Niche Skill (over
broad base of programming languages) &Intelligence Quotient) are highly loaded on
Component 3 or Factor 3.
Component 1 or Factor 1 consists of Variables (Graduation/Post Graduation in Computer
Science, Hot Skills Possession (AI/Data Analytics/ML), Relevant Certifications & Significance
of First Job (Role/Salary/Company). Factor 1 can be named as Entry level merit
Component 2 or Factor 2 consists of Variables (Onsite Opportunities, Open Source
Contribution, Location Preference, Macroeconomic/ Govt. Policies & Active Presence on
Linkedin) Factor 2 can be named as ‘External support’
Component 3 or Factor 3 consists of Variables (Niche Skill (over broad base of programming
languages) & Intelligence Quotient). Factor 3 can be named as ‘Knowledge’
Figure (L)
Figure (L) shows the component score coefficient matrix. For each subject, the factor score
is calculated by multiplying the values of variables.
Check for Fit model
The last step in Factor analysis is to determine the fitness of the factor analysis model.
Figure (h) shows the degree of correlation and for an appropriate factor analysis solution,
the difference between reproduced and observed correlation should be small. From figure
(h) we can say its indicates appropriate factor analysis model.
Conclusion
Source: