Professional Documents
Culture Documents
Employee Attrition Predictoin using machine learning-ahtesham.
Employee Attrition Predictoin using machine learning-ahtesham.
AN INTERNSHIP REPORT
Submitted by
MOHAMMED AHTESHAM AHMED
(3GN19IS010)
CERTIFICATE
The project report has been approved as it satisfies the academic requirements in respect
of Internship prescribed for the course Internship / Professional Practice (18CSI85)
External Viva:
1)
2)
DECLARATION
Date : 28-02-2023
Place : BIDAR
USN : 3GN19IS010
NAME : MOHAMMED AHTESHAM AHMED
INTERNSHIP CERTIFICATE PROVIDED BY THE COMPANY
OFFER LETTER PROVIDED BY THE COMPANY
ACKNOWLEDGEMENT
This Internship is a result of accumulated guidance, direction and support of several important
persons. We take this opportunity to express our gratitude to all who have helped us to complete the
Internship.
We express our sincere thanks to our Principal, Dr. Dhananjay M for providing us adequate facilities
to undertake this Internship.
We would like to thank our Head of Department Prof.Ramesh Patil for providing us an opportunity
to carryout Internship and for his valuable guidance and support.
We would like to thank our (Lab assistant name) Software Services for guiding us during the period
of internship.
We express our deep and profound gratitude to our guide, Prof. Anil K, for his keen interest and
encouragement at every step in completing the Internship.
We would like to thank to internship coordinator Prof. Laxman Singh and all the faculty members
of our department for the support extended during the course of Internship.
We would like to thank the non-teaching members of our dept, for helping us during the Internship.
Last but not the least, we would like to thank our parents and friends without whose constant help,
the completion of Internship would have not been possible.
USN : 3GN19IS010
EMPLOYEE ATTRITION PREDICTION USING MACHINE LEARNING
ABSTRACT
Employee attrition is a major concern for organizations, as it can impact their bottom
line and disrupt operations. Predicting which employees are likely to leave can help
organizations take proactive measures to retain them. In recent years, machine learning
algorithms have been used to predict employee attrition based on factors such as job
algorithms can identify patterns and predict which employees are at risk of leaving. This can
help organizations develop targeted retention strategies and improve employee satisfaction,
Helps to predict whether an employee of a company will leave or not, using the k-
hours at work and number of years spent in the company, among others, as our features. Other
approaches to this problem include the use of ANNs, decision trees and logistic regression. The
dataset was split, using 70% for training the algorithm and 30% for testing it, achieving an
accuracy of 94.32%.
TABLE OF CONTENT
CHAPTER -1
COMPANY PROFILE
It is set up directly by registering the company with RoC, Ministry of Corporate Affairs. Its
registered office address EasiliTech, Linganand Nagar, Nandi Colony, Bidar, Karnataka, India.
EasiliTech registered state is Karnataka.
At EasiliTech, we are passionate about using technology to improve education and enhance
learning outcomes. With a team of experienced professionals and a commitment to innovation,
we are dedicated to providing engaging, personalized, and effective educational and Its
development experiences for students of all ages and streams.
Our goal is to empower students to reach their full potential and succeed in their academic and
professional pursuits, and we work closely with educators, parents, and the wider community
to create a supportive and inclusive learning environment. we believe that education should be
accessible and impactful for all. As a leading EdTech company, we use cutting-edge
technology to create innovative solutions that improve the learning experience for students
everywhere.
Our aim is to empower learners with the tools they need to succeed in their academic and
professional pursuits, and we work tirelessly to provide affordable, personalized, and effective
educational experiences. With a team of dedicated professionals. If you're looking for
innovative and effective education technology solutions, look no further than EasiliTech.
At EasiliTech, we also assist job aspirants by providing them the stepping-stone to a long and
fruitful career in the fast-booming realm of Information Technology. We ascertain that our
training is conducted by pioneers in the technology with proven track records for over a decade.
Our trainers are highly skilled with immense focus and dedication towards the professional
evolution of our students. In addition to training, we also provide state of the art lab facilities
and placement assistance to our students, leading the corporates to hire from our talent pool.
Our Vision
Our Mission
Our mission is to bridge the digital divide between colleges and industries and provide quality
educational to all students, regardless of their location and streams.
Our services
CHAPTER 2
INTRODUCTION
Organizations must understand the reasons for employee attrition and take proactive
measures to reduce its impact. This can include identifying the root causes of dissatisfaction or
disengagement among employees, offering better compensation and benefits packages,
creating a positive work environment, providing opportunities for career growth and
development, and improving communication and feedback channels.
In recent years, with the increasing availability of data and advanced analytics
techniques, organizations have been able to predict and prevent employee attrition through
machine learning algorithms. By analyzing large datasets, these algorithms can identify
patterns and predict which employees are at risk of leaving, allowing organizations to take
targeted actions to retain them. Overall, managing employee attrition is critical for
organizational success and employee satisfaction.
Employee Attrition is a huge problem across industries and generally costs the company
a lot for hiring, retraining, productivity and work loss for each employee who leaves. Price and
Waters, a boutique data science consulting firm, is looking to build a Machine Learning model
to predict whether an Employee might quit. Using this model, they might plan human
intervention to alleviate the issues faced by the employee. The firm is also interested in specific
features that are highly indicative of attrition. The company in a pilot program, recorded
employee data. The company collected employee performance data for some of the months
randomly for each employee to understand it in the context of attrition. The company wants
you to predict whether an employee would quit in the near future, given the data and to discover
features indicative of attrition. ‘Left_Company’ is the target variable and you would have to
predict either ‘1’ (Left), ‘0’ (Retained) for each unique employee id in the test dataset.
Attribute details:
CHAPTER 3
PROBLEM STATEMENT
Machine learning can be a powerful tool in predicting employee attrition by analyzing various
factors that contribute to employee turnover. These factors may include job satisfaction, work-
life balance, compensation and benefits, job role and responsibilities, performance metrics, and
other demographic and psychographic factors.
The goal of employee attrition prediction using machine learning is to develop a model that
can accurately predict which employees are likely to leave the organization, based on historical
data on employee turnover and related factors. This model can then be used by organizations
to identify at-risk employees and take appropriate measures to retain them, such as offering
additional training, improving work conditions, or increasing compensation
CHAPTER 4
LITERATURE SURVEY
Summary of the models used for prediction of employee attrition
Raman, etal. Correlationand Finding out Emerald Email Data Attrition of General and specific
(2019) Hypothesis whether the Sample Survey faculty of B- email
Testing features of email schools communication
and thesentiments features,
reflect whether Communication
the faculty will after office hours,
leave or not Internal
communication,
Email sentiment
Setiawanet al. Logistic To find the level IOP Survey data Employee 11 were significant
(2020) Regression of attrition andits Publishing Attrition some of themare
job
dependence satisfaction,
on particular frequency of
variables business travel,
years worked in
the company,
years with a
manager
Usha and Naïve Bayes, To identify which IOP Survey Data Employee Around 20 variables
Balaji (2021) Decision Tree, algorithm could Publishing Attrition were considered
J48, Random predict which consisted
Forest, and K attrition with of demographic
Means better details and
Clustering accuracy satisfactionlevel
El-Rayes, et al. DecisionTree, Identifying the Emerald Anonymous Employee Salary Increase,
(2020) RandomForest, probability ofan Resumes from Switch Firm rating, Firm
Gradient employeeleaving Glassdoor foundationyear
the
Boosted Tree job during the
transition
Al-Darraji, et Deep Learning Predicting who MDPI IBM Analytics Employee A total of 35
al. (2021) (Neural will leave the dataset Attrition variables were used
Network) company andwho
will not
Khera and Support Vector To develop a Sage Survey Data Employee A total of 22
Divya Machine predictive Attrition variables were
(2019) model for taken and gender,
employee
attrition in business travel
the IT sector and the total
number of
companies
worked earlier
were found to
be irrelevant
Alduayj and Support Vector To compare IEEE IBM Watson Employee A total of 32
Rajpoot Mechanism, different Analytics Attrition features were
machine Synthetic selected and
(2018) K-Nearest learning dataset top 12 features were
Neighbour, algorithms identified
Adaptive for attrition as the
Synthetic prediction important ones
Approach and identify (overtime, years
(ADASYN) the best of experience,
algorithm job level,
income, etc.)
Fallucchi, Gaussian Different MDPI IBM Analytics Employee 35 features
et al. Naïve Bayes, algorithms dataset Attrition from the
(2020) Logistic Were dataset were
Regression, compared to used and
K-nearest analyze and Gaussian Naïve
neighbor, identify the Bayes was
Decision tree Best found to have
classifier, algorithm the best recall
Random among them
Forest
classifier,
Support
Vector
Mechanism
Studies of employee turnover are reviewed using meta-analytic techniques. The findings indicate
that almost all of the 26 variables studied relate to turnover. The findings also indicate that study
variables including population, nationality, and industry moderate relationships between many of
the variables and turnover. It is suggested that future research on employee turnover: (1) report study
variables, (2) continue model testing rather than simply correlating variables with turnover, and (3)
incorporate study variables into future models.[1]
We aim to predict whether an employee of a company will leave or not, using the k-Nearest
Neighbours algorithm. We use evaluation of employee performance, average monthly hours at
work and number of years spent in the company, among others, as our features. Other
approaches to this problem include the use of ANNs, decision trees and logistic regression. The
dataset was split, using 70% for training the algorithm and 30% for testing it, achieving an
accuracy of 94.32%.[2]
This study takes a dynamic multilevel approach to examine how the relationship between
an employee's job satisfaction trajectory and subsequent turnover may change depending on
the employee's unit's job satisfaction trajectory and its dispersion. Analyses of longitudinal
multilevel data collected from 5,270 employees in 175 business units of a hospitality company
demonstrate a significant three-way interactive effect of unit-level job satisfaction trajectory
and its dispersion and individual job satisfaction trajectory on individual job exit. In particular,
in the presence of a negative unit-level job satisfaction trajectory and low dispersion, a positive
change in individual-level job satisfaction does not affect the odds of a person leaving an
organization. Put differently, an employee's being out of step with prevailing unit-level
attitudes appears to alter the relationship between his or her job satisfaction trajectory and
turnover propensity. Further, unit-level job-satisfaction change and its dispersion jointly
influence the overall turnover rate in a unit. The results indicate unit-level and individual-level
job satisfaction trajectories have unique multilevel influences on turnover above and beyond
static levels of job satisfaction. Accounting for these dynamics substantially increases the
explained variance in turnover behaviour. The findings increase understanding of the job
satisfaction-turnover link over time and across levels.[4]
CHAPTER 5
METHODOLOGY
1. Data Collection: The first step is to collect relevant data about the employees, such as
their performance ratings, job satisfaction levels, engagement scores, tenure, salary, and
demographic information.
3. Feature Engineering: The next step is to select the most relevant features that are highly
correlated with employee attrition. This involves identifying the factors that contribute
to employee attrition, such as salary, job satisfaction, and performance.
4. Model Building: Once the features are selected, various machine learning algorithms
can be applied to the data to build a predictive model. The algorithms can include
logistic regression, decision trees, random forest, or neural networks.
5. Model Evaluation: The model's accuracy and performance need to be evaluated using
various metrics, such as accuracy, precision, recall, F1 score, and ROC curves.
6. Model Deployment: The final step is to deploy the model in a production environment
to predict employee attrition. The model can be integrated with HR systems to provide
real-time insights and alerts about employees who are at risk of leaving.
CHAPTER 6
ADVANTAGES
3. Cost savings: If the departing employees are replaced with less experienced or less
senior employees, the company may save on salary and benefit expenses, at least in the
short term.
However, it's important to note that the potential advantages listed above are
heavily dependent on the specific circumstances of the employee attrition and should
be weighed against the potential negative impacts.
CHAPTER 7
FUTURE SCOPE
One of the main factors that will shape the future scope of employee attrition is the
changing nature of work. As new technologies and business models emerge, the types of jobs
and skills required are likely to evolve, leading to greater competition for talent. This
competition can drive up turnover rates as employees seek out new opportunities that better
match their skills and interests.
Another factor that may influence employee attrition in the future is demographic shifts.
As baby boomers retire, younger generations will take their place in the workforce, bringing
with them different expectations and priorities. Younger workers tend to be more mobile and
less likely to stay with one employer for their entire career, which could contribute to higher
turnover rates.
Finally, the COVID-19 pandemic has significantly impacted the world of work, with
many employees reassessing their priorities and looking for new opportunities. As the
pandemic continues to evolve, its impact on employee attrition rates remains to be seen.
Overall, the future scope of employee attrition will be shaped by a variety of factors,
including technological advancements, demographic shifts, and the ongoing impact of the
COVID-19 pandemic. Organizations will need to adapt to these changes and develop strategies
to retain their top talent in order to remain competitive in the marketplace.
CONCLUSION
The conclusion is based on the analysis and discussion of the empirics and theory to try to answer the
two main research questions. The questions that were used were:
1. How does predictions with a machine learning model perform compared to a simpler regression
analysis model when applied to HR data sets to predict employee attrition?
2. How does understanding a ML model, that predicts employee attrition, affect people’s will to use
them? 2b. What do practitioners believe that such a model can contribute with to an organization?
The partitioning within the datasets that each model used was:
a) Heavily balanced with employees who did not quit their job after a pulse survey and that could
have affected the results negatively. Therefore, the machine learning models used, random
forest and support vector machine, could not be compared in terms of performance with a
multiple logistic regression model.
b) For the second research question, it was shown that it was not necessary for individual
employees to understand how a machine learning model works in order for them to use them,
or be used by them. Instead, the important part was that employees needs to be able to trust a
model to accept it’s usage.
c) To trust a model, it was shown that it was enough that someone that the employees trusted in,
trusted the model or that it was an organization with good reputation that used the model.
Trusting a model was the most important part, but it was also shown that indirectly that meant
that the models used has to be interpretable and understandable by someone who firstly can
understand the model and secondly start a chain reaction of trust in the model.
d) Lastly, transparency in how the models are used and what they are used for was a key factor
for employees to accept the usage of ML models to predict employee attrition. That has direct
impact on employee performance and therefore also organizational performance.
Lowering employee turnover would also have direct positive impact on the economy of the
organization. Lowering employee attrition would reduce costs in hiring processes and training
new employees. This could also increase the attractiveness of the organization which could
attract new talented employees. Predicting employee attrition with a machine learning model
could also contribute to that organizations could start recruitment processes earlier and
therefore minimize the productivity notch when switching between old and new employees.
This could help the organization to maintain the production on normal levels, and lower the
chance of disruptions in productivity.
REFERENCES
[1] Cotton, J.L. and Tuttle, J.M., 1986. “Employee turnover: A metaanalysis and review with
implications for research” Academy of management review, pp.55-70.
[2] Heckert, T.M. and Farabee, A.M., 2006. “Turnover intentions of the faculty at a teaching-
focused university”. Psychological reports, pp.39-45.
[3] Rish, Irina, “An empirical study of the naive bayes classifier”, IJCAI Workshop on
Empirical Methods in AI.
[4] Liu, D., Mitchell, T.R., Lee, T.W., Holtom, B.C. and Hinkin, T.R., 2012. “When employees
are out of step with coworkers: How job satisfaction trajectory and dispersion influence
individual-and unit-level voluntary turnover”. Academy of Management Journal, pp.1360-
1380.
[5] Shaw, D. Jason. 2021. Turnover rates and organizational performance: Review, critique,
and research agenda. Organizational Psychology Review 1(3). 187-213.
doi:10.1177/2041386610382152
[6] Ajit, P. & Punnoose, R. 2019. Prediction of Employee Turnover in Organizations using
Machine Learning Algorithms. A case for Extreme Gradient Boosting. International Journal
of Advanced Research in Artificial Intelligence 5(9). 22-26. DOI:
10.14569/IJARAI.2016.050904
[7] Baxter, Pamela & Jack, Susan. 2020 Qualitative Case Study Methodology: Study Design
and