Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

PREDICTION OF EMPLOYEE

ATTRITION
Managing People in Organization Project

Submitted By:
AnanthaKrishnan Mavelil B2019004
Krishna Kumar B2019022
Leslie Korah Chally B2019023
Reethika Reddy B2019024
Mikita Hiraou B2019029
Srihari K R B2019053
Contents
Objective:................................................................................................................................................ 2
Reasons for Predicting: .......................................................................................................................... 2
Approach: ............................................................................................................................................... 2
Variables in the Dataset: ........................................................................................................................ 2
Exploratory data analysis: ...................................................................................................................... 3
Attrition vs Overtime ......................................................................................................................... 3
Department wise attrition ................................................................................................................. 3
Job Satisfaction Vs attrition ............................................................................................................... 4
Age vs Department vs attrition.......................................................................................................... 4
Cleaning the dataset: ............................................................................................................................. 5
Selection of Prediction Models: ............................................................................................................. 5
Weights of the attributes: ...................................................................................................................... 5
Conclusion of Analysis:........................................................................................................................... 6
References: ............................................................................................................................................. 6
Objective:
To obtain the attributes that are causing attrition in the organization, and with these
attributes need to build a model to predict whether a newly joined employee/Valuable
Employee fall under attrition category or not.

Reasons for Predicting:


This will help the organization to reduce the direct costs like (replacement, recruitment and
selection, temporary staff, management time), indirect costs like (morale, pressure on
remaining staff, costs of learning, product/service quality, organizational memory), and the
pain minutes that is spent on boarding a new employee into a vacant position in the
organization.

Approach:
• We have obtained the past 5-year data from an organization that has obtained the
following data of the employees and it has classified them whether they have attrition
to be yes or not.
• Perform exploratory data analysis by plotting graphs to understand the data and get
insights.
• Pre-process the data for to remove NA values and any outliers in the dataset.
• Split the data into training and testing in the ratio of 80:20
• Build various models on the training dataset.
• Evaluate the model against the test dataset to check for its accuracy.
• Recommend the best model based on the accuracy score.

Variables in the Dataset:


Emp Number
Age
Attrition
Monthly income
Department
Education
distance from home
Gender
job satisfaction
marital status
Overtime
Percentage of salary hike
performance rating
relationship satisfaction
stock option level
total working years
work life balance
years at company
years in current role
years since last promotion
years with current manager
Exploratory data analysis:

Attrition vs Overtime

From the above plot we are able to see that in case of overtime for an employee there is a
higher chance for that employee to have attrition.

Department wise attrition


From the above graph we can observe that the higher rate of attrition is observed in
Research and Development department and the 2nd highest is observed in Sales and the
lowest attrition rate is observed in Human Resources Department.

Job Satisfaction Vs attrition

From the above graph we observe that higher the satisfaction rate for the employees then
there is less chance for attrition, and the vice versa.

Age vs Department vs attrition

From the above boxplot, we see that younger employees are more likely to attrite as
compared to employees of higher age.
Cleaning the dataset:
• The dataset contained NA values which were replaced with 0, and moreover the
dataset has 2466 No and 474 yes in case attrition attribute. Since this data is more
biased towards ‘No’, we need to up-sample the dataset.
• The dataset was also checked to see if the data has any outliers and the outliers in the
data were removed and also plots were plotted to make sure that the data follows a
normal distribution.
• Once the outlier analysis is done, we plot the correlation plot to understand the
correlation of the variables with one another and with the dependent variable.
• The correlation plot showed that there is strong correlation between some attributes
(i.e. monthly income and job level, job level and total working years, total working
years and monthly income), however our targeted attribute attrition showed poor
correlation with other attributes.

Selection of Prediction Models:


The classification models like (Support vector machine, Random Forest, Naïve Bayes, Logistic
Regression) were applied on the data. The model was tested against the testing dataset. The
following results were obtained. There was no Overfitting in the Final Model.

We are able to see that the maximum accuracy is obtained for Logistic regression which is
close to the Industry accepted Accuracy, hence we are selecting this as the best classifier
model in this case.

Weights of the attributes:


From the standardized beta values obtained from logistic regression we were able to see that
the following attributes in descending order are the reasons for the employee attrition.

Job_Satisfaction
Monthly_Income
Stock_Option_Level
Work_Life_Balance
OverTime
Total_Working_Years
Business Travel
Marital_Status
Distance_From_Home
Years_In_CurrentRole
Years_At_Company
Years_With_Current_Manager
Conclusion of Analysis:
From the logistic regression model, we obtained the standardized beta values and from those
values we were able to infer that, Attrition Rate majorly Depends on Business Travel,
Distance_From_Home, Job_Satisfaction, Marital_Status, Monthly_Income, OverTime,
Stock_Option_Level, Total_Working_Years, Work_Life_Balance, Years_At_Company,
Years_In_CurrentRole and Years_With_Current_Manager.
From the weights of the attributes we can say that the Job Satisfaction plays a major role in
determining the attrition rate of an employee in the organization. Money also plays a role
here as the person with higher salary has lower attrition rate than the employee who has less.
Stock option level also determines the likelihood of employee attrition, as the employee with
no option to purchase company stocks has probably lower interest in its overall success than
those who can.

References:
https://medium.com/@srimalashish/predicting-employee-churn-with-python-4e665a449a20

http://rtuttinsights.com/portfolio/hr-analytics-predicting-employee-attrition/
https://medium.com/@MLJARofficial/human-resources-analytics-predict-employee-attrition-
5ddc3ed781c

You might also like