Professional Documents
Culture Documents
Research Paper 102
Research Paper 102
Existing Method
SMOTE
SMOTE ( Synthetic Minority Oversampling Technique) is a
technique that is used when the data is lacking, or not even. Its
main purpose is to increase the number of test cases in the
dataset so proper classification can take place. Unlike
General/random oversampling which uses test cases given
within the data to fill in the blanks, SMOTE creates synthetic
Fig.4 About page for our application
data from the dataset and uses the newly produced synthetic
data to fill the lacking parts. This helps the overfitting problem
that causes the model to fail by making it accustomed to the In this page one can access information about the
training data, that is caused by random oversampling. particulars of our application to better understand what
they are going to use. This page also highlights the
importance of proper skillsets and a good sense of
responsibility that helps the user better understand the
significance of a necessary promotion which further
improves the growth rate of the employee and in turn that
of the organization.
3. /login : It leads to the login page. Fig.7 Login Home page of our application
The users that have used their user name and password get
access to this home page of the application. Here they have
other routes which they could use to further use their
applications.
6. /upload: This leads to the page where one can upload the
dataset to train the model.
4. /registration: It leads new users to register themselves. Here the users can upload any dataset that they wish to
upload so that they can use our predictive model to analyze
which employee is likely to get predicted on the basis of
their skills, achievements and current trends.
7. /viewdata: Here one can view the data they have uploaded
Here the uploaded dataset goes through the preprocessing Fig.13 The employee is not promoted
phases of the application which is absolutely essential for
the prediction analysis of the model to take place After the model is generated that will be used to predict the
correctly. The user can themselves select a split ratio chances of a specific employee being promoted or not
which is used to train the model on the dataset by dividing dependent on a particular accuracy, the user can use that model.
it in two parts. One part deals with the training of the This is done so by accessing the prediction page where we can
model where as the other part is used for prediction observe that there are particular fields that need to be filled with
purposes. It is preferred to use a 30% split for desired the employee’s information whose promotion they need to
training and test percentage to avoid the issue of over- predict.
fitting. Afterwards It also utilizes SMOTE to deal with
imbalance. After entering the required details of the employee such as the
id, age, department, their education level, whether they were
promoted beforehand, their KPI achievements, their
9. /model: Here one can select the model they want to work performance in training and their years of service to the
the data with company a result is generated that tells the user whether the
employee is likely to be promoted or not.
Architecture
Here the user can select the desired model for the required needs
which is done by selecting an algorithm here. The selected
algorithm then displays a accuracy percentage for the said
model which can be used to identify whether the given output
has probable chances of happening or not.
10. /prediction: Here the user can input all the details of a
employee and find out whether are getting promoted or
not.
The main purpose of a use case diagram is to show what In conclusion, this paper can be used to be able to predict
system functions are performed for which actor. Roles of the employee promotions by understanding the relationship
actors in the system can be depicted. between various factors such as KPI, Training and their scores,
the time they spent in the organization. This was done by
incorporating various machine learning algorithms like naïve
bayes, svm , xgboost and techniques like SMOTE.