Professional Documents
Culture Documents
PBL Report Ii
PBL Report Ii
By
Guide
Prof. Minal Jungare
C E R T I F I CAT E
With immense pleasure, we are presenting this Project Based Learning II as a part of
the curriculum of S.E Computer Engineering. We wish to thank all the people who gave us
endless support right from the stage the idea was conceived.
This project would not be possible without help of library department who helped us
gathering the information from various sources. Lastly, we offer our regards to all those who
supported us in any respect during the completion of PBL II project.
1 Introduction………………………………………………………. 1
1.1 Motivation……………………………………………………….… 2
1.2 Problem Definition………………………………………………… 3
2 Literature Survey………………………………………………… 4
3 Software Requirements
3.1 Introduction………………………………………………………... 8
3.1.1 Project Scope………………………………………………. 8
3.1.2 Purpose……………………………………………………. 9
3.1.3 Objectives…………………………………………………. 9
3.2 Functional Requirements
3.2.1 Resume tracking functionality…………………………………. 11
3.3 External Interface Requirements
3.3.1 User Interfaces………………………………………………. 11
3.3.2 Hardware Interfaces…………………………………………. 12
3.3.3 Software Interfaces…………………………………………... 12
3.3.4 Communication Interfaces…………………………………... 12
3.4 Non-functional Requirements
3.4.1 Performance Requirements…………………………………. 13
3.4.2 Safety Requirements………………………………………… 13
3.4.3 Security Requirements………………………………………. 13
3.4.4 Software Quality Attributes…………………………………. 13
3.5 System Requirements
3.5.1 Database Requirements……………………………………… 14
3.5.2 Software Requirements……………………………………… 15
3.5.3 Hardware Requirements……………………………………... 15
3.6 Analysis Models
3.6.1 Waterfall Model……………………………………………... 16
3.7 System Implementation Plan……………………………………. 17
The utility of JRS extends beyond mere convenience; it addresses several key pain points
encountered by both job seekers and employers. For job seekers, the process of sifting
through countless job listings can be daunting and time-consuming. Moreover, the lack of
visibility into suitable opportunities may lead to frustration and suboptimal career
choices. JRS alleviates these challenges by providing curated recommendations tailored
to each user's unique profile, thereby streamlining the job search process and enhancing
the likelihood of finding a fulfilling career path.
Similarly, employers stand to benefit from the adoption of JRS, as these systems facilitate
more efficient talent acquisition processes. By leveraging data-driven insights to identify
candidates whose profiles align with the job requirements, employers can streamline
recruitment efforts, reduce time-to-hire, and enhance the quality of candidate matches.
ICEM, Department of Computer Engineering 2021-22
6
1.1 MOTIVATION
In regard to today’s society, the process of finding a job can be very long and complex, resulting in some
people losing their will to continue. Because of that fact, this thesis project can be considered as an
opportunity to contribute as much as possible to making the entire process simpler for people who need a
job. Even helping only one person would be of a great importance and would make a difference.
Although there are already some similar services online, using the right combination of tools and
machine learning algorithms can have an influence on this matter. Bringing the idea to life would benefit
everyone, from companies who look for new employees to potential employees themselves. Also,
Pitchler AB is a very young company whose main project requires the component which this thesis
project focuses on. Consequently, the solution proposed by this thesis project is planned to be put into
use in the real-world environment as soon as it reaches the level required for that.
Increased need for new machine learning-based solutions yield an increased number of tools, techniques,
and specialists. A specific problem to be solved in this thesis is training a machine learning model for
recruitment of new employees. The thesis is done as a part of the platform Pitchler AB, in association
with the company Pitchler AB. The platform is meant to provide an option to all people to find the most
suitable job ads with one click. The focus of this thesis project is based on training the machine learning
model, using machine teaching and learning techniques and algorithms to get a machine learning-based
recommendation system. The problem of the thesis can be divided into subproblems.
• Select which data attributes would be used for both candidates’ and jobs’ representation.
• Extract the relevant data from informal candidates’ and jobs’ representations.
• Divide the data into the training and the test set.
• Present the first prototype recommendation system.
• Recommend a better approach for future use.
Since there are already numerous machine learning and data processing techniques for this purpose, a
good combination of them can result in a very good and accurate model that could have an impact on the
real world by enabling many people to find a job faster. There already are some similar systems, but the
one being developed in this thesis project is meant to serve as a very important component.
Literature review is a method where data is collected from what other authors have
published on the same or similar topic. In this case, books and papers were explored to get
insights on multiple matters. For example, this method was helpful when it comes to
methods of data preprocessing. Also, it was used to better understand which sort of
problems require different types of machine learning approaches, such as supervised or
unsupervised learning and others. It also refers to model selection, since different models
perform better than the others in different situations. Literature review had an impact on
tools used during the project. For example, multiple Python libraries were discovered and
used through the process.
The concept of recommender systems traces its roots back to the early days of e-
commerce, where platforms like Amazon pioneered the use of collaborative filtering to
suggest products based on user preferences. Building upon this foundation, researchers
began applying similar techniques to the domain of job recommendation, recognizing
the parallels between recommending products and recommending career opportunities.
Over time, JRS have evolved from simple rule-based systems to sophisticated models
powered by machine learning algorithms, capable of analyzing vast amounts of data to
generate personalized recommendations.
More recently, advancements in deep learning have led to the emergence of neural
network-based recommender systems, capable of capturing complex patterns and
relationships in the data. These models, which include techniques such as deep
collaborative filtering and neural collaborative filtering, have demonstrated promising
results in improving recommendation accuracy and addressing cold start problems.
The application of JRS extends beyond traditional job search platforms, encompassing a
wide range of domains including recruitment, talent management, and workforce
planning. In the realm of recruitment, JRS empower employers to identify and attract
top talent by providing personalized recommendations tailored to the specific
requirements of each job opening. Similarly, job seekers benefit from JRS by receiving
targeted recommendations that align with their skills, experiences, and career
aspirations, thereby streamlining the job search process and enhancing job satisfaction.
Despite their potential benefits, JRS face several challenges that warrant further
investigation. One key challenge lies in the mitigation of algorithmic biases, which may
inadvertently perpetuate inequalities or disadvantage certain demographic groups.
Furthermore, the dynamic nature of the job market presents another challenge, as
trends, preferences, and skill requirements evolve over time. To remain effective, JRS
must adapt to these changes by continuously learning from user feedback, updating
their recommendation models, and incorporating real-time data sources.
Looking ahead, future research in the field of JRS is poised to explore innovative
approaches for improving recommendation accuracy, addressing algorithmic biases,
and enhancing user engagement. Collaborative efforts between researchers, industry
practitioners, and policymakers will be essential to realize the full potential of JRS and
create a more inclusive, efficient, and equitable job market for all.
Job Recommender Systems represent a powerful tool for matching job seekers with
suitable employment opportunities and empowering organizations to make informed
talent management decisions. Through the application of advanced algorithms and
methodologies, JRS have the potential to streamline the job search process, enhance
recruitment outcomes, and facilitate workforce planning initiatives. However,
addressing challenges related to algorithmic biases and the dynamic nature of the job
market remains a critical area for future research and development. By leveraging
interdisciplinary approaches and fostering collaboration across academia, industry, and
government, we can unlock the full potential of JRS to create a more efficient,
equitable, and fulfilling job market for individuals and organizations alike.
3.1 INTRODUCTION
The software requirement specification of our project will have the entire necessary
requirement which will be a baseline of our project. The software requirement
specification will incorporate functional and nonfunctional requirements, system
architecture, data flow diagrams, UML diagrams, experimental setup requirements and
performance metrics.
During the work on this project, there were limitations which made it impossible to do
everything that was initially planned. The first limitation was the quality of data. It was
significantly lower than expected. The data that Pitchler AB managed to collect was
related only to job seekers. Furthermore, the dataset contained only short resumes for
each candidate. Unfortunately, it contained only skills that candidates claimed they had,
not including any level of those skills. Also, only candidates within IT area were
available. Consequently, only skills related to IT-related abilities such as knowledge in
programming languages, frameworks, operating systems, and other similar IT skills
were present. Candidates’ data did not contain experience, the skill level, or any other
information for any of the skills that could help provide better results. The reason for
this was the low quality of the questionnaire, or candidates’ lack of interest in filling in
the information. Due to the latest GDRP regulations, obtaining existing (candidate, job)
pairs was not possible. When it comes to the data related to job postings, it was
retrieved from Arbetsförmedlingen, the Swedish unemployment agency, using their free
API. One of the limitations was the fact that most of these records were in Swedish.
This was a limitation because I could not understand jobs’ descriptions written in
Swedish. To change the language, a script was written to translate the records to
JobRecommendation systems is a type of software tools designed to provide suggestions for items, that a
user might find useful. The suggestion can refer to any decision-making process such as buying a
product, seeing a movie or applying for a job. The term “item” generally refers to any entity that a
system recommends to a user. Similarly, the term “user” refers to any entity to whom an item is
recommended.
The popularity of job recommendation systems has grown during the recent years and proof for that is
the fact that many globally popular platforms such as Netflix, Amazon and YouTube use
recommendation systems to provide their users with more quality content. For example, Netflix uses
recommendation systems to suggest shows and movies to their users based on previous watching
experience. Amazon also employs recommendation systems to recommend products or books to their
users based on ratings or previously liked products. The list of examples could be infinite since the
application of recommendation systems is very wide.
Job Recommendation systems tend to be beneficial to both service providers and users. They bring the
possibility of reducing transaction costs of finding and selecting items in, say, an online shopping
environment .
Starting from 2010, many researches were conducted on recommendation systems. Most of those
focused on exploring and designing new algorithms and techniques to improve existing solutions. As a
result of that, nowadays a designer of an application using recommendation systems must carefully
choose between available algorithms in order to obtain the best possible results.
Job Recommendation systems are mostly classified as content-based (CB) and collaborative filtering
(CF) systems. Content-based recommendation systems cluster users by comparing representations of
items to representations of users’ profiles. Representations of users’ profiles express users’ interests.
However, collaborative filtering is the technique that uses peer opinions to predict the interests of others.
In collaborative filtering, a target user is matched against all other users in order to discover which users
are similar and which users have similar interests as a target user .
Objective 2 (O2): Prepare data to expose underlying patterns to machine learning algorithms.
Once the data has been thoroughly explored, the focus shifts to data preparation, where the goal is to
preprocess and transform the data to make it suitable for machine learning algorithms. This involves
tasks such as cleaning data, handling missing values, encoding categorical variables, and scaling
features. By preparing the data effectively, the team ensures that underlying patterns and relationships
are more readily discernible to machine learning models, ultimately enhancing the system's predictive
capabilities.
Objective 3 (O3): Explore as many different machine learning models as possible and focus on those that
provide the best results.
With the data prepared, the team embarks on an iterative process of model exploration and evaluation.
This involves experimenting with a diverse range of machine learning algorithms, including but not
limited to collaborative filtering, content-based filtering, and hybrid approaches. By systematically
testing various models and tuning hyperparameters, the team identifies the most effective algorithms that
yield optimal performance metrics such as accuracy, precision, recall, and F1-score. The selection of the
best-performing models lays the foundation for building the recommendation engine.
In our system once the user/ applicant will upload his/ her information and
resumes. Also the admin will be able to check the applicant and select the best
resume/applicant from the crowd.
The requirements section of hardware includes minimum of 180 GB hard disk and 4
GB RAM with 2 GHz or higher speed. The primary requirements include a memory of
4GB for the Android Application development and MySQL.
• RAM - 4 GB (min)
This is the software configuration in which the project was shaped. The programming
language used, tools used, etc are described here.
• Database : MySQL
• The overall performance of the software will enable the users to work efficiently.
The application is designed in modules where errors can be detected and fixed easily.
This makes it easier to install and update new functionality if required.
All data will be encrypted using strong encryption algorithm and according to location
encryption is done.
Our software has many qualities attribute that are given below:
• Availability: This software is freely available to all users. The availability of the
software is easy for everyone.
• Maintainability: After the deployment of the project if any error occurs then it can be
easily maintained by the software developer.
ICEM, Department of Computer Engineering 2021-22
20
• Reliability: The performance of the software is better which will increase the
reliability of the Software.
• User Friendliness: Since, the software is a GUI application; the output generated is
much user friendly in its behaviour.
• Security: Users are authenticated using many security phases so reliable security is
provided.
MySQL is free and open-source software under the terms of the GNU General
Public License and is also available under a variety of proprietary licenses. MySQL was
owned and sponsored by the Swedish company MySQL AB, which was bought by Sun
Microsystems (now Oracle Corporation). In 2010, when Oracle acquired Sun, Widenius
forked the open-source MySQL project to create MariaDB.
IDE : VS Code
Ram: 4 GB.
Waterfall approach was first SDLC Model to be used widely in Software Engineering
to ensure success of the project. In "The Waterfall" approach, the whole process of
software development is divided into separate phases. In this Waterfall model, typically,
the outcome of one phase acts as the input for the next phase sequentially.
Implementation − With inputs from the system design, the system is first
developed in small programs called units, which are integrated in the next
phase. Each unit is developed and tested for its functionality, which is referred
to as Unit Testing.
Integration and Testing − All the units developed in the implementation phase
are integrated into a system after testing of each unit. Post integration the entire
system is tested for any faults and failures.
Maintenance − There are some issues which come up in the client environment.
To fix those issues, patches are released. Also, to enhance the product some
better versions are released. Maintenance is done to deliver these changes in
the customer environment.
In this step of waterfall, we identify what are various requirements are need for our
project such are software and hardware required, database, and interfaces.
In this system design phase, we design the system which is easily understood for end
user i.e., user friendly. We design some UML diagrams and data flow diagram to
understand the system flow and system module and sequence of execution.
3. Implementation:
4. Testing:
The different test cases are performed to test whether the project module is giving
expected outcome in assumed time. All the units developed in the implementation
phase are integrated into a system after testing of each unit. Post integration the entire
system is tested for any faults and failures.
5. Deployment of System:
Once the functional and non-functional testing is done, the product is deployed in
the customer environment or released into the market.
6. Maintenance:
There are some issues which come up in the client environment. To fix those issues
patches are released. Maintenance is done to deliver these changes in the customer
environment. The next phase is started only after the defined set of goals are achieved
for previous phase and it is signed off, so the name "Waterfall Model". In this model
phases do not overlap.
ICEM, Department of Computer Engineering 2021-22
27
4. SYSTEM DESIGN
It is an important tool as it provides an overall view of the physical deployment of the software system
and its evolution roadmap.
In the presentation tier there are two entities that are Applicant and HR. The Website contains different
features of the system that are Resume Tracking , Review and ratings and finally the Database Tier
HR: The HR logs into the system, manages the Applicant’s login, manages the
resumes, creates the vacancies and scrutinize the applicants
Users: The user logs into the system, updates the basic information which is
assigned to him/ her and uploads the resumes.
Conclusion
The findings of this project have shown that a variety of algorithms and approaches can be used for
building a recommendations system. Collaborative filtering is just one of the methods that are useful in
this field. Understanding of the algorithms and technologies can be used to build very powerful tools that
would provide a lot of good to the person behind them. The solution prototype proposed in this thesis
can, with some modification, be applied to other areas as well. Any area that could use the
recommendation system is a potential target for such a system. Also, these results are likely to be used as
a prototype for the recommendation engine for a recruitment platform owned by Pitchler AB. After
improvements and with some good fortune, this prototype is going to affect the society by helping people
in finding the employment. Also, the prototype is going to contribute to a business of Pitchler AB.
When it comes to the matters that could have been done differently, data acquisition is the first one. As
already mentioned, data quality was significantly lower than the average. Better data would yield better
results since the models would gain better knowledge in users’ tastes. More time should have been
dedicated to data acquisition. Due to the GDPR regulations and strict policies when it comes to obtaining
personal data, anonymous surveys should have been made few months before the start of the project.
These steps would have a great influence in the results and would affect their quality.
Future work
Although this was a nine-month project, as every other degree project it has a limited scope. Collaboration
with Pitchler AB is already planned regarding this project. Recognizing the data issue, they are willing to
continue improving this prototype and eventually use it to meet their business goals. An important step in
improving the prototype is following suggestions from Chapter 6. Data acquisition in suggested format will
make the improvement of this solution easier. Also, since time did not allow to try some other approaches,
such as content-based, it will be one of the goals for the future as well. The proposed prototype recommends
jobs to candidates. However, with slight modifications, it can be made an opposite, meaning that is possible
to modify the prototype to recommend candidates for a certain job posting. This could be useful for
companies when looking for new employees. Depending on the needs of Pitchler AB, this might be done in
thefuture.
[1] A. Géron, Hands-on machine learning with Scikit-Learn and TensorFlow, 1st ed. Sebastopol: O’Reilly
Media, Inc., 2017.
[2] K. Foote, "A Brief History of Machine Learning - DATAVERSITY", DATAVERSITY, 2019. [Online].
Available: https://www.dataversity.net/a-brief-history-of-machine-learning/. [Accessed: 07- Apr- 2019].
[3] F. Ricci, Recommender systems handbook. New York, NY: Springer, 2011.
[4] F. Isinkaye, Y. Folajimi and B. Ojokoh, "Recommendation systems: Principles, methods and evaluation",
Egyptian Informatics Journal, vol. 16, no. 3, pp. 261-273, 2015. Available: 10.1016/j.eij.2015.06.005
[Accessed 7 September 2019].
[5] G. Shani and A. Gunawardana, "Evaluating Recommendation Systems", Recommender Systems
Handbook, pp. 257-297, 2010. Available: 10.1007/978-0-387-85820-3_8 [Accessed 7 September 2019].
[6] B. Kim, Q. Li, C. Park, S. Kim and J. Kim, "A new approach for combining content-based and
collaborative filters", Journal of Intelligent Information Systems, vol. 27, no. 1, pp. 79-91, 2006. Available:
10.1007/s10844-006-8771-2.
[7] W. Shalaby et al., "Help me find a job: A graph-based approach for job recommendation at scale", 2017
IEEE International Conference on Big Data (Big Data), 2017. Available: 10.1109/bigdata.2017.8258088
[Accessed 7 September 2019].
[8] R. Belsare and D. Deshmukh, "Employment Recommendation System using Matching, Collaborative
Filtering and Content Based Recommendation", International Journal of Computer Applications Technology
and Research, vol. 7, no. 6, pp. 215-220, 2018. Available: 10.7753/ijcatr0706.1003.
[9] W. Hong, S. Zheng, H. Wang and J. Shi, "A Job Recommender System Based on User Clustering",
Journal of Computers, vol. 8, no. 8, 2013. Available: 10.4304/jcp.8.8.1960-1967.
[10] K. Kenthapadi, B. Le and G. Venkataraman, "Personalized Job Recommendation System at LinkedIn",
2019.