Professional Documents
Culture Documents
Efficient Scalable Job Reco
Efficient Scalable Job Reco
Author Proof
(&)
Ravita Mishra and Sheetal Rathi
Thakur College of Engineering and Technology, Kandivali (E) Mumbai 400101,
India
m.ravita@gmail.com, sheetal.rathi@thakureducation.org
1 Introduction
Author Proof
Now-a-day social media is very common platform to share the data and day today’s
activities. With the enormous use of various internet sources likes, mobile phone and
smart devices, Internet users can receive huge information about shopping and social
activity of user and online learning.
If the data volume and variety increases tremendously, then individual user faces
various problem of excessive information, it causes problem to make the correct
decisions. This framework is called as information overload. To resolve users’ infor-
mation overload problem, a new technique recommender system comes in pictures.
Recommender system can solve various problems by effectively finding users’
probable requirements and elect fascinating items from a vast amount of applicant
information. Recommender systems are mainly categorized into three main forms,
i.e., content-based (CB), collaborative filtering (CF) and hybrid recommender system
is combination of both resolve the drawback of content and collaborative filtering.
Hybrid filtering is created by three ways: ensemble design (off the self-method are
combined into a single, gives more powerful output), monolithic design (integrated
recommendation algorithms are created) and mixed design (uses multiple
recommendation algorithm). Knowledge-based systems (KB) rely on specific domain
knowledge to endorse items to their users. They map the item features with user
requirements and p to determine whether the item is useful for the user. Demographic
systems (DG) recommends items on the basis of user’s demographic profile, it helps
in marketing to recommend different items [1] . Fig. 1 depicts the different approach
of recommender system.
Recommender
system
Model Memory
based based
Classification
Clustering
on a daily basis and social networks, because of large size and structure the integrate
growing field of research.
This worth of data is leveraged by any recommender systems, and resulting find-
ings can help to solve the interesting problems related to social obligation,
recruitment, and in friend suggestion. One of important application of recommender
system is job recommendation it recruits suitable candidates through job portal. Every
day, lots of candidates browse online job portal for finding the effective, meaningful
and trans-parent job. Now-a-days many online job portals are available and they used
content based or collaborative recommendation. One of job recommender is work4
which is San Francisco based company and uses the concept of content based and
offers Facebook recruitment solution by taking some attribute of then. This system
leads many challenges like sparsity of job and user, overspecialization and limited
content analysis. The existing also not solved the unmatched jobs/candidates pair,
many low qualifications of applicants that match the search criteria [2]. The best fit
between jobs and candidate pairs are solely depends on underlying aspects that are
hard to measure. To solve the existing recommendation system’s problem by merging
the important feature of content and collaborative filtering and they solve individual’s
problem each other’s. In hybridization we used the features of both approaches and
dropping their limitation, it gives better result as compare to earlier approach. Content
based system referred many algorithms like K nearest neighbour, clustering, Bayesian
clustering and deep learning. In collaborative filtering uses user based and item based
algorithm such as support vector machine, principal component analysis, and neural
networks. Hybrid system uses all existing algorithm such as feature combination,
maximum entropy and deep learning.
This paper gives immense knowledge of accuracy measure, application domain;
challenges of existing job recommender system and comparison between different job
recommender systems. We have compared different methods, approach and evaluation
matrix of a recommendation system of three companies’ dataset and their results are
discussed. The paper is organized in five parts, the Sect. 1 contain introduction about
recommender system, Sect. 2 gives the state of arts, Sect. 3 gives various research
questions and Sect. 4 gives the job recommendation comparison, Sect. 5 presents the
approaches, comparison of the different algorithm and finally concludes this paper.
2 Literature Survey
social aware group recommendation framework and use two important characteristics
Author Proof
of group members are tolerance and altruism are presented. Mobile internet user’s
problem can handle with the help of collaborative filtering (CF) and hybrid filtering
and also predict the interests of mobile users for effective recommendation purpose.
Anandhan et al. [6] presents different types of Recommendation system for online
resources, such as blogs, forums, social networking websites, bookmarking websites,
and video and chat portals. Author also presents recommendation system approaches,
research domains; data sets used in each domain, mining algorithm, recommendation
type, and different performance measures. Wu et al. [7] presents the real time mobile
system Kaleido, uses a clustering approach and latent bias model which gives rec-
ommendation solution by taking affective text into user account. Adomavicius and
Tuzhilin [8], Yang et al. [9] presents the state of arts of recommender system and
different techniques such as matrix factorization based and neural network system
used for online application such as voting system. Author also explores the social
connection of user and group affiliation information.
Aghasian et al. [10] presents the surveys on privacy concerns, measurements and
privacy-preserving techniques that will used in online social networks and extensive
recommender systems. Author discussed the privacy-preservation models for the
users and provides information sanitization and data obfuscation to assure data
anonymity of individuals, address the resource consumption and computation time
and modeling attacks. Taghavi et al. [11], Valverde-Rebaza et al. [12] presents the
deep investigation of the real world recommender and focus on the statistical research
of published recommender systems, and also introduces the taxonomies driven by
several phases: initial phase, design and development phase, estimation, function
phase. Author pre-sents a model that used publicly available a new dataset formed by
a set of job seekers profiles and a set of job vacancies collected from different job
search engine sites; and put forward the proposal of a framework for job
recommendation based on professional skills of job seekers. Diaby et al. [13] presents
another type of recommender system (Job Recommender system) and offer job based
on Facebook extracted field. With the help of support vector machine and content
based filtering analyses different field of resume and Facebook are extracted.
Geyik et al. [14] Presents a LinkedIn Talent Solutions and provides a facility for job
recruiter to find out to suitable candidates and other hand for job aspirant to search best
career opportunities. Author also addresses the traditional search and recommendation
problem, for best job opportunity they only focus on mutual interest between the recruiter
and the candidate, other problems are recruiter are not able to express their job hiring
needs because of lack of domain knowledge, time and manual effort for best search
standard. Jalili et al. [15], Kenthapadi et al. [16] presents, the statistical modeling system
and few challenges of design and implementation stage of the existing system, describes
few modeling components (Bayesian hierarchical and smoothing techniques) and its main
task to presenting robust compensation to users. Author also highlighted the different
challenges and their solution like data collection and pro-cessing; Improvement in
statistical smoothing technology and improving outlier detection at the initial stages.
LinkedIn Talent system addresses all these issues.
Efficient and Scalable Job Recommender System 5
3 Research Questions
Author Proof
RQ1: What kinds of approaches are used for generating recommendations by Rec-
ommender System?
The widely used recommendation technologies in commercial applications should
be broadly characterize as content based and rating based filtering.
Collaborative Recommendation (CF) technique gives recommendation based on
items and target user and consider past history of similar users with similar choices
[15]. CF techniques are further categorize into two main sub parts: memory-based that
make the recommendations compares a user’s historical records to other records in the
database; and model-based uses statistical learning, model fitting done by user
directory then use it to generate predictions. Content-Based Filtering (CB) gives
recommendation based on items similar to the user has liked in the past. Here profile
is created for each individual user or item and describe its important characteristics.
For example, the attributes of a movie profile may be its genre, director, story, actors,
its box office popularity, effects etc. Hybrid Filtering (HF) follows a blended
approach that covers all other basic approaches to achieve some synergy among them.
Hybrid recommendation technique gives best performance by mitigating the
shortcoming/disadvantages of one technique using the strengths/advantages of other
techniques. In literature, there are many different ways proposed to create a hybrid
system by merging two or more basic approaches and several other approaches were
also found in literature that include data filtering techniques, used to extract
information from data and can be used to improve RSs performance. In 2002, Burke
presented a many hybridization techniques and they are classified as: Weighted,
Switching, Mixed, Feature Segmentation, Meta level and Cascade [17].
RQ2: What are the strengths and weaknesses of Recommendation System?
Table 1 gives idea about strength and weaknesses of algorithm and data in
recommendation system and it is indicated by possible (Y), not possible (N).
RQ3: Issues and Challenges encountered in Recommender System.
This section describes the most common issues and challenges that encounter in
deploying RSs.
1. Sparse RSs: Generally, user does not rate product or new item which first time in
market and the resultant ratings matrix becomes very sparse. Due to this, sparsity
problem of data, it declines the chances of finding a set of users with similar ratings.
2. Cold-start problem: It is also referred as new item and new user problem. In job
recommendation it is consider as new job and new candidate, recent jobs can’t be
recommended initially when it is introduced to a CF system with no ratings.
3. Scalability Problem: Scalability problem mainly arise in huge and dynamic data
sets which is produced by interactions between user and item such as preferences,
ratings and reviews. It is possible that when some recommendation algorithms are
applied on relatively small data sets, they provide the best results, but may reflect
inefficient or worst behaviour on very large datasets.
4. Privacy Issue: To produce quality personalized recommendations, RSs are bound to
gather as much user data as possible and to exploit it to the fullest. But on the other
Author Proof
6
Table 1. Open challenges of different recommendation approaches
Open challenges Content Collaborative Context Knowledge Demographic
side, this may create a negative impression on the users’ mind about their privacy
Author Proof
because the system knows too much about them. Thus, such techniques need to be
designed that can sensibly, meticulously and carefully use the user data by
ensuring that information about the users’ true preferences is not freely accessible
to malevolent users.
5. Diversity of Items: Generally, a user can opt for an item of his interest from a
recommendation list if the list reflects some diversity in the recommended items to
some extent. Seamless recommendations for a restricted type of product have no
value until or unless it is desired or explicitly described by the user with a narrow
clique of preferences.
6. Gray sheep problem: Black sheep which correspond to the groups of users that
have very few or no correlating users and gray sheep which correspond to the
users that have their own unusual taste and low correlating with others.
RQ4: Which evaluation methods are used to evaluate the quality of Recommender
System?
The quality of recommender system can be evaluated with the help of evaluation
measures and it give result based on the type of collaborative filtering application.
1. Precision: It evaluates the portion of related items restore out of all item restore
and measure the exactness of data. E.g. the proportion of recommended job those
are actually suitable.
Precision recall
F1¼2 precision þ recall ð3Þ
p ab c ða; bÞ 4
ð jÞ¼ c b ð Þ
ð Þ
8 R. Mishra and S. Rathi
2 2
pmi log c ð a ; bÞ 5
¼ c a c b ð Þ
ð Þ ð Þ
6. Scalability: It navigates huge collections of items/jobs data and it extent up to the
actual datasets, deals efficiency with Eqs. (1), (2), (3) and analysis to accelerate for
massive datasets. The above six evaluation metrics used in recommendation
system to evaluate and measure performance [18].
RQ5: What are the various application domains where RSs being adopted?
The recommender system used in web recommendation, E-Commerce marketing,
Entertainment, Movies, Music, E-services, E-learning (Education), News media,
tourism and many more [19].
Web Recommenders: With the help of web recommender users’ personal interests
will learn and Personal WebWatcher keeps track of the Web pages and contents they
visit and also gather profiles data in the form of a weighted interpretation network
[20]. E-Commerce: The product recommendations depend on demographic
information, or past purchasing behavior of the customer. Hybrid recommendation
system computes the similarities of product details and user profile. It also increases
the competition amongst sellers [21]. Movie: Movie RSs attracts almost all categories
of users and researchers, MovieLens recommends movies to the people based on their
movie rating. Netflix recommends movies and TV shows allows for streaming
purpose [22]. Music: Last.fm, Pandora are the best examples for music
recommendation system uses CF. FOAFing (friend of a friend discovers, explores the
music content based on user profiling via FOAF description, content-based and
context-based information [23]. News: It provides very exciting breaking news to its
potential consumers. YourNews42 is a RS for personalized news access where a
distinct interest profile is maintained for various topics such as Business National,
World, etc. [24]. Tourism: Tourism RSs employ two types of interfaces: web-based
which are very useful prior to the visit (example e-Tourism, City Trip Planner) and
mobile-based that recommends attractions during the visit (MapMobyRek, GeOasis)
are few examples of mobile-based RSs [22, 23]. Entertainment and education:
Provides different TV program recommender systems are based on content-based
filtering [25]. For analysis contents includes mainly online text document, images,
videos, web page contents, e-mails contents and news article. Recommender
technologies [26–28] give recommendation with the help of probabilistic models. E-
learning helps to explore wide range of research maintain digital libraries [1, 19, 29].
RQ6: Different gaps exist in the existing Job Recommender System.
The recommender system have several challenges which is difficult to solved by
available approach (content based and collaborative filtering) [18]. Few challenges are
not addressed by researchers that are discussed in figure 2.
Efficient and Scalable Job Recommender System 9
existing system balance between privacy and modeling needs are not included.
Outlier detection is also not done at submission stage of user profile and behaviour.
2. Sparsity problem faced by traditional CF methods. Scalable algorithms are not
designed.
3. Cold-Start Problem appears: sometime new users and new jobs are not interacted
and if they got interacted frequently, they have to wait for the next turn. In big
scale intensive system this problem is difficult to solve.
4. in e-learning and job recommendation includes uncontrolled vocabulary and tag
ambiguity, tag redundancy, tag with little semantics but different variations. Not
used algorithm like Likemind for prediction and Horting for Sparsity issues.
5. Tree and ensemble model increase sparsity. In current model uses tree ensemble
methods and this model cannot work with sparse id features; sparse only works for
non-zeros numbers and small number of examples.
6. Jobs in current system are not uniformly distributed, and they are highly biased and
distribute across job types and user interest. Matrix factorization and KNN cannot
solve the problem because of rare interaction of the problem. AQ2
Gaps in Existing
System
10
Table 2. Comparison of existing job recommender system
User profiling and Layout User Advantages Drawbacks
cosine similarity and all dataset (work4). It serves millions of job and approximately 65
Author Proof
million active search resume. Work4 uses few attributes of Facebook or LinkedIn users
and specially grant to access data [24, 32]. The suitable job whose description are
matching with profile and they used for hiring. For comparison purpose we have used
three types of dataset (1) Random Dataset: randomly selecting few users and jobs, Work4
data-bases manually annotate that data and used. For evaluation purpose more than 3,494
entries used; (2) Feedback Dataset: uses feedback data from applications users and stored
and 6, 650 entries for evaluation purpose; (3) Candidates Dataset: only apply that jobs
which matches their profile, for evaluation purpose more than 15, 625 entries;
(4) All Dataset are created artificially, it is the aggregation of random, feedback and
candidate data and approximately 26,669 entries. These dataset are tested in Cosine
(Fig. 3 depicts the comparison between work4 dataset and different techniques).
%
(
)
A
y
c
c
c
r
10
8
6
4
2
Data Set 0
3494 6650 15625 26669
Cosine Similarity 4 5 7 8.5
Linear SVM 5.5 6 7.5 9
Similarity and Linear SVM techniques and result shows that linear SVM give
better result as compare to cosine similarity.
We are comparing the performance of four different model (work4,
CareerBuilder, LinkedIn Job Ecosystem, LinkedIn Talent Search) with three
algorithms (Classification/Support Vector Machine (SVM), K-Nearest Neighbour
(KNN), Graph Based Approach (GPA) and DSSM (deep semantic structured models).
The result shows that graph based approach gives best result as compare to other
methods approach. LinkedIn talent search uses deep learning approach and improve
the accuracy.
In this section we are comparing different algorithm like; Matrix Factorization for
evaluation purpose (MF), SVM for evaluation, Neural Network for analysis purpose (NN),
KNN for finding similarity, Naive Bayes (NB) for finding probability, HIS (hybrid
immunizing solution) is a job recommender system it uses collaborative filtering approach
and algorithms, recommends the best opening and matching jobs for an applicant (user).
GBA adopts graph theory approach and link analyses for page ranking, the major
shortcoming of Collaborative filtering approach are overcome. Graph based approach are
user two types homogeneous and heterogeneous. The DSSM uses different layer; word
hashing done in first layer, and the next layer reduce the
12 R. Mishra and S. Rathi
dimension of document vectors. Here Eq. (4) find the probability of interaction of two
Author Proof
job Eq. (5) gives the probable chances appearing least popular job to the top of the to
the related job list (Fig. 4 depicts comparison between different dataset and
algorithm).
100
Pr
on
KNN
isi
ec
DL
0
W4 CB LE LT
Company Dataset
Algorithm
Dataset 1 (3494)
Dataset 2 (6,650)
Dataset 3 (15,625)
MF NN HIS GBA DPSS
6 Conclusion
Author Proof
References
1. Lu J, Wu D, Mao M, Wang W, Zhang G (2015) Recommender system application
developments: a survey. Decis Supp Syst 74:12–32
2. Yangi Z, Wui B, Zhengi K, Wang X, Lei L (2016) A survey of collaborative filtering-based
recommender systems for mobile internet applications.
https://doi.org/10.1109/access.2016. 2573314
3. Kumar B, Sharma N (2016) Approaches, issues and challenges in recommender systems: a
systematic review. Indian J Sci Technol 9(47). https://doi.org/10.17485/ijst/2015/v8i1/94892
4. Eirinaki M, Gao J, Varlamis I, Tserpes K (2017) Recommender systems for large-scale
social networks: a review of challenges and solutions. Elsevier B.V. http://dx.doi.org/10.
1016/j.future.2017.09.0150167-739X
5. Sun L, Wang X, Wang Z, Zhao H, Zhu W (2017) Social aware video recommendation for
online social groups. IEEE Trans Multimed 19(3)
6. Anandhan A, Shuib L, Ismail MA, Mujtaba G (2018) Social media recommender systems:
review and open research. IEEE 6:2169–3536. https://doi.org/10.1109/access.2018.2810062
7. Wu C, Zhang Y, Jia J, Zhu W (2015) Mobile contextual recommender system for online
social media. IEEE Trans Mob Comput 14(8). https://doi.org/10.1109/tmc.2017.2694830
8. Adomavicius G, Tuzhilin A (2005) Towards the next generation of recommender systems:
a survey of the state-of-the art and possible extensions. IEEE Trans Knowl Data Eng
17:734– 749
14 R. Mishra and S. Rathi
filtering-based recommendation of online social voting. IEEE Trans Comput Soc Syst 4(1).
https://doi.org/10.1109/tcss.2017.2665122
10. Aghasian E, Garg S, Montgomery J (2018) User’s privacy in recommendation systems
applying online social network data: a survey and taxonomy. http://arxiv.org/abs/1806.
07629 v1 [cs.CR]. Springer
11. Taghavi M, Bentahar J, Bakhtiyari K, Hanachi C (2017) New insights towards developing
recommender systems. The British Computer Society. https://doi.org/10.1093/comjnl/
bxx056
12. Valverde-Rebaza J, Puma R, Bustios P, Nathalia (2018) Job recommendation based on job
seeker skills: an empirical study. In: Proceedings of the Text2StoryIR’18 workshop,
Grenoble, France, 26 Mar 2018. http://ceur-ws.org
13. Diaby M, Viennet E, Launay T (2013) Toward the next generation of recruitment tools: an
online social network-based job recommender system. In: 2013 IEEE/ACM international
conference on advances in social networks analysis and mining
14. Geyik SC, Guo Q, Hu B, Ozcaglar C, Thakkar K, Wu X, Kenthapadi K (2018) Talent
search and recommendation systems at LinkedIn: practical challenges and lessons learned.
In: Proceedings pf SIGIR’18, 8–12 July 2018, Ann Arbor, MI, USA. ACM, New York,
NY, USA, 2 pages. https://doi.org/10.1145/3209978.3210205
15. Jalili M, Ahmadian S, Izadi M, Moradi P, Saleh M (2018) Evaluating collaborative
filtering recommender algorithms: a survey. Digital Object Identifier 6. IEEE Access.
https://doi.org/ 10.1109/access.2018.2883742
16. Kenthapadi K, Chudhary A, Stuart A (2017) LinkedIn salary: a system for secure
collection and presentation of structured compensation Insights to job Seekers.
http://arxiv.org/abs/ 1705.06976 v2 [cs.SI]
17. Ricci F, Rokach L, Shapira B, Kantor PB (2016) Recommender systems handbook.
Springer, US
18. Mukamakuza C, Sacharidis D, Werthner H (2018) Mining user behavior in social
recommender systems. In: WIMS’18 proceedings of the 8th international conference on
web intelligence, mining and semantics. https://doi.org/10.1145/3227609.3227651. Article
no. 37. ISBN: 978-1-4503-5489-9
19. Ghazanfar MA, Prugel-Bennett A (2010) Building switching hybrid recommender system
using machine learning classifiers and collaborative filtering. IAENG Int J Comput Sci 37:3
20. Gomez- Uribe CA, Hunt N (2016) The netflix recommender system: algorithms, business
value, and innovation. ACM Trans Manag Inf Syst (TMIS) 6(4)
21. Billsus D, Pazzani MJ, Chen J (2002) A learning agent for wireless news access. In:
Proceedings of 5th ACM international conference on intelligent user interfaces, pp 33–36
22. Thiengburanathum P, Cang S, Yu H (2016) An overview of travel recommendation
system. In: IEEE 22th international conference on automation and computing
23. Vansteenwegen P, Souffriau W, Berghe GV, Van Oudheusden D (2011) The city trip
planner: an expert system for tourists. Expert Syst Appl 38(6):6540–6546
24. Mishra R (2019) Entity resolution in online social networks (@Facebook and LinkedIn).
In: Proceedings of IEMIS 2018, vol 2. https://doi.org/10.1007/978-981-13-1498-8_20.
25. Franke M, Geyer-Schulz A, Neumann AW (2008) Recommender services in scientific
digital libraries. In: Tsihrintzis GA, Jain LC (eds) Multimedia services in intelligent
environments. Studies in computational intelligence, vol 120, pp 377–417. Springer,
Berlin, Heidelberg
Efficient and Scalable Job Recommender System 15