Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 17

Computer Science and Information Systems 00(0):0000–0000 https://doi.org/10.

2298/CSIS123456789X
1

1 LN and ML-based model architecture for recruiting IT


2 professionals *

3 Juan Rolando Eneque Pisfil1, Hugo Calderón-Vilca1


4 1 Universidad Nacional Mayor de San Marcos, Lima, Perú
5 juan.eneque@unmsm.edu.pe, hcalderonv@unmsm.edu.pe

6 Abstract. Personnel recruitment is a key process for companies since through


7 this process they try to find the ideal or best qualified person to perform a job or
8 occupy a position. This process involves the definition of the job offer, through
9 which several qualities (age, knowledge, experience, among others) that must
10 be met for the position offered are exposed. There are several platforms that
11 serve as intermediaries to connect people and companies, but they do not
12 provide the ability to assess whether the applicant for a job offer meets the
13 requirements requested. The objective of this research is to propose a model
14 that helps automate the recruitment process focused on IT professionals.
15 Keywords: Recruitment, PLN, ML

161. Introduction

17Personnel selection is the process of obtaining the quantity and quality of employees
18needed for the business and involves a large number of activities (planning,
19recruitment, selection and incorporation of new employees).
20[16] indicates that one of the disadvantages of the recruitment process is the cost of
21operation related to the application of appropriate selection techniques, that is,
22choosing the candidate that meets the requirements of the position offered is a
23complicated task because it implies that the Human Resources area invests large
24resources, distributed among activities such as: review of profiles, filtering and
25personal interviews.
26Human resources management and the problems they present are being addressed by
27Artificial Intelligence (IA) and its branches. For example, in the literature review of
28[6], the author shows us that AI offers a diverse set of suggestions of how specific AI
29techniques could be applied to specific Human Resources tasks.
30An example of the aforementioned is reflected in the proposal of [4], in which they
31address the problem of candidate classification with the help of Machine Learning.
32For this purpose, they evaluated algorithms (linear regression, M5 model tree, REP
33decision tree and support vector machine) of supervised learning in combination with
34a semantic skill matching mechanism to achieve automated electronic recruitment.
35Another ML-based proposal is [3], in which they propose a microservices-based
36framework in order to recommend the best job offers for a candidate.

2* If this is an extended version of a conference paper, it should be clearly stated here.

3
12 First Author et al.
2
1On the other hand, [19] proposes a system with a hybrid approach (PLN and regular
2expressions) that seeks to solve the problem of resume categorization and resume-job
3offer matching.
4Finally, [11] present a bidirectional recommender system for candidates in job search.
5The author’s proposal implements a microservices-based, scalable and stateless
6architecture to drive automation through recommendation using Machine Learning
7and static methods.
8Using an electronic recruitment or e-recruitment strategy that also implements
9“intelligent” mechanism or AI techniques, offers great advantages when evaluating
10hundreds of profiles, since they offer faster (depending on the technique and
11processing resources) and more accurate results for what we are looking for.
12Based on the review of recruitment research, the objective of the present research is to
13design an architecture based on Natural Language Processing and Machine Learning
14to address the problem of recruiting IT professionals.
15The rest of the paper is organized as follows. Section 2 covers research covering the
16recruitment problem. In section 3, we detail the architecture design and, finally, in
17section 4 we show the results and discussion on these.
18

192. State of the art

20We analyzed a total of 20 investigations and divided them into 3 categories according
21to the techniques applied: Machine learning, Natural Language Processing and
22Semantic Correspondence.

232.1. IT personnel recruitment using Machine Learning techniques

24[4] proposes a system for candidate selection through the analysis of the candidate’s
25LinkedIn and blogger profile. For this purpose, they evaluated supervised learning
26algorithms (linear regression, M5 model tree, REP decision tree and support vector
27machine) and combined them with a semantic skill matching mechanism.
28Supported by the strengths of semantic knowledge (concept similarity) and the
29strengths of Machine Learning methods, [3] propose a scalable and stateless
30architecture for an automated Human Capital Management system and with which
31they seek to recommend jobs to a candidate and vice versa, recommend candidates for
32a company.
33A recommendation system that uses a Gradient Boosting Decision Tree (GBDT) and a
34hybrid convolutional neural network model to compute a correlation between a job
35seeker and a job offer with the goal of improving the quality of human resource
36recommendation is proposed by [17].
37[20] proposes a convolutional neural network model with the objective of solving the
38person-job matching problem. The authors’ proposal is a neural network that learns
39the joint representation of person-job fit from historical job applications.
40[11] proposes an architecture for automation through recommendation using machine
41learning and statistical methods. The authors’ proposal is an extension of the research

3
1 Authors’ Instructions 3
2
3
4
1of [3] in which they aim to achieve better system robustness and recommendation
2quality by implementing features such as candidate career interests, scoring functions
3for academic information and professional experience, string matching, etc.
4[15] presents an automated Machine Learning-based model for CV recommendation.
5In which, a CV goes through preprocessing for cleaning and feature extraction using
6the TF-IDF approach and subsequently through the classification model is assigned to
7a category.
8In the recruitment process, recruiters do not focus exclusively on a person’s technical
9skills to determine their sustainability for an offered position, but also take into
10account characteristics such as education, personality, experience, etc.
11
12 Table 1. Characteristics to consider for personnel selection

Author(s) Considered Characteristics


[Error: Personality
Reference
source not
found]
[Error: Location, experience and education
Reference
source not
found]
[Error: Employment history
Reference
source not
found]
[Error: Work experience, company (Glassdoor
Reference parameters), location and education
source not
found]
13
14As shown in Table 1, in the different proposals analyzed, we can see that the authors
15take into account what is described in the previous paragraph and evaluate or consider
16other characteristics apart from the technical skills of a candidate to assess the
17suitability of this for an offered position.
18Machine Learning algorithms were evaluated by [4] and [15]. [4] evaluated linear
19regression, M5 tree model, decision tree (REP) and Support Vector Regression (SVR)
20with two nonlinear kernels (polynomial kernel and universal PUK kernel) for the
21evaluation of total experience and relative experience concluding that the tree models
22and the SVR model with PUK kernel produced better correlation results for their
23proposal. On the other hand, [15] evaluated Random Forest (RF), Multinomial Naive
24Bayes (NB), Logistic Regression (LR) and Linear Support Vector Machine (SVR), for
25CV classification, obtaining that the latter had better accuracy for the classification
26task.

5
14 First Author et al.
2
1[17] and [20] dive into Machine Learning subtasks and propose solutions based on the
2use of Convolutional Neural Networks. The first one proposes a recommendation
3system with a GBDT model and a hybrid convolutional neural network model for
4regularization and recommendation. The second one, on the other hand, relies
5exclusively on convolutional neural networks and applies cosine similarity to calculate
6the similarity between job offers and a candidate’s CV.

72.2. IT personnel recruitment using semantic correspondence techniques

8A job recommendation system based on user profile is proposed by [8], in which they
9also seek to predict career advancement from the user’s work history.
10A content-based recommendation algorithm that extends and updates the Minkowski
11distance is proposed by [1], with the objective of matching people and jobs. The
12authors’ proposal quantifies the sustainability of a searcher/candidate by analyzing a
13structured form of the candidate’s job and profile created from the content analysis of
14the unstructured form of these.
15[7] proposes a Resume Matching System called ResuMatcher, which determines the
16sustainability of a job by calculating the similarity between the models generated from
17the resume and the job description.
18A career path recommendation system that relies on text mining and collaborative
19filtering techniques and also recommends skills based on related job offers generated
20from the user’s profile skills is proposed by [13].
21[12] proposes a candidate recommendation system called Smart Applicant Ranker; in
22it, they use ontologies to compare CV models (consisting of education, work
23experience and skills) and job requirement models to find the best candidates based on
24the similarity of the generated ontological models.
25A bidirectional semantic correspondence system is proposed by [2] to measure the
26degree of semantic similarity between the skills and qualifications of a job seeker and
27an offered vacancy. In addition, they apply machine learning techniques for
28bidirectional matching of job vacancies and occupational standards to improve the
29content of job vacancies and job seeker profiles based on social network analysis and
30occupational standards.
31[18] propose the use of weighted tree algorithms to calculate the similarity between
32job advertisements and keywords or criteria used by job seekers.
33[14] propose an ontology-based (most relevant) job recommendation system that is
34built from the basic information collected and the list of favorite and viewed jobs by
35the user.
36In the proposals of [8], [1], [12], [2], [18] and [14] the authors propose solutions that
37require the information to be analyzed to have a certain structure. On the other hand,
38the proposals of [7] and [13] apply unstructured analysis, taking into account that the
39information contained in a CV does not present a unique style or format.
40 Table 2. Format of the information to be processed
Information to process
Author(s) Structured Not
Structured

3
1 Authors’ Instructions 5
2
3
4
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
[Error: X
Reference
source not found]
1[8], [7] and [2] present proposals that approach the selection problem from the
2perspective of similarity between a candidate’s CV/profile and the vacancy/position
3offered. In contrast, [18] addresses the problem through the similarity of the content of
4a job offer and the search keywords used by a user.
5Although the proposals of [8], [7] and [2] address the same similarity approach, each
6one presents some peculiarity. In the proposal put forward by [8], recommendation
7based on the content of the candidate’s work history is applied. [7] rely on the
8qualifications, skills and work experience described in the candidate’s CV and those
9required in the job offer and generate recommendations based on the similarity
10between them. Finally, [2] take into account the similarity of qualifications and skills
11and also take into account the candidate’s connections since their testimony enhances
12the process of evaluating whether or not a candidate is suitable for a vacancy.
13
14 Table 3. Data source

Author(s) Data Source Quantity


[Error: LinkedIn 2400
Reference
source not
found]
[Error: Kaggle 100

5
16 First Author et al.
2
Reference
source not
found]
[Error: Indeed 1000
Reference
source not
found]
[Error: Universidad Estatal de San 1000
Reference José
source not
found]
[Error: - -
Reference
source not
found]
[Error: Not specified 175
Reference
source not
found]
[Error: Not specified 100
Reference
source not
found]
[Error: - -
Reference
source not
found]

12.3. IT personnel recruitment using Natural Language Processing techniques

2An online recruitment system that exploits multiple semantic resources and uses
3statistical measures of concepts relatedness is proposed by [10]. Moreover, it relies on
4PLN to identify and extract possible concept lists from job postings and candidate
5CVs.
6[9] propose a solution focused on job matching for older workers. In this solution,
7from the description entered in the system search engine, keywords are extracted from
8the text after tokenizing sentences and filtering words based on morphological
9analysis. Then, based on the top 10 keywords, the search for related job offer
10documents is performed.
11To solve the resume-job offer matching problem of job portals [19] pose a hybrid
12approach and incorporate the use of resume categorization to reduce the dataset to be
13analyzed, that is, instead of evaluating the total resumes, the analysis is only applied
14to resumes that fall within the category described in the job offer.

3
1 Authors’ Instructions 7
2
3
4
1To cover the problem of CV retrieval based on the description of a job offer, [5]
2propose the use of the average word embedding (AWE) model and the Principal
3Component (PCA) algorithm to solve the dimensionality problem that AWE can
4present.
5
6Table 4. Weighting techniques applied in proposals using PLN
7
Author(s) Technique/Approach
Weighting
[Error: TF-IDF
Reference
source not
found]
[Error: BM25
Reference
source not
found]
[Error: TF-IDF
Reference
source not
found]
[Error: AWE
Reference
source not
found]
8
9In the proposals of [10], [9], [19] and [5], we could appreciate different techniques
10applied to information retrieval, as shown in Table 4, [9] applied that TF-IDF
11weighting scheme to eliminate concepts that do not present significant value. [9] made
12use of Solr/lucene scores of the BM25 algorithm, which performs scoring based on
13term frequency and document length normalization. [9], relied on the TF-IDF
14technique, which subsequently performs concept list filtering/refinement by removing
15concepts with low weights assigned by this technique. On the other hand, [9] indicate
16that classical information retrieval models such as Bag of Word (BOW) and BM25
17have certain weaknesses and require complementary techniques such as latent
18semantic indexing (LSI). Therefore, they rely on the average word embeddings
19(AWE) models.
20

5
18 First Author et al.
2
13. Architecture design

2An architecture based on PLN and ML is proposed, whose objective is to generate a


3list of candidates that fully or largely meet the requirements of a job offer or vice
4versa, displaying a list of job offers that best match the IT skills of a person’s profile.
5To do this, the information contained in a person’s CV or job offer is entered into a
6structured form, this information goes through a cleaning stage with the objective of
7eliminating those words that do not provide relevant meaning or that may obstruct the
8capture of IT skills.
9The IT skills obtained during pre-processing go through a model to detect those with
10greater semantic similarity and with which we obtain a list of professional categories
11by consulting an IT skill – related professions dictionary, which forms one more
12characteristic of the processed profile or job offer.
13Finally, in the matching module, in case a job offer is being evaluated, we obtain the
14CVs that have a category in common with the job offer, in order to reduce the volume
15of data to be evaluated, and through a grouping algorithm, we generate a list or
16ranking of the best candidates for the job offer. In case a profile is being evaluated,
17exactly the same thing happens, with the difference that the profile is grouped with a
18set of job offers.

19
20 Figure 1. Model Architecture

3
1 Authors’ Instructions 9
2
3
4
1 Figure 1 shows the architecture of our model and its components:
2  Data form
3  Pre-processing module
4  Categorization module
5  Matching module
6

73.1. Data form

8 It represents the core of the system and is the component that receives the necessary
9information for the model to work. Through it, the actors (applicant and candidate)
10initiate the behavior of the model, since they provide the data that pass through each
11of the components of the model and ultimately generate a ranking of candidates for
12the job offer entered or a ranking of job offers for the CV entered.

133.2. Pre-processing module

14 In this component, the corpus of the text entered in the skills section goes through a
15cleaning process, through which we detect and eliminate those punctuation marks or
16symbols that do not provide context-relevant meaning or that cause an IT skill not to
17be detected.

18
19 Figure 2. Skills corpus cleaning
20
21In figure 2, we present the proposed flowchart for data cleaning. Since in our skills
22detection process we rely on an IT dictionary, it is necessary to ensure that an IT skill
23(contained in the skills section of each form) does not contain characters that would
24cause the omission of this skill during the process. Therefore, the first step to follow is

5
110 First Author et al.
2
1the conversion of the text of the skills section into a list of characters. After that, we
2parse each element of the generated list and remove the signs and symbols. Finally,
3we rejoin this list of characters and obtain a clean corpus to process.
4An important element in this module is Word2vec, which is a neural network
5composed of an input layer, a hidden layer and an output layer that allows us to
6calculate the semantic relationship between words in a given context. Taking into
7account the above, we take advantage of this tool and train it with IT skills.
8This model helps us to fulfill the objective of this module, which is to obtain a subset
9of skills with a strong semantic relationship and thus, reduce the number of queries to
10be made later in the categorization module. This is under the premise that a set of
11strongly related skills will result in an equally related number of IT occupations.
12
133.3 Categorization module
14With this module we obtain the IT occupations related to each of the skills detected in
15the previous module. These occupations help us to categorize the document (job offer
16or CV) that is being processes and also serve to reduce the volume of data to be
17worked with in the next module.
18Table 5. IT dictionary excerpt
IT Skill IT Professions
Expressjs backend, js developer
Extjs frontend, js developer
Firebase backend, mobile
developer
Flask python developer,
backend, web developer
19
20Table 5 shows a small excerpt of how the IT dictionary is composed.
21An IT skill is not exclusive to one profession and that is why during the consultation
22of our IT skill dictionary it is possible that there are one or more IT skills that have in
23common one or more IT professions/occupations.
24Taking into account the above, during each query to our dictionary we assign a
25frequency value. Then, at the end of the query process, we calculate the average
26frequency and categorize the document under evaluation (job offer or CV) with those
27professions that have a value greater than or equal to the average.
283.4 Matching module
29In this module, in case a job offer is being processed, the list of professional categories
30obtained is taken and for each of these, the CVs of the same category are extracted
31from the database. In case a profile or CV is being processed, the documents extracted
32from the database will be job offers.
33With the set of documents obtained, a data table is built. This data table has as column
34headers the IT skills detected from the filtered set and the item being processed, each
35row will be represented by a profile or CV, where each row – column intersection will
36have a value that depends on the following conditions:
37• 0 will be assigned if the CV does not possess the IT skill described in the
38column.

3
1 Authors’ Instructions 11
2
3
4
1• 1 will be assigned if the CV possesses the IT skill described in the column.
2• 2 will be assigned if the CV contains the IT skill described in the column and
3it matches one of the requirements of the job offer.
4In case a profile or CV is being processed, the criteria are the same, with the
5difference that each row will be represented by a job offer.
6This data table represents the input for clustering. The unsupervised Mean-shift
7algorithm is in charge of analyzing this set and assigning a group or cluster number to
8each one. This algorithm, unlike others, does not require a number of clusters to be
9assigned, but it iterates and analyzes each of the elements of the set and establishes the
10number of clusters. Once the process is finished, we have the number of clusters to
11which each element belongs. Of these, those that are in the same cluster as the
12document (job offer or CV) being processed represent the output of the clustering
13component. es el encargado de analizar este conjunto y asignar a cada uno un número
14de grupo o clúster.
153.5 Model output
16Our final objective is to obtain a ranking of candidates; therefore, we order the CVs
17(obtained during clustering) based on the percentage of skills that a CV fulfills with
18respect to those specified in the job offer. Put differently, given a CVi, where i є N,
19which contains an HCV list of skills, and given the job offer, which contains the
20required skills (RS) and the desirable skills (DS). The percentage of RS (%RS) is
21calculated as the number of RS that are contained in HCV over the total number of
22HCV items.
23
24As an example, given a CV and a job offer with RS and DS. The percentage of RS and
25DS is calculated as follows:
26HCV = [Java, Spring, JSF, Oracle, Android, Flutter, Spring Boot]
27• n(HCV) = 7
28• RS = [Java, Android, React, Flutter]  %RS = 3/7 ≈ 42.8%
29• DS = [Spring, Spring Boot]  %DS = 2/7 ≈ 28.5%
304 Results and Discusion
31In this section, for the evaluation and discussion of results, we used 200 job offers and
3250 profiles or CVs. In addition, we rely on an IT dictionary which consists of 225
33skills, and the occupations associated with each of these.
34As we indicated in the theoretical input chapter, out model consists of 3 components:
35pre-processing, categorization and clustering. In this chapter we will show the results
36of processing a document (job offer or CV) by each of these components.
374.1 Pre-processing results
384.1.1 Case 1:CV
39When registering a CV through the web system form, the section containing IT
40knowledge or skills is processed to detect those with the highest semantic similarity:
41
42Table 6. CV: Pre-processing results
CV IT skills detected Most similar IT
skills
cv_000 9 = ['html', 'css', 8 = ['html5', 'css3',

5
112 First Author et al.
2
1 'javascript', 'java', 'javascript', 'php',
'php', 'laravel', 'vue.js', 'java',
'vuejs', 'rxjava', 'spring', 'laravel']
'spring']
cv_000 13 = ['html', 'css', 9 = ['html5', 'css3',
2 'javascript', 'javascript', 'php',
'typescript', 'java', 'typescript',
'php', 'python', 'angular', 'python',
'angular', 'nodejs', 'react', 'nodejs']
'azure', 'react', 'js',
'nestjs']

cv 16 = ['java', 10 = ['java', 'spring',


_0049 'hibernate', 'jpa', 'android',
'mybatis', 'spring', 'hibernate',
'spring', 'javascript', 'mybatis',
'c', 'python', 'flask', 'javascript', 'html5',
'html', 'css', 'css3', 'python',
'datastage', 'sql', 'linux']
'linux', 'android']
cv_005 11 = ['python', 5 = ['android', 'java',
0 'django', 'drf', 'flask', 'css3', 'html5',
'angular', 'android', 'javascript']
'java', 'css', 'html',
'js', 'net']
1
2Table 6 shows the results obtained by pre-processing the skills section, at this point,
3Word2vec helps us to reduce the skills detected in the aforementioned section and as
4results, we obtain those IT skills that are more related to each other or, in other words,
5those that have a greater semantic relationship.
6
74.1.2 Case 2: job offer
8On the other hand, in the case of a job offer, the sections that go through pre-
9processing are the required skills and desirable skills, since these include IT skills.
10
11Table 7. Job offer: pre-processing results
Offer IT skills detected Most similar IT
skills
Oferta_1 9 = ['html', 'css', 8 = ['html5',
'javascript', 'css3', 'javascript',
'nodejs', 'angular', 'php', 'nodejs',
'php', 'laravel', 'laravel', 'aws',

3
1 Authors’ Instructions 13
2
3
4
'aws', 'azure'] 'azure']
Oferta_2 13 = ['php', 10 = ['php',
'javascript', 'python',
'typescript', 'c#', 'symfony', 'css3',
'xamarin', 'javascript',
'python', 'html5',
'symfony', 'typescript',
'django', 'html', 'angular', 'c#',
'css', 'aws', 'xamarin']
'dynamo',
'angular']

Oferta_199 14 = [‘android’, 9 = ['html5',


'html', 'css', 'css3', 'javascript',
'javascript', 'php', 'typescript',
'typescript', 'java', 'angular', 'python',
'php', 'python', 'react', 'nodejs']
'angular', 'nodejs',
'azure', 'react', 'js',
'nestjs']
Oferta_200 10 = ['java', 10 = ['java',
'spring', 'android', 'spring', 'android',
'hibernate', 'hibernate',
'mybatis', 'mybatis',
'javascript', 'javascript',
'html5', 'css3', 'html5', 'css3',
'python', 'linux'] 'python', 'linux']
1
2In Table 7, we show the results obtained from pre-processing the skills section
3(required and desirable). As we mentioned in the case of the CVs, the output obtained
4indicated which IT skills have the highest semantic relationship.
54.2 Categorization results
64.2.1 Case 1:CV
7After obtaining the IT skills with the highest semantic similarity, each skill is queried
8in a dictionary to obtain the associated IT occupations.
9
10Table 8. CV: categorization results
CV Assigned categories
cv_0001 ['frontend', 'web developer']
cv_0002 ['frontend', 'web developer', 'js developer']

5
114 First Author et al.
2
cv_0049 ['java developer', 'backend']
cv_0050 ['java developer', 'frontend']
1
2As mentioned in the previous paragraph, the skills obtained as output from the pre-
3processing are consulted in the IT dictionary and as a result, we obtain data shown in
4Table 8.
54.2.2 Case 2: job offer
6For the case of a job offer, the same process is applied as in case 1, but the skills that
7are consulted in the IT dictionary are those that were detected in the mandatory skills
8section, since these are the ones that best describe the required profile.
9
10Table 9. Job offer: categorization results
Offer Assigned categories
Oferta_1 ['frontend', 'web developer', 'js developer', 'php
developer']
Oferta_2 ['php developer', 'backend', 'web developer',
'frontend', '.net developer']

Oferta_199 [‘web developer']


Oferta_200 ['java developer', ‘web developer']
11
12As a result of the above, Table 20 shows the categories obtained for each job offer
13going through the categorization process.
144.3 Clustering results
154.3.1 Case 1:CV
16In the clustering stage, if the document entered is a CV, those job offers that have a
17common category are extracted from the database in order to reduce the volume of
18data to be processed. From this set of data (CV and job offers) the input dataset for the
19Mean-shift algorithm is created and from which we obtain as a result a cluster
20containing the CV and a subset of job offers (most suitable offers to the profile).
21
22Table 10. CV: clustering results
CV Best offers % of required % of desired skills
skills met met
cv_0001 oferta_0057 60.0 0
oferta_0048 30.77 0
oferta_0021 30.77 0
cv_0002 oferta_0084 66.67 0
oferta_0048 0
34.62
oferta_0021 0
34.62

3
1 Authors’ Instructions 15
2
3
4

cv_0049 oferta_0105 41.18 100


oferta_0007
33.34 0
oferta_0048
oferta_0021 30.77 0
30.77 0
cv_0050 oferta_0084 55.56 0
oferta_0003
45.45 0
oferta_0073
oferta_0110 45.45 0
oferta_0007 27.78 0
oferta_0108
27.78 0
oferta_0048
oferta_0021 25.0 0
oferta_0051 23.08 0
23.08 0
18.75 0
1
2With the subset obtained, as shown in Table 10, we created a ranking of the best job
3offers for each CV.
4In the proposal made by [4], they employ semantic matching to calculate the distance
5between the candidate’s profile skills and experience with the job offer requirements.
6On the other hand, [2] use string matching to evaluate the correspondence between a
7vacancy (job offer) and a profile. In this type of methods, it does not consider that
8some IT skills can be represented in more than one form (Ex: Javascript can be found
9in some offers or profiles al JS). Therefore, in our proposal we create an IT dictionary
10to deal with this problem. Such a dictionary not only informs us about the occupations
11related to a skill, but also considers the various forms of writing with which this IT
12skill can be represented. The latter contributes to broaden the detection of skills and,
13thus, to obtain a better quality result.
14In the study conducted by [19] they propose a method to automatically classify CVs to
15their respective job offers, they perform a categorization/labeling of the documents,
16with the objective of comparing only the elements of the same category. To this end,
17they combined two knowledge bases (DICE and O*NET) with which they obtained
18the occupation associated with each skill. On the other hand, in our proposal, we
19constructed an IT dictionary with 226 skills. In this dictionary, for each skill there is a
20set of associated IT occupations according to the current market. The latter is what
21differentiates us from the aforementioned proposal, since, unlike the author’s
22proposal, our proposal focuses exclusively on the IT area, using a knowledge base
23built manually for this purpose.
24Conclusions and future work

5
116 First Author et al.
2
1An architecture based on Natural Language Processing and Machine Learning is
2proposed to address the problem of recruiting IT personnel.
3As shown in the cited references, in addition to the skills or knowledge, there are other
4qualities that are qualified to determine which person best meets the requirements of a
5job offer. Among these we find the work history, with which we can obtain the years
6of experience, positions held, among others. As future work, we want to build on this
7architecture to design a generalized architecture for recruitment.
8

9References

101. Almalis, N. D., Tsihrintzis, G. A., Karagiannis, N., & Strati, A. D. (2016). FoDRA - A
11 new content-based job recommendation algorithm for job seeking and recruiting. IISA
12 2015 - 6th International Conference on Information, Intelligence, Systems and
13 Applications.
142. Chala, S. A., Ansari, F., Fathi, M., & Tijdens, K. (2018). Semantic matching of job seeker
15 to vacancy: a bidirectional approach. International Journal of Manpower, 39(8), 1047–
16 1063.
173. Chaudhary, A., Jobanputra, M., Shah, S., Gandhi, R., Chaudhary, S., & Goswami, R.
18 (2018). Automated human capital management system. 12th Annual IEEE International
19 Systems Conference, SysCon 2018 - Proceedings, 1–8.
204. Faliagka, E., Iliadis, L., Karydis, I., Rigou, M., Sioutas, S., Tsakalidis, A., & Tzimas, G.
21 (2014). On-line consistent ranking on e-recruitment: Seeking the truth behind a well-
22 formed CV. Artificial Intelligence Review, 42(3), 515–528.
235. Fernández-Reyes, F. C., & Shinde, S. (2019). CV Retrieval System based on job
24 description matching using hybrid word embeddings. Computer Speech and Language, 56,
25 73–79.
266. Figueroa-García, J. C., Kalenatic, D., & López-Bello, C. A. (2015). Artificial Intelligent
27 Techniques in Human Resource Management. Intelligent Systems Reference Library, 87,
28 623–643.
297. Guo, S., Alamudun, F., & Hammond, T. (2016). RésuMatcher: A personalized résumé-job
30 matching system. Expert Systems with Applications, 60, 169–182.
318. Heap, B., Krzywicki, A., Wobcke, W., Bain, M., & Compton, P. (2014). Combining career
32 progression and profile matching in a job recommender system. Lecture Notes in
33 Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
34 Notes in Bioinformatics), 8862, 396–408.
359. Kaoru, S., Kenichi, S., Masatomo, K., & Atsuhi, H. (2017). Towards extracting recruiters’
36 tacit knowledge based on interactions with a job matching system. Lecture Notes in
37 Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
38 Notes in Bioinformatics), 10298, 557–568.
3910. Kmail, A. B., Maree, M., Belkhatir, M., & Alhashmi, S. M. (2016). An automatic online
40 recruitment system based on exploiting multiple semantic resources and concept-
41 relatedness measures. Proceedings - International Conference on Tools with Artificial
42 Intelligence, ICTAI, 2016-Janua, 620–627.
4311. Mehta, M., Derasari, R., Patel, S., Kakadiya, A., Gandhi, R., Chaudhary, S., & Goswami,
44 R. (2019). A service-oriented human capital management recommendation platform.
45 SysCon 2019 - 13th Annual IEEE International Systems Conference, Proceedings, 1–8.
4612. Mohamed, A., Bagawathinathan, W., Iqbal, U., Shamrath, S., & Jayakody, A. (2018).
47 Smart Talents Recruiter - Resume Ranking and Recommendation System. 2018 IEEE 9th

3
1 Authors’ Instructions 17
2
3
4
1 International Conference on Information and Automation for Sustainability, ICIAfS 2018,
2 1–5.
313. Patel, B., Kakuste, V., & Eirinaki, M. (2017). CaPaR: A career path recommendation
4 framework. Proceedings - 3rd IEEE International Conference on Big Data Computing
5 Service and Applications, BigDataService 2017, 23–30.
614. Rimitha, S. R., Abburu, V., Kiranmai, A., Marimuthu, C., & Chandrasekaran, K. (2019).
7 Improving Job Recommendation Using Ontological Modeling and User Profiles. 2019
8 15th International Conference on Information Processing: Internet of Things, ICINPRO
9 2019 - Proceedings.
1015. Roy, P. K., Chowdhary, S. S., & Bhatia, R. (2020). A Machine Learning approach for
11 automation of Resume Recommendation system. Procedia Computer Science, 167(2019),
12 2318–2327.
1316. Vallejo Chávez, L. M. (2016). Gestión del talento humano ESPOCH 2016.
1417. Wang, H., Liang, G., & Zhang, X. (2018). Feature Regularization and Deep Learning for
15 Human Resource Recommendation. IEEE Access, 6, 39415–39421.
1618. Wierfi, A. D., Utami, E., & Sunyoto, A. (2019). The application of extended weighted tree
17 similarity algorithm for similarity searching. 2019 International Conference on
18 Information and Communications Technology, ICOIACT 2019, 428–433.
1919. Zaroor, A., Maree, M., & Sabha, M. (2018). A Hybrid Approach to Conceptual
20 Classification and Ranking of Resumes and Their Corresponding Job Posts. International
21 Conference on Intelligent Decision Technologies, 2, 13–21.
2220. Zhu, C., Zhu, H., Xiong, H., Ma, C., Xie, F., Ding, P., & Li, P. (2018). Person-Job Fit.
23 ACM Transactions on Management Information Systems, 9(3), 1–17.

You might also like