Labor Market Prediction Using Machine Learning Methods A Systematic Literature Review

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

Labor Market Prediction Using Machine Learning


Methods: A Systematic Literature Review
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS) | 979-8-3503-7222-9/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICETSIS61505.2024.10459632

Fawzah Alharbi Prof. Adel Ismail Al-Alawi


Department of Management & Marketing Department of Management & Marketing
University of Bahrain University of Bahrain
Sakhair, Kingdom of Bahrain Sakhair, Kingdom of Bahrain
fawzah.alharbi@gmail.com aialalawi@uob.edu.bh
202100129@stu.uob.edu.bh adel.alalawi@gmail.com

Abstract— This study systematically reviews the literature big data models and technologies. Using such models allows
in the field of labor market prediction. The main aim is to researchers to discover patterns and emerging trends that
identify the most recently applied machine learning methods may not be apparent through traditional methods [4].
used to predict the job market’s future. A comprehensive Machine learning models can provide insights into which
search was conducted across multiple databases, including
jobs will likely grow or decline in the future [1, 3]. This
Scopus, Google Scholar, IEEE, and Web of Science, resulting
in a final selection of 10 relevant articles published between study's objective is to answer the following research
2018 and 2023. The findings of this study found that the used questions:
machine learning methods are Long Short-Term Memory Question 1: What are the recent machine learning methods
(LSTM), Bidirectional Long Short-Term Memory (BiLSTM), that have been utilized in predicting the future of the labor
Long Short-Term Memory-Gated Recurrent Unit (LSTM- market?
GRU), and Autoregressive Integrated Moving Average Question 2: What datasets are used for predicting the future
(ARIMA) besides text mining techniques such as word of the labor market?
embedding and sentiments analysis. These findings suggest Question 3: What are the main applications of machine
that, in general, hybrid models have the potential to
learning in labor market prediction?
outperform individual models, especially in the case of multiple
types of data. The results of this review also emphasize the
importance of using integrated datasets when applying By answering the research questions, this study aims to
machine learning methods. Another important finding was provide a comprehensive overview of recently used machine
that the key applications of machine learning in labor market learning methods in predicting the labor market's future and
prediction are predicting unemployment, predicting needed highlight the data types used in the applications. Moreover, it
educational programs, and predicting market demand. This provides a prescriptive of the research trends in the field of
study contributes to labor market forecasting by providing an labor market forecasting.
extensive synthesis and analysis of the existing literature on the
future of the labor market and the use of machine learning
The paper is structured as follows: section II explains the
methods. However, future research must include qualitative
and external factors affecting the prediction of the job market methodology used; Section III presents an overview of the
future. The major limitation of this study is the number of selected papers; Section IV outlines and discusses the results
reviewed publications. obtained; and Section V concludes the paper.

Keywords— Artificial intelligence, Machine Learning, II. METHODOLOGY


Prediction, Labor Market, Job Forecasting The Systematic Literature Review (SLR) method is used to
review and analyze the literature in predicting the future of
I. INTRODUCTION the labor market. This study followed the Preferred
Recently, Artificial Intelligence (AI) and employment have Reporting Items for Systematic Reviews and Meta-Analyses
become a contemporary research issue since automation, (PRISMA) method as a guideline, which is a standardized
advanced technologies, and digitization are expected to alter approach for conducting and reporting systematic literature
the global labor market [1]. AI's impact on the workforce is reviews [5]. To identify the relevant articles, a search
expected to displace jobs in many sectors, particularly after strategy was employed based on the following criteria:
the COVID-19 pandemic that accelerated technology
adoption [2]. According to a study by [3], presently, the A. Inclusion of articles criteria
percentage of tasks that have the potential to be automated is
• Conferences and journals’ papers published from
22.0%. After five years, this rises to 40.0%; after ten years, it
shifts to 60.0%. However, this transformation will create new 2018 to 2023.
job opportunities and encourage upskilling [2]. • The included papers focused on machine learning,
Consequently, predicting the future of employment is big data, and AI applications in predicting the future
essential to set strategies and plan processes and policies [4]. of the labor market.
This paper aims to review the latest research in forecasting • Search terms and keywords included the titles,
future jobs that apply machine learning methods, giving a abstracts, “Artificial intelligence,” "Big data,"
broad perspective on the latest applied machine learning and

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SAO PAULO. Downloaded on March 25,2024 at 23:11:01 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-7222-9/24/$31.00 ©2024 IEEE 478
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

“Machine learning," "future jobs," and "labor III. OVERVIEW


market." The following section summarizes recent studies that
• Papers in English used machine learning tools to predict the future of the labor
market. The reviewed papers used various techniques, data
B. Exclusion of Articles sources, and models. All the selected studies aim to provide
• Papers published before 2018, Books. valuable insights into the latest trends in forecasting the job
• Papers use traditional research approaches for market needs and the skills that will be in demand.
predicting (focus group, statistical analysis,
The study by [7] used machine learning methods to
surveys).
predict the future of jobs by using patent data to indicate
technological trends. The machine learning model employed
C. Article Selection Process in this study, 'Fast-Text, ' was invented by Facebook's AI
• Define appropriate keywords, which are lab. It is used for learning word classification and
“forecasting jobs,” “machine learning’, ”predicting embedding. The researchers first linked patent data to job
jobs,” and “Big data.” descriptions provided in O∗NET, the primary source of
• Use multiple electronic databases, such as Scopus, occupational information in the US. They predicted the
Google Scholar, IEEE, and Web of Science. future of the required talents using a combination of patent
• Screen Titles and Abstracts: Evaluate the titles and analysis, word embedding techniques, and job classification
abstracts using the previously set inclusion and data. One of the potential limitations of this study is that the
exclusion criteria. The irrelevant articles are approach primarily relies on patent data, which may not
always accurately reflect the demand for jobs associated
excluded during this screening phase.
with a specific technology. Additionally, bias could exist in
the choice of patent grouping codes, which could affect the
The steps of PRISMA are illustrated in Fig. 1. The validity of the job forecast. The authors suggest comparing
selected papers for this systematic literature review are ten the results of the job forecast with other research and
conferences and journal papers. Conference papers have assessment indicators to ensure the precision of the findings.
several advantages over journal articles, providing
researchers with in-depth insights into their specialized Other research [8] utilized machine learning methods to
areas. They present the latest research findings before forecast the future of jobs in the software sector. The data
publication and allow the sharing of scientific work in collected is job openings data that consists of a combination
progress or initial results. Also, they show new ideas and of web scraping, manual data collection, and government
early-stage studies that may not be fully explored or data resources. After applying different machine learning
published in journals yet [6]. models, the study revealed that the bidirectional LSTM
model had the highest accuracy among all the applied
models. A downside of the methodology used is that the
dataset used in the study is based on software job postings
from the last two years, which may not be representative of
other sectors or regions. In addition, the model is limited to
the technological solutions and skills included in the trained
dataset. However, this study has contributed to job
forecasting and trend analysis research using combined data
and machine learning models.

Python-based open-source software tools have been


developed by [9]. Then, tools are used to collect and analyze
data from job websites. The goal is to design educational
programs in information technology that consider the labor
market's expectations, abilities, and needed talents. They
concluded that the developer is the most demanded IT
profession and focused their analysis on the skills required
for developers. Therefore, they extracted the top 20 skills
from the Russian Jobs portal “Headhunter” to construct the
data, which includes programming languages and
technologies. The authors used the dataset to create two
Web Development and Data Science educational programs
that align with job market demands.
Fig. 1: Preferred Reporting Items for Systematic Reviews and Meta-
Analyses method (PRISMA) Flow Chart: Adapted from [5] [10] indicated the potential advantages of text mining
methods in identifying emerging job market trends and
forecasting future demand. The study used various
techniques, including web scraping, Elasticsearch, sentiment

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SAO PAULO. Downloaded on March 25,2024 at 23:11:01 UTC from IEEE Xplore. Restrictions apply.
479
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

analysis, and R, to automate the data collection and analysis. occupational data O*NET. The data created was then
The study findings identified the most in-demand specific utilized in the prediction tool, the light gradient-boosting
skills in the Information Technology sector and emphasized machine (LightGBM) algorithm. According to the study
the importance of soft skills. This demonstrates how text- findings, more than half of the jobs in China are expected to
mining tools can be used to forecast job market trends and be taken over by artificial intelligence in the next few
future employment demand. decades.

To forecast early unemployment rates in the United States IV. RESULTS AND DISCUSSION
[11], their study applied a multidimensional approach to Previously reviewed studies shared a common goal: to
online search data with LSTM neural networks. They predict the future of the labor market using machine
broadened the Google keyword components to include learning methods. The studies focused mainly on using
behavioral and psychological features. The study found that machine learning algorithms, including LSTM neural
integrating Google Trends keywords from dimensions other networks, BiLSTM, ARIMA, LSTM-GRU, RFR, SVR,
than job search improves the predicting accuracy. LightGMB, and NLP techniques and sentiment analysis
algorithms. These methods are used to analyze big data and
[12] aimed to identify emerging jobs in Morrocco’s discover patterns that can used in predicting future job
information technology industry. The data was extracted trends. Some studies additionally provided perspectives into
the talents and skills that will be in demand in the future.
from the national occupation directory, recruitment
This enables individuals and organizations to make
websites, and university websites using R and Python
informed decisions about education, training, and hiring.
algorithms. Text mining techniques collected data on labor Taken together, the studies in the field of predicting the
market demand for every job. The authors found that there future of the labor market apply ML methods: market
is a gap between the universities' curriculum and the demand, educational programs, and unemployment. The
employment opportunities. The research findings can assist machine learning methods and datasets used are summarized
universities and students in better understanding job market in Table 1.
needs and the skills required for each position. Furthermore,
the study can aid universities in recognizing rising A. Market demand prediction
professions and revising educational programs accordingly.
• In the study by [7], the researchers employed the
Fast-Text tool to connect patent data with job
[13] their study combined two types of Recurrent Neural descriptions and predict future labor market trends.
Networks (RNN): Long Short-Term Memory (LSTM) and
Gated Recurrent Unit (GRU). Creating a hybrid model • [8] applied bidirectional LSTM models to predict
LSTM-GRU, the hybrid model gives more accurate jobs in the software industry using historical
prediction results; the LSTM-GRU model determines both datasets.
long-term and short-term dependencies. The hybrid model • [10] used text-mining techniques to analyze job
was used to predict future unemployment rates. The data postings and identify the most frequently mentioned
sets were obtained from the OECD reports that are available technical skills.
online and used the monthly unemployment data for the US,
UK, France, and Italy. • The research by [14] used word embedding and
Natural Language Processing (NLP) techniques to
[14] in their research, they presented insights into the track changes in the labor market and predict the
potential of word embedding and machine learning demand for skills.
techniques for tracking trends in the labor market and • Research was conducted by [15] to predict
predicting needed skills. Natural language processing (NLP) employment and future market skills. The
methods were used to predict job classifications from the researchers used the Random Forest Regressors
job description. The study aimed to build an updated job (RFR), Support Victor Regressors (SVR), and
classification system that represents the trends in the labor Integrated Moving Average (ARIMA) model. The
market. results showed that RFR and SVR outperform the
ARIMA model.
In a study by [15], the researchers applied three machine The main findings of these studies indicate that machine
learning models to predict online job postings in the learning tools, especially text mining, can be used to predict
Netherlands. A comparison of the three models' market demand for jobs with a high degree of accuracy.
performance reveals that the Support Vector Regressor These machine learning methods can analyze big data to
(SVR) model and Random Forest Regressor (RFR) identify patterns and trends that can be used to forecast
outperform the baseline ARIMA model. On the other hand, emerging market demand.
comparing the Support Vector Regressor (SVR) model and
the Random Forest Regressor (RFR) showed that RFR is B. Educational program prediction
appropriate in more significant sample prediction. At the
• The research conducted by [9] proposed software
same time, SVR has a higher potential to predict small tools developed by the researchers for collecting data
datasets. from job website postings. The data was analyzed
using machine learning algorithms developed by the
In the study by [16], the researchers constructed a researchers. The results are used to design
Chinese occupational dataset based on the United States

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SAO PAULO. Downloaded on March 25,2024 at 23:11:01 UTC from IEEE Xplore. Restrictions apply.
480
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

educational programs that consider labor market • The research by [11] used multi-dimensional Google
expectations, abilities, and needed talents. Trends data and applied the neural network model
LSTM to predict unemployment rates.
• The study by [12] used R and Python algorithms to
identify emerging jobs in the information technology • The study by [16] used constructed data for
industry. The study extracted data from the national occupations in China and applied the LightGBM
occupation directory, recruitment websites, and algorithm to predict future jobs in China.
university websites.
The studies showed that ML tools can predict future
The main result of these studies showed that machine unemployment rates. These predictions can aid with
learning tools can be used to design educational programs decision-making, including policy adjustment, workforce
that align with the job market demands. planning, and identifying job market risks and opportunities.
Table 1 summarizes the machine learning tools used in
C. Unemployment prediction
the reviewed papers, the datasets, and the studies’
• The study by [13] used a hybrid LSTM-GRU model limitations.
to predict future unemployment rates. The study
applied the model using monthly unemployment data
from the US, UK, France, and Italy obtained from the
OECD website.

TABLE I. SUMMARY OF SELECTED STUDIES' FINDINGS

SUMMARY
Paper
Year Data Collection/ML Tool Dataset Limitation/Comment
[7] 2022 Data collection: The data collection method is Patent Data/Job The study depends on patent data as an
not mentioned; the fastText model trains job information data indicator of technological trends.
and patent description data and generates job from the O*NET Patents do not necessarily lead to new
representation vectors. database: The technology or new markets.
ML Tool: Word embedding. main data source
of US
occupational
information
[8] 2023 Data collection: Web crawling method Software The researchers depend on one source of data:
ML Tool: Bidirectional Long Short-Term Industry job vacancy ads.
Memory (BiLSTM) Historical Data Integrating other dataset sources will provide a
by Web Scraping comprehensive view of market demand.
[9] 2021 Data collection: Open Source software Build a dataset of The researchers have developed open-source
developed by the researchers. 3.7 million job prediction tools. However, testing them on
ML Tool: Interactive Analytic Service tool advertisements different datasets is imperative to ensure their
based on Python using JupyterLab. from the Russian effectiveness.
job site
“HeadHunter"
[10] 2018 Data collection: web scraping. R Scripts On-line Job The dataset relies only on open-access job ads
ML Tool: Text mining techniques, keyword postings, as data sources. It is recommended to include
extraction, topic modeling, and sentiment companies’ other data sources to improve the prediction
analysis. websites, and results.
social media
platforms
2023 Data collection: The tool is not mentioned, Google Trends There are challenges in identifying whether the
[11] only the multidimensional approach of Google data. And searched keywords are job-related.
search query data to expand the keyword United States
dimensions. Department of
ML Tool: Long Short-Term Memory (LSTM) Labor- Jobless
data
2022 Data collection: Python and R Scripts National The data collected are job descriptions in a
[12] ML Tool: occupation government directory. Therefore, the study
Naive Bayes directories, job may not detect new market trends and needs.
Logistic Regression portals, and
SVM university
Word2vec and websites
Logistic Regression
Doc2vec and SVM
2023 Data collection: The tool is not mentioned. The data from The model's performance varies for different
[13] ML Tool: Combination of two Recurrent the OECD datasets.
Neural Networks (RNN), Long Short-Term website. For the
Memory (LSTM), and Gated Recurrent Unit United States,
(GRU) United Kingdom,
LSTM-GRU hybrid model France, and Italy

2020 Data collection: The tool is not mentioned O*NET® The BERT model achieved the highest
[14] ML Tools: Combination of Natural Language database accuracy of 65%. However, the authors did not
Processing (NLP) and Machine Learning apply other metrics, especially precision,
Methods: BiLSTM, BERT, GloVe, FastText, recall, F1 score, or area under the curve

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SAO PAULO. Downloaded on March 25,2024 at 23:11:01 UTC from IEEE Xplore. Restrictions apply.
481
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

SUMMARY
Paper
Year Data Collection/ML Tool Dataset Limitation/Comment
word2vec, TF-IDF-PCA, TF-IDF (AUC), which will accurately evaluate the
model's effectiveness.
[15] 2023 Data collection: The tool is not mentioned. Online job The study did not consider job taxonomy
ML Tool: Posting Ads in methods to achieve more accurate predicting
Autoregressive integrated moving average the Netherlands results.
(ARIMA)
Support Vector Regressor (SVR) model
Random Forest Regressor (RFR)
2023 Data collection: The tool is not mentioned. China’s Seventh The study did not consider external factors
[16] ML Tool: Light Gradient-Boosting Machine National such as demographic indicators, including age
(LightGBM) Population distribution and population size.
Census across six
major categories
and O*NET
[6] L. I. Meho, “Using Scopus’s CiteScore for assessing the quality
V. CONCLUSION of computer science conferences,” Journal of Informetrics, vol.
13, no. 1, pp. 419–433, Feb. 2019, doi:
This study set out to review ten papers that applied 10.1016/j.joi.2019.02.006.
machine learning techniques to predict the future of the labor [7] T. Ha, M. Lee, B. Yun, and B. Y. Coh, “Job Forecasting Based
market. The results of this study identified some of the recent on the Patent Information: A Word Embedding-Based
machine-learning tools and datasets used in predicting the Approach,” IEEE Access, vol. 10, pp. 7223–7233, 2022, doi:
10.1109/ACCESS.2022.3141910.
future of the labor market. The reviewed studies showed that [8] S. Senthurvelautham and N. Senanayake, “A Machine Learning-
using multiple and hybrid machine learning methods leads to Based Job Forecasting and Trend Analysis System to Predict
better prediction results. Recurrent Neural Networks (RNNs) Future Job Markets Using Historical Data,” in 2023 IEEE 8th
and text mining were the most employed machine learning International Conference for Convergence in Technology, I2CT
methods. Moreover, using only one data source, like publicly 2023, Institute of Electrical and Electronics Engineers Inc., 2023.
available job postings, may limit the prediction results. doi: 10.1109/I2CT57861.2023.10126233.
[9] A. Sozykin, A. Koshelev, A. Bersenev, D. Shadrin, A. Aksenov,
Integrating multiple data sources can provide a and E. Kuklin, “Developing Educational Programs Using Russian
comprehensive view of the job market's future. Another IT Job Market Analysis,” in Proceedings - 2021 Ural Symposium
important finding was that the researchers predicted the on Biomedical Engineering, Radioelectronics and Information
future of the labor market from three perspectives: predicting Technology, USBEREIT 2021, Institute of Electrical and
future unemployment, predicting needed educational Electronics Engineers Inc., May 2021, pp. 391–394. doi:
10.1109/USBEREIT51232.2021.9454998.
programs, and predicting market demands. Further research [10] R. B. Mbah, M. Rege, and B. Misra, “Discovering Job Market
should be done to consider external features relating to Trends with Text Analytics,” in Proceedings - 2017 International
economic trends, demographic indicators, changes in GDP, Conference on Information Technology, ICIT 2017, Institute of
and technological advancements. Electrical and Electronics Engineers Inc., Jul. 2018, pp. 137–142.
doi: 10.1109/ICIT.2017.29.
[11] A. Grybauskas, V. Pilinkienė, M. Lukauskas, A. Stundžienė, and
ACKNOWLEDGMENT J. Bruneckienė, “Nowcasting Unemployment Using Neural
Networks and Multi-Dimensional Google Trends Data,”
We want to express our sincere gratitude to the Economies, vol. 11, no. 5, May 2023, doi:
University of Bahrain. We thank all those who have 10.3390/economies11050130.
contributed to this research paper. [12] I. Rahhal, K. Carley, K. Ismail, and N. Sbihi, “Education Path:
Student orientation based on the job market needs,” in IEEE
Global Engineering Education Conference, EDUCON, IEEE
REFERENCES Computer Society, 2022, pp. 1365–1373. doi:
[1] Y. K. Dwivedi, A. Sharma, N. P. Rana, M. Giannakis, P. Goel, 10.1109/EDUCON52537.2022.9766771.
and V. Dutot, “Evolution of artificial intelligence research in [13] M. Yurtsever, “Unemployment rate forecasting: LSTM-GRU
Technological Forecasting and Social Change: Research topics, hybrid approach,” Journal for Labour Market Research, vol. 57,
trends, and future directions,” Technological Forecasting and no. 1, Jun. 2023, doi: 10.1186/s12651-023-00345-8.
Social Change, vol. 192, p. 122579, Jul. 2023, doi: [14] C. M. Jaramillo, P. Squires, H. G. Kaufman, A. Mendes Da Silva,
10.1016/j.techfore.2023.122579. and J. Togelius, “Word embedding for job market spatial
[2] Z. Zhang, “The impact of the artificial intelligence industry on representation: Tracking changes and predicting skills demand,”
the number and structure of employment in the digital economy in Proceedings - 2020 IEEE International Conference on Big
environment,” Technological Forecasting and Social Change, Data, Big Data 2020, Institute of Electrical and Electronics
vol. 197, p. 122881, Dec. 2023, doi: Engineers Inc., Dec. 2020, pp. 5713–5715. doi:
10.1016/j.techfore.2023.122881. 10.1109/BigData50022.2020.9377850.
[3] R. Gruetzemacher, D. Paradice, and K. B. Lee, “Forecasting [15] P. Gajewski, B. Čule, and N. Rankovic, “Unveiling the Power of
extreme labor displacement: A survey of AI practitioners,” ARIMA, Support Vector and Random Forest Regressors for the
Technological Forecasting and Social Change, vol. 161, p. Future of the Dutch Employment Market,” Journal of Theoretical
120323, Dec. 2020, doi: 10.1016/j.techfore.2020.120323. and Applied Electronic Commerce Research, vol. 18, no. 3, pp.
[4] L. Turulja, L. M. Glavan, and M. P. Bach, “Big Data and Labour 1365–1403, Aug. 2023, doi: 10.3390/jtaer18030069.
Markets: A review of research topics,” Procedia Computer [16] C. Wang, M. Zheng, X. Bai, Y. Li, and W. Shen, “Future of jobs
Science, vol. 217, pp. 526–535, Jan. 2023, doi: in China under the impact of artificial intelligence,” Finance
10.1016/j.procs.2022.12.248. Research Letters, vol. 55, p. 103798, Jul. 2023, doi:
[5] M. J. Page et al., “PRISMA 2020 explanation and elaboration: 10.1016/j.frl.2023.103798.
Updated guidance and exemplars for reporting systematic
reviews,” The BMJ, vol. 372. BMJ Publishing Group, Mar. 29,
2021. doi: 10.1136/bmj.n160.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE SAO PAULO. Downloaded on March 25,2024 at 23:11:01 UTC from IEEE Xplore. Restrictions apply.
482

You might also like