Retrieve 2

OPEN ACCESS
Journal of Financial Management, Markets and Institutions

Vol. 11, No. 1 (2023) 2330001 (43 pages)
.c The Author(s)
#
DOI: 10.1142/S2282717X23300015
BIBLIOMETRIC ANALYSIS ON BIG DATA APPLICATIONS

IN INSURANCE SECTOR: PAST, PRESENT, AND FUTURE
RESEARCH DIRECTIONS
SUNITA MALL*,§, TUSHAR RANJAN PANIGRAHI†,¶ and SUSHMA VERMA‡,||

*Departmentof Statistics & Data Analytics
MICA, Ahmedabad, Gujarat, India
†Department of Finance, Unitedworld School of Business
Karnavati University, Ahmedabad, Gujarat, India
‡
Department of Finance
Vivekanand Education Society's Institute of Management Studies and Research
Mumbai, India
§Sunita.mall@micamail.in
¶
dr.tusharpanigrahi@gmail.com
||sushma.verma@ves.ac.in
Received 10 September 2022

Revised 10 March 2023
Accepted 11 April 2023
Published 29 May 2023
In this study, the key areas and current trends in the ¯eld of big data applications in the
insurance industry are identi¯ed, along with suggestions for future research initiatives. We
identi¯ed the most prominent authors, journals, organizations, and countries based on their
total publications and citations, showing their signi¯cance within the network, using biblio-
metric analysis on a sample of 191 articles retrieved from Scopus from 1976 to 2021. VOSviewer
and R-Biblioshiny tools were used to generate the bibliometric output on these retrieved papers.
The ¯ndings showed that although while a good number of writers from other parts of the world
contributed to the literature on big data applications in the insurance industry, during this time,
most research papers have listed the United States, India, and China as their a±liated countries.
The yearly publication was either one or two, with some discontinuity, from 1976 to 2011, but
since 2012, it has increased, exhibiting an exponential growth tendency. The three journals
\Risks," \Applied Stochastic Models in Business and Industry," and \Expert Systems with
Applications" are the most popular for including a sizable number of papers in the ¯eld of big
data technologies in the insurance sector. Each of the top 10 authors in this ¯eld published two
research papers during these 46 years. Seven areas, including fraud detection and prevention,
risk assessment, pricing & rate making, technology utilization, risk management, claim pro-
cessing & prediction, and ¯nally digitalization, were the major focus of research papers on
bigdata applications in the insurance business. The human-centered AI system development,
adoption of wearable technology, personalization, and other topics were found to have received
This is an Open Access article published by World Scienti¯c Publishing Company. It is distributed under
the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 (CC BY-NC-ND)
License which permits use, distribution and reproduction, provided that the original work is properly cited,
the use is non-commercial and no modi¯cations or adaptations are made.
2330001-1
S. Mall, T. R. Panigrahi & S. Verma
very little attention in this study. As a result, the researchers may now direct future research in
this area. This study is completely new of its kind in the domain of insurance though few
documents are available on the broad concept of ¯nance.
Keywords: Bigdata; insurance sector; bibliometric analysis; citation network; bibliometric

coupling.
1. Introduction
Insurance ¯rms create a fund by collecting money from various customers, often
known as policy holders, to safeguard them and assist them when necessary. These
businesses promise to send the policy holders a certain amount of money when they
need it. Like many other industries, the insurance sector has moved towards digital
platforms (Churchill 2007). Like any other sector, insurance industry has seen the
usage of technology develop exponentially. New methods of tracking, assessing, and
reducing risk have been tested by insurance ¯rms thanks to advanced technology and
digital platforms. The Internet of Things, arti¯cial intelligence, blockchain, machine
learning, big data analytics, and insurance management platforms are some of the
important technologies that are being deployed for insurance analytics and insurance
technology (Eckert & Osterrieder 2020). The use of data analytics to target clients
has long been a success for the insurance sector. The classi¯cation of persons into
various risk groups, the prevention of fraud losses, and cost optimization are made
possible by accident statistics, policyholder personal information, and outside sour-
ces (Carfora et al. 2019). The move to digital platforms has created new opportu-
nities for information sources that may be utilized to precisely identify a customer's
segment and comprehend the intricate behavioral patterns of that consumer. Big
data is used in the insurance industry to describe the use of unstructured and/or
structured data to a®ect underwriting, rating, pricing, forms, marketing, and claims
management. Big data applications have already begun helping insurance businesses.
Bigdata analysis and the state of business are hot topics right now. Applications
utilizing big data, arti¯cial intelligence, and machine learning result in decisions and
business processes that are speci¯cally tailored to each person's wants and expec-
tations, enhancing the expansion and e®ectiveness of commercial operations. Over
the past couple of decades, big data analytics has been utilized extensively across
industries (Giannakis 2019, Villars & Olofson 2011, Ogbuokiri et al. 2015). Big data
is produced as a result of frequent interactions with users and among users in the
digital world and the evolution of technology (Hussain & Cambria 2018). Huge
amounts of structured and unstructured data are being produced because of tech-
nological advancements on the internet, mobile devices, cloud computing, wearable
technology, embedded sensors, etc. If the companies can extract value from
these data, they will have a competitive advantage (Giannakis 2019, Villars &
Olofson 2011). The majority of sectors now use data-driven decision-making, and big
data and arti¯cial intelligence (AI) are the two main tools for this process (Liang &
Liu 2018). The recent rapid development of bigdata and the availability of vast
2330001-2
Bibliometric Analysis on Big Data Applications in Insurance Sector
amounts of digitized data have boosted attention among scholars, business, gov-
ernment, and practitioners (Zhang 2018). The researchers are eager to investigate all
facets of how data are present in every element of human life (Williams &
Burnap 2017, Williamson 2015). Previous studies have shown that using big data
analytics helps ¯rms extract value from enormous amounts of unstructured data,
boosts productivity, creativity, and competitiveness, increases customer loyalty, and
generally improves business decision-making (Gandomi & Haider 2015, Mishra
et al. 2016, Vecchio 2017, Verma & Bhattacharyya 2017a). Every discipline has seen
an increase in big data research (Özk€ ose et al. 2015). Since the inception of big data
and AI, a plethora of research articles on its theory, technology, and approaches have
been created. Due to the vast amount of data generated by mobile usage, the re-
search world has also seen an increase in publications based on applications of big
data and AI in several disciplines of study (Eastin et al. 2016, Liao et al. 2015, Paul
et al. 2017).
Bigdata o®ers ¯ndings that are focused on strategy, thus decision-makers across
sectors use it (Labrinidis 2015; Jagadish 2015). Bigdata enables to transform the
business processes (Mishra et al. 2016). By streamlining all areas of operations,
bigdata, AI, and machine learning have radically changed the insurance sector,
transforming its present and future. Huge amounts of data are being processed and
analyzed by the organizations in order to extract value for their consumers and
businesses (George et al. 2014). Several studies have shown that businesses with
data-centric business strategies outperform their rivals in terms of productivity
(Ernest 2011). Bigdata analytics not only make it possible to revolutionize business
operations but also to successfully address signi¯cant business di±culties (Wamba
et al. 2015).
Bigdata, AI, and machine learning are employed in the insurance industry in a
variety of contexts for diverse business choices. Few studies have highlighted the
application of these cutting-edge methodologies in insurance consumer studies,
particularly in tracking and predicting the behavior of insurance customers and
segmenting them according to similar behavior (Carfora et al. 2019, Meyers &
Hoyweghen 2020, Zhang & Banerji 2017, Zhang et al. 2019). Every insurer must
make a few typical choices regarding the anticipation and processing of claims.
Examples include the overall amount of the policyholder's claims or losses, which
claims should be denied, the level of risk involved with an insurance policy, etc.
Bigdata, AI, and machine learning approaches are highly good at providing an an-
swer to these issues (Ding et al. 2020, Jain et al. 2019, Johnson et al. 2021). The
insurance sector has greatly bene¯ted from digitalization and technical development
in terms of developing the capacity to manage massive amounts of data, data pri-
vacy, data warehousing, risk management, etc. (Eckert & Osterrieder 2020,
Marabelli et al. 2017, Nayak et al. 2019). The insurance business has been quite
concerned about fraud since it causes enormous ¯nancial losses. The use of bigdata
analytics, AI, and machine learning approaches has been advocated by several
researchers as cutting-edge methods to identify and anticipate insurance fraud
2330001-3
(Dua & Bais 2014, Major & Riedinger 2002, Mall et al. 2018, Song et al. 2019,
Wang & Xu 2018). The premium cost is an important decision for an insurer since
customers must pay a premium for the insurance policies they buy from an insurance
provider. Every insurance ¯rm has a key decision-making process called pricing or
rate-making, and the main goal is to charge the proper premium for the correct kind
of coverage. Some past studies have shown that big data, arti¯cial intelligence,
and machine learning approaches can be utilized to control insurance rate making.
(Barry & Charpentier 2020, Christmann et al. 2007, Huang & Meng 2019).
Previous researchers such as Boyd & Crawford (2012), Chen & Zhang (2014),
Hashem et al. (2015) have worked on the theoretical development of bigdata ana-
lytics. Another strand of researchers has worked on how bigdata analytics can en-
hance the ¯rm's capabilities and performance by optimizing its resources (Davenport
et al. 2012, Ernest 2011, Murdoch & Detsky 2013, Sharma et al. 2014). It is note-
worthy to mention that few researchers focused on the management transition to
bigdata analytics (Chen et al. 2012, Davenport et al. 2012, George et al. 2014,
Manyika et al. 2017). Nevertheless, research on Bibliometric Analysis on the appli-
cation of big data technology in insurance industry has not yet been done.
In Table 1, certain research papers concentrating on big data applications across
the ¯nancial industry are contrasted with our study, which is primarily focused on
big data applications within the insurance business.
Our study di®ers from the existing one in many ways. The aim of this research is
to identify in°uence, usage, and the bene¯ts of using big data analytics in insurance
sector. Given the major clusters, evolving applicability of big data tools in insurance,
and potential future research ¯elds, this study pinpoints the use of big data tech-
nology in the insurance business. We determine the publishing trends and conceptual
and intellectual structure of this ¯eld by a bibliometric investigation. This paper
o®ers suggestions for future research topics and identi¯es some of the important
dynamics of big data applications in the insurance industry. In the cited ¯eld, we
found the publication pattern and intellectual organization. To the best of our
knowledge, this study is the ¯rst to use bibliometric analysis to summarize big data's
use in the insurance industry. On the mentioned topic, we have answered ¯ve re-
search questions (RQs) in this study. These are the inquiries we are looking to answer
by the following RQs:
RQ1: What is the current publishing trend in the domain of big data applications in
the insurance industry?
The amount of output is a crucial sign for identifying the direction of the research
area's growth (Fuad et al. 2020, Fusco et al. 2020).
RQ 2: Who are the major writers, publications, and organizations who have made
substantial contributions to the literature on applications of big data technology in
insurance industry?
2330001-4
Table 1. Comparison of earlier studies on Bigdata in Insurance and our study.
Basis of comparison Santoso et al. (2022) Hasan et al. (2020) Nobanee et al. (2021) Altaf (2021) Our study
Title Insurance Underwriting and Current landscape and in- Big Data Two Decades of Big Data in Bibliometric analysis on
Technology Relationship: °uence of big data on - Applications the Banking Finance bigdata applications in
A Bibliometric Analysis ¯nance Sector: A Bibliometric Systematic Literature Review Insurance sector: Past,
Analysis Approach. and Future Research present, and future
Agenda research directions
Source Journal of Theoretical and Journal of Big Data SAGE Open (Book) Big Data Analytics for
Applied Information Internet of Things.
Technology
Time Period 1987–2021 Not De¯ned 2012–2020 2000–2019 1976–2021
Keywords Underwriting, Insurance, pre- Big data ¯nance, Big data Bigdata and Banking Financial markets, internet ¯- Big Data, Arti¯cial Intelli-
mium estimation, premi- in ¯nancial services, Big nance, ¯nancial services, gence, Deep Learning,
um calculation, risk data in risk manage- Big data, Internet of things, Machine Learning, in-
assessment, machine ment Data manage- Financial technology, Fin- surance Plan, Insurance
learning, classi¯cation, ment. tech and Financial Analyt- Policy, structured claim
technology, chatbot, arti- ics. data, unstructured
¯cial intelligence, big claim data, Insurance
data, internet of things, Sector, Insurance com-
2330001-5
blockchain, cloud company, Life Insurance,
puting, mobile comput- Health Insurance, Med-
ing. iclaim, General Insur-
ance.
Focus of the study To identify the goals and di- Examines the research on Highlighted big data's signi¯- To examine the literature on big To identify in°uence, usage,
rection of research litera- the impact of big data cance, application, and data in ¯nance, identify and the bene¯ts of using
ture conducted on on di®erent ¯nancial role in the banking and knowledge gaps, and discuss big data analytics in
insurance underwriting markets and institu- ¯nance industry. Also ex- potential future study insurance sector.
and how it relates to the tions, as well as its amined the potential ¯elds.
technology ¯eld. interactions with online areas for future research
credit services, internet in big data analytics for
¯nance, ¯nancial man- the banking sector.
agement, fraud detec-
tion, risk analysis, and
¯nancial application
management.
Methodology Cluster analysis of Authors Bibliometric analysis, cita- Bibliometric Analysis and Bibliometric Analysis Bibliometric Analysis and
Keywords and Citation tion analysis, and key- Thematic Analysis. Thematic Analysis.
Analysis. word mapping analysis.
It is important for the scholars to know who the most active authors in a certain ¯eld
are as it facilitates future collaborations and publications (Rey-Martí et al. 2016, Van
Eck & Waltman 2014). The number of publications per entity, the number of cita-
tions obtained, and bibliographic coupling are used to determine this (Baker et al.
2020, Patel et al. 2022, Khanra et al. 2020). In addition, prominent writers are the
best candidates to be contacted for developing policies as well as for conducting more
study in that speci¯c area.
RQ 3: What are the most important pieces of literature in this ¯eld?
In bibliometric study, it is crucial to know which text is the most widely acknowl-
edged among academics in the selected ¯eld (i.e. Baker et al. 2020, Khanra et al.
2020). Researchers can ¯nd several study directions by using the well-known and
popular documents (Bahoo et al. 2020). This is found using citation analysis based on
the number of citations that document has got (Caviggioli & Ughetto 2019, Khanra
et al. 2020), citation per document (Patel et al. 2022), and bibliographic coupling
(Khanra et al. 2020) based on its relationship with other papers.
RQ4: Which are the most in°uential countries and their present state of research
collaboration between the authors belonging to these countries?
By using Map charts and network visualization to convey the data, the study of
country collaboration provides an unbiased view of the representativeness and de-
velopment of global research on any given issue. Researchers will be able to increase
the e®ectiveness of their work as well as give detailed insight into the connections
between various nations/regions and the various sorts of research accomplishments
they have made (Li et al. 2019). The results might provide readers with insight into
how big data technology applications have evolved over time and in various coun-
tries, according to published publications.
RQ 5: What are the prevailing themes in the body of published literature in the
domain of bigdata applications in the insurance sector?
Discovering the most prevalent topics and areas of study for academics working in
this area is the major goal of this research inquiry. This is accomplished utilizing
the following methods: theme map, keyword analysis, and bibliographic coupling
(Ferreira 2018, Karakus et al. 2019, Aria & Cuccurullo 2017, Cobo et al. 2011).
RQ 6: What would the plan of action be for further study in this area?
Bibliographic coupling results in discrete clusters that represent di®erent topics

(Karakus et al. 2019). The identi¯cation of research gaps results from a thorough
content analysis of the articles within each cluster. By presenting plans for future
research, this o®ers a chance for this study ¯eld to progress (Khanra et al. 2020,
Kumar et al. 2020).
2330001-6
We have answered our RQs and evaluated the progress of research in this domain.
In our study, we have considered 45 years data for the bibliometric analysis, from 1976
till 2021. Our study has also discussed the future research directions in this domain and
thereby inspires researchers to come up with innovative research in this ¯eld. The
study combines SLR, bibliometric analysis, and content analysis to attempt to address
these research topics. In this study, the preferred reporting items for systematic
reviews and meta-analyses (PRISMA) recommendations de¯ned by Moher et al.
(2009) are employed to guide the usage of SLR to retrieve the suitable literature for
further analysis. In addition to choosing keywords, it speci¯es inclusion and exclusion
criteria so that pertinent documents may be extracted (Kumar et al. 2020). Following
that, bibliometric analysis is used to map the existing literature and group documents.
The potential of bibliometric analysis to statistically synthesize the body of infor-
mation relevant to a certain study topic is widely established (Bhatt et al. 2020, Goyal
et al. 2021). The literature relevant to the selected research subject has a large vari-
ation in vocabulary. When this occurs, bibliometric analysis, as opposed to a standard
literature review, aids in detecting and analyzing relationships more e®ectively
through visualization (Shome et al. 2023). Within each cluster, a content analysis of
key papers is also o®ered. This aided in determining the theme for each cluster
(Rodrigues & Mendes 2018). In the social sciences, content analysis is a frequently used
method for conducting systematic reviews of the body of current knowledge (Gaur &
Kumar 2018, Goyal & Kumar 2021). Along with text analysis, bibliometric analysis
helps uncover research gaps and suggest future research areas (Paltrinieri et al. 2019).
By looking at the research enquiries raised by this study, we discovered that
discussions on technological applications in the insurance business, particularly by
researchers from the USA, India, and China, have greatly improved since 2008.
Wang, Y., Major, J. A., and Zhang, J. are few of the most well-known authors in this
¯eld. With the most citations, the top three in°uential publications are Gerson &
Star (1986), Wang & Xu (2018) and Riikkinen et al. (2018). The most often
co-occurring terms with the searched keywords like machine learning, bigdata, in-
surance, arti¯cial intelligence, and deep learning include classi¯cation, data mining,
fraud detection, health insurance, and insurtech. Any one of these seven
categories fraud detection and prevention, pricing and ratemaking, technology
use, customization, risk assessment, claim processing and prediction, and ¯nally,
digitalization can be used to classify the research works examined in this study.
Remaining paper is organized as follows. Section 2 explains the research meth-
odology of the study followed by Sec. 3, which explains bibliometric analysis per-
formed to answer the RQs in this paper along with summarization of the ¯ndings and
Sec. 4 explains the conclusions and contributions of this study.
2. Data and Methodology

A total of 191 research articles in the domain of applications of big data in the
insurance sector are retrieved from the Scopus bibliometric database. We formulated
2330001-7
six RQs to understand the current and future research trends in the sample domain
and these are answered using bibliometric analysis. Bibliometric investigation
analyses the prevailing research work on a given ground quantitatively to identify
the historical and current trend of research in that ¯eld (Bhatt et al. 2020, Verma &
Bhattacharyya 2017b). Biblioshiny tool of R-Studio and VOSviewer are the tools
used to analyze these 191 research articles.
Bibliometric analysis uses a scienti¯c mapping to review and classify the existing
literature on any topic and summarize the research development in any research
domain (Bartolini et al. 2019). It provides the summary of most in°uential authors,
articles, journals, countries, keywords, etc. on any topic under study (Bhukya
et al. 2022). This study uses a combination of bibliometric analysis and content
analysis to decipher the structure of the chosen study ¯eld. Content analysis helps in
understanding the current intellectual research structure that shapes the future
direction of research (Baker et al. 2020).
In conducting a bibliometrics review, mostly Scopus database is chosen over other
such bibliometric databases (Pranckut 2021, Sweileh et al. 2016). Quality consis-
tency and continuous improvement in the standard of research articles are the
central focus of the journals indexed under Scopus. Scopus database is more inclusive
and comprehensive in comparison to the Web of Science, PubMed, and Dimension
(Bartol & Budimir 2013). Further, the earlier studies supported that to conduct
bibliometric study for the research articles published after 1995, it is better to re-
trieve the bibliometric data from Scopus database as Scopus declared that the
documents published before 1996 were lacking complete citation information under
Scopus database (Jacso 2011, Vieira & Gomes 2009, Worthington & Higgs 2006).
Thus, based on the support of various prior bibliometric studies, this research paper
relies on the bibliometric data of the Scopus database.
In this study, bibliometric analysis along with the content analysis is performed
on the existing literature pertaining to big data application in the insurance sector.
Figure 1 shows the methodological framework of this study:
2.1. Method of analysis

Research de¯nes the structure of a scienti¯c ¯eld (Ronda-pupo 2017). We have
performed bibliometric analysis to identify the structure of research on applications
Framing of Questions
Selection of Database
Literature Selection
Bibliometric study of Selected Literature
Conclusions and Implications
Fig. 1. Methodological framework used in our study.
2330001-8
of bigdata in the insurance sector (Castriotta et al. 2018). A study by Kalantari et al.
(2017) highlighted the value of bibliometric approach and discussed the latest
research trends in bigdata in di®erent domains. Bibliometric analysis is used to
unveil research collaborations amongst researchers on bigdata applications (Xian &
Madhavan 2014). The current trends and future research scopes are identi¯ed using
bibliometric analysis (Li et al. 2017).
2.2. Bibliometric tool selection

Bibliometric analysis can be analyzed using CRExplorer, Publish or Perish, Scien-
toPyUI, Bibexcel, BiblioMaps, and many such softwares. Prior studies of biblio-
metric analysis are conducted mostly with Vosviewer and Biblioshiny followed by
bibexcel and Gephi in various domains (Hafeez et al. 2019). In this study, we have
used VOSviewer and Biblioshiny interface of R (Bibliometrix 3.0) to analyze the
articles retrieved from Scopus bibliometric database. In terms of the technologies
under analysis, Bibliometrix has a larger collection of methodologies and is accessible
to practitioners via Biblioshiny. VOSviewer can import and export data from a
variety of sources and o®ers an excellent visualization (Moral-Muñoz et al. 2020).
The top contributors and top in°uencers in terms of writers, sources, organizations,
nations, etc. are found using VOSviewer. The overall summary of the dataset,
research trends, and topic clustering based on author keywords is found using
Biblioshiny.
With little to no coding experience required, Biblioshiny, a web-based tool ac-
cessible through R-studio, is used. Additionally, visualization analysis has also been
done using VOSviewer, a tool for bibliometric analysis. VOSviewer's UI is very user-
friendly, and the default layout options are adequate (Donthu et al. 2021). With its
low-dimensional visualization, VOSviewer allows users to see how objects are placed
so that the distance between any two objects best captures how similar they are.
Each map o®ers the option of viewing it with a density overview or an overlay
overview in addition to the standard network view (van Eck & Waltman 2007,
p. 299).
Biblioshiney tool of R packages supports the entire range of statistical
approaches and visualizations, in contrast to programs like VOSviewer which often
specialize in a few visualization types (network graphs). Bibliometrix provides a
number of methods for performing temporal or longitudinal analysis (Mougenot &
Doussoulin 2022). Simple line graphs to display frequency variations, Histo-
riographic visualization, Thematic Map development, and Reference Publication
Year Spectroscopy are all features of Bibliometrix (Moral-Muñoz et al. 2020, Saikia
et al. 2020).
2.3. Bibliometric database selection

Scopus detects a little more citations than the ISI's Web of Science (WoS) databases,
indexes around 8000 more journals, and may be a more reliable database for business
2330001-9
Records removed before screening:

Identification
Records identified from Scopus:

(n = 1729) Records published in year 2022 removed for
citation biasness reasons (n = 121)
Records considered from Subject area:

Computer science (n = 618) Engineering (n = 273),
Decision Science (n = 165), Mathematics (n = 177),
Records screened Business Management (n = 118) Social science (n =
(n = 1608) 96), Economics, Econometrics & Finance (n = 78) and
Arts & Humanities (n = 10)
Records excluded from Subject area other than the
Screening
above-mentioned discipline: (n = 73)
Document Type Excluded: Conference paper (n = 445). Book

Reports sought for retrieval Chapter, (n =40), Conference Review (n = 22). and Only
(n = 1535) Review papers (n = 19)
Not in English language (n = 4)
Total excluded report (n = 840)
Reports assessed for eligibility Only Articles in English script are Considered (n = 1225)
(n = 695)
Reports Manually excluded due to non-retrieval of full

text: (n = 119)
Included
Retrieved Reports Manually excluded not matching to

Reports included in this study
searched topic in the abstract and full text: (n = 385)
(n = 191)
Fig. 2. Documents selection.
and economics research than WoS. Scopus is unquestionably a more useful tool than
WoS for many publications that are not judged deserving of an ISI ranking (Levine-
Clark & Gil 2008). With some sample test of both the database, it was found that
non-ISI publications received more citations than the worst ¯ve ISI business and
economics journals (Levine-Clark & Gil 2008).
In every ¯eld of research, Google Scholar, a free scholarly resource, yields con-
siderably more citations as it indexes not only the peer-reviewed journals that make
up the content of the other sources, but also many titles that WoS and Scopus do
not. This latter aspect is crucial since it appears that mentioning of articles in widely
circulated newsletters or numerous course syllabi are more indicative of their rele-
vance than citations in the scholarly literature. Although GS citations are often
greater, it seems that comparing citation counts to those of other works in the same
area published through GS would be a useful way to gauge e®ect. For scholars
without access to the relatively pricey WoS or Scopus, GS might be a handy tool
(Levine-Clark & Gil 2008). But amongst the three database GS does not provide a
well-connected core citation network.
WoS re°ects essentially the well-connected core citation network component on
base research, but Scopus allows us to witness some transfer from the core to the
applied research peripheral. WoS has a restrictive indexation strategy, whereas
2330001-10
Scopus has a selective indexation policy. In addition to better metadata quality,

Dimensions' lax indexation approach sends a similar, if less forceful, message to
Scopus regarding coverage (Stahlschmidt & Stephen 2022). Even though each da-
tabase has pros and cons of its own, we primarily used the Scopus database since it
indexes applied research journals more thoroughly (in comparison to WoS) and
because papers published in journals on the Scopus list also undergo a rigorous peer
review procedure (in comparison to GS). Furthermore, because our research focuses
on the application of big data in the insurance sector and analyses application-based
research papers in the area, Scopus database is the ideal option.
2.4. Search string of keywords

The bibliometric data using the string of key words ((\Big Data" OR \Arti¯cial
Intelligence" OR \AI " OR \Deep Learning" OR \Machine Learning") AND
(\insurance Plan" OR \Insurance Policy" OR \structured claim data" OR
\unstructured claim data" OR \Insurance Sector" OR \Insurance company" OR \Life
Insurance" OR \Health Insurance" OR \Mediclaim" OR \General Insurance")) was
searched from the Scopus bibliometric database. The initial search could retrieve a
total of 1729 documents that included documents from book chapters, conference
papers, and documents in all languages. The ¯ltering of documents from the initial
search were made by limiting the search of articles in published journals of English
language and in the subject area of Computer Science, Mathematics, Economics,
Decision Science, Business and Management for the purpose of getting relevant re-
search papers. The ¯nal number of documents studied in this paper after the screening
and ¯ltering is 191. The searching and screening process of inclusion and exclusion of
the research documents in our study is depicted in Fig. 2 using the PRISMA technique.
Initial Search String on Scopus Citation Database.
A quantitative method called bibliometrics evaluates publications' number and
quality using relational, evaluative, and descriptive methods. Simple descriptions of
bibliographic data are produced by descriptive techniques, whereas evaluations of
the e®ects of publications are produced by evaluative methods (McBurney &
Novak 2002). When examining the relationships between units, such as authors,
documents, sources, organizations, and nations, as well as when evaluating the
structure of a study topic, relational approaches are used. The following bibliometric
techniques are used in this study to answer the RQs:
(1) Bibliographic Coupling,

(2) Citation Analysis,
(3) Co-authorship and Collaboration Analysis,
(4) Co-word Analysis.
Bibliographic coupling and citation analysis are seen as indicators of in°uence and
resemblance (Niñerola et al. 2019, Zupic & Čater 2015). Although co-word analysis
2330001-11
Annual trend
Most Active Journal

Descriptive
Analysis Leading Countries
Leading Authors & Organizaon of
Bibliometric Affiliaon
Evaluation of
Citation Analysis
Big Data, AI &
ML Bibliometric &
Keyword and Co-occurrence Analysis
Network Analysis
Application in
Co-authorship Analysis
Insurance
Sector Bibliographic Coupling Analysis
Cluster Analysis
Themac Analysis
Gap Analysis
Fig. 3. Analytical framework of our study.
uses keywords to show the relationships between di®erent concepts, co-authorship,

and collaboration analysis evaluates the collaboration between scholars, organiza-
tions, and countries (Hui & Fong 2004, Wang et al. 2012). Based on bibliometric
analysis, these techniques have been widely used in a variety of past studies (Liu
et al. 2005, Xu et al. 2018, Zamore et al. 2018). The research then provides the
descriptive statistics for the bibliometric data. The dynamics of the subject ¯eld are
then further examined using methods such as bibliographic coupling, citation anal-
ysis, co-author analysis, and co-word analysis. The research structure of our study is
displayed in Fig. 3 that basically explains analytical framework of our study.
In our study, we have performed bibliometric analysis on the literature of bigdata
application in the insurance sector using citation analysis, co-authorship analysis,
keyword and co-occurrence analysis, bibliometric coupling, etc. to answer the RQs
(Castriotta et al. 2018, Korom 2019, Xu & Yu 2019).
3. Analysis and Findings

The RQ1 of our study is, what is the current publication trend in domain of appli-
cation of bigdata in insurance sector? To answer RQ1, we considered some para-
meters of publication trend such as publications by year, author, journal,
organization, and country. The data for this analysis is collected from Scopus da-
tabase and we performed bibliometric analysis to draw relevant insights.
3.1. Data summary

Figure 4 depicts the descriptive statistics on 191 research articles published with a
growth rate of 8.98% per annum since 1976. 550 authors have contributed to this
research domain amongst which 28 are single authors. The ¯gure con¯rms that
2330001-12
Fig. 4. Descriptive statistics.
co-authors per document is 3.04 whereas the international co-authorship percentage

is 18.32%. The average citation per document is 10.73 whereas the document average
age is 6.04. Total number of references in these articles is 6954.
3.2. Publication by year

The number of publications on the topic between 1976–2021 is presented in Fig. 5.
It is evident from the ¯gure that there is a sharp increase in publications after
2016. The advanced technology, innovations, and digitalization transformed the
work process in every sector including insurance sector (Manyika et al. 2017). This
has broadened the scope of research in this topic.
3.3. Publication by country

Bigdata application in insurance sector is an emerging research area and has
attracted many researchers to explore more about its bene¯ts, challenges, and
encourages the researchers to explore the application of bigdata and AI in di®erent
context in insurance industry decisions. Table 2 shows the top publishing countries
in this domain both based on number of documents and citations. The top three
countries based on documents are United States, India, China, respectively, whereas
Annual Producon of Documents

60
y = 0.0025x3 - 0.1414x2 + 2.128x - 6.5887
48
50 R² = 0.8067
40
40 36
No .of Documents
30
20 16
8 9
10 3 3 3 4 4
1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 2 0 0 0 0 2 1 1 1 0 1 2 1 0
0
1976
1978
1980
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
-10
Year
Fig. 5. Publication by year.
2330001-13
Table 2. Leading countries by documents and citation.
Rank Country No of articles Rank Country Citations
1 United States 52 1 United States 576

2 India 24 2 China 345
3 China 16 3 India 199
4 Germany 16 4 Germany 188
5 Taiwan 12 5 Turkey 111
6 Italy 9 6 Australia 111
7 United Kingdom 9 7 Spain 60
8 Spain 6 8 Italy 59
9 France 5 9 Taiwan 57
10 Belgium 5 10 Slovenia 54
based on the citation, the top three countries in sequence are United States, China,
and India.
3.4. Publications by journal

The 191 articles appeared in 148 journals. The journals with most articles on bigdata
applications in insurance sector are listed in Table 3. The theme of these journals is in
line with the topic considered. It is evident from Table 3 that the leading journals
which are publishing articles in this domain are Risks, Applied Stochastic Models in
Business and Industry, Expert Systems with Applications, Big Data and Society,
Advanced Computer Science and Applications, etc.
Table 3. Top publishing journals on bigdata applications in insurance sector.
Rank Sources Publisher Articles SJR score Q rating
1 Risks MDPI 9 0.4 Q2

2 Applied Stochastic Models Wiley-Blackwell 5 0.46 Q2
in Business and Industry Publishing
3 Expert Systems with Elsevier 4 2.07 Q1
Applications
4 Big Data and Society Sage Journals 3 2.04 Q1
5 International Journal of Science and Information 3 0.28 Q3
Advanced Computer Organization
Science and Applica-
tions
6 Lecture Notes in Computer Springer Science 3 0.41 Q2
Science
7 Decision Support Systems Elsevier 2 1.97 Q1
8 Journal of Big Data Springer Open 2 2.59 Q1
9 Applied Soft Computing Elsevier 2 1.96 Q2
Journal
10 Journal of Ambient Intelli- Springer 2 0.91 Q1
gence & Humanized
Computing
2330001-14
3.5. Publication by author and organization

Referring to the database extracted from Scopus, 628 authors from 280 organizations
have contributed research papers in this domain. The researchers who have con-
tributed high-impact research work and the organizations which have published
articles in the domain of application of bigdata in the insurance sector are listed in
Table 4. The research papers of these researchers and organizations are sorted based
on the citation rate. The citation rate of the listed for authors' contribution varies
from 29 to 121 whereas the citation for organizations varies from 2 to 181.
The top 10 ranked researchers have published only two documents each on the
topic considered. Wang contributed two documents which has the highest citation
score of 121. The ¯rst paper focused on analyzing textual information in the claim to
detect fraud. More precisely, it discussed about the process to detect automobile
insurance fraud by using text mining methods. The second paper discussed pre-
dicting driving risk through in-depth analysis using machine learning techniques.
The output of this research will help the insurer to decide the premium accordingly.
The research work of Zhang has the second highest citation score of 81. His paper
proposed a novel hybrid model to solve the CRM in insurance industry. The
Table 4. Top contributing authors and organizations.
Rank Author TP TC Organization TP TC
1 Wang Y. 2 121 School of Statistics and Mathematics, 2 3

Zhejiang Gongshang University, China
2 Zhang J. 2 81 Department of Business & Management, 1 2
Webster Vienna Private University,
Austria
3 Major J. A. 2 67 Department of Economics and Business, 1 2
Saint Anselm College, Manchester,
United States
4 Riedinger D. R. 2 67 Department of Tourism, Faculty of 1 2
Economic Sciences, Ionian University,
Greece
5 Alcañiz M. 2 35 Research Institute of Energy Management 1 2
and Planning, University of Tehran,
Iran
6 Guillen M. 2 35 CMR Institute of Technology Bangalore, 1 1
India
7 Khoshgoftaar T. M. 2 34 Visveswaraya Technological University, 1 1
India
8 Brockett P. L. 2 33 Department Of Business And Management, 1 16
University Of Sussex, United Kingdom
9 Bhattacharyya S. S. 2 29 Information And Process Management 1 16
Department, Bentley University, United
States
10 Krishnamoorthy B. 2 29 Saunders College Of Business, Rochester 1 16
Institute of Technology, United States
Note: TP ¼ Total Publication and TC ¼ Total Citations.
2330001-15
proposed model will help the insurance company to identify clusters of similar cus-
tomers by processing linguistic terms and crisp number data. Major and Riedinger
have the citation score of 67 each. Their research paper discussed on the detection of
healthcare provider fraud. Their study provides a machine learning-based process to
detect healthcare provider fraud by integrating expert knowledge and statistical
information.
The most cited organizations in this domain are Tremont Research Institute of
USA, School of information, Renmin University of China, Smart City Research
Centre, China, and Capital Markets Cooperative Research Centre of Australia with
1 document each and with citations 181, 120, 120, and 105, respectively. The net-
work of co-authorship of authors and countries is addressed by our RQ4 in the later
sections of our study.
3.6. Citation network analysis

The second RQ of our study is to explore the most in°uential articles on the domain
of bigdata applications in insurance sector. We performed citation network analysis
of 216 articles to answer our second RQ. We used R Biblioshiny for global and local
citation and VOS Viewer for citation network map.
The decision of an author to link his document and another authors' work at a
particular point is a citation (Kampis et al. 2009). One of the most relevant ways to
measure the impact of a research article and to build the intellectual linkages is the
citation analysis (Appio et al. 2014, Ding & Cronin 2011). An articles' impact
depends on the citations made by other works. Niñerola et al. (2019) remarked that
citation measures the in°uence and degree of recognition of an author, an article, or a
journal.
Top 10 research articles by both global citation and local citation are displayed in
Table 5. Local citation indicates an article's impact within the collection of articles
considered on this topic. Global citation signi¯es the number of times an article in the
database is cited by other works and research disciplines. Table 5 depicts Gerson &
Star (1986) has highest global citation of 181 followed by Wang & Xu (2018) with
120 and Srinivasan & Arunasalam (2013) with 105 global citations respectively.
Highest local citation is there for Riikkinen et al. (2018) followed by Wang & Xu
(2018), Ince & Aktan (2009) and Kose et al. (2015) each with three local citations.
3.7. Keyword and co-occurrence analysis

The premise of relatedness needs to be established for publication clustering.
According to bibliometrics, relatedness is frequently calculated using either word
relationships or citation relationships (Van Eck & Waltman 2017). Authors' Key-
words, Index Keywords, or Keywords Plus are considered as relatedness indications
when the relationship is determined based on words (Boyack & Klavans 2010).
Authors' Keywords are a collection of terms that, in the authors' opinion, the best
captures the essence of their work and draws attention to any overarching themes,
2330001-16
Table 5. Top 10 articles based on Global and Local citation.
Publication Global Local

Sl. No. Article Authors Source Year Citation Citation
1 Analyzing Due Process in the Gerson & Star ACM Transactions on Infor- 1986 181 0
Workplace mation Systems
2 Leveraging Deep Learning with Wang & Xu Decision Support Systems 2018 120 3
LDA-Based Text Analytics to
Detect Automobile Insurance
Fraud
3 Leveraging Big Data Analytics to Srinivasan & Arunasalam IT professional 2013 105 1
Reduce Healthcare Costs
4 Body Area Network BAN–A Key Schmidt, Norgall, M€orsdorf, Biomedizinische Technik 2002 100 0
Infrastructure Element for Pa- Bernhard & von der Grün
tient-Centered Medical Appli- T
cations
5 Lightweight RFID Protocol for Fan, Jiang, Li & Yang Y IEEE Transactions on Indus- 2018 85 0
Medical Privacy Protection in trial Informatics
IoT
2330001-17
6 Predictive Modeling of Hospital Zheng, Zhang, Yoon, Lam, Expert Systems with Applica- 2015 81 0
Readmissions Using Metaheur- Khasawneh & Poranki tions
istics and Data Mining
7 Online Clinical Decision Support Lakshmanaprabu, Mohanty Applied Soft Computing 2019 55 0
System Using Optimal Deep & Krishnamoorthy S
Neural Networks
8 Using Arti¯cial Intelligence to Riikkinen M, Saarijärvi H, International Journal of Bank 2018 53 5
Create Value in Insurance Sarlin P & Lähteenmäki I Marketing
9 A Comparison of Data Mining Ince H & Aktan B Journal of Business Economics 2009 52 3
Techniques for Credit Scoring and Management
in Banking: A Managerial
Perspective
10 An Interactive Machine-Learning- Kose I, Gokturk M & Kilic K Applied Soft Computing 2015 51 3
Based Electronic Fraud and
Abuse Detection System in
Healthcare Insurance
whereas, Publisher's Keywords are used by the Publisher to index the documents.
The Scopus-generated extended keywords or phrases known as \Keywords Plus" are
found in the references to papers rather than in the titles or keywords of those
publications (Tripathi et al. 2018, Zhang et al. 2015). Direct citation relations, co-
citation relationships, and bibliographic coupling relationships are further categories
for the citation-based relationships (Klavans & Boyack 2017).
For assessing the relatedness of publications, a combined approach often con-
siders both citation linkages and word relations (Boyack & Klavans 2010). Co-word
analysis and bibliographic coupling of texts are used in this study to highlight themes
and group publications together since they might reveal current topics (Chang et al.
2015). For mapping the intellectual structure, citation analysis, and co-citation
analysis may also be utilized, however, these techniques cannot identify the emergent
ideas.
Co-word analysis is a technique for examining signi¯cant word co-occurrences
and for identifying relationships and interactions between the research themes and
contemporary research trends. Keywords are the phrases and words that authors
regularly employ in the papers' titles, abstracts, and body. Co-word analysis is used
in this paper to illustrate the interactions that take place throughout di®erent in-
novation process phases and to show whether fundamental or applied research is the
primary force (Callon et al. 1991). It is referred to be co-occurring when two essential
words, I and j, appear together in the summary of a single document. It should be
clear that counting the occurrences of co-words will not reveal the links between
them. Terms that are used often indeed, almost systematically
will be given
an advantage over words that are used less frequently when indexing the papers in a
¯le that are being evaluated (Callon et al. 1991).
Co-word analysis uses the interactions between groups of keywords to map the
relationships between objects in textual data and literary ideas (Wang et al. 2012). It
is predicated on the idea that the keywords re°ect any document's main ideas and
give a quick overview of the material already available on a given subject (Khanra
et al. 2020). This approach is frequently used to comprehend trends and hot issues in
a certain scienti¯c subject. According to Gar¯eld (1990), Keywords Plus has the
ability to deeply capture a document's content despite the paucity of studies dem-
onstrating this (Zhang et al. 2015). Additionally, the authors' keywords and index
keywords place greater emphasis on document identi¯cation on a particular concept
or topic than do the terms in Keywords Plus, which stress research methodologies,
tools, and approaches more (Gar¯eld 1990, Gar¯eld & Sher 1993, Zhang et al. 2015).
Authors' keywords and index keywords are analyzed in this study which is known as
keyword information. The data on keywords reveal which terms are often utilized in
the articles. According to the bibliometric data, the authors of the publications
included in this study contributed a total of 457 keywords, while the publishers
indexed the materials using 409 keywords. The standardization of keywords was the
¯rst stage. This was achieved by adhering to the principle of simplicity (Valderrama-
Zurian et al. 2017).
2330001-18
Table 6. Most frequent keywords.
Rank Words Frequency Rank Words Frequency
1 Machine Learning 44 11 Big Data Analytics 6

2 Big Data 30 12 Clinical Decision Support System 6
3 Arti¯cial Intelligence 13 13 Natural Language Processing 5
4 Insurance 13 14 Chronic Disease 4
5 Neural Network 13 15 Clustering 4
6 Data Mining 12 16 Decision Support System 4
7 Classi¯cation 11 17 Insurtech 4
8 Deep Learning 11 18 Medicare 4
9 Health Insurance 9 19 Prediction 4
10 Fraud Detection 7 20 Privacy 4
For instance, complete form and acronym (\Arti¯cial Intelligence" or \AI") as

well as singular and plural forms (\Decision Support System" and \Decision Support
Systems") were standardized. For the top 100 commonly occurring terms, it was
done manually. Thus, a total of 685 author keywords and 1614 index keywords were
produced. Based on how frequently they appear, Table 6 lists the top 20 authors'
keywords. Table 6 shows that authors most used machine learning, big data, In-
surance and arti¯cial intelligence. A better understanding of these authors' keywords
can be seen in Fig. 6. Authors have highlighted `machine learning', `Supervised
Learning, and àrti¯cial neural network' used for `classi¯cation' of insurance custo-
mers. Similarly, they have used \support vector machine", \neural network", \deep
neural network" to boost \customer relationship" in the insurance companies. The
search term used to pick documents is validated by the frequency of the author's
keywords. The perspective of indexers, however, seems to be broader and it considers
Fig. 6. Keyword co-occurrence network map.
2330001-19
the insurance industry, information technology, as well as ¯nancial technology in

insurance management. Author's keywords have been used for further research since
they are carefully selected by writers and are thought to be the most precise de-
scription of an article's content (Song et al. 2019, Zhang et al. 2015).
RQ3 of our study discusses which themes are the most popular themes in the
domain of bigdata applications in the insurance sector? We have answered this RQ
by referring to the keyword and co-occurrence analysis performed in R. Biblioshiny.
This section highlights the most frequently used authors' keywords in the relevant
publications. Keyword analysis helps to identify prominent research topic in this
area. An article's content can be represented by a keyword (Comerio &
Strozzi 2019). Keyword co-occurrence signi¯es the link between two keywords that
appear together in an article which explains the relationship between those two
keywords.
The keyword analysis output is displayed in Table 6 and the keyword co-oc-
currence network map using VOSviewer network visualization is displayed in Fig. 6,
which indicates the literature of bigdata applications in insurance sector. The top 10
most frequently used keywords are machine learning, bigdata, insurance, arti¯cial
intelligence, deep learning, classi¯cation, data mining, fraud detection, health in-
surance, and insurtech. It further indicates that bigdata applications in insurance
sector research is mostly centered on insurance claims, health insurance, fraud de-
tection, and technological innovations. The minimum number of occurrences of a
keyword is set as 2 to build the keyword co-occurrence network. Out of 685 authors'
keywords, 100 met the threshold of 2.
It is also evident from Fig. 6 that insurtech, risk, digitalization, fraud detection,
health care has the most prominent nodes indicating their relative importance in this
research domain. The co-occurrence of authors keywords shows AI technology, IOT
are especially applied autonomous vehicle or automated vehicle, understanding the
personal data, risk pro¯le of the insurance buyer and innovation of the insurance
products for insurance companies. Techniques like arti¯cial neural network, Bayes-
ian network, decision support system, decision tree, random forest, logistic regres-
sion, support vector machine and spatial analysis are used to analyze the health
record of the health insurance buyers by insurance companies. Product feature se-
lection, analyzing health care disparities, health insurance claims, and decision on
variable annuity or premium are based on ensemble learning, ensemble modeling,
and machine learning. In automobile insurance and health insurance, big data an-
alytics, predictive analytics, topic modeling are used for fraud detection and pre-
ventive care.
In Fig. 7, it is observed that ensemble learning, arti¯cial neural network, support
vector regression, imbalanced data, and supervised learning are the new tools under
big data analytics used by insurance companies in the most recent research pub-
lications. Activities like risk assessment, reinsurance and innovations are the ¯eld of
research in emerging in the recent most research documents.
2330001-20
Fig. 7. Keyword co-occurrence overlay visualization map.
3.8. Country co-authorship analysis

In this section, we have answered our RQ4 which is de¯ned as follows:
RQ4: Which are the most in°uential countries and the present state of collaboration
between the authors belonging to these countries? In scienti¯c research, collabora-
tions amongst the researchers brings intellectual association (Cisneros et al. 2018).
The literature on a research topic is in°uenced by the reference of certain publication
by the network of co-authors (Caviggioli & Ughetto 2019, Racherla & Hu 2010).
Song et al. (2019) documented that the collaboration between individual, organi-
zations, and countries is explained by their social network. We identi¯ed the most
in°uential authors. We identi¯ed the most in°uential countries and the network of
collaboration of authors from these countries by analyzing the current state of col-
laboration and the extent of collaboration amongst them.
The strength of association between the countries is displayed in Fig. 8. We set a

minimum of ¯ve documents as criterion. Fourteen countries met the threshold. We
found that US has the maximum number of documents with 58 documents and 577
citations followed by China with 26 documents and 203 citations and India with 24
documents and 348 citations. It is evident from Fig. 8 that most signi¯cant and
frequent collaboration is found amongst the scholars of US, China, and India. A
decent degree of collaboration is seen between authors from Germany, Taiwan, UK,
and Italy. It is also found that Italy has co-authorship with US and UK, Taiwan has
co-authorship with US, China, and India whereas Germany has co-authorship only
with US.
2330001-21
Fig. 8. Country co-authorship network map.
3.9. Bibliometric mapping and research themes

In our study, the RQ 5 address the question which is framed as \What is the in-
tellectual structure of current research in the domain of bigdata applications in the
insurance sector?" Bibliometric mapping has been widely used in various research
domain across the discipline and it summarizes the commonalities in the content of
research documents (Donthu et al. 2020, Homrich et al. 2018, Lee et al. 2014, Zhang
& Banerji 2017). In our study, bibliometric coupling of documents is used to develop
bibliometric network (Fig. 9). In this network diagram, the articles are represented
with nodes and the network signi¯es the link between the articles. Bibliographic
coupling of documents provides various clusters of documents. The citation has been
taken 6. Out of 216 documents, 74 met the threshold.
In our study, seven clusters are formed comprising of 37 articles. Table 7 shows
the number of publications in each of these seven clusters from 1992 to April 2022.
In this section, we discussed the research themes identi¯ed from these seven
clusters referring to the most relevant articles included in it. Table 8 depicts the
articles in these clusters along with their RQs and future implications. These articles
represent various applications of bigdata, AI and machine learning in the insurance
sector. It is also evident from Table 8 that bigdata analytics plays a vital role in
insurance data handling, business decisions, transformation, and growth. These
themes will enable the researchers to identify the gaps and to explore future research
in this domain.
Theme 1 discusses on fraud detection and prevention. Fraud has been one of the
biggest challenges across the globe. Fraud is an act to achieve gains or bene¯ts
illegally on false ground which badly impact the moral of the human being, the law
and society and the economic growth of a country (Alexopoulos et al. 2007). Gill
(2016) remarked that insurance fraud can be de¯ned as a ¯ctitious claim which is
made either individually or in a group, overstating a claim with a motive of gaining
more than the entitled amount. Therefore, fraud is a cybercrime that causes huge
¯nancial losses. It is evident from Table 7 that the articles related to this theme are
basically explaining the insurance fraud detection using machine learning techniques.
2330001-22
Table 7. Number of articles on big data, AI, and ML application in Insurance industry in each cluster
between 1992 and 2022.
Grand
Year Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 total
1986 0
1992 1 1
1997 0
2002 1 1
2007 1 1
2008 0
2009 1 1
2010 1 1
2012 0
2013 0
2014 1 1
2015 1 1 2
2016 0
2017 1 1 2
2018 1 1 2
2019 5 2 1 3 4 15
2020 2 1 3 6
2021 2 1 1 4
Total documents 10 7 5 5 4 4 2 37
of the cluster
Total citations 231 141 233 118 83 71 67 944
of the cluster
Total links of 84 28 13 18 15 44 30 232
the cluster
Healthcare provider fraud and health insurance fraud can be e®ectively identi¯ed by
using machine learning algorithms (Dua & Bais 2014, Johnson & Khoshgoftaar 2019,
Major & Riedinger 2002, Song et al. 2019) Claim analysis using machine learning
algorithms is used to detect insurance fraud and to classify genuine claims and fraud
claims (Rawat et al. 2021, Wang & Xu 2018). Fraud can be damaging to insurance
business resulting in a huge ¯nancial loss. Therefore, fraud needs to be prevented.
Pricing or Ratemaking is another theme. The process to determine what prices or
rates must be charged by an insurance company is called pricing or ratemaking.
Bigdata and datamining techniques are used for insurance ratemaking. These
techniques help to determine right amount of premium to be charged to the custo-
mers (Barry & Charpentier 2020, Christmann et al. 2007). A study by Huang &
Meng (2019) discussed on the classi¯cation of ratemaking for user-based insurance
product.
Technology utilization is the next theme. Advanced technology and digital
platforms have evolved the insurance landscape. Arti¯cial intelligence, internet of
things, block chain, machine learning, bigdata analytics are some of the technologies
that have enriched the insurance industry by providing tech solutions to track,
measure, and control risk. A study by Nayak et al. (2019) discussed about
2330001-23
Table 8. Research themes.
Cluster Research question

number Title of the article Author Year Name of the journal addressed TC TLS Theme
1 EFD: A hybrid knowledge/ John A. Major and 1992 International Journal How Electronic Fraud De- 20 15 Fraud Detection and
Statistical-based system Dan R. Riedinger of Intelligent Sys- tection (EFD) assists Prevention
for detection of Fraud tems Investigative Con-
(Major & Rie- sultants in the Managed
dinger 2002) Care and Employee
Bene¯ts Security Unit of
The Travelers Insurance
Companies in the detec-
tion and pre-investiga-

tive analysis of
healthcare provider
fraud.
Supervised learning meth- Prerna Dua and 2014 Intelligent Systems How to detect healthcare 11 2
ods for fraud detection Sonali Bais Reference Library fraud by using super-
in healthcare insurance vised machine learning
2330001-24
(Dua & Bais 2014) techniques.
Application of machine Seema Rawat, 2021 International Journal How can the claim analysis 11 1
learning and data visu- Aakankshu of Information help to understand the
alization techniques for Rawat, Deepak Management Data client strata in a sys-
decision support in the Kumar, A. Sai Insights tematic manner and
insurance sector (Rawat Sabitha helps in identifying
et al. 2021) fraud claims and genu-
ine claims by using ma-
chine learning
algorithm.
Development of a medical Chang-Woo Song, 2019 Cluster Computing How to detect fraudulent 40 1
big-data mining process Hoill Jung & and abnormal cases in
using topic modelling Kyungyong healthcare services by
(Song et al. 2019) Chung using machine learning
techniques.
Table 8. (Continued )

Leveraging deep learning Yibo Wang, Wei Xu 2018 Decision Support Sys- What process can be fol- 120 4
with LDA-based text tems lowed to analyze textual
analytics to detect au- information in the
tomobile insurance claims to detect insur-
fraud (Wang & ance fraud
Xu 2018) What process can be used to
detect automobile in-
surance fraud by using
text mining methods
where the experience of
human experts is
hidden.
Medicare fraud detection Justin M. Johnson & 2019 Journal of Big Data How the Medicare fraud, 26 8
using neural networks Taghi M. waste and abuse can be
(Johnson & Khoshgof- Khoshgoftaar detected using neural
2330001-25
taar 2019) networks?
2 On a strategy to develop Andreas Christmann 2007 Acta Mathematicae How data mining techni- 8 5 Pricing and Rate-
robust and simple tari®s Applicatae Sinica ques can be used to de- making
from motor vehicle in- termine the actual
surance data (Christ- premium to be charged
mann et al. 2007) to the customer.
Personalization as a prom- Laurence Barry & 2020 Big Data and Society What is the impact of big 13 27
ise: Can Big Data Arthur data technologies for in-
change the practice of Charpentier surance ratemaking?
insurance? (Barry & Does telematics removed
Charpentier 2020) or not risk apprehen-
sions and pricing in
motor insurance?

Automobile insurance clas- Yifan Huang & 2019 Decision Support Sys-
How to decide on classi¯ca- 30 17
si¯cation ratemaking Shengwang Meng tems tion of ratemaking for
based on telematics usage-based insurance
driving data (Huang & (UBI) product.
Meng 2019) How to predict the risk
probability and claim
frequency of an insured
vehicle?
3 Democratizing health in- Bishwajit Nayak, 2019 b Business Strategy and Which factors should a 13 8 Technology
surance services; accel- Som Sekhar Development health insurance ¯rm Utilization
erating social inclusion Bhattacharyya & consider developing its
through technology pol- Bala Krishna- technology policy?
icy of health insurance moorthy How can the technology
¯rms (Nayak et al. 2019) policy of a health insur-
ance ¯rm enhance social
2330001-26
inclusivity?
Integrating wearable tech- Bishwajit Nayak1 j 2019a Journal of Systems What are the key dynamic 16 5
nology products and big Som Sekhar and Information capabilities that health
data analytics in busi- Bhattacharyya2 j Technology insurance ¯rms should
ness strategy: A study of Bala Krishna- build to manage big
health insurance ¯rms moorthy (2019) data generated by wear-
able technology to at-
tain a competitive
advantage?
What is the impact of the
adoption of wearable
technology products for
Indian health insurance
¯rms?

DeepReco: Deep learning Abhaya Kumar 2019 Computation How big data analytics can 34 1
based health recom- Sahoo, Chittar- be used for the imple-
mender system using anjan Pradhan, mentation of an e®ective
collaborative ¯ltering Rabindra Kumar health recommender
(Sahoo et al. 2019) Barik and system/engine in health
Harishchandra care industry.
Dubey 3 (2019)
The light and dark side of Marco Marabelli, 2017 Communications of What are the uses of sensor- 16 5
the black box: Sensor- Sean Hansen, Sue the Association for based technologies in the
based technology in the Newell, Chiara Information Sys- automotive insurance
automotive industry Frigerio tems industry.
(Marabelli et al. 2017)
Smart services in health- Rouven-B. Wiegard1 2019 Electronic Markets What are the signi¯cant 20 1
2330001-27
care: A risk-bene¯t- & Michael determinants of an
analysis of pay-as-you- H. Breitner1 insured's intention to
live services from cus- use wearable devices in
tomer perspective in pay-as-you-live services
Germany (Wiegard & by comparing perceived
Breitner 2019) privacy risks and per-
ceived bene¯ts?
4 `Happy failures': Experi- Gert Meyers and Ine 2020 Big Data and Society What is the role of experi- 11 13 Personalization
mentation with behav- Van Hoyweghen mentation for the mak-
ior-based personaliza- ing of big data enabled
tion in car insurance personalization in insur-
(Meyers & Hoywe- ance market?
ghen 2020)

A Novel Hybrid Correlation Xiaofang Zhang, 2019 International Journal How to segment similar 12 1
Measure for Probabilis- Zeshui Xu, Peijia of Information customers from massive
tic Linguistic Term Sets Ren Technology and insurance customer data
and Crisp Numbers and Decision Making using correlation mea-
Its Application in Cus- sures and clustering
tomer Relationship algorithm.
Management (Zhang
et al. 2019)
A \pay-how-you-drive" car Maria Francesca 2019 Soft Computing How to identify the driver's 27 8
insurance approach Carfora1 Fabio behavior and segment
through cluster analysis Martinelli2 them using unsupervised
(Carfora et al. 2019) Francesco Mer- machine learning
caldo2 Vittoria techniques.
2330001-28
Nardone3
Albina Orlando1

5 Assessing risk in life insur- Rachna Jain, Jafar 2019 Journal of Intelligent How to evaluate the risk 9 3 Risk Management
ance using ensemble A. Alzubib, and Fuzzy Systems associated with an in-
learning (Jain Nikita Jaina and surance policy applicant
et al. 2019) Pawan Joshia by using ensemble
learning.
6 Predicting motor insurance Jessica Pesantez- 2019 Risks How to predict the occur- 35 9 Claim Processing
claims using telematics Narvaez, Mon- rence of an accident and Prediction
data
XGboost versus tserrat Guillen claim using machine
logistic regression and Manuela learning techniques.
(Pesantez-Narvaez Alcañiz
et al. 2019)

Machine learning improves Kexing Ding, Baruch 2020 Review of Accounting Is the loss estimates gener- 15 3
accounting estimates: Lev, Xuan Peng Studies ated by machine learn-
Evidence from insurance & Ting Sun ing superior to actual
payments (Ding managerial estimates?
et al. 2020) How to predict the total
loses or claims by pol-
icyholders.
Covariate selection from Mario V. Wüthrich, 2017 European Actuarial How to estimate the driving 24 3
telematics car driving Journal habits and generate
data (Wüthrich 2017) pattern in driving style
by using telematics data
and using machine
learning techniques.
Responsible Arti¯cial Intel- Marina Johnson 2021 Information Systems How to identify potentially 7 2
ligence in Healthcare: Abdullah Albizri Frontiers denied claims by using a
2330001-29
Predicting and Prevent- & Antoine responsible arti¯cial in-
ing Insurance Claim Harfouche telligence approach.
Denials for Economic
and Social Wellbeing
(Johnson et al. 2021)
7 How digitalization a®ects Christian Eckert 2020 Zeitschrift fur die Which digital technologies 8 3 Digitalization
insurance companies: Katrin gesamte Versicher- have high strategic rele-
Overview and use cases Osterrieder ungswissenschaft vance for the digital
of digital technologies transformation of insur-
(Eckert & ance companies.
Osterrieder 2020) What is the impact of digi-
tal technologies on the
insurer's information
technology (IT) system.
Note: TLS – Total Link Strength and TC – Total Citation.

Fig. 9. Network map of bibliographic coupling of documents.
establishing few technology factors that are essential for the health insurance com-
panies in risk management, data warehousing and data privacy. Another strand of
researchers documented the impact of the adoption of wearable technology products
for Indian health insurance ¯rms (Nayak et al. 2019, Wiegard & Breitner 2019).
Trust, privacy, and risk are the concerns associated with technology usage. Sensor-
based technologies are helping the insurance companies to gain competitive advan-
tage around risk assessment and behavior-based pricing (Marabelli et al. 2017, Zari¯
et al. 2018).
The next theme is personalization. In the insurance sector, personalization is
de¯ned as developing a strong understanding of customers, simplifying customer
interactions, and providing the right kind of services as per customer needs. Digi-
talization has made it easier for the insurer, reinsurer, and insurance brokers to drive
personalization. To gain competitive advantage, Insurance companies are making
bigdata-enabled personalization to have personalized insurance prices, services, and
products (Meyers & Hoyweghen 2020). Understanding the customers plays a vital
role in business decisions. To track the customer behaviors, bigdata analytics is used
(Carfora et al. 2019, Zhang 2018).
Risk assessment is the next theme. One of the key objectives of insurance com-
panies is to diversify risk. Insurers verify the customer information for assessing the
risks. Based on their behavior, customers are segmented into di®erent risk classes.
Bigdata analytics has improved the e±ciency of risk assessment process in the in-
surance industry. The insurer should o®er right kind of policy with right amount of
premium. To decide the premium the insurer needs to evaluate and assess the risks
associated with the insurance policy. A study by Jain et al. (2019) proposed a method
of evaluating the risk associated with an insurance policy applicant by using en-
semble learning.
The next theme identi¯ed is claim processing and prediction. When a claimant
requests a claim for his/her policy, the insurer checks and validate the adequate
information and authenticity of the claim and either reimburse the money in a part
or whole accordingly. Bigdata and machine learning algorithms have made it easier
to handle huge insurance claim data. Insurance companies can reduce the operational
cost and increase the e±ciency of the claim process and gain competitive advantage
by predicting loss estimates, authentic claims, potential denied claims by using
bigdata analytics (Ding et al. 2020, Pesantez-Narvaez et al. 2019).
2330001-30
Digitalization is another theme. Digitalization is driving signi¯cant changes in

the insurance sector. Digitalization enables the insurance companies to use di®erent
digital channels and advanced bigdata analytics to have a two-way interaction with
the customers. Insurance operations like claim processing and prediction, risk as-
sessment, pricing, etc. are handled e®ectively due to digital transformation in the
insurance industry. A study by Eckert & Osterrieder (2020) highlighted the bene¯ts
and opportunities of using digital technology by the insurance companies for various
operational and strategic decisions.
3.10. Development of an integrative framework

We have developed an integrative conceptual framework summarizing our cluster
analysis. The main objective of this bibliometric analysis is to understand the trend
and structure of research in the domain of bigdata applications in the insurance
sector. The conceptual framework displayed in Fig. 10 shows di®erent bigdata tools
and techniques and their application in insurance decisions.
The documents in the clusters highlighted two broad segments of bigdata ana-
lytics such as (a) tools and algorithm and (b) Infrastructure. The algorithms used in
the extracted papers are clustering, pattern recognition, classi¯cation, and regres-
sion. Bigdata infrastructure helps in bigdata management and processing. Bigdata
infrastructure helps in data collection, data storage, data transfer, and data backup
Tozzi et al. (2019).
Fig. 10. Conceptual framework of bigdata applications in insurance sector.
2330001-31
This research presents an integrated conceptual model that describes the bigdata
tools and techniques in insurance business decisions (Fig. 10).
Insurance sector has experienced signi¯cant changes due to the evolution of
technology. Large amount of insurance data generated from various sources such as
web servers, sensors, health care data, telematics, wearable technology, etc. are
managed and processed e®ectively by bigdata tools and techniques. The documents
included in the clusters also highlighted few applications of bigdata tools, algorithms,
and bigdata infrastructures in insurance sector. Our bibliometric study revealed that
the bigdata tools, techniques, and infrastructure are widely used for various insur-
ance business decisions such as fraud detection and prevention, tracking consumer
behavior, technology utilization, pricing, and ratemaking, claim processing and
prediction, risk management, handling digital platforms, personalization, etc. which
ultimately helps the insurance company in broad activities like marketing, opera-
tions, and strategy building.
4. Findings and Conclusions

This section discussed the overall research output and suggested directions for future
research. We also identi¯ed some impediments that the researchers face while
working in this domain. The descriptive analysis answers RQ1 and depicts the
current research trends in the domain of bigdata applications in the insurance sector.
It is evident from the bibliometric analysis that there is an increasing trend of
publication on this topic since 2008. However, since 2019, a signi¯cant sti® upward
trend is visible in the publications. The reason could be due to global pandemic, the
organizations across sectors had to create digital innovation, and to adopt data
driven cloud-based business which had given a wider scope to the academicians and
practitioners to pursue research on various aspect related to this. We found that the
authors and the organizations across the globe have contributed to the literature on
bigdata applications in the insurance sector. The leading countries contributing to
the literature of the sample topic are USA, India, and China. The citation network
analysis of our study has answered our RQ2 and suggests that a study by Gerson &
Star (1986) has highest global citation of 181 followed by Wang & Xu (2018) with
120 citations. Riikkinen et al. (2018) got highest local citation on his work followed
by Wang & Xu (2018).
The prominent keywords are identi¯ed and RQ3 is answered through keyword
and co-occurrence analysis. The prominent keywords in the sample domain are
machine learning, bigdata, insurance, arti¯cial intelligence, deep learning, classi¯-
cation, data mining, fraud detection, health insurance, and insurtech. The studies
have focused on insurance claim analysis and prediction, data handling, technology,
innovation, etc. Our RQ4 is to identify the most in°uential countries and the current
state of collaboration between the authors from these countries. Our result explains
that most frequent collaborations are seen amongst the scholars of US, China, and
India on the topic under study. This also highlights that US has maximum citation
2330001-32
followed by China and India, respectively. The current intellectual structure in the
sample topic (RQ5) is answered by the bibliometric coupling analysis. We have
identi¯ed few research themes from di®erent clusters formulated by bibliometric
coupling analysis such as fraud detection and prevention, pricing and ratemaking,
technology utilization, personalization, risk assessment, claim processing and pre-
diction, and digitalization.
The insurance sector is moving towards digitalization thanks to big data, ma-
chine learning, arti¯cial intelligence (AI), and neural networks. In order to undertake
digital transformation, improve sales and eliminate frauds, and timely settlement of
insurance claims, insurance companies are embracing these technologies. Also, the
bibliographic coupling of research publications (RQ5) shows the applications of
machine learning in fraud detection and prevention, and precise risk analysis for
premium calculation through actuary service. These services have an impact through
boosting sales and customer satisfaction, accelerating manual tasks, enhancing the
path to purchase, streamlining procedures and ensuring system dependability, an-
alyzing ¯nancial performance, and managing growth of the insurance companies.
Despite these ground-breaking service transmissions, the insurance business being in
¯nance industry still faces a number of serious big data problems. One of the most
important and pressing issues with big data services is privacy and data protection as
observed in the co-word analysis. Even though every ¯nancial service and product
depends entirely on data and generates data every second, big data research in
insurance service hasn't yet reached its pinnacle. In this light, it makes sense to settle
the next research directions by discussion of this work.
4.1. Impediments to current research

A systematic review of the literature suggests that the growth of research in this
domain is still limited. The factors that impede the research growth are summarized
as follows:
(a) Lack of data availability: In most of the developing countries, it is di±cult for
the independent researchers to access the insurance data which restricts more
and better research in this domain.
(b) Lack of theory development: Most of the studies in this domain are empirical
studies. Therefore, more conceptual research relating to a theoretical framework
should be encouraged.
(c) Lack of academic collaboration: More collaborations are needed across the globe
to develop better research frameworks.
4.2. Research Gaps and future areas of research

It is observed from the analysis that most of the research in the sample domain are
from USA, India, China, etc. (RQ4). However, it is also seen that the authors of these
documents mostly belong to the same country. The insurance sectors across the globe
2330001-33
di®ers with respect to their technology adoption, digitalization, innovation, and

overall operations. Therefore, cross-country studies would provide better insights
and would open vast research scope in this domain.
Most of the research works discussing the application of bigdata are empirical in
nature. Conceptual research paper with robust framework in this domain is needed,
which can be studied by the researchers. From the thorough literature analysis, we
found that most of the studies have proposed a solution to complex insurance
business decisions such as claim processing, data warehousing, fraud detection, and
prediction, risk assessment, ratemaking, etc (RQ5). However, very few studies are on
personalization, digitalization, human centered AI system development, adoption of
wearable technology, and telematics data usage for insurance decisions which are the
avenues for future research according to the authors of the studied research paper.
The implementation of big data technology that supports the insurance indus-
try's business in numerous ways was the main theme of every research paper we
studied. Nevertheless, none of the research document focuses on the bene¯ts for the
customers, such as how personalized insurance services may increase customer sat-
isfaction and wellbeing or how customers can safeguard themselves from the mis-
selling tactics of insurance agents and being duped by insurance companies. Future
research in this ¯eld may also include analyzing consumer engagement with and
experiences with AI-enabled insurance services. Therefore, we recommend the future
research direction can be more customer oriented.
The reviewed literature revealed that the sensors and internet of things (IoT)
frameworks that are part of wearable technology have not been extensively resear-
ched in relation to a number of health management services (HMS), which have a
signi¯cant impact on the determination of rates and pricing for health insurance and
life insurance products. Particularly, these wearable gadgets of HMS operations can
enable e®ective and smart diagnosis, supervision, and treatment of diseases and
disorders which may lower the risk of the insurance companies. It may be because of
implementation (cost and user adoption) and regulatory rules that prohibit the
deployment of Blockchain in HMS that research materials on data quality, ethics,
and privacy concerns connected to big data technologies in insurance businesses are
hard to come by. Future digital transformation will make it possible for clients and
insurers to work together on loss prevention, bene¯ting both sides. By making
investments in the avoidance of detrimental events rather than concentrating on
managing claims after the damage has already occurred, insurers may gain a sig-
ni¯cant competitive edge on the market and lower their costs. By warning the in-
sured in case of danger or highlighting the advantages that they can have by altering
their behavior and lifestyle, analytics based on real data and devices that generate
large amounts of data, such as mobile phones, wearables, and telemetric, have sig-
ni¯cant opportunities to prevent an unwanted event from occurring.
The next step in personalization will be to pinpoint policyholders' future
demands considering their prospective life stages. Insurance ¯rms strive to be their
clients' enduring partners and allies. An enormous amount of unstructured data is
2330001-34
produced by people using the Internet and social networks more often. Insurance
providers may develop marketing e®orts that are speci¯cally aimed at attracting new
customers by examining this data. Therefore, in the future, researchers who are
interested in insurance technology can concentrate more on wearable technology and
customization to maximize the bene¯ts to both the insurance company and the
policy holders as surveys are a signi¯cantly less reliable source of information
regarding user demands than online activity of the insured.
4.3. Theoretical contribution

Our bibliometric study on bigdata applications in the insurance sector would help
the future researchers to identify the themes on bigdata applications in insurance
sector and other related domains for research and collaborations. Our study has
structured the current research in this domain and discussed the major contributions
based on authors, journals, documents, universities, countries, etc.
Earlier studies have discussed impact of bigdata analytics on enterprises (Khanra
et al. 2020), bigdata applications in banking sector (Nobanee et al. 2021), use of
bigdata in product development (Zhan et al. 2018), bigdata analysis's usefulness in
healthcare industry for e®ective diagnosis and treatment (Mahajan & Mehta 1984),
Bigdata applications in stock market (Hasan et al. 2020). Our study extends the
existing research to understand the bigdata applications in insurance sector.
We have proposed few emerging themes and a conceptual framework which
explains the scope and direction for future research indicating di®erent application
areas of insurance sector.
4.4. Practical implications

Our study provides a purview of various emerging themes studied earlier in insurance
sector research using bigdata analytics. This bibliometric study outcomes would help
the researchers to explore more insurance application areas using alternate bigdata
tools and technology. This study would also motivate the practitioners and the ¯rms
to invest more on bigdata tools and technology to identify and resolve complex
business problems with robust solutions to gain competitive advantage and to ensure
consistent growth of the ¯rm.
References
P. Alexopoulos, K. Kafentzis, X. Benetou, T. Tagaris & P. Georgolios (2007) Towards a
generic fraud ontology in e-government. ICE-B 2007 Proceedings of the 2nd Inter-
national Conference on e-Business, January, 269–276.
N. Altaf (2021) Two decades of big data in ¯nance: Systematic literature review and future
research agenda. In: Big Data Analytics for Internet of Things, 351–365, Wiley.
F. P. Appio, F. Cesaroni & A. Di Minin (2014) Visualizing the structure and bridges of the
intellectual property management and strategy literature: A document co-citation
analysis, Scientometrics 101 (1), 623–661, doi: 10.1007/s11192-014-1329-0.
2330001-35
M. Aria & C. Cuccurullo (2017) Bibliometrix: An R-tool for comprehensive science mapping
analysis, Journal of Informetrics 11 (4), 959–975, doi: 10.1016/j.joi.2017.08.007.
S. Bahoo, I. Alon & A. Paltrinieri (2020) Corruption in international business: A review and
research agenda, International Business Review 29 (4), 101660.
H. K. Baker, N. Pandey, S. Kumar & A. Haldar (2020) A bibliometric analysis of board
diversity: Current status, development, and future research directions, Journal of
Business Research 108 (November 2019), 232–246, doi: 10.1016/j.jbusres.2019.11.025.
L. Barry & A. Charpentier (2020) Personalization as a promise: Can Big Data change the
practice of insurance? Big Data and Society 7 (1), 1–12, doi: 10.1177/2053951720935143.
T. Bartol & G. Budimir (2013) Assessment of research ¯elds in scopus and web of science in the
view of national research evaluation. Scientometrics 98, 1491–1504, doi: 10.1007/
s11192-013-1148-8.
M. Bartolini, E. Bottani & E. H. Grosse (2019) Green warehousing: Systematic literature
review and bibliometric analysis, Journal of Cleaner Production 226, 242–258, doi:
10.1016/j.jclepro.2019.04.055.
Y. Bhatt, K. Ghuman & A. Dhir (2020) Sustainable manufacturing. Bibliometrics and content
analysis, Journal of Cleaner Production 260, 120988, doi: 10.1016/j.jclepro.2020.
120988.
R. Bhukya, J. Paul, M. Kastanakis & S. Robinson (2022) Forty years of European manage-
ment journal: A bibliometric overview, European Management Journal 40 (1), 10–28,
doi: 10.1016/j.emj.2021.04.001.
K. W. Boyack & R. Klavans (2010) Co-citation analysis, bibliographic coupling, and direct
citation: Which citation approach represents the research front most accurately?
Journal of the American Society for Information Science and Technology 61 (12), 2389–
2404, doi: 10.1002/asi.21419.
D. Boyd & K. Crawford (2012) Critical questions for big data: Provocations for a cultural,
technological, and scholarly phenomenon, Information, Communication & Society
15 (5), 662–679.
M. Callon, J. P. Courtial & F. Laville (1991) Co-word analysis as a tool for describing the
network of interactions between basic and technological research: The case of polymer
chemistry. Scientometrics 22, 155–205.
M. F. Carfora, F. Martinelli, F. Mercaldo, V. Nardone, A. Orlando, A. Santone & G. Vaglini
(2019) A \pay-how-you-drive" car insurance approach through cluster analysis, Soft
Computing 23 (9), 2863–2875, doi: 10.1007/s00500-018-3274-y.
M. Castriotta, M. Loi, E. Marku & L. Naitana (2018) What's in a name? Exploring the
conceptual structure of emerging organizations, Scientometrics 118, 407–437, doi:
10.1007/s11192-018-2977-2.
F. Caviggioli & E. Ughetto (2019) A bibliometric analysis of the research dealing with the
impact of additive manufacturing on industry, business and society, International
Journal of Production Economics 208, 254–268, doi: 10.1016/j.ijpe.2018.11.022.
Y. W. Chang, M. H. Huang & C. W. Lin (2015) Evolution of research subjects in library and
information science based on keyword, bibliographical coupling, and cocitation analy-
ses, Scientometrics 105 (3), 2071–2087, doi: 10.1007/s11192-015-1762-8.
C. P. Chen & C. Y. Zhang (2014) Data-intensive applications, challenges, techniques and
technologies: A survey on Big Data, Information Sciences 275, 314–347.
H. Chen, R. H. L. Chiang, V. C. Storey, C. H. Lindner & J. M. Robinson (2012) Business
intelligence and analytics: From big data to big impact quarterly-business intelligence
and analytics: From big data to big impact, MIS Quarterly 36 (4), 1165–1188.
2330001-36
A. Christmann, I. Steinwart & M. Hubert (2007) Robust learning from bites for data mining,
Computational Statistics and Data Analysis 52 (1), 347–361, doi: 10.1016/j.
csda.2006.12.009.
C. Churchill (2007) Insuring the low-income market: Challenges and solutions for commercial
insurers, The Geneva Papers on Risk and Insurance-Issues and Practice 32, 401–412.
L. Cisneros, M. Ibanescu, C. Keen, O. Lobato-Calleros & J. Niebla-Zatarain (2018) Biblio-
metric study of family business succession between 1939 and 2017: Mapping and
analyzing authors' networks, Scientometrics 117 (2), 919–951, doi: 10.1007/s11192-018-
2889-1.
M. J. Cobo, A. G. Lopez-Herrera, E. Herrera-Viedma & F. Herrera (2011) An approach for
detecting, quantifying, and visualizing the evolution of a research ¯eld: A practical
application to the fuzzy sets theory ¯eld, Journal of Informetrics 5 (1), 146–166, doi:
10.1016/j.joi.2010.10.002.
N. Comerio & F. Strozzi (2019) Tourism and its economic impact: A literature review using
bibliometric tools, Tourism Economics 25 (1), 109–131, doi: 10.1177/
1354816618793762.
T. H. Davenport, P. Barth & R. Bean (2012) How \Big Data" is di®erent, MIT Sloan
Management Review 54 (1), 21–24.
Y. Ding & B. Cronin (2011) Popular and/or prestigious? Measures of scholarly esteem,
Information Processing and Management 47 (1), 80–96, doi: 10.1016/j.ipm.2010.01.002.
K. Ding, B. Lev, X. Peng, T. Sun & M. A. Vasarhelyi (2020) Machine learning improves
accounting estimates: Evidence from insurance payments, Review of Accounting Studies
25 (3), 1098–1134, doi: 10.1007/s11142-020-09546-9.
N. Donthu, S. Kumar & D. Pattnaik (2020) Forty-¯ve years of Journal of Business Research:
A bibliometric analysis, Journal of Business Research 109, 1–14, doi: 10.1016/j.
jbusres.2019.10.039.
N. Donthu, S. Kumar, D. Mukherjee, N. Pandey & W. M. Lim (2021) How to conduct a
bibliometric analysis: An overview and guidelines, Journal of Business Research 133,
285–296, doi: 10.1016/j.jbusres.2021.04.070.
P. Dua & S. Bais (2014) Supervised learning methods for fraud detection in healthcare in-
surance, Intelligent Systems Reference Library 56, 261–285, doi: 10.1007/978-3-642-
40017-9 12.
M. S. Eastin, N. H. Brinson, A. Doorey & G. Wilcox (2016) Living in a big data world:
Predicting mobile commerce activity through privacy concerns, Computers in Human
Behavior 58, 214–220, doi: 10.1016/j.chb.2015.12.050.
C. Eckert & K. Osterrieder (2020) How digitalization a®ects insurance companies: Overview
and use cases of digital technologies, Zeitschrift Fur Die Gesamte Versicher-
ungswissenschaft 109(5), 333–360, doi: 10.1007/s12297-020-00475-9.
C. R. Ernest (2011) Big data, analytics and the path from insights to value, MIT Sloan
Management Review 52(2), 3–14.
F. A. Ferreira (2018) Mapping the ¯eld of arts-based management: Bibliographic coupling and
co-citation analyses, Journal of Business Research 85, 348–357, doi: 10.1016/j.
jbusres.2017.03.026.
A. Fuad, Y. C. G. Lee & C. Y. Hsu (2020) Bibliometric analysis of bioscience trends journal
(2007–2017): Knowledge dynamics and visualization, Library Philosophy and Practice
1–18.
F. Fusco, M. Marsilio & C. Guglielmetti (2020) Co-production in health policy and manage-
ment: A comprehensive bibliometric review, BMC Health Services Research 20, 1–16.
M. Gagolewski (2011) Bibliometric impact assessment with R and the CITAN package.
Journal of Informetrics, 5 (4), 678–692.
2330001-37
A. Gandomi & M. Haider (2015) Beyond the hype: Big data concepts, methods, and analytics,
International Journal of Information Management 35 (2), 137–144, doi: 10.1016/j.
ijinfomgt.2014.10.007.
E. Gar¯eld & I. H. Sher (1993) Key words plus [TM]-algorithmic derivative indexing, Journal
of the American Society for Information Science 44, 298.
E. Gar¯eld (1990) Keywords Plusr : ISI's breakthrough retrieval method. Part 1. Expanding
your searching power on current contents on diskette, Current Contents 32, 5–9.
A. Gaur & M. Kumar (2018) A systematic approach to conducting review studies: An as-
sessment of content analysis in 25 years of IB research, Journal of World Business
53 (2), 280–289.
G. George, M. R. Haas & A. Pentland (2014) Institutional Knowledge at Singapore Man-
agement University Big Data and Management: From the Editors. Research Collection
Lee Kong Chian School of Business 2014 (4), 321–326, doi: 10.5465/amj.2014.4002.
E. M. Gerson & S. L. Star (1986) Analyzing due process in the workplace. ACM Transactions
Of O±ce Information Systems 4 (3), 257–270.
M. Giannakis (2019) A cloud-based supply chain management system: E®ects on supply chain
responsiveness. Journal of Enterprise Information Management 32 (4), 585–607, doi:
10.1108/JEIM-05-2018-0106.
M. Gill (2016) Crime at Work: Studies in Security and Crime Prevention (Vol. 1). Palgrave
Macmillan. doi: 10.1007/978-1-349-23551-3.
K. Goyal & S. Kumar (2021) Financial literacy: A systematic review and bibliometric analysis,
International Journal of Consumer Studies 45 (1), 80–105.
K. Goyal, S. Kumar & J. J. Xiao (2021) Antecedents and consequences of personal ¯nancial
management behavior: A systematic literature review and future research agenda,
International Journal of Bank Marketing 39 (7), 1166–1207.
D. M. Hafeez, S. Jalal & F. Khosa (2019) Bibliometric analysis of manuscript characteristics
that in°uence citations: A comparison of six major psychiatry journals, Journal of
Psychiatric Research 108, 90–94, doi: 10.1016/j.jpsychires.2018.07.010.
M. M. Hasan, J. Popp & J. Olah (2020) Current landscape and in°uence of big data on ¯nance,
Journal of Big Data 7 (1), 21, doi: 10.1186/s40537-020-00291-z.
I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani & S. U. Khan (2015) The rise of
\big data" on cloud computing: Review and open research issues, Information Systems
47, 98–115.
A. S. Homrich, G. Galv ao, L. G. Abadia & M. M. Carvalho (2018) The circular economy
umbrella: Trends and gaps on integrating pathways, Journal of Cleaner Production
175, 525–543, doi: 10.1016/j.jclepro.2017.11.064.
Y. Huang & S. Meng (2019) Automobile insurance classi¯cation ratemaking based on tele-
matics driving data, Decision Support Systems 127, 113156, doi: 10.1016/j.
dss.2019.113156.
S. C. Hui & A. C. M. Fong (2004) Document retrieval from a citation database using con-
ceptual clustering and co-word analysis, Online Information Review 28 (1), 22–32.
A. Hussain & E. Cambria (2018) Semi-supervised learning for big social data analysis,
Neurocomputing 275, 1662–1673, doi: 10.1016/j.neucom.2017.10.010.
H. Ince & B. Aktan (2009) A comparison of data mining techniques for credit scoring in
banking: A managerial perspective, Journal of Business Economics and Management
10 (3), 233–240, doi: 10.3846/1611-1699.2009.10.233-240.
H. V. Jagadish (2015) Big data and science: Myths and reality. Big Data Research 2 (2),
49–52.
P. Jacso (2011) 0.33 Jacso The h-index, h-core citation rate and the bibliometric pro¯le of the
Scopus database.pdf, Online Information Review 35 (3), 492–501.
2330001-38
R. Jain, J. A. Alzubi, N. Jain & P. Joshi (2019) Assessing risk in life insurance using ensemble
learning, Journal of Intelligent and Fuzzy Systems 37 (2), 2969–2980, doi: 10.3233/JIFS-
190078.
J. M. Johnson & T. M. Khoshgoftaar (2019) Medicare fraud detection using neural networks,
Journal of Big Data 6 (1), 63, doi: 10.1186/s40537-019-0225-0.
M. Johnson, A. Albizri & A. Harfouche (2021) Responsible arti¯cial intelligence in healthcare:
Predicting and preventing insurance claim denials for economic and social wellbeing.
Information Systems Frontiers, doi: 10.1007/s10796-021-10137-5.
A. Kalantari, A. Kamsin, H. S. Kamaruddin, N. Ale Ebrahim, A. Gani, A. Ebrahimi &
S. Shamshirband (2017) A bibliometric approach to tracking big data research trends,
Journal of Big Data 4 (1), 30, doi: 10.1186/s40537-017-0088-1.
os (2009) Dynamic Social Networks and
G. Kampis, L. Gulyaś, Z. Szaszí, Z. Szakolczi & S. So
the Textrend/CIShell Framework. In Applied Social Network Analysis Conference.
27–28.
M. Karakus, A. Ersozlu & A. C. Clark (2019) Augmented reality research in education:
A bibliometric study, Eurasia Journal of Mathematics, Science and Technology Edu-
cation, 15 (10), Article em1755.https://doi.org/10.29333/ejmste/103904.
S. Khanra, A. Dhir, N. Islam & M. Mäntymäki (2020) Big data analytics in healthcare:
A systematic literature review, Enterprise Information Systems, 14 (7), 878–912, doi:
10.1080/17517575.2020.1812005.
R. Klavans & K. W. Boyack (2017) Research portfolio analysis and topic prominence, Journal
of Informetrics 11 (4), 1158–1174.
P. Korom (2019) A bibliometric visualization of the economics and sociology of wealth in-
equality: A world apart? Scientometrics 118 (3), 849–868, doi: 10.1007/s11192-018-
03000-z.
I. Kose, M. Gokturk & K. Kilic (2015) An interactive machine-learning-based electronic fraud
and abuse detection system in healthcare insurance, Applied Soft Computing Journal
36, 283–299, doi: 10.1016/j.asoc.2015.07.018.
S. Kumar, R. Sureka & S. Colombage (2020) Capital structure of SMEs: A systematic liter-
ature review and bibliometric analysis, Management Review Quarterly 70 (4), 535–565,
doi: 10.1007/s11301-019-00175-4.
A. Labrinidis (2015) The big data-same humans problem. In CIDR.
C. I. S. G. Lee, W. Felps & Y. Baruch (2014) Toward a taxonomy of career studies through
bibliometric visualization. Journal of Vocational Behavior 85 (3), 339–351, doi:
10.1016/j.jvb.2014.08.008.
M. Levine-Clark & E. L. Gil (2008) A comparative citation analysis of Web of Science, Scopus,
and Google Scholar. Journal of Business & Finance Librarianship 14 (1), 32–46.
X. Li, P. Wu, G. Q. Shen, X. Wang & Y. Teng (2017) Mapping the knowledge domains of
Building Information Modeling (BIM): A bibliometric approach. Automation in Con-
struction, 84, 195–206, doi: 10.1016/j.autcon.2017.09.011.
J. Li, B. Zou, Y. H. Yeo, Y. Feng, X. Xie, D. H. Lee, . . . & M. H. Nguyen (2019) Prevalence,
incidence, and outcome of non-alcoholic fatty liver disease in Asia, 1999–2019: A sys-
tematic review and meta-analysis, The Lancet Gastroenterology & Hepatology 4 (5),
389–398.
T. Liang & Y. Liu (2018) Research landscape of business intelligence and big data analytics: A
bibliometrics study, Expert Systems with Applications 111 (128), 2–10, doi: 10.1016/j.
eswa.2018.05.018.
Z. Liao, Q. Yin, Y. Huang & L. Sheng (2015). Management and application of mobile big data
Zhensong, International Journal of Embeded System 7 (1), 63–70.
2330001-39
X. Liu, J. Bollen, M. L. Nelson, & H. Van de Sompel (2005). Co-authorship networks in the
digital library research community, Information Processing & Management 41 (6),
1462–1480, doi: 10.1016/j.ipm.2005.03.012.
A. Mahajan & D. Mehta (1984). Strong form e±ciency of the foreign exchange market and
bank positions, Journal of Financial Research 7 (3), 197–207, doi: 10.1111/j.1475-
6803.1984.tb00370.x.
J. A. Major & D. R. Riedinger (2002). EFD: A hybrid knowledge/statistical-based system for
the detection of fraud, Journal of Risk and Insurance 69 (3), 309–324.
S. Mall, P. Ghosh & P. Shah (2018). Management of fraud: Case of an Indian insurance
company, Accounting and Finance Research 7 (3), 18, doi: 10.5430/afr.v7n3p18.
J. Manyika, S. Francisco, J. Remes, J. Mischke & M. Krishnan (2017). The Productivity Puzzle:
A Closer Look at the United States, Discussion Paper, www.mckinsey.com/mgi.
M. Marabelli, S. Hansen, S. Newell & C. Frigerio (2017). The light and dark side of the black
box: Sensor-based technology in the automotive industry, Communications of the
Association for Information Systems 40 (1), 351–374, doi: 10.17705/1cais.04016.
M. K. McBurney & P. L. Novak (2002). What is bibliometrics and why should you care?,
In: Proc. IEEE Int. Professional Commun. Conf.108–114, IEEE.
G. Meyers & I. V. Hoyweghen (2020). `Happy failures': Experimentation with behaviour-based
personalisation in car insurance, Big Data and Society 7 (1), doi: 10.1177/
2053951720914650.
D. Mishra, A. Gunasekaran, T. Papadopoulos & S. J. Childe (2016). Big data and supply chain
management: A review and bibliometric analysis, Annals of Operations Research
270 (1), 313–336, doi: 10.1007/s10479-016-2236-y.
D. Moher, A. Liberati, J. Tetzla®, D. G. Altman & Prisma Group (2009). Reprint
preferred
reporting items for systematic reviews and meta-analyses: The PRISMA statement,
Physical Therapy89 (9), 873–880.
J. A. Moral-Muñoz, E. Herrera-Viedma, A. Santisteban-Espejo & M. J. Cobo (2020) Software
tools for conducting bibliometric analysis in science: An up-to-date review, El profe-
sional de la informacion 29 (1), 4.
B. Mougenot & J. P. Doussoulin (2022) Conceptual evolution of the bioeconomy: A biblio-
metric analysis, Environment, Development and Sustainability 24 (1), 1031–1047.
T. B. Murdoch & A. S. Detsky (2013) The inevitable application of big data to health care,
VIEWPOINT 309 (13), 1351–1352, http://clarabridge.com/default.aspx?tabid=
137&ModuleID=635.
B. Nayak, S. S. Bhattacharyya & B. Krishnamoorthy (2019) Democratizing health insurance
services; accelerating social inclusion through technology policy of health insurance
¯rms, Business Strategy and Development 2 (3), 242–252, doi: 10.1002/bsd2.59.
A. Niñerola, M. V. Sanchez-Rebull & A. B. Hernandez-Lara (2019) Tourism research on
sustainability: A bibliometric analysis, Sustainability 11 (5), 1377.
H. Nobanee, M. N. Dilshad, M. Al Dhanhani, M. Al Neyadi, S. Al Qubaisi & S. Al Shamsi
(2021) Big data applications the banking sector: A bibliometric analysis approach,
SAGE Open 11 (4), doi: 10.1177/21582440211067234.
B. O. Ogbuokiri, C. N. Udanor & M. N. Agu (2015) Implementing bigdata analytics for small
and medium enterprise (SME) regional growth, IOSR Journal of Computer Engineering
17 (6), 35–43.
H. Özk€ose, P. L. Q. Uõ & C. Gencer (2015) Yesterday, today and tomorrow of big data,
Procedia
Social and Behavioral Sciences 195, 1042–1050, doi: 10.1016/j.sbspro.
2015.06.147.
2330001-40
A. Paltrinieri, M. K. Hassan, S. Bahoo & A. Khan (2019) A bibliometric review of sukuk

literature, International Review of Economics & Finance, doi: 10.1016/j.iref.2019.
04.004.
R. Patel, M. Migliavacca & M. Oriani (2022) Blockchain in banking and ¯nance: Is the best yet
to come? A bibliometric review, Research in International Business and Finance
101718.
P. V. Paul, K. Monica & M. Trishanka (2017) A Survey on Big Data Analytics Using Social
Media Data. 2017 Innovations in Power and Advanced Computing Technologies
(i-PACT), 1–4.
J. Pesantez-Narvaez, M. Guillen & M. Alcañiz (2019) Predicting motor insurance claims using
telematics data
XGboost versus logistic regression, Risks 7 (2), 70, doi: 10.3390/
risks7020070.
R. Pranckut (2021) Web of science (WoS) and scopus: The titans of bibliographic information
in today's academic world, Publications 9 (1) 12.
P. Racherla & C. Hu (2010) A social network perspective of tourism research collaborations,
Annals of Tourism Research 37 (4), 1012–1034, doi: 10.1016/j.annals.2010.03.008.
S. Rawat, A. Rawat, D. Kumar & A. S. Sabitha (2021) Application of machine learning and
data visualization techniques for decision support in the insurance sector, International
Journal of Information Management Data Insights 1 (2), 100012, doi: 10.1016/j.jji-
mei.2021.100012.
A. Rey-Martí, D. Ribeiro-Soriano & D. Palacios-Marqués (2016) A bibliometric analysis of
social entrepreneurship, Journal of Business Research 69 (5), 1651–1655, doi: 10.1016/j.
jbusres.2015.10.033.
M. Riikkinen, H. Saarijärvi, P. Sarlin & I. Lähteenmäki (2018) Using arti¯cial intelligence to
create value in insurance, International Journal of Bank Marketing, 36 (6), 1145–1168,
doi: 10.1108/IJBM-01-2017-0015.
M. Rodrigues & L. Mendes (2018) Mapping of the literature on social responsibility in the
mining industry: A systematic literature review, Journal of Cleaner Production 181,
88–101, doi: 10.1016/j.jclepro.2018.01.163.
G. A. Ronda-pupo (2017) Relationship between citations and co-authorship patterns in
management journals, Scientometrics, 110, 1191–1207.
A. K. Sahoo, C. Pradhan, R. K. Barik & H. Dubey (2019) DeepReco: Deep learning based
health recommender system using collaborative ¯ltering, Computation 7 (2), 25, doi:
10.3390/computation7020025.
B. K. Saikia, S. M. Benoy, M. Bora, J. Tamuly, M. Pandey & D. Bhattacharya (2020) A brief
review on supercapacitor energy storage devices and utilization of natural carbon
resources as their electrode materials, Fuel 282, 118796.
C. B. Santoso, H. L. H. S. Warnars, A. N. Fajar & H. Prabowo (2022) Insurance underwriting
and echnology relationship: A bibliometric analysis, Journal of Theoretical and Applied
Information Technology 100 (13).
S. Sharma, U. S. Tim, J. Wong, S. Gadia & S. Sharma (2014) A brief review on leading big
data models, Data Science Journal 13, 138–157.
S. Shome, M. K. Hassan, S. Verma & T. R. Panigrahi (2023) Impact investment for sus-
tainable development: A bibliometric analysis, International Review of Economics &
Finance 84, 770–800.
C. W. Song, H. Jung & K. Chung (2019) Development of a medical big-data mining process using
topic modeling, Cluster Computing 22, 1949–1958, doi: 10.1007/s10586-017-0942-0.
Y. Song, X. Chen, T. Hao, Z. Liu & Z. Lan (2019) Exploring two decades of research on
classroom dialogue by using bibliometric analysis, Computers & Education 137, 12–31,
doi: 10.1016/j.compedu.2019.04.002.
2330001-41
U. Srinivasan & B. Arunasalam (2013. Leveraging big data analytics to reduce healthcare
costs. IT Professional 15 (6), 21–28, doi: 10.1109/MITP.2013.55.
S. Stahlschmidt & D. Stephen (2022) From indexation policies through citation networks to
normalized citation impacts: Web of Science, Scopus, and Dimensions as varying
resonance chambers, Scientometrics 127(5), 2413–2431.
W. M. Sweileh, N. Y. Shraim, S. W. Al Jabi, A. F. Sawalha, A. S. Abutaha & S. H. Zyoud
(2016) Bibliometric analysis of global scienti¯c research on carbapenem resistance
(1986–2015), Annals of Clinical Microbiology and Antimicrobials 15 (1), 56, doi:
10.1186/s12941-016-0169-6.
C. Tozzi, N. Walani & M. Arroyo (2019) Out-of-equilibrium mechanochemistry and self-
organization of °uid membranes interacting with curved proteins, New Journal of
Physics 21 (9), 093004, doi: 10.1088/1367-2630/ab3ad6.
M. Tripathi, S. Kumar, S. K. Sonker & P. Babbar (2018) Occurrence of author keywords and
keywords plus in social sciences and humanities research: A preliminary study,
COLLNET Journal of Scientometrics and Information Management 12 (2), 215–232.
J. C. Valderrama-Zurian, C. Navarro-Molina, R. Aguilar-Moya, D. Melero-Fuentes &
R. Aleixandre-Benavent (2017) Trends in scienti¯c research in Online Information
Review. Part 2. Mapping the scienti¯c knowledge through bibliometric and social net-
work analyses, preprint, arXiv:1709.07817.
N. J. Van Eck & L. Waltman (2007) Bibliometric mapping of the computational intelligence
¯eld, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
15 (5), 625–645.
N. J. Van Eck & L. Waltman (2014) Visualizing bibliometric networks. in Measuring Scholarly
Impact, 285–320, Cham: Springer.
N. J. Van Eck & L. Waltman (2017) Citation-based clustering of publications using CitNe-
tExplorer and VOSviewer, Scientometrics 111 (2), 1053–1070, doi: 10.1007/s11192-
017-2300-7.
P. D. Vecchio (2017) Creating value from social big data: Implications for smart tourism
destinations, Information Processing and Management, January, 0–1, doi: 10.1016/j.
ipm.2017.10.006.
S. Verma & S. S. Bhattacharyya (2017a) Perceived strategic value based adoption of big data
analytics in emerging economy: A qualitative approach for Indian ¯rms. Journal of
Enterprise Information Management.
S. Verma & S. S. Bhattacharyya (2017b) Perceived strategic value-based adoption of Big Data
Analytics in emerging economy: A qualitative approach for Indian ¯rms. Journal of
Enterprise Information Management, 30(3), 354–382, doi: 10.1108/JEIM-10-2015-
0099.
E. S. Vieira & J. A. N. F. Gomes (2009) A comparison of Scopus and web of Science for a
typical university, Scientometrics, 81(2), 587–600, doi: 10.1007/s11192-009-2178-0.
R. L. Villars & C. W. Olofson (2011) WHITE PAPER Big Data: What It Is And Why You
Should Care, IDC Analyze the Future.
M. V. Wüthrich (2017) Covariate selection from telematics car driving data, European
Actuarial Journal 7 (1), 89–108, doi: 10.1007/s13385-017-0149-z.
F. S. Wamba, S. Akter, A. Edwards, G. Chopin & D. Gnanzou (2015) How \big data" can
make big impact: Findings from a systematic review and a longitudinal case study,
International Journal of Production Economics, 165, 234–246, doi: 10.1016/j.
ijpe.2014.12.031.
Y. Wang & W. Xu (2018) Leveraging deep learning with LDA-based text analytics to detect
automobile insurance fraud, Decision Support Systems 105, 87–95, doi: 10.1016/j.
dss.2017.11.001.
2330001-42
Z. Y. Wang, G. Li, C. Y. Li & A. Li (2012) Research on the semantic-based co-word analysis,

Scientometrics 90 (3), 855–875.
R. B. Wiegard & M. H. Breitner (2019). Smart services in healthcare: A risk-bene¯t-analysis of
pay-as-you-live services from customer perspective in Germany, Electronic Markets
29 (1), 107–123, doi: 10.1007/s12525-017-0274-1.
M. L. Williams & P. Burnap (2017) Towards an ethical framework for publishing Twitter data
in social research: Taking into account users' views, online context and algorithmic
estimation, Sociology 51, 1149–1168, doi: 10.1177/0038038517708140.
B. Williamson (2015) Smarter learning software: Education and the big data imaginary. Big
Data
Social Data, University of Warwick.
A. C. Worthington & H. Higgs (2006) Weak-form market e±ciency in Asian emerging and
developed equity markets: Comparative tests of random walk behaviour, Accounting
Research Journal 19 (1), 54–63, doi: 10.1007/s13398-014-0173-7.2.
H. Xian & K. Madhavan (2014) Anatomy of scholarly collaboration in engineering education:
A big-data bibliometric analysis, Journal of Engineering Education 103 (3), 486–514,
doi: 10.1002/jee.20052.
Z. Xu & D. Yu (2019) A Bibliometrics analysis on big data research (2009–2018). Journal of
Data, Information and Management 1 (1–2), 3–15, doi: 10.1007/s42488-019-00001-2.
X. Xu, X. Chen, F. Jia, S. Brown, Y. Gong & Y. Xu (2018) Supply chain ¯nance: A systematic
literature review and bibliometric analysis, International Journal of Production Eco-
nomics 204, 160–173, doi: 10.1016/j.ijpe.2018.08.003.
S. Zamore, K. Ohene Djan, I. Alon & B. Hobdari (2018) Credit risk research: Review and
agenda, Emerging Markets Finance and Trade 54 (4), 811–835.
M. H. Zari¯, H. Sadabadi, S. H. Hejazi, M. Daneshmand & A. Sanati-Nezhad (2018) Non-
contact and nonintrusive microwave-micro°uidic °ow sensor for energy and biomedical
engineering, Scienti¯c Reports, 8 (1), 139, doi: 10.1038/s41598-017-18621-2.
Y. Zhan, K. H. Tan, Y. Li & Y. K. Tse (2018) Unlocking the power of big data in new product
development, Annals of Operations Research, 270 (1–2), 577–595, doi: 10.1007/s10479-
016-2379-x.
W. Zhang & S. Banerji (2017). Challenges of servitization: A systematic literature review. In:
Industrial Marketing Management, Vol. 65, pp. 217–227, Elsevier Inc. doi: 10.1016/j.
indmarman.2017.06.003.
P. Zhang, F. Yan & C. Du (2015) A comprehensive analysis of energy management strategies
for hybrid electric vehicles based on bibliometrics, Renewable and Sustainable Energy
Reviews 48, 88–104, doi: 10.1016/j.rser.2015.03.093.
X. F. Zhang, Z. S. Xu & P. J. Ren (2019) A novel hybrid correlation measure for probabilistic
linguistic term sets and crisp numbers and its application in customer relationship
management, International Journal of Information Technology and Decision Making
18 (2), 673–694, doi: 10.1142/S021962201950007X.
D. Zhang (2018) Big data security and privacy protection, 2018 International Conference on
Virtual Reality and Intelligent Systems, 14–15 September, Jishou, China, pp. 275–278.
I. Zupic & T. Čater (2015) Bibliometric methods in management and organization, Organi-
zational Research Methods 18 (3), 429–472.
2330001-43
Copyright of Journal of Financial Management, Markets & Institutions is the property of
World Scientific Publishing Company and its content may not be copied or emailed to
multiple sites or posted to a listserv without the copyright holder's express written permission.
However, users may print, download, or email articles for individual use.

Retrieve 2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Retrieve 2

Uploaded by

Copyright:

Available Formats

OPEN ACCESS

Journal of Financial Management, Markets and Institutions

BIBLIOMETRIC ANALYSIS ON BIG DATA APPLICATIONS

SUNITA MALL*,§, TUSHAR RANJAN PANIGRAHI†,¶ and SUSHMA VERMA‡,||

Received 10 September 2022

Keywords: Bigdata; insurance sector; bibliometric analysis; citation network; bibliometric

RQ 3: What are the most important pieces of literature in this ¯eld?

Bibliographic coupling results in discrete clusters that represent di®erent topics

2. Data and Methodology

2.1. Method of analysis

Fig. 1. Methodological framework used in our study.

2.2. Bibliometric tool selection

2.3. Bibliometric database selection

Records removed before screening:

Records identified from Scopus:

Records considered from Subject area:

above-mentioned discipline: (n = 73)

Document Type Excluded: Conference paper (n = 445). Book

Reports Manually excluded due to non-retrieval of full

Retrieved Reports Manually excluded not matching to

Fig. 2. Documents selection.

Scopus has a selective indexation policy. In addition to better metadata quality,

2.4. Search string of keywords

(1) Bibliographic Coupling,

Most Active Journal

Fig. 3. Analytical framework of our study.

uses keywords to show the relationships between di®erent concepts, co-authorship,

3. Analysis and Findings

3.1. Data summary

Fig. 4. Descriptive statistics.

co-authors per document is 3.04 whereas the international co-authorship percentage

3.2. Publication by year

3.3. Publication by country

Annual Producon of Documents

Fig. 5. Publication by year.

Table 2. Leading countries by documents and citation.

Rank Country No of articles Rank Country Citations

1 United States 52 1 United States 576

3.4. Publications by journal

Table 3. Top publishing journals on bigdata applications in insurance sector.

Rank Sources Publisher Articles SJR score Q rating

1 Risks MDPI 9 0.4 Q2

3.5. Publication by author and organization

Table 4. Top contributing authors and organizations.

Rank Author TP TC Organization TP TC

1 Wang Y. 2 121 School of Statistics and Mathematics, 2 3

Note: TP ¼ Total Publication and TC ¼ Total Citations.

3.6. Citation network analysis

3.7. Keyword and co-occurrence analysis

Publication Global Local

Table 6. Most frequent keywords.

Rank Words Frequency Rank Words Frequency

1 Machine Learning 44 11 Big Data Analytics 6

For instance, complete form and acronym (\Arti¯cial Intelligence" or \AI") as

Fig. 6. Keyword co-occurrence network map.

the insurance industry, information technology, as well as ¯nancial technology in

Fig. 7. Keyword co-occurrence overlay visualization map.

3.8. Country co-authorship analysis

The strength of association between the countries is displayed in Fig. 8. We set a

Fig. 8. Country co-authorship network map.

3.9. Bibliometric mapping and research themes

Cluster Research question

tion and pre-investiga-

Cluster Research question