Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 25

INTRODUCTION

Healthcare is a booming sector of the economy in many countries


[1]. With its growth, come challenges including rising costs, inefficiencies, poor quality, and
increasing complexity.
[2]. U.S. healthcare expenditures increased by 123% between 2010 and 2015—from $2.6
trillion to $3.2 trillion.

[3]. Inefficient—non-value added tasks (e.g., readmissions, inappropriate use of antibiotics,


and fraud)—constitutes 21–47% of this enormous expenditure.
[4]. Healthcare institutions are adopting information technology in their management system.
[5]. Better decision-making based on available information could mitigate these challenges
and facilitate the transition to a value-based healthcare industry.
[6]. A large volume of data is collected through this system on a regular basis. Analytics
provides tools and techniques to extract information from this complex and voluminous data
and translate it into information to assist decision-making in healthcare.
[7]. It can generate fact-based decisions for “planning, management, measurement, and
learning” purposes.
[8]. Use of analytics—including data mining, text mining, and big data analytics—is assisting
healthcare professionals in disease prediction, diagnosis, and treatment, resulting in an
improvement in service quality and reduction in cost.
[9]. According to some estimates, application of data mining can save $450 billion each year
from the U.S. healthcare system.
Rural Health Care System – The structure and current scenario.
The health care infrastructure in rural areas has been developed as a three tier system and is
based on the following population norms:
Centre Population Norms Plain Area
Hilly/Tribal/Difficult Area Sub-Centre 5000
3000
Primary Health Centre 30,000
20,000
Community Health centre 1, 20,000
80,000
Sub-Centres (SCs)
The Sub-Centre is the most peripheral and first contact point between the primary health care
system and the community. Each Sub-Centre is required to be manned by at least one
Auxiliary Nurse Midwife (ANM) / Female Health Worker and one Male Health Worker (for
details of staffing pattern, see Box 1 and for recommended staffing structure under Indian
Public Health Standards (IPHS).
Under NRHM, there is a provision for one additional second ANM on contract basis. One
Lady Health Visitor (LHV) is entrusted with the task of supervision of six Sub-Centres. Sub-
Centres are assigned tasks relating to interpersonal communication in order to bring about
behavioural change and provide services in relation to maternal and child health, family
welfare, nutrition, immunization, diarrhoea control and control of communicable diseases
programmes. The Sub-Centres are provided with basic drugs for minor ailments needed for
taking care of essential health needs of men, women and children. The Ministry of Health &
Family Welfare is providing 100% Central assistance to all the Sub-Centres in the country
since April 2002 in the form of salary of ANMs and LHVs, rent at the rate of Rs 3000/- per
annum and contingency at the rate of Rs 3200/- per annum, in addition to drugs and
equipment kits. The salary of the Male Health Worker is borne by the State Governments
(Also see para 2.4 for NRHM additionalities). Under the Swap Scheme, the Government of
India has taken over an additional 39,554 Sub Centres from State Governments / Union
Territories since April, 2002 in lieu of 5,434 Rural Family Welfare Centres transferred to the
State Governments / Union Territories. There are 1, 48,124 Sub Centres functioning in the
country as on March 2011. Number of Sub Centres existing as on March 2011 increased from
146026 in 2005 to 148124 in 2011.
There is significant increase in the number of Sub Centres in the States of Chhattisgarh,
Haryana, Jammu & Kashmir, Karnataka, Maharashtra, Orissa, Punjab, Rajasthan, Tamil
Nadu, Tripura and Uttarakhand. As on March, 2011, there are 148124 Sub Centres, 23887
Primary Health Centres (PHCs) and 4809 Community Health Centres (CHCs) functioning in
the country 2 Primary Health Centres (PHCs).PHC is the first contact point between village
community and the Medical Officer. The PHCs were envisaged to provide an integrated
curative and preventive health care to the rural population with emphasis on preventive and
promotive aspects of health care. The PHCs are established and maintained by the State
Governments under the Minimum Needs Programme (MNP)/ Basic Minimum Services
(BMS) Programme. As per minimum requirement (Box-1), a PHC is to be manned by a
Medical Officer supported by 14 paramedical and other staff. Under NRHM, there is a
provision for two additional Staff Nurses at PHCs on contract basis. It acts as a referral unit
for 6 Sub Centres and has 4 - 6 beds for patients. The activities of PHC involve curative,
preventive, promotive and Family Welfare Services.
Primary Health Centres (PHCs)
PHC is the first contact point between village community and the Medical Officer. The PHCs
were envisaged to provide an integrated curative and preventive health care to the rural
population with emphasis on preventive and promotive aspects of health care. The PHCs are
established and maintained by the State Governments under the Minimum Needs Programme
(MNP)/ Basic Minimum Services (BMS) Programme. As per minimum requirement, a PHC
is to be manned by a Medical Officer supported by 14 paramedical and other staff. Under
NRHM, there is a provision for two additional Staff Nurses at PHCs on contract basis. It acts
as a referral unit for 6 Sub Centres and has 4 - 6 beds for patients. The activities of PHC
involve curative, preventive, promotive and Family Welfare Services. There were 23,887
PHCs functioning in the country as on March 2011. Community Health Centres
(CHCs) .CHCs are being established and maintained by the State Government under
MNP/BMS programme. As per minimum norms, a CHC is required to be manned by four
Medical Specialists i.e. Surgeon, Physician, Gynaecologist and Pediatrician supported by 21
paramedical and other staff. It has 30 in-door beds with one OT, X-ray, Labour Room and
Laboratory facilities. It serves as a referral centre for 4 PHCs and also provides facilities for
obstetric care and specialist consultations .As on March, 2011, there are 4,809 CHCs
functioning in the country.The details of the population norms for each level of rural health
infrastructure and current status against these norms.
Community Health Centres (CHCs)
CHCs are being established and maintained by the State Government under MNP/BMS
programme. As per minimum norms (Box-1), a CHC is required to be manned by four
Medical Specialists i.e. Surgeon, Physician, Gynecologist and Pediatrician supported by 21
paramedical and other staff (See Annexure-D for IPHS norms). It has 30 in-door beds with
one OT, X-ray, Labour Room and Laboratory facilities. It serves as a referral centre for 4
PHCs and also provides facilities for obstetric care and specialist consultations (Also see para
2.4 for NRHM additionalities). As on March, 2011, there are 4,809 CHCs functioning in the
country. 1.5. The details of the population norms for each level of rural health infrastructure
and current status against these norms.
First Referral Units (FRUs)
An existing facility (district hospital, sub-divisional hospital, community health centre etc.)
can be declared a fully operational First Referral Unit (FRU) only if it is equipped to provide
round-the-clock services for Emergency Obstetric and New Born Care, in addition to all
emergencies that any hospital is required to provide. It should be noted that there are three
critical determinants of a facility being declared as a FRU: i) Emergency Obstetric Care
including surgical interventions like Caesarean Sections; ii) New-born Care; and iii) Blood
Storage Facility on a 24-hour basis.
Strengthening of Rural Health Infrastructure under National Rural Health Mission
The National Rural Health Mission (2005-12) seeks to provide effective healthcare to rural
population throughout the country with special focus on 18 states, which have weak public
health indicators and/or weak infrastructure. These 18 States are Arunachal Pradesh, Assam,
Bihar, Chhattisgarh, Himachal Pradesh, Jharkhand, Jammu & Kashmir, Manipur, Mizoram,
Meghalaya, Madhya Pradesh, Nagaland, Orissa, Rajasthan, Sikkim, Tripura, Uttarakhand and
Uttar Pradesh. The Mission is an articulation of the commitment of the Government to raise
public spending on Health from 0.9% of GDP to 2-3% of GDP.NRHM aims to undertake
architectural correction of the health system to enable it to effectively handle increased
allocations as promised under the National Common Minimum Programme and promote
policies that strengthen public health management and service delivery in the country. It has
as its key components provision of a female health activist in each village; a village health
plan prepared through a local team headed by the Health & Sanitation Committee of the
Panchayat; strengthening of the rural hospital for effective curative care and made
measurable and accountable to the community through Indian Public Health Standards
(IPHS); integration of vertical Health & Family Welfare Programmes, optimal utilization of
funds & infrastructure, and strengthening delivery of primary healthcare. It seeks to revitalize
local health traditions and mainstream AYUSH into the public health system. It further aims
at effective integration of health concerns with determinants of health like sanitation &
hygiene, nutrition, and safe drinking water through a District Plan for Health. It seeks
decentralization of programmes for district management of health and to address the inter-
State and inter-district disparities, especially among the 18 high focus States, including unmet
needs for public health infrastructure. It also seeks to improve access of rural people,
especially poor women and children, to equitable, affordable, accountable and effective
primary healthcare.
Following are the core and supplementary strategies of NRHM
Core Strategies:  Train and enhance capacity of Panchayati Raj Institutions (PRIs) to own,
control and manage public health services.
 Promote access to improved healthcare at household level through the female health
activist (ASHA).
 Health Plan for each village through Village Health Committee of the Panchayat.
 Strengthening sub-centre through an untied fund to enable local planning and action and
more Multi-Purpose Workers (MPWs).
 Strengthening existing PHCs and CHCs, and provision of 30-50 bedded CHC per lakh
population for improved curative care to a normative standard (Indian Public Health
Standards defining personnel, equipment and management standards).
 Preparation and Implementation of an inter-sectoral District Health Plan prepared by the
District Health Mission, including drinking water, sanitation & hygiene and nutrition.
 Integrating vertical Health and Family Welfare programmes at National, State, District,
and Block levels.
 Technical Support to National, State and District Health Missions, for Public Health
Management.  Strengthening capacities for data collection, assessment and review for
evidence based planning, monitoring and supervision.
 Formulation of transparent policies for deployment and career development of Human
Resources for health.
 Developing capacities for preventive health care at all levels for promoting healthy life
styles, reduction in consumption of tobacco and alcohol etc. promoting non-profit sector
particularly in under-served areas.
Supplementary Strategies:
 Regulation of Private Sector including the informal rural practitioners to ensure availability
of quality service to citizens at reasonable cost. Promotion of Public Private Partnerships for
achieving public health goals. Mainstreaming AYUSH - revitalizing local health traditions.
 Reorienting medical education to support rural health issues including regulation of
Medical care and Medical Ethics.
NRHM Plan of Action relating to Infrastructure and Manpower Strengthening
Component (A): Accredited Social Health Activists
 Every village/large habitation will have a female Accredited Social Health Activist
(ASHA) - chosen by and accountable to the panchayat- to act as the interface between the
community and the public health system. States to choose State specific models.
 ASHA would act as a bridge between the ANM and the village and be accountable to the
Panchayat.
 She will be an honorary volunteer, receiving performance-based compensation for
promoting universal immunization, referral and escort services for RCH, construction of
household toilets, and other healthcare delivery programmes.
 She will be trained on a pedagogy of public health developed and mentored through a
Standing Mentoring Group at National level incorporating best practices and active
involvement of community health resource organizations.
 She will facilitate preparation and implementation of the Village Health Plan along with
Anganwadi Worker, ANM, functionaries of other Departments, and Self Help Group
members, under the leadership of the Village Health Committee of the Panchayat.
 She will be promoted all over the country, with special emphasis on the 18 high focus
States. The Government of India will bear the cost of training, incentives and medical kits.
The remaining components will be funded under Financial Envelope given to the States
under the programme.
 She will be given a Drug Kit containing generic AYUSH and allopathic formulations for
common ailments. The drug kit would be replenished from time to time.
 Induction training of ASHA to be of 23 days in all, spread over 12 months. On the job
training would continue throughout the year.
 Prototype training material to be developed at National level, subject to State level
modifications.  Cascade model of training proposed through Training of Trainers including
contract plus distance learning model.
 Training would require partnership with NGOs/ICDS Training Centres and State health
institutes.
Component (B): Strengthening Sub-Centres (SC)
 Each sub-centre will have an Untied Fund for local action @ Rs 10,000 per annum. This
Fund will be deposited in a joint Bank Account of the ANM & Sarpanch and operated by the
ANM, in consultation with the Village Health Committee.
 Supply of essential drugs, both allopathic and AYUSH, to the Sub-centres.
 In case of additional Outlays, Multipurpose Workers (Male)/ Additional ANMs wherever
needed, sanction of new Sub-centres as per 2001 population norm, and upgrading existing
Sub-centres, including buildings for Sub-centres functioning in rented premises will be
considered.

Component (C): Strengthening Primary Health Centres (PHCs)


Mission aims at strengthening PHCs for quality preventive, promotive, curative, supervisory
and outreach services, through:  Adequate and regular supply of essential quality drugs and
equipment (including Supply of Auto Disabled Syringes for immunisation) to PHCs.
 Provision of 24 hour service in at least 50% PHCs by addressing shortage of doctors,
especially in high focus States, through mainstreaming AYUSH manpower.
 Observance of Standard treatment guidelines & protocols.
 In case of additional Outlays, intensification of ongoing communicable disease control
programmes, new programmes for control of non-communicable diseases, upgradation of
100% PHCs for 24 hours referral service, and provision of 2nd doctor at PHC level (1 male, 1
female) would be undertaken on the basis of felt need.
Component (D): Strengthening Community Health Centres (CHCs)
First Referral Care a key strategy of the Mission is:  Operationalising existing Community
Health Centres (30-50 beds) as 24 hour First Referral Units, including posting of
anaesthetists.
 Codification of new Indian Public Health Standards" setting norms for infrastructure, staff,
equipment, management etc. for CHCs.
 Promotion of Stakeholder Committees (Rogi Kalyan Samitis) for hospital management.
 Developing standards of services and costs in hospital care.
 Develop, display and ensure compliance to Citizen's Charter at CHC/PHC level.
 In case of additional Outlays, creation of new Community Health Centres (30-50 beds) to
meet the population norm as per Census 2001, and bearing their recurring costs for the
Mission period could be considered. 3. Rural Health Infrastructure - a statistical overview
The Centres Functioning 3.1. The Primary Health Care Infrastructure has been developed as a
three tier system with Sub Centre, Primary Health Centre (PHC) and Community Health
Centre (CHC) being the three pillars of Primary Health Care System. Progress of Sub
Centres, which is the most peripheral contact point between the Primary Health Care System
and the community, is a prerequisite for the overall progress of the entire system. A look at
the number of Sub Centres functioning over the years revealed that at the end of the Sixth
Plan (1981-85) there were 9 84,376 Sub Centres, which increased to 1,30,165 at the end of
Seventh Plan (1985-90) and to 1,45,272 at the end of Tenth Plan (2002-2007). As on March,
2011, 1, 48,124 Sub Centres are functioning in the country.
3. Rural Health Infrastructure - a statistical overview
The Centres Functioning The Primary Health Care Infrastructure has been developed as a
three tier system with Sub Centre, Primary Health Centre (PHC) and Community Health
Centre (CHC) being the three pillars of Primary Health Care System. Progress of Sub
Centres, which is the most peripheral contact point between the Primary Health Care System
and the community, is a prerequisite for the overall progress of the entire system. A look at
the number of Sub Centres functioning over the years revealed that at the end of the Sixth
Plan (1981-85) there were 9 84,376 Sub Centres, which increased to 1,30,165 at the end of
Seventh Plan (1985-90) and to 1,45,272 at the end of Tenth Plan (2002-2007). As on March,
2011, 1,48,12 Sub Centres functioning in the country.
Similar progress can be seen in the number of PHCs which was 9115 at the end of Sixth Plan
(1981-85) and almost doubled to 18671 at the end of Seventh Plan (1985-90). Number of
PHCs rose to 22370 at the end of Tenth Plan (2002-2007). As on March, 2011, there are
23887 PHCs functioning in the country. A number of PHCs have been upgraded to the level
of CHCs in many States. In accordance with the progress in the number of Sub Centres and
PHCs, the number of CHCs has also increased from 761 at the end of Sixth Plan (1981-85) to
1910 at the end of Seventh Plan (1985-90) and 4045 at the end of Tenth Plan (2002-2007). As
on March, 2011, 4809 CHCs are functioning in the country. 3.2. Statement 1 presents the
number of Sub Centres, PHCs and CHCs existing in 2011 as compared to those reported
existing in 2005. As may be seen from the Statement 1, at the national level there is an
increase of 2098 Sub Centres, 651 PHCs and 1463 CHCs in 2011 as compared to those
existing in 2005. This implies an increase of about 43% in number of CHCs, about 2.8% in
number of PHCs and about 1.4% in number of Sub Centres in 2011 as compared to 2005.
There is significant increase in the number of Sub Centres in the States of Chhattisgarh,
Haryana, Jammu & Kashmir, Maharashtra, Orissa, Punjab, Rajasthan, Tamil Nadu, Tripura
and Uttarakhand. Significant increase is also observed in the number of PHCs in the States of
Andhra Pradesh, Assam, Bihar, Chhattisgarh, Haryana, Jammu & Kashmir, Karnataka,
Maharashtra, Nagaland, Uttarakhand, Uttar Pradesh. In case of CHCs, significant increase is
observed in the States of Arunachal Pradesh, Chhattisgarh, Gujarat, Haryana, Himachal
Pradesh, Jammu & Kashmir, Jharkhand, Kerala, Madhya Pradesh, Odisha, Punjab, Rajasthan,
Tamil Nadu, Uttarakhand, Uttar Pradesh and West Bengal. The average population covered
by a Sub Centre, PHC and CHC was 5624, 34876 and 173235, respectively.
Building Status 3.3. As on March, 2011, 62.7% of Sub Centres, 86.7% of PHCs and 95.3%
of CHCs are located in the Government buildings. The rest are located either in rented
building or rent free Panchayat/ Voluntary Society building.
3.4. Statement 2, Statement 3 and Statement 4 give the comparative picture of the status of
buildings for Sub Centres, PHCs and CHCs, respectively, in 2011 as compared to that in
2005. As may be seen, percentage of Sub Centres functioning in the Government buildings
has increased from 50% in 2005 to 62.7% in 2011 mainly due to substantial increase in the
government buildings in the States of Assam, Chhattisgarh, Goa, Haryana, Karnataka,
Madhya Pradesh, Maharashtra, Manipur, Meghalaya, Mizoram, Orissa, Punjab, Rajasthan,
Sikkim, Tripura, Uttarakhand, Uttar Pradesh and West Bengal 3.5. Similarly, percentage of
PHCs functioning in Government buildings has also increased significantly from 78% in
2005 to 86.7% in 2011. This is mainly due to increase in the Government buildings in the
States of Assam, Chhattisgarh, Gujarat, Haryana, Himachal Pradesh, Karnataka, Madhya
Pradesh, Maharashtra, Nagaland and Uttar Pradesh. 3.6. Number of CHCs functioning in
Government buildings have increased appreciably in 2011 as compared to 2005. The
percentage of CHCs in Govt. buildings has increased from 90% in 2005 to 95.3% in 2011.
3.7. Comparative State-wise status of buildings for Sub Centres, PHCs and CHCs in 2010
and 2011.

Manpower 3.8.
The availability of manpower is one of the important prerequisite for the efficient functioning
of the Rural Health services. As on March, 2011 the overall shortfall (which excludes the
existing surplus in some of the states) in the posts of HW(F) / ANM was 3.8% of the total
requirement as per the norm of one HW(F) / ANM per Sub Centre and PHC. The overall
shortfall is mainly due to shortfall in States namely, Chhattisgarh, Gujarat, Himachal Pradesh,
Kerala, Tamil Nadu, Tripura and Uttar Pradesh. As on March, 2011 the overall shortfall in
the posts of HW (F) / ANM was 3.8% of the total requirement, mainly due to shortfall in
States namely, Chhattisgarh, Gujarat, Himachal Pradesh, Kerala, Tamil Nadu, Tripura and
Uttar Pradesh. For allopathic Doctors at PHCs, there was a shortfall of 12.0% of the total
requirement for existing infrastructure as compared to manpower in position. Percentage of
Sub Centres functioning in the Government buildings has increased from 50% in 2005 to
62.7% in 2011 Percentage of PHCs functioning in Government buildings has increased
significantly from 78% in 2005 to 86.7% in 2011 The % of CHCs in Govt. buildings has
increased from 90% in 2005 to 95.3% in 2011. Similarly, in case of HW (M), there was a
shortfall of 64.7% of the requirement. In case of Health Assistant (Female)/LHV, the shortfall
was 38% and that of Health Assistant (Male) was 43.3%. For allopathic Doctors at PHC,
there was a shortfall of 12.0% of the total requirement. This is again mainly due to significant
shortfall in Doctors at PHCs in the States of Chhattisgarh, Gujarat, Karnataka, Madhya
Pradesh, Nagaland, Orissa, Rajasthan and Uttar Pradesh. Even out of the sanctioned posts, a
significant percentage of posts are vacant at all the levels. For instance, 5% of the sanctioned
posts of HW (Female)/ ANM were vacant as compared to 42.2% of the sanctioned posts of
Male Health Worker.
At the Sub Centre level the extent of existing manpower can be assessed from the fact that
3.2% of the Sub Centres were without a Female Health Worker / ANM, 49.1% Sub 16
Centres were without a Male Health Worker and 2% Sub Centres were without both Female
Health Worker / ANM as well as Male Health Worker.
3.10. PHC is the first contact point between village community and the Medical Officer.
Manpower in PHC include a Medical Officer supported by paramedical and other staff.
As on March, 2011, 4.6% of the PHCs were without a doctor, 36.9% were without a Lab
technician and 24.6% were without a Pharmacist.
The Community Health Centres provide specialized medical care in the form of facilities of
Surgeons, Obstetricians & Gynaecologists, Physicians and Paediatricians. The current
position of specialists manpower at CHCs reveal that as on March, 2011, out of the
sanctioned posts, 55.9% of Surgeons, 46.7% of Obstetricians & Gynaecologists, 58.9% of
Physicians and 40% of Paediatricians were vacant. Overall 39.5% of the sanctioned posts of
specialists at CHCs were vacant. Moreover, as compared to requirement for existing
infrastructure, there was a shortfall of 75% of Surgeons, 65.9% of Obstetricians &
Gynaecologists, 80.1% of Physicians and 74.4% of Paediatricians. Overall, there was a
shortfall of 63.9% specialists at the CHCs as compared to the requirement for existing CHCs.
The shortfall in Specialists is significantly high in most of the States.

Review of literature
Selected literature and their selection process for the review are described in this section.
Initially a two phase advance keyword search was conducted on the database Web of Science
and one phase (Phase 2) search in PubMed and Google Scholar with time filter 1 January
2005 to 31 December 2016 in “All Fields”. Journal articles written in English was added as
additional filters. Keywords listed in Table 2 were used in different phases. The complete
search procedure was conducted using the following procedure:

 Inclusion criteria: The phase 1 search resulted in thousands of articles which was then
narrowed down using the phase 2 keywords within the initial search space. Second phase
resulted in 129 articles in Web of Science, and 5255 articles in PubMed. Search in Google
Scholar search engine was conducted with phase 2 keywords which resulted in 700 articles.
The title, abstract, and keywords of those articles were screened and those discussing the
application of data mining and big data in the healthcare decision-making process were
retained for full-text review. To make the screening process efficient, duplicate articles were
removed at the eligibility phase instead of screening phase of the PRISMA.

 Exclusion criteria: This included articles reporting on results of: qualitative study, survey,
focus group study, feasibility study, monitoring device, team relationship measurement, job
satisfaction, work environment, “what-if” analysis, data collection technique, editorials or
short report, merely mention data mining, and articles not published in international
journals.
 2.2. Quality Assessment and Processing Steps

The full text of each of the 117 articles was reviewed separately by two researchers to
eliminate bias .To assess the quality of the cross sectional studies, we applied the JBI Critical
Appraisal Checklist for Analytical Cross Sectional Studies. For theoretical papers, we applied
the Critical Appraisal Skills Programme (CASP) qualitative research checklist .We modified
the checklist items, as not all items specified in the JBI or CASP checklists were applicable to
studies .We evaluated each article’s quality based on inclusion of: (1) clear objective and
inclusion criteria; (2) detailed description of sample population and variables; (3) data source
(e.g., hospital,, survey) and format (e.g., structured Electronic Medical Record (EMR),
International Classification of Diseases code, unstructured text, survey response); (4) valid
and reliable data collection; (5) consideration of ethical issues; (6) detailed discussion of
findings and implications; (7) valid and reliable measurement of outcomes; and (8) use of an
appropriate data mining tool for cross-sectional studies and (1) clear statement of aims; (2)
appropriateness of qualitative methodology; (3) appropriateness of research design; (4)
clearly stated findings; and (5) value of research for the theoretical papers. Summary
characteristics from any study fulfilling these criteria were included in the final discussion.

 To summarize the body of knowledge, we adopted the three-step processing


methodology outlined by Levy and Ellis .During the review process, information was
extracted by identifying and defining the problem, understanding the solution process
and listing the important findings (“Know the literature”). We summarized and
compared each article with the articles associated with the similar problems
(“Comprehend the literature”). This simultaneously ensured that any irrelevant
information was not considered for the analysis. The summarized information was
stored in a spreadsheet in the form of a concept matrix as described by Webster and
Watson [32]. We updated the concept matrix periodically, after completing every
20% of the articles which is approximately 23 articles, to include new findings
(“Apply”). Based on the concept matrix, we developed a classification scheme (for
further comparison and contrast. We established an operational definition for each
class and same class articles were separated from the pool (“Analyse and Synthesis”).
We compared classifications between researchers and we resolved disagreements (on
six articles) by discussion. The final classification provided distinguished groups of
articles with summary, facts, and remarks made by the reviewers.
Three stages of effective literature review process, adapted from Levy and Ellis [31]( 1. Levy
Y., Ellis T.J. A systems approach to conduct an effective literature review in support of
information systems research. Inf. Sci. 2006; 9:181–212. Doi : 10.28945/479. [Cross Ref])
RESEARCH METHODOLOGY

4. Application of Analytics in Healthcare

It provides the operational definitions of the six application areas (i.e., clinical decision
support, healthcare administration, privacy and fraud detection, mental health, public health,
and pharmacovigilance) identified in this review. Figure 10 shows the percentage of articles
in each area. Among different classes in healthcare analytics, data mining application is
mostly applied in clinical decision support (42%) and administrative purposes (32%). This
section discusses the application of data mining in these areas and identifies the main aims of
these studies, performance gaps, and key features.

4.1. Clinical Decision Support

Clinical decision support consists of descriptive and/or predictive analysis mostly related to
cardiovascular disease (CVD), cancer, diabetes, and emergency/critical care unit patients.
Some studies developed novel data mining algorithms which we review. Table 5 describes
the topics investigated and data sources used by papers using clinical decision-making,
organized by major diseases category.

4.2. Healthcare Administration


Data mining was applied to administrative purposes in healthcare in 32% (29 articles) of the
articles reviewed. Researchers have applied data mining to: data warehousing and cloud
computing; quality improvement; cost reduction; resource utilization; patient management;
and other areas .It provides a list of these articles with major focus areas, problems analyzed
and the data source.

4.3. Healthcare Privacy and Fraud Detection

Health data privacy and medical fraud are issues of prominent importance We reviewed four
articles displayed and described in  that discussed healthcare privacy and fraud detection. The
challenges of privacy protection have been addressed by a group of researchers [122] who
proposed a new anonymization algorithm for both distributed and centralized anonymization.
Their proposed model performed better than K-anonymization model in terms of retaining
data utility without losing much data privacy (for K = 20, the discernibility ratio—a
normalized measure of data quality—of the proposed approach and traditional K-
anonymization method were 0.1 and 0.4 respectively). 

4.4. Mental Health

Mental illness is a global and national concern According to the National Survey on Drug
Use and Health (NSDUH) data from 2010 to 2012, 52.2% of U.S. population had either
mental illness, or substance abuse/dependence. Additionally, nearly 30 million people in the
U.S. suffer from anxiety disorders summarizes the four articles we reviewed that apply data
mining in analyzing, diagnosing, and treating mental health issues.

4.5. Public Health

Seven articles addressed issues that were not limited to any specific disease or a demographic
group, which we classified as public health problems. It contains the list of papers
considering public health problems with data sources.

4.6. Pharmacovigilance

Pharmacovigilance involves post-marketing monitoring and detection of adverse drug


reactions (ADRs) to ensure patient safety. The estimated annual social cost of ADR events
exceeds one billion dollars, making it an important part of healthcare system .Characteristics
of the nine papers addressing pharmacovigilance
. Future Research and Challenges

Data mining has been applied in many fields including finance, marketing, and
manufacturing [172]. Its application in healthcare is becoming increasingly popular [173]. A
growing literature addresses the challenges of data mining including noisy data,
heterogeneity, high dimensionality, dynamic nature, computational time. In this section, we
focus on future research applications including personalized care, information loss in
preprocessing, collecting healthcare data for research purposes, automation for non-experts,
interdisciplinarity of study and domain expert knowledge, integration into the healthcare
system, and prediction-specific to data mining application and integration in healthcare.

 Personalized care

The EMR is increasingly used to document demographic and clinician patient information
[1]. EMR data can be utilized to develop personalized care plans, enhancing patient
experience [162] and improving care quality.

 Loss of information in pre-processing

Pre-processing of data, including handling missing data, is the most time-consuming and
costly part of data mining. The most common method used in the papers reviewed was
deletion or elimination of missing data. In one study, approximately 46.5% of the data and
363 of 410 features were eliminated due to missing values [49]. In another, researchers [98]
were only able to use 2064 of 4948 observations (42%) [98]. By eliminating missing value
cases and outliers, we are losing a significant amount of information. Future research should
focus on finding a better method of missing value estimation than elimination. Moreover,
data collection techniques should be developed or modified to avoid this issue.

Similar to missing data, deletion or elimination is a common way to handle outliers [174].
However, as illustrated in one of the studies we reviewed [48], outliers can be used to gain
information about rare forms of diseases. Instead of neglecting the outliers, future research
should analyze them to gain insight.

 Collecting healthcare data for research purpose

Traditionally, the primary objective of data collection in healthcare is documentation of


patient condition and care planning [109]. Including research objectives in the data collection
process through structured fields could yield more structured data with fewer cases of error
and missing values [64]. A successful example of data collection for research purpose is the
Study of Health in Pomerania (SHIP) [175]. The objective of SHIP was to identify common
diseases, population level risk factors, and overall health of people living in the north-east
region of Germany. This study only suffered from one “mistake” for every 1000 data entries
[175] which ensures a structured form of data with high reliability, less noise and fewer
missing values. We can take advantage of current documentation processes (EMR or EHR)
by modifying them to collect more reliable and structured data. Long-term vision and
planning is required to introduce research purpose in healthcare data collection.
 Automation of data mining process for non-expert users

The end users of data mining in healthcare are doctors, nurses, and healthcare professionals
with limited training in analytics. One solution for this problem is to develop an automated
(i.e., without human supervision) system for the end users [134]. A cloud-based automated
structure to prevent medical errors could also be developed [95]; but the task would be
challenging as it involves different application areas and one algorithm will not have similar
accuracy for all applications [134].

 Interdisciplinary nature of study and domain expert knowledge

Healthcare analytics is an interdisciplinary research field [134]. As a form of analytics, data


mining should be used in combination with expert opinion from specific domains—
healthcare and problem specific (i.e., oncologist for cancer study, cardiologist for CVD)
[106]. Approximately 32% of the articles in analytics did not utilize expert opinion in any
form. Future research should include members from different disciplines including
healthcare.

 Integration in healthcare system

Very few articles reviewed made an effort to integrate the data mining process into the actual
decision-making framework. The impact of knowledge discovery through data mining on
healthcare professional’s workload and time is unclear. Future studies should consider the
integration of the developed system and explore the effect on work environments.

 Prediction error and “The Black Swan” effect

In healthcare, it is better not to predict than making an erroneous prediction [46]. A little
under half of the literature we identified in analytics is dedicated to prediction but, none of
the articles discussed the consequence of a prediction error. High prediction accuracy for
cancer or any other disease does not ensure an accurate application to decision-making.

Moreover, prediction models may be better at predicting commonplace events than rare ones.
Researchers should develop more sophisticated models to address the unpredictable,
addressed a similar issue in evidence based recommendations for medical prescriptions. Their
concern was, how much evidence should be sufficient to make a recommendation.

DATA ANALYSIS

The distribution of articles published related to data mining and big data analytics in
healthcare across the timeline of the study (2005–2016) is presented.The distribution shows
an upward trend with at least two articles in each year and more than ten articles in the last
four years.
We identified three types of analytics in the literature: descriptive (i.e., exploration and
discovery of information in the dataset), predictive (i.e., prediction of upcoming events based
on historical data) and prescriptive (i.e., utilization of scenarios to provide decision support).
Five of the 92 studies employed both descriptive and predictive analytics. In which displays
the percentage of healthcare articles using each analytics type, we show that descriptive
analytics is the most commonly used in healthcare (48%). Descriptive analytics was
dominant in all the application areas except in clinical decision support. Among the
application areas, pharmacovigilance studies only used descriptive analytics as this
application area is focused on identifying an association between adverse drug effects with
medication. Predictive analytics was used in 43% articles. Among application areas, clinical
decision support had the highest application of predictive analytics as many studies in this
area are involved in risk and morbidity prediction of chest pain, heart attack, and other
diseases. In contrast, use of prescriptive analytics was very uncommon (only 9%) as most of
these studies were focused on either a specific population base or a specific disease scenario.
However, some evidence of prescriptive analytics was found in public healthcare,
administration, and mental health. These studies create a data repository and/or analytical
platform to facilitate decision-making for different scenarios.
To identify types of data, we adopted the classification scheme identified by Raghupathi and
Raghunathi which takes into account the nature (i.e., text, image, number, and electronic
signal), source, and collection method of data together .It provides the operational definitions
of taxonomy adopted in this paper .The percentage of data type used and, the number of
usage by application area. As expected, human generated (HG) data, including EMR,
Electronic Health Record (HER), and Electronic Patient Record (EPR), is the most
commonly (77%) used form. Web or Social media (WS) data is the second dominant (11%)
type of data, as increasingly more people are using social media now and ongoing digital
revolution in the healthcare sector. In addition, recent development in Natural Language
Processing (NLP) techniques is making the use of WS data easier than before [36]. The other
three types of data (SD, BT, and BM) consist of only about 12% of total data usage, but
popularity and market growth of wearable personal health tracking devices [37] may increase
the use of SD and BM data.

Data mining techniques used in the articles reviewed include classification, clustering,
association, anomaly detection, sequential pattern mining, regression, and data warehousing.
While elaborate description of each technique and available algorithms is out of scope of this
review, we report the frequency of each technique and its sector wise distribution
in respectively. Among the articles included in the review, 57 used classification techniques
to analyse data.
Utilization of data mining techniques, (a) by percentage and (b) by application area.

CONCLUSION

The development of an informed decision-making framework stems from the growing


concern of ensuring a high value and patient-focused health care system. Concurrently, the
availability of big data has created a promising research avenue for academicians and
practitioners. As highlighted in our review, the increased number of publications in recent
years corroborates the importance of health care analytics to build improved health care
systems world-wide. The ultimate goal is to facilitate coordinated and well-informed health
care systems capable of ensuring maximum patient satisfaction.
This paper adds to the literature on healthcare and data mining (Table 1) as it is the first, to
our knowledge, to take a comprehensive review approach and offer a holistic picture of
health care analytics and data mining. The comprehensive and methodologically rigorous
approach we took covers the application and theoretical perspective of analytics and data
mining in healthcare. Our systematic approach starting with the review process and
categorizing the output as analytics or theoretical provides readers with a more widespread
review with reference to specific fields.

We also shed light on some promising recommendations for future areas of research
including integration of domain-expert knowledge, approaches to decrease prediction error,
and integration of predictive models in actual work environments. Future research should
recommend ways so that the analytic decision can effectively adapt with the predictive model
subject to errors and unpredictable incidents. Regardless of these insightful outcomes, we are
not constrained to mention some limitations of our proposed review approach. The sole
consideration of academic journals and exclusion of conference papers, which may have
some good coverage in this sector is the prime limitation of this review. In addition to this,
the search span was narrowed to three databases for 12 years which may have ignored some
prior works in this area, albeit the increasing trend since 2005 and less number of
publications before 2008 can minimize this limitation. The omission of articles published in
languages other than English can also restrict the scope of this review as related papers
written in other languages might be evident in the literature. Moreover, we did not conduct
forward (reviewing the papers which cited the selected paper) and backward (reviewing the
references in the selected paper and authors’ prior works) search as suggested by Levy and
Ellis [31].
METHODOLOGY

Methodological Quality of the Studies 

Out of 117 papers included in this review, 92 applied analytics and 25 were
qualitative/conceptual. The methodological quality of the analytical studies (92 out of 117)
were evaluated by a modified version of 8 yes/no questions suggested in JBI Critical
Appraisal Checklist for Analytical Cross Sectional Studies [29]. Each question contains 1
point (1 if the answer is Yes or 0 for No). The score achieved by each paper is provided in the
final column of Supplementary Materials Table S3. On average, each paper applying
analytics scored 7.6 out of 8, with a range of 6–8 points. Major drawbacks were the absence
of data source and performance measure of data mining algorithms. Out of 92 papers, 23 did
not evaluate or mention the performance of the applied algorithms and eight did not mention
the source of the data. However, all the papers in healthcare analytics had a clear objective
and a detailed discussion of sample population and variables. Data used in each paper was
either de-identified/anonymized or approved by institute’s ethical committee to ensure patient
confidentiality.

We applied the Critical Appraisal Skills Programme (CASP) qualitative research checklist
[30] to evaluate the quality of the 25 theoretical papers. Five questions (out of ten) in that
checklist were not applicable to the theoretical studies. Therefore, we evaluated the papers in
this section in a five-point scale (1 if the answer is Yes or 0 for No). Papers included in this
review showed high methodological quality as 21 papers (out of 25) scored 5. The last
column in the Supplementary Materials Table S3 provides the score achieved by individual
papers
References

1.  Cortada J.W., Gordon D., Lenihan B. The Value of Analytics in Healthcare. IBM Institute
for Business Value; Armonk, NY, USA: 2012. Report No.: GBE03476-USEN-00. [Google
Scholar]
2. Center for Medicare and Medicaid Services. [(Accessed on 1 August 2017)]; Available
online: https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-
andReports/NationalHealthExpendData/NationalHealthAccountsHistorical.html.
3. Berwick D.M., Hack Barth A.D. Eliminating waste in US health care. J. Am. Med.
Assoc.  2012; 307:1513–1516. Doi: 10.1001/jama.2012.362. [PubMed] [Cross Ref] [Google
Scholar]
4. Makary M.A., Daniel M. Medical error-the third leading cause of death in the US. Br.
Med. J. 2016; 353:i2139. Doi: 10.1136/bmj.i2139. [PubMed] [Cross Ref] [Google Scholar]
5. Prokosch H.-U., Ganslandt T. Perspectives for medical informatics. Methods Inf.
Med. 200948:38–44. Doi: 10.3414/ME9132. [PubMed] [Cross Ref] [Google Scholar]
6. Simpao A.F., Ahumada L.M., Gálvez J.A., Rahman M.A. A review of analytics and
clinical informatics in health care. J. Med. Syst. 201438:45. Doi: 10.1007/s10916-014-0045-
x. [PubMed] [Cross Ref] [Google Scholar]
7. Ghassemi M., Celli L.A., Stone D.J. State of the art review: The data revolution in critical
care. Crit. Care. 201519:118. doi: 10.1186/s13054-015-0801-4. [PMC free article] [PubMed]
[Cross Ref] [Google Scholar]
8. Tomar D., Agarwal S. A survey on Data Mining approaches for Healthcare. Int. J. Bio-Sci.
Bio-Technol. 20135:241–266. Doi: 10.14257/ijbsbt.2013.5.5.25. [Cross Ref] [Google
Scholar].

You might also like