Professional Documents
Culture Documents
Chapter Two Final..sited 2
Chapter Two Final..sited 2
LITERATURE REVIEW
2.1 Introduction
Data mining derives its name from the similarities between searching for
valuable information in a large database and mining a mountain for a vein of
valuable one, Both processes require either sifting through an immense amount
of material, or intelligently probing it to find where the value resides[37, 38].It is
the computer-assisted process of digging through and analyzing enormous
sets of data and then extracting the meaning of the data. Data mining tools
predict behaviors and future trends, allowing businesses to make proactive,
knowledge-driven decisions. Data mining tools can answer business
questions that were traditionally too time consuming to resolve. They scour
databases for hidden patterns, finding predictive information that experts
may miss because it lies outside their expectations[39].
Exploratory Data Analysis (EDA) As the name implies, the objective here is to
merely examine the data without having a specific notion of what we are trying to
find. EDA approaches are typically interactive and visual, and for small, low-
dimensional data sets, there are numerous efficient graphical presentation ways.
Discovering Patterns and Rules: The three types of tasks listed above are
concerned with model building. The discovery of patterns is the focus of other
data mining applications.
One example is spotting fraudulent behavior by detecting regions of the space
defining the different types of transactions where the data points significantly
different from the rest. Another use is in astronomy, where detection of unusual
stars or galaxies may lead to the discovery of previously unknown phenomena.
Yet another is the task of finding combinations of items that occur frequently
in transaction databases (e.g. grocery products that are often purchased
together).
Data mining experts have given this issue a lot of attention, and algorithmic
methods based on association rules have been used to solve it.
Retrieval by Content: Here the user has a pattern of interest and wishes to find
similar patterns in the data set. This task is most commonly used for text and
image data sets. For text, the pattern may be a set of keywords, and the user
may wish to find relevant documents within a large set of possibly relevant
documents (e.g., Web pages). For images, the user may have a sample image, a
sketch of an image, or a description of an image, and wish to find similar
images from a large set of images. In both cases the definition of similarity
is critical, but so are the details of the search strategy.
One of the greatest strengths of data mining is reflected in its wide range of
methodologies and techniques that can be applied to a host of problem sets[37].
Data mining tools perform data analysis and uncover important data patterns,
contributing greatly to different business strategies including medical researchers.
The widening gap between data and information calls for a systematic
development of data mining tools that will turn data tombs into golden nuggets
of knowledge. Thus, patterns and knowledge from data mining is using for
sound judgment and proactive decision making in different organization including
health care sectors.
3 Data selection (where data relevant to the analysis task are retrieved from the
database)
5 Data mining (a crucial procedure that uses clever techniques to extract data
patterns)
6 Pattern evaluations (to identify the truly interesting patterns representing
knowledge based on interestingness measure)
2.5.2 CRISP-DM
1 Sample: the first step in is to create one or more data tables by sampling data
from the data warehouse. Mining a representative sample instead of the entire
volume drastically reduces the processing time required to obtain business
information.
2 Explore: after sampling the data, the next step is to explore the data visually
or numerically for trends or groupings. Exploration helps to refine the discovery
process. Techniques such as factor analysis, correlation analysis and clustering
are often used in the discovery process.
3 Modify: modifying the data refers to creating, selecting, and transforming
one or more variables to focus the model selection process in a particular
direction, or to modify the data for clarity or consistence.
4 Model: creating a data model involves using the data mining software to
search automatically for a combination of data that predicts the desired outcome
reliably.
5 Assess: the last step is to assess the model to determine how well it
performs. A common means of assessing a model is to set aside a portion of the
data during the sampling stage. If the model is valid, it should work for both the
reserved sample and for the sample that was used to develop the model[44].
Predictive data mining tasks produce the model of the system described by the
given dataset to build a model that permits the value of unknown variable to be
predicted from the known values of other variables[45]. It is a technique that
involves using some variables or fields in the dataset to predict unknown or
previously unseen future values of other variables of interest. It is usually used to
create a model based on a set of predictors to relate the dependent variables.
Examples of predictive modeling includes classification, prediction etc.
The second category of data mining function is descriptive mining task. This is
another data mining task used to characterize the general properties of the
data in the database[37]. It produces new, nontrivial information based on
the available dataset and is to gain an understanding of the analyzed system
by uncovering patterns and relationships in large datasets. The goal of a
descriptive model is to describe all of the data or the process generating the
data(13).Examples for descriptive data mining are clustering, summarization,
association rule discovery, and sequence discovery. The followings are some of
the examples from both data mining tasks how they are working in real pattern
discovery process.
2.7 Summary of related work
In Ethiopia, neonatal mortality and morbidity are among the highest in the world,
on which more than one-third of childhood death occurs within the first 28 days of
age. As studies revealed the incidence, etiology and risk factors of neonatal
jaundice vary according to ethnicity, economic status, and geographical differences
of countries.
A lot of studies are conducted to provide solutions for the problems; still there is
some limitation on the studies.
6 (22) prospective cohort study was applied This study's primary aim was to build
and validate a prediction model for
severe hyperbilirubinemia using
umbilical cord blood bilirubins (CBB)
7 (25) case-control study with cross-sectional identify the possible factors associated
with neonatal jaundice and assess
maternal knowledge level of this
condition
8 (8) Statistical analysis : Detraining the Burden of severe
Systematic review and meta-analysis neonatal jaundice defined by clinical
using meta-analytical technique. jaundice associated with clinical
outcomes including acute bilirubin
encephalopathy/ kernicterus and/or
exchange transfusion (ET) and/or
jaundice-related death.this study
recommended the study being
retrospective study did not afford the
authors opportunity to actively enquire
for the application of dusting powder
on the subject as possible cause of
neonatal jaundice (NNJ)
Study one: This study shows a high rate of neonatal mortality. Neonatal mortality
was highly associated with primipara, prematurity, low birth weight, perinatal
asphyxia, respiratory distress syndrome, congenital anomaly, neonatal sepsis and
duration of hospital stay.
Study two: determine the prevalence of neonatal jaundice and secondly to explore
its risk factors in healthy term neonates. The main findings of this study showed
that data mining techniques are important and valid approaches for the prediction
of neonatal hyperbilirubinemia.
Study FOURE: found out The babies in the study facility frequently suffered from
neonatal jaundice, which was more common in moms with lower educational
levels and in infants whose parents had separated.272 babies (aged 1-30 days) the
Neonatal clinic of the Department of Child health, Central Hospital ,Warri, Delta
State between 2009, The moms' socio demographic information was evaluated
through the use of a semi-structured questionnaire. The produced data were
Random sampling new born childes and analysis of gathered data.
Study five: The magnitude of neonatal jaundice among neonates was found to be
high. Duration of labor, time of delivery, sexes of neonate, sepsis, maternal blood
group, and blood type incompatibility were significantly associated with neonatal
jaundice. Therefore, improving newborn care and timely intervention for neonates
with ABO/Rh incompatibility are recommended.
Study six: This study's primary aim was to build and validate a prediction
model for severe hyperbilirubinemia using umbilical cord blood bilirubins
(CBB). The study considered combination umbilical cord blood bilirubins
(CBB) with gestational age and maternal race predict neonatal hyperbilirubinemia
by applying prospective cohort study was applied.
Study Seven: Low neonatal birth weight and prolonged duration of labour are
associated with neonatal jaundice. Mothers had inadequate knowledge of
neonatal jaundice and its causes. Therefore, during routine prenatal visits,
healthcare professionals should focus primarily on providing more education about
the illness and its causes.
For # 1 This study had limitations because the study assessed on small sample and
a significant number of sample charts were incomplete. This study was done in a
single center as a result; the prevalence may not reflect the overall prevalence in
the community. In addition, as a cross-sectional study design, this study does not
show cause-and-effect relationships.
For # 2 As a limitation, this study was not considered some essential predictors,
like a thyroid-stimulating hormone, and glucose-6-phosphatase dehydrogenase. A
limitation of the study is that convenience sampling was used, which contributed to
a small study sample and the high percentage of babies born via C-section. The
low numbers of participants with certain risk factors (e.g. smoking or alcohol use)
made it difficult to investigate associations. Data on the gravidity of the mothers
and the gender of the babies were not collected, which were identified as risk
factors in some studies.
For #3 bigger sample could have improved the result (used 70 variables) , and the
study considered limited risk factors associated ( father and mother information,
siblings information, gestational information, delivery information ,clinical
information of the complete hospital stay).
For # 4 the limitation this study is its limited to small no sample and the sample is
only limited to health care center delivery , the gathered is analyzed without any
tool , with limited variable analysis.
For # 5 In this study a total of 209 neonates, which is small no and its considering
babies born in intensive care unit of public hospitals. The study does not
considered babies born at home and come to health center.
For#6 This is a single site study, and further validation in a larger, multi-site study
is warranted. The study considered only combination umbilical cord blood
bilirubins (CBB) with gestational age and maternal race predict neonatal
hyperbilirubinemia.
For #7 the sample data is only limited to One hundred and fifty (150) neonates
comprising 100 with clinically evident jaundice and 50 without jaundice were
conveniently recruited from the Trauma and Specialist Hospital in the Effutu
Municipality. Blood samples were collected for the determination of few risk
factors of serum bilirubin, glucose-6-phosphate dehydrogenase (G6PD), status and
blood group (ABO and Rhesus).
For #8 study being retrospective study did not afford the authors opportunity to
actively enquire for the application of dusting powder on the subject as possible
cause of neonatal jaundice (NNJ)
The amount of data used by the second/ researchers was very small and the data
collection is limited to small numbers of patients, which are born in time of data
collection, previously bourn jaundice patients dataset is not analyzed.
The study of the second algorithm used only the variables gestational age and
newborn blood group (ABO).
Even though their various factors for jaundice disease, there is limitation of
explaining factors related to jaundice. Etiology and risk factors of neonatal
jaundice vary according to ethnicity, economic status, and geographical differences
of countries.
Most of the studies are conducted by using cross-sectional study and systematic
data analysis. The studies are conduct on classification decision tree algorithms,
other types classification algorithms where not used for comparative algorithm
analysis by the researchers.
To my knowledge, there is no study conducted that used data mining to predict
jaundice on factors responsible for the occurrence.
Therefore, this study will apply data mining techniques for predicting the jaundice
status of newborns. Specifically, identify the determinant attributes of jaundice
status of newborn babies, build best prediction model.
References
1. Mitra, S. and J. Rennie, Neonatal jaundice: aetiology, diagnosis and treatment. British Journal of
Hospital Medicine, 2017. 78(12): p. 699-704.
2. Kleigman, B., Jenson. Stanton Saunders International edition, Ed 18th, 2008: p. p2666.
3. Cohen, R.S., R.J. Wong, and D.K. Stevenson, Understanding neonatal jaundice: a perspective on
causation. Pediatrics & Neonatology, 2010. 51(3): p. 143-148.
4. Mwaniki, M.K., et al., Long-term neurodevelopmental outcomes after intrauterine and neonatal
insults: a systematic review. The Lancet, 2012. 379(9814): p. 445-452.
5. Olusanya, B.O., S. Teeple, and N.J. Kassebaum, The contribution of neonatal jaundice to global
child mortality: findings from the GBD 2016 study. Pediatrics, 2018. 141(2).
6. Lawn, J.E., et al., Every Newborn: progress, priorities, and potential beyond survival. The lancet,
2014. 384(9938): p. 189-205.
7. Olusanya, B.O., T.A. Ogunlesi, and T.M. Slusher, Why is kernicterus still a major cause of death
and disability in low-income and middle-income countries? Archives of disease in childhood,
2014. 99(12): p. 1117-1121.
8. Slusher, T.M., et al., Burden of severe neonatal jaundice: a systematic review and meta-analysis.
BMJ paediatrics open, 2017. 1(1).
9. Greco, C., et al., Diagnostic performance analysis of the point-of-care bilistick system in
identifying severe neonatal hyperbilirubinemia by a multi-country approach. EClinicalMedicine,
2018. 1: p. 14-20.
10. Sreedha, B., P.R. Nair, and R. Maity, Non-invasive early diagnosis of jaundice with computer
vision. Procedia Computer Science, 2023. 218: p. 1321-1334.
11. Bhutani, V., R. Vilms, and L. Hamerman-Johnson, Universal bilirubin screening for severe
neonatal hyperbilirubinemia. Journal of perinatology, 2010. 30(1): p. S6-S15.
12. Maisels, M.J. Screening and early postnatal management strategies to prevent hazardous
hyperbilirubinemia in newborns of 35 or more weeks of gestation. in Seminars in fetal and
neonatal medicine. 2010. Elsevier.
13. Vodret, S., et al., Attenuation of neuro-inflammation improves survival and neurodegeneration in
a mouse model of severe neonatal hyperbilirubinemia. Brain, behavior, and immunity, 2018. 70:
p. 166-178.
14. Bhutani, V.K., et al., Neonatal hyperbilirubinemia and Rhesus disease of the newborn: incidence
and impairment estimates for 2010 at regional and global levels. Pediatric research, 2013. 74(1):
p. 86-100.
15. Onyearugha, C., B. Onyire, and H. Ugboma, Neonatal jaundice: Prevalence and associated
factors as seen in Federal medical centre Abakaliki, Southeast Nigeria. J Clin Med Res, 2011. 3(3):
p. 40-45.
16. Tette, E.M., et al., The pattern of neonatal admissions and mortality at a regional and district
hospital in the Upper West Region of Ghana; a cross sectional study. PloS one, 2020. 15(5): p.
e0232406.
17. Greco, C., et al., Neonatal jaundice in low-and middle-income countries: lessons and future
directions from the 2015 Don Ostrow Trieste Yellow Retreat. Neonatology, 2016. 110(3): p. 172-
180.
18. Tewabe, T., et al., Neonatal mortality in the case of Felege Hiwot referral hospital, Bahir Dar,
Amhara Regional State, North West Ethiopia 2016: a one year retrospective chart review. Italian
journal of pediatrics, 2018. 44: p. 1-5.
19. Yismaw, A.E. and A.A. Tarekegn, Proportion and factors of death among preterm neonates
admitted in University of Gondar comprehensive specialized hospital neonatal intensive care
unit, Northwest Ethiopia. BMC research notes, 2018. 11: p. 1-7.
20. Demography, E., Health Survey: Addis Ababa. Ethiopia and Rockville, Maryland, USA: Central
statistics agency and ICF. EDHS, 2016.
21. Castillo, A., et al., Umbilical cord blood bilirubins, gestational age, and maternal race predict
neonatal hyperbilirubinemia. PLoS One, 2018. 13(6): p. e0197888.
22. Scrafford, C.G., et al., Incidence of and risk factors for neonatal jaundice among newborns in
southern N epal. Tropical Medicine & International Health, 2013. 18(11): p. 1317-1328.
23. Garosi, E., F. Mohammadi, and F. Ranjkesh, The relationship between neonatal jaundice and
maternal and neonatal factors. Iranian Journal of Neonatology, 2016. 7(1): p. 37-40.
24. Omekwe, D.E., et al., Survey and management outcome of neonatal jaundice from a developing
tertiary health centre, Southern Nigeria. IOSR Journal of Dental and Medical Sciences, 2014.
13(4): p. 35-39.
25. Adoba, P., et al., Knowledge level and determinants of neonatal jaundice: a cross-sectional study
in the Effutu Municipality of Ghana. International journal of pediatrics, 2018. 2018.
26. Birhanu, M.Y., et al., Rate and predictors of neonatal jaundice in northwest Ethiopia: prospective
cohort study. Journal of Multidisciplinary Healthcare, 2021: p. 447-457.
27. Brits, H., et al., The prevalence of neonatal jaundice and risk factors in healthy term neonates at
National District Hospital in Bloemfontein. African Journal of Primary Health Care and Family
Medicine, 2018. 10(1): p. 1-6.
28. Kavehmanesh, Z., et al., Prevalence of readmission for hyperbilirubinemia in healthy newborns.
2008.
29. Tavakolizadeh, R., et al., Maternal risk factors for neonatal jaundice: a hospital-based cross-
sectional study in Tehran. European journal of translational myology, 2018. 28(3).
30. Khedmat, L., S.Y. Mojtahedi, and A. Moienafshar, Recent clinical evidence in the herbal therapy
of neonatal jaundice in Iran: A review. Journal of Herbal Medicine, 2021. 29: p. 100457.
31. Olusanya, B.O., F.B. Osibanjo, and T.M. Slusher, Risk factors for severe neonatal
hyperbilirubinemia in low and middle-income countries: a systematic review and meta-analysis.
PloS one, 2015. 10(2): p. e0117229.
32. Ogunlesi, T.A. and O.B. Ogunfowora, Predictors of acute bilirubin encephalopathy among
Nigerian term babies with moderate-to-severe hyperbilirubinaemia. Journal of tropical
pediatrics, 2011. 57(2): p. 80-86.
33. Lake, E.A., et al., Magnitude of neonatal jaundice and its associated factor in neonatal intensive
care units of Mekelle city public hospitals, Northern Ethiopia. International journal of pediatrics,
2019. 2019.
34. Fanello, C., et al., Prevalence and Risk Factors of Neonatal Hyperbilirubinemia in a Semi-Rural
Area of the Democratic Republic of Congo: A Cohort Study. The American Journal of Tropical
Medicine and Hygiene, 2023. 109(4): p. 965.
35. Hansen, T.W.R., Narrative review of the epidemiology of neonatal jaundice. Pediatric Medicine,
2021. 4.
36. Ip, S., et al., An evidence-based review of important issues concerning neonatal
hyperbilirubinemia. Pediatrics, 2004. 114(1): p. e130-e153.
37. Kantardzic, M., Data mining: concepts, models, methods and algorithms, A John Wiley & Sons.
Inc. Hoboken, New Jersey, 2011.
38. Han, J., M. Kamber, and D. Mining, Concepts and techniques. Morgan Kaufmann, 2006. 340: p.
94104-3205.
39. Sumathi, S. and S. Sivanandam, Evolution and Scaling of Data Mining Algorithms. Introduction to
Data Mining and its Applications, 2006: p. 151-164.
40. Witten, I.H., et al. Practical machine learning tools and techniques. in Data mining. 2005. Elsevier
Amsterdam, The Netherlands.
41. Adeniyi, D.A., Z. Wei, and Y. Yongquan, Automated web usage data mining and recommendation
system using K-Nearest Neighbor (KNN) classification method. Applied Computing and
Informatics, 2016. 12(1): p. 90-108.
42. Cios, K.J., et al., Text Mining. Data Mining: A Knowledge Discovery Approach, 2007: p. 453-465.
43. Kantardzic, M.M. and J. Gant. Mining Sequences in Distributed Sensors Data for Energy
Production. in FLAIRS. 2007.
44. Larose, D.T., An introduction to data mining. Traduction et adaptation de Thierry Vallaud, 2005.
45. Niakšu, O. and O. Kurasova, Data mining applications in healthcare: research vs practice.
Databases Inf. Syst. BalticDB&IS, 2012. 58: p. 2012.