Professional Documents
Culture Documents
International Journal of Strategic Information Technology and Applications
International Journal of Strategic Information Technology and Applications
ABSTRACT
Data mining has been gaining attention with the complex business environments, as a rapid increase of
data volume and the ubiquitous nature of data in this age of the internet and social media. Organizations
are interested in making informed decisions with a complete set of data including structured and
unstructured data that originate both internally and externally. Different data mining techniques
have evolved over the last two decades. To solve a wide variety of business problems, different data
mining techniques are developed. Practitioners and researchers in industry and academia continuously
develop and experiment varieties of data mining techniques. This article provides an overview of data
mining techniques that are widely used in different fields to discover knowledge and solve business
problems. This article provides an update on data mining techniques based on extant literature as of
2018. That might help practitioners and researchers to have a holistic view of data mining techniques.
Keywords
Business Environment, Data Mining Applications, Data Mining Techniques, Data Mining
1. INTRODUCTION
Data mining techniques (Liao et al., 2012) have been applied in the retail industry, marketing,
customer relationship management (CRM), finance and banking, insurance, scientific discoveries,
and healthcare to name a few. Data mining techniques are used to address different business scenarios
such as customer recommendations, anomaly detection, development of customer profiles, mining
of unstructured data, discovery of new insights, providing accurate predictions, explore complex
patterns of data, provide predictive analytics capabilities, develop interesting patterns in data, and
develop customer behavior patterns (Hart et al., 2003), medical diagnosis, and scientific discoveries.
Researchers have attempted to conduct the review of individual data mining techniques to report
progress in terms of research and problem-solving (Rahman, 2018a). They also made attempts to
compare between data mining techniques to understand problem solving capability and performance.
This paper reviews data mining techniques, their applications, and problem-solving capability.
The paper first reviews prominent data mining techniques and then provides a list of problems
solved by these techniques. The author searched relevant articles in EBSCO databases which pulled
thousands of papers related to each data mining techniques. For each data mining technique separate
search was conducted. The author reviews the papers by the title of each paper to short-list articles
that are relevant to this research.
The research on data mining as of 2018 suggests that several data mining techniques are widely
used. They include Bayesian networks, neural networks (NN), decision trees, association rules,
clustering techniques, support vector machine (SVM), logistic regression, and K-nearest neighbors.
Based on an extensive review of extant literatures it was found that a handful of real world data mining
problems are solved the data mining techniques mentioned above.
DOI: 10.4018/IJSITA.2018010104
Copyright © 2018, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
78
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
2. LITERATURE REVIEW
Data mining is a vast field of research. Processing, transforming, aggregating, and finding hidden
information take a lot to computer applications in terms of algorithms, techniques, and experiments.
During the last two decades a good number of research, survey of techniques, and literature review
was conducted (Rahman, 2018b). This section of the paper provides an account of those research. In
most cases researchers made attempt to conduct such studies on a particular algorithm or data mining
technique. This research makes attempt to provide a holistic overview of data mining techniques,
some comparative analysis, advantages and limitation, and problem classifications.
Wu et al. (2008) conducted a survey to identify top ten data mining algorithms that are influential
in the research community. The authors conducted their survey on ACM KDD Innovation Award and
IEEE ICDM Research Contributions Award winners. This is important source of reading most widely
used algorithms. Based on their 2006 survey the authors identified ten algorithms which include
C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. Later Li
(2015) provided an explanation to these algorithms and their associated data mining techniques. The
author provided examples of real world use of these algorithms in different data mining techniques.
Liao et al. (2012) conducted a survey of past research on data mining techniques and applications.
In their survey of papers between 2000 and 2011 the authors identified several key words appeared
most as data mining techniques which include decision tree, artificial neural network, clustering,
association rule, artificial intelligence, bioinformatics, customer relationship management, and fuzzy
logic. The authors also suggested that the fields of social science including psychology, cognitive
science, and human behavior might find data mining as an alternative methodology besides qualitative,
quantitative, and scientific methods to understand the subject areas.
Prieto et al. (2016) provides an overview of research in neural networks. The authors state that
as one of the prominent data mining techniques neural networks technique has acquired maturity
and consolidation in solving real world problems. They also point out that neural networks have
contributed significantly in the different disciplines including computational neuroscience, neuro-
engineering, computational intelligence, and machine learning. The authors also state that several
national and multinational project initiatives are underway to understand human brain using neural-
network research.
Hotho et al. (2005) performed a survey on text mining. The authors provided a list of data mining
techniques for text mining. Text mining is meant for knowledge discovery or extracting meaningful
information from unstructured data. Due to unstructured kind of data it requires a lot of preprocessing,
classification, clustering, and filtering.
Jain (2010) published a paper entitled ‘data clustering: 50 years beyond K-means’ in Pattern
Recognition Letters. The author stated that numerous clustering algorithms have been published
during the last many decades. But still k-mean algorithm is most widely used one which was proposed
in 1955. The author concluded the paper with a few problems and research directions in designing
clustering algorithms. The author proposed the need for benchmark data for research community to
test and evaluate clustering algorithms. The author suggested to achieve a tighter integration between
clustering algorithm and real-world application needs.
Phyu (2009) provides a survey of data mining techniques for classification. Classification is
one of the prominent data mining kind of problem being solve by different data mining techniques.
The author states that decision trees and Bayesian network are notable techniques for data accuracy.
Sapankevych and Sankar (2009) surveys support vector machine’s time series related prediction
capability. The authors state that SMV has capability to accurately forecast time series data. The
author also reports that SMV outperforms other data mining technique such as neural network-based
non-linear predictions. The author also highlighted advantages and challenges in using SMV for time
series predictions.
79
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Wu et al. (2014) provide an overview of data mining in big data space. Big data is a new set of
data the organizations try to utilize these days to gain business value. Big data get created in both
internal and external sources most of which are unstructured. With big data framework Hadoop data
mining libraries called Mahout evolved. And the other big data processing engine, Spark provides
a new machine learning library called MLlib is available. Both Mahout and MLlib are based on
prominent data mining algorithms and techniques. Wu et al. (2014) provide challenges of data mining
and proposed a big data processing model from data mining perspectives.
Tosun et al. (2017) conducted an extensive review of Bayesian networks technique’s application.
The author reports limited use of BN current literature does not provide insights to replicate studies.
The author proposes a framework with contextual and methodological details which could be used
to replicate and expand the work of Bayesian networks techniques.
In this research, the author lists all prominent data mining techniques, their applications, advantage
and limitations. Based on a comprehensive review of data mining papers published in leading journals
for the last two decades the author also discusses problems that are currently solved by different data
mining techniques.
With the innovation of microcomputers and the wide use of computers by business organizations and
household users’ data has been growing exponentially. Lately, because of the incredible reach of the
Internet web, data has been growing monumentally. The business organizations have started thinking
about finding more business value out of this huge volume of business data. Different kinds of data
mining techniques have evolved over the last two decades to identify the patterns in those data to
solve business problems and increase business revenue. This paper reviews prominent data mining
techniques and identifies the problems they solve. The underlying algorithms for these techniques
consist of top ten algorithms identified by the IEEE International Conference on Data Mining (ICDM)
in December 2006 which include C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN,
Naive Bayes, and CART (Wu et al., 2008).
80
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
81
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
The neural network was originally thought to solve problems like the human brain does. But it
is never possible to make it work that way. Later researcher and practitioners attempted to use NN to
solve other problems. According to Wikipedia entry, over time, NN has been used to perform tasks
in the field of computer vision, speech recognition, filtering in social network texts, and medical
diagnosis. Table 2 shows the current trends in using neural networks which speaks of neural networks
technique’s use to tackle predictions problems. Table 2 also shows neural networks being used to
solve classification, optimization, and pattern recognition problems.
There are certain limitations of neural networks reported (Pradhan, 2016): NNs are too much
of a black-box, no way to know the cause of the output. This makes it difficult to train. Hence, the
model based on training data set can be nondeterministic (Quora, 2014). To get accurate results NN
needs a large data set. Hardware requirements need to be large enough.
82
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
83
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
3.5. Clustering
Clustering includes techniques for classifying data objects in a group that look similar and different
from (or unrelated to) the objects in other groups. Clustering task segments of a heterogeneous
population into groups of items are homogenous (Ngai et al, 2013). For example, clustering could
be used to group customers according to income, age, profession, purchase-policies and prior claims
experience. Saxena et al. (2017) provide a comprehensive study of clustering which provides a list of
existing methods and developments made over a period of time. The authors highlight that clustering
techniques have been applied in the fields of pattern recognition and image segmentation. Since data
categories are unspecified, this is sometimes defined as unsupervised learning (Cornuejols et al., 2018).
Data mining applications use clustering to come up with similarities e.g. to segment a client/
customer base. It can be used to generate profiles in target marketing. The k-means is one of the most
popular clustering algorithms (Jain, 2010). Clustering the k-means algorithm is used to partition data
set into a specified number of clusters (Wu et al., 2008). The user can assign the number of clusters
needed and k-means return results accordingly. The k-means is generally faster and more efficient
than other algorithms when dealing with a large dataset (Li, 2015). In data mining, Expectation-
Maximization (EM) algorithm is also used in cluster analysis for knowledge discovery. (Li, 2015;
Wu et al., 2008). Zhang et al. (2016) propose a random-walk algorithm for big graph data clustering.
They assert that their method outperforms previous random-walk-based algorithms in solving graph
clustering problems. Their technique is built upon parallel computing paradigm as big data volume is
huge. Cornuejols et al. (2018) present collaborative clustering model. In this model the authors use a set
of clustering algorithms that are applied in parallel on a given data set to get a better overall solution.
Yassouridis and Leisch (2017) present comparative performance benchmarks of different clustering
algorithms on functional data. Table 5 provides the most recent use of clustering in problem-solving.
Limitations of clustering have been reported. With a simple k-means approach it is sometimes
difficult to find the optimal number of clusters. Also, some algorithms end up with just a local and
not a global optimum number of clusters in which case solutions might be perfect (Koelbl, 2018).
84
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm
(Burbidge, 2012). The SVM is a training algorithm for learning classification and regression rules
from data. A comparison between SMV and NN showed that SMV outperformed NN in terms
of accuracy (Agrawal and Agrawal, 2015). The SMV performs similar tasks like C4.5 algorithm
although it does not provide a decision tree (Li, 2015). Ougiaroglou et al. (2018) present experimental
results of SMV in terms of data reduction capability in training dataset size. The goal is to alleviate
high memory requirements and operational costs. The authors showed that their model effectively
reduced training dataset size with a small performance degradation. Table 6 shows the latest trends
of applications of SVM in problem solving.
Research suggests that SMVs have limitations like they need ready real-valued vectors as
features. Another reason is that SMVs are computationally intensive and an increase in training data
can slowdown machine capabilities or processing power. Also, another issue is that non-linear SMVs
are expensive in the training process (Quora, 2017).
0, 1
Y, N
F, T
Logistic regression is easy to implement and efficient to train. It is less complex and easier to
inspect. There are limitations to logistic regression. A non-linear problem cannot be solved with
logistic regression since its decision surface is linear (Raschka, 2016).
4. CONCLUSION
This study reviewed the progress of different data mining techniques. They include Bayesian networks,
neural networks, decision trees, association rules, clustering, support vector machines, logistic
regression, and k-nearest neighbors. The author discussed varieties of real-world business problems
solved by data mining techniques. The study found that data mining research and applications have
been conducted widely. Based on a search of research publications in scientific publications database
it was found that thousands of papers appeared for the last decade.
This study provides an overview of prominent data mining techniques. The author provides
positive aspects and limitations of different data mining techniques. This is expected to help users
get a good overview of each of the techniques. The author asserts that no one particular technique
will be sufficient to solve a problem in all use cases. The context and nature of data sets need to be
taken into consideration in choosing a particular technique. The author hopes this work will provide
readers with insights into both techniques and problem solving, and future research directions.
87
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
REFERENCES
Abdallah, A., Maarof, M. A., & Zainal, A. (2016). Fraud detection system: A survey. Journal of Network and
Computer Applications, 68, 90–113. doi:10.1016/j.jnca.2016.04.007
Agrawal, S., & Agrawal, J. (2015). Survey on anomaly detection using data mining techniques. Procedia Computer
Science, 60, 708–713. doi:10.1016/j.procs.2015.08.220
Aibinu, A.M., Salau, B.H., Rahman, N.A., Nwohu, M.N., & Akachukwu, C.M. (2016). A novel clustering
based genetic algorithm for route optimization. Engineering Science and Technology, an International Journal,
19(4), 2022-2034.
Albuquerque, M. T. D., Gerassis, S., Sierra, C., Taboada, J., Martín, J. E., Antunes, I. M. H. R., & Gallego, J.
R. (2017). Developing a new Bayesian Risk Index for risk evaluation of soil contamination. The Science of the
Total Environment, 603/604, 167–177. doi:10.1016/j.scitotenv.2017.06.068 PMID:28624637
Alimjan, G., Sun, T., Jumahun, H., Guan, Y., Zhou, W., & Sun, H. (2017). A hybrid classification approach based
on support vector machine and k-nearest neighbor for remote sensing data. International Journal of Pattern
Recognition and Artificial Intelligence, 31(10), 1–22. doi:10.1142/S0218001417500343
Alizadeh, M., Shahheydari, H., Kavianpour, M., Shamloo, H., & Barati, R. (2017). Prediction of longitudinal
dispersion coefficient in natural rivers using a cluster-based Bayesian network. Environmental Earth Sciences,
76(2), 1–11. doi:10.1007/s12665-016-6379-6
Anbari, M. J., Tabesh, M., & Roozbahani, A. (2017). Risk assessment model to prioritize sewer pipes inspection
in wastewater collection networks. Journal of Environmental Management, 190, 91–101. doi:10.1016/j.
jenvman.2016.12.052 PMID:28040592
Andrejiova, M., Grincova, A., & Marasova, D. (2018). Failure analysis of rubber composites under dynamic impact
loading by logistic regression. Engineering Failure Analysis, 84, 311–319. doi:10.1016/j.engfailanal.2017.11.019
Apollo, M., Grzyl, B., & Miszewska-Urbanska, E. (2017). Application of BN in risk diagnostics arising from
the degree of urban regeneration area degradation. In Proceedings of the 2017 Baltic Geodetic Congress (BGC
Geomatics), Gdansk, Poland, June 22-25. doi:10.1109/BGC.Geomatics.2017.47
Aswani, R., Ghrera, S. P., Kar, A. K., & Chandra, S. (2017). Identifying buzz in social media: A hybrid approach
using artificial bee colony and k-nearest neighbors for outlier detection. Social Network Analysis and Mining,
7(1), 38. doi:10.1007/s13278-017-0461-2
Bahmani, A., & Mueller, F. (2016). Efficient clustering for ultra-scale application tracing. Journal of Parallel
and Distributed Computing, 98, 25–39. doi:10.1016/j.jpdc.2016.08.001
Baldominos, A., Saez, Y., & Isasi, P. (2018). Evolutionary convolutional neural networks: An application to
handwriting recognition. Neurocomputing, 283, 38–52. doi:10.1016/j.neucom.2017.12.049
Banda, O. A. V., Goerlandt, F., Kuzmin, V., Kujala, P., & Montewka, J. (2016). Risk management model of
winter navigation operations. Marine Pollution Bulletin, 108(1/2), 242–262. doi:10.1016/j.marpolbul.2016.03.071
PMID:27207023
Banghart, M., Bian, L., Strawderman, L., & Babski-Reeves, K. (2017). Risk assessment on the EA-6B aircraft
utilizing Bayesian networks. Quality Engineering, 29(3), 499–511. doi:10.1080/08982112.2017.1319957
Banuls, V. A., Lopez, C., Turoff, M., & Tejedor, F. (2017). Predicting the impact of multiple risks
on project performance: A scenario-based approach. Project Management Journal, 48(5), 95–114.
doi:10.1177/875697281704800507
Barua, S., Gao, X., Pasman, H., & Mannan, M. S. (2016). Bayesian network based dynamic operational risk
assessment. Journal of Loss Prevention in the Process Industries, 41, 399–410. doi:10.1016/j.jlp.2015.11.024
Basu, S., Mukhopadhyay, S., Karki, M., DiBiano, R., Ganguly, S., Nemani, R., & Gayaka, S. (2018). Deep
neural networks for texture classification-A theoretical analysis. Neural Networks, 97, 173–182. doi:10.1016/j.
neunet.2017.10.001 PMID:29126070
Baya, A. E., Larese, M. G., & Namias, R. (2017). Clustering stability for automated color image segmentation.
Expert Systems with Applications, 86, 258–273. doi:10.1016/j.eswa.2017.05.064
88
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Belharbi, S., Herault, R., Chatelain, C., & Adam, S. (2018). Deep neural networks regularization for structured
output prediction. Neurocomputing, 281, 169–177. doi:10.1016/j.neucom.2017.12.002
Ben‐Gal, I. (2007). Bayesian networks. Encyclopedia of statistics in quality and reliability. John Wiley & Sons,
Ltd.
Bentes, C., Velotto, D., & Tings, B. (2018). Ship classification in TerraSAR-X images with convolutional neural
networks. IEEE Journal of Oceanic Engineering, 43(1), 258–266. doi:10.1109/JOE.2017.2767106
Bisson, C., & Gurpinar, F. (2017). A Bayesian approach to developing a strategic early warning system for the
French milk market. Journal of Intelligence Studies in Business, 7(3), 25–34.
Bouallegue, W., Bouabdallah, S. B., & Tagina, M. (2017). Robust fault detection and isolation in bond graph
modelled processes with Bayesian networks. International Journal of Computer Applications in Technology,
55(1), 46–54. doi:10.1504/IJCAT.2017.082261
Boukhris, I., Elouedi, Z., & Ajabi, M. (2017). Toward intrusion detection using belief decision trees for big data.
Knowledge and Information Systems, 53(3), 671–698. doi:10.1007/s10115-017-1034-4
Burbidge, R., & Buxton, B. (2012). An introduction to support vector machines for data mining. Retrieved from
http://datamining.martinsewell.com/BuBu.pdf
Cai, L., Thornhill, N. F., Kuenzel, S., & Pal, B. C. (2017). Real-time detection of power system disturbances
based on k-nearest neighbor analysis. IEEE Access, 5, 5631–5639.
Chan, G.-Y., Chua, F.-F., & Lee, C.-S. (2016). Intrusion detection and prevention of web service attacks for
software as a service: Fuzzy association rules vs fuzzy associative patterns. Journal of Intelligent & Fuzzy
Systems, 31(2), 749–764. doi:10.3233/JIFS-169007
Changbao, X., Lijin, Z., Yu, W., Liang, H., Yongtian, J., & Liming, Y. (2017). Risk assessment model of relay
protection system based on multi-state Bayesian networks. In Proceedings of the 2017 IEEE Conference and Exp.
on Transportation Electrification Asia-Pacific (ITEC Asia-Pacific), Harbin, China, August 7-10. doi:10.1109/
ITEC-AP.2017.8081034
Chen, L., Liu, Y., Zhao, J., Wang, W., & Liu, Q. (2016). Prediction intervals for industrial data with incomplete
input using kernel-based dynamic Bayesian networks. Artificial Intelligence Review, 46(3), 307–326. doi:10.1007/
s10462-016-9465-y
Chen, W. (2016). What are the disadvantages of using a decision tree for classification? Quora. Retrieved from
https://www.quora.com/What-are-the-disadvantages-of-using-a-decision-tree-for-classification
Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for
stock market indices prediction. Expert Systems with Applications, 80, 340–355. doi:10.1016/j.eswa.2017.02.044
Chitra, K., & Subashini, B. (2013). Data mining techniques and its applications in banking sector. International
Journal of Emerging Technology and Advanced Engineering, 3(8), 219–226.
Chugh, S., Selvan, K. A., & Nadesh, R. K. (2017). Prediction of heart disease using apache spark analysing
decision trees and gradient boosting algorithm. IOP Conference Series. Materials Science and Engineering.
Cornuejols, A., Wemmert, C., Gancarski, P., & Bennani, Y. (2018). Collaborative clustering: Why, when, what
and how. Information Fusion, 39, 81–95. doi:10.1016/j.inffus.2017.04.008
Cui, X., Liu, Y., Zhang, Y., & Wang, C. (2018). Tire defects classification with multi-contrast convolutional
neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 32(4), 1850011.
doi:10.1142/S0218001418500118
D’Addona, D. M., & Teti, R. (2013). Image data processing via neural networks for tool wear prediction. Procedia
CIRP, 12, 252–257. doi:10.1016/j.procir.2013.09.044
Das, N., Kalita, K., Boruah, P. K., & Sarma, U. (2018). Prediction of moisture loss in withering process of tea
manufacturing using artificial neural network. IEEE Transactions on Instrumentation and Measurement, 67(1),
175–184. doi:10.1109/TIM.2017.2754818
89
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Deng, Z., Zhu, X., Cheng, D., Zong, M., & Zhang, S. (2017). Efficient kNN classification algorithm for big
data. Neurocomputing, (195): 143–148.
Ding, C., Wang, D., Ma, X., & Li, H. (2016). Predicting short-term subway ridership and prioritizing its influential
factors using gradient boosting decision trees. Sustainability, 8(11), 1100. doi:10.3390/su8111100
Dominguez-Morales, J. P., Jimenez-Fernandez, A. F., Dominguez-Morales, M. J., & Jimenez-Moreno, G. (2018).
Deep neural networks for the recognition and classification of heart murmurs using neuromorphic auditory sensors.
IEEE Transactions on Biomedical Circuits and Systems, 12(1), 24–34. doi:10.1109/TBCAS.2017.2751545
PMID:28952948
Dreiseitla, S., & Ohno-Machadob, L. (2002). Logistic regression and artificial neural network classification
models: A methodology review. Journal of Biomedical Informatics, 35(5–6), 352–359. doi:10.1016/S1532-
0464(03)00034-0 PMID:12968784
Duca, A. L., Bacciu, C., & Marchetti, A. (2017). A K-nearest neighbor classifier for ship route prediction. In
Proceedings of the OCEANS 2017, Aberdeen, UK, June 19-22. doi:10.1109/OCEANSE.2017.8084635
Dudek, G., & Pelka, P. (2017). Forecasting monthly electricity demand using k nearest neighbor method. Przeglad
Elektrotechniczny, 93(4), 62–65.
du Jardin, P. (2017). Dynamics of firm financial evolution and bankruptcy prediction. Expert Systems with
Applications, 75, 25–43. doi:10.1016/j.eswa.2017.01.016
El Khiyari, H., & Wechsler, H. (2017). Age invariant face recognition using convolutional neural networks and
set distances. Journal of Information Security, 8(3), 174–185. doi:10.4236/jis.2017.83012
Emsia, E., & Coskuner, C. (2016). Economic Growth Prediction Using Optimized Support Vector Machines.
Computational Economics, 48(3), 453–462. doi:10.1007/s10614-015-9528-1
Fisher, W. D., Camp, T. K., & Krzhizhanovskaya, V. V. (2017). Anomaly detection in earth dam and levee
passive seismic data using support vector machines and automatic feature selection. Journal of Computational
Science, 20, 143–153. doi:10.1016/j.jocs.2016.11.016
Gerassis, S., Saavedra, A., Garcia, J. F., Martin, J. E., & Taboada, J. (2017). Risk analysis in tunnel construction
with Bayesian networks using mutual information for safety policy decisions. WSEAS Transactions on Business
and Economics, 14, 215–224.
Ghaddar, B., & Naoum-Sawaya, J. (2018). High dimensional data classification and feature selection using support
vector machines. European Journal of Operational Research, 265(3), 993–1004. doi:10.1016/j.ejor.2017.08.040
Han, W., Borges, J., Neumayer, P., Ding, Y., Riedel, T., & Beigl, M. (2017). Interestingness classification of
association rules for master data. In ICDM 2017: Advances in Data Mining. Applications and Theoretical
Aspects (pp. 237-245).
Hart, P. E., Stork, D. G., & Duda, R. O. (2003). Pattern classification (2nd ed.). John Wiley and Sons, Inc.
Heckerman, D. (1997). Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1),
79–119. doi:10.1023/A:1009730122752
Henry, F., Herwindiati, D. E., Mulyono, S., & Hendryli, J. (2017). Sugarcane Land Classification with Satellite
Imagery using Logistic Regression Model. IOP Conference Series. Materials Science and Engineering.
Ho, S. H., Speldewinde, P., & Cook, A. (2017). Predicting arboviral disease emergence using Bayesian networks:
A case study of dengue virus in Western Australia. Epidemiology and Infection, 145(1), 54–66. doi:10.1017/
S0950268816002090 PMID:27620510
Hosmer, D. W. Jr, Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). USA: Wiley.
doi:10.1002/9781118548387
Hotho, A., Nürnberger, A., & Paaß, G. (2005). A brief survey of text mining. GLDV Journal for Computational
Linguistics and Language Technology. Retrieved from http://www.kde.cs.uni-kassel.de/hotho/pub/2005/
hotho05TextMining.pdf
90
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Hou, H.-R., Meng, Q.-H., Zeng, M., & Sun, B. (2018). Improving classification of slow cortical potential signals
for BCI systems with polynomial fitting and voting support vector machine. IEEE Signal Processing Letters,
25(2), 283–287. doi:10.1109/LSP.2017.2783351
Hou, J., Liu, W., Xu, E., & Cui, H. (2016). Towards parameter-independent data clustering and image
segmentation. Pattern Recognition, 60, 25–36. doi:10.1016/j.patcog.2016.04.015
Huang, B., Liu, Z., Chen, J., Liu, A., Liu, Q., & He, Q. (2017). Behavior pattern clustering in blockchain networks.
Multimedia Tools and Applications, 76(19), 20099–20110. doi:10.1007/s11042-017-4396-4
Ican, O., & Çelik, T. B. (2017). Stock market prediction performance of neural networks: A literature review.
International Journal of Economics & Finance, 9(11), 100–108. doi:10.5539/ijef.v9n11p100
Iturriaga, F. J. L., & Sanz, I. P. (2015). Bankruptcy visualization and prediction using neural networks: A study
of U.S. commercial banks. Expert Systems with Applications, 42(6), 2857–2869. doi:10.1016/j.eswa.2014.11.025
Jabeur, S. B. (2017). Bankruptcy prediction using Partial Least Squares Logistic Regression. Journal of Retailing
and Consumer Services, 36, 197–202. doi:10.1016/j.jretconser.2017.02.005
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666.
doi:10.1016/j.patrec.2009.09.011
Jamshida, A., Ait-Kadib, D., & Ruizc, A. (2017). An advanced dynamic risk modeling and analysis in projects
management. Journal of Modern Project Management, (May-August), 6-11.
Jiang, Y.-G., Wu, Z., Wang, J., Xue, X., & Chang, S.-F. (2018). Exploiting feature and class relationships in
video categorization with regularized deep neural networks. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 40(2), 352–364. doi:10.1109/TPAMI.2017.2670560 PMID:28221992
John, A., Yang, Z., Riahi, R., & Wang, J. (2016). A risk assessment approach to improve the resilience of a
seaport system using Bayesian networks. Ocean Engineering, 111, 136–147. doi:10.1016/j.oceaneng.2015.10.048
Jooa, J. H., Bangb, S. W., & Parka, G. D. (2016). Implementation of a recommendation system using association
rules and collaborative filtering. Procedia Computer Science, 91, 944–952. doi:10.1016/j.procs.2016.07.115
Juneja, N. (2015). What are the disadvantages of using a decision tree for classification? Quora. Retrieved from
https://www.quora.com/What-are-the-disadvantages-of-using-a-decision-tree-for-classification
Kabir, M. H. (2016). Data mining framework for generating sales decision making information using association
rules. International Journal of Advanced Computer Science and Applications, 7(5), 378–385.
Kaminski, B., Jakubczyk, M., & Szufel, P. (2018). A framework for sensitivity analysis of decision trees. Central
European Journal of Operations Research, 26(1), 135–159. doi:10.1007/s10100-017-0479-6 PMID:29375266
Kamran, M., Haider, S. A., Akram, T., Naqvi, S. R., & He, S. K. (2016). Prediction of IV curves for a
superconducting thin film using artificial neural networks. Superlattices and Microstructures, 95, 88–94.
doi:10.1016/j.spmi.2016.04.018
Kanes, R., Ramirez Marengo, M. C., Abdel-Moati, H., Cranefield, J., & Vechot, L. (2017). Developing a
framework for dynamic risk assessment using Bayesian networks and reliability data. Journal of Loss Prevention
in the Process Industries, 50, 142–153. doi:10.1016/j.jlp.2017.09.011
Khlif, A., & Mignotte, M. (2017). Segmentation data visualizing and clustering. Multimedia Tools and
Applications, 76(1), 1531–1552. doi:10.1007/s11042-015-3148-6
Kim, H.-J., Jo, N.-O., & Shin, K.-S. (2016). Optimization of cluster-based evolutionary undersampling for the
artificial neural networks in corporate bankruptcy prediction. Expert Systems with Applications, 59, 226–234.
doi:10.1016/j.eswa.2016.04.027
Koelbl, M. (2018). What are the disadvantage of clustering in data mining? Quora. Retrieved on 3/9/2018 from:
https://www.quora.com/What-are-the-disadvantage-of-clustering-in-data-mining
91
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Kourou, K., Rigas, G., Exarchos, K. P., Papaloukas, C., & Fotiadis, D. I. (2016). Prediction of oral cancer
recurrence using dynamic Bayesian networks. In Proceedings of the 2016 IEEE 38th Annual International
Conference of the Engineering in Medicine and Biology Society (EMBC), Orlando, FL, August 16-20. doi:10.1109/
EMBC.2016.7591917
Koziarski, M., & Cyganek, B. (2017). Image recognition with deep neural networks in presence of noise -
Dealing with and taking advantage of distortions. Integrated Computer-Aided Engineering, 24(4), 337–349.
doi:10.3233/ICA-170551
Lawi, A., La Wungo, S., & Manjang, S. (2017). Identifying irregularity electricity usage of customer behaviors
using logistic regression and linear discriminant analysis. In Proceedings of the 2017 3rd International
Conference on Science in Information Technology (ICSITech), Bandung, Indonesia, October 25-26. doi:10.1109/
ICSITech.2017.8257174
Leon Blanco, J. M., Gonzalez-R, P. L., Arroyo Garcia, C. M., Cozar-Bernal, M. J., Calle Suarez, M., Canca Ortiz,
D., & Gonzalez Rodriguez, M. L. et al. (2018). Artificial neural networks as alternative tool for minimizing error
predictions in manufacturing ultra deformable nanoliposome formulations. Drug Development and Industrial
Pharmacy, 44(1), 135–143. doi:10.1080/03639045.2017.1386201 PMID:28967285
Lee, I., Kwak, M., & Han, D. (2016). A dynamic k-nearest neighbor method for WLAN-based position systems.
Journal of Computer Information Systems, 56(4), 295–300. doi:10.1080/08874417.2016.1164000
Levashenko, V., Zaitseva, E., Kvassay, M., & Deserno, T. M. (2016). Reliability estimation of healthcare
systems using Fuzzy Decision Trees. In Proceedings of the 2016 Federated Conference on Computer Science
and Information Systems (FedCSIS), Gdansk, Poland, September 11-14.
Li, B., Liu, B., Lin, W., & Zhang, Y. (2017). Performance analysis of clustering algorithm under two kinds of
big data architecture. Journal of High Speed Networks, 23(1), 49–57. doi:10.3233/JHS-170556
Li, J., Li, M., Wu, D., Dai, Q., & Song, H. (2016). A Bayesian networks-based risk identification approach
for software process risk: The context of Chinese trustworthy software. International Journal of Information
Technology & Decision Making, 15(6), 1391–1412. doi:10.1142/S0219622016500401
Li, N., Feng, X., & Jimenez, R. (2017). Predicting rock burst hazard with incomplete data using Bayesian
networks. Tunnelling and Underground Space Technology, 61, 61–70. doi:10.1016/j.tust.2016.09.010
Li, R. (2015). Top 10 data mining algorithms, explained. KDnuggets News. Retrieved from http://www.kdnuggets.
com/2015/05/top-10-data-mining-algorithms-explained.html
Li, Y. P., Nie, S., Huang, C. Z., McBean, E. A., Fan, Y. R., & Huang, G. H. (2017). An integrated risk analysis
method for planning water resource systems to support sustainable development of an arid region. Journal of
Environmental Informatics, 29(1), 1–15. doi:10.3808/jei.200900148
Li, X., Chen, G., & Zhu, H. (2016). Quantitative risk analysis on leakage failure of submarine oil and gas
pipelines using Bayesian network. Process Safety & Environmental Protection: Transactions of the Institution
of Chemical Engineers, 103, 163–173. doi:10.1016/j.psep.2016.06.006
Li, Z., Zhang, Q., & Zhao, X. (2017). Performance analysis of K-nearest neighbor, support vector machine, and
artificial neural network classifiers for driver drowsiness detection with different road geometries. International
Journal of Distributed Sensor Networks, 13(9), 1–12. doi:10.1177/1550147717733391
Liao, S.-H., Chu, P.-H., & Hsiao, P.-Y. (2012). Data mining techniques and applications – A decade review
from 2000 to 2011. Expert Systems with Applications, 39(12), 11303–11311. doi:10.1016/j.eswa.2012.02.063
Liu, S., Hu, Y., Li, C., Lu, H., & Zhang, H. (2017). Machinery condition prediction based on wavelet and support
vector machine. Journal of Intelligent Manufacturing, 28(4), 1045–1055. doi:10.1007/s10845-015-1045-5
Lockamy, A. (2017). An examination of external risk factors in Apple Inc.’s supply chain. Supply Chain Forum:
International Journal, 18(3), 177-188.
Luo, Y., Cheng, Y., Uzuner, O., Szolovits, P., & Starren, J. (2018). Segment convolutional neural networks
(Seg-CNNs) for classifying relations in clinical notes. Journal of the American Medical Informatics Association,
25(1), 93–98. doi:10.1093/jamia/ocx090 PMID:29025149
92
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Ma, H., Gou, J., Wang, X., Ke, J., & Zeng, S. (2017). Sparse coefficient-based k-Nearest neighbor classification.
IEEE Access, 5, 16618–16634. doi:10.1109/ACCESS.2017.2739807
Ma, X., Lu, H., Gan, Z., & Zeng, J. (2017). An explicit trust and distrust clustering based collaborative filtering
recommendation approach. Electronic Commerce Research and Applications, 25, 29–39. doi:10.1016/j.
elerap.2017.06.005
Mahdavi, G., Maharluie, M. S., & Shokrolahi, A. (2017). The use of artificial neural networks for quantifying
the relative importance of the firms’ performance determinants. International Journal of Economics & Financial
Issues, 7(3), 119–127.
Mathioulakis, E., Panaras, G., & Belessiotis, V. (2018). Artificial neural networks for the performance prediction
of heat pump hot water heaters. International Journal of Sustainable Energy, 37(2), 173–192. doi:10.1080/14
786451.2016.1218495
Mikolajczyk, T., Nowicki, K., Bustillo, A., & Yu Pimenov, D. (2018). Predicting tool life in turning operations
using neural networks and image processing. Mechanical Systems and Signal Processing, 104, 503–513.
doi:10.1016/j.ymssp.2017.11.022
Moltchanova, E., Avila, R., Horn, B., Moriarty, E., & Hodson, R. (2018). Evaluating statistical model
performance in water quality prediction. Journal of Environmental Management, 206, 910–919. doi:10.1016/j.
jenvman.2017.11.049 PMID:29207304
Muralitharan, K., Sakthivel, R., & Vishnuvarthan, R. (2018). Neural network based optimization approach for
energy demand prediction in smart grid. Neurocomputing, 273, 199–208. doi:10.1016/j.neucom.2017.08.017
Ngai, E. W. T., Xiu, L., & Chau, D. C. K. (2009). Application of data mining techniques in customer relationship
management: A literature review and classification. Expert Systems with Applications, 36(2), 2592–2602.
doi:10.1016/j.eswa.2008.02.021
Nivolianitou, Z. S., Koromila, I. A., & Giannakopoulos, T. (2016). Bayesian network to predict environmental
risk of a possible ship accident. International Journal of Risk Assessment and Management, 19(3), 228–239.
doi:10.1504/IJRAM.2016.077381
Noh, S., & An, K. (2017). Risk assessment for automatic lane change maneuvers on highways. In Proceedings
of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, May 29-June 3.
doi:10.1109/ICRA.2017.7989031
Noroozian, A., Kazemzadeh, R. B., Niaki, S. T. A., & Zio, E. (2018). System Risk Importance Analysis Using
Bayesian Networks. International Journal of Reliability Quality and Safety Engineering, 25(1), 1–26. doi:10.1142/
S0218539318500043
Nwulu, N. I. (2017). A decision trees approach to oil price prediction. In Proceedings of the 2017 International
Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey, September 16-17. doi:10.1109/
IDAP.2017.8090313
Oliveira, C. M., & Pereira, D. A. (2017). An association rules based method for classifying product offers from
e-shopping. Intelligent Data Analysis, 21(3), 637–660. doi:10.3233/IDA-150444
Olson, D. L., Delen, D., & Meng, Y. (2012). Comparative analysis of data mining methods for bankruptcy
prediction. Decision Support Systems, 52(2), 464–473. doi:10.1016/j.dss.2011.10.007
Oracle Corporation. (2012). Oracle data mining concepts 11g Release 1. Retrieved from http://docs.oracle.com/
cd/B28359_01/datamine.111/b28129/algo_svm.htm
Ougiaroglou, S., Diamantaras, K. I., & Evangelidis, G. (2018). Exploring the effect of data reduction on
Neural Network and Support Vector Machine classification. Neurocomputing, 280, 101–110. doi:10.1016/j.
neucom.2017.08.076
Park, Y. W., & Baskiyar, S. (2017). Adaptive scheduling on heterogeneous systems using support vector machine.
Computing, 99(4), 405–425. doi:10.1007/s00607-016-0513-x
Peralta, D., Triguero, I., Garcia, S., Saeys, Y., Benitez, J. M., & Herrera, F. (2018). On the use of convolutional
neural networks for robust classification of multiple fingerprint captures. International Journal of Intelligent
Systems, 33(1), 213–230. doi:10.1002/int.21948
93
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Petre, R. (2013). Data mining solutions for the business environment. Database Systems Journal, 4(4), 21–29.
Phyu, T. N. (2009). Survey of classification techniques in data mining. In Proceedings of the International
Multi-Conference of Engineers and Computer Scientists, IMECS 2009, Hong Kong, March 18 - 20.
Pozi, M. S. M., Sulaiman, M. N., Mustapha, N., & Perumal, T. (2016). Improving anomalous rare attack detection
rate for intrusion detection system using support vector machine and genetic programming. Neural Processing
Letters, 44(2), 279–290. doi:10.1007/s11063-015-9457-y
Pradhan, C. (2016). What are the limitations of Neural Networks? Quora. Retrieved from https://www.quora.
com/What-are-the-limitations-of-Neural-Networks
Purnamaningtyas, E., & Utami, E. (2017). Implementation of k-nearest neighbor algorithm analysis in predicting
regular hajj applicant failure. Journal of Theoretical and Applied Information Technology, 95(20), 5494–5505.
Quintana, D., Cervantes, A., Saez, Y., & Isasi, P. (2017). Clustering technique for large-scale home care crew
scheduling problems. The International Journal of Artificial Intelligence, Neural Networks, and Complex
Problem-Solving Technologies, 47(2), 443–455.
Quora. (2017). Are association rules still a useful technique? Quora. Retrieved from https://www.quora.com/
Are-association-rules-still-a-useful-technique
Quora. (2014). What are the pros and cons of neural networks from a practical perspective? Retrieved from
https://www.quora.com/What-are-the-pros-and-cons-of-neural-networks-from-a-practical-perspective-Personal-
comments-from-heavy-users-welcome
Quora. (2017). Why is SVM not popular nowadays? Also, when did SVM perform poorly? Retrieved from https://
www.quora.com/Why-is-SVM-not-popular-nowadays-Also-when-did-SVM-perform-poorly
Rahman, N. (2018a). A taxonomy of data mining problems. [IJBAN]. International Journal of Business Analytics,
5(2), 73–86.
Rahman, N. (2018b). Data Mining Problems Classification and Techniques. [IJBDAH]. International Journal
of Big Data and Analytics in Healthcare, 3(1), 38–57.
Rama, K., Shekhar, S., Kiran, J., Rau, R., Pritchett, S., Bhandari, A., & Chitalia, P. (2016). List Price Optimization
Using Customized Decision Trees. Machine Learning and Data Mining in Pattern Recognition, 88-97.
Rama, K., Shekhar, S., Kiran, J., Rau, R., Pritchett, S., Bhandari, A., & Chitalia, P. (2016). List Price Optimization
Using Customized Decision Trees. In Machine Learning and Data Mining in Pattern Recognition (pp. 88-97).
Raschka, S. (2016). What are the pros and cons of using logistic regression with one binary outcome and several
binary predictors? Quora. Retrieved from https://www.quora.com/What-are-the-pros-and-cons-of-using-logistic-
regression-with-one-binary-outcome-and-several-binary-predictors
Razanamahandry, L. C., Andrianisa, H. A., Karoui, H., Podgorski, J., & Yacouba, H. (2018). Prediction model
for cyanide soil pollution in artisanal gold mining area by using logistic regression. Catena, 162, 40–50.
Ristolainen, K. (2018). Predicting banking crises with artificial neural networks: The role of nonlinearity and
heterogeneity. The Scandinavian Journal of Economics, 120(1), 31–62. doi:10.1111/sjoe.12216
Rodriguez, M.Z., Comin, C.H., Casanova, D., Bruno, O.M., Amancio, D.R., Rodrigues, F.A., & Costa, L.D.F.
(2016). Clustering Algorithms: A Comparative Approach.
Ross, P. (2000). Rule induction: Ross Quinlan’s ID3 algorithm. Retrieved from http://www.soc.napier.ac.uk/~peter/
vldb/dm/node11.html
Rouse, M. (2011). Association rules (in data mining). Retrieved from http://searchbusinessanalytics.techtarget.
com/definition/association-rules-in-data-mining
Salmam, F. Z., Madani, A., & Kissi, M. (2016). Facial expression recognition using decision trees. In Proceedings
of the 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV), Beni
Mellal, Morocco, March 29-April 1. doi:10.1109/CGiV.2016.33
94
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Santos, K. C. P., & Barrios, E. B. (2017). Improving predictive accuracy of logistic regression model using
ranked set samples. Communications in Statistics. Simulation and Computation, 46(1), 78–90. doi:10.1080/0
3610918.2014.955113
Sapankevych, N. I., & Sankar, R. (2009). Time Series Prediction Using Support Vector Machines: A Survey.
IEEE Computational Intelligence Magazine, 4(2), 24–38. doi:10.1109/MCI.2009.932254
Satyanarayana, S., Kumar, P. S., & Sridevi, G. (2017). Improved Process Scheduling in Real-Time Operating
Systems Using Support Vector Machines. In Proceedings of 2nd International Conference on Micro-Electronics,
Electromagnetics and Telecommunications (pp. 603-611).
Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., & Lin, C.-T. et al. (2017). A review of
clustering techniques and developments. Neurocomputing, 267, 664–681. doi:10.1016/j.neucom.2017.06.053
Scalabrin, M., Gadaleta, M., Bonetto, R., & Rossi, M. (2017). A Bayesian forecasting and anomaly detection
framework for vehicular monitoring networks. In Proceedings of the 2017 IEEE 27th International Workshop
on Machine Learning for Signal Processing (MLSP), Tokyo, Japan, September 25-28. doi:10.1109/
MLSP.2017.8168151
Shahbaba, M., & Beheshti, S. (2016). Signature test as statistical testing in clustering. Signal, Image and Video
Processing, 10(7), 1343–1351. doi:10.1007/s11760-016-0926-1
Simeunovic, N., Kamenko, I., Bugarski, V., Jovanovic, M., & Lalic, B. (2017). Improving workforce scheduling
using artificial neural networks model. Advances in Production Engineering & Management, 12(4), 337–352.
doi:10.14743/apem2017.4.262
Su, X., & Khoshgoftaar, T. M. (2009). A Survey of Collaborative Filtering Techniques. Advances in Artificial
Intelligence, 1–19. doi:10.1155/2009/421425
Tang, C., Xiang, Y., Wang, Y., Qian, J., & Qiang, B. (2016). Detection and classification of anomaly intrusion
using hierarchy clustering and SVM. Security and Communication Networks, 9(16), 3401–3411. doi:10.1002/
sec.1547
Tang, Y., Ji, J., Gao, S., Dai, H., Yu, Y., & Todo, Y. (2018). A Pruning Neural Network Model in Credit
Classification Analysis. Computational and Mathematical Methods in Medicine, 21–22. PMID:29606961
Tarvin, T. R. (2017). Combatting professional error in bankruptcy analysis through the design and use of decision
trees in clinical pedagogy. St. John’s Law Review, 91(2), 427–504.
Tavana, M., Abtahi, A.-R., Di Caprio, D., & Poortarigh, M. (2018). An Artificial Neural Network and Bayesian
Network model for liquidity risk assessment in banking. Neurocomputing, 275, 2525–2554. doi:10.1016/j.
neucom.2017.11.034
Thenmozhi, M., & Chand, G. S. (2016). Forecasting stock returns based on information transmission across
global markets using support vector machines. Neural Computing & Applications, 27(4), 805–824. doi:10.1007/
s00521-015-1897-9
Tosun, A., Bener, A. B., & Akbarinasaji, S. (2017). A systematic literature review on the applications of Bayesian
networks to predict software quality. Software Quality Journal, 25(1), 273–305. doi:10.1007/s11219-015-9297-z
Triepels, R., Daniels, H., & Feelders, A. (2018). Data-driven fraud detection in international shipping. Expert
Systems with Applications, 99, 193–202. doi:10.1016/j.eswa.2018.01.007
Tsai, F.-M., & Huang, L. J. W. (2017). Using artificial neural networks to predict container flows between the
major ports of Asia. International Journal of Production Research, 55(17), 5001–5010. doi:10.1080/0020754
3.2015.1112046
Tylman, W., Waszyrowski, T., Napieralski, A., Kaminski, M., Trafidlo, T., Kulesza, Z., & Wenerski, M. et al.
(2016). Real-time prediction of acute cardiovascular events using hardware-implemented Bayesian networks.
Computers in Biology and Medicine, 69, 245–253. doi:10.1016/j.compbiomed.2015.08.015 PMID:26456181
Vaidya, A. (2017). Predictive and probabilistic approach using logistic regression: application to prediction of
loan approval. In Proceedings of the 2017 8th International Conference on Computing, Communication and
Networking Technologies (ICCCNT), Delhi, India, July 3-5. doi:10.1109/ICCCNT.2017.8203946
95
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Valle, M. A., Ruz, G. A., & Morras, R. (2018). Market basket analysis: Complementing association rules with
minimum spanning trees. Expert Systems with Applications, 97, 146–162. doi:10.1016/j.eswa.2017.12.028
Varshney, D., Kumar, S., & Gupta, V. (2017). Predicting information diffusion probabilities in social networks:
A Bayesian networks based approach. Knowledge-Based Systems, 133, 66–76. doi:10.1016/j.knosys.2017.07.003
Villarrubia, G., De Paz, J. F., Chamoso, P., & La Prieta, F. D. (2018). Artificial neural networks used in
optimization problems. Neurocomputing, 272, 10–16. doi:10.1016/j.neucom.2017.04.075
Watanabe, T., Monden, A., Kamei, Y. K., & Morisaki, S. (2016). Identifying recurring association rules in
software defect prediction. In Proceedings of the 2016 IEEE/ACIS 15th International Conference on Computer
and Information Science (ICIS), Okayama, Japan, June 26-29. doi:10.1109/ICIS.2016.7550867
Widodo, A., & Handoyo, S. (2017). The classification performance using logistic regression and support vector
machine (SVM). Journal of Theoretical and Applied Information Technology, 95(19), 5184–5193.
Williams, D. A. (2016). Can Neural networks predict business failure? Evidence from small high tech firms in
the U.K. Journal of Developmental Entrepreneurship, 21(1), 1–17. doi:10.1142/S1084946716500059
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., & Steinberg, D. et al. (2008). Top 10
algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37. doi:10.1007/s10115-007-0114-2
Wu, X., Zhu, X., Wu, G.-Q., & Ding, W. (2014). Data mining with big data. IEEE Transactions on Knowledge
and Data Engineering, 26(1), 97–107. doi:10.1109/TKDE.2013.109
Xia, Y., Nie, L., Zhang, L., Yang, Y., Hong, R., & Li, X. (2016). Weakly Supervised Multilabel Clustering
and its Applications in Computer Vision. IEEE Transactions on Cybernetics, 46(12), 3220–3232. doi:10.1109/
TCYB.2015.2501385 PMID:27046858
Xu, G., Shen, C., Liu, M., Zhang, F., & Shen, W. (2017). A user behavior prediction model based on parallel
neural network and k-nearest neighbor algorithms. Cluster Computing, 20(2), 1703–1715. doi:10.1007/s10586-
017-0749-z
Yan, L., Huang, Z., Zhang, Y., Zhang, L., Zhu, D., & Ran, B. (2017). Driving risk status prediction using Bayesian
networks and logistic regression. Intelligent Transport Systems, 11(7), 431–439. doi:10.1049/iet-its.2016.0207
Yassouridis, C., & Leisch, F. (2017). Benchmarking different clustering algorithms on functional data. Advances
in Data Analysis and Classification, 11(3), 467–492. doi:10.1007/s11634-016-0261-y
Yeo, B., & Grant, D. (2018, February). (218). Predicting service industry performance using decision tree
analysis. International Journal of Information Management, 38(1), 288–300. doi:10.1016/j.ijinfomgt.2017.10.002
Yuan, S., Huang, H., & Wu, L. (2016). Use of word clustering to improve emotion recognition from short text.
Journal of Computing Science and Engineering, 10(4), 103–110. doi:10.5626/JCSE.2016.10.4.103
Yuan, C., & Malone, B. (2013). Learning optimal Bayesian networks: A shortest path perspective. Journal of
Artificial Intelligence Research, 48, 23–65. doi:10.1613/jair.4039
Yusra, M. F., Trilaksono, B. R., Yendra, R., & Fudholi, A. (2017). Music interest classification of twitter users
using support vector machine. Journal of Theoretical and Applied Information Technology, 95(11), 2352–2358.
Zetai, W., Rengin, C., Shuyan, X., Xiaosi, W., & Yuli, F. (2017). Research and improvement of WiFi positioning
based on k nearest neighbor method. Computer Engineering, 43(3), 289–293.
Zhang, D., Lee, K., & Lee, I. (2018). Hierarchical trajectory clustering for spatio-temporal periodic pattern
mining. Expert Systems with Applications, 92, 1–11. doi:10.1016/j.eswa.2017.09.040
Zhang, H., Raitoharju, J., Kiranyaz, S., & Gabbouj, M. (2016). Limited random walk algorithm for big graph
data clustering. Journal of Big Data, 3(26), 1–22. doi:10.1186/s40537-016-0060-5
Zhang, X., Ding, S., & Xue, Y. (2017). An improved multiple birth support vector machine for pattern
classification. Neurocomputing, 225, 119–128. doi:10.1016/j.neucom.2016.11.006
Zhang, X.-D., Li, A., & Pan, R. (2016). Stock trend prediction based on a new status box method and AdaBoost
probabilistic support vector machine. Applied Soft Computing, 49, 385–398. doi:10.1016/j.asoc.2016.08.026
96
International Journal of Strategic Information Technology and Applications
Volume 9 • Issue 1 • January-March 2018
Zhou, L., Si, Y.-W., & Fujita, H. (2017). Predicting the listing statuses of Chinese-listed companies using
decision trees combined with an improved filter feature selection method. Knowledge-Based Systems, 128,
93–101. doi:10.1016/j.knosys.2017.05.003
Nayem Rahman is an Information Technology (IT) Professional. He has implemented several large projects using
data warehousing and big data technologies. He is currently working toward the Ph.D. degree in the Department
of Engineering and Technology Management at Portland State University, USA. He holds an M.S. in Systems
Science (Modeling & Simulation) from Portland State University, Oregon, USA and an MBA in Management
Information Systems (MIS), Project Management, and Marketing from Wright State University, Ohio, USA. His
most recent publications appeared in the International Journal of Big Data and Analytics in Healthcare (IJBDAH).
His principal research interests include Big Data Analytics, Big Data Technology Acceptance, Data Mining for
Business Intelligence, and Simulation-based Decision Support System (DSS).
97