Professional Documents
Culture Documents
Comparative Analysis of Meta-Heuristic Feature Selection and Feature Extraction Approaches For Enhanced Chronic Kidney Disease Prediction
Comparative Analysis of Meta-Heuristic Feature Selection and Feature Extraction Approaches For Enhanced Chronic Kidney Disease Prediction
Abstract—Chronic Kidney Disease (CKD) has garnered considering the trade-off between true positive and false
significant attention over the past decades, primarily due positive rates across various classification thresholds [12].
to its lack of symptoms in the early stages. The objective
of this research paper is to evaluate and contrast the II. BACKGROUND
effects of various feature extraction methods, such as Numerous approaches for chronic kidney disease prediction
Linear Discriminant Analysis (LDA), Principal using intelligent algorithms have been developed. Iliyas
Component Analysis (PCA), Independent Component Ibrahim et al., [2] utilized a Deep Neural Network (DNN) on
Analysis (ICA), and meta-heuristic feature selection Bade General Hospital's dataset to predict CKD with 98%
methods like Particle Swarm Optimization (PSO), Ant accuracy. The study highlights Creatinine and Bicarbonate as
Colony Optimization (ACO), and Artificial Bee Colony key attributes for effective CKD detection. Author Saurabh
(ABC). The classification models employed for evaluating Pal [3] investigated CKD prediction using a machine learning
the selected features include Artificial Neural Network model incorporating categorical and non-categorical
(ANN), Random Forest Classifier (RF), Multilayer attributes. The approach combines baseline classifiers,
Perceptron Classifier (MLP), and K-Nearest Neighbors utilizing a majority voting method for a 3% accuracy
(KNN). The issue of overfitting and underfitting has been improvement. Ibomoiye Domor Mienye et al., [4] introduced
addressed. The results are computed based on accuracy a novel method by integrating PSO to optimize parameters in
for both the training and testing sets and AUC-ROC a Stacked Sparse Autoencoder (SSAE), tackling internal
scores, which have been visualized. We found out that the covariate shift challenges. The SSAE network connects the
meta-heuristic optimization feature selection algorithms last autoencoder's hidden layer to a softmax classifier.
improve the performance of the models drastically Vijendra Singh et al., [5] in their paper gave deep learning
compared to feature extraction techniques. model for early chronic disease diagnosis, using Recursive
Feature Elimination, outperforms five classifiers (SVM,
Keywords—chronic kidney disease, prediction, deep learning, KNN, Logistic Regression, Random Forest, Naive Bayes)
machine learning, meta-heuristic feature selection, feature with 100% accuracy. Surpassing recent studies, including
extraction. various classifiers, the model's perfect accuracy, ranging
I. INTRODUCTION from 85% to 98.5% in existing works, positions it as a
promising tool for nephrologists. Hanyu Zhang et al., [6]
CKD silently impacts millions globally, necessitating early addressed challenges in evaluating CKD patients' conditions
detection and intervention [1]. Globally, in 2017, 1.2 million by employing data preprocessing and ANN techniques. The
people died from CKD [17]. Traditional diagnostic markers study compares a classical MLP model with a LASSO-
often miss early signs, but deep learning, leveraging preselected MLP model, showing comparable high accuracy
extensive patient data, reveals elusive patterns. With in mapping clinical factors to survivability. Chaity Mondol et
advancements in data analytics and computing, machine al., [7] introduced a high-accuracy neural network method for
learning enhances our understanding of CKD risk factors, detecting CKD, offering a promising tool for risk assessment.
enabling early detection and management, potentially The study preprocesses the raw dataset, enhancing detection
transforming lives [10]. The meta-heuristic algorithms for efficiency. Optimized neural network models (OCNN,
feature selection have been proven helpful to build robust OANN, OLSTM) outperform traditional methods, with
models for disease prediction [11]. Our work aims to compare OCNN reaching 98.75% accuracy. Manonmani. M et al., [8]
meta-heuristic feature selection algorithms (PSO, ACO, enhanced the Teaching-Learning-Based Optimization
ABC) and feature extraction algorithms (LDA, PCA, ICA) in (TLBO) algorithm for high-dimensional medical data
predicting CKD model’s performance. These algorithms analysis, addressing CKD diagnosis accuracy. The proposed
extract/select relevant features for a refined data ITLBO achieves a 36% reduction in features, surpassing
representation, which are then used as an input for the TLBO's 25%. Experimental results show improved
classification models (ANN, RF, MLP, KNN). All models performance metrics, including a 6.75%, 6.25%, and 4.75%
share hyperparameters, enabling an evaluation of the accuracy boost for SVM, Gradient Boosting, and CNN.
effectiveness of feature selection/extraction methods in S.Belina et al., [9] aimed for optimal predictability of CKD
improving CKD prediction accuracy and reliability. The by combining ACO-based Feature Selection and Extreme
evaluation of results in this study centres around two key Learning Machine (ELM). The proposed ACO algorithm
metrics: accuracy score and the Area Under the Receiver minimizes features efficiently, improving CKD prediction
Operating Characteristic (AUC-ROC) curve. The accuracy accuracy and streamlining the diagnostic process. In above
score provides a holistic measure of the overall correctness of mentioned studies, various feature selection and feature
the predictions made by the models. On the other hand, the extraction methods were used to improve the accuracy
AUC-ROC curve offers a more nuanced evaluation by models. We have selected some of the above feature
extraction and selection methods to perform a comparative
analysis.
III. DATASET
The dataset used in this paper is the CKD dataset that is taken
from the DY Patil hospital. There are a total of 22 attributes
including the class attribute. The dataset is a binary
classification dataset with ‘ckd’ representing presence of
disease and ‘notckd’ representing absence of disease. The
dataset has 400 records of CKD patients and 400 of non-CKD
patients respectively. The details about attributes of dataset
are given in TABLE 1.
𝑡
𝑥𝑘𝑗 is a random selected bee, ∅𝑖𝑗 is a random number between (𝑇𝑃𝑅𝑖+1 +𝑇𝑃𝑅𝑖 )(𝐹𝑃𝑅𝑖+1 −𝐹𝑃𝑅𝑖 )
𝐴𝑈𝐶 = ∑𝑛−1
𝑖=1 (6)
[−1,1]. 2
C. Models
𝐹𝑃
ANN is a computational model that draws inspiration from the 𝐹𝑃𝑅 = (7)
𝐹𝑃+𝑇𝑁
architecture and operations of the human brain. An artificial
neural network (ANN) is made up of layers of networked
nodes, or neurons, that process information by use of weighted In all the classification models, 80% of dataset is used for
connections to convert input signals into output. The network training and 20%of the dataset is used for testing. Among 800
learns by fine-tuning these weights during training, which records, 560 records are used for training and 240 records are
improves its capacity for precise classifications or predictions used for testing.
[6]. The network is composed of input and output layers, with Initially the base models, which are ANN, RF, MLP, KNN
hidden layers enabling intricate representations. ANNs are are trained using all the features in dataset. The AUC-ROC
useful for a variety of tasks, including as pattern recognition curves for all the base models is given in Figure 2.
and regression, since they are excellent at identifying complex
patterns and relationships in data.
RF is an ensemble learning method used for classification and
regression tasks. During training, it builds several decision
trees and combines their predictions to increase overall
resilience and accuracy [5]. A random subset of the data is
used to train each tree, and the final prediction is either the
average (regression) or the majority vote (classification) of
each tree's individual predictions. By comparing the relative
importance of each variable among the trees in the forest,
Random Forest reduces overfitting, manages noisy data
effectively, and offers insights into feature relevance.
Multi-Layer Perceptron is an artificial neural network
containing an input layer, one or more hidden layers, and an
output layer [13]. Each connection between neurons in one
layer and those in a subsequent layer is weighted. The network
modifies these weights during training in order to identify
patterns in the incoming data. Each neuron's activation
function adds non-linearity, which enables the network to Figure 2. AUC-ROC curves for base model
simulate intricate interactions. Because of its adaptability,
MLPs can be applied to a wide range of tasks, such as The following are the few important parameters for each
regression and classification, by varying the number of layers model used for the base models. The parameters for ANN are
and neurons in relation to the difficulty of the problem. given in TABLE 3.
KNN is a simple and intuitive machine learning algorithm
TABLE 3: Parameters for ANN
used for classification and regression tasks. In order to
function, a data point is either classified into the majority class
Parameter Value
(classification) or its K nearest neighbours values are averaged
(regression) in the feature space [14]. The number of optimizer adam
neighbours taken into account depends on the selection of K. dropout rate 0.5
KNN is predicated on the idea that comparable instances in no. of layers 3
The key parameters for RF are n_estimators, max_depth, selected by using feature extraction algorithms. PCA is the
criterion, their values are given in TABLE 4 best performing feature extraction algorithm.
Parameter Value
n_estimators [10,200]
max_depth [1,20]
criterion [‘gini’,’entropy’]
Parameter Value
hidden_layer_size 50
max_iter 1000
solver adam Figure 3. AUC-ROC for LDA
activation relu
Parameter Value
n_neighbors [1,30]