Professional Documents
Culture Documents
Methods For Class-Imbalanced Learning With Support Vector Machines - A Review and An Empirical Evaluation
Methods For Class-Imbalanced Learning With Support Vector Machines - A Review and An Empirical Evaluation
Abstract
This paper presents a review on methods for class-imbalanced learn-
ing with the Support Vector Machine (SVM) and its variants. We first
explain the structure of SVM and its variants and discuss their inef-
ficiency in learning with class-imbalanced data sets. We introduce a
hierarchical categorization of SVM-based models with respect to class-
imbalanced learning. Specifically, we categorize SVM-based models into
re-sampling, algorithmic, and fusion methods, and discuss the prin-
ciples of the representative models in each category. In addition, we
conduct a series of empirical evaluations to compare the performances
of various representative SVM-based models in each category using
benchmark imbalanced data sets, ranging from low to high imbalanced
ratios. Our findings reveal that while algorithmic methods are less
time-consuming owing to no data pre-processing requirements, fusion
methods, which combine both re-sampling and algorithmic approaches,
1
Springer Nature 2021 LATEX template
1 Introduction
Learning class imbalanced data, i.e., one or more target (majority) classes
contain far more samples than those from other (minority) classes, poses a
major challenge in supervised learning. In many real-world applications, such
as medical diagnosis [1], credit card fraud detection [2], condition monitoring
systems [3] and natural disasters [4], it is difficult and expensive to obtain data
samples with respect to minority (e.g., abnormal) or minority classes [5–7].
As such, the number of samples belonging to the majority or normal classes
is often larger than that of the minority classes. Standard supervised classifi-
cation algorithms, which work well with less skewed data sample distribution
problems, can become biased towards the majority classes; therefore failing to
yield favorable accuracy scores for the minority classes [8].
Over the years, many methods have been developed to enhance the per-
formance of standard classification algorithms in learning class-imbalanced
data. In general, the available methods to solve this issue can be catego-
rized into three groups: re-sampling, algorithmic, and fusion methods [9].
Re-sampling methods, which focus on data pre-processing, change the prior
distributions of both minority and majority classes [10]. Algorithmic methods
modify the algorithm structure to enhance their robustness in learning imbal-
anced data [11]. Fusion methods combine different techniques or ensemble
models for addressing class imbalanced data [12].
The support vector machine (SVM) [13, 14] is an effective kernel-
based learning algorithm for solving data classification problems. It aims to
find a hyperplane to optimally separate data samples with maximal mar-
gins [15–17]. Due to its structural risk minimization capability, SVM can
effectively reduce overfitting and avoid local minima [18]. Since its introduc-
tion, many SVM variants have been proposed, e.g., fuzzy SVM (FSVM) [19],
twin SVM (TSVM) [20], fuzzy TSVM (FTSVM) [21], intuitionistic FTSVM
(IFTSM) [16], etc. Despite their success in handling balanced data sets, their
performances are severely affected by imbalanced data distribution [22]. This
is caused by treating all samples equally and ignoring the difference among
both majority and minority classes. As a result, a decision boundary biases
toward the majority class samples is established when the underlying SVM
models learn from imbalanced data.
There are several reviews of imbalanced classification methods in the litera-
ture [6, 23–26]. In 2004, the study in [23] discussed the weaknesses of traditional
SVM models in solving imbalanced classification problems and explained why
Springer Nature 2021 LATEX template
the conventional under-sampling methods are not the best choice. In 2009,
the study in [6] provided a critical review of imbalanced learning methods
along with several assessment metrics. Moreover, challenging issues and future
research directions were presented. In 2012, the study in [24] introduced a
taxonomy of ensemble-based methods to address class imbalanced issues and
provided empirical evaluations with respect to the representative ones of each
category. On the other hand, the study in [25] presented an in-depth review
of rare event detection in imbalanced learning problems between 2006 and
2016. In 2013, the study in [27] reviewed SVM-based re-sampling and algo-
rithmic methods. While these surveys have been conducted years ago, none
of them provide an in-depth and comprehensive review of SVM-based meth-
ods for addressing class-imbalanced learning problems. As such, our review
presented in this paper aims to fill this gap. Firstly, we explain SVM and its
variants and discuss why they cannot effectively learn from class imbalanced
data. Then, we provide a hierarchical categorization of SVM-related methods
for tackling class imbalanced problems and present the representative ones of
each category. In addition, we conduct a thorough empirical comparison of
these methods. In short, the main contributions of this paper are as follows:
• a comprehensive review of SVM-based methods for tackling class imbalanced
problems;
• a hierarchical categorization of imbalanced SVM methods along with the
representative ones of each category;
• a comprehensive performance evaluation of SVM-based class imbalanced
learning methods;
• a discussion on the key research gaps and future research directions.
This review paper consists of five sections. Section 2 explains the structures
of SVM, FSVM and TSVM, and discusses why these methods are not able
to handle imbalanced classification problems. Section 3 reviews imbalanced
SVM-based methods. Specifically, a hierarchical categorization of imbalanced
SVM-based methods is presented, and each category is further divided into
several constituents. Section 4 presents an empirical study on the performance
of various SVM-based models. The results are discussed and analyzed. Finally,
Section 5 presents the concluding remarks and suggestions for future research.
2 Preliminaries
In this section, we explain the structures and principles of SVM, FSVM
and TSVM. Importantly, we discuss the issues of these models in handling
imbalanced data classification problems.
w.Φ(x) + b = 0 (1)
where w ∈ RD and b ∈ R are the weight vector and bias, respectively. For
linear classification problems, the maximal margin optimization problem is
solved to obtain the hyperplane, as follows:
1
min( w.w), (2)
2
For non-linear classification problems, even though the input samples are
transformed into a high-dimensional feature space, they are not completely
separable by linear functions. To alleviate this issue, a set of slack variables
ξi ≥ 0 is integrated into (2):
N
1 X
min( w.w + C ξi ) (3)
2 i=1
s.t.yi (w.Φ(xi ) + b) ≥ 1 − ξi
ξi ≥ 0, i = 1, 2, ..., N.
PN
where i=1 ξi is the penalty factor that measures the total error, and C is a
parameter to balance the trade-off between maximizing the margin and mini-
mizing the error. To solve this quadratic optimization problem, the Lagrangian
optimization is constructed, as follows:
l l l
X 1 XX
max{ αi − αi αj yi yj Φ(xi )Φ(xj )} (4)
i=1
2 i=1 j=1
l
X
s.t. yi αi = 0, 0 ≤ αi ≤ C, i = 1, 2, ..., N,
i=1
Springer Nature 2021 LATEX template
N N N
X 1 XX
max{ αi − αi αj yi yj K(xi , xj )} (7)
i=1
2 i=1 j=1
l
X
s.t. yi αi = 0, 0 ≤ αi ≤ C, i = 1, 2, ..., N
i=1
Then, b is computed using (6). Finally, the decision function with respect
to x can be obtained as follows:
XN
f (x) = sign(w.Φ(x) + b) = sign( αi yi K(xi , x) + b). (9)
i=1
N
1 T X
min w .w + C si ξi (10)
2 i=1
Springer Nature 2021 LATEX template
l l l
1 T X X X
L(w, b, ξ, α, β) = w .w + C si ξi − αi (yi (w.zi + b) − 1 + ξi ) − βi ξi .
2 i=1 i=1 i=1
(11)
To find the saddle point of L(w, b, ξ, α, β), the following conditions should
be satisfied:
N
∂L(w, b, ξ, α, β) X
=w− αi yi zi = 0 (12)
∂w i=1
N
∂L(w, b, ξ, α, β) X
=− αi yi = 0 (13)
∂b i=1
∂L(w, b, ξ, α, β)
= si C − αi − βi = 0. (14)
∂ξi
Springer Nature 2021 LATEX template
N N N
X 1 XX
maximize W (α) = αi − αi αj yi yj K(xi , xj ) (15)
i=1
2 i=1 j=1
N
X
s.t. yi αi = 0, 0 ≤ αi ≤ si C, i = 1, ..., N.
i=1
The input sample xi with ᾱi > 0 is denoted as the support vector. FSVM
contains two types of support vectors: (i) support vector with 0 < ᾱi < si C,
which lies on the hyperplane margin, and (ii) the support vector with ᾱi =
si C,which indicates misclassification. In general, fuzzy-based SVM models can
reduce the sensitivity of conventional SVM toward handling noisy samples.
w1 .xi + b1 = 0, w 2 x i + b2 = 0 (17)
where wj and bj are the weight vector and bias of the j-th hyperplane,
respectively. A quadratic optimization problem is formulated to obtain both
hyperplanes, as follows:
1
min (Aw1 + e1 b1 )T (Aw1 + e1 b1 ) + p1 eT2 ξ2 (18)
w1 ,b1 ,ξ2 2
s.t. − (Bw1 + e2 b1 ) + ξ2 ≥ e2 , ξ2 ≥ 0,
and
1
min (Bw2 + e2 b2 )T (Bw2 + e2 b2 ) + p2 eT1 ξ1 (19)
w2 ,b2 ,ξ1 2
s.t. (Aw2 + e1 b2 ) + ξ1 ≥ e1 , ξ1 ≥ 0,
Springer Nature 2021 LATEX template
1 class + 1 class +
0
class - class -
0
Hyperplane Hyperplane
0.8 0.8
0.6
0
0.6
0
0.4 0.4
0
0.2 0.2
0
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
(a) (b)
Fig. 1: (a) A hyperplane formed by SVM for an imbalanced data set; (b) an
ideal hyperplane that is expected to be formed by SVM-based methods for an
imbalanced data set.
where A and B indicate data samples belonging to classes +1 and −1, respec-
tively, e1 and e2 are vectors of ones, and p1 and p2 are penalty parameters.
Once the optimal parameters, i.e., (w1∗ , b∗1 ) and (w2∗ , b∗2 ), are achieved, a test
sample, x, can be classified, as follows:
| (wj∗ )T x + b∗j |
f (x) = arg min . (20)
j ∈{1,2} ∥ wi∗ ∥
Class imbalance
learning with SVM
class. One solution to alleviate this issue is to change the trade-off based on
the imbalanced rate [31, 32].
The third reason is the imbalanced rate of support vectors [30]. When the
ratio of imbalanced samples increases, the rate of support vectors with respect
to the majority and minority classes becomes more imbalanced. This causes
the neighborhood of a test sample near the decision boundary to be dominated
by support vectors of the majority class; hence, SVM tends to classify it as a
majority class sample. Due to the KKT condition in (5) and (6), the summation
of α’s related to the support vectors of the majority class should be equal to
the summation of α’s related to those of the minority class. Since the number
of support vectors belonging to the minority class is far fewer than those of
the majority class, their α’s are larger than those of the majority class. These
α’s operate as weights during prediction. As such, the support vectors of the
minority class receive higher weights than those of the majority class, offsetting
the effect of support vector imbalanced.
class + class +
3 3
class - class -
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-4 -3 -2 -1 0 1 2 3 -4 -3 -2 -1 0 1 2 3
(a) (b)
3
3
2
2
1 1
0 0
-1 -1
-2 -2
-3 -3
-4 -3 -2 -1 0 1 2 3 -4 -3 -2 -1 0 1 2 3
(c) (d)
Fig. 3: Examples of (a) an imbalanced data set, (b) under-sampling, (c) over
sampling, and (d) combined re-sampling.
and its variants as a base classifier. Study [41] evaluated popular re-sampling
methods and concluded that under-sampling produces better results.
(SCMs) [48] adopt the k-means algorithm to cluster the majority class sam-
ples into several disjoint clusters in the feature space. An SVM is then trained
using the minority class samples. Using the trained SVM, the support vectors
are approximately identified. A shrinkage technique is used to remove data
samples that are not support vectors. This procedure is repeated until a ter-
mination condition is satisfied. In diversified sensitivity-based under-sampling
(DSUS) [49], the data samples from both majority and minority classes are
grouped into k clusters separately using the k-mean algorithm. In each cluster,
the sample closest to the cluster center is identified for both classes, and the
similarity measure (SM) score of each sample is computed using the stochas-
tic sensitivity measure. The sample with the highest SM score is selected from
each class to form the initial data set for training the base classifier. This pro-
cedure is repeated until a balanced set of samples can yield a high sensitivity
rate is obtained.
The study [53] introduced two strategies. In the first strategy, the cluster
centers are used to represent the majority class samples, while in the second
strategy, the nearest neighbours are used to represent them. The study in [52]
adopted the fuzzy adaptive resonance theory (ART) neural network [54] for
data clustering. Operating as an incremental learning algorithm, Fuzzy ART is
used to cluster all training samples and identify clusters that contain only the
majority class samples. Then, the cluster centers, instead of data samples, are
utilized to train an intuitionistic FTSM (IFTSVM) [16] model for performing
classification.
In [55, 56], an active learning-based selection strategy was developed.
Firstly, an SVM is trained using imbalanced data. Then, the closest samples
to the hyperplane are identified and added to the training set. This procedure
is continued until a termination condition is satisfied.
shorter execution duration. However, it can lead to the loss of useful infor-
mation by removing data samples located along the decision boundary. In
particular, random under-sampling does not take into consideration informa-
tive samples along the decision boundary. Moreover, since the data distribution
is normally unknown, most classifiers attempt to approximate it through sam-
ple distribution. In under-sampling, the data samples are not considered as
random. In this regard, the study in [23] empirically analyzed the factors asso-
ciated with this limitation. Explanations on why under-sampling may not be
the best choice for SVM are provided.
In contrast, over-sampling can result in over-fitting owing to duplicating
the available samples or generating new samples based on existing ones. It
also increases the computational burden during training. To alleviate these
shortcomings, the study in [34] empirically combined random under-sampling
and SMOTE can perform better than under-sampling and over-sampling tech-
niques. Combining under-sampling and over-sampling reverses the initial bias
of the classification algorithm towards the majority class in favour of the
minority class.
3.2 Algorithmic Methods
Many modifications from the algorithm perspective have been developed to
enable SVM and its variants for learning imbalanced data [71–77]. The key
modifications are discussed in the following sub-sections.
l l
1 X X
min w.w + (C + ξi + C − ξi ) (21)
2
i|yi =+1 i|yi =−1
l l l
X 1 XX
max{ αi − αi αj yi yj K(xi , xj )} (23)
i=1
2 i=1 j=1
l
X
s.t. yi αi = 0, 0 ≤ αi+ ≤ C + , 0 ≤ αi− ≤ C − i = 1, ..., l
i=1
Springer Nature 2021 LATEX template
Xl
f (x) = sign( αi yi K(x, xi ) + b)
i=1
n+ n−
X X
= sign( +
αi yi K(x, xi ) + αj− yj K(x, xj ) + b), (24)
i=1 j=1
where αj+ (αi− ) is the coefficient of the positive (negative) support vectors.
zSVM increases αi+ in the positive support vectors by multiplying them with a
positive value, z. The resulting decision function in (25) reduces bias towards
the majority class samples. The weights of the positive support vectors in (25)
are increased by adjusting αi+ .
l1
X l2
X
f (x) = sign(z ∗ αi+ yi K(x, xi ) + αi− yi K(x, xj ) + b). (25)
i=1 j=1
Two variants of SVM, i.e, CS-SVM and support vector data description [88],
are utilized to handle high-dimensional class imbalanced data.
Due to less sensitivity to support vectors and noisy samples, fuzzy-based
SVM models are leveraged to tackle imbalanced classification problems. A
membership function is assigned to each class according to the importance of
different training samples [19, 29], i.e.,
s+ + +
i = f (xi )r , (26)
s− − −
i = f (xi )r , (27)
where s+
i (s−
i )indicates the membership values of the minority (majority) class
samples, 0 ≤ f (xi ) ≤ 1 indicates the significance of xi in its class, while r− and
r+ indicate class imbalanced setting. The misclassification cost for minority
−
(majority) class samples is set to s+i C (si C), which allocates a value in [0, 1]
([0, r]), and r < 1.
Over the years, various functions have been introduced, e.g. (26) and (27),
based on the distance of each sample from the class center [29] and hyper-
plane [89]. In addition, the study in [29] defined f (xi ) according to the distance
from the class center. Eqs. (28) and (29) present f (xi ) for linear and non-linear
cases, respectively.
dcen
i
flin (xi ) = 1 − ( ), (28)
max(dcen
i )+δ
2
fexp (xi ) = , (29)
1 + exp(dcen
i ∗ β)
changes the spatial resolution around the decision boundary using conformal
transformation (CT) [96]. CT maps elements X ∈ D to elements Y ∈ T (D)
in a way that preserves the local angles between elements after mapping. It
adaptively controls the transformation based on the skewness of the decision
boundary, i.e., the kernel matrix is modified when the data samples do not
have a vector-space representation.
The second group leverages on kernel alignment [97, 98]. The class
probability-based FSVM (ACFSVM) [99] model uses affinity computed by the
support vector description domain (SVDD) to identify outliers and data sam-
ples in the majority class that are close to the decision boundary. The kernel
k-nearest neighbour algorithm computes the class probability of all majority
class samples. ACFSVM reduces the contribution of data samples with lower
class probabilities, i.e., noisy samples, and focuses on the majority class sam-
ples with higher affinities and class probabilities, therefore skewing the decision
boundary toward the majority class. Enhanced automatic twin support vector
machine (EATWSVM) [100] learns a kernel function through Gaussian similar-
ity and improves data separability by incorporating a centered alignment-based
technique. The study in [98] extended the kernel alignment technique [97] to
measure the degree of agreement between a kernel and a classification task.
The definition of alignment is modified using different weights for the majority
and minority class samples. Other kernel alignment methods include regular-
ized orthogonal weighted least squares (ROWLSs)) [101], kernel neural gas
(KNG) [102] and P2P communication paradigm (P2PKNNC) [28] to tackle
data imbalanced issues.
Third group focuses on margin alignment. Maximum margin TSVM
(MMTSSVM) [103] identifies two homocentric spheres for tackling imbalanced
data. It uses a small sphere to capture as many majority class samples as possi-
ble, and increases the distance between two margins to separate most minority
class samples using a large sphere.
n+ n−
∥w∥2 X X
Lp = + C+ ξi + C − ξj
2
{i|yi =+1} {j|yj =−1}
Springer Nature 2021 LATEX template
M
X
fmajority (x) = arg max I(fi (x) = c). (32)
c
i=1
Bagging [110] and boosting [111] are two popular ensemble methods. Bag-
ging trains M independent classifiers on M different subsets of randomly drawn
samples from the original data set. In [112], bagging was used to generate sev-
eral re-sampled ensemble input data and training subsets. The training subsets
are used to train a number of base classifiers, i.e., SVM, while ensemble input
data samples are employed by all base classifiers. Then, a deep belief network
(DBN) [113] is adopted to produce the final output by combining the decisions
generated by an ensemble of base classifiers.
Springer Nature 2021 LATEX template
Boosting aims to train a classifier and correct its errors in previous itera-
tions. Adaptive boosting (AdaBoost) [114] and its variants [115–117] have been
extensively used with SVM to deal with class imbalanced problems. The data
samples are re-weighted at each training iteration of a classifier. Specifically, it
reduces the weights of correctly classified samples, and vice versa. Therefore,
the classifier pays more attention to the misclassified samples from the previ-
ous iteration. Other useful boosting techniques for tackling class imbalanced
problems include Adacost [118], RareBoost [119], and SMOTEBoost [120].
Over the years, many ensemble models that adopt SVM or its variants as
the base classifiers have been devised to deal with class imbalanced problems.
Hybrid kernel machines into an ensemble (HKME) [121] consists of two base
classifiers, namely SVM and v SVC [122], which is a one-class SVM classifier.
HKME trains SVM with a balanced data set achieved by re-sampling and
trains v SVC with the majority class samples. Then, an averaging strategy is
used to combine the decisions for reaching the final decision. the study in [123]
proposed an ensemble one-class classifier [88] for hypertension detection, i.e.,
a multi-class imbalanced problem. A one-class classifier is trained to identify
the majority class samples from outliers. The original multi-class problem is
decomposed using an ensemble of one-class classifier to distinguish between
five types of hypertension. Then, the error-correcting output codes [124] is
employed to reconstruct the multi-class problem from multiple single-class
responses.
The study in [125] developed an ensemble model based on CS-SVM
for recognizing cardiotocography signal abnormality. Specifically, weighting,
selecting, and voting techniques are used for producing predictions by combin-
ing the outcomes of ensemble members. EasyEnsemble [43, 126] and methods
in [127–135] divide the majority class samples into several sub-samples that
contain the same number of samples as that of the minority class samples using
different strategies, e.g. random sampling, clustering or bootstrapping. Then,
several SVM classifiers are trained and ensemble techniques are used to yield
the prediction.
The study in [136] leveraged an active learning strategy for training boost-
ing SVM to exploit the most informative samples. Each ensemble member
is trained using data samples that have significant impact on the previous
classifier. Specifically, data samples located along the decision boundary are
considered. The penalty terms are computed based on local cardinalities of the
majority and minority class samples.
4 Empirical Studies
In this section, two experiments are conducted to compare the performance of
various SVM-based models for imbalanced data learning in binary classification
settings. The first evaluates different algorithmic methods, while the second
assesses several re-sampling and fusion methods. A total of 36 benchmark
Springer Nature 2021 LATEX template
Table 1: Details of the UCI data sets, including the numbers of samples for
the minority and majority classes, total numbers of samples, dimensions, and
imbalanced ratio.
Data # Data set Minority Majority Instance Dimension Imbalance ratio
classification problems from the UCI1 and KEEL2 repositories are used. As
shown in Table 1, these data sets contain imbalanced ratios from 1 to close
to 130. Multi-class data sets are converted into binary classification tasks, as
shown in Table 1. All data samples are normalized between 0 and 1, and the
results of five-fold cross-validation are reported.
The SVM parameters are set as follows. The best C value is determined
using a grid search scheme covering {2i |i = −10, −9, ..., 0, ..., 9, 10}, ϵ which
is a small value, is selected from 0, 0.1, 0.2, 0.3. The optimal Gaussian kernel
function is utilized to deal with non-linear cases, i.e., K(x1 , x2 ) = exp(− ∥
x1 − x2 ∥2 /σ 2 ) and σ ∈ {2σmin :σmax } with σmin = −10, σmax = 10.
The geometric mean (G-mean) and AUC are computed for performance
evaluation. G-means measures the trade-off between the classification perfor-
mance of both majority and minority classes, while AUC indicates the ability
of a classifier in distinguishing between the majority and minority classes.
1
http:// archive.ics.uci.edu/ml/datasets.html
2
https://sci2s.ugr.es/keel/imbalanced.php
Springer Nature 2021 LATEX template
Table 2: AUC (%) rates of SVM-based models for imbalanced data sets with
a linear kernel function.
Data # TSVM[20] FTSVM[21] EFSVM[92] FSVM-CIL[29] zSVM[137] EFTWSVM-CIL[93]
Table 3: G-Mean (%) rates of SVM-based models for imbalanced data sets
with a linear kernel function.
Data # TSVM[20] FTSVM[21] EFSVM[92] FSVM-CIL[29] zSVM[137] EFTWSVM-CIL[93]
Table 4: AUC (%) rates of SVM-based models for imbalanced data sets with
a nonlinear kernel function.
Data set TSVM[20] FTSVM[21] EFSVM[92] FSVM-CIL[29] zSVM[137] EFTWSVM-CIL[93]
Table 5: G-Mean (%) rates of SVM-based models for imbalanced data sets
with a nonlinear kernel function.
Data set TSVM[20] FTSVM[21] EFSVM[92] FSVM-CIL[29] zSVM[137] EFTWSVM-CIL[93]
Table 6: AUC (%) rates of re-sampling and fusion-based SVM models for
imbalanced data sets.
Data # FSVM[19] SVM-SMOTE[62] SVM-OSS[138] SVM-RUS[138] SVM[13] EasyEnsemble[47] AdaBoost[24] 1-NN[24]
are compared. Note that SVM [47], FSVM [19] and 1-NN [24] are proposed
Springer Nature 2021 LATEX template
for tackling balanced classification problems. All SVMs use linear kernel func-
tions. Tables 6 and 7 present the AUC and G-mean scores along with their
standard deviation (SD). Overall, EasyEnsemble [47] and SVM-SMOTE [62]
outperform other methods in terms of average AUC (86.18%) and G-mean
(85.22%), respectively.
5 Discussion
This section analyzes and discusses the main findings of this review and high-
lights several future research directions along with several research questions.
SVM and its variants for learning imbalanced data can be broadly cate-
gorized into three: re-sampling, algorithmic, and fusion methods. Re-sampling
methods pre-process data samples by attempting to balance between the
numbers of majority and minority class samples. This can be achieved by
under-sampling the majority class samples, over-sampling the minority class
samples, or their combination. Despite the success of re-sampling methods,
several limitations exist. These methods require pre-processing of data sam-
ples before model training. Although under-sampling strategies can reduce the
computational burden, useful information could be eliminated by removing
data samples close to the decision boundary. On the other hand, over-sampling
strategies could lead to the overfitting problem, and they require longer
execution durations.
Comparing with re-sampling methods, algorithmic models do not require
data pre-processing, therefore a lower computational burden. In general, two
Springer Nature 2021 LATEX template
algorithmic strategies that enable the SVM and its variants to effectively deal
with imbalanced data are: (i) applying different cost functions to the majority
and minority class samples; (ii) modifying the underlying kernel functions. On
the other hand, fusion models combine re-sampling and algorithmic methods or
employ an ensemble of learning models, usually weak learners, to tackle class-
imbalanced learning problems. As reported in the literature, fusion models
usually outperform algorithmic methods. However, they are computationally
expensive, as it is necessary to train multiple models in fusion methods. Table
8 presents a comparison of the three categories of SVM-based models from
different aspects. Table 8 presents a comparison of the three categories of
SVM-based models from different aspects. The following sub-section presents
several directions that need further investigation.
Description Pre-process data samples Modify the underlying algo- Integrate two or more
to balance the numbers of rithm of SVM and its vari- methods into a single
majority and minority class ants directly without data framework.
samples. pre-processing.
Techniques Under-sampling or over- Use different cost functions. Combine re-sampling and
sampling, or a combination Modify kernel functions. algorithmic techniques.
of both. Ensemble of learning mod-
els.
Execution Vary subject to the extent Generally faster as no data Time-consuming owing to
Time of re-sampling. pre-processing is needed. the complexity and multi-
ple model training.
The data that support the findings of this study are openly available in UCI
and KEEL repositories at http:// archive.ics.uci.edu/ml/datasets.html and
https://sci2s.ugr.es/keel/imbalanced.php, respectively.
References
[1] Wang, J., He, Z., Huang, S., Chen, H., Wang, W., Pourpanah, F.: Fuzzy
measure with regularization for gene selection and cancer prediction.
International Journal of Machine Learning and Cybernetics, 1–17 (2021).
https://doi.org/10.1007/s13042-021-01319-3
[2] Randhawa, K., Loo, C.K., Seera, M., Lim, C.P., Nandi, A.K.: Credit
card fraud detection using adaboost and majority voting. IEEE Access
6, 14277–14284 (2018). https://doi.org/10.1109/ACCESS.2018.2806420
[3] Pourpanah, F., Zhang, B., Ma, R., Hao, Q.: Anomaly detection and
condition monitoring of uav motors and propellers. In: IEEE SENSORS,
pp. 1–4 (2018). https://doi.org/10.1109/ICSENS.2018.8589572
[4] Maalouf, M., Trafalis, T.B.: Robust weighted kernel logistic regression
in imbalanced and rare events data. Computational Statistics & Data
Analysis 55(1), 168–183 (2011). https://doi.org/10.1016/j.csda.2010.06.
014
[5] Pourpanah, F., Abdar, M., Luo, Y., Zhou, X., Wang, R., Lim, C.P.,
Wang, X.-Z., Wu, Q.M.J.: A review of generalized zero-shot learning
methods. IEEE Transactions on Pattern Analysis and Machine Intel-
ligence 45(4), 4051–4070 (2023). https://doi.org/10.1109/TPAMI.2022.
3191696
Springer Nature 2021 LATEX template
[6] He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Transactions
on Knowledge and Data Engineering 21(9), 1263–1284 (2009). https:
//doi.org/10.1109/TKDE.2008.239
[7] Wang, X., Zhao, Y., Pourpanah, F.: Recent advances in deep learning.
International Journal of Machine Learning and Cybernetics 11, 747–750
(2020). https://doi.org/10.1007/s13042-020-01096-5
[9] Liu, J.: Fuzzy support vector machine for imbalanced data with border-
line noise. Fuzzy Sets and Systems 413, 64–73 (2021). https://doi.org/
10.1016/j.fss.2020.07.018. Data Science
[10] Barua, S., Islam, M.M., Yao, X., Murase, K.: Mwmote–majority
weighted minority oversampling technique for imbalanced data set learn-
ing. IEEE Transactions on knowledge and data engineering 26(2),
405–425 (2012). https://doi.org/10.1016/j.eswa.2020.113504
[11] Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., Togneri, R.: Cost-
sensitive learning of deep feature representations from imbalanced data.
IEEE Transactions on neural networks and learning systems 29(8), 3573–
3587 (2017). https://doi.org/10.1109/TNNLS.2017.2732482
[12] Sun, Z., Song, Q., Zhu, X., Sun, H., Xu, B., Zhou, Y.: A novel ensem-
ble method for classifying imbalanced data. Pattern Recognition 48(5),
1623–1637 (2015). https://doi.org/10.1016/j.patcog.2014.11.014
[14] Vapnik, V.: The Nature of Statistical Learning Theory, (2000). https:
//doi.org/10.1007/978-1-4757-3264-1
[15] Tao, D., Jin, L., Liu, W., Li, X.: Hessian regularized support vector
machines for mobile image annotation on the cloud. IEEE Transactions
on Multimedia 15(4), 833–844 (2013). https://doi.org/10.1109/TMM.
2013.2238909
[16] Rezvani, S., Wang, X., Pourpanah, F.: Intuitionistic fuzzy twin support
vector machines. IEEE Transactions on Fuzzy Systems 27(11), 2140–
2151 (2019). https://doi.org/10.1109/TFUZZ.2019.2893863
[18] Kang, Q., Shi, L., Zhou, M., Wang, X., Wu, Q., Wei, Z.: A distance-
based weighted undersampling scheme for support vector machines and
its application to imbalanced classification. IEEE Transactions on Neural
Networks and Learning Systems 29(9), 4152–4165 (2018). https://doi.
org/10.1109/TNNLS.2017.2755595
[19] Lin, C.-F., Wang, S.-D.: Fuzzy support vector machines. IEEE trans-
actions on neural networks 13(2), 464–471 (2002). https://doi.org/10.
1109/72.991432
[21] Gao, B.-B., Wang, J.-J., Wang, Y., Yang, C.-Y.: Coordinate descent
fuzzy twin support vector machine for classification. In: IEEE Inter-
national Conference on Machine Learning and Applications, pp. 7–12
(2015). https://doi.org/10.1109/ICMLA.2015.35
[23] Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines
to imbalanced datasets. in Proceedings of the European Confer-
ence on Machine Learning, 39–50 (2004). https://doi.org/10.1007/
978-3-540-30115-8 7
[24] Galar, M., Fernandez, A., Barrenechea, E., Herrera, F.: A review
on ensembles for the class imbalance problem: bagging, boosting,
and hybrid-based approaches. IEEE Transactions on Systems, Man,
and Cybernetics, Part C 42, 463–484 (2012). https://doi.org/10.1109/
TSMCC.2011.2161285
[25] Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing,
G.: Learning from class-imbalanced data: Review of methods and appli-
cations. Expert Systems with Applications 73, 220–239 (2017). https:
//doi.org/10.1016/j.eswa.2016.12.035
[26] He, H., Ma, Y.: Imbalanced Learning: Foundations, Algorithms, and
Applications, 1st edn. Wiley-IEEE Press, ??? (2013). https://doi.org/
10.1002/9781118646106
Springer Nature 2021 LATEX template
[27] Batuwita, R., Palade, V.: Class imbalance learning methods for sup-
port vector machines. Imbalanced learning: Foundations, algorithms, and
applications, 83–99 (2013). https://doi.org/10.1002/9781118646106.ch5
[28] Yu, X.-P., Yu, X.-G.: Novel text classification based on k-nearest neigh-
bor. in Proceedings of the International Conference on Machine Learning
and Cybernetics, 3425–3430 (2007). https://doi.org/10.1109/ICMLC.
2007.4370740
[29] Batuwita, R., Palade, V.: Fsvm-cil: fuzzy support vector machines for
class imbalance learning. IEEE Transactions on Fuzzy Systems 18, 558–
571 (2010). https://doi.org/10.1109/TFUZZ.2010.2042721
[30] Wu, G., Chang, E.Y.: Class-boundary alignment for imbalanced dataset
learning. In: In ICML Workshop on Learning from Imbalanced Data
Sets, pp. 49–56 (2003)
[32] Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitiv-
ity of support vector machines. Proceedings of the International Joint
Conference on AI, 55–60 (1999)
[33] Estabrooks, A., Jo, T., Japkowicz, N.: A multiple resampling method for
learning from imbalanced data sets. Computational intelligence 20(1),
18–36 (2004). https://doi.org/10.1111/j.0824-7935.2004.t01-1-00228.x
[34] Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: Smote: Synthetic
minority over-sampling technique. Journal of Artificial Intelligence
Research 16, 321–357 (2002). https://doi.org/10.1613/jair.953
[35] Chen, J., Casique, M., Karakoy, M.: Classification of lung data by
sampling and support vector machine. In Proceedings of the Annual
International Conference of the IEEE Engineering in Medicine and Biol-
ogy Society 2, 3194–3197 (2004). https://doi.org/10.1109/IEMBS.2004.
1403900
[36] Fu, Y., Ruixiang, S., Yang, Q., Simin, H., Wang, C., Wang, H., Shan,
S., Liu, J., Gao, W.: A block-based support vector machine approach to
the protein homology prediction task in kdd cup 2004. SIGKDD Explo-
ration Newsletters 6, 120–124 (2004). https://doi.org/10.1145/1046456.
1046475
[41] Tyagi, S., Mittal, S.: Sampling approaches for imbalanced data classi-
fication problem in machine learning. Proceedings of ICRIC, 209–221
(2019). https://doi.org/10.1007/978-3-030-29407-6 17
[42] Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets:
One-sided selection. In: In Proceedings of the Fourteenth International
Conference on Machine Learning, pp. 179–186. Morgan Kaufmann, ???
(1997). https://doi.org/10.1.1.43.4487
[43] Liu, X., Wu, J., Zhou, Z.: Exploratory undersampling for class-imbalance
learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B
(Cybernetics) 39(2), 539–550 (2009). https://doi.org/10.1109/TSMCB.
2008.2007853
[44] Tang, X., Hong, H., Shu, Y., Tang, H., Li, J., Liu, W.: Urban waterlog-
ging susceptibility assessment based on a pso-svm method using a novel
repeatedly random sampling idea to select negative samples. Journal of
Hydrology 576, 583–595 (2019). https://doi.org/10.1016/j.jhydrol.2019.
06.058
[45] Tang, X., Machimura, T., Li, J., Liu, W., Hong, H.: A novel opti-
mized repeatedly random undersampling for selecting negative samples:
A case study in an svm-based forest fire susceptibility assessment. Jour-
nal of Environmental Management 271, 111014 (2020). https://doi.org/
10.1016/j.jenvman.2020.111014
[46] Hengyu, Z.: An under-sampling algorithm based on svm. In: IEEE Inter-
national Conference on Artificial Intelligence and Industrial Design, pp.
64–68 (2021). https://doi.org/10.1109/AIID51893.2021.9456573
Springer Nature 2021 LATEX template
[47] Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory undersampling for class-
imbalance learning. IEEE Transactions on Systems, Man, and Cyber-
netics, Part B (Cybernetics) 39(2), 539–550 (2008). https://doi.org/10.
1109/TSMCB.2008.2007853
[48] Yuan, J., Li, J., , Zhang, B.: Learning concepts from large scale imbal-
anced data sets using support cluster machines. in Proceedings of the
annual ACM international conference on Multimedia, 441–450 (2006).
https://doi.org/10.1145/1180639.1180729
[49] Ng, W.W.Y., Hu, J., Yeung, D.S., Yin, S., Roli, F.: Diversified
sensitivity-based undersampling for imbalance classification problems.
IEEE Transactions on Cybernetics 45(11), 2402–2412 (2015). https:
//doi.org/10.1109/TCYB.2014.2372060
[50] Di, Z., Kang, Q., Peng, D., Zhou, M.: Density peak-based pre-clustering
support vector machine for multi-class imbalanced classification. In:
IEEE International Conference on Systems, Man and Cybernetics, pp.
27–32 (2019). https://doi.org/10.1109/SMC.2019.8914451
[52] Rezvani, S., Wang, X.: Class imbalance learning using fuzzy art and
intuitionistic fuzzy twin support vector machines. Information Sciences
578, 659–682 (2021). https://doi.org/10.1016/j.ins.2021.07.010
[53] Lin, W.-C., Tsai, C.-F., Hu, Y.-H., Jhang, J.-S.: Clustering-based under-
sampling in class-imbalanced data. Information Sciences 409, 17–26
(2017). https://doi.org/10.1016/j.ins.2017.05.008
[54] Carpenter, G.A., Grossberg, S., Rosen, D.B.: Fuzzy art: Fast stable
learning and categorization of analog patterns by an adaptive resonance
system. Neural Networks 4(6), 759–771 (1991). https://doi.org/10.1016/
0893-6080(91)90056-B
[55] Ertekin, S., Huang, J., Giles, L.: Active learning for class imbalance prob-
lem. in Proceedings of the Annual International ACM SIGIR Conference
on Research and Development in Information Retrieval, 823–824 (2007).
https://doi.org/10.1145/1277741.1277927
[56] Ertekin, S., Huang, J., Bottou, L., Giles, L.: Learning on the border:
active learning in imbalanced data classification. in Proceedings of the
ACM Conference on Information and Knowledge Management, 127–136
(2007). https://doi.org/10.1145/1321440.1321461
Springer Nature 2021 LATEX template
[58] Wang, J., Yao, Y., Zhou, H., Leng, M., Chen, X.: A new over-sampling
technique based on svm for imbalanced diseases data. In: Proceedings
of the International Conference on Mechatronic Sciences, Electric Engi-
neering and Computer, pp. 1224–1228 (2013). https://doi.org/10.1109/
MEC.2013.6885254
[59] Liang, X.W., Jiang, A.P., Li, T., Xue, Y.Y., Wang, G.T.: Lr-smote —
an improved unbalanced data set oversampling based on k-means and
svm. Knowledge-Based Systems 196, 105845 (2020). https://doi.org/10.
1016/j.knosys.2020.105845
[60] Sreejith, S., Khanna Nehemiah, H., Kannan, A.: Clinical data clas-
sification using an enhanced smote and chaotic evolutionary feature
selection. Computers in Biology and Medicine 126, 103991 (2020).
https://doi.org/10.1016/j.compbiomed.2020.103991
[61] Mathew, J., Pang, C.K., Luo, M., Leong, W.H.: Classification of imbal-
anced data by oversampling in kernel space of support vector machines.
IEEE Transactions on Neural Networks and Learning Systems 29(9),
4065–4076 (2018). https://doi.org/10.1109/TNNLS.2017.2751612
[62] Nguyen, H.M., Cooper, E.W., Kamei, K.: Borderline over-sampling for
imbalanced data classification. Int. J. Knowl. Eng. Soft Data Paradigm.
3(1), 4–21 (2011). https://doi.org/10.1504/IJKESDP.2011.039875
[63] Barua, S., Islam, M.M., Yao, X., Murase, K.: Mwmote–majority
weighted minority oversampling technique for imbalanced data set learn-
ing. IEEE Transactions on Knowledge and Data Engineering 26(2),
405–425 (2014). https://doi.org/10.1016/j.eswa.2020.113504
[64] Wei, J., Huang, H., Yao, L., Hu, Y., Fan, Q., Huang, D.: New imbalanced
fault diagnosis framework based on cluster-mwmote and mfo-optimized
ls-svm using limited and complex bearing data. Engineering Applications
of Artificial Intelligence 96, 103966 (2020). https://doi.org/10.1016/j.
engappai.2020.103966
[65] R.Batuwita, Palade, V.: Efficient resampling methods for training sup-
port vector machines with imbalanced datasets. in Proceedings of the
International Joint Conference on Neural Networks, 1–8 (2010). https:
//doi.org/10.1109/IJCNN.2010.5596787
Springer Nature 2021 LATEX template
[66] Piri, S., Delenb, D., Liu, T.: A synthetic informative minority over-
sampling (simo) algorithm leveraging support vector machine to enhance
learning from imbalanced datasets. Decision Support Systems 106, 15–29
(2018). https://doi.org/10.1016/j.dss.2017.11.006
[67] A.Rivera, W.: Noise reduction a priori synthetic over-sampling for class
imbalanced data sets. Information Sciences 408, 146–161 (2017). https:
//doi.org/10.1016/j.ins.2017.04.046
[68] Liu, A., Martin, C., La Cour, B., Ghosh, J.: Effects of oversampling ver-
sus cost-sensitive learning for bayesian and svm classifiers. In: Data Min-
ing, pp. 159–192 (2010). https://doi.org/10.1007/978-1-4419-1280-0 8
[69] Himaja, D., Maruthi Padmaja, T., Radha Krishna, P.: Oversample based
large scale support vector machine for online class imbalance problem.
In: Mondal, A., Gupta, H., Srivastava, J., Reddy, P.K., Somayajulu,
D.V.L.N. (eds.) Big Data Analytics, pp. 348–362. Springer, ??? (2018).
https://doi.org/10.1007/978-3-030-04780-1 24
[70] Wei, J., Huang, H., Yao, L., Hu, Y., Fan, Q., Huang, D.: New imbalanced
bearing fault diagnosis method based on sample-characteristic oversam-
pling technique (scote) and multi-class ls-svm. Applied Soft Computing
101, 107043 (2021). https://doi.org/10.1016/j.asoc.2020.107043
[71] Ganaie, M.A., Tanveer, M., Suganthan, P.N.: Regularized robust fuzzy
least squares twin support vector machine for class imbalance learning.
In: International Joint Conference on Neural Networks, pp. 1–8 (2020).
https://doi.org/10.1109/IJCNN48605.2020.9207724
[72] Yanze, L., OGAI, H.: Quasi-linear svm with local offsets for high-
dimensional imbalanced data classification. In: Annual Conference of
the Society of Instrument and Control Engineers of Japan, pp. 882–887
(2020). https://doi.org/10.23919/SICE48898.2020.9240376
[73] Cao, B., Liu, Y., Hou, C., Fan, J., Zheng, B., Yin, J.: Expediting the
accuracy-improving process of svms for class imbalance learning. IEEE
Transactions on Knowledge and Data Engineering, 1–1 (2020). https:
//doi.org/10.1109/TKDE.2020.2974949
[74] Liu, J.: Fuzzy support vector machine for imbalanced data with border-
line noise. Fuzzy Sets and Systems 413, 64–73 (2021). https://doi.org/
10.1016/j.fss.2020.07.018. Data Science
[75] Tanveer, M., Sharma, A., Suganthan, P.N.: Least squares knn-based
weighted multiclass twin svm. Neurocomputing (2020). https://doi.org/
10.1016/j.neucom.2020.02.132
Springer Nature 2021 LATEX template
[76] Wu, G., Zheng, R., Tian, Y., Liu, D.: Joint ranking svm and binary rele-
vance with robust low-rank learning for multi-label classification. Neural
Networks 122, 24–39 (2020). https://doi.org/10.1016/j.neunet.2019.10.
002
[77] Cao, P., Zhao, D., Zaiane, O.: An optimized cost-sensitive svm for imbal-
anced data learning. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G.
(eds.) Advances in Knowledge Discovery and Data Mining, pp. 280–292
(2013). https://doi.org/10.1007/978-3-642-37456-2 24
[78] Bach, F., Heckerman, D., Horvitz, E.: Considering cost asymmetry in
learning classifiers. J. MACHINE LEARNING RESEARCH 7, 1713–
1741 (2006). https://doi.org/10.5555/1248547.1248610
[79] Kowalczyk, A., Raskutti, B.: One class svm for yeast regulation predic-
tion. SIGKDD Exploration Newsletters 4, 99–100 (2002). https://doi.
org/10.1145/772862.772878
[80] Raskutti, B., Kowalczyk, A.: Extreme re-balancing for svms: a case
study. SIGKDD Exploration Newsletters 6, 60–69 (2004). https://doi.
org/10.1145/1007730.1007739
[81] Imam, T., Ting, K., Kamruzzaman, J.: z-svm: an svm for improved
classification of imbalanced data. in Proceedings of the Australian joint
conference on Artificial Intelligence: advances in Artificial Intelligence,
264–273 (2006). https://doi.org/10.1007/11941439 30
[82] Gu, B., Sheng, V.S., Tay, K.Y., Romano, W., Li, S.: Cross validation
through two-dimensional solution surface for cost-sensitive svm. IEEE
Transactions on Pattern Analysis and Machine Intelligence 39(6), 1103–
1121 (2017). https://doi.org/10.1109/TPAMI.2016.2578326
[83] Yang, C.Y., Yang, J.S., Wang, J.-J.: Margin calibration in svm class
imbalanced learning. Neurocomputing 73, 397–411 (2009). https://doi.
org/10.1016/j.neucom.2009.08.006
[84] Yu, X., Liu, J., Keung, J.W., Li, Q., Bennin, K.E., Xu, Z., Wang, J., Cui,
X.: Improving ranking-oriented defect prediction using a cost-sensitive
ranking svm. IEEE Transactions on Reliability 69(1), 139–153 (2020).
https://doi.org/10.1109/TR.2019.2931559
[85] Xu, G., Zhou, H., Chen, J.: Cnc internal data based incremental cost-
sensitive support vector machine method for tool breakage monitoring
in end milling. Engineering Applications of Artificial Intelligence 74,
90–103 (2018). https://doi.org/10.1016/j.engappai.2018.05.007
[86] Ma, Y., Zhao, K., Wang, Q., Tian, Y.: Incremental cost-sensitive support
Springer Nature 2021 LATEX template
[90] Ma, H., Wang, L., Shen, B.: A new fuzzy support vector machines
for class imbalance learning. International Conference on Electrical and
Control Engineering, 3781–3784 (2011). https://doi.org/10.1155/2014/
536434
[91] Wang, Y., Wang, S., Lai, K.K.: A new fuzzy support vector machine to
evaluate credit risk. IEEE Transactions on Fuzzy Systems 13, 820–831
(2005). https://doi.org/10.1109/TFUZZ.2005.859320
[92] Fan, Q., Wang, Z., Li, D., Gao, D., Zha, H.: Entropy-based fuzzy support
vector machine for imbalanced datasets. Knowledge-Based Systems 115,
87–99 (2017). https://doi.org/10.1016/j.knosys.2016.09.032
[93] Gupta, D., Richhariya, B., Borah, P.: A fuzzy twin support vector
machine based on information entropy for class imbalance learning. Neu-
ral Computing and Applications 31, 7153–7164 (2019). https://doi.org/
10.1007/s00521-018-3551-9
[95] G.Wu, Chang, E.: Kba: Kernel boundary alignment considering imbal-
anced data distribution. IEEE Transactions on Knowledge and Data
Engineering 17, 786–795 (2005). https://doi.org/10.1109/TKDE.2005.
95
[96] Amari, S., Wu, S.: Improving support vector machine classifiers by
modifying kernel functions. Neural Networks 12(6), 783–789 (1999).
https://doi.org/10.1016/S0893-6080(99)00032-5
Springer Nature 2021 LATEX template
[97] Cristianini, N., Kandola, J., Elissee, A., Shawe-Taylor, J.: On kernel-
target alignment. in Advances in Neural Information Processing Systems
14, MIT Press, 367–373 (2002). https://doi.org/10.7551/mitpress/1120.
001.0001
[98] Kandola, J., Shawe-taylor, J.: Refining kernels for regression and uneven
classifocation problems. in Proceedings of International Conference on
Artificial Intelligence and Statistics (2003)
[99] Tao, X., Li, Q., Ren, C., Guo, W., He, Q., Liu, R., Zou, J.: Affinity
and class probability-based fuzzy support vector machine for imbalanced
data sets. Neural Networks 122, 289–307 (2020). https://doi.org/10.
1016/j.neunet.2019.10.016
[101] Hong, X., Chen, S., Harris, C.: kernel-based two-class classifier for imbal-
anced data sets. IEEE Transactions on Neural Networks 18, 28–41
(2007). https://doi.org/10.1109/TNN.2006.882812
[102] Qin, A., Suganthan, P.: Kernel neural gas algorithms with application
to cluster analysis. in Proceedings of the 17th International Conference
on Pattern Recognition, IEEE Computer Society, 617–620 (2004). https:
//doi.org/10.1109/ICPR.2004.1333848
[103] Xu, Y.: Maximum margin of twin spheres support vector machine for
imbalanced data classification. IEEE Transactions on Cybernetics 47(6),
1540–1550 (2017). https://doi.org/10.1109/TCYB.2016.2551735
[104] Liu, J., Zio, E.: Feature vector regression with efficient hyperparame-
ters tuning and geometric interpretation. Neurocomputing 218, 411–422
(2016). https://doi.org/10.1016/j.neucom.2016.08.093
[105] Liu, J., Zio, E.: Integration of feature vector selection and support vector
machine for classification of imbalanced data. Applied Soft Comput-
ing Journal 79, 702–7011 (2019). https://doi.org/10.1016/j.asoc.2018.
11.045
[106] Kim, K.H., Sohn, S.Y.: Hybrid neural network with cost-sensitive sup-
port vector machine for class-imbalanced multimodal data. Neural
Networks 130, 176–184 (2020). https://doi.org/10.1016/j.neunet.2020.
06.026
[107] Rtayli, N., Enneya, N.: Enhanced credit card fraud detection based on
Springer Nature 2021 LATEX template
[108] Wu, W., Xu, Y., Pang, X.: A hybrid acceleration strategy for nonparal-
lel support vector machine. Information Sciences 546, 543–558 (2021).
https://doi.org/10.1016/j.ins.2020.08.067
[109] Tian, Y., Qi, Z., Ju, X., Shi, Y., Liu, X.: Nonparallel support vector
machines for pattern classification. IEEE Transactions on Cybernetics
44(7), 1067–1079 (2014). https://doi.org/10.1109/TCYB.2013.2279167
[112] Yua, L., Zhou, R., Tang, L., Chenca, R.: A dbn-based resampling svm
ensemble learning paradigm for credit classification with imbalanced
data. Applied Soft Computing 69, 192–202 (2018). https://doi.org/10.
1016/j.asoc.2018.04.049
[113] Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep
beliefnets. Neural Comput 18(7), 1527–1554 (2006). https://doi.org/10.
1162/neco.2006.18.7.1527
[114] Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm.
In: Proceedings of the International Conference on Machine Learning,
pp. 148–156 (1996). https://doi.org/10.5555/3091696.3091715
[116] Sun, J., Li, H., Fujita, H., Fu, B., Ai, W.: Class-imbalanced dynamic
financial distress prediction based on adaboost-svm ensemble combined
with smote and time weighting. Information Fusion 54, 128–144 (2020).
https://doi.org/10.1016/j.inffus.2019.07.006
[117] Mehmood, Z., Asghar, S.: Customizing svm as a base learner with
adaboost ensemble to learn from multi-class problems: A hybrid
approach adaboost-msvm. Knowledge-Based Systems, 106845 (2021).
https://doi.org/10.1016/j.knosys.2021.106845
Springer Nature 2021 LATEX template
[118] Fan, W., Stolfo, S., Zhang, J., Chan, P.: Adacost: Misclassification cost-
sensitive boosting. in Proceedings of the 16th International Conference
on Machine Learning, 97–105 (1999). https://doi.org/10.1145/3373509.
3373548
[119] Joshi, M., Kumar, V., Agarwal, C.: Evaluating boosting algorithms to
classify rare classes: Comparison and improvements. in Proceedings of
the IEEE International Conference on Data Mining, 257–264 (2001).
https://doi.org/10.1109/ICDM.2001.989527
[120] Chawla, N., Lazarevic, A., Hall, L., Bowyer, K.: Smoteboost: Improv-
ing prediction of the minority class in boosting. in Proceedings of
the Principles of Knowledge Discovery in Databases, 107–119 (2003).
https://doi.org/10.1007/978-3-540-39804-2 12
[121] Li, P., Chan, K., Fang, W.: Hybrid kernel machine ensemble for imbal-
anced data sets. in Proceedings of the IEEE International Conference on
Pattern Recognition, 1108–1111 (2006). https://doi.org/10.1109/ICPR.
2006.643
[122] Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson,
R.C.: Estimating the support of a high-dimensional distribution. Neu-
ral Computation 13(7), 1443–1471 (2001). https://doi.org/10.1162/
089976601750264965
[124] Wilk, T., Wozniak, M.: Soft computing methods applied to combination
of one-class classifiers. Neurocomputing 75(1), 185–193 (2012). https:
//doi.org/10.1016/j.neucom.2011.02.023
[125] Zeng, R., Lu, Y., Long, S., Wang, C., Bai, J.: Cardiotocography signal
abnormality classification using time-frequency features and ensemble
cost-sensitive svm classifier. Computers in Biology and Medicine 130,
104218 (2021). https://doi.org/10.1016/j.compbiomed.2021.104218
[126] Liu, X.-y., Wu, J., Zhou, Z.-h.: Exploratory under-sampling for class-
imbalance learning. In: International Conference on Data Mining, pp.
965–969 (2006). https://doi.org/10.1109/ICDM.2006.68
[127] Lin, Z., Hao, Z., Yang, X., Liu, X.: Several svm ensemble methods inte-
grated with under-sampling for imbalanced data learning. in Proceedings
of the International Conference on Advanced Data Mining and Applica-
tions, 536–544 (2009). https://doi.org/10.1007/978-3-642-03348-3 54
Springer Nature 2021 LATEX template
[128] Kang, P., Cho, S.: Eus svms: ensemble of under-sampled svms for
data imbalance problems. in Proceedings of the international conference
on Neural Information Processing, 837–846 (2006). https://doi.org/10.
1007/11893028 93
[129] Liu, Y., An, A., Huang, X.: Boosting prediction accuracy on imbalanced
datasets with svm ensembles. in Proceedings of the 10th Pacific-Asia
conference on Advances in Knowledge Discovery and Data Mining, 107–
118 (2006). https://doi.org/10.1007/11731139 15
[130] Wang, B., Japkowicz, N.: Boosting support vector machines for imbal-
anced data sets. Knowledge and Information Systems 25, 1–20 (2010).
https://doi.org/10.1007/s10115-009-0198-y
[131] Liu, X., Yi, G.Y., Bauman, G., He, W.: Ensembling imbalanced-spatial-
structured support vector machine. Econometrics and Statistics 17, 145–
155 (2021). https://doi.org/10.1016/j.ecosta.2020.02.003
[132] Bao, L., Juan, C., Li, J., Zhang, Y.: Boosted near-miss under-sampling
on svm ensembles for concept detection in large-scale imbalanced
datasets. Neurocomputing 172, 198–206 (2016). https://doi.org/10.
1016/j.neucom.2014.05.096
[133] Lee, W., Jun, C.-H., Lee, J.-S.: Instance categorization by support vector
machines to adjust weights in adaboost for imbalanced data classifica-
tion. Information Sciences 381, 92–103 (2017). https://doi.org/10.1016/
j.ins.2016.11.014
[134] Tashk, A., Faez, K.: Boosted bayesian kernel classifier method for face
detection. in Proceedings of the Third International Conference on
Natural Computation, IEEE Computer Society, 533–537 (2007). https:
//doi.org/10.1109/ICNC.2007.287
[135] Badrinath, N., Gopinath, G., Ravichandran, K.S., Soundhar, R.G.: Esti-
mation of automatic detection of erythemato-squamous diseases through
adaboost and its hybrid classifiers. Artificial Intelligence Review 45,
471–488 (2016). https://doi.org/10.1007/s10462-015-9436-8
[136] Zieba, M., Tomczak, J.M.: Boosted svm with active learning strategy
for imbalanced data. Soft Computing 19, 3357–3368 (2015). https://doi.
org/10.1007/s00500-014-1407-5
[137] Imam, T., Ting, K.M., Kamruzzaman, J.: z-svm: An svm for improved
classification of imbalanced data. In: Sattar, A., Kang, B.-h. (eds.) AI
2006: Advances in Artificial Intelligence, pp. 264–273. Springer, Berlin,
Heidelberg (2006). https://doi.org/10.1007/11941439 30
Springer Nature 2021 LATEX template
[138] Balcázar, J., Dai, Y., Watanabe, O.: A random sampling technique for
training support vector machines. In: Algorithmic Learning Theory, pp.
119–134 (2001). https://doi.org/10.1007/3-540-45583-3 11