Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)

Ensembles Based Combined Learning for Improved Software Fault Prediction: A


Comparative Study

Chubato Wondaferaw Yohannese, Tianrui Li, Macmillan Simfukwe, Faisal Khurshid


School of Information Science and Technology
Southwest Jiaotong University
Chengdu 611756, China
freewwin@yahoo.com, trli@swjtu.edu.cn, macsims85@gmail.com, faisalnit@gmail.com

Abstract—Software Fault Prediction (SFP) research has made compared with an individual classifier. As illustrated in the
enormous endeavor to accurately predict fault proneness of literatures [6, 14, 32, 35], is more like, to make wise
software modules to maximize precious software test resources, decisions, people may counsel many experts in the area and
reduce maintenance cost, help to deliver software products on take consideration of their opinions rather than only depend
time and satisfy customer, which ultimately contribute to on their own decisions. In fault prediction, a predictive
produce quality software products. In this regard, Machine model generated by ML can be considered as an expert.
Learning (ML) has been successfully applied to solve Therefore, a good approach to make decisions more
classification problems for SFP. Moreover, from ML, it has accurately is to combine the output of different predictive
been observed that Ensemble Learning Algorithms (ELA) are
models. So that, all can improve or at least equal to the
known to improve the performance of single learning
algorithms. However, neither of ELA alone handles the
predictive performance over an individual models [6, 14, 32,
challenges created by redundant and irrelevant features and 35]. Therefore, in this study, we develop a new framework to
class imbalance problem in software defect datasets. Therefore, compare eminent ELA, namely, bagging [16, 30, 32, 35] and
the objective of this paper is to independently examine and AdaBoost.M1 [16, 31, 32, 35] with J48 Decision Tree (DT)
compare prominent ELA and improves their performance as a base classifier. In addition to that, we use McCabe and
combined with Feature Selection (FS) and Data Balancing (DB) Halstead Static Code Metrics [19, 22] datasets for
techniques to identify more efficient ELA that better predict experimental analysis.
the fault proneness of software modules. Accordingly, a new In ML, the ELA is known to improve the predictive
framework that efficiently handles those challenges in a performance of individual classifiers, but neither of these
combined form is proposed. The experimental results confirm ensemble techniques alone solves the data skewness (class
that the proposed framework has exhibited the robustness of imbalance) problem [16, 26] and the existence of redundant
combined techniques. Particularly the framework has high and irrelevant features, specifically in defect datasets [26].
performance when using combined bagging ELA with DB on Thus, to deal with these issues, the ensemble based
selected features. Therefore, as shown in this study, ensemble combined framework has to be designed specifically.
techniques used for SFP must be carefully examined and Therefore, in this study, we combine ELA with Feature
combined with both FS and DB in order to obtain robust Selection (FS) [9, 12-14, 20, 21] and Data Balancing (DB)
performance. [11, 20, 23-25] techniques. FS is carried out by removing
less important and redundant features, so that only important
Keywords-Software Fault Prediction, Ensemble Learning
Algorithms, Feature Selection, Data Balancing.
features are left for training the predictive models and the
performance of ELA could be improved. Moreover, as
software defect datasets are composed of Not Fault Prone
I. INTRODUCTION (NFP) instances with only a small percentage of Fault Prone
The growing demands of quality software in different (FP) instances, DB is carried out to resolve this skewed
industry have been igniting the Software Fault Prediction nature of defect datasets, so that building SFP models on
(SFP) research area; thereby the quality can be cautiously balanced data could improve ELA performance.
inspected and undertaken before releasing the software. SFP Therefore, this paper aims to independently examine and
is targeted to inspect and detect faulty proneness of software compare ELA and realize their performance improvement
modules and help to focus more on those specific modules when combined with FS and DB to identify efficient
predicted as faulty so as to manage resources efficiently and techniques that better perform for SFP. Hence, the main
reduce the number of faults occurring during operations. In contribution of this study is the empirical analysis of
this regard, statistical and Machine Learning (ML) multiple ELA in combination with FS and DB. Interestingly,
techniques have been employed for SFP in most studies [2- the proposed framework has exhibited the robustness of
14]. On the other hand, from ML techniques, Ensemble combined techniques. Particularly, it has high performance
Learning Algorithms (ELA) have been demonstrated to be when combining ensemble techniques with DB on selected
useful in different areas of research [7, 15-18], where all of features, which constitutes a primary contribution credited to
them have confirmed that ELA can effectively solve this study.
classification problems with better performance when

978-1-5386-1829-5/17/$31.00 ©2017 IEEE


The remainder of this paper is organized as follows: methods. They demonstrated that ensembles of few rankers
related works presented in Section II . Section III discusses are effective and even better than ensembles of many or all
details of the proposed framework and the algorithms. rankers. As noticed, these studies don’t compare different
Section IV presents the experimental design. Section V ensemble techniques and perform DB to resolve class
reports our results and discussion together with a comparison imbalance issues. Therefore, to the best of our knowledge, no
of ELA when combined with FS as well as both FS and DB study has made an attempt to make a combined comparison
techniques. Section VI discusses threats to validity with of ELA with FS and DB (in our case, Information Gain (IG)
regard to our experimental setups. Section VIII presents final and Synthetic Minority Over-sampling Techniques
conclusions based on obtained results and future works. (SMOTE)) concepts together for SFP.
Having this gap in mind, which has not been addressed
II. RELATED WORK by many studies, we design a new framework that follows
This section focuses on the studies that have tried to two strategies step by step. Details of our new framework for
address the problems in SFP using ELA. improved SFP based on combined ensembles are presented
These studies typically build SFP models to help in Section III.
software engineers to focus development activities on FP
modules, make better use of test resources and produce III. ENSEMBLE BASED COMBINED FRAMEWORK FOR
quality software products [9, 26, 28, 29]. Khoshgoftaar et al. IMPROVING SOFTWARE FAULT PREDICTION
[18] investigated FS, Under-sampling with AdaBoost An ensemble based combined framework for improved
algorithm and their main drive was to make a comparison SFP is shown in Figure 1. We follow two strategies to attain
between repetitive sample FS and individual FS. And they the objective of the study.
confirmed that using boosting is effective in improving In strategy one, we perform FS using IG evaluation
classification performance. On the other hand, Shanthini et al. methods based on Ranker search methods for all datasets
[15] investigated bagging ensemble with Support Vector used in this study to identify useful features for ensemble
Machine (SVM) as a base learner for SFP. They showed that learning. The input of the framework for this strategy is top
the proposed ensemble of SVM is superior to individual ranked features from all datasets. Then we build the SFP
approach for SFP. As noticed, both of these studies don’t models on the selected features using two prominent ELA
consider employing different ensemble approaches so as to (Bagging and AdaBoost.M1 with J48 DT as the base
identify the techniques that are more efficient for SFP. classifier). Experiments have carried out by running a 10-
As discussed earlier, the ELA is known to increase the fold cross-validation and each experiment repeated 10 times.
performance of individual classifiers, but neither of these Then the results are captured using AUC, and Accuracy
ensemble techniques alone solves the existence of irrelevant Performance Evaluation (PE) criteria. As shown in Figure 1,
features and class imbalance problem of defect datasets. To this strategy serves as to compare combined ELA. Also we
deal with these issues, we have realized that the performance use the efficiently performing techniques for further
of ELA can be increased by keeping the quality of software comparable reference in the subsequent performance
defect datasets, which can be done by applying either FS experiments.
and/or resolving class imbalance problem [26]. For instance, In strategy two, we resolve the data skewness problem
Shivaji et al. [12] investigated multiple FS using NB and using SMOTE algorithm on the selected features of the
SVM classifiers. Wang et al. [13] made a comprehensive defect datasets. To get reasonably balanced data for
empirical study to assess ensembles of feature ranking ensemble classification, we make the target ration for NFP

Figure 1. Ensembles Based Combined Framework For Improved SFP


and FP module as recommended to be 65% and 35%, This will be used to compare with the previous performance
respectively by Khoshgoftaar et al. [5]. As shown in Figure 1, experiment.
this strategy serves as a demonstration of performance Finally, we make comparison between the selected
improvement by combining both IG and SMOTE with ELA, combined ELA from both strategy one and strategy two
and as a final point to realize the efficient ensemble classifier. category based on AUC PE criteria to realize performance
improvement and more efficiently performing combined
Overall Algorithm : Experiment al P rocedure
ELA for SFP.
Dat aset s m {D 1, D 2, ..., D n }; / *defect dat aset s*/
FS m IG; / *redunda nt and ir relevant fea t ure removing met hod */
DB m SMOTE; / *Dat a Balancing met hod*/ IV. EXPERIMENTAL DESIGN
ELA m {BaG, AdB}; / *Ensemble Learning Algorit hms*/
R m 10; / *t he number of repeat ing experiment */ During the experiments, in this study, performance
N m 10; / *t he numb er of folds*/
for each D  Dat aset s do evaluation is carried out by running a 10-fold cross-
Pi m C i ,D / D / *probabilit y of arbit rary t uple blongs t o class C i * /
m
validation [7]. First, we rank the attributes using the IG FS
Info(D ) m ¦ pi log2 ( pi ) / * entropy of D * / techniques for each dataset independently. After ranking the
iv 1
InfoA (D ) m ¦ Dj / D u Info(D j ) / * expect ed informat ion required t o
attributes, following the recommendation from literatures
j 1 [13], we select the top [log2 n] attributes (n is the total
classify a t uple based on t he part it ioning by at t ribut e A * /
InfoGain (A ) m Info(D )  InfoA (D ) / *informat ion gained by A * / number of independent features). Then the class attribute is
for each A  D do / * attiribute or feature selection * / included to yield final datasets.
rankOfA m InfoGain (A ) / * t he highest r a nked at t ribut e
A is t he one wit h t he largest informat ion gain */ The performances of each ensemble techniques are
rankOfA '[ ] m {rankOfA 1, rankOfA 2, .... rankOfA n }; evaluated against each other for each dataset. Employed
endfor
for I = 0 t o lengt h-1 do / * sort A based on I nfoG ain(A ) */ algorithms are performed using WEKA version 3.8 [27]. The
highR ank = I; final comparison is based on using the AUC measure.
for J = I + 1 t o lengt h-1 do
if rankOfA '[J ] > rankOfA '[highR ank] Since the proposed ensemble based combined framework
h ighR ank = J ;
endif is implemented step by step following two strategies, and
endfor evaluated using eight publicly available software defect
Temp = rankOfA '[I]; rankOfA '[I] = rankOfA '[highR ank ];
rankOfA '[highR ank ] = Temp; datasets. Moreover, the experimental procedures have
endfor presented in the following pseudo-code to summarize overall
selectedFeatures m top ª¬log2n º¼ + class A; / * select ing t op ra nked feat ures*/
for each t imes  [1, R] do / *R t imes N-fold cross-validat ion*/ combined approach. The selected eight datasets are the latest
D' m selectedFeatures; and cleaned versions from publicly accessible PROMISE
foldDat a m generat e N folds from D';
for each fold m [1, N] do repository of NASA software projects [1]. These datasets are
t est Dat a m fold Dat a[fold]; popular in SFP studies and have been used by many studies
t rainingDat a m D' - t est Dat a;
for each combFSELA m ELA do / *evaluat e combined [29]. Table I summarizes some main characteristics of the
ELA wit h FS*/ selected datasets. All the datasets consist of McCabe and
fault P redict or m combFSELA(t r ain ingDat a);
V m evaluat e combined ELA on t est Dat a; / * obtain Halstead Static Code Metrics. Note that data imbalance is
t ot al vot e received by each class (FP and NFP )*/ consistently observed in all datasets. The value of using
predict orP erformance m choose t he class t hat receives
t he highest t ot al V as t he final classificat ion; static code metrics to build SFP models has been empirically
endfor illustrated by Menzies et al. [10].
endfor
endfor
S mino  D'; / * S mino is a subset of # F P i ns t an ces */ V. ANALYSIS AND DISCUSSIONS
S majo  D'; / * S majo is a subset of # NFP ins t an ces */
Pmajo m recommended/ required percent age of S majo ; This section presents the experimental analysis and
x i  S mino ; / * is the # FP ins t an ce underconsi deration
it s one of the k - nearest neighbors for x i :x i^  S mino */ discussions based on the objective of the study. Performance
k m 5; / *set 5 as k-nearest neighbors for each example x i  S mino * / comparison in terms of AUC and Accuracy among
G  [0, 1]; / * is a uniformal dist ribut ion random var iable */
100 prominent ELA with FS technique (in our case IG) and data
rat e = (( S majo * )  (S majo  S mino )) / S mino * 100; / * det ermine rat e imbalance problem solution (in our case SMOTE) presented
Pmajo
t o synt het ically creat e inst an ces from S mino * / in the following figures and tables. The results are based on
for x i  S mino
x i^ m one of randomly choosed # FP from k - near est neighbors; the performance of ELA combined with IG FS and ELA
x newMino m x i  (x i^  x i ) u G ; / * new # FP ins t an ce creation */ combined with both IG and SMOTE.
endfor
for each t imes  [1, R] do / *R t imes N-fold cross-validat ion*/
balancedData m D'  x newMino ; / * balanced data on s elected features */ TABLE I. A DESCRIPTION OF DATASETS.
foldDat a m generat e N folds from balancedData ;
for each fold m [1, N] do Dataset #Attr. #Ins. #NFP #FP %NFP %FP
t est Dat a m fold Dat a[fold]; ar1 29 121 112 9 92.56% 7.44%
t rainingDat a m balancedData - t est Dat a;
for each combDBFSELA m ELA do / *evaluat e combined ELA ar4 29 107 87 20 81.31% 18.69%
wit h DB and FS */ JM1' 21 9593 7834 1759 81.66% 18.34%
fault P redict or m combDBFSELA(t rainingDat a); KC2 21 522 415 107 79.50% 20.50%
V m evaluat e combined ELA on t est Dat a; / * obt ain t ot al vot e MC1'' 38 1988 1942 46 97.69% 2.31%
received by each class (FP and NFP )*/ MW1' 37 264 237 27 89.77% 10.23%
predict orP erformance m choose t he class t hat receives t he
highest t ot al V as t he fin al classificat ion; PC3' 37 1125 985 140 87.56% 12.44%
endfor PC4'' 37 1287 1110 177 86.25% 13.75%
endfor
endfor
endfor
(a) (b)
Figure 2. Comparison of ELA Combined with IG FS using Accuracy, and AUC

TABLE II. CLASSIFICATION RESULTS OF ELA COMBINED WITH IG A. Comparison: ELA Performance Combined with IG
IGDTBagging IGDTAdaBoost.M1
Performance comparison of bagging and AdaBoost.M1
Dataset
Accuracy AUC Accuracy AUC using IG FS are given in Figure 2 (a) and (b) and Table II. In
JM1' 81.766 0.72 80.568 0.696 terms of both indexes used in this study AdaBoost.M1
MC1" 97.712 0.8 98.305 0.821 appears to perform low and bagging demonstrates the
MW1' 88.799 0.678 86.373 0.669 highest values.
PC3' 86.587 0.803 85.024 0.777
PC4" 88.874 0.908 87.887 0.894
Thus, the result reflects the better performance of
ar1 90.404 0.755 87.929 0.744 bagging over AdaBoost.M1. Except that out of eight datasets,
ar4 84.545 0.833 80.709 0.794 in MC1” and KC2, AdaBoost.M1 shows better performance
KC2 81.934 0.833 82.718 0.801 in accuracy as well as in AUC measure using MC1” dataset.
Average 87.58 0.791 86.19 0.775
However, considering the average performance of all
TABLE III. CLASSIFICATION RESULTS OF ELA COMBINED WITH BOTH
datasets, bagging still outperforms AdaBoost.M1.
IG AND SMOTE
B. Comparison: ELA Performance Combined with IG and
Dataset
SMOTEIGDTBagging SMOTEIGDTAdaBoost.M1 SMOTE
Accuracy AUC Accuracy AUC
JM1' 80.773 0.855 78.926 0.835
In Figure 3 (a) and (b) and Table III, the performance
MC1" 96.596 0.988 97.303 0.992 comparison of bagging and AdaBoost.M1 combined with
MW1' 84.966 0.916 85.595 0.907 both IG and SMOTE are given. In terms of both indexes,
PC3' 83.266 0.905 83.834 0.911 AdaBoost.M1 appears to perform low and bagging
PC4" 90.158 0.962 90.439 0.963
ar1 82.092 0.901 81.931 0.896
demonstrates the highest values. However, for MC1"(97.303,
ar4 77.401 0.854 76.923 0.846 0.992), MW1' (85.595), PC3' (83.834, 0.911), and PC4"
KC2 80.39 0.871 79.078 0.836 (90.439, 0.963) datasets, AdaBoost.M1 outperforms bagging
Average 84.46 0.907 84.25 0.898 ensemble learning in both accuracy and AUC (except MW1')
measures. Nevertheless, considering the average

(a) (b)
Figure 3. Comparison of ELA Combined with both IG and SMOTE using Accuracy and AUC
performance of all datasets, bagging still outperforms drown based on the important features selected using IG. In
AdaBoost.M1. Thus, the results reflect the better terms of the total instances and number of classes, the
performance of combined bagging but closely followed by datasets may not be good representatives. However, this
combined AdaBoost.M1 on software defect datasets. On the practice is common among the fault prediction research area.
other hand, based on this performance results, we can say
that, after resolving class imbalance problem with some VII. CONCLUSION AND FUTURE WORKS
datasets, AdaBoost.M1 competitively shows good This study made empirical evaluation of the capability of
performance, which clearly needs further investigation with ELA in predicting FP software modules and compared their
more datasets from another software metrics. performance combined with FS and both FS and DB
C. Comparison: IGDTBagging with SMOTEIGDTBagging techniques using eight NASA software defect datasets. Our
objective of using FS and DB was that, by combining those
As expected, selecting useful features and resolving class filtering techniques with ELA, we would be able to prune
imbalance problem has been proved to be useful and non-relevant features and balance classes, and then learn an
improve ELA performance. In this regard, based on our ELA that performs better than from learning on the whole
proposed framework, Sections V (A) and (B) experimental feature set and in imbalanced classes. Accordingly, the
results show achieved performance improvements. And the experimental results reveal our combined technique assures
more efficiently performed ELA in average is found to be the performance improvement. Thus, dealing with the
combined bagging in both strategy one and two. Therefore, challenges of SFP mentioned in this study, our proposed
this section points out the performance improvement framework confirms remarkable classification performance
achieved through combined bagging ELA when combined and lays the pathway to software quality assurance.
with IG as well as combined with both IG and SMOTE using As the future work, we plan to explore more ELA
AUC PE. Accordingly, as shown in Figure 4, the including vote and stacking, and more data preprocessing
performance of combined bagging algorithm gives better techniques with more defect datasets which consist of
results in all datasets when combining with both IG and different software metrics; and to realize how the proposed
SMOTE than combining only with IG. These affirms the framework helps to identify the more efficient combined
contribution of combined preprocessing as removing ensemble techniques and improve its classification
irrelevant and redundant features as well as resolving class performance to accurately predict FP software modules.
imbalance problem and its power to improve the
performance of ELA. ACKNOWLEDGEMENT
This work is supported by the Fundamental Research
Funds for the Central Universities (No. 2682015QM02).
REFERENCES
[1] T. Menzies, R. Krishna, and D. Pryor. (2016). The Promise
Repository of Empirical Software Engineering Data. Available:
http://openscience.us/repo
[2] E. Arisholm, L. C. Briand, and E. B. Johannessen, "A systematic and
comprehensive investigation of methods to build and evaluate fault
prediction models," Journal of Systems and Software, vol. 83, pp. 2–
17, 2010.
[3] K. O. Elish and M. O. Elish, "Predicting defect-prone software
modules using support vector machines," Journal of Systems and
Software, vol. 81, pp. 649–660, 2008.
[4] I. Gondra, "Applying machine learning to software fault-proneness
prediction," Journal of Systems and Software, vol. 81, pp. 186–195,
Figure 4. Comparison of Bagging ELA Combined with IG and both IG
2008.
and SMOTE using AUC [5] T. M. Khoshgoftaar, C. Seiffert, J. V. Hulse, A. Napolitano, and A.
Folleco, "Learning with limited minority class data," in the Sixth
International Conference on Machine Learning and Applications,
VI. THREAT TO VALIDITY Cincinnati, OH, 2007.
[6] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and
There are threats that may have an effect on our Techniques: Morgan Kaufmann Publishers Inc., 2011.
experimental results. The proposed prediction models were [7] R. Kohavi, "A study of cross-validation and bootstrap for accuracy
created without changing the parameter setting except that estimation and model selection," in the International Joint Conference
DT algorithm is used with ensembles techniques, which was on Artificial Intelligence, 1995.
[8] I. H. Laradji, M. Alshayeb, and L. Ghouti, "Software defect
not default in both cases. Thus, investigations were not made prediction using ensemble learning on selected features," Information
by changing the default parameters setting to see how the and Software Technology, vol. 58, pp. 388–402, 2015.
variation affects the model performance. In addition to that, [9] R. Malhotra, "A systematic review of machine learning techniques for
as many software metrics are defined in literature, different software fault prediction," Applied Soft Computing, vol. 27, pp. 504-
518, 2015.
software metrics might be better indicator to defectiveness of [10] T. Menzies, J. Greenwald, and A. Frank, "Data mining static code
modules. However, we used static code software metrics attributes to learn defect predictors," IEEE Transactions on Software
which were available in selected datasets. Conclusions were Engineering, vol. 33, pp. 2–13, 2007.
[11] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, [33] F. Provost and T. Fawcett, "Robust classification for imprecise
"SMOTE: Synthetic minority over-sampling technique," Journal of environments," Machine Learning, vol. 42 pp. 203–231, 2001.
Artificial Intelligence Research, vol. 16, pp. 321-357, 2002. [34] C. Catal, "Performance evaluation metrics for software fault
[12] S. Shivaji, E. J. Whitehead, R. Akella, and S. Kim, "Reducing prediction studies," Acta Polytechnica Hungarica, vol. 9, pp. 193–206,
features to improve code change-based bug prediction," IEEE 2012.
Transactions on Software Engineering, vol. 39, pp. 552–569, 2013. [35] R. Polikar, "Ensemble based systems in decision making," IEEE
[13] H. Wang, T. M. Khoshgoftaar, and A. Napolitano, "A comparative Circuits and Systems Magazine, vol. 6, pp. 21-45, 2006.
study of ensemble feature selection techniques for software defect
prediction," in the Ninth International Conference on Machine
Learning and Applications, IEEE, Washington, DC, 2010.
[14] E. Frank, M. A. Hall, and I. H. Witten, The WEKA Workbench.
Online Appendix for "Data Mining: Practical Machine Learning
Tools and Techniques," 4th ed.: Morgan Kaufmann, 2016.
[15] A. Shanthini and R. M. Chandrasekaran, "Analyzing the effect of
bagged ensemble approach for software fault prediction in class level
and package level metrics," in the IEEE International Conference on
Information Communication and Embedded Systems (ICICES), India,
2014.
[16] Mikel Galar, Alberto Fernandez, Edurne Barrenechea, Humberto
Bustince, and F. Herrera, "A Review on Ensembles for the Class
Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based
Approaches," IEEE Transactions on Systems, Man, and Cybernetics,
Part C (Applications and Reviews), vol. 42, pp. 463 - 484, 2012.
[17] [17] S.K. Mathanker, P.R. Weckler, T.J. Bowser, N. Wang, and N. O.
Maness, "AdaBoost classifiers for pecan defect classification,"
Computers and Electronics in Agriculture, vol. 77, pp. 60–68, 2011.
[18] Taghi M. Khoshgoftaar, Kehan Gao, and A. Napolitano, "Improving
software quality estimation by combining feature selection strategies
with sampled ensemble learning," in the IEEE 15th International
Conference on Information Reuse and Integration (IRI), San
Francisco, California, USA, 2014.
[19] D. Radjenovic, M. Hericko, R. Torkar, and A. Zivkovic, "Software
fault prediction metrics: A systematic literature review," Journal of
Information and Software Technology, vol. 55, pp. 1397–1418, 2013.
[20] H. Liu, H. Motoda, and L. Yu, "A selective sampling approach to
active feature selection," Artificial Intelligence, vol. 159, pp. 49–74,
2004.
[21] S. Liu, X. Chen, W. Liu, J. Chen, Q. Gu, and D. Chen, "FECAR: A
feature selection framework for software defect prediction," in the
38th Annual International Computers, Software and Applications
Conference, Vasteras, 2014.
[22] T. J. McCabe, "A complexity measure,," IEEE Transactions on
Software Engineering, vol. SE-2, pp. 308–320, 1976.
[23] V. García, J. S. Sánchez, and R. A. Mollineda, "On the effectiveness
of preprocessing methods when dealing with different levels of class
imbalance," Knowledge-Based Systems, vol. 25, pp. 13-21, 2012.
[24] H. He and E. A. Garcia, "Learning from Imbalanced Data," IEEE
Transactions on Knowledge and Data Engineering, vol. 21, pp. 1263-
1284, 2009
[25] P. Sarakit, T. Theeramunkong, and C. Haruechaiyasak, "Improving
emotion classification in imbalanced YouTube dataset using SMOTE
algorithm," in the 2nd International Conference on Advanced
Informatics: Concepts, Theory and Applications, Chonburi, 2015.
[26] W. Y. Chubato and T. Li, "A Combined-Learning Based Framework
for Improved Software Fault Prediction," International Journal of
Computational Intelligence Systems, vol. 10, pp. 647–662, 2017.
[27] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H.
Witten, "The WEKA data mining software: an update; SIGKDD
Explorations," Retrieved 01 Sep. 2017.
[28] C. Catal, "Software fault prediction: A literature review and current
trends," Expert Systems with Applications, vol. 38, pp. 4626–4636,
2011.
[29] T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell, "A
systematic literature review on fault prediction performance in
software engineering," IEEE Transactions on Software Engineering,
vol. 38, pp. 1276–1304, 2012.
[30] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, pp.
123-140, 1996.
[31] Yoav Freund and R. E. Schapire, "Experiments with a new boosting
algorithm," in Thirteenth International Conference on Machine
Learning, San Francisco, 1996, pp. 148-156.
[32] Polikar R., “Ensemble Learning,” Scholarpedia, 2009.

You might also like