Software Fault Prediction Metrics A Systematic Literature Review

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Title: Unraveling Software Fault Prediction Metrics: A Systematic Literature Review

In the ever-evolving landscape of software development, ensuring the reliability and quality of
software systems remains paramount. Among the myriad of methodologies employed, software fault
prediction metrics stand out as a crucial tool for preemptively identifying potential flaws within
software systems. However, navigating the vast array of available metrics and understanding their
implications can be a daunting task for researchers and practitioners alike.

Undoubtedly, crafting a comprehensive literature review on software fault prediction metrics poses
its own set of challenges. From scouring through numerous academic databases to critically
evaluating the relevance and reliability of selected studies, the process demands meticulous attention
to detail and a nuanced understanding of the subject matter.

One of the primary obstacles encountered in this endeavor is the sheer volume of literature available.
With a multitude of research papers, journal articles, and conference proceedings addressing various
aspects of software fault prediction metrics, synthesizing the existing body of knowledge into a
cohesive narrative requires a significant investment of time and effort.

Moreover, the complexity inherent in the subject matter itself adds another layer of difficulty.
Software fault prediction metrics encompass a diverse range of methodologies, encompassing
everything from statistical techniques to machine learning algorithms. As such, comprehending the
intricacies of each metric and discerning their applicability in real-world scenarios necessitates a deep
understanding of both software engineering principles and statistical analysis.

In light of these challenges, it's no wonder that many researchers and practitioners find themselves
overwhelmed when attempting to navigate the labyrinthine landscape of software fault prediction
metrics. However, amidst the sea of complexity, there exists a beacon of hope for those seeking
clarity and guidance in their endeavors.

At ⇒ StudyHub.vip ⇔, we understand the challenges associated with crafting a literature review on


software fault prediction metrics. That's why we offer comprehensive writing services tailored to
meet your specific needs. Our team of experienced writers possesses the expertise and insights
necessary to distill complex research findings into concise and coherent narratives.

Whether you're looking to explore the latest advancements in software fault prediction metrics or
seeking guidance on selecting the most appropriate methodologies for your research, ⇒
StudyHub.vip ⇔ is here to assist you every step of the way. With our commitment to quality,
reliability, and customer satisfaction, you can trust us to deliver exceptional results that exceed your
expectations.

Don't let the complexities of software fault prediction metrics overwhelm you. Let ⇒ StudyHub.vip
⇔ be your trusted partner in navigating the intricacies of this dynamic field. Contact us today to
learn more about our services and discover how we can help you achieve your academic and
professional goals.
The SEs that obtained High and Very High scores were considered for the mapping. Expand 62 PDF
3 Excerpts Save Empirical Investigation of Metrics for Fault Prediction on Object-Oriented Software
Bindu Goel Y. Distribution of DeP studies on a year-by-year basis. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution (CC BY)
license ( ). In this paper, we perform a systematic mapping to analyze all the software defect
prediction literature available from 1995 to 2018 using a multi-stage process. Son, L.H.; Pritam, N.;
Khari, M.; Kumar, R.; Phuong, P.T.M.; Thong, P.H. Arisholm, E.; Briand, L.C. Predicting Fault-
Prone Components in a Java Legacy System. In the 98 primary study s selected for this mapping, 12
statistical tests have been used by 18 studies ( Figure 10 ). Once all the primary studies have been
obtained, a carefully designed inclusion-exclusion criteria are applied to the resultant set in order to
eliminate entities that do not match the objectives of the mapping. Conflicts of Interest The authors
declare that they do not have any conflict of interest. Visit our dedicated information section to learn
more about MDPI. Funding This research received no external funding. Table 4 summarizes the
percentage of SEs and the number of SEs in each of the four categories. In order to be human-
readable, please install an RSS reader. The Friedman test has been used in three studies and the t Test
has been used in only one study. Gupta Computer Science 2017 TLDR Analysis of using software
metrics after transformation in identifying fault-prone areas on multiple releases of 8 products (30
releases) shows that the removing redundancy and then doing log transformation can be used to
derive threshold values for all metrics under investigation and then performing the fault classification
for each metric. Studies that do not use DeP as the dependent variable. Son, Le Hoang, Nakul
Pritam, Manju Khari, Raghvendra Kumar, Pham Thi Minh Phuong, and Pham Huy Thong. Severity
is a basic component of any defect detected in a software system. QQ16 Does the abstract provide
sufficient information about the content of the study. Hence, a study can have a maximum 18 points
and minimum 0 points. Expand 31 PDF 2 Excerpts Save. 1 2 3. Related Papers Showing 1 through 3
of 0 Related Papers Tables Topics 38 Citations 23 References Related Papers Stay Connected With
Semantic Scholar Sign Up What Is Semantic Scholar. Soubervielle-Montalvo Computer Science J.
Syst. Softw. 2017 108 Save A Systematic Review of the Empirical Validation of Object-Oriented
Metrics towards Fault-proneness Prediction Bassey Isong O. However, we do not guarantee
individual replies due to the high volume of messages. Kumar Saikrishna Sripada A. Sureka S. K.
Rath Computer Science Journal of Systems and Software 2018 81 1 Excerpt Save Fault Prediction
Using Statistical and Machine Learning Methods for Improving Software Quality R. Malhotra A.
Bansal Computer Science Journal of Information Processing Systems 2012 TLDR It is reasonable to
claim that quality models have a significant relevance with Object Oriented metrics and that machine
learning methods have a comparable performance with statistical methods. These trained models can
then be used to identify defect prone parts of running projects. The literature hence obtained was
processed further using a carefully designed inclusion-exclusion criteria and quality analysis criteria
as described in the following sections. Multiple requests from the same IP address are counted as one
view. The scores obtained by SEs in the quality analysis step are separated into four groups, Low,
Average, High and Very High. Expand 8 PDF 1 Excerpt Save A Detailed Survey on Machine
Intelligence Based Frameworks for Software Defect Prediction MR. Raghvendra Omprkash Singh B.
Once the questions have been identified and evaluated, a search query is designed which is used to
extract studies from digital libraries. To test the effectiveness of such a model and to simulate this
real-life application scenario, cross-project defect prediction models are constructed. RQ5 What is the
performance of various learning techniques across datasets. In this paper, we perform a systematic
mapping to analyze all the software defect prediction literature available from 1995 to 2018 using a
multi-stage process. It should be noted that while the number of studies that apply multi-co-linearity
analysis is only seven (7.14%) the number of studies that apply feature sub selection is 44 (50%).
Expand 142 Highly Influential PDF 4 Excerpts Save Investigating effect of Design Metrics on Fault
Proneness in Object-Oriented Systems K. Aggarwal Y. Singh Arvinder Kaur R. QQ16 Does the
abstract provide sufficient information about the content of the study. The proposed model predicts
the fault density at the end of each phase of software development using relevant software metrics.
Eclipse dataset has been used by 10 studies while Apache datasets are used by four studies. We use
cookies on our website to ensure you get the best experience. Please note that many of the page
functionalities won't work as expected without javascript enabled. Quality scores of primary studies
obtaining at least 13 points. The findings of this research are useful not only to the software
engineering domain, but also to the empirical studies, which mainly focus on symmetry as they
provide steps-by-steps solutions for questions raised in the article. And achieving reliability is the
need of today’s global competition. RQ9: This question does not use any diagramming method. 3.
Results and Discussion 3.1. Description of SEs This study is very useful for developers. FS can
reduce the dimensionality of data; removing irrelevant and redundant data. One of the most
important goals of such techniques is to accurately predict the modules where faults are likely to
hide as early as possible in the development lifecycle. If two studies by the same author(s) exist,
where one is an extension of the previous work the former is discarded. Kaminsky, K.; Boetticher, G.
Building a Genetically Engineerable Evolvable Program (GEEP) Using Breadth-Based Explicit
Knowledge for Predicting Software Defects. RQ4: The fourth question deals with the performance
measures and statistical tests used in DeP studies. The most frequently used technique for feature
selection is correlation-based feature selection (CFS), 19 studies make use of CFS. Expand 1 Save
Predicting Software Defects using Machine Learning Techniques M. The resultant data was
converted to an Excel (.xlsx) Workbook for usage during the data synthesis process. You can
download the paper by clicking the button above. Kumar Saikrishna Sripada A. Sureka S. K. Rath
Computer Science Journal of Systems and Software 2018 81 1 Excerpt Save Fault Prediction Using
Statistical and Machine Learning Methods for Improving Software Quality R. Malhotra A. Bansal
Computer Science Journal of Information Processing Systems 2012 TLDR It is reasonable to claim
that quality models have a significant relevance with Object Oriented metrics and that machine
learning methods have a comparable performance with statistical methods. Table 4 summarizes the
percentage of SEs and the number of SEs in each of the four categories. Here, we analyze how many
studies have made use of statistical tests in their evaluation of the results. Apart from any fair
dealing for the purpose of private study or research, no. Out of eight techniques, only three have been
used more than once. Hence, a study can have a maximum 18 points and minimum 0 points.
It is necessary that researchers explore the applicability of DeP models specifically to security related
defects and vulnerabilities. The approach of Bayesian Belief Nets is discussed at the end of the paper
as a candidate for improved decision making in the industry whenever uncertainty reigns over
decision making. After CFS, the most used technique is Principal Component Analysis (PCA). The
Friedman test has been used in three studies and the t Test has been used in only one study. ISPRS
International Journal of Geo-Information (IJGI). Evolutionary Programming (EP), Evolutionary
Subgroup Discovery (ESD), Genetic Algorithm (GA), and Gene Expression Programming (GeP) and
Particle Swarm Optimization (PSO) have been studied only once. Paper should be a substantial
original Article that involves several techniques or approaches, provides an outlook for. In the last
five years, considerable work has been done on evaluating the applicability of these techniques to the
defect prediction problem. Figure 11 provides a detailed description of the performance analysis. We
try to visit every aspect of security related defects in our response to this question. The final results
reported by the prediction model are affected. Please note that many of the page functionalities won't
work as expected without javascript enabled. The content is provided for information purposes only.
It shows that the research in DeP picked up in the year 2005 when five studies were conducted. This
mapping found out that few search-based techniques like AIRS and GP have good performance in
predicting defects, but that number of studies that support this finding is very small. Most of the
studies made use of public domain data and standard performance measures like accuracy, error rate,
precision, recall and ROC analysis. 3.1.1. Quality Assessment Questions Table 3 shows the results of
the quality analysis. Editor’s Choice articles are based on recommendations by the scientific editors
of MDPI journals from around the world. QQ7 Is the data collection procedure clearly defined.
Studies have not made much use of multi-co-linearity analysis techniques and only half of the studies
selected for mapping have used feature sub-selection techniques. Since the number of studies is
limited, tables are used to summarize which study uses what search-based technique. We have
selected the list of following digital libraries to perform our search: IEEE Xplore Springer Link
Science Direct Wiley Online Library ACM Digital Library Google Scholar The Search String: A
search string is the combination of characters and words entered by a user into a search engine to
find desired results. When a defect is detected in a software system, the severity assigned to it helps
project managers identify the level of impact it can have on the system. It is encouraging to see that
only one out of three studies did used statistical tests. 3.9. RQ8: How Many DeP Studies atTempt to
Identify Defects of Varying Severity Levels. DeP models are built using two approaches: First, by
using measurable properties of the software system called Software Metrics and second, by using
fault data from a similar software project. One of the most important goals of such techniques is to
accurately predict the modules where faults are likely to hide as early as possible in the development
lifecycle. Details of the techniques used to answer the selected research questions are given below:
RQ1: To answer this question we use a bar chart that shows the number of studies using machine
learning; search-based techniques, and statistical techniques and threshold-based techniques. Out of
eight techniques, only three have been used more than once. Expand 1 Save Predicting Software
Defects using Machine Learning Techniques M. Expand 17 2 Excerpts Save Effective fault
prediction model developed using Least Square Support Vector Machine (LSSVM) L. Factors such
as researcher bias, selection of test parameters, etc., affect the performance of a technique greatly.
These account for 77 out of the selected 98 studies. All studies were examined independently on the
basis of the criteria defined in The Inclusion-Exclusion Criteria. Severity is a basic component of any
defect detected in a software system. Both the studies have reported the prediction accuracy of the
resultant models. For organizations looking to start new projects, it is important that they can make
use of training data from existing DeP models. Expand 8 Highly Influenced 4 Excerpts Save Fault
Prediction using Metric Threshold Value of Object Oriented Systems Shruti Gupta D. Application of
the inclusion-exclusion criteria resulted in 98 studies out of the total 156 studies being selected for
quality analysis. On the basis of fault density at the end of testing phase, total number of faults in the
software is predicted. Journal of Theoretical and Applied Electronic Commerce Research (JTAER).
The following future guidelines are provided on the basis of the results of this study: Datasets used
for DeP studies should undergo thorough pre-processing that includes multi-co-linearity analysis and
feature sub selection. For the final analysis, we imposed the following restrictions: Only those
techniques were selected that were used in five or more studies. This is a very small number and
reflects the fact the very little emphasis has been laid on the analysis of multi-co-linearity among data
elements present in datasets used for DeP studies in the past. 3.3.2. What are Different Techniques
Used for Feature Sub Selection in DeP Studies. The objective of this constraint is to eliminate those
datasets that have been used only once and could get included in the analysis severely impacting the
overall performance of a learning algorithm. Feature sub-selection procedures applied to DeP studies.
We explored each aspect of the process ranging from data collection; data pre-processing, and
techniques used to build DeP models to measures used to evaluate model performance and statistical
evaluation schemes used to mathematically validate the results of a DeP model. Comparison of
performance of each technique pair for all datasets identified. Estimate the effectiveness of search-
based techniques for DeP. Expand 20 Save A Systematic Literature Review on Fault Prediction
Performance in Software Engineering T. To identify lacking areas and facilitate the building and
benchmarking of new DeP models it is important to study the previous works in a systematic manner
and extract meaningful information. Software measurement is a way to track the process. The
approach of Bayesian Belief Nets is discussed at the end of the paper as a candidate for improved
decision making in the industry whenever uncertainty reigns over decision making. Empirical Study
of Software Defect Prediction: A Systematic Mapping. Symmetry. 2019; 11(2):212. Estimate the
number of studies that work on defects of various severity levels. Expand 142 Highly Influential
PDF 4 Excerpts Save Investigating effect of Design Metrics on Fault Proneness in Object-Oriented
Systems K. Aggarwal Y. Singh Arvinder Kaur R. The first part is concerned with multi-co linearity
analysis techniques and the second part deals with feature sub-selection techniques used. Software
quality metric highlights the quality aspects of product, process, and project. Such tests are used to
evaluate the difference between two samples (for defect prediction studies, the sample is the result(s)
generated by an algorithm). The analysis of 98 primary study s suggests that k sample tests have been
rarely used in literature. The SEs that obtained High and Very High scores were considered for the
mapping. Techniques like C4.5, J48, CART, and Random Forest come under the decision tree class.
Other learning techniques, as can be seen in the graph, had an average performance. Kumar
Saikrishna Sripada A. Sureka S. K. Rath Computer Science Journal of Systems and Software 2018
81 1 Excerpt Save Fault Prediction Using Statistical and Machine Learning Methods for Improving
Software Quality R. Malhotra A. Bansal Computer Science Journal of Information Processing
Systems 2012 TLDR It is reasonable to claim that quality models have a significant relevance with
Object Oriented metrics and that machine learning methods have a comparable performance with
statistical methods. To provide a strong mapping it is essential that selection of primary studies for
mapping must be done carefully. The focus of their study is on model building methods, metrics
used, and datasets used. Identification of metrics that are found to be relevant for DeP. RQ3.3
Which metrics are found to be insignificant predictors for DeP. It shows the percentage and number
of SEs that Agree to, Disagree to, and stands Neutral towards each quality question. DeP models are
built using two approaches: First, by using measurable properties of the software system called
Software Metrics and second, by using fault data from a similar software project. In this question,
we deal with various questions relating to data used in a DeP model. A total of 22 studies obtained
at least 13 points in the quality analysis process. From Figure 9 it is clear that many of these popular
performance measures have been used in DeP studies. Inclusion Criteria: Empirical studies for DeP
using software metrics. Editors select a small number of articles recently published in the journal that
they believe will be particularly. Identification of metrics that are found to be irrelevant for DeP.
These account for 77 out of the selected 98 studies. RQ4 What are the methods used for
performance evaluation in DeP models. Expand 31 PDF 2 Excerpts Save. 1 2 3. Related Papers
Showing 1 through 3 of 0 Related Papers Tables Topics 38 Citations 23 References Related Papers
Stay Connected With Semantic Scholar Sign Up What Is Semantic Scholar. Expand 2 1 Excerpt
Save Detection of Software Anomalies Using Object-oriented Metrics R. C. Juliano B. Travencolo
M. S. Soares Computer Science ICEIS 2014 TLDR It is discovered that, in most of the studied
articles, CBO, RFC and WMC are often useful and hierarchical metrics as DIT and NOC are not
useful in the implementation of such activities, which can be used to guide software development.
Journal of Functional Morphology and Kinesiology (JFMK). The ability of a model to learn from
data that does not come from the same project or organization will help organizations that do not
have sufficient training data or are going to start work on new projects. Son, L.H.; Pritam, N.; Khari,
M.; Kumar, R.; Phuong, P.T.M.; Thong, P.H. For the final analysis, we imposed the following
restrictions: Only those techniques were selected that were used in five or more studies. The focus is
on independent variables and techniques used to build models. It is evident that search-based
techniques have a good predictive capability and this domain should be explored further. The
objective of this constraint is to eliminate those datasets that have been used only once and could get
included in the analysis severely impacting the overall performance of a learning algorithm. The
relation given below summarizes the comparison procedure. The average, maximum and minimum
ROC area for the above learning techniques are given in Table 10. 3.7. RQ6: What Is the
Performance of Search-Based Techniques in Defect Prediction. Only seven out of 98 studies have
stated in their work that they make use of multi-co linearity procedures. We explored each aspect of
the process ranging from data collection; data pre-processing, and techniques used to build DeP
models to measures used to evaluate model performance and statistical evaluation schemes used to
mathematically validate the results of a DeP model. Note that from the first issue of 2016, this
journal uses article numbers instead of page numbers. Ayan Computer Science Expert Syst. Appl.
2016 44 Save Source code metrics: A systematic mapping study Alberto Salvador Nunez Varela H.
G. Perez-Gonzalez Francisco Eduardo Martinez Perez C.
In this part, we identify the metrics that are found to be significant predictors of software defects. A
DeP model that can estimate the severity of a possible defect is very helpful to practitioners, because
it makes them aware of the modules which can have defects of high severity, moderate severity and
low severity and the order of action ideally should be testing of modules with high severity defects
first, then modules with defects having moderate severity level and finally those modules which
have defects of low severity. Also, the results of QQ5, QQ10, QQ11, and QQ17 are well distributed
on the positive and negative end. Object Technol. 2007 TLDR Result of this study shows that many
metrics are based on comparable ideas and provide redundant information and it is shown that by
using a subset of metrics in the prediction models can be built to identify the faulty classes. Using
Faults-Slip-Through Metric as a Predictor of Fault-Proneness. This ensures that only those techniques
are evaluated which have been widely used in literature. Eclipse dataset has been used by 10 studies
while Apache datasets are used by four studies. RQ3: This research question deals with the data used
in SEs. ROC Area, Accuracy and Precision have been used in 28, 25 and 27 studies, respectively.
3.5.2. What are Different Statistical Tests of Significance Used in DeP studies. Regression,
Discriminant analysis and Threshold based classification have been performed in 35, 8 and 4 studies
respectively. Studies have not made much use of multi-co-linearity analysis techniques and only half
of the studies selected for mapping have used feature sub-selection techniques. Son, Le Hoang,
Nakul Pritam, Manju Khari, Raghvendra Kumar, Pham Thi Minh Phuong, and Pham Huy Thong.
The approach of Bayesian Belief Nets is discussed at the end of the paper as a candidate for
improved decision making in the industry whenever uncertainty reigns over decision making. Cross-
project defect prediction means that the DeP model trained on one set of data and is validated on the
data obtained from different software. To ensure that all the primary studies that our mapping plans
to address are covered we need to be careful in the selection and placement of keywords used in the
search string. The question related to the identification of security related defects has not been
discussed in any systematic mapping to the best of the authors’ knowledge. Expand 4 PDF Save An
Empirical Study of Software Metrics Diversity for Cross-Project Defect Prediction Yiwen Zhong
Kun Song Shengkai Lv Peng He Computer Science Mathematical Problems in Engineering 2021
TLDR The experimental results suggest that the impact of different metric sets on the performance of
CPDP is significantly distinct, with semantic and structural metrics performing better. ISPRS
International Journal of Geo-Information (IJGI). Hence, a study can have a maximum 18 points and
minimum 0 points. In machine learning, feature selection entails selecting a subset of relevant
features (i.e. predictors) to be used in a particular model. Figure 11 provides a detailed description of
the performance analysis. Various papers have been studied and various methods proposed have been
analyzed for their possible shortcomings and enhancements. If two studies by the same author(s)
exist, where one is an extension of the previous work the former is discarded. These account for 77
out of the selected 98 studies. We explored each aspect of the process ranging from data collection;
data pre-processing, and techniques used to build DeP models to measures used to evaluate model
performance and statistical evaluation schemes used to mathematically validate the results of a DeP
model. Chandrasekaran Computer Science 2012 TLDR This study is focused on the high-
performance fault predictors based on machine learning algorithms and finds SVM provides the best
prediction performance for both metrics level in terms of precision, recall and accuracy. To browse
Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade
your browser. It has been widely recognized that the inheritance metrics are the weakest predictors of
defects. Expand 1 Excerpt Save Stacking Based Ensemble Learning for Improved Software Defect
Prediction Sweta Mehta K. S. Patnaik Computer Science Lecture Notes in Electrical Engineering
2021 TLDR Stacking Ensemble technique gave best results for all the datasets with defect prediction
accuracy more than 0.9 among the algorithms used for this experiment. In this section, we elaborate
on the distribution of model building techniques used in DeP studies.
The resultant data was converted to an Excel (.xlsx) Workbook for usage during the data synthesis
process. Empirical Study of Software Defect Prediction: A Systematic Mapping. Symmetry. 2019;
11(2):212. This paper describes a novel software reliability growth model based on non homogeneous
Poisson process with allowing for imperfect debugging. Design, code, and most recently,
requirements metrics have been successfully used for predicting fault-prone modules. The number of
studies that use statistical tests for model comparison is also limited at 23 studies. Software reliability
growth model is used to estimate the reliability through mathematical expression and it also used to
interpret software failures as a random process. For this a new approach is discussed to develop
fuzzy profile of software metrics which are more relevant for software fault prediction. Tropical
Medicine and Infectious Disease (TropicalMed). Both support vector machine and neural network
have been used in 21 studies. 3.3. RQ2: What Are the Different Data Pre-Processing Techniques
Used in DeP Models. Based on the above facts, we strongly suggest that statistical tests should be
part of all research studies that involve large datasets. Studies that do not use DeP as the dependent
variable. Severity is a basic component of any defect detected in a software system. Estimate the
effectiveness of learning techniques for DeP. Data Extraction: To extract meaningful information for
each study such that the research questions can be answered, the data extraction method should be
objectively defined. Complexity metric WMC is also reported significant by 15 studies. It shows the
percentage and number of SEs that Agree to, Disagree to, and stands Neutral towards each quality
question. Since DeP studies are heavily dependent on data and their primary aim is to identify metrics
that are relevant to DeP, the result of our analysis towards RQ2 is not very encouraging. The most
frequently used technique for feature selection is correlation-based feature selection (CFS), 19
studies make use of CFS. Essentially, this is what makes a DeP model applicable in a real-life
situation. This is a very small number and reflects the fact the very little emphasis has been laid on
the analysis of multi-co-linearity among data elements present in datasets used for DeP studies in the
past. 3.3.2. What are Different Techniques Used for Feature Sub Selection in DeP Studies. The focus
is on independent variables and techniques used to build models. Empirical Study of Software Defect
Prediction: A Systematic Mapping. QQ7 Is the data collection procedure clearly defined. Out of the
total 98 studies, only four studies (SE20, SE41, SE50, and SE62) have attempted to build DeP
models that identify multiple severity levels. 3.10. RQ9: How Many DeP Studies Perform Cross-
Project DeP. Some studies have reported other metrics like LCOM3 and LCOM as weak predictors
of defects. The review committee filled the data extraction card and any conflicts raised during the
data extraction process were resolved by taking suggestions from other researchers. Finally, we assess
the performance of 10 learning techniques over 11 datasets (KC1 is used at method level, as well as a
class level). Only six studies (SE46, SE54, SE56, SE84, SE85, and SE96) of the total 98 studies have
attempted to build cross-project DeP models. RQ2 What are the different data pre-processing
techniques used in DeP models. Identification of metrics that are found to be irrelevant for DeP.

You might also like