Kaur 2014

Software maintainability prediction by data mining of
software code metrics
Arvinder Kaur Kamaldeep Kaur Kaushal Pathak

Computer Science Department Computer Science Department Computer Science Department
USICT, GGS Indraprastha USICT, GGS Indraprastha USICT, GGS Indraprastha
University University University
Delhi, India Delhi, India Delhi, India
arvinderkaurtakkar@yahoo.com kdkaur99@gmail.com kkaushal1991@gmail.com
Abstract— Software maintainability is a key quality attribute management of a software system during the maintenance
that determines the success of a software product. Since software phase. If maintainability prediction model is accurate, design
maintainability is an important attribute of software quality, corrections can be adopted, which in turn helps in reducing
accurate prediction of it can help to improve overall software future maintenance effort [2].
quality. This paper utilizes data mining of some new predictor
Programming in an object-oriented (OO) system is different
metrics apart from traditionally used software metrics for
predicting maintainability of software systems. The prediction from non-OO system, due to some concepts that are specific
models are constructed using static code metric datasets of four to OO paradigm like classes, objects, inheritance and
different open source software (OSS): Lucene, JHotdraw, JEdit, encapsulation. Because of this difference, we cannot apply
and JTreeview. Lucene contain 385 classes and is of 135241 lines well known non-OO effort prediction model to OO software
of code (LOC) OSS, JHotdraw contain 159 classes and is of 21802 effort prediction. Therefore a number of software metrics are
LOC OSS, JEdit contain 275 classes and is of 104053 LOC OSS proposed to predict maintainability of OO system like Li and
and JTreeview contain 60 classes and is of 11988 LOC OSS. The Henry (L&H) [3] metrics and Chidamber and Kemerer (C&K)
metrics were collected using two different metrics extraction metrics [4]. In this paper, maintainability is measured as the
tools Chidamber and Kemerer Java metric (CKJM) tool and
number of changes (CHANGE) made to the source code
IntelliJ IDEA. Naïve Bayes, Bayes Network, Logistic,
MultiLayerPerceptron and Random Forest classifiers are used to during the maintenance phase. A change can be addition,
identify the software modules that are difficult to maintain. deletion or modification of line of code. Addition of a line is
Random forest models are found to be most useful in software counted as one change; Deletion of line is also counted as one
maintainability prediction by data mining of software code change; while the modification of line is counted as two
metrics as random forest models have higher recall, precision changes (one for addition and other for deletion).
and Area under curve (AUC) of ROC curve. There are some previous studies on software
maintainability prediction that mine software metrics proposed
Keywords— Data mining; software code metrics; Software
by Li and Henry [3] and Chidamber and Kemerer[4]. But the
maintainability prediction
drawback of these studies is that they mine a small and a
limited set of metrics. In this study we perform data mining of
I. INTRODUCTION an enhanced set of 25 metrics for software maintainability
Present day software systems have high complexity and as prediction. Such a large metrics set has not been considered in
the size of a software system increases, it is becomes difficult previous research to the best of our knowledge. This paper
to maintain it and thus overall cost of software increases. The firstly, selects an enhanced set of software metrics that are
IEEE standard glossary of software engineering defines significant predictors of maintainability by using static metric
maintainability as “The ease with which a software system or datasets of four different open source software (OSS) viz.
component can be modified to correct faults, improve Lucene, JHotdraw, JEdit, JTreeview, by calculating Pearson
performance or other attributes, or adapt to changed correlation coefficient between these datasets and CHANGE,
environment”[1]. Maintainability is highly significant factor in and then builds OO software maintainability prediction
the economic success of software products. Maintainability is models using five data mining classifiers- Naïve Bayes,
an important quality attribute, but it is difficult to estimate, Bayes Network, Logistic , Multilayer perceptron and random
because it involves making predictions about future changes forests from data mining toolkit WEKA[5]
that will occur in a software module, once it has been The remainder of this paper is organized as follows.
deployed. A software metrics based maintainability prediction Section II presents related work in this area. Section III
model helps an organization to predict the probability of describes the OO software metrics datasets, correlation
change of software modules, and thus helps better analysis and the sampling method used in this study. Section
IV describes data mining classifiers used in this study. Section
978-1-4799-4674-7/14/$31.00©2014IEEE
V presents the results and discussion. Section VI presents TABLE I. SELECTED SOFTWARE METRICS PROPOSED BY LI AND
threats to validity of this work. Finally Section VII presents HENRY
conclusion and directions of future work. Metric Definition Sources
This metric reports the total number of
II. RELATED WORK Number of subclasses of each class that occur in the
[19]
Children (NOC) project. Both direct and indirect
A lot of research studies have used linear regression models subclasses are counted.
to build software maintainability prediction models. For Message Passing Coupling is simply
example, multiple linear regression (MLR) models were used defined as the number of method calls in a
Message Passing
class. Classes with high Message Passing [19]
by Fioravanti and Nesi [6], De Lucia et al.[7] and Li and Coupling(MPC)
Coupling may be less stable, and require
Henry [3] to predict maintainability. higher amounts of integration testing.
Some machine learning algorithms are also applied in The metric measures the number of
predicting software maintainability. C. van Koten, A.R. Gray Response for a different methods that can be executed
[8] use Bayesian network to predict software maintainability. Class(RFC) when an object of that class receives a [18]
Thwin and Quah[9] used neural network to build object message (when a method is invoked for
oriented software prediction model. Many researchers that object).
[8,9,10,11,12,13,14,15] have used two datasets namely UIMS Lack of cohesion in It counts the number of null pairs of
[18]
methods (LCOM) method that do not share attributes
and QUES proposed by Li and Henry [3] for model building Weighted method A class's WMC metric is simply the sum
and its evaluation. In the last decade some new machine per class(WMC) of the complexities of its methods.
[18]
learning approaches are also proposed and evaluated. This metric reports the number of lines of
Aggarwal et al. [16] suggested the use of Fuzzy model; Kaur et Lines of Code
code in each class. Comments are counted
(LOC) [19]
al.[14] suggested the use of soft computing approaches such as for purposes of this metric, but
or SIZE 1
ANN (Artificial Neural Network), FIS (Fuzzy Inference whitespace is not.
system), and ANFIS (Adaptive Neuro Fuzzy inference Source Lines of
This metric reports the number of lines of
System). Elish et al. [13] used Tree Nets and proved that it can code in each class. Comments and
Code (SLOC) or [19]
whitespace are not counted for purposes
be used for predicting maintainability and they also provide SIZE 2
of this metric.
competitive results when compared to other models. Ping [17]
used Hidden Markov Model (HMM) to define health Index of
a product and suggested that it can be used to provide weight to
a process of maintenance over a period of time. In this study
we contribute to software maintainability prediction by data
mining enhanced set of 25 software metrics of open source
TABLE II. .ADAPTED PROPOSED SOFTWARE METRICS FROM [19]
systems as compared to 8 or 10 software metrics used in
previous studies [8,9,12,13,14,15] Metric Definition Sources
This metric reports the Halstead
Vocabulary metric for a class. The
Halstead Vocabulary
Halstead Vocabulary of a class is [19]
(n)
III. OO SOFTWARE DATASETS defined as the total number of distinct
operators and operands in a class.
The Halstead Volume is defined as N *
A. Characteristics of Datasets
log(n), where N is the Halstead Length
This paper uses OO Software datasets that are extracted by Halsted Volume(V) [19]
metric and n is the Halstead Vocabulary
two metric extraction tools namely Chidamber and Kemerer metric for a class.
Java metric (CKJM)[18] and IntelliJIDEA[19]. These metric This metric reports the total number of
datasets are collected from a total of 879 classes in four open statements in each class. Statements in
source OO software systems: Lucene, JHotdraw, JEdit and Number of named inner classes and interfaces are
[19]
Statements (STAT) not counted for purpose of this metric,
JTreeview. The code was written in Java language. The Lucene but statements in anonymous inner
dataset contain 385 classes, JHotdraw contain 159 classes, classes are.
JEdit contain 275 classes and JTreeview contain 60 classes. This metric reports the Halstead Bugs
Maintainability was measured as CHANGE metric by counting metric for a class. The Halstead Bugs is
the number of additions, deletions or modifications of lines of Halsted Bugs (B) intended as an estimate of the number of [19]
code. Addition of a line is counted as one change; Deletion of bugs in a class. In practice, it has usually
line is also counted as one change; while the modification of been found to underestimate
line is counted as two (one for addition and other for deletion) This metric reports the Halstead Length
metric for a class. The Halstead length
changes. The description of each metric is shown in Table I Halsted Length (N) [19]
of a class is defined as the total number
and Table II. Table I shows the metrics that are chosen from of operators and operands in a class.
existing OO software dataset published by Li and Henry [3]. This metric reports the number of
Table II presents the additional set of metrics that are used in queries for eacy class. A query is
this study that can be used for prediction of maintainability. Number of Query defined as a method that returns a value.
[19]
(Query) Constructors and inherited methods are
not counted for the purposes of this
metric.
This metric reports the Halstead software a class or interface depends on.
Difficulty metric for a class. The Classes with large level orders may be
Halsted Difficulty
Halstead Difficulty is intended to [19] difficult to test and maintain.
metric (D)
correspond to the level of difficulty of This metric reports the number of
Number of transitive
understanding a class. classes or interfaces which each class [19]
dependencies(Dcy*)
This metric reports the number of directly or indirectly depends on.
Number of
classes or interfaces which each class [19] This metric reports the level order of a
dependencies (Dcy)
directly depends on. class. The level order is defined as 0 for
The coupling between object classes which are dependent on no other
classes (CBO) metric represents the project classes. Conceptually, level
Coupling between
number of classes coupled to a given [18] order measure how many "layers" of
Objects (CBO)
class (efferent couplings and afferent software a class or interface depends on.
Level order (Level*) [19]
couplings). Classes with large level orders may be
A class's efferent coupling is a measure difficult to test and maintain. In designs
of how many other classes is used by the with large cyclic dependencies, level
Efferent Coupling
specific class. Coupling has the same [18] order may give an unreasonably rosy
(Ce )
definition in context of Ce as that used picture of how many layers of software
for calculating CBO. depends on.
Number of attributes This metric reports the total number of This metric reports the number of inner
[19]
added (NAA) attributes (or fields) added by this class. Number of inner classes or interface which each class
[19]
This metric reports the Halstead Effort classes (Inner*) contains. Anonymous inner classes are
metric for a class. The Halstead Effort is not counted for purposes of this metric.
Halsted Effort (E) [19]
intended to correspond to the level of This metric reports the total number of
effort necessary to understand a class. operations (or methods) and attributes
Class Size
This metric reports the number of (or fields) for each class. Static fields
(Operations + [19]
Number of commands for each class. A query is and methods inherited from super
Attributes) (CSOA)
Commands defined as a method that returns void. [19] classes are not counted for purposes of
(Command) Inherited methods are not counted for this metric.
the purposes of this metric. This metric reports the total number of
This metric reports the number of lines Class Size attributes (or fields) for each class. Static
[19]
of code in each class which contain (Attributes) (CSA) fields inherited from super classes are
comments. Anonymous inner classes are not counted for purposes of this metric.
Comments Lines of included in their containing class for This metric reports the number of
[19] Number of
Code (CLOC) purposes of this metric, while named classes or interfaces which directly
dependents (Dpt)
inner classes are evaluated separately. depend on each class.
Whitespace lines are not counted for This metric reports the total number of
purposes of this metric. operations (or methods) for each class.
Class Size
This metric reports the total number of Static methods inherited from super [19]
Number of operations (operations) (CSO)
operations (or methods) overridden by [19] classes are not counted for purposes of
overridden (NOOC)
this classb. this metric.
The NPM metric simply counts all the This metric reports the average
Average cyclomatic
Number of Public methods in a class that are declared as cyclomatic complexity of the non- [18]
[18] complexity (Avg cc)
methods (NPM) public. It can be used to measure the size abstract methods in each class.
of an API provided by a package A change can be addition, deletion or
This metric reports the number of lines modification of line of code. Addition of
JavaDoc lines of Number of Line of
of code in each class which contain [19] a line is counted as one; Deletion of line
Code (JLOC) code changed
javadoc comments. is also counted as one; while the
(CHANGE)
This metric reports the total number of modification of line is counted as two
operations (or methods) added by this (one for addition and other for deletion)
Number of class. Methods which are inherited from
Operations added super classes are not counted for [19]
(NOAC) purposes of this metric, nor are methods
which override (non-abstract) inherited
methods. For our study, we mine the source code for changes
This metric reports the number of occurring to source lines of code of common software modules
Number of package packages on which each class directly or of two versions of four open source software systems. The four
[19]
dependencies (PDcy) indirectly depends. open source systems along with the version are: lucene: 2.0.0
and 2.1.0; for JHotdraw: 5.2 and 5.3; for JEdit: 3.1.0 and 3.2.0;
This metric reports the total number of for JTreeview: 1.03 and 1.1.6. We also mine the software
constructors declared for each class. If code metrics presented in Table I and Table II.
Number of
no constructor is declared, the default [19]
Constructors (CONS)
constructor is ignored and zero is
reported.
This metric reports the level order of a
class. The level order is defined as 0 for
Level order (Level) classes which are dependent on no other [19]
project classes. Conceptually, level
order measure how many "layers" of
TABLE III. CORRELATION BETWEEN CHANGE AND OO METRIC B. Data Extraction , Correlation Analysis and Sampling
method
Pearson's Correlation Coefficient
Lucene Jhotdraw Jedit Jtreeview
For model building, we first extracted the metrics of all four
dataset dataset dataset dataset OSS: Lucene, JHotdraw, JEdit and JTreeview using CKJM
and IntelliJ IDEA and then merged the metrics calculated by
WMC .419** .472** .909** .566**
both of them. Now, we have selected each class from both the
CBO .264** .472** .604** .871** versions of OSS and calculated the metric CHANGE (refer
RFC .414** .516** .743** .822** Table 1) between them by using Eclipse compare plug-in.
** ** ** Next, Pearson’s correlation coefficient between CHANGE and
LCOM .482 .527 .286 .557**
software code metrics is calculated. The results of correlation
Ce .264** .472** .604** .871** analysis are presented in Table III. These results indicate that
NPM .365** .266** .654** .356* several new software metrics such as Halstead bugs(B),
Comments Lines of Code (CLOC), Number of Commands
avg_cc -0.010 0.054 0.164 -0.167
(Command), Number of inner classes (Inner*) and Number of
** ** **
B .302 .480 .985 .535** transitive dependencies (Dcy*) are also useful in software
CLOC .231** .567** .390** 0.209 maintainability prediction as compared to only C&K metrics
** ** ** used in previous studies[8,9,12,13,14,15].After this,
Command .459 .372 .682 .395*
approximately two-third of the cases of this dataset are
Cons .156** 0.039 .320** 0.197 chosen by random sampling . This subset of cases forms a
CSA .241 **
0.170 0.029 .395* learning subset, which is used to construct maintainability
**
prediction model. The remaining one-third of cases forms a
CSO .319 0.125 0.063 .347* test dataset, which is used to evaluate the accuracy of model
CSOA .339** 0.131 0.058 .355* built. Next, the maintainability of software classes of all the
D .343** .314** .962** .500** four software system is predicted using five classifiers from
WEKA [5] data mining toolkit. The classifiers chosen are
Dcy .302** .541** .615** .882** Naïve Bayes, BayesNet, Logistic, Multilayer perceptron and
Dcy* .118* 0.187 -0.016 .603** Random Forest. These classifiers are chosen because, they
Dpt 0.061 0.080 .355** 0.103
have been found to be accurate and useful in data mining
** ** **
static code metric based studies [20].
E .297 .488 .975 .369*
Inner* .119* NA .211* .478**
** ** **
IV. OVERVIEW OF CLASSIFIERS
JLOC .327 .561 .311 0.147
Level 0.109 0.160 -0.036 .605**
• Naïve Bayes [5]- Naïve Bayes classifier is based on
Level* 0.053 .204* -0.012 .546** Bayes theorem of conditional probability. Naïve
LOC .198** .570** .889** .523** Bayes algorithm performs classification by building a
Naïve Bayes independent probability model. This
MPC .228** .549** -0.028 0.199
model utilizes s a maximum posteriori decision rule to
** ** **
N .225 .465 .954 .526** select the hypothesis that is most probable.
N .211** .528** .735** .813** • Bayes Network [5]- A Bayes network is a machine
NAA .260** .514** .440** .455** learning classifier which is based on fusion of graph
and probability theory. The nodes of a graph in the
NOAC .407** .407** .928** .429* network signify random variables or attributes and
NOOC .133* .566** 0.034 .733** edges signify the joint probability distribution of
PDcy 0.112 .247 *
0.093 .739**
random variables .
Query .295** .514** .958** .466** • Logistic Regression [5]- Logistic regression is a
** ** ** **
statistical approach to classification based on logit
SLOC .184 .537 .931 .615 transformation of dependent variable. It
STAT .198 **
.508 **
.973 **
.534** approximates the dependent variable using a linear
function of independent variables by searching
NOC 0.105 .312** -0.006 0.275
weights that best fit the training data. The selected
** ** **
V .216 .502 .962 .566** weights are such that they maximize the log-
a.
**. Correlation is significant at the 0.01 level (2-tailed).
likelihood.
b.
*. Correlation is significant at the 0.05 level (2-tailed). • Multilayer Perceptron MLP [5]- MLP is a neural
network consisting of layers of input nodes, hidden
nodes and an output nodes connected through
weighted links. A error correction based Back-
propagation algorithm used for training the neural
network. TABLE IV. MODEL EVALUATION MEASURES OF SOFTWARE
MAINTAINABILITY PREDICTION MODELS
• Random Forest [5]- A Random forest is a classifier
based on numerous decision trees. The prediction is ROC
Dataset Classifiers Recall Precision
arrived by voting mechanism by numerous decision Area
trees. Random forests are fast ensemble learning Naïve Bayes 0.535 0.635 0.726
methods. Bayes Network 0.702 0.706 0.787
Logistic 0.741 0.748 0.810
Lucene
Multilayer 0.724 0.723 0.783
V. RESULTS AND DISCUSSION Perceptron
In this section we present the results of analysis performed Random Forest 0.732 0.737 0.796
to classify the modules which are difficult to maintain. The Naïve Bayes 0.667 0.825 0.806
modules are considered to be difficult to maintain if they are Bayes Network 0.595 0.768 0.817
more change-prone. For this purpose we use the classifiers Logistic 0.762 0.762 0.748
JHotdraw
presented in section IV. The input software metrics to the Multilayer 0.762 0.770 0.721
Perceptron
classifiers are presented in Table III. Only those metrics are
Random Forest 0.762 0.754 0.716
used as inputs which are found to be significant through
Naïve Bayes 0.493 0.804 0.740
correlation analysis. The maintainability prediction models Bayes Network 0.797 0.859 0.844
obtained by the five classifiers are presented in Table IV. Logistic 0.595 0.634 0.624
Since we use five classifiers and four data sets, there are 20 JEdit
Multilayer 0.623 0.632 0.720
models in Table IV. The software maintainability models are Perceptron
evaluated in terms of the following model evaluation Random Forest 0.768 0.803 0.823
measures:- Naïve Bayes 0.467 0.711 0.550
Bayes Network 0.578 0.729 0.764
• Recall- Recall is the proportion of modules which are JTreeview
Logistic 0.667 0.778 0.698
correctly predicted as difficult to maintain out of the Multilayer 0.600 0.680 0.611
total modules that are actually known to be difficult Perceptron
to maintain. Random Forest 0.700 0.701 0.710
• Precision- Precision is the proportion of modules

which are correctly predicted as difficult to maintain VI. THREATS TO VALIDITY
out of the total modules that predicted to be difficult Like other empirical studies, limitations encountered during
to maintain by a classifier. this study are given below:
i. The datasets which are used in this study are based on
• ROC Area-An ROC curve is a two dimensional plot Lucene, JHotdraw, JEdit, JTreeview software, which
in which the x-axis denotes the false positive rate are written in Java language. The models build by this
(FPR) and the y-axis denotes the true positive rate study likely to be valid for Object oriented languages,
(TPR) of a classifier. As the Area under curve for example C++ or Java, however further research
(AUC) of ROC gets larger, the classifier gets better. can establish its usefulness in predicting
maintainability effort in other paradigms.
A classification result is considered to be accurate if the ii. During the process of selection of predictor variables
performance measures such as recall , precision and accuracy while constructing the proposed model, although
utmost care has been taken to consider only those
are equal to or above 0.70[21]. The results from Table IV
variables that has strong impact on maintainability
indicate that in terms of AUC of ROC curve software
effort, but there may be other metrics specific to
maintainability can be accurately predicted by data mining of certain application domains that whose effect on
static code metrics as 16 out of the 20 models have AUC of maintainability of software needs to be determined.
ROC curve greater than or equal to 0.7[21]. Further the Examples of these metrics could be number of
Random Forest models on all four data sets have Recall, database connections for database application or
Precision and AUC greater than or equal to 0.7. Random number of connections for mobile applications.
Forests are state of the art classifiers used in software fault
prediction studies [21] and also found to be useful in software
maintainability prediction. Other four classifiers Naïve Bayes,
Bayes Network, Logistic and Multilayer perceptron are not as VII. CONCLUSION AND FUTURE WORK
accurate on all three performance measures –recall, precision This paper utilizes data mining of some new software
and AUC of ROC curve. code metrics for prediction of software maintainability apart
from traditionally used Li and Henry[3] C& K metrics[4]. The [15]. A. Kaur, K. Kaur, “Statistical Comparision of modelling methods
new predictor software code metrics are presented in Table II. for software maintainability prediction”, Int. J. Soft. Eng. Knowl.
Eng, Vol. 23, 2013.
Naïve Bayes, Bayes Network, Logistic, MultiLayer perceptron [16]. K.K. Aggarwal, Yogesh Singh, Pravin Chandra, Manimala Puri,
and Random Forest models are constructed for prediction of “Measurement of Software Maintainability Using a Fuzzy Model”,
software maintainability using OO software metric datasets Journal of Computer Sciences, pp 538-542, 2005
that are extracted by two metric extraction tools namely [17]. L Ping, “A Quantitative Approach to Software Maintainability
Chidamber and Kemerer Java metric (CKJM) and IntelliJ Prediction”, International Forum on Information Technology and
Applications, vol: 1, no : 1, pp : 105-108, July 2010.
IDEA. These metric datasets are collected from a total of 879
[18]. D. Spinellis and M. Jureczko.(May 2011). Metric
classes in four open source OO software systems: Lucene,
JHotdraw, JEdit and JTreeview. Our results indicate that Description [Online] Available:
accurate models can be constructed for software http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/
[19]. Bertrand Meyer in object-oriented software construction, second
maintainability prediction by data mining of software code edition, Predntice Hall, 1988.
metrics. Our future work will consist of data mining of [20]. T. Menzies, J. Greenwald, and A. Frank, “Data mining static code
software code metrics to predict maintainability of web based attributes to learn defect predictors”, IEEE Transactions on
and mobile applications. Software Engineering, IEEE Computer Society, vol. 3, pp 2–13,
2007.
[21]. S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, “Benchmarking
Classification Models for Software Defect Prediction: A Proposed
Framework and Novel Findings,” IEEE Trans. Software Eng., vol.
VIII. REFERENCES 34, no. 4, pp. 485-496, July/Aug. 2008.
[1]. IEEE. “IEEE Standard Glossary of Software Engineering

Terminology,” IEEE Std. 610.12-1990. Institute of Electrical and
Electronics Engineers, 1990.
[2]. J. Saraiva, “A roadmap for software maintainability measurement”,
International Conference on Software Engineering (ICSE) 1453-
1455, 2013.
[3]. W. Li and S. Henry, "Object-Oriented Metrics that Predict
Maintainability," Journal of Systems and Software, vol. 23, no 2,
pp. 111-122, 1993.
[4]. S.R. Chidamber, C.F. Kemerer, A metrics suite for object-oriented
design, IEEE Transactions on Software Engineering 20 (6), pp.
476–493, 1994.
[5]. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H.
Witten, “The WEKA Data Mining Software: An Update”,
SIGKDD Explorations, Vol.- 11, Issue 1, 2009.
[6]. F. Fioravanti and P. Nesi, "Estimation and prediction metrics for
adaptive maintenance effort of object oriented systems," IEEE
Transactions on Software Engineering, vol. 27, no. 12, pp. 1062-
1084, 2001.
[7]. A. De Lucia, E. Pompella, and S. Stefanucci, "Assessing effort
estimation models for corrective maintenance through empirical
studies," Information and Software Technology, vol. 47, no. 1, pp.
3-15, 2005.
[8]. C. van Koten A.R. Gray, “An application of Bayesian network for
predicting object-oriented software maintainability”, Information
and software technology, 48, pp. 59-67, 2006.
[9]. M. Thwin and T. Quah, "Application of neural networks for
software quality prediction using objectoriented metrics," Journal
of Systems and Software, vol. 76, no. 2, pp. 147-156, 2005.
[10]. KK Aggarwal, Y Singh, JK Chhabra, “An Integrated Measure of
Software Maintainability”, Annual proceedings: Reliability and
Maintainability Ssymposium, pp. 235-241, 2002.
[11]. K.K. Aggarwal, Y. Singh, P. Chandra and M. Puri, “ Measurement
of Software Maintainability Using a Fuzzy Model”, Journal of
Computer Sciences, vol. 1, no.4, pp. 538-542, 2005.
[12]. Y. Zhou and H. Leung, "Predicting object-oriented software
maintainability using multivariate adaptive regression splines”,
Journal of Systems and Software, vol. 80, no. 8, pp. 1349-1361,
2007.
[13]. MO. Elish and KO. Elish , “Application of TreeNet in Predicting
Object-Oriented Software Maintainability: A Comparative Study”,
European Conference on Software Maintenance and
Reengineering, pp. 1534-5351, 2009.
[14]. A Kaur, K Kaur, R Malhotra, “Soft Computing Approaches for
Prediction of Software Maintenance Effort”, International Journal
of Computer Applications), vol 1, no. 16, pp. 69-75, 2010.

Kaur 2014

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kaur 2014

Uploaded by

Copyright:

Available Formats

Software maintainability prediction by data mining of

software code metrics

Arvinder Kaur Kamaldeep Kaur Kaushal Pathak

• Precision- Precision is the proportion of modules

[1]. IEEE. “IEEE Standard Glossary of Software Engineering

You might also like