Professional Documents
Culture Documents
Software Metrics
Software Metrics
Data Collection
Data Analysis
06/11/2024
“We can not make good decisions with bad data”
06/11/2024
Some characteristics of good data :
06/11/2024
How To Define The Data?
Direct Measurement
The measurement of an attribute which does not depend
on the measurement of any other attribute.
Example:
Length of a program is measured by counting the lines of
code.
Indirect Measurement
The measurement of an attribute which depends on the
measurement of one or more attributes.
Example:
module defect density = (number of defect)/(module
size)
06/11/2024
Process
Process Derived
Product
Product Data
Raw Data extraction
Refine Data Analysis attribute
Resource
Resource collection values
06/11/2024
Terminology
An error is the human mistake that causes fault.
06/11/2024
What do we need to record of a problem?
06/11/2024
Example
In the 1980s, problem with a radiation-therapy machine were discovered in East
Texas Cancer Center. The machine administrated two types of radiation therapy:
x-ray and electron.
06/11/2024
Classification of failure based on severity
06/11/2024
Changes
06/11/2024
How to collect data
Manual data collection: Manager, systems analysts, programmers,
testers and users must collect raw data on forms. This manual recording is
subject to bias, error, omission, and delay. Unfortunately, in many instances,
there is no alternative to manual data collection.
06/11/2024
When to collect data and how to store data
Data collection planning must begin when project planning begins. The
actual data collection takes place during many phases of development.
For example, some data relating to project personnel can be collected at
the start of project (for example, qualification or experience) while other
data collection, such as effort, begins at project start and continues
through operation and maintenance.
06/11/2024
Data analysis and terminology
To perform the analysis , we use statistical techniques to describe
the distribution of attribute values, as well as the relationship
between or among attributes. Like –
06/11/2024
Data analysis techniques
Box plots
Scatter plots
Control charts
Measures of association
Robust correlation
Linear regression
Robust regression
Multivariate regression
06/11/2024
The nature of the data
Normal Distribution
Skewed Distribution
Non-normal Distribution
06/11/2024
Purpose of the experiment
To conform a theory
To explore a relationship
06/11/2024
Decision Tree
06/11/2024
Box Plots
A box plot depict the summary of the range of a set of data. It
shows that where most of the data are clustered and is there any
outlier data or not.
Upper quartile
lower quartile
Upper tail
lower tail
u l
Box length (d) = u-l
Upper tail = u+1.5d
Lower tail = u-1.5d
median
06/11/2024
Box Plots
System MOD 88 61 43
A 15 16
B 43
C 61
D 10
E 43
F 57
G 58
H 65
I 50 75 50 25 0
J 60
K 50 51
L 96
M 51
N 61
P 32
Q 78 10,15,32,43,43,48,50,50,51,57,58,60,61,,61,65,78,96
R 48
06/11/2024
Scatter Plots
Box plot shows information about one variable, a scatter plot depicts the
relationship between two variables.
06/11/2024
Control Charts
Control chart helps us to see when our data are within acceptable
bounds. If it is out of bounds then we can take action to prevent
problems before they occur.
06/11/2024
Measures of association
Scatter plots examine the behavior of two attributes , and
sometime we can determine that two attributes are related.
A change in one attribute seems usually to provoke a
predictable change in the other, but we do not know for
certain that similar change will take place in the future.
r=
06/11/2024
Thank You
06/11/2024