Prediction of Suitability of Agile Development Process For Software Projects

Prediction of suitability of Agile development
process for software projects
Name : Masood Ahmad

Registration # : Fa018-MSCE-001
Research Supervisor : Dr. Imran Ashraf

Co – Supervisor : Mr. Nouman Noor
HITEC University, Taxila 1

Agile Development
• Process model
• Introduce a dynamic strategies
• Consist of number of smaller
cycles refers as sprints or
iterations.
• Allow the team to adapt
change quickly
2
Why we use Agile Development
About 50% to 80% software projects fails [1].
• Projects fail because:

• Fixed and separates stages Agile development primarily focuses
• No feedback until testing. • Easily and Quickly Adapt to Change
• End Product is defined • Higher Quality Product
• End product testing • Unit testing
• Using structured way(water • Short feedback loop
fall) to develop software. • User-Focused Testing
• Project did not meet up the • Rapid delivery
real requirements. • Focuses on continue development
• Late delivery in Market
3
Research Problem
• Prediction of suitability of agile software development process for
certain software development project
4
Proposed Solution
Pass
Software Machine
Metrics Learning
Model
Fail
5
Literature Review
6
Paper Year Techniques Problems
An analysis of factors affecting software reliability 2000 Identified Factors Identified the 32 factors that effect on software
[2] reliability i.e. software complexity, testing effort,
programmer skill, testing environment, testing
coverage, frequency of program specification etc.
An analysis of systematic Approach for Selection 2010 Statistical qualitative Authors says that every single adoption are some
in component based projects and Adoption of and quantitative unique challenges so we need to apply qualitative and
Agile Practices in Component-Based Projects [3] Techniques quantitative method for data analysis depending upon
the project environment
The software development project outcomes 2011 Survey On the basis of survey of research he presents a new
factors [4] classification of the framework in which he represents
an abstracted and synthesized view of the factors that
effect on project outcomes
What Software Test Approaches, Methods, and
Techniques are Actually Used in Software Industry
2018 Survey Shows that IT industry is using only
[5] the functional testing approach and
only 52.63% were doing non-
functional testing
7
Collect data from
IT professionals Methodology
Apply statistical
techniques(t test and Input Training
Chi square test) Data
Indicate significant Train and Test

factors Results
Model
Input Testing data

8
Data Collection
9
Data Collection
 Data set is collected from 47 IT Companies of Pakistan
 Pakistan Revenue Automation (PVT) Limited
 Askari bank
 Vizteck Solutions
 United vision
 Vector coder
And many more
 The technique we used is snowball in which we first approached the target data
from our known contacts and requested them to approach their contacts.
10
Data Parameters
• Total budget or cost available
Project Cost
• Time required to complete project

Project Time
Project Requirements • Evolving/Changing Frequently or

Nature static requirements
Project UI Design • GUI of the project

Provision 11
Data Parameters
• Web Based, Mobile Based, Desktop Based, All Three
Project domain
• Name of tools such as Android Studio, Visual Studio,Xcode etc
Project Development Tool
• In terms of Line of Code or functional point
Project complexity
• Working from different locations
Project Team Locality • Geographically Dispersed or same location
Project Targeted • Intended users of developed project

• General Public , special community, Business users, Technical Professionals etc
Audience
• Total people working on the project (0-10) (0-30) greater than 30
Team Size 12
Data Preprocessing
13
Data Cleaning
• Removed redundant records
• Removed the record which is from the persons who are not software
testers or scrum master/project managers
• Converted header attributes and long statements into single words
• Converted whole data into same format
• Total number of records (Before Cleaning) = 47

• Total number of records (After Cleaning) = 41
14
Data Cleaning
Data Before Cleaning
Which of the
What was the Who was the following
project target How often agile project
What was the Which complexity in What was the What was the audience of does your management After
domain of project term of nature of testing effort What was the What was the your project team held framework What was the completion,
What is your What is the your project development numbers of requirements in term of budget or size of your for which any of the was used locality of what was the
company What is your size of your that you have tool was lines of code in the project number of cost of the project project was scrum with my project status of
name? job role? company? completed? used? (LOC)? ? hours? project? team? build? meetings? project: team? project?
Special
Frequent Medium Community
Changed (Between Medium (10 (For specific
Contour Softwares Large (250+ Greater then throughout Between 49 1001$ to to 30 community Geographicall
softwares architect Employees) All three Visual Studio 7000 LOC project to 96 Hours 5000$ Members) only) Daily Scrum y Dispersed Successful
Data After Cleaning

Project Nature of Project Agile Development
Project Domain Development Tools Complexity Requirement Testing Effort Budget Team Size Target Audience Scrum Meeting Model Team Locality Class
Medium Special Geographically

All three Visual Studio Large Frequently Changed Effort Medium Medium Community Daily Scrum Dispersed Successful
15
Data Significance
16
T-Test
• Test is used to check data validity
• To confirm our collected data is not random
• We need Data coding to apply T-Test
• Convert High, Medium and Low to 3,2 and 1 etc.
• Check the significance level by generating P-Value
• If P-Value less than or equals to 0.05 than data is significant to
continue research
• T-Test Results: P-Values of all parameter of our data were
less 0.05 which proves that data is well for research
17
Significant Factors
18
Chi Square Test
•
• Chi Square Measure the presence or absence of any association
between two variables
• The formula for Chi square is
• O is observed frequency and e is expected frequency

• Expected frequency = row value * column value / Total
19
Significance level
Factors
Significant (P-Value)
Factors
Company Size .0379
Project Domain .0467
All these factors are significant because we Project Development Tool .031
take factors from research papers
Project Complexity .0422
Requirement Nature .0342
Testing Effort .0428
Budget .0464
Project Size .0357
Target Audience .018
Scrum Meeting Held .018
Management Framework .0303
Project Locality .0224
20
Training and Testing the Model
21
Software Tool
• We have used Weka 3.8.5 for classification
• Easy to use
• Provides the accuracy and error rate of our results in the form of a
confusion matrix
22
Selection of Training/Testing Data
• “Cross Validation” gives us perfect estimate/accuracy of a model
• K-Fold cross validation technique
• This technique partitions the data randomly into K-folds/groups
• The ﬁrst fold/group used for testing of model
• Training of model is done on remaining k-1 folds/groups
• This step repeats K times and each time diﬀerent training and testing
fold/group use for testing
• The value of K is 10
23
Classifiers
•• We
have used about 15 classifiers
• Following 5 gave top results:
• Naive Bayes
• SMO (sequential Minimal Optimization)

• Use to solve quadratic problem
• Decompose QP in to series of sub QP
• ZerorR
• Predict the majority category
• Useful to determine baseline performance
• Hoeffding
• Hoeffding tree is an incremental decision tree learner for large data streams.
• Decision Stump
• A decision stump makes a prediction based on the value of just a single input feature
24
Comparison of Classifier
Classifiers Accuracy Error Rate
Naïve Byes 90.2439 % 0.2291 %
ZeroR 95.122 % 0.2212 %
SMO 87.8049 % 0.3492 %
Decision Stump 95.122 % 0.2222 %
Hoeffding Tree 87.804 % 0.2735 %

25
Confusion matrix
Confusion matrix describe the performance of a classification model
• True Positive Rate: When it's actually yes, how often does it predict yes?
• TP/actual yes
• False Positive Rate: When it's actually no, how often does it predict yes?
• FP/actual no
• True Negative Rate: When it's actually no, how often does it predict no?
• TN/actual no
• Precision: When it predicts yes, how often is it correct?
• TP/predicted yes
• Prevalence: How often does the yes condition actually occur in our sample?
• actual yes/total
• Accuracy: Overall, how often is the classifier correct?
• (TP+TN)/total
• Misclassification Rate: Overall, how often is it wrong?
• (FP+FN)/total
26
Zero R Classifier Confusion Matrix
A B Classified as
39 0 A= successful
2 0 B= unsuccessful
27
Sequential Minimal Optimization (SMO)
Classifier Confusion Matrix
A B Classified as
39 3 A= successful
2 0 B= unsuccessful
28
Naive Bayes Classifier Confusion Matrix
A B Classified as
36 3 A= successful
1 1 B= unsuccessful
29
Decision Stump Classifier Confusion
Matrix
A B Classified as
39 0 A= successful
2 0 B= unsuccessful
30
Hoeffding Tree Classifier Confusion
Matrix
A B Classified as
36 3 A= successful
2 0 B= unsuccessful
31
Conclusion and Future work
• We have gathered a data about successful software's on the basis of
identified factors from literature.
• After that we statistically analyzed the data to find significant factors.
• Then build a machine learning model on that data.
• In future we can build machine learning model based on gathering
more data.
• Identify additional factors not only from literature
• but also from industry to check on which success of software is dependent
and then build model on it
32
References
1. Amjad hussain zahid a critical analysis of software failure causes
From project management perspectives vfast transactions on software engineering
Http://vfast.org/journals/index.php/vtse@ 2018, issn(e): 2309-6519; issn(p): 2411-6327
Volume 13, number 3, september-december, 2018
2. An analysis of factors effecting software reliability Xuemei Zhang, Hoang Pham * Department of Industrial Engineering, Rutgers
University, P.O. Box 909, Piscataway, NJ 08855, USA Received 4 November 1998; received in revised form 10 January 1999;
accepted 14 March 1999
3. A Systematic Approach for Selection and Adoption of Agile Practices in Component-Based Projects Iva Krasteva and Sylvia
Ilieva Sofia University St. Kliment Ohriski, 65 Akad. J. Boucher str., Sofia, Bulgaria ivak@rila.bg, sylvia@acad.bg
4. Factors that Affect Software Systems Development Project Outcomes: A Survey of Research October 2011 ACM
Computing Surveys 43(4):24
• What Software Test Approaches, Methods, and Techniques are Actually Used in Software Industry Laura Strazdina, Vineta
Arnicane, G. Arnicans, J. Bicevskis, Juris Borzovs, Ivans Kulesovs
• Computer Science
• Doctoral Consortium/Forum@DB&IS
33
Thank You!!!
Any Questions?
34

Prediction of Suitability of Agile Development Process For Software Projects

Uploaded by

Copyright:

Available Formats

You might also like

Prediction of Suitability of Agile Development Process For Software Projects

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Prediction of Suitability of Agile Development Process For Software Projects

Uploaded by

Copyright:

Available Formats

Prediction of suitability of Agile development

process for software projects

Name : Masood Ahmad

Research Supervisor : Dr. Imran Ashraf

HITEC University, Taxila 1

• Projects fail because:

Indicate significant Train and Test

Input Testing data

• Time required to complete project

Project Requirements • Evolving/Changing Frequently or

Project UI Design • GUI of the project

Project Targeted • Intended users of developed project

• Total number of records (Before Cleaning) = 47

Data After Cleaning

Medium Special Geographically

• O is observed frequency and e is expected frequency

• SMO (sequential Minimal Optimization)

Naïve Byes 90.2439 % 0.2291 %

ZeroR 95.122 % 0.2212 %

SMO 87.8049 % 0.3492 %

Decision Stump 95.122 % 0.2222 %

Hoeffding Tree 87.804 % 0.2735 %

You might also like