Prediction of Suitability of Agile Development Process For Software Projects

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

Prediction of suitability of Agile development

process for software projects

Name : Masood Ahmad


Registration # : Fa018-MSCE-001

Research Supervisor : Dr. Imran Ashraf


Co – Supervisor : Mr. Nouman Noor

HITEC University, Taxila 1


Agile Development
• Process model
• Introduce a dynamic strategies
• Consist of number of smaller
cycles refers as sprints or
iterations.
• Allow the team to adapt
change quickly

2
Why we use Agile Development
About 50% to 80% software projects fails [1].

• Projects fail because:


• Fixed and separates stages Agile development primarily focuses
• No feedback until testing. • Easily and Quickly Adapt to Change
• End Product is defined • Higher Quality Product
• End product testing • Unit testing
• Using structured way(water • Short feedback loop
fall) to develop software. • User-Focused Testing
• Project did not meet up the • Rapid delivery
real requirements. • Focuses on continue development
• Late delivery in Market 

3
Research Problem
• Prediction of suitability of agile software development process for
certain software development project

4
Proposed Solution
Pass

Software Machine
Metrics Learning
Model

Fail

5
Literature Review

6
Paper Year Techniques Problems
An analysis of factors affecting software reliability 2000 Identified Factors Identified the 32 factors that effect on software
[2] reliability i.e. software complexity, testing effort,
programmer skill, testing environment, testing
coverage, frequency of program specification etc.

An analysis of systematic Approach for Selection 2010 Statistical qualitative Authors says that every single adoption are some
in component based projects and Adoption of and quantitative unique challenges so we need to apply qualitative and
Agile Practices in Component-Based Projects [3] Techniques quantitative method for data analysis depending upon
the project environment

The software development project outcomes 2011 Survey On the basis of survey of research he presents a new
factors [4] classification of the framework in which he represents
an abstracted and synthesized view of the factors that
effect on project outcomes
What Software Test Approaches, Methods, and
Techniques are Actually Used in Software Industry
2018 Survey Shows that IT industry is using only
[5] the functional testing approach and
only 52.63% were doing non-
functional testing

7
Collect data from
IT professionals Methodology

Apply statistical
techniques(t test and Input Training
Chi square test) Data

Indicate significant Train and Test


factors Results
Model

Input Testing data


8
Data Collection

9
Data Collection
 Data set is collected from 47 IT Companies of Pakistan
 Pakistan Revenue Automation (PVT) Limited
 Askari bank
 Vizteck Solutions
 United vision
 Vector coder
And many more
 The technique we used is snowball in which we first approached the target data
from our known contacts and requested them to approach their contacts.
10
Data Parameters
• Total budget or cost available
Project Cost

• Time required to complete project


Project Time

Project Requirements • Evolving/Changing Frequently or


Nature static requirements

Project UI Design • GUI of the project


Provision 11
Data Parameters
• Web Based, Mobile Based, Desktop Based, All Three
Project domain
• Name of tools such as Android Studio, Visual Studio,Xcode etc
Project Development Tool
• In terms of Line of Code or functional point
Project complexity
• Working from different locations
Project Team Locality • Geographically Dispersed or same location

Project Targeted • Intended users of developed project


• General Public , special community, Business users, Technical Professionals etc
Audience
• Total people working on the project (0-10) (0-30) greater than 30
Team Size 12
Data Preprocessing

13
Data Cleaning
• Removed redundant records
• Removed the record which is from the persons who are not software
testers or scrum master/project managers
• Converted header attributes and long statements into single words
• Converted whole data into same format

• Total number of records (Before Cleaning) = 47


• Total number of records (After Cleaning) = 41

14
Data Cleaning
Data Before Cleaning
Which of the
What was the Who was the following
project target How often agile project
What was the Which complexity in What was the What was the audience of does your management After
domain of project term of nature of testing effort What was the What was the your project team held framework What was the completion,
What is your What is the your project development numbers of requirements in term of budget or size of your for which any of the was used locality of what was the
company What is your size of your that you have tool was lines of code in the project number of cost of the project project was scrum with my project status of
name? job role? company? completed? used? (LOC)? ? hours? project? team? build? meetings? project: team? project?
Special
Frequent Medium Community
Changed (Between Medium (10 (For specific
Contour Softwares Large (250+ Greater then throughout Between 49 1001$ to to 30 community Geographicall
softwares architect Employees) All three Visual Studio 7000 LOC project to 96 Hours 5000$ Members) only) Daily Scrum y Dispersed Successful

Data After Cleaning


Project Nature of Project Agile Development
Project Domain Development Tools Complexity Requirement Testing Effort Budget Team Size Target Audience Scrum Meeting Model Team Locality Class

Medium Special Geographically


All three Visual Studio Large Frequently Changed Effort Medium Medium Community Daily Scrum Dispersed Successful
15
Data Significance

16
T-Test
• Test is used to check data validity
• To confirm our collected data is not random
• We need Data coding to apply T-Test
• Convert High, Medium and Low to 3,2 and 1 etc.
• Check the significance level by generating P-Value
• If P-Value less than or equals to 0.05 than data is significant to
continue research
• T-Test Results: P-Values of all parameter of our data were
less 0.05 which proves that data is well for research

17
Significant Factors

18
Chi Square Test
• 
• Chi Square Measure the presence or absence of any association
between two variables
• The formula for Chi square is

• O is observed frequency and e is expected frequency


• Expected frequency = row value * column value / Total

19
Significance level
Factors
Significant (P-Value)

Factors
Company Size .0379
Project Domain .0467
All these factors are significant because we Project Development Tool .031
take factors from research papers
Project Complexity .0422
Requirement Nature .0342
Testing Effort .0428
Budget .0464
Project Size .0357
Target Audience .018
Scrum Meeting Held .018
Management Framework .0303
Project Locality .0224
20
Training and Testing the Model

21
Software Tool
• We have used Weka 3.8.5 for classification
• Easy to use
• Provides the accuracy and error rate of our results in the form of a
confusion matrix

22
Selection of Training/Testing Data
• “Cross Validation” gives us perfect estimate/accuracy of a model
• K-Fold cross validation technique
• This technique partitions the data randomly into K-folds/groups
• The first fold/group used for testing of model
• Training of model is done on remaining k-1 folds/groups
• This step repeats K times and each time different training and testing
fold/group use for testing
• The value of K is 10

23
Classifiers
•• We
  have used about 15 classifiers
• Following 5 gave top results:
• Naive Bayes

• SMO (sequential Minimal Optimization)


• Use to solve quadratic problem
• Decompose QP in to series of sub QP
• ZerorR
• Predict the majority category
• Useful to determine baseline performance
• Hoeffding
•  Hoeffding tree is an incremental decision tree learner for large data streams.
• Decision Stump
• A decision stump makes a prediction based on the value of just a single input feature

24
Comparison of Classifier
Classifiers Accuracy Error Rate

Naïve Byes 90.2439 % 0.2291 %

ZeroR 95.122 % 0.2212 %

SMO 87.8049 % 0.3492 %

Decision Stump 95.122 % 0.2222 %

Hoeffding Tree 87.804 % 0.2735 %


25
Confusion matrix
Confusion matrix describe the performance of a classification model 
• True Positive Rate: When it's actually yes, how often does it predict yes?
• TP/actual yes
• False Positive Rate: When it's actually no, how often does it predict yes?
• FP/actual no
• True Negative Rate: When it's actually no, how often does it predict no?
• TN/actual no
• Precision: When it predicts yes, how often is it correct?
• TP/predicted yes
• Prevalence: How often does the yes condition actually occur in our sample?
• actual yes/total
• Accuracy: Overall, how often is the classifier correct?
• (TP+TN)/total
• Misclassification Rate: Overall, how often is it wrong?
• (FP+FN)/total

26
Zero R Classifier Confusion Matrix
A B Classified as

39 0 A= successful

2 0 B= unsuccessful

27
Sequential Minimal Optimization (SMO)
Classifier Confusion Matrix
A B Classified as

39 3 A= successful

2 0 B= unsuccessful

28
Naive Bayes Classifier Confusion Matrix
A B Classified as

36 3 A= successful

1 1 B= unsuccessful

29
Decision Stump Classifier Confusion
Matrix
A B Classified as

39 0 A= successful

2 0 B= unsuccessful

30
Hoeffding Tree Classifier Confusion
Matrix
A B Classified as

36 3 A= successful

2 0 B= unsuccessful

31
Conclusion and Future work
• We have gathered a data about successful software's on the basis of
identified factors from literature.
• After that we statistically analyzed the data to find significant factors.
• Then build a machine learning model on that data.
• In future we can build machine learning model based on gathering
more data.
• Identify additional factors not only from literature
• but also from industry to check on which success of software is dependent
and then build model on it

32
References
1. Amjad hussain zahid a critical analysis of software failure causes
From project management perspectives vfast transactions on software engineering
Http://vfast.org/journals/index.php/vtse@ 2018, issn(e): 2309-6519; issn(p): 2411-6327
Volume 13, number 3, september-december, 2018
2. An analysis of factors effecting software reliability Xuemei Zhang, Hoang Pham * Department of Industrial Engineering, Rutgers
University, P.O. Box 909, Piscataway, NJ 08855, USA Received 4 November 1998; received in revised form 10 January 1999;
accepted 14 March 1999
3. A Systematic Approach for Selection and Adoption of Agile Practices in Component-Based Projects Iva Krasteva and Sylvia
Ilieva Sofia University St. Kliment Ohriski, 65 Akad. J. Boucher str., Sofia, Bulgaria ivak@rila.bg, sylvia@acad.bg
4. Factors that Affect Software Systems Development Project Outcomes: A Survey of Research October 2011 ACM
Computing Surveys 43(4):24
• What Software Test Approaches, Methods, and Techniques are Actually Used in Software Industry Laura Strazdina, Vineta
Arnicane, G. Arnicans, J. Bicevskis, Juris Borzovs, Ivans Kulesovs
• Computer Science
• Doctoral Consortium/Forum@DB&IS

33
Thank You!!!
Any Questions?

34

You might also like