Professional Documents
Culture Documents
Software Process Mining: Annotation
Software Process Mining: Annotation
ANNOTATION
Nowadays, in the era of social, mobile and cloud computing, different
business information systems produce, log and trace regularly terra bytes of
data. Process mining deals with transforming this data to a valuable
knowledge, which is used for improving the business processes.
However, process mining can also be successfully applied to the area of
development of information systems. It can be used for deriving the model
of a software development process. Mining the end-user behavior can help
improving the functionality and the usability of software. And mining the
software system at runtime is beneficial for improving the software
architecture and performance.
Here, we introduce software process mining:
29.01.2014
Slide 2
VLADIMIR RUBIN
Lead IT Architekt and Consultant
Collaboration with msg systems AG
Slide 3
29.01.2014
Slide 4
* http://www.projectcartoon.com
HOW
PROCESS MINING
HELPS DEALING WITH
SOFTWARE ENGINEERING
CHALLENGES?
29.01.2014
Slide 5
29.01.2014
Slide 6
AGENDA
29.01.2014
Slide 7
MOTIVATION: QUALITY
Idea
Software Process Quality
CMM (CMMI)
Company
Process
Model
~50%
of companies
Product Quality
deriving software
development
processes
Process
Engineer
Slide 8
29.01.2014
Slide 9
HYPOTHESIS
Document Logs from
Software Repositories
can be used for
discovering
Process Models
Mining
Approach
29.01.2014
Slide 10
designer
CODE
developer
TEST
qaengineer
REV
manager
DES
designer
TEST
qaengineer
CODE
developer
REV
designer
DES
designer
VER
qaengineer
CODE
designer
REV
manager
Revision 569362 - (view) (download) (as text) (annotate) [select for diffs]
Modified Fri Aug 24 12:09:09 2007 UTC (6 weeks, 1 day ago)
by bayard
Revision 567258 - (view) (download) (as text) (annotate) [select for diffs]
Modified Sat Aug 18 11:14:52 2007 UTC (7 weeks, 1 day ago)
by tetsuya
SVN log
Other Examples:
Bug Tracking (Bugzilla, ...)
Issue Tracking (Jira, ...)
...
2. Process Mining
designer
CODE
developer
TEST
qaengineer
REV
manager
DES
designer
TEST
qaengineer
CODE
developer
REV
designer
DES
designer
VER
qaengineer
CODE
designer
REV
manager
Constructing TS
Modification Strategies for TS
Properties:
flexible, supports generalization
deals with complex constructs
generates consistent models
apply theory of regions: synthesis
algorithms of Cortadella et al.
2. Process Mining
Performance Perspective
3. Model Analysis
0.67
TEST
0.33
DES
VER
REV
0.25
CODE
0.75
Organizational Perspective
Verification (LTL)
0.111
0.111
designer
0.111
0.111
developer
qaengineer
0.111
DES
designer
CODE
developer
TEST
qaengineer
REV
manager
0.222
0.111
0.111
manager
IMPLEMENTATION
1. Preprocessing
2. Process Mining
ProM Import
Framework
3. Model Analysis
ProM
Slide 14
EVALUATION
Case Studies:
Main Results:
Softwaretechnikpraktikum
(SCM system CVS and Subversion)
FG Softwaretechnik,
University of Paderborn
29.01.2014
Slide 15
CONTRIBUTIONS
A Worklfow Mining
Approach for Deriving
Software Process Models
Tool Support
Theory of
Regions
configurable
consistent
Evaluation
Sources of Experimental Data
29.01.2014
Slide 16
AGENDA
29.01.2014
Slide 17
29.01.2014
Slide 18
Slide 19
29.01.2014
Slide 20
1-
29.01.2014
Slide 21
MINING: INPUT
~ 30 MB Logs per Day per Environment (PROD, TEST, INT, DEV)
Logs are preprocessed and converted to CSV (30 KB per Day)
29.01.2014
Slide 22
95 cases
482 events
50 activities
Mean duration: 6.5 min; Median duration: 26.5 s
29.01.2014
Slide 23
Frequent activities:
Hotel Quote
Hotel Book
Flight Search
Show Reservation
29.01.2014
Slide 24
29.01.2014
Slide 25
FOCUS ON FAILURES
Problems with:
Hotel Search
Hotel Quote
Show Reservation
29.01.2014
Slide 26
SOME STATISTICS
Most frequent travelling directions:
29.01.2014
Slide 27
4. We aligned the failure cases with the exceptions and created the
issues for further bug fixing.
29.01.2014
Slide 28
29.01.2014
Slide 29
29.01.2014
Slide 30
29.01.2014
Slide 31
MINING: INPUT
~ 5 GB of Traces per Day per Environment (PROD, TEST, INT, DEV)
Logs are preprocessed and converted to CSV (20 MB per Day) Input for Disco
29.01.2014
Slide 32
758 cases
Computation of the whole graph takes
61844 events
more then 30 minutes
508 activities
Mean duration: 5 sec; Median duration: 30 millis
29.01.2014
Slide 33
29.01.2014
Slide 34
29.01.2014
Slide 35
29.01.2014
Slide 36
SOME STATISTICS
Payloads:
Frequency of activities:
29.01.2014
Slide 37
29.01.2014
Slide 38
OVERVIEW
Slide 39