Professional Documents
Culture Documents
1 s2.0 S0950584917300472 Main
1 s2.0 S0950584917300472 Main
1 s2.0 S0950584917300472 Main
a r t i c l e i n f o a b s t r a c t
Article history: Context: Software Engineering (SE) is an evolving discipline with new subareas being continuously de-
Received 29 September 2015 veloped and added. To structure and better understand the SE body of knowledge, taxonomies have been
Revised 13 January 2017
proposed in all SE knowledge areas.
Accepted 14 January 2017
Available online 16 January 2017 Objective: The objective of this paper is to characterize the state-of-the-art research on SE taxonomies.
Keywords: Method: A systematic mapping study was conducted, based on 270 primary studies.
Taxonomy Results: An increasing number of SE taxonomies have been published since 20 0 0 in a broad range of
Classification
venues, including the top SE journals and conferences. The majority of taxonomies can be grouped into
Software engineering
the following SWEBOK knowledge areas: construction (19.55%), design (19.55%), requirements (15.50%)
Systematic mapping study
and maintenance (11.81%). Illustration (45.76%) is the most frequently used approach for taxonomy vali-
dation. Hierarchy (53.14%) and faceted analysis (39.48%) are the most frequently used classification struc-
tures. Most taxonomies rely on qualitative procedures to classify subject matter instances, but in most
cases (86.53%) these procedures are not described in sufficient detail. The majority of the taxonomies
(97%) target unique subject matters and many taxonomy-papers are cited frequently. Most SE taxonomies
are designed in an ad-hoc way. To address this issue, we have revised an existing method for developing
taxonomies in a more systematic way.
Conclusion: There is a strong interest in taxonomies in SE, but few taxonomies are extended or revised.
Taxonomy design decisions regarding the used classification structures, procedures and descriptive bases
are usually not well described and motivated.
© 2017 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license.
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
http://dx.doi.org/10.1016/j.infsof.2017.01.006
0950-5849/© 2017 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license. (http://creativecommons.org/licenses/by-nc-nd/4.0/)
44 M. Usman et al. / Information and Software Technology 85 (2017) 43–59
[11] to group and classify organisms by using a fixed number of 2.2. Subject matter
hierarchical levels. Nowadays, different classification structures
(e.g. hierarchy, tree and faceted analysis [12]) have been used The first step in the design of a new taxonomy is to clearly
to construct taxonomies in different knowledge fields, such as define the units of classification. In software engineering this could
Education [13], Psychology [14] and Computer Science [15]. be requirements, design patterns, architectural views, methods and
Taxonomies have contributed to mature the SE knowledge field. techniques, defects etc. This requires a thorough understanding of
Nevertheless, likewise the taxonomy proposed by Carolus Linnaeus the subject matter to be able to define clear taxonomy classes or
that keeps being extended [16], SE taxonomies are expected to categories that are commonly accepted within the field [19,20].
evolve over time incorporating new knowledge. In addition, due
to the wide spectrum of SE knowledge, there is still a need to 2.3. Descriptive bases / terminology
classify the knowledge in many SE subareas.
Although many SE taxonomies have been proposed in the lit- Once the subject matter is clearly defined or an existing def-
erature, it appears that taxonomies have been designed or evolved inition is adopted, the descriptive terms, which can be used to
without following particular patterns, guidelines or processes. A describe and differentiate subject matter instances, must also be
better understanding of how taxonomies have been designed and specified. An appropriate description of this bases for classification
applied in SE could be very useful for the development of new is important to perform the comparison of subject matter in-
taxonomies and the evolution of existing ones. stances. Descriptive bases can also be viewed as a set of attributes
To the best of our knowledge, no systematic mapping or that can be used for the classification of the subject matter
systematic literature review has been conducted to identify and instances [19,20].
analyze the state-of-the-art of taxonomies in SE. In this paper, we
describe a systematic mapping study [17,18] aiming to characterize 2.4. Classification procedure
the state-of-the-art research on SE taxonomies.
The main contribution of this paper is a characterization of Classification procedures define how subject matter instances
the state-of-the-art of taxonomies in SE. Our results also show (e.g., defects) are systematically assigned to classes or categories.
that most taxonomies are developed in an ad-hoc way. We Taxonomy’s purpose, descriptive bases and classification proce-
therefore revised a taxonomy development method in the light dures are related and dependent on each other. Depending upon
of the findings of this mapping study, our own experience and the measurement system used, the classification procedure can
literature from other research fields with more maturity regarding be qualitative or quantitative. Qualitative classification procedures
taxonomies (e.g., psychology and computer science). are based on nominal scales. In the qualitative classification sys-
The remainder of this paper is organized as follows: tems, the relationship between the classes cannot be determined.
Section 2 describes related background. Section 3 presents the Quantitative classification procedures, on the other hand, are
employed research methodology. The current state-of-the-art on based on numerical scales [20].
taxonomies in SE, as well as the validity threats associated with
the mapping study, are presented in Section 4. In Section 5, we 2.5. Classification structure
present a revised method for developing SE taxonomies, along with
an illustration of the revised method and its limitations. Finally, As aforementioned, a taxonomy is mainly a classification
our conclusions and view on future work are provided in Section 6. mechanism. According to Rowley and Farrow [21] there are two
main approaches to classification: enumerative and faceted. In
2. Background enumerative classification all classes are fixed, making a classifi-
cation scheme intuitive and easy to apply. It is, however, difficult
In this section, we discuss important aspects related to tax- to enumerate all classes in immature or evolving domains. In
onomy design that serve as motivation for the research questions faceted classification aspects of classes are described that can
described in Section 3. be combined and extended. Kwasnik [12] describes four main
approaches to structure a classification scheme (classification
2.1. Taxonomy definition and purpose structures): hierarchy, tree, paradigm and faceted analysis.
Hierarchy [12] leads to taxonomies with a single top class
Taxonomy is neither a trivial nor a commonly used term. that “includes” all sub- and sub-sub classes, i.e. a hierarchical
According to the most cited English dictionaries, a taxonomy is relationship with inheritance (“is-a” relationship). Consider, for
mainly a classification mechanism: example, the hierarchy of students in an institution wherein the
top class “student” has two sub-classes of “graduate student”
• The Cambridge dictionary1 defines taxonomy as “a system for and “undergraduate student”. These sub-classes can further have
naming and organizing things, especially plants and animals, into sub-sub classes and so forth. A true hierarchy ensures the mu-
groups that share similar qualities”. tual exclusivity property, i.e an entity can only belong to one
• The Merriam-Webster dictionary2 defines taxonomy as “Orderly class. Mutual exclusivity makes hierarchies easy to represent and
classification of plants and animals according to their presumed understand; however, it cannot represent multiple inheritance
natural relationships”. relationships though. Hierarchy is also not suitable in situations
• The Oxford dictionaries3 define taxonomy as “The classification when researchers have to include multiple and diverse criteria
of something, especially organisms” or “A scheme of classification”. for differentiation. To define a hierarchical classification, it is
mandatory to have good knowledge on the subject matter to be
Since taxonomy is mainly defined as a classification system,
classified; the classes and differentiating criteria between classes
one of the main purposes to develop a taxonomy should be to
must be well defined early on.
classify something.
Tree [12] is similar to the hierarchy, however, there is no
inheritance relationship between the classes of tree-based tax-
1
www.dictionary.cambridge.org. onomies. In this kind of classification structure, common types of
2
www.merriam-webster.com. relationships between classes are “part-whole”, “cause-effect” and
3
www.oxforddictionaries.com. “process-product”. For example, a tree representing a whole-part
M. Usman et al. / Information and Software Technology 85 (2017) 43–59 45
3.4. Extraction process 1. The first author independently re-extracted the data of 50% (70)
of the studies originally extracted by the second author (ran-
The extraction process employed in this work is summarized domly selected) and vice-versa. Five disagreements were iden-
in Fig. 4 and consists of four main steps: Define a classification tified and all of them were related to the item “classification
scheme, define an extraction form, extract data, and validate the structure”.
extracted data. 2. Eighteen studies were randomly selected from the studies orig-
We designed classification scheme by following Petersen et al.’s inally extracted by the first and second authors (9 studies from
guidelines [18]. It has the following facets: each author). Those studies were independently re-extracted by
48 M. Usman et al. / Information and Software Technology 85 (2017) 43–59
Table 4
Data extraction form.
result of the data analysis (see Section 4), along with information Table 5
Publication venues with more than two taxonomy papers.
from additional literature ([12,19,20]), was used to revise an exist-
ing method previously proposed to design SE taxonomies [25], as Publication venue F
detailed in Section 5.
IEEE Intl. Conference on Software Maintenance (ICSM) 10
Intl. Conference on Requirements Engineering (RE) 6
4. Results Intl. Conference on Software Engineering (ICSE) 5
Hawaii Intl. Conference on Systems Sciences (HICSS) 4
In this section, we describe the results of the mapping study Asia Pacific Software Engineering Conference (APSEC) 4
European Conference on Software Maintenance and Reengineering 4
reported herein, which are based on the data extracted from
(CSMR)
270 papers reporting 271 taxonomies (one paper presented two Intl. Conference on Software Engineering and Knowledge Engineering 4
taxonomies). The percentages in Sections 4.1 and 4.7 reflect the (SEKE)
number of papers (270), whereas the percentages in all other Intl. Symposium on Empirical Software Engineering and Measurement 4
subsections reflect the number of taxonomies (271). (ESEM)
Americas Conference on Information Systems (AMCIS) 3
Other Conferences 101
4.1. General results Conference papers total 145
IEEE Transactions on Software Engineering (TSE) 11
Fig. 6 shows that SE taxonomies have been proposed since Information and Software Technology (IST) 9
1987, with an increasing number of these published after the year ACM Computing Surveys (CSUR) 7
Journal of System and Software (JSS) 6
20 0 0, which suggests a higher interest in this research topic.
Journal of Software: Evolution and Process 5
Table 5 displays that 53.7% (145) of the studies were published IEEE Computer 5
in relevant conferences in the topics of maintenance (International Empirical Software Engineering (ESE) 4
Conference on Software Maintenance), requirements engineering IEEE Software 3
Communications of the ACM 3
(Requirements’ Engineering Conference) or general SE topics (e.g.
Requirements Engineering 3
International Conference on Software Engineering). Taxonomies Other Journals 35
were published at 99 unique conferences with 78 featuring only Journal papers total 91
a single SE taxonomy publication. These results further indicate a Workshop papers total 34
broad interest in SE taxonomies in a wide range of SE knowledge
areas.
Table 5 also shows that 33.7% (91) of the primary studies
were published as journal articles in 44 unique journals. Tax-
onomies have been published frequently in relevant SE journals an increasing interest in SE taxonomies in a broad range of SE
(e.g. IEEE Transactions on Software Engineering and Information knowledge areas.
and Software Technology). We believe that this has been the case Fig. 7a–h depict the yearly distribution of SE taxonomies by
because the scope of these journals is not confined to a specific knowledge area for the KAs with 10 or more taxonomies. Note
SE knowledge area. that most knowledge areas follow an increasing trend after 20 0 0,
Primary studies were published also in 28 unique workshops with many taxonomies for construction, design, and quality in the
(34 – 12.6%). As for journals and conferences, the results indicate 1980s and 1990s.
50 M. Usman et al. / Information and Software Technology 85 (2017) 43–59
Fig. 7. Yearly distribution of primary studies by KAs. Horizontal axes represent the years (starting 1987), while vertical axes denote the number of taxonomies.
4.2. Classification scheme results Ninety one taxonomies (33.58%) are reported in “philosophical
papers”, wherein authors propose a taxonomy, but do not provide
In this section, we present the results corresponding to the any kind of validation, evaluation or illustration. Relatively fewer
three facets of the classification scheme described in Section 3, i.e. taxonomies are reported in “evaluation papers” (34 – 12.54%) and
SE knowledge area (KA), research type and presentation approach. “validation papers” (11 – 4.06%).
The vertical axis in Fig. 8 depicts the SE knowledge areas in Fig. 8 also depicts the classification of the taxonomies using
which taxonomies have been proposed. Construction and design 2 aspects of the classification scheme, i.e. SE knowledge area and
are the leading SE knowledge areas each with 53 (19.55%) tax- research type.
onomies. These are relatively mature SE fields with a large body Taxonomies in the knowledge areas construction and design
of knowledge and a high number of subareas. are mostly reported either as solution proposals (construction –
A high number of taxonomies have also been proposed in 27; design – 31) or philosophical papers (construction – 20; design
the requirements (42 – 15.50%), maintenance (32 – 11.81%) and – 17). Taxonomies in the knowledge areas requirements, mainte-
testing (27 – 9.96%) knowledge areas. Few taxonomies have been nance and testing are better distributed across different research
proposed in economics (3 – 1.11%) and professional practice (3 – types, wherein besides the solution proposal and the philosophical
1.11%), which are more recent knowledge areas. research types, a reasonable percentage of taxonomies are reported
The results show that most SE taxonomies (76.37%) are pro- as evaluation or validation papers.
posed in the requirements, design, construction, testing and The horizontal axis in Fig. 9 shows the distribution of tax-
maintenance knowledge areas, which correspond to the main onomies by presentation approach. Most taxonomies (57.93%) are
activities in a typical software development process [26]. presented purely as text or table, while 42.07% of the taxonomies
The horizontal axis in Fig. 8 shows the distribution of tax- are presented through some graphical notation in combination
onomies by research types, according to Wieringa et al. [24]. with text.
Most taxonomies are reported in papers that are classified as Fig. 9 also displays the classification of the identified tax-
“solution proposals” (135 – 49.82%), wherein the authors propose onomies in terms of SE knowledge area and presentation approach.
a taxonomy and explain or apply it with the help of an illustration. The results show 2 different trends:
M. Usman et al. / Information and Software Technology 85 (2017) 43–59 51
Table 6 Table 7
Approaches for utility demonstration. Classification structure.
Descriptive basis F %
Fleishman et al. [28]. For the remaining 93.7% (254) taxonomies,
no definition of “taxonomy” was provided. Sufficiently clear description 248 91.51
To identify the purpose of each taxonomy, we extracted the Without sufficient description 23 8.49
relevant text, referred here as purpose descriptions, from each
of the primary studies, using a process similar to open coding Table 9
[29,30]. As codes we used the keywords used in the primary Classification procedure types.
studies to describe a taxonomy’s purpose.
Classification procedure F %
For about 56% of the taxonomies, the authors used “classify”
(48.80%) or “categorize” (7.74%) to describe the purpose of their Qualitative 262 96.68
taxonomy. For 5.9% of the taxonomies it was not possible to Quantitative 7 2.58
Both 2 0.74
identify a specific purpose. For the remaining taxonomies (38.37%),
we found 41 different terms for describing the purpose, e.g.,
“identify”, “understand”, and “describe”. Table 10
Classification procedure descriptions.
Table 11
Number of primary studies (Frequency) by number of citations.
Citations F %
≥300 8 3.0
≥200 16 5.9
≥100 30 11.1
≥50 61 22.6
≥20 105 38.9
≥10 143 53.0
≥1 244 90.4
0 26 9.6
– 83.76%) and only 44 taxonomies (16.24%) have an explicit Fig. 10. Distribution of average number of citations per year. The x-axis shows
description. ranges of average citations per year. The y-axis shows the number of primary stud-
ies.
The results clearly indicate that hierarchical classification struc-
tures (hierarchy, tree and paradigm) are the ones mostly used to Table 12
design SE taxonomies. This is not surprising, since taxonomies Number of citations (mean and median) for primary papers by knowledge area.
were originally designed using hierarchy as classification structure
Knowledge area Mean Median
(see Introduction).
Nevertheless, SE is considered as a young discipline that is in Construction 92.7 17
constant and fast evolution. For this reason, the existing knowl- Design 41.2 9
Requirements 38.1 10.5
edge is sometimes incomplete or unstable. This is probably the
Maintenance 40.9 21
reason faceted analysis is also frequently employed as classification Testing 28.8 8
structure to design SE taxonomies; such classification structure Quality 46.6 14.5
is the one that is most appropriate for knowledge fields with Models & methods 68.2 9
incomplete and unstable knowledge [12]. It was interesting to Process 10.1 5
12 13
The number of citations for each primary study was fetched from Google Average is computed by dividing citation count of a study by number of years
Scholar during Jan–Feb 2015. from publication date to 2015.
54 M. Usman et al. / Information and Software Technology 85 (2017) 43–59
Table 13
Comparison between the original and the revised versions of Bayona-Oré et al.’s method.
Phase Activity according to Bayona-Oré et al. [25] Revised activity Rationale for change
Planning A1: Identify the area of the study B1: Define SE knowledge area Similar to A1
A2: Define the objectives of the taxonomy B2: Describe the objectives of the taxonomy Combination A2 and A4
A3: Develop a survey of user needs — Deleted due to Issue 3
A4: Define the scope of the proposed taxonomy — Part of B1 and B2
A5: Define the team in charge of developing the — Deleted due to Issue 3
taxonomy
A6: Identify the required resources — Moved to next “phase”
A7: Document the plan — Deleted due to Issue 3
A8: Obtain commitment and support — Deleted due to Issue 3
— B3: Describe the subject matter to be classified Based on results of this study, from Wheaton
[20] and Glass and Vessey [19]
— B4: Select classification structure type Added to address Issue 4. Based on results of
this study,from Kwasnik [12] and Glass and
Vessey [19]
— B5: Select classification procedure type Added to address Issue 5. Based on results of
this study, from Wheaton [20] and Glass and
Vessey [19]
— B6: Identify the sources of information Same as A9; considered as a planning activity
Identification A9: Identify the sources of information — Moved to planning phase
and A10: Extract all terms and identify candidate B7: Extract all terms Partly corresponds to A10; categories’
categories identification is covered by B10
extraction — B8: Perform terminology control Combines A11 and A13
Design A11: Check the list of terms and define the criteria — Corresponds to B8
and A12: Define the first level of the taxonomy design — Part of B9 and B10
Construction A13: Perform terminology control — Moved to identification and extraction phase
A14: Define the subsequent levels of the taxonomy — Covered in B10
A15: Review and approve the taxonomy by — Deleted due to Issue 3
stakeholders and experts
— B9: Identify and describe taxonomy dimensions Added to address Issue 4. Based on results of
this study, from Kwasnik [12] and Glass and
Vessey [19]
— B10: Identify and describe categories of each Added to address Issue 4. Based on results of
dimension this study, from Wheaton [20], Kwasnik [12],
Glass and Vessey [19] and A10, A12, A14
— B11: Identify and describe the relationships Added to address Issue 4. Based on results of
this study, from Wheaton [20], Kwasnik [12]
and Glass and Vessey [19]
A16: Define the guidelines for using and updating B12: Define the guidelines for using and Same as A16
the taxonomy updating the taxonomy
Testing A17: Test the taxonomy B13: Validate the taxonomy Combination of A17 and A18
and Validation A18: Incorporate improvements as a result of the — Part of B13
tests
Deployment A19 : Prepare the training plan — Deleted due to Issue 3
A20: Train users — Deleted due to Issue 3
A21: Collect evidence of learning — Deleted due to Issue 3
A22: Use the selected technology and make the — Deleted due to Issue 3
taxonomy available
A23: Develop the management and maintenance — Deleted due to Issue 3
plan
A24: Manage and maintain the taxonomy — Deleted due to Issue 3
manner. Only Bayona-Oré et al. [25] described a systematic ap- • Planning - In this phase, plans are established to define the
proach to develop their taxonomy. However, the method proposed activities for the design and implementation of the taxonomy.
by them has some issues and thus can be enhanced. Therefore, in It has eight activities.
the rest of this section we describe Bayona-Oré et al.’s taxonomy • Identification and extraction - In this phase, the work plan
design method and its limitations. and the information required by the organization are aligned.
It has two activities.
Bayona-Oré et al.’s method • Design and construction - In this phase, the taxonomy is de-
Bayona-Oré et al. proposed a taxonomy development method signed and constructed using the inventory of terms in order to
that was used to create a taxonomy of critical success factors for determine the final structure of the taxonomy. It has six activi-
software process deployment. They have developed the method ties.
based on a literature review of methods and guidelines used • Testing and validation - In this phase, it is ensured that the de-
for developing taxonomies. Each author identified, reviewed and signed taxonomy will be useful for users to achieve their goals.
analyzed the activities included in the reviewed literature. It has two activities.
The identified activities were aggregated according to their • Deployment - In the final phase, the taxonomy is deployed
similarities. As a result, the authors organized the activities in throughout the organization. It has six activities.
five groups, which are considered as the phases of the proposed
method. The resulting method consists of five phases and 24
activities. The phases are described as follows (the activities are The activities of the method are to be carried out in sequence.
presented in Table 13): Furthermore, the method is built upon the assumption that the
M. Usman et al. / Information and Software Technology 85 (2017) 43–59 55
taxonomy to be developed has hierarchical-based classification • Data extraction – This threat refers to the possibility that dif-
structure. ferent reviewers interpret and handle data extraction in differ-
ent ways. This could lead to different data extracted from the
Bayona-Oré et al.’s method issues same primary study, depending on the reviewer who carried
Based on literature from more mature fields (e.g., [12,19,20]), out the data extraction. The first action to mitigate this threat
the lessons learned during the conduction of the mapping was to design a spreadsheet to be used by all authors during
study, and our previous experience with designing SE tax- the data extraction. The spreadsheet, as well as the definitions
onomies/classification schemes [31–34], we identified the follow- of the data that had to be extracted was discussed by the re-
ing issues with Bayona-Oré et al.’s method: viewers to ensure a common understanding. The second miti-
• Issue 1 – Six out of nine sources on which Bayona-Oré et al.’s gation action was to perform a cross-validation of the extracted
method is based are either gray literature or are not available data, as detailed in Subsection 3.4. Cross-validation showed that
in English. the item “classification structure” was specially difficult to ex-
• Issue 2 – The proposed method is a simple aggregation of all tract, since none of the papers provided that item explicitly.
the activities suggested in the nine sources, thus some of these It had to be inferred from the text, which led to a high level
activities are only used in one of the peer-reviewed sources. of disagreement between reviewers. Thus, the third mitigation
• Issue 3 – The scope of some of the suggested activities goes action, performed jointly by the first three authors, was to re-
beyond taxonomy design, as they cover steps related to project screen all 280 studies, focusing on the “classification structure”,
planning, recruitment, training and deployment, which are not as described in Subsection 3.4.
necessarily relevant when designing taxonomies.
• Issue 4 – The method does not have activities to deal with clas- 5. A revised taxonomy development method
sification structures, i.e. it is only able to deal with hierarchical-
based classification structures. As per the results of our map- As aforementioned, we identified some issues associated with
ping study, 39.48% of the identified taxonomies have a facet- Bayona-Oré et al.’s taxonomy design method. Thus, we propose
based classification structure, which suggests the need for a the following changes to address the method’s limitations:
method that allows for designing facet-based taxonomies.
1. Exclude phases and activities that were out of the scope of tax-
• Issue 5 – The method does not have activities to deal with
onomy development (Issue 3).
classification procedures. As per existing literature [20], it is
2. Exclude activities related to Issues 1 and 2 that were not sup-
mandatory to define clear classification criteria, which can be
ported by the results of our mapping study and literature from
better done if a method is used to systematize the classifica-
other research fields.
tion procedure design.
3. Incorporate new activities to address Issues 4 and 5.
We addressed the aforementioned issues by revising Bayona- 4. Analyze the phases and activities to identify additional aspects
Oré et al.’s method. The revised method is described in Section 5. that could be further improved, such as the sequence of the
activities and whether activities could be merged to facilitate
4.9. Threats to validity related to the mapping study the use of the method.
The following three major validity threats were identified and As a result, we revised Bayona-Oré et al.’s method, which
mitigated in relation to the systematic mapping study: resulted in a revised method that has 4 phases with 13 activities
in total (the original method has five phases and 24 activities).
• Coverage of the search string – This threat refers to the effec-
A side-by-side comparison between the original and the revised
tiveness of the applied search string to find a sufficiently large
versions of Bayona-Oré et al.’s method is shown in Table 13. A
number of relevant primary studies. To mitigate this threat, the
more detailed description of each activity of the revised method is
search string was designed to be as comprehensive as possible.
provided in Subsections 5.1–5.1.
To achieve this goal, we not only included the SWEBOK knowl-
edge areas related to SE; but also main synonyms of the in-
cluded knowledge areas. The search string still cannot be con- 5.1. Phases and activities of the revised method
sidered complete as we did not include an exhaustive list of
synonyms for all knowledge areas. The search string returned a Here we describe the phases and activities of the revised
high number of primary studies (1517 unique studies). The ac- method. Note that we adjusted the description of the phases and
curacy of the search string was considered as fair, since 18,46% activities to reflect the changes we made in the method.
of the 1517 were included after the selection process. An addi-
tional mitigation action was to apply the search string on mul- Planning
tiple databases. This mapping study includes only those works The planning phase has 6 activities that define the taxonomy’s
as primary studies whose authors explicitly claim their work context and initial setting, i.e. SE knowledge area, objective, sub-
as taxonomies. While this is a limitation, we decided to do ject matter, classification structure type, classification procedure
so to avoid misclassifying studies as taxonomies when authors type and sources of information.
themselves did not name their classifications explicitly as tax- In activity B1, one selects and makes clear the SE knowledge
onomies. area for the new taxonomy. This makes it easier to understand the
• Study selection – This threat refers to the likelihood of dis- context of the taxonomy and thus to apply it.
agreements between reviewers regarding study inclusion or ex- In activity B2, the objectives and scope of the taxonomy are de-
clusion. The first mitigation strategy was to define clear selec- fined clearly, so that it is applied within the intended boundaries.
tion criteria. The selection criteria were discussed by all the In activity B3, the subject matter for the classification is
authors to ensure shared understanding of the criteria. When- described in detail. This is a fundamental step in taxonomy
ever a paper was excluded, a reviewer had to provide a reason development as discussed in Sections 2 and 4.
for the exclusion. The second mitigation action was to perform In activity B4, an appropriate classification structure type is
cross-validations in both level-1 and level-2 screenings, as de- selected, which can be hierarchy, tree, paradigm or facet-based
scribed in Subsection 3.3. (see Sections 2 and 4).
56 M. Usman et al. / Information and Software Technology 85 (2017) 43–59
In the design and construction phase, we mapped four activ- able approach for validating the proposed taxonomy considering
ities (B9, B10, B11 and B12). As a result of B9, eight dimensions factors such as taxonomy purpose and audience.
were identified, and as result of B10, 25 categories were identi- We also identified that hierarchy (53.14%) and facet-based
fied (see Table 14 for the identified dimensions and categories). (39.48%) were the most frequently used classification structures.
Regarding B11, due to the classification structure employed (facet- Majority of the studies used a qualitative procedure to assign
based), there is no relationship between the dimensions. Finally, subject matter instances to classes.
the authors did not carry out B12, i.e. they did not design any The overall result of this paper indicates that, except for one
guidelines for using and evolving the taxonomy. Nevertheless, study, SE taxonomies have been developed without following any
they have illustrated the taxonomy, which provides some guidance systematic approach. Bayona-Oré et al. have recently proposed a
about how to use the proposed taxonomy. method to develop SE taxonomies. Nevertheless, in the light of
In the validation phase, they only demonstrated the utility results presented in this paper, we identified a need to revise
of the proposed taxonomy by classifying size metrics reported in this method. Therefore, we revised Bayona-Oré et al.’s method,
existing literature (outcome of B13). which is another contribution of this paper. We intend to apply
this revised method to validate and demonstrate its usefulness in
5.3. Limitations of the revised method developing SE taxonomies.
[24] R. Wieringa, N. Maiden, N. Mead, C. Rolland, Requirements engineering paper [32] R. Britto, C. Wohlin, E. Mendes, An extended global software engineering tax-
classification and evaluation criteria: a proposal and a discussion, Requir. Eng. onomy, J. Softw. Eng. Res. Dev. 4 (3) (2016b) 1–24.
11 (1) (2005) 102–107. [33] E. Mendes, S. Counsell, N. Mosley, Towards a taxonomy of hypermedia and
[25] S. Bayona-Oré, J.A. Calvo-Manzano, G. Cuevas, T. San-Feliu, Critical success fac- web application size metrics, in: D. Lowe, M. Gaedke (Eds.), Web Engineering,
tors taxonomy for software process deployment, Softw. Qual. Control 22 (1) Lecture Notes in Computer Science, 3579, Springer Berlin Heidelberg, 2005,
(2014) 21–48. pp. 110–123.
[26] R. Pressman, Software Engineering: A Practitioner’s Approach, 7, McGraw-Hill, [34] J. Börstler, Feature-oriented classification for software reuse, in: Proceedings
Inc., New York, NY, USA, 2010. of the 7th International Conference on Software Engineering and Knowledge
[27] D.H. Doty, W.H. Glick, Typologies as a unique form of theory building: to- Engineering, 1995, pp. 204–211.
ward improved understanding and modeling, Acad.Manage. Rev. 19 (2) (1994) [35] C. Wohlin, A. Aurum, Towards a decision-making structure for selecting a re-
230–251. search design in empirical software engineering, Empirical Softw. Eng. (2014)
[28] E.A. Fleishman, M.K. Quaintance, L.A. Broedling, Taxonomies of Human Perfor- 1–29.
mance: The Description of Human Tasks., Academic Press, 1984. [36] M. Usman, Supporting Effort Estimation in Agile Software Development,
[29] G.R. Gibbs, Analyzing Qualitative Data (Book 6 of The SAGE Qualitative Re- Blekinge Institute of Technology, 2015.
search Kit), London: Sage, 2007. [37] R. Britto, M. Usman, E. Mendes, A web effort predictors taxonomy, in revision
[30] B. Glaser, A. Strauss, The Discovery of Grounded Theory: Strategies for Quali- with Journal of Web Engineering (2017).
tative Research, Observations (Chicago, Ill.), Aldine Transaction, 2009.
[31] R. Britto, E. Mendes, C. Wohlin, A specialized global software engineering tax-
onomy for effort estimation, in: Proceedings of the 11th IEEE International
Conference on Global Software Engineering (ICGSE), 2016a, pp. 154–163.
M. Usman et al. / Information and Software Technology 85 (2017) 43–59 59
Muhammad Usman is a PhD student at Department of Software Engineering, Blekinge Institute of Technology, Sweden. He completed his Masters in Computer Science in
2006 from Mohammad Ali Jinnah University, Islamabad. From 2007 to 2013, Usman worked as lecturer and researcher at International Islamic University Islamabad, Pakistan.
His research interests include empirical SE, evidence based software process improvement, agile and global software development. Contact him at muhammad.usman@bth.se.
Ricardo Britto is a PhD candidate of the Department of Software Engineering at Blekinge Institute of Technology. Britto received an MSc in Electrical Engineering from Federal
University of Rio Grande do Norte in 20 08. Between 20 09 and 2013, Britto worked as researcher and project manager at Federal University of Piaui-Brazil. His research
interests include large-scale agile software development, global software engineering, search-based software engineering and software process improvement. Contact him at
rbr@bth.se.
Jürgen Börstler is a professor of software engineering at Blekinge Institute of Technology, Sweden, where he also is Head of the Department of Software Engineering.
Previously, he has been a professor of computer science at Umea University, Sweden. Börstler received a PhD in computer science from Aachen University of Technology
(RWTH), Germany. He is a member of SERL-Sweden, the Software Engineering Research Lab at BTH and a senior member of the Swedish Requirements Engineering Network.
His main research interests are in requirements engineering, object-oriented methods, software process improvement, software measurement, software comprehension, and
computer science education. He also is a founding member of the Scandinavian Pedagogy of Programming Network.
Emilia Mendes is a professor of the Department of Computer and Engineering Science at Blekinge Institute of Technology, Sweden, and also a Finish distinguished professor
at the University of Oulu, Finland. She has previously been an associate professor at universities in Auckland (NZ) and Zayed (UAE). Mendes received a PhD in Computer
Science from the University of Southampton (UK) in 1999. Her research interests include Computer Science, Empirical Software Engineering and Web Engineering. To date
she has published over 200 refereed publications, which include three books. She worked in the ICT industry for ten years as programmer, business analyst and project
manager prior to moving to the UK in the end of 1995 to initiate her PhD studies. Contact her at emilia.mendes@bth.se.