2014 Icmcs

Towards an Automatic Evaluation of UML Class
Diagrams by Measuring Graph Similarity
Anas Outair1, Abdelouahid Lyhyaoui1, Mariam Tanana1

1
National School of Applied Science, Tangier, Morocco
{ anas.outair@gmail.com , lyhyaoui@gmail.com , Mariam.tanana@hotmail.fr }
Abstract. This paper aims to develop models, methods and tools for the design
dedicated to learning Oriented Object Modeling. We address the problem of
analysis of student answers and in particular the construction of a UML class
diagram from textual specifications. The main objective is to propose a method
to automatically analyze the diagrams conducted by students. This proposal is
applied to analyze the UML class diagrams and focuses primarily on the
structural and semantic aspects of the models to match.
Keywords: Learner assessment, UML Diagrams, Transformation of graphs,

Matching and measuring graph similarity
1 Introduction
UML is one of the most important courses in higher education, and modeling Object
Oriented which aims to understand the models of design concepts in the different
phases of the software development cycle. Modeling is an activity open; it has no
general method of resolution. It requires the control of object-oriented concepts and
training on examples. As part of our work, we focus on one of the basic activities
involved in teaching object-oriented modeling: construction from textual
specifications of an UML class diagram analysis by students. The general problem on
which we focus is the analysis of learner productions. This problem leads to new
questions: how to validate a response from the learner in the absence of educational
resolver? What feedback can be provided? How to answer these questions preserving
flexibility and generic?
The main objective of this thesis is to propose a method for automatic analysis of
diagrams of the learner in the modeling business environment conducted by students.
This method should be independent for educational needs, to ensure some generic so
that the results can be used to produce synchronous feedback for Human Learning. To
meet this goal, we want to investigate existing learning environments for modeling
and analysis of student productions and finally the relating works on transformation,
matching and similarity measure. From this study, we adopted the principle to design
a tool based on the comparison and matching components of several diagrams. To
describe this article, we organized into seven parts. The first section describes some
methods and tools to evaluate the learner in general. The second section describes the
specific concepts and features in UML language; we define first what the UML
diagram, and then focus on the class diagram. We conclude this section with an
example of construction of an exercise class diagram produced by students and
analysis of several representations as graph of the same example. The third part
relates the consistency check of diagrams, analysis of the responses of the learner, the
production of educational feedback by focusing on the notion of differences between
the student diagram and a reference diagram constructed by a expert. The fourth
section specifies the concepts inherent in the model matching problem, their
application contexts and classical techniques employed to respond. We clarify first
the concepts and terminology related to the matched model.
2 Learning Assessment
The learning assessments occupy a very important place in the education. The
knowledge acquired by the students can be tested by the teacher in the form of a
formative assessment and summative assessment [1].
When a teacher wants to evaluate a learner, he begins by determining the aimed
educational objective. Next, according to the cognitive level of this objective, he will
choose one or several types of exercises. The higher the cognitive level to evaluate is,
the more it is necessary to give freedom to the learner. The aim of this work is to
evaluate the skills of the learner, which corresponds to the cognitive level
« application » [2]. We are interested in tools adapted to this choice. Indeed, the main
objective is to develop tools helping the teacher to realize an evaluation, which is the
most objective possible, of the students’ know-how/ skills. The chosen field of
application is the UML diagrams, where the implementation of case studies is
necessary for a good understanding of the basic notions of the UML design and
modeling. The students have to conceive these case studies. The teacher’s work will
consist at evaluating the obtained results. It is a tedious task because the correction of
an UML diagram created by a student is difficult to understand, especially if there are
several possible solutions. Since UML does not provide the methodology for
modeling, the students have difficulties during the construction of a class diagram.
When students construct an UML diagram, which has several solutions, it might be
presented in different ways and point of views. For this reason, this paper offer
assistance to the teacher in the task of evaluating the UML diagrams produced by
students. Firstly, we are interested in class diagrams because it is the most
complicated part in the diagram design. In the following section, we will present the
different view of a system model, and the categories of elements of class diagram.
3 Assessment of Differences between Class Diagrams
We start by focusing on the evaluation of the class diagram, because it is the most
used, and considered as the most important of object oriented modeling. To
demonstrate this approach, an example model containing a teacher’s class diagram
and a student’s class diagram, to detect all differences between them.
Thereafter, we will illustrate the use of class diagrams with an example of a case
study on «Library management” produced by teacher and student.
Figure 1 shows a class diagram of the teacher modeled from a text statement, this
reference diagram is not optimal, there may be alternative solutions with semantic
equivalence but structurally different. First of all the focus is on a single solution of
teacher’s class diagram to find the differences with the class diagram produced by a
single student.
3.2 Class Diagrams
Fig. 1. The diagram of the student and the reference diagram constructed by the teacher are
shown respectively in the left and right.
5 Taxonomy Differences
The taxonomy of differences was constructed from the manual comparison diagram
of the student with a reference diagram for exercise of our case study. The differences
are grouped into eight categories:
Omission of an element: the student has omitted an element of the teacher class
diagram. Adding of an element: the student added in this diagram an element that is
not represented in the teacher class diagram. Transfer of an element: an element has
been moved. Duplication of an element: an element of the teacher class diagram is
shown in the class diagram of the student by several elements of the same type.
Fusion of elements: several elements of the same type are represented in the student
class diagram by a single element. Reversing the direction of a relationship: the sense
of a relationship (inheritance, aggregation or composition) was reversed by the
student.
The differences were developed from a manual comparison; several differences
have been found in class, attribute, method, relationship, orientation relationships and
multiplicities. The differences can be expressed as insertion, omission, inversion and
replacement. The student’s class diagram is incomplete; he omitted to represent some
elements of the diagram. These omissions are for instance the absence of a class
which implies that the attributes, operations and relations that connect them to other
classes in the diagram are also absents. It may be that the student has omitted to
represent attribute inherited by subclasses of the super class, then these classes will
also be incomplete. However, the insertion of an element or relationship in the
diagram constructed by the student refers to the fact that he did not respect the case
study, or that he made a mistake. Table 1 illustrates an example of general differences
between the teacher’s class diagrams and the student’s class diagram.
Table 1. The general differences between two class diagrams.
Differences noted by the teacher Feedback

Omission {Librarian} Omission of class and
Omission {works (Librarian  Library)} elements associated with
Omission {registered (Librarian  Member)} this class
Omission {Borrow} Omission of association
Omission {do (Librarian  Borrow)} class and elements
Omission {know (Library  Borrow)} associated with this
association class
{ have (Library  Document) } REVERSE Reversing the direction
{ have (Document  Library) } of a aggregation
relationship
{ Dictionary Document} REVERSE Reversing the direction
{ Document Dictionary) } of a generalization
relationship
Those differences have been done manually, if we want to do it automatically or

semi-automatically, it will be difficult with the graphic form of these diagrams.
Thereafter, we would like to represent it in an easier and handle able format. Indeed,
the class diagrams contain several links between classes and each class has several
attributes or operations. Links can be of different types (combination of inheritance,
aggregation, composition and simple association) and be labeled differently (role,
multiplicity, and navigability) [3]. We have shown in this section that the solution
produced by the student does not automatically infer whether the student develops the
correct or erroneous constructions in relation to the case study requested. The use of a
valid solution or several solutions defined by a teacher's necessary for a system to be
able to automatically analyze student productions. Oversights and errors that the
student commits are identified from a diagram constructed. At a more general level,
the problem of comparison of several different models created by students has been
studied outside of a learning environment. This problem is similar to a model
matching process, which is why we present the model matching problem and the
classical approaches that have been developed to treat it.
6 Graph Matching
Graph matching plays a central role in solving correspondence problems in computer

vision. Graph matching problems that incorporate pair-wise constraints can be cast as
a quadratic assignment problem [4]. Matching graph labeled is finding semantic
correspondences between two graphs [5]. The matching can be considered as an
operation or an operator which takes two graphs as input and produces a mapping
between the elements of two graphs corresponds semantically to each other [6]. The
graph matching problems consisting in mapping the vertices of two graphs, the
objective being to compare the objects modeled by graphs.
6.1 Match Result
The matching result may associate one or more elements of the first graph to one or
more of the other graph, and vice versa. Cardinality relationships between paired
elements are introduced to describe the matching products. Cardinality relationships
correspond to four scenarios:
1: 1 (one element associated with an element),
1: n (an element associated with n elements),
n: 1 (n elements associated with an element),
m: n (m elements associated with n elements).
The last three scenarios correspond to elements of multiple matches. The cardinality
of the matching is described as:
Local: one or more elements of a graph can be matched to one or more of the other
graph.
Overall: an element of a graph may intervene in zero, one or more matches of
elements of alignment produced between two graphs.
Most matching approaches are restricted to the local cardinality 1: 1 because they
select as candidate matching for an element of a model, the element most similar in
the other model [7]. So the products alignments generally have a global cardinality 1:
1 or 1: n.
6.2 Matching Process
The matching process is shown in the figure 3 can be seen as a function f which from:
A pair of models to match n and n';
An input alignment M;
A set of parameters p;
A set of external resources r.
Returns an alignment M' between these graphs.
M’ = f (n, n’, M, p, r). (1)
Fig. 3. Matching process.
6.3 Problems matching and similarity measure
The matching processes several graphs and their constituents that they are also
compared according to various criteria. The comparison of several objects based on
the evaluation of their similarities and their differences. Measuring the similarity of
two objects is to identify commonalities as the distance measure between two objects
is the identification of the differences. So the distance and similarity are two concepts
that refer to a common goal. The calculation of the distance or similarity of two graph
structures allows you to find the best ² of vertices of a graph, that is to say the one that
preserves the most characteristic vertices and edges [8]. A graph matching is a
matching:
Univalent when each vertex of a graph is associated with at most one vertex of the
other graph (matching cardinality 1: 1).
Multivalent when each vertex of a graph can be associated to a set of vertices of
the other graph [15]. If the matching cardinality (1: n), then it is a burst of vertices,
and if the matching cardinality (n: 1) then it is a fusion of vertices.
6.4 Syntactic, Structural and Semantic Matching Techniques
The structural matching techniques based on graphs consider the input graphs in the
form of labeled directed graphs. Comparing the similarities between the node pairs is
focused on the analysis of their position within the graph to compare. Calculating the
similarity in some algorithm may be more specifically oriented on the child nodes, the
leaves or the relationships between the nodes. The matching system can be syntactic
and semantic. The syntactic matching approaches do not directly analyze the
significance of the terms, and therefore the semantics. In these approaches, semantic
correspondences are determined using syntactic similarity measures, usually
expressed in an interval [0..1] or directed techniques that consider the syntax labels as
strings [9]. The figure 4 shows the difference between a syntactic approach and a
semantic approach.
Fig. 4. Matching process [16].

6.5 Similarity Measure for Matching Graph
We studied the existing approaches of the parameterized distance measurement

depending on the type of graph to match for different graph matching problems. This
measure of similarity between two graphs is based on their common characteristics
throughout all their characteristics. It models the measurements of distance and
similarity existing graphs specific to the application domain and type of desired match
(unique and / or multivalent), to calculate the similarity score of each of the vertices
and arcs [8], [10].We recall that a graph is a data structure used in particular to model
objects in terms of components (called vertices) and binary relations between
components (called arcs). A multi digraph is a directed graph which is permitted to
have multiple arcs, i.e., arcs with the same source and target nodes. A multi
digraph G is an ordered pair G= (V, A) with:
V is a set of vertices. A multi set of ordered pairs of vertices called directed
edges, arcs or arrows [11].
A labeled multi digraph G is a multi graph with labeled vertices and arcs [12]. A
labeled graph is a directed graph such that vertices and edges are associated with
labels. Without loss of generality, we shall assume that every vertex and edge is
associated with at least one label: if some vertices (resp. edges) have no label, one can
add an extra anonymous label that is associated with every vertex (resp. edge) [13].
More formally, given a finite set of vertex labels LV , and a finite set of edge labels LE,
a labeled graph will be defined by a triple G =( V, RV , RE) such that:
V is a finite set of vertices. RV ⊂ V x LV is the relation that associates vertices with
labels, i.e., RV is the set of couples (vi , l) such that vertex vi has label l. RE ⊂ V x V x
LE is the relation that associates edges with labels, i.e., RE is the set of triples (vi , vj , l)
such that edge (vi , vj) has label l. Note that from this edge relation RE, one can define
the set E of edges.
The similarity of two graphs G and G 'with respect to a mapping m vertices and arc
is defined by:
The function f weighted characteristics of graphs G and G '. The split function
calculates the set of m bursts. The function g weights these bursts. The two functions f
and g are customizable to the needs of the application. The maximum similarity
sim (G, G ') of two graphs G and G' is the best pairing of vertices and m arcs.
7 Conclusion
The graph matching problem is complex and can be approached from various
techniques and algorithms. It is apparent that the use of a single technique is not
satisfactory to meet the matching problem of graphs. The use of several techniques
and several matching increases the calculations and therefore the time to produce
alignment, but requires to think about how they will be combined and configured.
The main entrances of graph matching systems are directed acyclic graphs whose
alignment of the components will be identified. Auxiliary data in addition to the
graphs will facilitate the matching process by clarifying semantic graphs and allow in
some cases to remove ambiguities and to direct or accelerate the process.
Correspondence of the proposed alignment by a matched model system focused on
the similarities and usually qualified by a semantic equivalence relation or a real score
between 0 and 1. The semantic relationships are more advanced in the semantic
matching techniques. In the studied techniques, we note that the matching problem is
difficult to treat automatically by the system. The intervention of a human operator
may be required. In addition, the result produced is processed by a domain expert to
check its relevance. The approaches proposed in the area are mostly semi-automatic.
References
1. Cowie, Bronwen, B. Bell.: A model of formative assessment in science

education. Assessment in Education: Principles, Policy & Practice Vol. 6, n°1, P. 101-116,
(1999)
2. C. Hadji, The evaluation demystified. ESF, 2th edn. p. 126, (1999).
3. RENSINK, Arend and KLEPPE, Anneke. On a graph-based semantics for uml class and
objectdi agrams. Electronic Communications of the EASST, 2008, vol. 10.
4. E. M. Loiola, N. M. De Abreu, P. O. Boaventura, P. Hahn, T. M. Querido, A Survey for the
Quadratic Assignment Problem European Journal of Operational Research, (2007)
5. Do H.-H., Rahm E., Matching Large Schemas: Approaches and Evaluation, In: the Journal
on Information Systems, vol. 32, n°6, 857-885, (2007)
6. Rahm E., Bernstein P. A., A survey of approaches to automatic schema matching, In: The
international Very Large DataBases Journal (VLDB Journal), vol. 10, n°4, Springer Berlin /
Heidelberg, 334-350, (2001)
7. Do H.-H., Melnik S., Rahm E., Comparison of Schema Matching evaluations, In: Web,
Web-Services, and Database Systems, NODe 2002 Web and Database-Related Workshops,
Erfurt, Germany, October 7-10 2002, Revised Papers,LNCS 2593, Springer Berlin /
Heidelberg, 221-237, (2003)
8. Sorlin S., Solnon C., Jolion J.-M., A Generic Graph Distance Measure Based on Multivalent
Matchings, In: Applied Graph Theory in Computer Vision and Pattern Recognition, 151–182,
(2007)
9. Shvaiko P., A Classification of Schema-Based Matching Approaches, In: Proceedings of
Meaning Coordination and Negotiation workshop at ISWC’04, Hiroshima, Japan, 119-130,
(2004)
10. Sorlin S., Measure the similarity of graphs, Laboratory for Computer Science Thesis in the
Image and Information Systems, University Claude Bernard Lyon 1, France, p. 142, (2006)
11. Sorlin. S and Solnon. C, "Reactive tabu search for measuring graph similarity." Graph-
Based Representations in Pattern Recognition. Springer Berlin Heidelberg, 172-182, (2005)
12. Diestel, Reinhard; Graph Theory, Springer; 2nd edition, ISBN 0-387-98976-5, February 18,
(2000 )
13. Champin. P. A and Solnon. C, Measuring the similarity of labeled graphs. In Case-Based
Reasoning Research and Development (pp. 80-95). Springer Berlin Heidelberg, (2005)

2014 Icmcs

Uploaded by

Copyright:

Available Formats

You might also like

2014 Icmcs

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2014 Icmcs

Uploaded by

Copyright:

Available Formats

Towards an Automatic Evaluation of UML Class

Diagrams by Measuring Graph Similarity

Anas Outair1, Abdelouahid Lyhyaoui1, Mariam Tanana1

Keywords: Learner assessment, UML Diagrams, Transformation of graphs,

3 Assessment of Differences between Class Diagrams

3.2 Class Diagrams

Table 1. The general differences between two class diagrams.

Differences noted by the teacher Feedback

Those differences have been done manually, if we want to do it automatically or

Graph matching plays a central role in solving correspondence problems in computer

6.1 Match Result

6.2 Matching Process

6.3 Problems matching and similarity measure

6.4 Syntactic, Structural and Semantic Matching Techniques

Fig. 4. Matching process [16].

We studied the existing approaches of the parameterized distance measurement

1. Cowie, Bronwen, B. Bell.: A model of formative assessment in science

You might also like