2016 Iceit

2nd International Conference on Electrical and Information Technologies ICEIT’2016
Towards an automatic assessment system in

introductory programming courses
Soundous Zougari, Mariam Tanana, Abdelouahid Lyhyaoui
Lab. LTI ENSA of Tangier, University Abdelmalek Essaadi, Tetouan, Morocco
Abstract—Nowadays, automatic assessment of programming of work and consume much time. It is often a complex and
assignments has become an important topic in academic daunting task since each program must be tested and its source
research. In fact, the increasing number of enrolled students in code analyzed. In addition, the correction process is prone to
programming courses created the need of a system providing errors or omissions due to the fatigue and the repetitive nature
immediate feedback to the learners and saving teachers from of the task [2].
manually managing all the students’ solutions. This paper
presents an attempt to conduct programming exercises Furthermore, the advantages of the automatic assessment
assessment both dynamically and statically to ensure a reliable are especially appreciated in the context of e-learning [3].
and objective evaluation job. While dynamic analysis is based on Several universities worldwide offer numerous online courses
xUnit framework, the static analysis is performed on the graph and the number of students enrolled in these courses is on the
representation of the programs. order of thousands. In online courses, the teaching process is
carried out via the computer, with minimal or no contact with
Keywords—programming assessment; computer aided the teacher. Therefore, fast and reliable automated assessments
assessment; graph representation are particularly desirable. All these reasons and so many more
have motivated many researchers to be interested in automating
I. INTRODUCTION the process of assessing learners’ productions. The first
The research work related to the TEL “Technology reference comes from Hollingsworth who published on the
Enhanced Learning” gathers in a transdisciplinary scientific subject in 1960 [4]. The idea caught on quickly and several
field, and wonders about the way of developing computing assessment systems have been developed [5]. Unfortunately,
tools to support, improve or even increase the learning, whether these systems are neither generic nor configurable and most of
it is face-to-face or remote [1]. The handled issues concern for them are not available to the general public. That is why we
example the design, the implementation, or the learners’ seek to develop our own assessment system.
assessment in the TEL. In this sense, this work presents a In what follows we will focus on the approaches mainly
reliable and objective method of assessing learners’ adopted by these systems, namely dynamic and static.
productions that not only will reduce the workload for teachers
but also provide useful feedbacks to students throughout their
III. PROGRAM ANALYSIS METHODS
learning process. Concerning the practical domain, we opted
for introductory programming courses for several reasons. The validation of computer programs is a crucial part in
Besides the fact that these courses are the core of any the cycle of their development. Two verification and
engineer's training, this is a domain where assessment is of a validation techniques have stood out in recent years: dynamic
great complexity, mainly because it is characterized by the analysis and static analysis. The main difference between
multitude of solutions to a given problem. these two approaches is that the dynamic analysis requires the
execution of the program to check its accuracy, unlike the
This contribution is organized as follows: Section 2 static analysis that examines a program without executing it.
presents necessary background information. Afterwards, we This section aims to present them:
give an overview of the main methods adopted by program
analysis systems in section 3. Section 4 describes the proposed A. Dynamic Analysis
approach, with a special emphasis on the programs’
Dynamic analysis involves running the code to verify its
transformation into intermediate graph structures. Section 5 accuracy. This is achieved by using different and varied test
concludes the paper and outlines directions for future work. cases allowing maximum path coverage of the programs. It is
intended to detect errors by comparing the gotten results with
II. THE RESEARCH STATEMENT those expected by the specification. This method is adopted by
To learn and master a new programming language, students most of the automatic programming assessment systems such
need to solve a large number of exercises in order to practice as Ceilidh [6], TRY [7], BAGS [8], Kassandra [9]. There are
the new syntax and semantics of the language. The comments two different approaches in the dynamic analysis method:
and feedback from teachers about the mistakes they made are • Black-box testing: testing is carried out without
crucial to improve their knowledge. However, it is difficult for knowledge of the internal structure and the objects
teachers to manually manage all students’ solutions. Indeed, handled by the system. We are only interested in the
correcting manually programming exercises can involve a lot
978-1-4673-8469-8/16/$31.00 ©2016 IEEE

inputs and outputs of the system. Functional testing or This method gives the possibility to check whether the
manual testing, for example, belongs to this category. program meets the requirements designated by the
evaluator.
• Grey-box testing: consists in validating the results of
each function in the program and generating the final • Structural analysis: measures the degree of similarity
score compared to these results. by comparing the assessed program to the solutions
expected by the teacher or expert. This method as a
1) Advantages of the dynamic analysis: static analysis method consists in measuring and
• The dynamic analysis is easy to implement, and tester calculating the degree of equivalence between
can be non-technical. programs through their graphical representation.
• It makes possible evaluating any program performance 1) Advantages of the static analysis:
in terms of the generated results compared with the • It can find weaknesses in the code at the exact location.
expected outputs in the test case.
• It takes into account all possible execution paths.
• It allows analysis of applications in which we do not
have access to the actual code. • The ability to analyze the program even if the code
2) Disadvantages of the dynamic analysis: contains errors as opposed to dynamic analysis.
• Risks related to the execution of the source code, e.g. a 2) Disadvantages of the static analysis:
buffer overflow that can cause a sudden breakdown of • The limitations of structural analysis since there are a
a server and put the data at risk. variety of solutions for the same problem.
• The major drawback of this method is that if the • Intricate and difficult to apply in the context of
program does not compile, then it cannot be assessed. complex programs.
• The generated feedback is limited to the expected • Automated tools can produce false positives and false
outputs of the test case. negatives.
• We cannot check the conformity of the program as
for the requirements defined by the instructor. IV. OUR PROPOSITION
We can deduce that the strengths and weaknesses of the
B. Static Analysis two approaches are complementary. We therefore propose an
The static program analysis is a family of techniques original combination of these two techniques. In this
allowing to collect information on the program without having combination, the dynamic analysis reports errors at runtime,
to run it and therefore eliminate the risks associated with its then the static analysis evaluates the structural properties of
execution. Among the tools that rely on static analysis lint [10] the programs.
and AutoLEP [11]. We can distinguish various methods within To perform dynamic analysis, we suggest the use of a
the approach of static analysis: dedicated framework to automate and conduct tests in a given
• Style analysis: analyses program readability language. This not only allows separating the test code from
(significant variable names, the presence of comments, the code, making possible testing it and thus facilitates its
the indentations, the scope of variables, etc.) so that the reuse, but also to do it without a manual intervention and a
program can be understood by other users. Among the human interpretation. Finally, the analysis of the gotten results
systems using program style analysis we find Style ++ could be automated since each test result has a status,
for C ++ [12]. generally ok or error [13]. This is addressed in section A.
• Error detection: the errors detected through this method On the other hand, to evaluate the structural properties of
are imperceptible during compilation time, but cause the programs (static analysis), we measure the degree of
problems when running (division by zero, infinite similarity by comparing the assessed program to the programs
loops, etc.). These errors or these bugs may contribute belonging to the solution space provided by the teacher or
to a blockage or a malfunction in the system. expert. A solution space is a set of paths representing the
different possible approaches for the same exercise. It can
• Metric analysis: measures the program properties to contain the correct solutions as well as the incorrect ones. It is
test its complexity and reliability (average size of made by an expert and has deemed pedagogically interesting
instructions, frequency of comments, number of approaches. This method requires the passage through the
instructions in a function, number of classes, number of graphical representation of the compared programs. More
functions, etc.). details are given in section B.
• Keyword analysis: examines the source code in order
to determine the presence of the words required by the A. Dynamic Analysis with xUnit
evaluator in the source code, for example a program Our aim being to remain as generic as possible, we chose
wherein the evaluator requires the use of the switch the xUnit framework. Indeed, it is a family of several similar
structure and which can be written using a set of if. frameworks gathered in a family named xUnit. The JUnit
framework was the first to be widely known but different within the same program. After creating the AST, we look for
development platforms and programming languages followed similar subtrees within the generated tree via some algorithms.
including nUnit (.Net), Dunit (Delphi), CppUnit (C ++), ... We Eventually, we return the matched trees.
will explain in what follows some of the JUnit basic features
and architecture components shared with other frameworks of 2) Graphs : graphs are another way for representing
the xUnit family. programs. Through the graphs you can display the program
structure in order to analyse and compare it with other
JUnit is an open source framework for the development programs [17]. There are several types of graphs:
and implementation of automated unit tests created by Kent
Beck and Erich Gamma [14], [15]. The main interest is to a) The Control Flow Graph (CFG) is a directed graph
ensure that the code still meets the needs even after possible where each node represents a basic block i.e. a straight-line
modifications. JUnit offers: piece of code without any jumps or jump targets; jump targets
start a block, and jumps end a block. Directed edges are used
• A framework for the development of unit tests based on to represent jumps in the control flow. It highlights loops,
assertions that test the expected results. conditional statements and branches. A path in this graph
• Applications allowing the tests completion and the represents a program implementation scenario. The program
results display. illustrated in Fig. 2 is used to provide an example for all the
type of graphs. Fig. 3 displays the program corresponding
The tests to automate are expressed in classes in the form CFG.
of test cases with their outcomes. JUnit runs these tests and
compares their outcomes with the expected results. That’s how
void main() {
the class code is separated from the code that allows testing it.
Often to test a class, it is easy to create a main( ) method int x = 0;
containing the treatment of the tests. The downside is that this int y = 1;
superfluous code will be included in the class. Moreover, it while (y < 10) {
must be executed manually. y = y * 2;
Writing test cases can have an immediate effect on x = x + 1;
detecting bugs but mostly it has a long term effect that }
facilitates the detection of side effects during modifications. printf(“%d”,x);
JUnit test cases are Java classes that contain one or more test printf(“%d”,y);
methods and that are grouped into test suites, as shown in }
Fig.1.
Fig. 2. Program example.
Fig. 3. Control Flow Graph.
b) The Control Dependence Graph (CDG) shows which

instructions will be executed based on the value of an
Fig. 1. JUnit Architecture. expression in the program. The nodes of the graph are the
same as those of the control flow graph. An edge goes from
two nodes p and q, if the value of the p expression has an
B. Static analysis
impact on the execution of the q instruction. See the
In order to statically analyse computer programs, it is corresponding CDG for our program in Fig. 4. In this
useful to choose a representation that is accurate but still example, we may note that the presence of the loop on the
abstract with respect to the programming language. There are expression y <10 is due to the fact that it must be re-evaluated
different forms to represent and visualize a program: if true, in other words until it is false.
1) Syntax Trees: an Abstract Syntax Tree or AST [16] is a c) The Data Dependence Graph (DDG) is a directed
tree whose internal nodes are labeled by operators and whose graph representing data dependencies of several objects
leaves are the operands. In other words, generally, a leave is a towards each other. It contains the same set of nodes in the
variable or a constant. This representation is commonly used previous graphs. The edges incident to a node represent data
in the search for similarity between different programs or values on which the node’s operations might depend. In fact,
an edge exists between two nodes p and q in a program if p measure technique for our domain of interest, in order to get a
defines a variable and q uses it. Fig. 5. shows the DDG for the reliable similarity percentage between students’ answer and
above mentionned program. the instructor solutions.
As soon as the new system is developed, we intend to test
it with real users. We plan to design and implement an
experiment in real learning environments to assess the
usability and performance of the proposed system. This
experiment will also allow us to evaluate its weaknesses and
therefore improve it.
REFERENCES
Fig. 4. Control Dependence Graph.
[1] P. Tchounikine, "Précis de recherche en ingénierie de EIAH", Juin 2009.
(references)
[2] Higgins, S., Hall, E., Baumfield, V., Moseley, D. (2005). A meta-
analysis of the impact of the implementation of thinking skills
approaches on pupils. In: Research Evidence in Education Library.
London: EPPICentre, Social Science Research Unit, Institute of
Education, University of London.
[3] Allen, I.E., & Seaman, J. [2010]. Class Differences: Online Education
in the United States, 2010. Newburyport, MA: Babson Survey Research
Group and The Sloan Consortium Green.
[4] J. Hollingsworth. Automatic graders for programming classes.
Communications of the ACM, 3:528–529, October 1960.
[5] C.Douce, D.Livingstone, and J. Orwell. 2005. Automatic test-based
assessment of programming: A review. Journal on Educational
Resources in Computing (JERIC) 5, 3 (2005), 4.
[6] S. Benford, E. Burke, E. Foxley, N. Gutteridge, and A. M. Zin.
Experiences with the Ceilidh system. In Proceedings of the International
Fig. 5. Data Dependence Graph. Conference in Computer Based Learning in Science, Vienna, 1993.
[7] Kenneth A. Reek. The TRY system -or- how to avoid testing student
We can deduce that each and every graphical method for programs. In Proceedings of the twentieth SIGCSE technical symposium
program representation supports some unique features. Since on Computer science education, SIGCSE ’89, pages 112–116, New
our intention is to assess students’ productions in introductory York, NY, USA, 1989. ACM.
programming courses, we choose to represent programs with [8] J. B. Hext and J. W. Winings. An automatic grading scheme for simple
control flow graphs. Indeed, in the flow control approach, the programming exercises. Commun. ACM, 12(5):272–275, May 1969.
focus is on the sequencing of operations in a process. Control [9] Urs Von Matt. Kassandra: the automatic grading system. SIGCUE
flow graphs are used as models to describe the structure of Outlook, 22(1):26–40, January 1994.
computer programs. They are used both for static analysis [18] [10] Ian F. Darwin, « Checking C Programs with Lint », O'Reilly Media,
October 1988.
and as a model for program coverage. Therefore, it’s a suitable
[11] Wang Tiantian; Su Xiaohong; Ma Peijun; Wang Yuying; Wang
representation for structural comparison. However, we will not Kuanquan, "AutoLEP: An Automated Learning and Examination
definitively exclude the other types of graphs because they can System for Programming and its Application in Programming Course,"
be interesting for future modifications in our system. in Education Technology and Computer Science, 2009. ETCS '09. First
International Workshop on , vol.1, no., pp.43-46, 7-8 March 2009.
V. CONCLUSION [12] Ala-Mutka, K., T. Uimonen and H.-M. Järvinen (2004). Supporting
students in C++ programming courses with automatic program style
In this paper, we propose a new assessment system for assessment. Journal of Information Technology Education, 3, 245–262.
students’ productions aiming introductory programming [13] HTTP://WWW.JMDOUDOUX.FR/JAVA/DEJ/CHAP-FRAMEWORKS-TEST.HTM
courses. The system is based on merging information from [14] https://fr.wikipedia.org/wiki/JUnit
two different evaluation methods, dynamic and static analysis. [15] Petar Tahchiev (Author), Felipe Leme (Author), Vincent
The first is carried out using the xUnit framework, which Massol (Author), Gary Gregory (Author), JUnit in Action, Second
provides features that do not only ease the dynamic analysis Edition, 2011.
process but also makes it flexible and reusable. [16] Ronald Cohn Jesse Russell, Abstract syntax tree Paperback – 1 Jan
2012.
The second analysis evaluates how close a student program [17] Vincent Mathieu. Outils d’analyse statique. 2001.
is to the expert solutions. This process consists of two parts: [18] S. Rao Kosaraju. Analysis of structured programs. J. Comput. Syst. Sci.,
the transformation of student program into an intermediate 9(3) :232-255, 1974.
representation and the examination of similarity
measurements. The intermediate representation in our
approach is based on the Control Flow Graphs (CFG) and the
similarity measurements will be our concern in the next stage
of our study. We will focus on the most accurate similarity

2016 Iceit

Uploaded by

Copyright:

Available Formats

You might also like

2016 Iceit

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2016 Iceit

Uploaded by

Copyright:

Available Formats

2nd International Conference on Electrical and Information Technologies ICEIT’2016

Towards an automatic assessment system in

978-1-4673-8469-8/16/$31.00 ©2016 IEEE

Fig. 3. Control Flow Graph.

b) The Control Dependence Graph (CDG) shows which

You might also like