Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

 

Automated Cheating Detection in Exams


using Posture and Emotion Analysis
​Mr. Nishchal J Ms.Sanjana Reddy Ms. Navya Priya N

CSE,R.V.College of Engineering CSE,R.V.College of Engineering CSE, R.V. College of Engineering

Bengaluru, India Bengaluru, India. Bengaluru, India

n​ishchalj.cs18@rvce.edu.in sanj​anasr.cs18@rvce.edu.in navyapriyan.cs18@rvce.edu.in

Abstract​- ​Cheating in exams has become a serious issue bending etc, are detected which are considered as
these days. Exams play an important role in every possible cases of cheating. If the number of such cases
student’s life. Cheating in exams has been a widespread detected are more than a predefined threshold, it is
phenomenon all over the world regardless of the levels of
determined as cheating and a report is sent to the
development. Detection of manual cheating detection
methods may no longer be wholly successful, to fully examiners. This situation can be reviewed again by the
prevent cheating during examinations. There is a need to examiner to make a final decision. Faces are registered
make this process automated and efficient. The suggested to a database by using a dataset creator which is
method makes the process of invigilation and cheating implemented using OpenCV. The student who is
detection totally automated. Hence there is no need to caught cheating is recognized using facial recognition
rely on manual methods. and a report about their activities along with a
Keywords-Posture detection, face recognition, emotion
timestamp is sent to the examiners following which
analysis, cheating activity detection​.
action can be taken after reviewing the report.
I. INTRODUCTION This kind of system has never been implemented
Cheating in exams has become a widespread before which makes it unique. The possibility of
phenomenon in the world regardless of the levels of conducting an exam without the presence of an
detection development. Many studies have been invigilator is unthought of and is definitely needed to
conducted over the past decade about cheating make examinations strict and efficient. The time and
activities performed by students and the means by energy of the invigilators can be definitely used
which university could attempt to combat this problem. productively elsewhere.
In the U.S., it was revealed that 95% of secondary
school students who admitted cheating said that they
II. LITERATURE SURVEY
had not been caught, 51% of secondary school
There are models which classify abnormal activities in
students did not believe cheating was wrong.
examinations using Multi-class Markov Chain Latent
There are three main reasons why students cheat in Dirichlet Allocation. These models are used as feature
exams; being afraid of failure, wanting to take risks detectors to detect arm joints, shoulders and the
and having no fear of getting caught. While the first position of the head. This model classifies the activities
factor is purely under the student’s control, the other into 5 possible classes and uses a supervised dynamic
factor can be controlled.[1] This majorly happens due and hierarchical Bayesian model for the same [2].
to poor invigilation, or less number of invigilators to Many works propose methods to detect abnormal
assist. Hence manual invigilation has many demerits behavior from videos. The main task in detecting
and making this system automated makes this system abnormal activity is to establish what abnormality is.
foolproof and reliable. The method proposed to detect abnormal activity is to
detect normal activities and define the rest as
The aim of this model is to detect abnormal or abnormal. A supervised model is used to classify the
cheating activities in an exam in an automated way. In activities.[3]
our model, this is done by detecting the body posture
of the student during the examination using the CCTV Gesture recognition is used to generate textual
footage of the classroom. Actions like turning back, d​escription of the cheating activity .​The gesture

978-1-7281-6828-9/20/$31.00 ©2020 IEEE 

Authorized licensed use limited to: University of Liverpool. Downloaded on September 17,2020 at 04:50:42 UTC from IEEE Xplore. Restrictions apply.
 

recognition model works with 3DCNN and XGBoost A.Posture detection using OpenPose
and a language generation model is based on an LSTM
network. The textual description is used to keep a The first stage is the posture detection. We use the
record of all such activities[4]​. ​There are works which OpenPose model for posture detection. OpenPose can
detect cheating activity in an exam through whispering detect the key points of the human body. The
of the cheater. This uses features such as energy, root coordinates of the key points can be used to estimate
mean square, time duration and spectral features for the the sitting posture of the person. 0 to 18 are the key
same. An alarm is made to ring whenever the points detected by the model as seen in Fig 1 [8].
whispering level crosses a threshold which is
calculated using Z-score.[5] i. Dataset

Work has been done to detect cheating in ​online exams For this the presence of a CCTV camera is required
as well. These works mainly rely on a webcam wherein since the footage from the camera is passed to the
analysis of a test taker’s behavior is done by checking model. It is required that the camera be placed such
the time delay and the variation of a student’s head that the lateral view of the student is captured. The
pose relative to the computer screen[6]. Usage of RGB video captured from the CCTV is sent to the model as
cameras followed by analysis and classification by input directly in real-time for analysis.
algorithms such as Support Vector Machine, depth
sensors and wearable devices are common for activity
recognition. It is observed that use of Kinect sensor or
depth sensor into a human activity recognition system
gives accurate results.[7]

The amount of work done to detect cheating activity in


offline exams is limited. This model attempts to detect
cheating activity in an exam by incorporating both
posture and emotion analysis which has never been
attempted before. Nonverbal communication provides
Fig 1. key points detected by open pose
a significant amount of information about the posture
of a human. Body posture significantly affects the
ii. Model Implementation and Training
emotions of a human, which should also be a factor to
consider while designing models involving detection of
abnormal activities. This idea is incorporated in this
model which has never been ventured before.

Inclusion of a facial recognition system makes it an


appropriate model to be used in public exams where a
large database of participating students is readily
available. This model also provides an alert system and
also some room for modifications in the report
generated by authority in case of false alerts.
Incorporation of this model for smooth conduction of
Fig 2.Network architecture of OpenPose model
exams is surely recommended to make the process of
exam invigilation automatic. The VGG Block consists of 5 blocks as seen in Fig 2.
In Stage 1 a 2-branch multistage CNN is used .The first
III. METHODOLOGY
branch predicts a set of 2D Confidence Map​s of body
The model consists of 5 parts: part locations. ​The second branch predicts a set of 2D
vector fields of Part Affinities (PAF), which encode the
A.Posture detection using OpenPose degree of association between parts.[10]
Stage 0 creates feature maps for the input image and
B.ALEXNET model to detect type of cheating
has two branches. Each branch has 5 convolution
C.Emotion Analysis layers. The branches have 128 and 19 outputs
respectively. Next, the stage T block, iterated for 6
D.Face Recognition and E. Report Generation stages, has two branches.

Authorized licensed use limited to: University of Liverpool. Downloaded on September 17,2020 at 04:50:42 UTC from IEEE Xplore. Restrictions apply.
 

Each branch consists of 7 convolution layers. The The factors considered in our model, which are used to
branches have 38 and 19 outputs respectively.[9] detect possible cases of cheating, considering that the
image is taken in lateral view are: [10][11]
iii. Results And Analysis
1. When a person turns back, their both the eyes
are captured by the camera as shown in Fig 3.
The output is shown in Fig 4.
2. If a person is not turning back, and sitting
straight, the image captured in the lateral view
will contain only one eye as shown in Fig 5. The
output is shown in Fig 6.
3. In a lateral view either the right ear or left ear is
captured. The angle between either ear and the
hip is calculated. If that angle is less than 70
degrees the person is sitting with a hunchback. If
Fig 3. a)Possible case of cheating where the student is turning back it is greater than 110 degrees, the person is
b)Output of the OpenPose model. reclined. If the ears are not visible, the person is
not in a lateral view.[11]
When any of the above cases are highlighted, the
model terms it as possible cases of cheating.

B. ALEXNET model to detect type of cheating

Human activity recognition and analysis is a powerful


Fig 4.Output of the posture detector which shows that both the eyes research point in video perception and analysis.
are seen hence the student is not sitting straight which is a possible Cheating activity can be detected if a student’s posture
case of cheating.
seems to be abnormal or not regular for a long period
of time or if this abnormality is detected frequently
over a time frame. Hence, student posture analysis and
detection is of prime importance. In the proposed
technique, there is a continuous capture of video from
the CCTV, monitoring the students writing the
examination. When there is a signal from the key point
detection model, this model does a frame by frame
analysis of the CCTV footage, so as to clearly extract
the output or the type of activity the student is indulged
in. [7]

i. Dataset
Fig5. a)Student sitting straight in an exam
b)Output of OpenPose model. The 4 main target classes here are ​Bending back,
Stretching the arms behind, Bending down, Facing the
camera as shown in Fig 7. A ground truth Dataset for
the same was created in which the team members of
this project were made to sit in different positions and a
video of the same was recorded. Later these were
converted into individual frames. About 1000 images
for every class was filtered and curated. The images are
resized to 224*224 pixels, the colour jitter (brightness,
Fig 6.Output of the posture detector which says that only one eye is hue, saturation and contrast) was set to 0.1% and thus
seen in the lateral view.
the images were normalised.

Authorized licensed use limited to: University of Liverpool. Downloaded on September 17,2020 at 04:50:42 UTC from IEEE Xplore. Restrictions apply.
 

The high accuracy rate was achieved mainly because of


using a pre-trained model which was already trained on
a huge dataset called Imagenet for about 1000 classes.

iii. Testing and Results

A whooping accuracy of 0.96 was maintained


constantly even for the test dataset. The graph between
the number of epochs and accuracy has been shown in
Fig 9. Thus, this model was well trained for detecting
Fig 7. Sample Dataset. the posture of the student and sending it for further
ii. ​Model Implementation and Training analysis to the facial recognition model.

The model was built on the basis of transfer learning


approach. ALEXNET model, trained on Imagenet dataset
of 1000 classes, was used to train on the Posture detection
data here. Alexnet model was developed by Alex
Krizhevsky which was the winning entry in ILSVRC
2012. It is a light weight neural network having 5
convolutional layers and 3 fully connected layers as shown
in Fig 8. The first two Convolutional layers have a
convolution of 11*11 and 5*5 matrices with strides of 2
and 2 units respectively. The overlapping max pooling Fig 9.Accuracy on the validation set during training
layer follows the two convolution layers. These maxpool D. Emotion Analysis
layers are similar to the other max pool layers, except that
the adjacent windows over the max are computed and Facial expressions play a very important role, like
overlap each other. body language, in human interactions. Hence by using
a sentiment analyzer in our code, we can systematically
come to a conclusion, if the student actually has an
intention to cheat in the exam. So, once the type of
cheating is detected from the previous model, now the
same images i.e., the set of images obtained after
converting the video to a series of frames are sent to
the sentiment analyzer model wherein the emotion in
the face of the student is detected. As many frames are
sent, a number of emotions can be detected(one for
each frame). Hence, the emotion which has maximum
frequency in that particular period of time is being
Fig 8.Architecture of ALEXNET Model considered for further evaluation. In order to be more
accurate and not falsely allege a person of malpractice,
Alexnet uses a pooling window of 3*3 with stride of 2. the second most frequent emotion is also
The next two layers are densely connected with another considered.[16]
overlapping max pool layer. The last two densely
connected layers feed into softmax for classification i. Dataset
labels. ReLU nonlinearity is applied after every
The datasets considered here are, Facial Expression
convolution layer, to normalize the outputs of each
Recognition Challenge (FERC-2013), Extended
layer. In the cheat detection model the last softmax
Cohn-Kanade (CK+) , and Radboud Faces Database
layer is altered such that the number of output classes
(RaFD). These datasets mainly differ in quality,
are reduced to 4 as shown in the fig 7.[8]
quantity and ‘cleanness’ of the images. The
The training was done with Py-Torch backend on FERC-2013 set has around 32000 low resolution
Google Colab platform which provided free GPU images , to which RaFD provides 8000 high resolution
usage for research. The training was done for 20 images. Also, facial expressions in CK+ and RaFD are
epochs after which, an accuracy of 0.96 was reached. clean, but FERC-2013 set shows ​emotions ‘in the
wild’.Considering three networks,training is done

Authorized licensed use limited to: University of Liverpool. Downloaded on September 17,2020 at 04:50:42 UTC from IEEE Xplore. Restrictions apply.
 

using 9000 samples from FERC-2013 along with 1000 The emotion with maximum score is considered to be
new samples for validation as shown in Fig 10. Testing the overall emotion[17].
will be done with 1000 images from RaFD set to get an
indication of performance on clean high quality data. On the whole, after evaluating it on the test dataset, an
In order to improve the accuracy further training was accuracy of 63% was achieved.
done with 20,000 images from the FERC-2013 set. C. Face recognition
Newly composed validation (2000 images) and test
sets (1000 images) from the FERC-2013 dataset are Once the possible cases of cheating is detected, the
used as well, together with the well-balanced RaFD person caught cheating has to be recognised and then
testset.[17] reported to respective authorities. This model has three
parts: dataset creator, Trainer and Detector.

i. Dataset

The dataset creator is needed to create a database of all


the students in an institute. All the faces detected are
compared with all the images in the database. SQlite is
used as a relational database management system. The
student id, name, age and gender are stored in the
database. This doesn’t require a separate server process
to operate. The image is converted to grayscale. The
Fig 10.Number of images per emotion in the final face detected in the image can be used to insert it in the
training set. database, as a new entry, when a new face is
The emotions taken into consideration here are –happy, recognised which is not present in the database or it
neutral, surprised, sad ,fearful ,angry, disgust. Hence, can be used to update existing data. This part is
when a student is mostly fearful, then it is likely that majorly used when an institute wants to register
the student is cheating.[17] . students in the database.[14]

ii. ​Model Implementation and Training ii. Model Implementation and Training

The network consists of three convolutional layers and The detectmultiscale of Cascade classifier is used to
two fully connected layers, combined with maxpooling detect a face in an image. The trainer uses the Local
layers to reduce the image size and a dropout layer so Binary Patterns Histograms (LBPH) as the face
that overfitting does not occur. The three network recognition algorithm. We use OpenCV's LBPH Face
layers are , network A, network B, network C. Recognizer to train the dataset.[15] The new image is
The Haar Feature-Based Cascaded Classifier is used passed through the detector to detect the face and fetch
inside the OpenCV framework , all data is the details of the student.
preprocessed. For every frame obtained from the E. Report generation
previous model, only the square part containing the
face is taken, which is rescaled, and then converted to The details of the student caught cheating is sent via
an array with 48x48 grey-scale values. This 48*48 mail to the respective authorities. The CCTV footage
input is sent to the first convolutional layer A which can be reviewed again if needed.We use python’s
has a filter of 5*5 and gives 64 outputs and hence is native library to send emails which is called SMTP lib.
normalized to 44*44 image. After applying maxpool of “smtplib” creates a Simple Mail Transfer Protocol
two strides , 22*22 images are obtained which are client session object which is used to send emails to
further sent to the next network B , which also has 5*5 any valid email id on the internet.
filter and gives 64 outputs and after normalization,
18*18 images are obtained which are sent to the third IV. LIMITATIONS
layer C, where a filter of 4*4 is used which gives 128
One of the requirements of this model is that the CCTV
outputs and after normalization 15*15 images are
camera is to be placed in a lateral position. This is one
obtained. Optimization using softmax gives 7 outputs
of the main limitations of the model. This model fails
after multiclass classification. The network gives
to detect cheating activity among multiple people in the
softmax scores for all the seven classes of emotions.
camera feed. The model requires the input to be in
lateral view only. The front view of the person doesn’t

Authorized licensed use limited to: University of Liverpool. Downloaded on September 17,2020 at 04:50:42 UTC from IEEE Xplore. Restrictions apply.
 

give the desired output. When there are multiple people VII. REFERENCES
in a row the lateral view captures only one person.​The
1. E-exam cheating detection system,Razan Bawarith, Dr.
camera must be adjusted at a sufficient height so that Abdullah Basuhail, Dr. Anas Fattouh and Prof. Dr.
all the people can be detected. When this is put to Shehab GamalelDin-King AbdulAziz University Saudi
action, there shall be slight changes in the angle Arabia, International Journal of Advanced Computer
thresholds through which this model classifies the Science and Applications(2017).
2. Classifying Abnormal Activities in Exam Using
sitting posture. Multi-class Markov Chain LDA Based on MODEC
Features:Janson Hendryli, Mohamad Ivan Fanany Faculty
of Computer Science Universitas Indonesia
3. N. M. Nayak, Modeling and recognition of complex
V. RESULTS human activities in Visual Analysis of Humans: Looking
at People, T. Moeslund, A. Hilton, V. Krger, & L. Sigal,
This paper follows a modular approach to build a Springer, (2011) 289-309.
Cheat Detection Model. The Posture Recognition 4. Arinaldi, A., & Fanany, M. I. Cheating video description
model using the OpenPose model could successfully based on sequences of gestures. 2017 5th International
Conference on Information and Communication
locate 18 different points on a human body. A little
Technology.(2017)
adaptation to the use case was made such that it could 5. Asadullah, M., & Nisar, S. An automated technique for
successfully predict if a person is turning back or cheating detection. 2016 Sixth International Conference
sitting straight with the position of eyes and arms. The on Innovative Computing Technology(2016).
Cheating posture detection model trained on Alexnet 6. Detecting probable cheating during online assessments
based on time delay and head pose ​Chia Yuan Chuang,
architecture could predict the type of cheating Scotty D. Craig​ & ​John Femiani (2017) 1123-1137
committed - ​Bending down, Arms extending back, 7. Ann, O. C., & Theng, L. B. Human activity recognition:
Fully turning back with a high accuracy of 96% on test A review. IEEE International Conference on Control
data. The Emotion Analysis model was used to analyse System, Computing and Engineering (2014)
8. Convolutional Pose Machines :Shih-En Wei Varun
the emotion of the student during the abnormal activity Ramakrishna Takeo Kanade Yaser Sheikh The Robotics
with an accuracy of 65% on test data. Institute Carnegie Mellon University:arXiv:1602.00134v4
(2016)
Combining the Cheating posture analysis model and 9. Realtime Multi-Person 2D Pose Estimation using Part
the Emotional analysis model and by taking the Mean Affinity Fields:Zhe Cao Tomas Simon Shih-En Wei
Square Root of their accuracies, we obtain the net Yaser Sheikh The Robotics Institute, Carnegie Mellon
University: arXiv:1611.08050v2 (2017)
possible accuracy of our proposed model, i.e. 10. Single-Network Whole-Body Pose Estimation:Gines
Hidalgo, Yaadhav Raaj, Haroon Idrees, Donglai Xiang,
√(96) * (63) = 77.76 ≈ 77.8% Hanbyul Joo, Tomas Simon, Yaser Sheikh-Carnegie
Mellon University arXiv:1909.13423v1 (2019)
Thus, by combining the results of all the models, a 11. Github repository:
prediction is made on the cheating activity, after which https://github.com/nvinayvarma189/Sitting-Posture-Reco
Facial recognition is used to recognise the student’s gnition​ (posture detection using OpenPose)
12. ImageNet Classification with Deep Convolutional Neural
identity. Report generation model is used to record all Networks, Alex Krizhevsky, University of Toronto, Ilya
the observations made by the above models and the Sutskever, University of Toronto, Geoffrey E. Hinton,
same is sent to the authority's email for further actions. University of Toronto
13. Human Body Posture Recognition Using Artificial Neural
Networks, Manu Bali, Devendran V, International Journal
of Innovative Technology and Exploring Engineering
VI. CONCLUSION (2019),2278-3075
14. Github repository:
Cheating in exams has been taking place in the world https://github.com/ChiragSaini/Simple-Facial-recognition
-with-Database (face recognition model using
regardless of significant development in technology
LBPH algorithm)
and advancement. Though online exams monitoring is 15. Zhao, X., & Wei, C. A real-time face recognition system
automated to some extent, offline exams still stick to based on the improved LBPH algorithm.IEEE 2nd
the traditional way of manual invigilation. Thus, this International Conference on Signal and Image Processing
paper has presented and got into action a state of art (ICSIP) (2017)
16. Github repository:
technology to automate manual invigilation. The four https://github.com/atulapra/Emotion-detection​( Emotion
models developed here have equally contributed to detection model)
successful working of the Cheat Detection System with 17. Enrique Correa, Arnoud Jonker, Michael Ozo, Rob Stolk.
a decent accuracy, thus paving the world towards a (2016) “Emotion Recognition using Deep Convolutional
Neural Networks”.
new advancement in the educational system.

Authorized licensed use limited to: University of Liverpool. Downloaded on September 17,2020 at 04:50:42 UTC from IEEE Xplore. Restrictions apply.

You might also like