Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

2024 3rd International Conference on Control, Instrumentation, Energy &

Communication (CIEC)

IEEE Conference Record Number 59440


Jointly organized by: Department of Applied Physics, University of Calcutta, India and IEEE
Joint Control Systems Society and Instrumentation & Measurement Society (CSS-IMS), Kolkata
Chapter, India

A NONLINEAR PROJECTION BASED HUMAN EMOTION


RECOGNITION APPROACH EMPLOYING FACE POINT DATA

Author’s Name:

Arka Bairagi* Bed Prakash Das* Anirban Dey* Kaushik Das Sharma*

(*Department of Applied Physics, University of Calcutta, Kolkata, India)

Date: 25-01-2024
Slide Overview

Human Emotion Performance of


Dataset Creation PSO
Data
Preprocessing

Work
Literature Review

Motivation

Materials and Methods

Performance Evaluation

Conclusion & Future


Performance
Nonlinear Parameters of
Dimensionality Classifiers
Reduction
Classification of
Data –
Classifiers

2
Literature Review

Name of
Used techniques Demerits
author
H. Soyel et al. probabilistic neural network • Low recognition rate
[1] (PNN) classifiers • confusion on recognition of sadness.
• Information on the level of muscle
Active Apspearance Model activation can’t be retrieved.
Sujano et al. [2]
(AAM), fuzzy logic • Three class classification
• Compatible with a single person only.
Q. Mao et al. Support vector machine
• Low recognition rate
[3] (SVM)
Decision Tree, SVM with • Three class classification
Z. Zhang et al. radial basis function (RBF), • the points should be selected in the eyes,
[4] Random Forest, Random Tree, eyebrows and lip, which could be more
BayesNet convincing for recognition
Gabor filter bank feature
B. Adil et al. • Gabor and SVM parameters are chosen
extractor, PCA, SVM with
[5] manually
RBF kernel
E. Pranav et al. Deep convolutional neural • Video image has been captured, user
[6] network (DCNN) privacy has been violated.
Z. Rzayeva et • Video image has been captured, user Fig 1. Radar Chart for representing Accuracy of Literature Review
Convolutional neural network
al. [7] privacy has been violated.
3
Motivation
Proposed an user privacy secured facial expression Develop a new framework with higher
recognition system. recognition accuracy.

Analysis and creation of in-house Austro Mongoloid Dealing with high dimensional feature
sub-race based facial information dataset. dataset.

Data projection in a
Human Emotion Feature scaling Nonlinear dimensionality
second order
Dataset Creation using normalization reduction using UMAP
polynomial space

Tune the coefficients


of the second order Check the Classification using
polynomial space accuracy different classifiers
using PSO

4
Materials and Methods

 Human Emotion Dataset Creation

Subject
Facial data Storage of facial Manual wise dataset
Data acquisition User interface
acquisition from expression data annotation of prepared for
setup development
human subject in CSV format stored data different
classes

Face points on Cartesian coordinate


(𝑋𝑖𝑛𝑡 , 𝑌𝑖𝑛𝑡 , 𝑍𝑖𝑛𝑡 ) = 𝑋𝑖 , 𝑌𝑖 , 𝑍𝑖 − (𝑋𝑛𝑡, 𝑌𝑛𝑡, 𝑍𝑛𝑡 )

Spatial distance with respect to nose tip point


Fig 2(a). Lines Fig 2(b). Lines 𝟐 𝟐 𝟐 …(1)
𝑬𝑫𝒏𝒑 = 𝑿𝒊𝒏𝒕 + 𝒀𝒊𝒏𝒕 + 𝒁𝒊𝒏𝒕
joining selected joining selected
face points to the face points to
their
nose tip face
corresponding Spatial distance for symmetrical face points
point
face point on both
Fig 3. Components – Kinect Xbox 360 [6]
side of face 𝑬𝑫𝒉 = (𝑿𝑹𝒊𝒏 − 𝑿𝑳𝒊𝒏 )𝟐 +(𝒀𝑹𝒊𝒏 − 𝒀𝑳𝒊𝒏 )𝟐 +(𝒁𝑹𝒊𝒏 − 𝒁𝑳𝒊𝒏 )𝟐 …(2)

Kinect detects
121 face points 5
Materials and Methods

 Human Emotion Dataset Creation


Physiological expression
of different emotions

1. Eyes are narrowed 1. Inner corners of the 1. Eyebrows raised, 1. Eyebrows raised and
1. Eyebrows pulled
and there is some eyebrows pulled up and more curve than seen in pulled together
down and together
wrinkling around the together fear, but not drawn 2. Raised upped eyelids
eyes together 2. Eyes opened
2. Upper eyelids drooped 3. Tensed lower eyelids
and eyes looking down
wide, staring hard
2. Cheeks are raised 2. Upper eyelids raised, 4. Jaw drooped open and
3. Lip corners pulled lower eyelids neutral 3. Lips pressed
3. Lips are pulled back lips stretched
downward tightly together
symmetrically 3. Jaw drooped down horizontally backwards.
6
Materials and Methods

 Human Emotion Dataset Creation

Annotated Database has


been created. Manually Facial
entry the annotation
Expression
Dataset
against each row. Kinect Xbox
360 Fig 4. Face Tracking by WPF Application

Subject of Group of Dataset Dataset


Dataset Class
Dataset Subjects Dimension Orientation

30 Human face Sorted according to • Size of Row : 30 x 5 x 10 = 1500. Five distinct Four types of orientation:
subject has been gender: (5 types of expression for each classes: • Face XY Positions
• Male people. Each expression for 10
recorded for • Happy • Eyebrow near ends
• Female times)
different class of • Size of Column : 243+3 = 246 • Sad • Face Points
facial expression. Sorted according to age: • Size of Database : 1500 x 246 • Surprise • Face Angles
• 18-25 • Fear
• 25-35 • Angry
• 35-45 7
Materials and Methods

 Data Pre-processing
Data in Data projected in …(3)
 Data Projection ℝ 1500×243
(𝑋−space) 2
𝑓 𝑋 = 𝑋 + 𝛼𝑋 + 𝛽 , second order polynomial space

 Feature Scaling
Table I: Accuracy, model precision, model recall and Table II: Accuracy, model precision, model recall and
Not a bell function model f1-score of different classifiers for normalization model f1-score of different classifiers for
as a feature scaling technique standardization as a feature scaling technique
Random K-Nearest Decision Random K-Nearest Decision
Methods → Methods →
Forest Neighbor Tree Forest Neighbor Tree
Coefficient Coefficient
values → α = ( 1.6033) values → α = ( 1.6033)
Performance β = ( 1.6451) Performance β = ( 1.6451)
Parameters ↓ Parameters ↓
Accuracy 96% 95.33% 92.67% Accuracy 88.5% 88.67% 85.33%
Model Model
0.958 0.954 0.926 0.884 0.888 0.854
Precision Precision
Model Recall 0.958 0.952 0.926 Model Recall 0.884 0.888 0.852
Model F1- Model F1-
0.96 0.952 0.926 0.886 0.888 0.852
Score Score
Fig 5. Probability density function of a feature
8
Materials and Methods

 Nonlinear Dimensionality Reduction


Manifold Learning
feature are in ℝ243 ; due to that,
non-linear structure in the data. & visualization problem occurred.
technique, like UMAP has
been introduced

 Manifold Learning
 Classification of Data – Classifiers
 Given points 𝑥1 , 𝑥2 , … , 𝑥𝑛 ∈ ℝ𝐷 that lie on a d-dimensional manifold M
 Random Forest
which gives, 𝑦1 , 𝑦2 , … , 𝑦𝑛 ∈ ℝ𝑑 , where 𝑦𝑖 = 𝑓 𝑥𝑖 .
 K- Nearest Neighbor (KNN)
 Decision Tree (D-Tree)
 Uniform Manifold Approximation and Projection (UMAP)
 Optimize the Projection Plane
Coefficients using PSO:
Learning the manifold structure in the high dimensional space Acceleration Number Number
w coefficients of of
Searching nearest neighbor
Low dimensional representation c1 c2 particles iterations
Graph construction
Minimum Distance (a) Varying distance; (b) 0.7 2.0 2.0 20 100
Local connectivity; (c) Fuzzy
Cost function minimization
area; (d) Merging of edges
9
Performance Evaluation

 Performance of PSO

Fig 7. Variation of cost function with increment of iteration Fig 8. Variation of cost function with increment of
for PSO based KNN classification method iteration for PSO based D-Tree classification method

Fig 6. Variation of cost function with increment of iteration


for PSO based random forest classification method

 Performance Parameters of
Classifiers
 Confusion Matrix

Fig 9. Confusion matrix for PSO based Fig 10. Confusion matrix for PSO Fig 11. Confusion matrix for PSO based
random forest classification method for based KNN classification method for α D-Tree classification method for α =
α = (- 0.27087), β = (-72.4287) = 0.66211, β = 14.81579 0.71594, β = 11.41197
10
Performance Evaluation

 Performance Parameters of Classifiers


Table III: Accuracy, model precision, model recall and model f1-score
of different classifiers for particular coefficient values
Random Table IV: Performance parameters for PSO based
Methods → KNN D-Tree
Forest random forest classification method for α = (-
Coefficient 0.27087), β = (-72.4287)
values → α = (- 0.27087) α = 0.66211 α = 0.71594
Performance
Performance β = (- 72.4287) β = 14.81579 β = 11.41197 F1-
Parameters → Precision Recall
Parameters ↓ Classes ↓
Score
Accuracy 98% 97.5% 96.67% Class 1 – Happy 0.98 0.98 0.98
Model Precision 0.98 0.972 0.966 Class 2 – Sad 0.98 0.99 0.99
Class 3 – Surprise 0.98 0.97 0.97
Model Recall 0.978 0.976 0.968
Class 4 – Fear 0.98 0.98 0.98
Model F1-Score 0.98 0.974 0.968 Class 5 – Anger 0.98 0.98 0.98

Table V: Performance parameters for PSO based KNN Table VI: Performance parameters for PSO based D-Tree
classification method for α = 0.66211, β = 14.81579 classification method for α = 0.71594, β = 11.41197 Fig 12. Area under curve (AUC) for PSO based
Performance Performance random forest, KNN and D-Tree classifiers
F1- F1-
Parameters → Precision Recall Parameters → Precision Recall
Score Score
Classes ↓ Classes ↓
Class 1 – Happy 0.98 0.96 0.97 Class 1 – Happy 0.95 0.94 0.95
Class 2 – Sad 0.94 1.00 0.97 Class 2 – Sad 0.96 0.99 0.98
Class 3 – Surprise 0.98 0.97 0.98 Class 3 – Surprise 0.97 0.96 0.96
Class 4 – Fear 0.98 0.97 0.97 Class 4 – Fear 0.97 0.97 0.97
Class 5 – Anger 0.98 0.98 0.98 Class 5 – Anger 0.98 0.98 0.98 11
Conclusion & Future Work
 Conclusion
 PSO based second order polynomial space data projected random forest method is outperform for the
recognition of facial emotion with the rate of 98% accuracy.
 Memory based classifiers have not performed satisfactorily except KNN, but voting based classifier like D-Tree
has performed slightly worse than KNN. So another voting based classifier, i.e. Random Forest has chosen for
classification.
 Random Forest is a collection of decision trees and majority vote of the forest is selected as the predicted output.
 Working principal of Random Forest is the combination of two steps – Bagging and Boosting.
 Bagging creates a different training subset from sample training data with replacement & the final output is based
on majority voting. Boosting combines weak learners into strong learners by creating sequential models such that
the final model has the highest accuracy.

 Future Work
 Metaheuristic optimization based projection can be done on higher than second order polynomial space and
examine the performance of the classification model.
 This emotion recognition method can be implemented on temperature and illumination controller of the room
for adjusting the room temperature or room light intensity based on occupant’s mood. Thus energy consumption
can be reduced using this system.
12
References
1. H. Soyel, H. Demirel, "3D facial expression recognition with geometrically localized facial features," 2008 23rd International Symposium on Computer and Information
Sciences, pp. 1-4, 2008.
2. Sujono, A. Gunawan, “Face Expression Detection on Kinect Using Active Appearance Model and Fuzzy Logic”, International Conference on Computer Science and
Computational Intelligence (ICCSCI 2015), pp. 268-274, vol.59, 2015.
3. Q. Mao, X. Pan, Y. Zhan, X. Shen, “Using Kinect for real-time emotion recognition via facial expressions”, Frontiers Inf Technol Electronic Eng, pp. 272–282, vol.16, 2015.
4. Z. Zhang, L. Cui, X. Liu, T. Zhu, "Emotion Detection Using Kinect 3D Facial Points," 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 407-
410, 2016.
5. B. Adil, K. M. Nadjib, L. Yacine, "A novel approach for facial expression recognition," 2019 International Conference on Networking and Advanced Systems (ICNAS), pp. 1-
5, 2019.
6. E. Pranav, S. Kamal, C. S. Chandran, M. H. Supriya, "Facial Emotion Recognition Using Deep Convolutional Neural Network," 2020 6th International Conference on
Advanced Computing and Communication Systems (ICACCS), pp. 317-320, 2020.
7. Z. Rzayeva, E. Alasgarov, "Facial Emotion Recognition using Convolutional Neural Networks," 2019 IEEE 13th International Conference on Application of Information and
Communication Technologies (AICT), pp. 1-5, 2019.
8. F. Piat, N. Tsapatsoulis, "Exploring the time course of facial expressions with a fuzzy system," 2000 IEEE International Conference on Multimedia and Expo. ICME2000.
Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat.No.00TH8532), pp. 615-618 vol.2, 2000.
9. A. F. Bobick, J. W. Davis, "The recognition of human movement using temporal templates," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 257-267,
vol. 23, no. 3, 2001.
10. A.J., Izenman, “Introduction to manifold learning”, WIREs Computational Statistics, vol. 4, pp. 439 - 446, 2012.
11. L. McInnes, J. Healy, J. Melville, “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction”, arXiv:1802.03426v3 [stat.ML], Sep. 2020.
12. B. K. Tripathy, A. Sundareswaran, S. Ghela, Unsupervised Learning Approaches for Dimensionality Reduction and Data Visualization, 1st ed, United States: CRC Press.,
2022.
13. J. Kennedy and R. Eberhart, "Particle swarm optimization," Proceedings of ICNN'95 - International Conference on Neural Networks, Perth, WA, Australia, vol. 4, pp. 1942-
1948, 1995
14. A. Dey, K. D. Sharma, T. Sanyal, P. Bhattacharjee Jr, P. Bhattacharjee , "Population based study on arsenic induced blood samples employing hybrid metaheuristic
optimization based ML approach," 2019 IEEE Region 10 Symposium (TENSYMP), Kolkata, India, pp. 599-604, 2019.

13

You might also like