Professional Documents
Culture Documents
Stress Detection Report
Stress Detection Report
Introduction:
The objective of this use case is to predict the stress level of a person and categorize the stress
level as amusement, baseline and stress.
The WESAD stands for WEarable Stress and Affect Detection. As the name indicates the
data is used to detect the stress level in a person. The data is collected by conducting an
experiment on 17 subjects. Out of 17, 2 subjects(S1 and S12) had been discarded, due to
sensor malfunction. For each subject, there are five sub files. They are
Explanation of Columns :
Columns Explanation
net_acc_mean Mean of the net accelerometer signal
net_acc_std Standard deviation of the net accelerometer signal
net_acc_min
net_acc_max
Minimum of the net accelerometer signal
Maximum of the net accelerometer signal
ACC_x_mean Mean of the accelerometer X signal
ACC_x_std Standard deviation of the accelerometer X signal
ACC_x_min
ACC_x_max
Minimum of the accelerometer X signal
Maximum of the accelerometer X signal
ACC_y_mean Mean of the accelerometer Y signal
ACC_y_std Standard deviation of the accelerometer Y signal
ACC_y_min
ACC_y_max
Minimum of the accelerometer Y signal
Maximum of the accelerometer Y signal
ACC_z_mean Mean of the accelerometer Z signal
ACC_z_std Standard deviation of the accelerometer Z signal
ACC_z_min
ACC_z_max
Minimum of the accelerometer Z signal
Maximum of the accelerometer Z signal
BVP_mean Mean of the BVP signal.
BVP_std Standard deviation of the BVP signal.
BVP_min
BVP_max
Minimum of the BVP signal.
Maximum of the BVP signal.
EDA_mean Mean of the EDA signal
EDA_std Standard deviation of the EDA signal.
EDA_min
EDA_max
Minimum of the EDA signal.
Maximum of the EDA signal.
EDA_phasic_mean Mean of the EDA phasic signal.
EDA_phasic_std Standard deviation of the EDA phasic signal.
EDA_phasic_min
EDA_phasic_max
Minimum of the EDA phasic signal.
Maximum of the EDA phasic signal.
EDA_smna_mean Mean of the EDA smna signal.
EDA_smna_std Standard deviation of the EDA smna signal.
EDA_smna_min
EDA_smna_max
Minimum of the EDA smna signal.
Maximum of the EDA smna signal.
EDA_tonic_mean Mean of the EDA tonic signal.
EDA_tonic_std Standard deviation of the EDA tonic signal.
EDA_tonic_min
EDA_tonic_max
Minimum of the EDA tonic signal.
Maximum of the EDA tonic signal.
Resp_mean Mean of the Respiration signal.
Resp_std Standard deviation of the Respiration signal.
Resp_min
Resp_max
Minimum of the Respiration signal.
Maximum of the Respiration signal.
subject Subject ID
The data from the SX_readme is also added to the file. The data in the SX_Quest is added to
the file and this denotes the label i.e. the value to be predicted.
The WESAD datasets—are inherently unbalanced because their experimental protocols
dictated different duration. SMOTE technique of upsampling was used to deal with the
unbalanced data.
The EDA is controlled by the sympathetic nervous system (SNS), and hence it is particularly
sensitive to high arousal states. First, a 5 Hz lowpass filter was applied to the raw EDA
signal]. Then, statistical features were computed (e.g. mean, standard deviation, dynamic
range, etc.). Furthermore, the raw EDA signal consists of a tonic (referred to as skin
conductance level (SCL)) and a phasic (skin conductance response (SCR))
component. ).cvxEDA library was used to process the EDA
biosignal(reference :lciti/cvxEDA: Algorithm for the analysis of electrodermal activity
(EDA) using convex optimization).
On the raw ACC signal different statistical features, e.g. the mean µacc,i and standard
deviation σacc,i were computed. These features were computed both for each axis separately
(i ∈ {x,y, z})
On the raw TEMP signal common statistical features (mean, standard deviation, min, max,
etc.) were computed.
Statistical features of the biosignals such as min,max,standard deviation,quartiles were
calculated .The signals were present in varying frequency.The frequency was converted in
time(seconds).Window size of 30 seconds was taken .These 30 seconds of data which
consisted of statistical features of the biosignals were flattened to form a single row.This is
how we created 1177 rows ,each row depicting the processed biosignals for 30 second
period.We have saved it in cleantable.csv file. The final Wesad data set contains 59 columns.
Idea for processing shown in block diagram:
Feature Selection:
The correlation matrix is plotted for the dataset and the following features are selected as they
are highly correlated.
Model Summary:
The extracted features, detailed above, serve as input for the classification step. Five machine
learning algorithms were applied and compared within our benchmark: Decision Tree (DT),
Random Forest (RF), AdaBoost (AB), Linear Discriminant Analysis (LDA), and k-Nearest
Neighbour (kNN). As the entire data processing chain was implemented in Python, we used
the scikit-learn implementation of the aforementioned classifiers.
A benchmark is created using a large amount of well-known features (extracted from
physiological and motion signals) and common machine learning methods (Decision Tree
(DT), Random Forest (RF), AdaBoost (AB), Linear Discriminant Analysis (LDA) .
The subjects (n = 15) were exposed to different affective stimuli (stress and amusement).
Finally, the data of each subject is linked to several self-reports, which represent the
subjective experience during an affective stimulus
Input & Output
Input: Output:
Logistic Regression:
Logistic Regression uses a different method for estimating the parameters, which
gives better results–better meaning unbiased, with lower variances.
Naive Bayes:
It is easy and fast to predict the class of the test data set. It also performs well in
multi-class prediction.
Decision Tree:
A significant advantage of a decision tree is that it forces the consideration of all
possible outcomes of a decision and traces each path to a conclusion.
Random Forest:
Random Forest increases predictive power of the algorithm and also helps prevent
overfitting.
AdaBoost:
AdaBoost can be used to boost the performance of any machine learning algorithm. It is
best used with weak learners. These are models that achieve accuracy just above random
chance on a classification problem.
GradBoost:
The GRADBOOST procedure creates a predictive model by fitting a set of additive
trees.
UI:
The UI is designed in such a way that everyone who uses the smartphone and laptop can
understand. The user has to sign up if he/she uses the website for the first time or login.
The main page displays the button “check your stress level” . When the user clicks the button,
the browser navigates to a new page where the user needs to fill some preliminary details and
press the button “connect the sensor”. Then the website shows an emoji and text which
indicates the stress level of the user.For Further understanding the website also displays a
graph.
The user can use the information provided in the website through buying the API.
UI depiction through block model:
40% of workers reported their job was very or extremely stressful. 25% view their jobs as
the number one stressor in their lives. Through our project we tried to take an initiative to
bring the percentage down by detecting the stress level at the earliest.
The data was handled in the correct way and used very carefully. The models were built for
prediction. We are able to categorize the person’s stress level as stress, amusement and
baseline. Finally an API concept is included in order to attract the user.