Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

A major depressive disorder

classification framework based on EEG


signals using statistical, spectral,
wavelet, functional connectivity, and
nonlinear analysis Reza Akbari
Movahed
Visit to download the full and correct content document:
https://ebookmass.com/product/a-major-depressive-disorder-classification-framework
-based-on-eeg-signals-using-statistical-spectral-wavelet-functional-connectivity-and-n
onlinear-analysis-reza-akbari-movahed/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Major Depressive Disorder 1st Edition Roger S Mcintyre

https://ebookmass.com/product/major-depressive-disorder-1st-
edition-roger-s-mcintyre/

Cognitive Dimensions of Major Depressive Disorder


(Oxford Psychiatry Library Series) 1st Edition Baune

https://ebookmass.com/product/cognitive-dimensions-of-major-
depressive-disorder-oxford-psychiatry-library-series-1st-edition-
baune/

Signals and systems : analysis using transform methods


and MATLAB Third Edition Michael J. Roberts

https://ebookmass.com/product/signals-and-systems-analysis-using-
transform-methods-and-matlab-third-edition-michael-j-roberts/

Functional Programming in R 4: Advanced Statistical


Programming for Data Science, Analysis, and Finance
Thomas Mailund

https://ebookmass.com/product/functional-programming-
in-r-4-advanced-statistical-programming-for-data-science-
analysis-and-finance-thomas-mailund/
Quantitative Trading Strategies Using Python: Technical
Analysis, Statistical Testing, and Machine Learning
Peng Liu

https://ebookmass.com/product/quantitative-trading-strategies-
using-python-technical-analysis-statistical-testing-and-machine-
learning-peng-liu/

Time-frequency Analysis of Seismic Signals Yanghua Wang

https://ebookmass.com/product/time-frequency-analysis-of-seismic-
signals-yanghua-wang/

Biostatistics and Computer-based Analysis of Health


Data using R 1st Edition Christophe Lalanne

https://ebookmass.com/product/biostatistics-and-computer-based-
analysis-of-health-data-using-r-1st-edition-christophe-lalanne/

Education and Economic Development: A Social and


Statistical Analysis Daniela-Mihaela Neam■u

https://ebookmass.com/product/education-and-economic-development-
a-social-and-statistical-analysis-daniela-mihaela-neamtu/

Practical Spring LDAP: Using Enterprise Java-Based LDAP


in Spring Data and Spring Framework 6 2nd Edition
Balaji Varanasi

https://ebookmass.com/product/practical-spring-ldap-using-
enterprise-java-based-ldap-in-spring-data-and-spring-
framework-6-2nd-edition-balaji-varanasi/
Journal of Neuroscience Methods 358 (2021) 109209

Contents lists available at ScienceDirect

Journal of Neuroscience Methods


journal homepage: www.elsevier.com/locate/jneumeth

A major depressive disorder classification framework based on EEG signals


using statistical, spectral, wavelet, functional connectivity, and
nonlinear analysis
Reza Akbari Movahed, Gila Pirzad Jahromi *, Shima Shahyad, Gholam Hossein Meftahi
Neuroscience Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran

A R T I C L E I N F O A B S T R A C T

Keywords: Background: Major depressive disorder (MDD) is a prevalent mental illness that is diagnosed through
Depression questionnaire-based approaches; however, these methods may not lead to an accurate diagnosis. In this regard,
Major depressive disorder (MDD) many studies have focused on using electroencephalogram (EEG) signals and machine learning techniques to
Machine learning
diagnose MDD.
Electroencephalogram (EEG)
Computer-aided diagnosis (CAD)
New method: This paper proposes a machine learning framework for MDD diagnosis, which uses different types of
EEG-derived features. The features are extracted using statistical, spectral, wavelet, functional connectivity, and
nonlinear analysis methods. The sequential backward feature selection (SBFS) algorithm is also employed to
perform feature selection. Various classifier models are utilized to select the best one for the proposed
framework.
Results: The proposed method is validated with a public EEG dataset, including the EEG data of 34 MDD patients
and 30 healthy subjects. The evaluation of the proposed framework is conducted using 10-fold cross-validation,
providing the metrics such as accuracy (AC), sensitivity (SE), specificity (SP), F1-score (F1), and false discovery
rate (FDR). The best performance of the proposed method has provided an average AC of 99%, SE of 98.4%, SP of
99.6%, F1 of 98.9%, and FDR of 0.4% using the support vector machine with RBF kernel (RBFSVM) classifier.
Comparison with existing methods: The obtained results demonstrate that the proposed method outperforms other
approaches for MDD classification based on EEG signals.
Conclusions: According to the obtained results, a highly accurate MDD diagnosis would be provided using the
proposed method, while it can be utilized to develop a computer-aided diagnosis (CAD) tool for clinical purposes.

1. Introduction Hamilton Depression Rating Scale (HDRS) (Hamilton, 1960). Unfortu­


nately, since these methods do not use any accepted biomarkers, their
Major Depressive Disorder (MDD) is a common psychiatric illness results are dependent on various factors, including the level of physi­
that causes persistent feelings of sadness, loss of pleasure, guilt feeling, cian’s skill and patient’s cooperation. Consequently, these methods are
and impairment of cognitive abilities. MDD patients suffer mostly from laborious, time-consuming, and subjective, so that many depressed pa­
the mentioned feelings, and, in the worst case, they might think about tients may not be diagnosed accurately due to these limitations. Hence,
suicide as well (Seligman, 1975). According to the World Health Orga­ finding and developing MDD diagnosis approaches based on biological
nization (WHO) estimation, more than 350 million people in the world indicators seem necessary to reduce MDD diagnosis subjectivity. Since
are suffering from MDD as the fourth most common cause of disability the electroencephalogram (EEG) signal reflects human brain activity, it
(Marcus et al., 2012; Organization, 2001). Currently, psychologists di­ is an objective and reliable biological indicator for diagnosing mental
agnose depression based on the scale-based interview standards, such as and cognitive disorders. Notably, recording of EEG signals is
Diagnostic and Statistical Manual for depression (DSM-IV) (Castillo cost-effective, relatively easy, and non-invasive. It can also provide high
et al., 2007), MiniMental State Examination (MMSE) (Folstein et al., temporal resolution information of the brain bioelectrical activity, while
1983), Beck depression inventory (BDI) (Folstein et al., 1983), and any alteration of brain functioning and mental state could be reflected in

* Corresponding author.
E-mail address: g_pirzad_jahromi@yahoo.com (G.P. Jahromi).

https://doi.org/10.1016/j.jneumeth.2021.109209
Received 30 November 2020; Received in revised form 22 April 2021; Accepted 26 April 2021
Available online 4 May 2021
0165-0270/© 2021 Elsevier B.V. All rights reserved.
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

these signals. However, manual interpretation of EEG signals is (Sharma et al., 2018). Also, they employed t-test feature-ranking process
complicated and tedious due to their intrinsic characteristics, such as to select the best features. An accuracy of 99.58% was reported in their
intricacy, non-linearity, and non-stationary. The development of ma­ study using the least square support vector machine (LS-SVM) classifi­
chine learning and computer technology has encouraged many re­ cation model. Mahato et al. utilized the band power of delta, theta,
searchers to investigate EEG-based machine learning techniques to alpha, and beta frequency bands, interhemispheric asymmetry, RWE,
establish the computer-aided diagnosis (CAD) systems for facilitating and wavelet entropy (WE) as EEG-derived features to diagnose MDD
the diagnosis of neurological disorders such as epilepsy (Behnam and (Mahato and Paul, 2019). In this framework, the principal component
Pourghassem, 2017; Song et al., 2012; Xiang et al., 2015), seizure analysis (PCA) technique was used to reduce the dimensionality of
(Acharya et al., 2018; Direito et al., 2017; Wei et al., 2019; Ghaderyan features for improving computational cost and classification efficiency.
et al., 2014), Parkinson’s disease (Hirschauer et al., 2015; Yuvaraj et al., The multi-layered perceptron neural network (MLPNN), radial basis
2016), schizophrenia (Shim et al., 2016), dementia (Durongbhan et al., function network (RBFN), LDA, and quadratic discriminant analysis
2019), and sleep disorders (Hassan and Bhuiyan, 2016; Lajnef et al., (QDA) were used to classify MDD and HC subjects. The highest classi­
2015; Mousavi et al., 2019; Chinara et al., 2020; Lachner-Piza et al., fication accuracy of 93.33% reported in their paper was achieved when
2018). the combination of alpha-band power and RWE features was applied to
Until now, several approaches have been proposed to diagnose MDD MLPNN and RBFN classifiers. Acharya et al. presented an automated
based on EEG signals using computerized methods. For instance, Hos­ EEG-based depression recognition method using a convolutional neural
seinifard et al. presented a framework for classifying MDD and healthy network (CNN) model (Acharya et al., 2018). They used a CNN model
control (HC) subjects using some linear and nonlinear features of EEG with 13 layers consisting of 5 convolutional layers, 5 pooling layers, and
signals (Hosseinifard et al., 2013). In the mentioned study, the power of 3 fully-connected layers to classify each bipolar EEG signal to the
alpha, beta, theta, and delta EEG frequency bands and four nonlinear healthy and depression classes. This approach obtained the accuracies of
features, including detrended fluctuation analysis, Higuchi fractal, cor­ 96.0% and 93.5% using EEG signals from the right and left hemispheres,
relation dimension, and Lyapunov exponent were extracted from EEG respectively. Ay et al. proposed a learning-based technique for auto­
signals. A feature selection technique based on the Genetic algorithm matic depression diagnosis using deep representation and sequence
was used to choose the best subset of features. For classifying MDD and learning with bipolar EEG signals (Ay et al., 2019). They employed a
HC subjects, some classifiers such as k-nearest neighbor (KNN), linear combination model of CNN and long-short term memory (LSTM) tech­
discriminant analysis (LDA), and logistic regression (LR) were utilized, niques to diagnose depressed patients. The classification accuracies of
and classification accuracy of 90% was obtained by all nonlinear fea­ 99.12% and 97.66% were reported in this paper for the right and left
tures and LR classifier. Besides that, Puthankattil et al. presented an hemisphere bipolar EEG signals, respectively. In addition, Mumtaz et al.
approach to classify the EEG signals of healthy and depressed cases using proposed two methods based on deep learning techniques to diagnose
relative wavelet energy (RWE) and signal entropy as extracted features depression using unipolar EEG signals (Mumtaz and Qayyum, 2019).
and the artificial neural network (ANN) as the classification model The first proposed model was a CNN model, and the second model was a
(Puthankattil and Joseph, 2012). They applied the bipolar EEG signals of combination of CNN and LSTM techniques. The reported classification
healthy and depressed subjects to the method. The reported results of accuracies using CNN and CNN-LSTM techniques were 98.32% and
accuracy, sensitivity, and specificity of this method were 98.11%, 95.97%, respectively. Mahato et al. investigated the automatic classifi­
98.73%, and 97.5%, respectively. In 2015, Acharya et al. proposed an cation of depressed patients and healthy subjects based on alpha power
automated depression diagnosis method based on the bipolar EEG sig­ and theta asymmetry EEG features (Mahato and Paul, 2020). In this
nals using nonlinear features extraction methods such as fractal study, the multi-cluster feature selection was used to select the most
dimension, Lyapunov Exponent, sample entropy, detrended fluctuation effective features. The classifiers used here were SVM, LR, NB, and De­
analysis, Hurst’s exponent, higher-order spectra, and recurrence quan­ cision Tree (DT). According to the reported results, this approach ob­
tification analysis (Acharya et al., 2015). These features were ranked tained an average accuracy of 88.33% using SVM classifier.
with t-test feature-ranking process and fed to a Support Vector Machine One of the common issues in the mentioned studies is not considering
(SVM) classifier, which achieved a classification performance with an the combination of different types of EEG-derived features for MDD
average accuracy of about 98%, sensitivity of about 97%, and specificity diagnosis. To address this issue, the present study aimed to propose a
of about 98.5%. Mumtaz et al. proposed a machine learning scheme diagnostic approach of MDD based on the machine learning technique
based on EEG-derived measures such as the power of alpha, beta, theta, using the combination of different types of EEG-derived features to
and delta EEG frequency bands and EEG alpha interhemispheric asym­ classify the EEG samples into the MDD and HC classes. The very
metry to predict depression disorder. (Mumtaz et al., 2017). In this approach consists of EEG signal preprocessing, data augmentation,
scheme, rank-based feature selection method was used to reduce the feature extraction, feature selection, classification, and validation. Here,
feature space’s redundancy. The classifier models used in this paper we used various analytic methods to extract EEG-derived features,
were LR, SVM, and Naive Bayesian (NB), among which SVM classifier including statistical, spectral, wavelet, functional connectivity, and
achieved the highest classification accuracy with an average accuracy of nonlinear analysis methods. The sequential backward feature selection
98.6%. Furthermore, Mumtaz et al. proposed the EEG-based functional (SBFS) was also employed to select the best subset of features and
connectivity features to classify MDD and HC subjects (Mumtaz et al., enhance classification performance. In the experimental setup of the
2018). To this end, synchronization likelihood features were extracted proposed method, different classifiers were evaluated to select the best
as input data for the classification framework, rank-based feature se­ one for the proposed framework. Besides that, the performance of each
lection was used to choose the best subset of features, and classifiers feature set, the effect of data augmentation, EEG signal power differ­
such as SVM, LR, and NB were employed to classify MDD and HC sub­ ences between MDD and HC subjects in common EEG frequency bands,
jects. They attained the highest classification accuracy of 98% was ob­ and the most significant functional connectivity features were also
tained using the selected synchronization likelihood features and SVM investigated.
classifier. They also performed a time-frequency decomposition of EEG The remainder of the paper is organized as follows. In Section 2, the
signals using wavelet transform for automatic MDD diagnosis in another dataset and the proposed framework are explained. The results of the
research (Mumtaz et al., 2017). The average classification accuracy re­ study are reported in Section 3. Finally, the discussion and conclusion
ported in their research using selected wavelet coefficients and LR are provided in Sections 4 and 5, respectively.
classifier was 89.6%. Sharma et al. used a bandwidth-duration localized
(BDL) three-channel orthogonal wavelet filter bank (TCOWFB) to
extract features from bipolar EEG signals of healthy and depressed cases

2
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

2. Materials and methods using the proposed preprocessing method. Next, the number of samples
was increased using data augmentation procedure. In the feature
2.1. Subjects extraction step, some measures based on statistical, spectral, wavelet,
functional connectivity, and nonlinear analyses were extracted from
A public data set provided by Mumtaz et al. (Mumtaz et al., 2017) each sample and arranged column-wise in a matrix called EEG feature
was utilized to evaluate the proposed method of depression diagnosis matrix. As exhibited in Fig. 1, the feature matrix was then divided
based on EEG signals. The participants of this study were the outpatients randomly into the training and testing sets using 10-fold cross-
of Hospital Universiti Sains Malaysia (HUSM). This dataset was acquired validation. To improve the classification performance and reduce the
from 34 MDD patients (17 females + 17 males, mean age dimensionality of the feature matrix, SBFS method was utilized as the
(yr) = 40.3 + 12.9) and age-matched 30 HC subjects (9 females + 21 feature selection technique. In this step, the training set was applied to
males, mean age (yr) = 38.3 + 15.6). The MDD patients met the diag­ the SBFS algorithm, so that it returned the best discriminative feature
nostic criteria according to the DSM-IV to confirm the diagnosis (Spitzer subset between HC and MDD classes. It is worth mentioning that SBFS
et al., 1994). All participants signed the consent forms of participation returned a specific subset in each iteration of the execution of the pro­
and were informed about the experimental procedure adopted for posed method using 10-fold cross-validation. Next, the classification
experimental data acquisition. The ethics committee of HUSM approved model was trained and validated by the training and testing sets with
the experimental setup (Mumtaz et al., 2017). The current study was selected features. Finally, the classification performance of the proposed
then approved by the ethics committee of Baqiyatallah University of method was evaluated based on the classification results of the testing
Medical Sciences, Tehran, Iran (ID:IR.BMSU.REC.1398.263). set during each iteration. It should be mentioned that all simulations and
implementations were conducted using MATLAB™R2019b on a system
with Intel® Xeon® Processor E5-2697 v2 CPU at 2 GHz and 16 GB
2.2. EEG signal recording
memory. More details of each step have been provided in the following.
The resting-state EEG signals were acquired from MDD and HC
2.3.1. EEG signal preprocessing
subjects in the eye-closed (EC) and eye-opened (EO) conditions. The
During recording, EEG signals get inherently contaminated with
procedure was performed using a 19-channel EEG cap. The cap’s sensors
different types of noises and artifacts. The origins of these artifacts are
were placed according to the 10-20 electrode placement standard
various biological and non-biological sources such as eye blinks and
(Jasper, 1958) and linked-ear (LE) reference (Dien, 1998). In other
movements, muscular activities, heartbeat, channel noise, and power
words, EEG signals were recorded from frontal (Fp1, Fp2, F3, F4, F7, F8,
line noise. As a result, EEG signals might not truly represent the un­
Fz), temporal (T3, T4, T5, T6), parietal (P3, P4, Pz), occipital (O1, O2),
derlying neuronal activity. Therefore, EEG signal preprocessing step is
and central (C3, C4, Cz) regions. These signals were sampled at 256 HZ
considered necessary for noise reduction and destructive artifacts sup­
and filtered with a 0.5 Hz to 70 Hz bandpass filter and an additional
pression to ensure that preprocessed signals represent pure brainwave
50 Hz notch filter using an amplifier from Brain Master Systems.
activity and avoid subsequent erroneous analysis.
In the present study, a proposed EEG signal preprocessing pipeline
2.3. Proposed classification method was employed, implemented using the EEGLAB toolbox (Delorme and
Makeig, 2004) of MATLAB software. Firstly, each EEG signal was
Fig. 1 illustrates the overview of the proposed framework for diag­ re-referenced to the A1-A2 channel. Next, all re-referenced EEG signals
nosing MDD based on EEG signals. As shown in Fig. 1, the proposed were high-pass filtered with 0.5 Hz cutoff frequency and low-pass
method contains EEG signal preprocessing, data augmentation, feature filtered with 32 Hz cutoff frequency. This procedure suppresses muscle
extraction, feature selection, classification, and validation steps. In the activity and power line noise since most of these artifacts’ power is
first step, the typical noises and artifacts in the signals were suppressed

Fig. 1. Overview of the proposed framework for MDD diagnosis using 19-channel EEG signals.

3
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

concentrated in higher frequencies. To remove other artifacts, inde­


pendent components of filtered signals were calculated using the inde­ where x(t) is the time-series, var is the variance function and x (t) is

pendent component analysis (ICA) algorithm. Since this method the first-order derivative of x(t). The activity parameter represents
assumes that the EEG signal is a mixture of independent components, it the variance of a time-series signal. The mobility and complexity
decomposes the signal into these parts so that each part could belong to parameters indicate the proportion of the standard deviation of the
the cerebral and artifactual sources. After applying ICA on a signal, each power spectrum and the change in frequency of the signal,
component was identified as an artifact or cerebral component using a respectively.
voting classification process with the aid of ICLabel (Pion-Tonachini (2) Spectral analysis: EEG spectral analysis intends to interpret the
et al., 2019) and MARA (Winkler et al., 2011) plugins and manual in­ power of fluctuations of the EEG time-series in EEG frequency
spection of the component in time and frequency domains. In the voting bands at different scalp regions. Numerous studies indicated the
classification process, each automatic plugin and the manual inspection significant relationships between the characteristics of EEG
method made a prediction (vote) for each component, and the final spectral features and neurological disorders, cognitive state, and
output prediction was the one that received more than half of the votes. mental illnesses. Therefore, EEG spectral features may be useful
Then, artifact components were removed, and the pruned signal was for characterizing EEG signals to be recognized and analyzing
reconstructed. Fig. 2 illustrates the scalp topography, time-series, and cognitive states or neurological dysfunctions. The proposed
power spectrum of some instances of the obtained independent com­ method employed the band power of EEG signals in the typical
ponents of the used dataset, which belong to the brain, eye blinking, EEG frequency bands and interhemispheric asymmetry as EEG
heartbeat, and muscular activities. Finally, the pruned EEG signal was spectral features. These frequency bands are associated with
visually inspected in the time domain to eliminate the remaining noisy different cognitive tasks, mental states, and neurological brain
intervals. mechanisms, and could be interpreted as depression biomarkers
because of their relation with various mental states. For instance,
2.3.2. Data augmentation beta frequency band is associated with expectancy, conscious­
In machine learning applications, data augmentation is used to in­ ness, memory, and problem-solving, and having too much of that
crease the number of samples without collecting new samples. A typical may lead to excessive stress and anxiety (Freeman and Quiroga,
time-series data augmentation is based on the signal-slicing, which 2012; Abhang et al., 2016; Evans and Abarbanel, 1999). Alpha
segments a signal into the smaller slices of equal lengths and the same frequency band is related to relaxation state, and prominence of
labels, with the original EEG signal label. In this study, an EEG sample that causes daydreaming, inability to focus, and deep relaxation.
was segmented into slices with 1-min lengths to generate new samples In contrast, its suppression can lead to anxiety, high stress, and
and increase the dataset’s diversity. It is worth mentioning that 1-min insomnia (Abhang et al., 2016; Evans and Abarbanel, 1999; Rao,
slicing led to the better results for the proposed method compared 2013). Theta frequency band is involved in shallow sleep state,
with other slicing times. After performing data augmentation, an EEG emotional processing, creativity, memory, and perceptual func­
dataset of HC and MDD subjects with more samples was obtained, tions. The unbalanced of this activity may result in anxiety, poor
consisting of 249 MDD and 261 HC samples, so that each sample emotional awareness, stress, hyperactivity, and impulsivity
included 19 channels. The reasons for the increase of HC samples than (Abhang et al., 2016; Evans and Abarbanel, 1999; Rao, 2013; Li
MDD cases after data augmentation were the different duration time of et al., 2019; Aftanas et al., 2002). Delta frequency band is typi­
the dataset’s EEG signals, eliminating noisy intervals, and unavailability cally associated with the deepest levels of relaxation and restor­
of some EEG signals. ative and healing sleep. If delta activity is abnormal, a person
may experience learning impairment, inability to think, or diffi­
2.3.3. Feature extraction culties maintaining conscious awareness (Freeman and Quiroga,
Feature extraction aims to derive meaningful parameters from a 2012; Abhang et al., 2016; Evans and Abarbanel, 1999; Harmony
dataset providing informative features, reducing the number of vari­ et al., 1996; Knyazev, 2012). Furthermore, many studies have
ables, and facilitating the subsequent steps of machine learning frame­ shown relationships between EEG frequency bands and MDD
works. In this study, the feature extraction step involved statistical, with different results. For example, it was observed that the
spectral, wavelet, functional connectivity, and nonlinear analysis decreased theta and delta activity are related to MDD (Saletu
methods. This step constructed a feature matrix called EEG feature et al., 2010; Knott et al., 2001; Coutin-Churchman and Moreno,
matrix consisting of 510 rows and 735 columns, where each row and 2008). In contrast, it was reported that increased delta and theta
each column represent each sample and its corresponding features, activity was associated with MDD (Liu et al., 2017; Nystrom et al.,
respectively. In the following, each category of the proposed features for 1986). Some studies determined relations between MDD and
classifying MDD and HC signals is described in detail. variation of alpha and beta activity (Coutin-Churchman and
Moreno, 2008; Lee et al., 2018; Roh et al., 2016; Begić et al.,
(1) Statistical analysis: In this study, some statistical measures were 2011). Although many studies have been conducted about the
extracted from artifact-free EEG segments as statistical features. relation between MDD and EEG frequency bands, a consistent
The statistical features include average, skewness, kurtosis, finding has not been obtained due to methodological differences
minimum, maximum, and Hjorth parameters extracted from each across studies and the inherent heterogeneity of the populations
channel of EEG segments. The Hjorth parameters proposed by under investigation. Herein, the power of four common EEG
Hjorth (1970) consist of three main measures called activity(h0 ), frequency bands, i.e., delta (0.5–4 Hz), theta (4–8 Hz), alpha
mobility(h1 ), and complexity(h2 ), which are defined as follows: (8–13 Hz), and beta (13–32 Hz) were computed. To compute the
power of frequency bands, the Welch periodogram method was
h0 = var(x(t)), (1)
used to estimate the power spectral density of signals (Oppen­
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ heim et al., 1989). In this work, the Hamming window with 50%
h1 =
h0 (x′ (t) )
, (2) overlap between the segments was used to estimate the Welch
h0 (x(t) ) power spectral density. Interhemispheric asymmetry is a spectral
feature that measures EEG signal power differences between the
left and right hemispheres in the common EEG frequency bands

h1 (x (t) )
h2 = , (3)
h1 (x(t) ) (Hinrikus et al., 2009). The equation of interhemispheric asym­
metry can be modeled as follows:

4
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Fig. 2. The scalp topography, time-series, and power spectrum of some instances of the obtained independent components of the EEG dataset signals. (a): Brain
component, (b): eye blinking component, (c): heart activity component, (d): muscular activity component.

5
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Amn = log(PRH ) − log(PLH ), (4)



N
WE = − RWEk logRWEk , (10)
where Amn , PRH , and PLH are the interhemispheric asymmetry, the
i=1

power of EEG signal in the right hemisphere, and the power of EEG
signal in the left hemisphere, respectively. Here, the interhemi­ where i = 1, 2, …, N indicates the level of the wavelet
spheric asymmetry was computed for delta, theta, alpha, and beta decomposition.
frequency bands for each EEG channel pair. The channel pairs in this (4) Nonlinear analysis: EEG signals are non-stationary and stochastic,
study were Fp2-Fp1; F4-F3; F8-F7; C4-C3; T4-T3; P4-P3; T6-T5; O2- inherently, containing some nonlinear characteristics. These
O1. properties limit the linear analysis to describe these signals
(3) Wavelet analysis: Wavelet transform is a time-frequency decom­ completely. Therefore, many studies related to EEG signal pro­
position technique that provides better time-frequency localiza­ cessing employ nonlinear analysis to investigate the complexity
tion than other similar methods such as empirical mode and dynamics of these signals. In this study, some nonlinear
decomposition and short-time Fourier transform (Rosso et al., methods such as detrended fluctuation analysis, Higuchi, corre­
2006). It utilizes time windows with different lengths, decom­ lation dimension, Lyapunov exponent, C0-complexity, Kolmo­
posing a signal into different frequency resolutions. The contin­ gorov entropy, Shannon entropy, and approximate entropy were
uous wavelet transform (CWT) of a time-series signal is defined as applied to the preprocessed EEG segments to extract nonlinear
follows: features.
∫ +∞ (11) Detrended fluctuation analysis: Detrended fluctuation anal­
1 t− b ysis is a mathematical approach for analyzing stochastic
CWT(a, b) = x(t)√̅̅̅̅̅̅ ψ ( )dt, (5)
− ∞ |a| a processes that estimates the correlation properties of a time-
series signal (Jospin et al., 2007). In the first step of this
where x(t) is the time-series signal, ψ is the shifted and scaled analysis, given a finite time-series signal, x(t) of length N,
wavelet basis, and a and b are the scaling and shifting parameters, the summation of it (X(k)), is computed using the following
respectively. Unfortunately, the information obtained by CWT may equation:
be highly redundant and requires a high computation load to be ∑
k
achieved. The discrete wavelet transform (DWT) is proposed to X(k) = (x(i) − x ), (11)
address this problem, which is defined as follows: i=1

∫ +∞
1 t − 2j k
DWT(a, b) = √⃒⃒̅̅̅̅̅̅̅ x(t)ψ ( j )dt, (6) where x denotes the average value of x(t). Then, X(k) is divided
⃒2j ⃒ − ∞ 2
into the n time windows with equal lengths, and a least-squares line
was fitted to the data within each window. Let Yn (k) indicate the
resulting least-squares line fitting. Next, the fluctuation (F(n)) is
where j and k represent the frequency and time localization,
computed using the following equation:
respectively. In the optimum DWT, the signal is passed through
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
quadrature mirror filters, consisting of a series of high-pass and low- √
√1 ∑ N
pass filter pairs. These filter pairs decompose a signal into the F(n) = √ (X(k) − Yn (k) ). (12)
N k=1
approximate (Ai ) and detail (Di ) coefficients, which represent low
and high-frequency components of the signal, respectively. This
decomposing procedure can be applied to the Ai coefficients for
Finally, the calculation process of (12) is repeated for time win­
several times, creating a hierarchal structure. In this research, the
dows with different sizes to construct a logarithmic scale of F(n)
three-level DWT decomposition was conducted by Coiflet 5 window
against n. The relation between logarithm of F(n) and n can be
function.
expressed by F(n) = nα , which α represents the correlation properties
After that, RWE and WE were computed as wavelet features using
of the time-series signal.
wavelet coefficients. The energy at the kth decomposition level (Ek )
(12) Higuchi: In 1988, Higuchi introduced a method for esti­
can be obtained using (7), which is defined as follows:
mating the fractal dimension of a set of points (Higuchi,
∑⃒ ⃒2
Ek = ⃒Ck,l ⃒ , (7) 1988). Suppose x(t) is a time-series signal with a length of N
l samples. Given this signal, T new time-series signals are
generated using (13),
{ ( [ ])}
where Ck,l is the wavelet coefficients at the kth decomposition N− τ
XτT = x(τ), x(τ + T), …, x τ + , (13)
level, l is the number of coefficients, and k = 1, 2, …, N denotes the T
decomposition level. The total energy (ET ) at the kth decomposition
level is obtained using (8), formulated as follows:
where τ = 1, 2, …, T and [r] is the integer part of r. The length of

N each time-series (Lτ (T)) is defined as follows:
ET = Ek . (8) [ ]
k=1
N− τ
∑ T

1 |x(τ − iT) − x(τ(i − 1)T ) | × (N − 1)


Then, RWE at kth resolution level (RWEk ) can be computed using Lτ (T) = i=1
[ ] . (14)
T N− τ
the following equation: T

Ek
RWEk = . (9)
ET In this algorithm, an average length is calculated for each time-
series using the below equation:
The equation of WE is based on Shannon entropy formulation,
which is represented as follows:

6
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209


T Given a time-series signal x(n) with N samples, the mean
L(T) = Lτ (T). (15) amplitude of the power spectrum of x(n) (M) can be ob­
tained using (21):
τ=1

The calculation of (15) is replicated for all T values ranging from 1 ∑


N− 1
M= |X(k) |2 , (21)
Tmin to Tmax . Finally, the slope of the linear fitting of lnL(T) versus ln N k=0
1/T is estimated as the fractal dimension of x(t). In this research, the
Tmin and Tmax were chosen to 1 and 30 values, respectively.
(13) Correlation dimension: Correlation dimension is a method for where X(k) is the fast Fourier transform of the x(n). A new spec­
estimating the space’s dimensionality occupied by a set of trum is constructed using X(k) and M as follows:
random points. Grassberger and Procacia proposed the most {
X(k) |X(k) |2 > M
commonly used correlation dimension algorithm in 1983 Y(k) = (22)
0 |X(k) |2 < M
(Grassberger and Procaccia, 2004). In the first step, this
algorithm generates an m-dimensional vector using time
delay (τ) and embedding dimension (m), which can be By calculating the inverse Fourier transform of Y(k) (y(n)), the C0-
represented as: complexity (C0) of the x(n) can be computed as follows:
∑N− 1
X(i) = [x(i), x(i + τ), …, x(i + (m − 1)τ ) ], (16) A1 |x(n) − y(n) |2
C0 = = n=0 ∑N− 1 2
, (23)
A0 n=0 |x(n) |

where i = 1, 2, …, N − (m − 1)τ, x is the time-series signal with N


samples, and X is the m-dimensional vector. After that, the correla­ where A1 and A0 are the power of irregular and regular parts of
tion integral of X is computed using the following equation: x(n), respectively.
2 ∑ (16) Kolmogorov entropy: Kolmogorov entropy (KE) is another
C(r) = ϕ(r − |X(i) − X(j) | ), (17) measure for characterizing the chaotic degree of a system
N(N − 1) i∕
=j
(Aftanas et al., 1997). It also reflects the rate of loss of in­
formation per unit of time for a time-series signal. Mathe­
where C(r) is the correlation integral and ϕ is the Heaviside step matically, it is defined based on the average rate of loss of
function. Next, the raw correlation dimension is estimated using information of a signal with n samples as follows:
(18), which is denoted as follows: 1 ∑ ( )
KE = − limlim lim Pi0 …in− 1 ln Pi0 …in− 1 , (24)
lnC(r) τ→0 ε→0 n→∞ nτ
(18)
i0 …in− 1
D = lim .
r→0 ln(r)

where Pi0 …in− 1 is the loss of information per each sample. The
The computation of (18) is repeated by increasing m, resulting in a positive and finite value of KE indicates that the dynamic phenom­
gradual increase of D until it is saturated. The saturated value of D is ena in the time-series are chaotic. The zero value of this parameter
the estimated correlation dimension of x(t) signal. means that the time-series contains regular phenomena, and infinite
(14) Lyapunov exponent: Lyapunov exponent is a measure of dy­ KE refers to the existence of non-deterministic phenomena in the
namic systems that characterizes the rate of convergence or signal.
divergence of close trajectories in phase space (Röschke (17) Shannon entropy: Shannon entropy is a quantity to measure
et al., 1995). For a dynamic system with d dimension, d the rate of the uncertainty of a random time-series, intro­
number of Lyapunov exponents can be computed. However, duced by Shannon (Shannon, 1948). The larger value of this
the largest Lyapunov exponent (LLE) is calculated instead of measure indicates more uncertainty and randomness of a
all exponents in most practical applications. For a dynam­ signal. The Shannon entropy of a random time-series with N
ical system, the maximum Lyapunov exponent (λ1 ) can be samples can be defined as:
defined as follows:

N

dj (i) = dj (0)exp(λ1 i▵t), (19) H=− pi ln(pi ), (25)


i=1

where dj (i) is the average Euclidian distance between two


where H and pi are the Shannon entropy of the signal and the
neighbor trajectories at i time and dj (0) is the Euclidian distance
probability of i sample in the time-series, respectively.
between the jth pair of initially most adjacent neighbors after i time.
(18) Approximate entropy: In 1991, Pincus et al. proposed
The LLE is calculated using (20), which is defined as follows:
approximate entropy as an algorithm to quantify the rate of
1 ( ) the unpredictability of a time-series signal (Pincus, 1991).
y(i) = < ln dj (i) >, (20)
▵t The more value of approximate entropy shows the more
irregularity of a time-series. Firstly, given a time-series x(n)
( ) with N data points, a new vector X(i) is constructed using
where y(i) is the approximated LLE and < ln dj (i) > is the mean
(16) on the assumption that τ = 1. Next, the distance be­
value of the natural logarithm of dj (i) over all values of j.
tween X(i) and X(j) (D[X(i)), X(j) ]) is calculated as follows:
(15) C0-complexity: C0-complexity is a measure to quantify ir­
regularities in a time-series signal proposed by Shen et al. D[X(i), X(j) ] = maxk=1,2,…,m [|x(i + k − 1) − x(j + k − 1) | ] (26)
(En-hua et al., 2005). In summary, a time-series signal can
be divided into the regular and stochastic components, and
where | . | is the Euclidean distance. After that, Cm
i (r) is computed
the C0-complexity defines the proportion of the amount of
for each i, i = 1, 2, …, N − m as follows:
irregularity to the regularity of a signal. In other words, it
indicates the complexity and randomness of a time-series.

7
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

numberof D[X(i), X(j) ] ≤ r are used for autocorrelation effects and sharpen the synchronization
Cim (r) = , (27)
N− m− 1 measure’s time resolution. These parameters are chosen such that
ω1 ≪ω2 ≪N. Next, the critical distance (εk,i ) is computed for each k
and each i for which Pk,ik,i = pref , where pref ≪1. Now, the number of
ε
where r is the threshold for D[X(i), X(j) ]. Finally, the approximate
entropy (ApEn) is defined as follows: channels where the Xk,i and Xk,j will be closer together than εk,i (Hi,j ),
is determined for each sample pair (i, j) and within the considered
ApEn = Φm (r) − Φm+1 (r), (28) window (ω1 < |i − j| < ω2 ) as follows:

M
( ⃒ ⃒)
where the equation of Φm (r) is: Hi,j = ϕ εk,i − ⃒Xk,i − Xk,j ⃒ (32)
k=1
N−∑
m− 1
1 ( )
Φm (r) = ln Cim (r) . (29)
N − m − 1 i=1
In other words, Hi,j indicates how many of the embedded signals
resemble each other. In the next step, the synchronization likelihood
In this research, the values of m and r were chosen to 2 and for each channel (k) and each discrete sample pair (i, j), (Sk,i,j ) is
0.2var(x), respectively. defined as follows:
(5) Functional connectivity analysis: Different perceptual and cognitive ⎧
⃒ ⃒
tasks require a coordinated flow of information distributed in the ⎨ Hi,j − 1 if ⃒Xk,i − Xk,j ⃒<εk,i ,

Sk,i,j = M− 1 (33)
brain regions. Functional connectivity is a method for analyzing ⎪
⎩ 0 ⃒ ⃒
if ⃒Xk,i − Xk,j ⃒>εk,i .
the dynamic coordination of neuronal activities in the brain. In
other words, functional connectivity investigates the statistical
relationships between neurobiological activities and connections Finally, the synchronization likelihood (Sk,i ) is computed as
between different brain regions such as frontal, temporal, central, follows:
parietal, and occipital (Squire et al., 2009). Since some
MDD-related cognitive tasks such as emotional regulation, ∑N
1
Sk,i = Sk,i,j , ω1 < |i − j| < ω2 . (34)
thinking, attention, problem-solving, and memory-related func­ 2(ω1 − ω2 ) j=1
tions are associated with frontal and temporal lobes and their
connection, functional connectivity features could be interpreted
as MDD biomarkers using brain network representation (Smith Sk,i describes the rate of synchronization between channel k at
and Kosslyn, 2007; Uttal, 2011). Many studies have also shown sample i and other M − 1 channels. If Sk,i = pref , all M signal channels
that the pathogenesis of MDD is related to the abnormalities in are uncorrelated whereas Sk,i = 1 indicates the maximum synchro­
the structures and networks of the brain regions (Zhu et al., 2012; nization of all M signal channels. In this study, the pref , l, m, the size of
Wu et al., 2013; Avery et al., 2014; Sheline, 2003; Koolschijn ω1 and ω2 were chosen to 0.01, 10, 10, 100, and 410, respectively. It
et al., 2009; Olbrich et al., 2014; Leuchter et al., 2012). Therefore, is worth mentioning that the synchronization likelihood between the
functional connectivity features could be effective for MDD same channels was removed, and the synchronization likelihood
diagnosis. There are different metrics for quantifying functional between different channels has remained as the functional connec­
connectivity, such as coherence, mutual information, and syn­ tivity features.
chronization likelihood. In this study, the synchronization like­
lihood metric was used to extract a set of features based on the 2.3.4. Feature selection
functional connectivity analysis. Synchronization likelihood In machine learning or statistical pattern recognition applications,
characterizes synchronization between two times-series signal the extracted features from a dataset may contain redundant or irrele­
(Stam and Van Dijk, 2002). The value of synchronization likeli­ vant features. In another point of view, high-dimensional extracted
hood between two time-series ranges from [0,1] interval. The features increase the computational load, which leads to the overfitting
zero value represents the complete non-synchronization, while issue of the models. Feature selection is a technique to select the desired
one value expresses the complete synchronization between two subset of features, reduce the dimension of feature space, and improve
signals. Consider M simultaneously recorded EEG channels (xk,i ), the classification performance of the pattern recognition model. In this
where kε{1, 2, …, M} and iε{1, 2, …, N} denote the channel work, SBFS was utilized to select a subset of features for improving the
number and the index of each discrete sample, respectively. Ac­ classification performance. The steps of this algorithm are summarized
cording to (30), an embedded vector (Xk,i ) is constructed using in Algorithm 1. In this study, the objective criterion (J) was defined as
the EEG signal corresponding to a channel as follows: the mean of misclassification rates during 10-fold cross-validation pro­
[ ] cess. This algorithm initially considers the whole feature set and then
Xk,i = xk,i , xk,i+l , …, xk,i+(m− 1)l , (30) sequentially removes features from the feature set until the elimination
of further features leads to the increase of the objective criterion (Pudil
et al., 1994). It should be mentioned that the classifier model in the SBFS
where m and l are the embedding dimension and lag parameters,
step is the same as the classifier model in the classification step of the
respectively. To estimate that the embedded vectors are closer to
proposed method.
each other than a distance of ε, a probability distribution (Pεk,i ) is
considered for each channel (k) and each sample (i). The formulation Algorithm 1. The SBFS algorithm
of Pεk,i is defined as follows: { }
Input: The set of all features, Y = y1 , y2 , …, yd
{ }

N
( ⃒ ⃒) Output: An optimum subset of features, Sk = Sj |j = 1, 2, …, k; Sj εY where k = 1,
1
Pεk,i = ϕ ε − ⃒Xk,i − Xk,j ⃒ , ω1 < |i − j| < ω2 , (31) 2, …, d
2(ω1 − ω2 ) j=1 1. Start with the full set, S0 = Y.
2. Remove the worst feature, s∗ = argmax(J(Sk − s) ), where s ε Y − Sk .
3. Update Sk+1 = Sk − s∗ ; k = k + 1.
where ϕ, | . |, ω1 ,and ω2 are Heaviside step function, Euclidean 4. If J(Sk ) > J(Sk− 1 ), go to step 6.
distance, Theiler correction window,and sharpening window, 5. Else go to the step 2.
respectively. The Theiler correction window and sharpening window 6. Stop.

8
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

and investigated. Finally, the proposed method was evaluated using


2.3.5. Classification another independent dataset, and its results were reported in the last
Classification is a fundamental part of the automatic identification of part of this section.
patterns in many statistical pattern recognition problems. In the present
study, different classifiers such as Support Vector Machine with linear 3.1. Results per classifier models
(LINSVM) and radial basis function (RBFSVM) kernels, LR, DT, NB, and
some ensemble classifiers like RusBoost (RB), GentleBoost (GB), and Table 1 shows the numerical results of the proposed method using
Random Forest (RF) were employed to select the best one for the pro­ different classifier models. As demonstrated, RBFSVM classifier model
posed machine learning framework. A data standardization based on z- provided the best performance among all classifiers with the highest
score transformation was performed on the training and testing sets to mean of AC, SP, and F1 metrics and the lowest mean of FDR measure.
modify unequal distribution of each feature and eliminate outliers. It The average AC, SE, SP, F1, and FDR metrics obtained by RBFSVM
should be mentioned that the hyperparameters of LINSVM, RBFSVM, classifier were 99.0%, 98.4%, 99.6%, 98.9%, and 0.4%, respectively.
LR, DT, NB, RB, and GB classification models were optimized using the RBFSVM classifier obtained the lowest standard deviation of AC, SP, F1,
Bayesian optimizer. It is worth mentioning that the number of trees in and FDR metrics compared to the other utilized classifiers, which in­
the RF ensemble classifier ranged from 2 to 50 during the training dicates the more robustness of this classifier’s performance in the pro­
procedure to find the optimum value of this hyperparameter. posed method. According to the reported results in Table 1, the second
and third best classifiers in the proposed method were LINSVM and RF
2.3.6. Validation classifiers, respectively. NB classifier exhibited the worst performance
In the present study, 10-fold cross-validation was employed to fairly with the lowest mean of AC, SP, and F1 metrics. The other classifiers
assess the classification performance of the proposed method. In this provided a similar performance. Overall, the obtained results show that
procedure, the dataset is initially divided into 10 folds, of which 9 folds all of the utilized classifiers except NB classifier approximately achieved
are randomly used as a training set, and the remaining fold is employed reliable and accurate performances for the classification of MDD and HC
as a testing set. This process is repeated 10 times until each fold is uti­ cases. Fig. 3 illustrates the box plots of the obtained values of AC, SE, SP,
lized as a testing set. During each iteration, the testing set is applied to and F1 metrics using different classifiers. As shown in Fig. 3, the ob­
the trained model, which results in 10 different evaluation metrics tained AC, SE, SP, and F1 metric values of all classifiers except NB
values. For investigating the overall classification performance, the classifier were relatively high, which confirms the acceptable perfor­
average and the standard deviation of the evaluation metrics were mance of the proposed method. However, the AC, SE, SP, and F1 metric
considered. The evaluation metrics used in this study were accuracy values of RBFSVM classifier were higher than the computed values of
(AC), sensitivity (SE), specificity (SP), F1-score (F1), and false discovery these metrics using other classifiers. Also, the boxplots of the AC, SE, SP,
rate (FDR), which are defined as follows: and F1 metrics of RBFSVM and LINSVM classifiers had the lowest height
compared to the other classifiers, indicating that the performance metric
AC =
TP + TN
, (35) values of RBFSVM and LINSVM classifiers were closer to the average
TP + FN + TN + FP performance compared to the other methods. Therefore, it can be
proven that RBFSVM and LINSVM classifiers provided the optimal
TP
SE = , (36) robustness/performance compared to the other classification models for
TP + FN
the proposed classification framework.
TN
SP = , (37) 3.2. The effect of data augmentation
FP + TN

2TP In order to analyze the effect of data augmentation on the perfor­


F1 = , (38)
2TP + FP + FN mance of the proposed method, EEG signals with different segment
lengths were applied to the proposed classification framework. Firstly,
FP the proposed framework was validated without data augmentation.
FDR = , (39)
FP + TN Then, it was tested by 1- and 2-min EEG segments. The obtained results
of these simulations based on the three best classifiers are reported in
where TP is the number of MDD samples that are correctly classified, FN
Table 2. As shown in Table 2, the proposed method with both data
is the number of MDD samples that are incorrectly classified as HC
augmentation strategies, 1- and 2-min slicing, achieved a better classi­
samples, FP is the number of HC samples that are incorrectly classified as
fication performance. In other words, the presented framework with
MDD cases, and TN is the number of HC cases that are correctly
both data augmentation strategies achieved higher means of AC, SE, SP,
classified.
F1 metrics, and lower means of FDR measure compared to the proposed
method without data augmentation. Only 2-min slicing with RBFSVM
3. Results
classifier led to a higher FDR mean than the 1-min and without slicing
conditions. According to the results, it can be interpreted that the data
In this section, the performance of the proposed machine learning
augmentation procedure with 1-min slicing led to the best classification
method to diagnose MDD based on EEG signals is evaluated in several
performance.
aspects. In the first part, the obtained numerical results of various
classifiers in the proposed framework are provided. In the second part,
the effect of the data augmentation procedure on the proposed frame­ 3.3. Comparison with other methods
work is analyzed. Next, the obtained results of the proposed method are
compared with the previous approaches. Then, each set of features is In order to compare the performance of the proposed method with
individually used as a feature matrix to assess their performance for other approaches, we implemented the methods described in (Mahato
classifying MDD and HC signals. In the fifth part, the EEG signal power and Paul, 2020; Mahato and Paul, 2019; Mumtaz et al., 2017; Mumtaz
differences between MDD and HC subjects are investigated. The most et al., 2018; Mumtaz et al., 2017). For a fair comparison, all of the
significant functional connectivity features are analyzed in the sixth part methods were validated using 10-fold cross-validation technique. It is
of this section. Next, the intersection of the returned subsets by SBFS in worth mentioning that the generated indices for training and testing sets
10-fold cross-validation execution of the proposed method is reported using 10-fold cross-validation were considered the same for all ap­
proaches. Table 3 provides the obtained numerical results of the

9
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Table 1
The classification results of the proposed method using different classifier models in terms of the percentage (%) of the mean and standard deviation of AC, SE, SP, F1,
and FDR metrics.
Classifier AC SE SP F1 FDR
(Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std)

LINSVM 98.8 ± 1.6 98.6 ± 3.0 98.9 ± 2.3 98.6 ± 1.9 1.2 ± 2.8
RBFSVM 99.0 ± 1.3 98.4 ± 2.5 99.6 ± 1.0 98.9 ± 1.3 0.4 ± 1.3
LR 91.9 ± 2.9 97.1 ± 2.0 86.6 ± 6.1 92.1 ± 3.1 12.1 ± 5.9
RF 98.0 ± 2.6 99.1 ± 2.7 97.0 ± 4.8 97.9 ± 2.6 3.0 ± 4.8
DT 93.3 ± 3.3 91.8 ± 4.8 94.6 ± 5.8 92.9 ± 3.6 5.4 ± 6.2
GB 95.2 ± 2.9 93.0 ± 6.3 97.5 ± 3.2 94.7 ± 3.7 3.0 ± 4.2
NB 87.2 ± 6.8 93.3 ± 6.9 81.1 ± 9.8 87.6 ± 6.8 1.71 ± 8.2
RB 93.5 ± 3.3 93.2 ± 5.4 93.7 ± 5.9 93.4 ± 3.1 6.0 ± 5.5

Fig. 3. The box plots of the achieved values of AC (a), SE (b), SP (c), and F1 (d) metrics per each classifier using 10-fold cross-validation method.

proposed method as well as the previous ones for automatic diagnosis of Compared to (Mumtaz et al., 2018; Mahato and Paul, 2020; Mahato and
MDD based on EEG signals. According to the summarized results in Paul, 2019; Mumtaz et al., 2017; Mumtaz et al., 2017), it has been
Table 3, the proposed framework achieved the highest mean of AC, SE, observed that the AC standard deviations were reduced by 68.29%,
SP, and F1 metrics and the lowest mean of the FDR metric compared to 80.59%, 87.25%, 81.42%, and 84.88%, respectively using the proposed
the other methods, indicating that the proposed method is more accu­ method. These results demonstrate that the classification performance
rate for the classification of MDD and HC subjects based on EEG signals. of the proposed method is relatively stable and more reliable than other
In another point of view, the proposed method, compared with (Mumtaz methods.
et al., 2018; Mahato and Paul, 2020; Mahato and Paul, 2019; Mumtaz
et al., 2017; Mumtaz et al., 2017), improved the AC mean by 7.60%, 3.4. Results per feature set
19.56%, 17.85%, 15.78%, and 28.73%, respectively. Furthermore, the
proposed method achieved the lowest standard deviation between all In this section, each set of the proposed features was used as the
evaluation metrics compared to the previous approaches of the auto­ feature matrix individually in the proposed method using RBFSVM,
matic classification of MDD and HC subjects based on EEG signals. LINSVM, and RF classification models. Table 4 lists the classification

10
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Table 2
The comparison of the classification results of the proposed method using data augmentation procedure with 1- and 2-min slicing and without data augmentation, in
terms of the percentage (%) of the mean and standard deviation of AC, SE, SP, F1, and FDR metrics.
Method Classifier AC SE SP F1 FDR
(Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std)

RBFSVM 99.0 ± 1.3 98.4 ± 2.5 99.6 ± 1.0 98.9 ± 1.3 0.4 ± 1.3
1-min slicing LINSVM 98.8 ± 1.6 98.6 ± 3.0 98.9 ± 2.3 98.6 ± 1.9 1.2 ± 2.8
RF 98.0 ± 2.6 99.1 ± 2.7 97.0 ± 4.8 97.9 ± 2.6 3.0 ± 4.8

RBFSVM 92.1 ± 4.9 97.5 ± 4.0 86.4 ± 11.7 92.7 ± 4.1 11.2 ± 7.3
2-min slicing LINSVM 93.1 ± 5.1 93.9 ± 7.8 93.1 ± 11.1 93.4 ± 4.3 6.0 ± 8.0
RF 94.8 ± 4.4 95.9 ± 5.5 94.0 ± 9.1 94.8 ± 4.3 5.5 ± 8.2

RBFSVM 90.7 ± 8.2 95.3 ± 7.7 84.8 ± 18.5 92.0 ± 6.7 9.7 ± 11.8
Without slicing LINSVM 92.7 ± 7.5 95.0 ± 8.1 88.7 ± 17.1 92.0 ± 7.3 9.3 ± 13.0
RF 91.4 ± 7.0 93.0 ± 12.3 84.1 ± 18.6 92.3 ± 6.7 6.8 ± 7.8

Table 3
The comparison of the classification results between the proposed method and previous works for identifying MDD and HC subjects based on the EEG signals.
Method AC SE SP F1 FDR
(Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std)

Proposed method 99.0 ± 1.3 98.4 ± 2.5 99.6 ± 1.0 98.9 ± 1.3 0.4 ± 1.3
(Mumtaz et al., 2018) 92.0 ± 4.1 94.8 ± 4.6 89.6 ± 4.1 91.9 ± 4.2 10.6 ± 4.2
(Mahato and Paul, 2020) 82.8 ± 6.7 86.9 ± 12.3 81.4 ± 14.3 82.1 ± 9.6 19.1 ± 16.5
(Mahato and Paul, 2019) 84.0 ± 10.2 84.3 ± 15.1 84.0 ± 16.2 83.4 ± 12.2 14.7 ± 15.5
(Mumtaz et al., 2017) 85.5 ± 7.0 89.5 ± 9.1 81.7 ± 11.9 86.0 ± 7.4 16.1 ± 10.7
(Mumtaz et al., 2017) 76.9 ± 8.6 76.0 ± 9.2 77.7 ± 7.2 76.0 ± 5.8 24.0 ± 9.7

Table 4
The classification results of the proposed method based on RBFSVM, LINSVM and, RF classification models using different EEG feature sets in terms of the percentage
(%) of the mean and standard deviation of AC, SE, SP, F1, and FDR metrics.
Feature Set Classifier AC SE SP F1 FDR
(Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std)

LINSVM 98.8 ± 1.6 98.6 ± 3.0 98.9 ± 2.3 98.6 ± 1.9 1.2 ± 2.8
Combining feature sets RBFSVM 99.0 ± 1.3 98.4 ± 2.5 99.6 ± 1.0 98.9 ± 1.3 0.4 ± 1.3
RF 98.0 ± 2.6 99.1 ± 2.7 97.0 ± 4.8 97.9 ± 2.6 3.0 ± 4.8

LINSVM 86.6 ± 6.2 86.8 ± 8.2 86.5 ± 4.8 83.6 ± 6.4 14.1 ± 5.2
Statistical RBFSVM 84.3 ± 3.9 81.6 ± 2.0 86.8 ± 6.8 83.7 ± 3.6 13.8 ± 6.9
RF 86.6 ± 6.5 84.6 ± 4.8 88.4 ± 8.4 86.1 ± 6.4 12.0 ± 8.7

LINSVM 92.1 ± 4.7 91.7 ± 6.0 92.4 ± 4.5 92.1 ± 4.6 7.4 ± 4.1
Spectral RBFSVM 92.5 ± 7.6 91.9 ± 7.5 93.2 ± 8.2 92.6 ± 7.4 6.5 ± 7.9
RF 91.7 ± 4.7 89.5 ± 5.7 93.7 ± 6.6 91.4 ± 4.5 6.3 ± 6.3

LINSVM 85.4 ± 3.7 82.4 ± 7.0 88.9 ± 5.4 84.4 ± 4.9 12.7 ± 7.7
Wavelet RBFSVM 84.9 ± 6.0 84.3 ± 1.0 85.3 ± 5.7 84.0 ± 8.2 16.0 ± 7.8
RF 85.8 ± 6.0 81.2 ± 9.3 90.5 ± 6.8 84.5 ± 6.6 10.9 ± 7.9

LINSVM 86.6 ± 4.1 86.9 ± 5.6 86.2 ± 5.3 86.8 ± 3.9 13.0 ± 3.7
Nonlinear RBFSVM 87.4 ± 3.9 87.3 ± 6.8 87.8 ± 3.5 86.9 ± 3.8 13.1 ± 4.1
RF 86.6 ± 6.5 84.6 ± 4.8 88.4 ± 8.4 86.1 ± 6.4 12.0 ± 8.7

LINSVM 92.1 ± 8.1 95.3 ± 6.2 89.0 ± 10.2 92.3 ± 8.0 10.4 ± 9.7
Functional connectivity RBFSVM 93.3 ± 8.2 93.9 ± 7.5 92.8 ± 9.2 93.2 ± 8.4 7.3 ± 9.4
RF 93.2 ± 7.8 95.0 ± 5.3 91.6 ± 11.0 93.4 ± 7.6 7.8 ± 10.3

results based on each feature set and integrated feature sets using the comparing the results, it can be concluded that the integrated feature
mentioned classifiers. sets obtained the highest average of AC, SE, SP, and F1 parameters and
It is clear from the provided results in Table 4 that the integrated EEG the lowest mean of the FDR metric. In addition, the integrated feature
feature sets achieved a better performance than each EEG feature set. By sets provided the classification results with the lowest standard

11
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

deviation of evaluation metrics. It is worth mentioning that the best feature set obtained the highest mean of AC (AC=93.3%). The spectral,
classification performance of the integrated feature sets was obtained by nonlinear, and statistical feature sets achieved the second, third, and
RBFSVM classifier, which provided a classification performance with an fourth highest mean of AC, respectively. It is worth mentioning that the
average AC of 99.0%, SE of 98.4%, SP of 99.6%, F1 of 98.9%, and FDR of highest average of AC by spectral, nonlinear, and statistical feature sets
0.4%. were 92.5%, 87.4%, and 86.6%, respectively. The wavelet feature set
As demonstrated in Table 4, the functional connectivity feature set achieved the lowest AC mean compared to the other EEG feature sets by
achieved the highest classification accuracy compared to the other EEG providing the average AC of 85.8%.
feature sets. In other words, it obtained the highest mean of AC, SE, and
F1 measures among all EEG feature sets. These results show the supe­ 3.5. EEG signal power analysis
riority of the functional connectivity feature set compared to the other
feature sets for the automatic classification of MDD and HC subjects The main goal of this section is to investigate the EEG signal power
based on EEG signals. For this feature set, RBFSVM classifier provided differences in common EEG frequency bands at different scalp regions
the best classification performance with an average AC of 93.3%, SE of between MDD and HC samples. To this end, the alpha, theta, beta, and
93.9%, SP of 92.8%, F1 of 93.2%, and FDR of 7.3%. delta band powers of MDD and HC cases were analyzed by t-test and f-
According to Table 4, the second best feature set for the automatic test methods to measure the difference between the two groups by the
classification of MDD and HC subjects based on EEG signals was the mentioned features. Table 5 lists the t-test and f-test results on the alpha,
spectral feature set. By comparing the functional connectivity and theta, beta, and delta EEG signal powers of MDD and HC cases in each
spectral feature sets results, the spectral feature set was determined to channel. The delta power provided the most significant difference be­
obtain the higher means of SP metric and lower means of FDR measure. tween MDD and HC samples in all brain regions based on the reported
It indicated that this feature set is more accurate to classify HC samples. results. The second best EEG power for MDD and HC discrimination was
For this feature set, RBFSVM classification model obtained the best the theta power. Nevertheless, the alpha and beta powers could not
classification performance by providing an average AC of 92.5%, SE of provide a significant difference between MDD and HC samples. How­
91.9%, SP of 93.2%, F1 of 92.6%, and FDR of 6.5%. ever, the alpha power could provide more discrimination between MDD
According to the reported results in Table 4, the third best feature set and HC samples compared to the beta power, especially in occipital and
in terms of classification performance was the nonlinear feature set, and temporal regions. In terms of brain regions, the frontal, temporal, and
its best performance was achieved by RBFSVM classifier (AC=87.4%, parietal provided the most discrimination between HC and MDD sam­
SE=87.3%, SP=87.8%, F1=86.9%, and FDR=13.1%). ples using delta and theta powers. The best scalp regions for alpha power
Among these feature sets, the statistical and wavelet feature sets were temporal and occipital regions. It was also observed that the beta
obtained the worst classification performance, respectively. For the powers of the temporal region, especially on the right side, provided
statistical feature set, LINSVM and RF classifiers performed almost more discrimination than the beta powers of other brain regions. Fig. 5
identically and achieved the highest classification performance. The best illustrates the boxplots of alpha, theta, beta, and delta EEG signal powers
classification results with the wavelet feature set were obtained by RF of MDD and HC samples at the frontal, temporal, parietal, occipital, and
classification model (AC=85.8%, SE=81.2%, SP=90.5%, F1=84.5%, central scalp regions. As demonstrated in Fig. 5, the most discriminative
and FDR=10.9%). Therefore, it can be interpreted that the wavelet EEG signal powers between MDD and HC subjects in all regions were
feature set obtained the worst classification performance compared to delta, theta, alpha, and beta powers, respectively. Fig. 5 shows that the
the other EEG feature sets. delta power provided the most significant difference between the MDD
Fig. 4 exhibits the bar plot of the average accuracies achieved by and HC classes. It has been observed that the theta and alpha signal
each EEG feature set and the combination of them using LINSVM, powers provided a slight difference between the MDD and HC samples.
RBFSVM, and RF classification models. As depicted in Fig. 4, the com­ However, the theta signal power provided more discrimination between
bination of all feature sets achieved the highest mean of AC metric MDD and HC samples than the alpha signal power. Nonetheless, the beta
(AC=99.0%). Among the feature sets, the functional connectivity signal power in all regions did not provide significant differences be­
tween the MDD and HC cases.
Fig. 6 shows the constructed scalp topographic maps of HC and MDD
samples in terms of average powers of alpha, delta, beta, and theta
frequency bands in each EEG channel. As depicted in Fig. 6, the delta
power in all brain regions provided significant differences between HC
and MDD cases. In addition, the theta power showed significant differ­
ences between HC and MDD cases, especially in temporal and frontal
regions. In other words, it was observed that MDD cases had lower delta
and theta activity compared to HC cases. Also, the MDD class had lower
average alpha powers than the HC class in frontal, occipital, parietal,
and central regions. In terms of beta power, the right temporal region
provided more discrimination between HC and MDD cases than other
regions. However, the beta power did not significantly differentiate HC
and MDD cases in other scalp regions.

3.6. Functional connectivity analysis

According to the mentioned results, the best individual EEG-derived


feature set for discrimination between HC and MDD cases is the func­
tional connectivity feature set. In this subsection, the most significant
functional connectivity features which discriminate HC and MDD cases
are investigated. To this end, the functional connectivity features of
Fig. 4. the bar plot of the obtained AC metrics achieved by each EEG feature set MDD and HC cases were ranked using the p-value metric of t-test
and the combination of them using RBFSVM, LINSVM and, RF classifica­ method, and ten top of them were determined. For more investigation,
tion models. these ten top features were also analyzed by f-test method. Table 6

12
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Table 5
The t-test results on the alpha, theta, beta, and delta EEG band powers in each EEG channel. Bolded items indicate p − value < 0.01.
T-test F-test

Brain Region EEG channel Alpha p-value Beta p-value Delta p-value Theta p-value Alpha p-value Beta p-value Delta p-value Theta p-value

Fp1 0.0280 0.0106 1.44e-26 1.46e − 09 0.1263 0.1835 2.65e − 57 7.86e − 67


Fp2 0.0110 0.0767 2.90e − 28 4.91e − 14 0.0763 0.0002 2.86e − 32 5.80e − 05
F7 0.0008 0.0019 3.17e − 24 1.13e − 19 0.3364 0.9139 1.04e − 115 5.40e − 09
Frontal F3 0.1003 0.1058 1.28e − 32 3.75e − 11 0.0741 0.7695 5.48e − 75 0.1676
Fz 0.1780 0.6787 1.13e − 32 2.37e − 07 0.0190 0.0025 2.17e − 54 3.92e − 51
F4 0.1103 0.0472 4.20e − 36 2.13e − 10 0.0004 0.2796 2.30e − 55 2.79e − 18
F8 0.0011 0.0001 1.56e − 41 1.15e − 13 0.0206 0.0001 1.15e − 82 6.71e − 66

C3 0.0556 0.0707 2.08e − 34 6.89e − 12 0.0079 3.03e − 05 3.19e − 60 5.46e − 22


Central Cz 0.2725 0.0522 4.69e − 35 1.56e − 10 0.0002 3.59e − 07 1.53e − 50 0.0173
C4 0.4135 0.5613 3.91e − 37 2.35e − 11 0.0001 8.31e − 08 1.66e − 50 0.2253

O1 5.50e − 06 0.0011 2.36e − 37 1.15e − 11 8.56e − 21 0.0018 1.02e − 70 9.34e − 67


Occipital
O2 9.07e − 06 0.0044 5.12e − 36 2.12e − 16 2.51e − 39 0.0038 1.08e − 61 1.26e − 08

P3 0.0094 0.0099 2.95e − 36 6.26e − 12 0.1228 7.51e − 04 1.64e − 61 0.0109


Parietal P4 0.0103 0.0384 1.17e − 40 4.00e − 09 0.5274 7.44e − 09 8.72e − 53 7.42e − 60
Pz 0.0185 0.0656 6.36e − 39 5.70e − 12 0.6026 1.16e − 05 7.15e − 54 2.62e − 12

T3 1.22e − 05 0.2344 8.37e − 33 2.46e − 15 0.0125 0.2013 1.27e − 76 1.99e − 30


T5 1.65e − 05 0.3583 9.07e − 39 2.20e − 17 0.0089 0.8226 3.69e − 71 2.56e − 07
Temporal
T4 1.07e − 05 0.0001 4.01e − 42 1.48e − 28 0.9287 2.02e − 06 1.23e − 72 2.16e − 20
T6 1.37e − 09 0.0001 7.89e − 46 1.41e − 11 1.15e − 27 0.3326 1.91e − 66 6.45e − 122

reports ten top significant functional connectivity features with their t- This dataset contains 128-channel resting-state EEG signals acquired
test and f-test results. Also, the boxplot of ten top significant functional from 24 MDD patients and 29 HC subjects. It should be mentioned that
connectivity features of MDD and HC cases is illustrated in Fig. 7. Ac­ the 19 EEG channels compatible with the proposed framework were
cording to these results, the computed statistical significance of ten top extracted from each signal to prepare the mentioned dataset for vali­
functional connectivity features using t-test method for the difference dating the proposed framework. Table 8 reports the obtained results of
between MDD and HC cases is less than 1e-6, which indicates the sig­ this assessment using the three best classification models. As shown in
nificant difference between the classes using each of these features. Table 8, LINSVM provided the best classification performance by
These ten top features were the F4-F8, T5-P4, T5-T6, Fz-Fp2, Fp1-Fz, T5- obtaining an average AC of 97.3%, SE of 98.3%, SP of 97.0%, F1 of
T4, Fp1-Fp2, T5-C4, Fz-F8, and T5-Cz, which provided the most signif­ 97.6%, and FDR of 2.5%. By comparing the results in Table 8 and the
icant difference between HC and MDD cases among functional connec­ results of LINSVM, RBFSVM, and RF classifiers in Table 1, it can be
tivity features. Additionally, the exhibited boxplots in Fig. 7 show that interpreted that the obtained results by the proposed method were close
these ten top features can lead to significant differences between MDD in both cases, and it provided an acceptable and robust performance on
and HC cases. Moreover, it can be interpreted that the functional con­ both datasets.
nectivity between frontal and temporal scalp regions provided the most
significant differences between MDD and HC classes. 4. Discussion

3.7. Selected features This study proposes an EEG-based machine learning framework to
discriminate MDD and HC cases automatically. The proposed framework
In the experimental setup of this study, SBFS method returns a spe­ utilized various EEG-derived features such as statistical, spectral,
cific subset of features as selected features in each iteration of 10-fold wavelet, functional connectivity, and nonlinear features. In other words,
cross-validation execution, which resulted in 10 subsets of features for the main objective of this study was to analyze the combination of
all iterations. In this subsection, the intersection of all these 10 subsets is different types of EEG-derived features to classify MDD and HC subjects.
reported and investigated. Table 7 lists these features with their details. Also, different classification models were evaluated and compared to
The number of these features was 506, of which 108 belonged to the select the best one for the proposed framework. In order to select the
spectral set, 171 belonged to the functional connectivity set, 77 best subset of features, SBFS algorithm was used. According to the re­
belonged to the statistical set, 54 belonged to the wavelet set, and 96 ported results in Table 1, the best classification performance on the
belonged to the nonlinear set. It was observed that all functional con­ Mumtaz et al. (Mumtaz et al., 2017) EEG dataset, was provided by
nectivity and spectral features were selected in all iterations and had the RBFSVM model by obtaining an average AC of 99%, SE of 98.4%, SP of
largest share in the selected features, while statistical and wavelet fea­ 99.6%, F1 of 98.9%, and FDR of 0.4%. Also, the best classification
tures had the least share in the selected features. performance on MODMA dataset (Cai et al., 2020) as the independent
dataset, was provided by LINSVM model by providing an average AC of
97.3%, SE of 98.3%, SP of 97.0%, F1 of 97.6%, and FDR of 2.5%. These
3.8. Validation on independent dataset
results on both datasets indicate the accurate and robust classification
performance of the proposed framework. Furthermore, each set of
In order to evaluate the proposed method on another independent
EEG-derived features was individually evaluated in the proposed
dataset, we applied the MODMA EEG dataset of MDD and HC subjects
framework for discriminating the EEG signals of MDD and HC cases.
(Cai et al., 2020) to the proposed method using 10-fold cross-validation.

13
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Fig. 5. the boxplots of the delta, alpha, theta, and beta signal powers of MDD and HC subjects at frontal (a), central (b), occipital (c), temporal (d), and parietal (e)
areas of the scalp.

Although the obtained results confirm the high potential of all proposed MDD-related mental states and cognitive tasks. In addition, these feature
EEG-based feature sets to distinguish MDD patients and HC subjects, the sets had the largest share in the intersection of returned feature subsets
combination of all proposed EEG-based feature sets provided the best by SBFS during 10-fold cross-validation and were selected in all itera­
classification performance. Among the feature sets, the functional con­ tions. It is worth mentioning that the wavelet feature set could obtain
nectivity and spectral feature sets achieved the best performance, better results by changing its hyperparameters, such as window func­
respectively. Another advantage of these feature sets is that they could tion, decomposition level, and combining the wavelet features of
also provide some biological information about MDD and some different decomposition levels. Besides, analyzing EEG signal powers of

14
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Fig. 6. the scalp topographic plots of average powers of delta (a), alpha (b), theta (c), and beta (d) frequency bands.

15
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Table 6 coefficients between frontal and temporal regions provide the most
Ten top functional connectivity features in terms of discrimination between HC significant discrimination between MDD and HC subjects. These results
and MDD classes with their p-values of t-test and f-test methods. are consistent with the finding of some studies related to the abnor­
Functional connectivity feature t-test p-value f-test p-value malities in brain regions of MDD patients (Sheline, 2003; Koolschijn
F4-F8 4.62e − 17 7.00e − 10
et al., 2009; Mumtaz et al., 2018; Olbrich et al., 2014; Leuchter et al.,
T5-P4 4.73e − 12 7.79e − 04
2012). This discrimination may be due to the frontal and temporal re­
gions’ functions and their connections, which are associated with some
T5-T6 2.76e − 11 0.0153
MDD-related mental states and cognitive tasks such as elicitation and
Fz-Fp2 1.28e − 10 3.02e − 11
recognition of emotions, attention, thinking, problem-solving, and
Fp1-Fz 5.92e − 10 1.99e − 07
memory-related functions. The execution time of the proposed method
T5-T4 9.73e − 10 0.2088
for a 1-min EEG signal with 19 channels was about 3 min and 48 s in
Fp1-Fp2 2.59e − 09 0.2008
MATLAB using a laptop pc with 12 GB RAM and Intel Core i7-3537U @
T5-C4 3.71e − 09 7.21e − 06
2.00 GHz CPU. This performance is approximately acceptable for
Fz-F8 4.56e − 09 2.33e − 04
real-time applications. However, a C/C++ implementation could
T5-Cz 4.28e − 08 0.0272
improve the computational cost and the execution time of the proposed
method for CAD applications.
MDD and HC samples was also done in alpha, delta, beta, and theta The comparison between the proposed framework and other state-of-
frequency bands. Based on the obtained results, the delta and theta the-art methods for diagnosing MDD based on EEG signals is provided in
powers provided the most significant difference between MDD and HC Table 9. Based on the summarized results in Table 9, the proposed
cases in all brain regions, respectively. The distinction made by theta framework with the highest classification accuracy among state-of-the-
activity may be due to the relation between theta activity and emotional art methods outperforms the other state-of-the-art techniques imple­
processing because MDD patients suffer from feelings of sadness, mented on the Mumtaz et al. (Mumtaz et al., 2017) EEG dataset. The
emptiness, or hopelessness. On the other side, the distinction made by main contribution of the proposed method compared to the other
delta activity may be associated with the inability to think and relaxa­ state-of-the-art methods is using different EEG-based features extracted
tion disturbances of MDD patients. Moreover, it was found that MDD by various analytical methods, while previous works have not utilized
cases had lower theta and delta activities compared to HC subjects. This the integration of various features in such a manner. The reported results
finding is consistent with the results of some previous studies (Mumtaz show that integrating all proposed feature sets led to the best classifi­
et al., 2017; Saletu et al., 2010; Knott et al., 2001; Coutin-Churchman cation performance, outperforming the previous works of MDD diag­
and Moreno, 2008). It was also observed that the theta and delta band nosis based on EEG signals. Also, SBFS algorithm was used in this study
powers of frontal and temporal regions provided more difference be­ to select the best subset of features to improve the classification per­
tween MDD and HC cases than these powers of other regions. This is formance and reduce the computational cost. It is noteworthy that this
because the frontal and temporal regions are involved in emotional paper utilized a sequential feature selection method for the first time,
regulation, thinking, attention, and other higher executive functions, while other state-of-the-art approaches used other feature selection
and MDD patients are different from HC participants in these functions. methods such as rank-based feature selection, PCA, and multi-cluster
It was also found that MDD cases had a lower average of alpha activity in feature selection methods (Mumtaz et al., 2017; Mumtaz et al., 2017;
frontal, occipital, parietal, and central regions than HC cases. It is in Mumtaz et al., 2018; Mahato and Paul, 2019; Mahato and Paul, 2020).
accordance with the results of some studies related to MDD (Mumtaz Despite the fact that SBFS algorithm positively impacts the classification
et al., 2017; Begić et al., 2011). Additionally, the functional connectivity performance of the proposed framework, it increases the computational
features of MDD and HC cases were statistically investigated, and the ten load of the model during the training phase for selecting the best subset
top of them for discrimination between HC and MDD cases were re­ of features as well. From another point of view, this study utilized some
ported. The obtained results show that the functional connectivity ensemble classifiers such as RB, RF, and GB that had not been used in the

Fig. 7. the boxplot of ten top functional connectivity features of MDD and HC subjects.

16
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

Table 7 computational load of the method as well. In addition, it complicated


The intersection of the selected features in all iterations of 10-fold cross- the interpretation of the framework in terms of physiological and
validation biomarker variables. Moreover, the clinical applications of this study
Feature set Selected features and all EEG-based machine learning techniques for MDD diagnosis are
Functional All features
not very clear. The clinical effectiveness of these methods requires more
connectivity clinical experimental evidence. For instance, it has not been tested that
Spectral All features the detection was correct or not when the treatment of MDD has suc­
ceeded in this and previous studies. Furthermore, the generalization of
Average (Fp1, Fp2, F3, F8, Fz, T3, T6, P3, P4, O2, C3, C4) this paper and similar studies need further EEG datasets of depression
Skewness (Fp1, F7, F8, Fz, T3, T6, P4, Pz, O1, C3, C4, Cz)
disorder. Future studies can also focus on automatic severity scaling of
Kurtosis (F7, F8, Fz, T3, T6, P4, Pz, O1, C3, C4, Cz)
Minimum (Empty) depression based on EEG signals due to a few numbers of researches on
Maximum (Empty) automatic depression severity scaling using EEG signals.
Statistical
Activity (F4, F7, F8, Fz, T3, T4, T5, P4, Pz, O2, C3, C4, Cz) However, this paper presents a machine learning scheme for diag­
Mobility (Fp1, Fp2, F3, F7, F8, Fz, T4, T5, T6, P3, Pz, O1, C3,
nosing MDD based on EEG signals, which can be used by clinical psy­
Cz)
Complexity (Fp1, F3, F4, F7, Fz, T3, T4, T5, T6, P4, Pz, O1,
chiatrist centers as a CAD tool. Compared with conventional diagnosis
O2, C3, C4) techniques for depression, the proposed approach does not suffer from
subjectivity in treatment conditions, and it is less prone to human errors.
RWE1 (Fp1, Fp2, F4, F8, T3, T4, T5, P3, P4, O2, C3, C4, Cz) Unlike the existing clinical diagnostic approaches that may be laborious
RWE2 (Fp1, Fp2, F3, F4, F8, Fz, T4, T5, T6, Pz, O1, O2, C3, and time-consuming, the proposed method provides a quick MDD
Wavelet C4, Cz) diagnostic performance without direct human intervention.
RWE3 (Fp1, Fp2, F4, F7, F8, T3, T6, P3, P4, Pz, O2, C4) As future work, providing new EEG datasets of depression patients
WE (Fp2, F3, F4, Fz, T4, T5, T6, P4, Pz, O1, O2, C3, C4, Cz) with different severity levels could be considered. The concatenation of
the conventional machine learning methods with deep learning tech­
Detrended fluctuation analysis (F3, F4, Fz, T3, T4, T5, T6, P4, niques could provide a novel EEG-based approach for the automatic
Pz, O1, C3)
Higuchi (Fp1, Fp2, F3, F7, Fz, T4, T6, P3, P4, Pz, O2, C4, Cz)
depression diagnosis. Furthermore, the merging of psychophysiological
Correlation dimension (Fp1, Fp2, F4, F8, Fz, T4, T5, T6, P3, characteristics and other biological signals with EEG may provide a
P4, C3, C4, Cz) novel depression diagnosis approach with a more clinical perspective. In
Lyapunov exponent (Fp1, F3, F4, F8, Fz, T3, T5, P4, O2, C3, addition, proposing machine learning techniques for automatic severity
C4, Cz)
Nonlinear scaling of depression could be considered in future works.
C0-complexity (Fp1, Fp2, F4, F7, T3, T6, P3, P4, Pz, O1, C4,
Cz)
Kolmogorov entropy (Fp1, Fp2, F3, F4, F8, T3, T5, T6, Pz, O1, 5. Conclusion
O2, C3, C4, Cz)
Shannon entropy (F3, F4, F7, Fz, T3, T6, P3, O1, Cz) In this paper, a machine learning framework for MDD diagnosis was
Approximate entropy (Fp1, Fp2, F3, F7, Fz, T3, P3, P4, O1,
O2, C3, C4)
introduced using the integration of different types of EEG-derived

previous studies. The obtained results indicate the acceptable perfor­ Table 9
mances of these classifiers to classify MDD and HC subjects based on Comparison of the classification results between the proposed method and
EEG signals. previous works for identifying MDD and HC subjects based on EEG signals.
It should be mentioned that the present study has a few limitations. Study Year EEG features Classifiers Reported
The major limitation is that the used datasets have a small number of results
samples. Therefore, the reported high classification accuracies could not (Mumtaz 2017 Spectral Features LR, SVM, and 85.5%
be very generalizable. However, we attempted to compensate for this et al., 2017) NB
limitation by using the data augmentation process. Nonetheless, the (Mumtaz 2017 Wavelet Coefficients LR 76.9%
et al., 2017)
generalization of this method and similar approaches require further
(Mumtaz 2018 Functional LR, SVM, and 92.0%
EEG datasets of depression disorder. On the other side, most of the et al., 2018) Connectivity NB
studies related to the automatic EEG-based MDD diagnosis methods (Mahato and 2019 Spectral and Wavelet- MLPNN, RBFN, 84.0%
used their private datasets, and even these private databases have a Paul, 2019) based Features LDA, and QDA
(Mahato and 2020 Spectral Features SVM, LR, NB, 82.8%
small amount of data. Therefore, the lack of public EEG datasets of MDD
Paul, 2020) and DT
with more subjects is very prominent. Generally, public datasets provide
new opportunities for collaborations, and it is very useful to generalize Statistical, Spectral, LINSVM,
the validation of the proposed approaches. Another limitation of the Wavelet, RBFSVM, LR,
Proposed
proposed method is its high computational load. Although the combi­ framework
2020 Functional
DT, RB, NB,
99%
nation of all feature sets provided the highest classification accuracy, it Connectivity and
GB, and RF
Nonlinear Features
generated a high dimensional feature matrix and increased the

Table 8
The classification results of the proposed method on MODMA dataset based on three best classifiers in terms of the percentage (%) of the mean and standard deviation
of AC, SE, SP, F1, and FDR metrics.
Classifier AC SE SP F1 FDR
(Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std) (Mean ± Std)

LINSVM 97.3 ± 5.1 98.3 ± 3.5 97.0 ± 9.4 97.6 ± 4.5 2.5 ± 7.9
RBFSVM 95.7 ± 4.1 95.9 ± 6.9 96.1 ± 6.8 95.6 ± 4.1 4.0 ± 6.7
RF 95.7 ± 6.4 96.3 ± 6.2 95.0 ± 12.6 95.6 ± 5.8 4.1 ± 5.8

17
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

features. These features were extracted using statistical, spectral, frequency and time-frequency features based on EEG signals. IEEE Trans. Neural
Syst. Rehabil. Eng. 27 (5), 826–835.
wavelet, functional connectivity, and nonlinear analysis methods. To
Hassan, A.R., Bhuiyan, M.I.H., 2016. A decision support system for automatic sleep
select the best subset of the extracted features and improve the classi­ staging from EEG signals using tunable q-factor wavelet transform and spectral
fication performance, SBFS was utilized in this study. In addition, features. J. Neurosci. Methods 271, 107–118.
various classifiers were employed in the proposed framework, that Lajnef, T., Chaibi, S., Ruby, P., Aguera, P.-E., Eichenlaub, J.-B., Samet, M., Kachouri, A.,
Jerbi, K., 2015. Learning machines and sleeping brains: automatic sleep stage
RBFSVM achieved the best classification performance among all of classification using decision-tree multi-class support vector machines. J. Neurosci.
them. The results confirm that the proposed EEG feature sets provide Methods 250, 94–105.
high classification performances. The results also indicate that the Mousavi, Z., Rezaii, T.Y., Sheykhivand, S., Farzamnia, A., Razavi, S., 2019. Deep
convolutional neural network for classification of sleep stages from single-channel
integration of all EEG feature sets leads to the best classification per­ EEG signals. J. Neurosci. Methods 324, 108312.
formance. According to the obtained results, the combination of all EEG Chinara, S., et al., 2020. Automatic classification methods for detecting drowsiness using
feature sets along with using RBFSVM classifier achieved an average AC wavelet packet transform extracted time-domain features from single-channel EEG
signal. J. Neurosci. Methods 347, 108927.
of 99%, SE of 98.4%, SP of 99.6%, F1 of 98.9%, and FDR of 0.4%, Lachner-Piza, D., Epitashvili, N., Schulze-Bonhage, A., Stieglitz, T., Jacobs, J.,
indicating the better performance of the combined features compared to Dümpelmann, M., 2018. A single channel sleep-spindle detector based on
the performance of each EEG feature set. Besides, the obtained results multivariate classification of EEG epochs: Mussdet. J. Neurosci. Methods 297, 31–43.
Hosseinifard, B., Moradi, M.H., Rostami, R., 2013. Classifying depression patients and
demonstrate that the proposed method outperforms the state-of-the-art normal subjects using machine learning techniques and nonlinear features from EEG
approaches for EEG-based automatic MDD diagnosis approaches. High signal. Comput. Methods Programs Biomed. 109 (3), 339–345.
potential of the proposed framework to identify MDD patients, based on Puthankattil, S.D., Joseph, P.K., 2012. Classification of EEG signals in normal and
depression conditions by ANN using RWE and signal entropy. J. Mech. Med. Biol. 12
the obtained results, would encourage the medical equipment industries
(04), 1240019.
to develop a CAD system for MDD diagnosis using the very proposed Acharya, U.R., Sudarshan, V.K., Adeli, H., Santhosh, J., Koh, J.E., Puthankatti, S.D.,
framework. Future research can focus on providing more EEG datasets Adeli, A., 2015. A novel depression diagnosis index using nonlinear features in EEG
of depression disorder with different severity levels, the concatenation signals. Eur. Neurol. 74 (1–2), 79–83.
Mumtaz, W., Xia, L., Ali, S.S.A., Yasin, M.A.M., Hussain, M., Malik, A.S., 2017a.
of the conventional machine learning methods with novel deep learning Electroencephalogram (EEG)-based computer-aided technique to diagnose major
techniques, proposing machine learning approaches for automatic depressive disorder (MDD). Biomed. Signal Process. Control 31, 108–115.
severity scaling of depression, and merging psychophysiological prop­ Mumtaz, W., Ali, S.S.A., Yasin, M.A.M., Malik, A.S., 2018. A machine learning framework
involving EEG-based functional connectivity to diagnose major depressive disorder
erties and other physiological signals with EEG for automatic diagnosis (MDD). Med. Biol. Eng. Comput. 56 (2), 233–246.
of depression. Mumtaz, W., Xia, L., Yasin, M.A.M., Ali, S.S.A., Malik, A.S., 2017b. A wavelet-based
technique to predict treatment outcome for major depressive disorder. PLoS One 12
(2).
Declaration of interests Sharma, M., Achuth, P., Deb, D., Puthankattil, S.D., Acharya, U.R., 2018. An automated
diagnosis of depression using three-channel bandwidth-duration localized wavelet
None. filter bank with EEG signals. Cognit. Syst. Res. 52, 508–520.
Mahato, S., Paul, S., 2019. Detection of major depressive disorder using linear and non-
linear features from EEG signals. Microsyst. Technol. 25 (3), 1065–1076.
References Acharya, U.R., Oh, S.L., Hagiwara, Y., Tan, J.H., Adeli, H., Subha, D.P., 2018b.
Automated EEG-based screening of depression using deep convolutional neural
Seligman P, M., 1975. Helplessness: On Depression, Development, and Death. network. Comput. Methods Programs Biomed. 161, 103–113.
Marcus, M., Yasamy, M.T., van Ommeren, M.v., Chisholm, D., Saxena, S., 2012. Ay, B., Yildirim, O., Talo, M., Baloglu, U.B., Aydin, G., Puthankattil, S.D., Acharya, U.R.,
Depression: A Global Public Health Concern. 2019. Automated depression detection using deep representation and sequence
Organization, W.H., 2001. The World Health Report 2001: Mental Health: New learning with EEG signals. J. Med. Syst. 43 (7), 205.
Understanding, New Hope. World Health Organization. Mumtaz, W., Qayyum, A., 2019. A deep learning framework for automatic diagnosis of
Castillo, R., Carlat, D., Millon, T., Millon, C., Meagher, S., Grossman, S., Association, A.P., unipolar depression. Int. J. Med. Inform. 132, 103983.
et al., 2007. Diagnostic and Statistical Manual of Mental Disorders. American Mahato, S., Paul, S., 2020. Classification of depression patients and normal subjects
Psychiatric Association Press, Washington, DC. based on electroencephalogram (EEG) signal using alpha power and theta
Folstein, M.F., Robins, L.N., Helzer, J.E., 1983. The mini-mental state examination. Arch. asymmetry. J. Med. Syst. 44 (1), 28.
Gen. Psychiatry 40 (7), 812–812. Spitzer, R.L., Gibbon, M.E., Skodol, A.E., Williams, J.B., First, M.B., 1994. DSM-IV
Hamilton, M., 1960. A rating scale for depression. J. Neurol. Neurosurg. Psychiatry 23 Casebook: A Learning Companion to the Diagnostic and Statistical Manual of Mental
(1), 56. Disorders. American Psychiatric Association.
Behnam, M., Pourghassem, H., 2017. Seizure-specific wavelet (seizlet) design for Jasper, H.H., 1958. The ten-twenty electrode system of the international federation.
epileptic seizure detection using correntropy ellipse features based on seizure Electroencephalogr. Clin. Neurophysiol. 10, 370–375.
modulus maximas patterns. J. Neurosci. Methods 276, 84–107. Dien, J., 1998. Issues in the application of the average reference: review, critiques, and
Song, Y., Crowcroft, J., Zhang, J., 2012. Automatic epileptic seizure detection in EEGs recommendations. Behav. Res. Methods Instrum. Comput. 30 (1), 34–43.
based on optimized sample entropy and extreme learning machine. J. Neurosci. Delorme, A., Makeig, S., 2004. EEGLAB: an open source toolbox for analysis of single-
Methods 210 (2), 132–146. trial EEG dynamics including independent component analysis. J. Neurosci. Methods
Xiang, J., Li, C., Li, H., Cao, R., Wang, B., Han, X., Chen, J., 2015. The detection of 134 (1), 9–21.
epileptic seizure signals based on fuzzy entropy. J. Neurosci. Methods 243, 18–25. Pion-Tonachini, L., Kreutz-Delgado, K., Makeig, S., 2019. ICLabel: an automated
Acharya, U.R., Oh, S.L., Hagiwara, Y., Tan, J.H., Adeli, H., 2018a. Deep convolutional electroencephalographic independent component classifier, dataset, and website.
neural network for the automated detection and diagnosis of seizure using EEG NeuroImage 198, 181–197.
signals. Comput. Biol. Med. 100, 270–278. Winkler, I., Haufe, S., Tangermann, M., 2011. Automatic classification of artifactual ICA-
Direito, B., Teixeira, C.A., Sales, F., Castelo-Branco, M., Dourado, A., 2017. A realistic components for artifact removal in EEG signals. Behav. Brain Funct. 7 (1), 30.
seizure prediction study based on multiclass SVM. Int. J. Neural Syst. 27 (03), Hjorth, B., 1970. EEG analysis based on time domain properties. Electroencephalogr.
1750006. Clin. Neurophysiol. 29 (3), 306–310.
Wei, X., Zhou, L., Zhang, Z., Chen, Z., Zhou, Y., 2019. Early prediction of epileptic Freeman, W., Quiroga, R.Q., 2012. Imaging Brain Function with EEG: Advanced
seizures using a long-term recurrent convolutional network. J. Neurosci. Methods Temporal and Spatial Analysis of Electroencephalographic Signals. Springer Science
327, 108395. & Business Media.
Ghaderyan, P., Abbasi, A., Sedaaghi, M.H., 2014. An efficient seizure prediction method Abhang, P.A., Gawali, B.W., Mehrotra, S.C., 2016. Introduction to EEG-and Speech-Based
using KNN-based undersampling and linear frequency measures. J. Neurosci. Emotion Recognition. Academic Press.
Methods 232, 134–142. Evans, J.R., Abarbanel, A., 1999. Introduction to Quantitative EEG and Neurofeedback.
Hirschauer, T.J., Adeli, H., Buford, J.A., 2015. Computer-aided diagnosis of Parkinson’s Elsevier.
disease using enhanced probabilistic neural network. J. Med. Syst. 39 (11), 179. Rao, R.P., 2013. Brain-Computer Interfacing: An Introduction. Cambridge University
Yuvaraj, R., Murugappan, M., Acharya, U.R., Adeli, H., Ibrahim, N.M., Mesquita, E., Press.
2016. Brain functional connectivity patterns for emotional state classification in Li, T.-M., Chao, H.-C., Zhang, J., 2019. Emotion classification based on brain wave: a
Parkinson’s disease patients without dementia. Behav. Brain Res. 298, 248–260. survey. Human Centric Comput. Inf. Sci. 9 (1), 1–17.
Shim, M., Hwang, H.-J., Kim, D.-W., Lee, S.-H., Im, C.-H., 2016. Machine-learning-based Aftanas, L.I., Varlamov, A.A., Pavlov, S.V., Makhnev, V.P., Reva, N.V., 2002. Time-
diagnosis of schizophrenia using combined sensor-level and source-level EEG dependent cortical asymmetries induced by emotional arousal: EEG analysis of
features. Schizophr. Res. 176 (2–3), 314–319. event-related synchronization and desynchronization in individually defined
Durongbhan, P., Zhao, Y., Chen, L., Zis, P., De Marco, M., Unwin, Z.C., Venneri, A., frequency bands. Int. J. Psychophysiol. 44 (1), 67–82.
He, X., Li, S., Zhao, Y., et al., 2019. A dementia classification framework using Harmony, T., Fernández, T., Silva, J., Bernal, J., Díaz-Comas, L., Reyes, A., Marosi, E.,
Rodríguez, M., Rodríguez, M., 1996. EEG delta activity: an indicator of attention to

18
R.A. Movahed et al. Journal of Neuroscience Methods 358 (2021) 109209

internal processing during performance of mental tasks. Int. J. Psychophysiol. 24 En-hua, S., Zhi-jie, C., Fan-ji, G., 2005. Mathematical foundation of a new complexity
(1–2), 161–171. measure. Appl. Math. Mech. 26 (9), 1188–1196.
Knyazev, G.G., 2012. EEG delta oscillations as a correlate of basic homeostatic and Aftanas, L.I., Lotova, N.V., Koshkarov, V.I., Pokrovskaja, V.L., Popov, S.A., Makhnev, V.
motivational processes. Neurosci. Biobehav. Rev. 36 (1), 677–695. P., 1997. Non-linear analysis of emotion EEG: calculation of Kolmogorov entropy
Saletu, B., Anderer, P., Saletu-Zyhlarz, G., 2010. EEG topography and tomography and the principal lyapunov exponent. Neurosci. Lett. 226 (1), 13–16.
(LORETA) in diagnosis and pharmacotherapy of depression. Clin. EEG Neurosci. 41 Shannon, C.E., 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27 (3),
(4), 203–210. 379–423.
Knott, V., Mahoney, C., Kennedy, S., Evans, K., 2001. EEG power, frequency, asymmetry Pincus, S.M., 1991. Approximate entropy as a measure of system complexity. Proc. Natl.
and coherence in male depression. Psychiatry Res. Neuroimaging 106 (2), 123–140. Acad. Sci. U.S.A. 88 (6), 2297–2301.
Coutin-Churchman, P., Moreno, R., 2008. Intracranial current density (LORETA) Squire, L.R., Dronkers, N., Baldo, J., 2009. Encyclopedia of Neuroscience. Elsevier.
differences in QEEG frequency bands between depressed and non-depressed Smith, E.E., Kosslyn, S.M., 2007. Cognitive Psychology: Mind and Brain (International
alcoholic patients. Clin. Neurophysiol. 119 (4), 948–958. Edition).
Liu, M., Zhou, L., Wang, X., Jiang, Y., Liu, Q., 2017. Deficient manipulation of working Uttal, W.R., 2011. Mind and Brain: A Critical Appraisal of Cognitive Neuroscience.
memory in remitted depressed individuals: behavioral and electrophysiological Zhu, X., Wang, X., Xiao, J., Liao, J., Zhong, M., Wang, W., Yao, S., 2012. Evidence of a
evidence. Clin. Neurophysiol. 128 (7), 1206–1213. dissociation pattern in resting-state default mode network connectivity in first-
Nystrom, C., Matousek, M., Hallstrom, T., 1986. Relationships between EEG and clinical episode, treatment-naive major depression patients. Biol. Psychiatry 71 (7),
characteristics in major depressive disorder. Acta Psychiatr. Scand. 73 (4), 390–394. 611–617.
Lee, P.F., Kan, D.P.X., Croarkin, P., Phang, C.K., Doruk, D., 2018. Neurophysiological Wu, D., Yuan, Y., Bai, F., You, J., Li, L., Zhang, Z., 2013. Abnormal functional
correlates of depressive symptoms in young adults: a quantitative EEG study. J. Clin. connectivity of the default mode network in remitted late-onset depression. J. Affect.
Neurosci. 47, 315–322. Disord. 147 (1–3), 277–287.
Roh, S.-C., Park, E.-J., Shim, M., Lee, S.-H., 2016. EEG beta and low gamma power Avery, J.A., Drevets, W.C., Moseman, S.E., Bodurka, J., Barcalow, J.C., Simmons, W.K.,
correlates with inattention in patients with major depressive disorder. J. Affect. 2014. Major depressive disorder is associated with abnormal interoceptive activity
Disord. 204, 124–130. and functional connectivity in the insula. Biol. Psychiatry 76 (3), 258–266.
Begić, D., Popović-Knapić, V., Grubiš in, J., Kosanović-Rajačić, B., Filipčić, I., Sheline, Y.I., 2003. Neuroimaging studies of mood disorder effects on the brain. Biol.
Telarović, I., Jakovljević, M., 2011. Quantitative electroencephalography in Psychiatry 54 (3), 338–352.
schizophrenia and depression. Psychiatria Danubina 23 (4), 355–362. Koolschijn, P.C.M., van Haren, N.E., Lensvelt-Mulders, G.J., Hulshoff Pol, H.E., Kahn, R.
Oppenheim, A., Schafer, R., Buck, J., 1989. Discrete-Time Signal Processing. Prentice- S., 2009. Brain volume abnormalities in major depressive disorder: a meta-analysis
Hall, Englewood cliffs. of magnetic resonance imaging studies. Hum. Brain Mapp. 30 (11), 3719–3735.
Hinrikus, H., Suhhova, A., Bachmann, M., Aadamsoo, K., Võhma, Ü., Lass, J., Tuulik, V., Olbrich, S., Tränkner, A., Chittka, T., Hegerl, U., Schönknecht, P., 2014. Functional
2009. Electroencephalographic spectral asymmetry index for detection of connectivity in major depression: increased phase synchronization between frontal
depression. Med. Biol. Eng. Comput. 47 (12), 1291. cortical eeg-source estimates. Psychiatry Res. 222 (1–2), 91–99.
Rosso, O., Martin, M., Figliola, A., Keller, K., Plastino, A., 2006. Eeg analysis using Leuchter, A.F., Cook, I.A., Hunter, A.M., Cai, C., Horvath, S., 2012. Resting-state
wavelet-based information tools. J. Neurosci. Methods 153 (2), 163–182. quantitative electroencephalography reveals increased neurophysiologic
Jospin, M., Caminal, P., Jensen, E.W., Litvan, H., Vallverdú, M., Struys, M.M., connectivity in depression. PLoS One 7 (2), e32508.
Vereecke, H.E., Kaplan, D.T., 2007. Detrended fluctuation analysis of EEG as a Stam, C.J., Van Dijk, B., 2002. Synchronization likelihood: an unbiased measure of
measure of depth of anesthesia. IEEE Trans. Biomed. Eng. 54 (5), 840–846. generalized synchronization in multivariate data sets. Physica D 163 (3–4), 236–251.
Higuchi, T., 1988. Approach to an irregular time series on the basis of the fractal theory. Pudil, P., Novovičová, J., Kittler, J., 1994. Floating search methods in feature selection.
Physica D 31 (2), 277–283. Pattern Recognit. Lett. 15 (11), 1119–1125.
Grassberger, P., Procaccia, I., 2004. Measuring the strangeness of strange attractors. The Cai, H., Gao, Y., Sun, S., Li, N., Tian, F., Xiao, H., Li, J., Yang, Z., Li, X., Zhao, Q., et al.,
Theory of Chaotic Attractors. Springer, pp. 170–189. 2020. Modma Dataset: A Multi-Modal Open Dataset for Mental-Disorder Analysis
Röschke, J., Fell, J., Beckmann, P., 1995. Nonlinear analysis of sleep EEG data in arXiv preprint arXiv:2002.09283.
schizophrenia: calculation of the principal lyapunov exponent. Psychiatry Res. 56
(3), 257–269.

19
Another random document with
no related content on Scribd:
some degree of corruption was inevitable in all political
organisations, he held that they should be regarded by the voter in
exactly the same light as bidders for a contract. Government should
simply be handed over to the organisation making, all things
considered, the lowest bid, which in New York city, Mr. Thompson
thought, would usually be Tammany Hall. The argument is so
thoroughly feudal in its conception of politics that one finds it difficult
to believe in the author’s entire sincerity, although this is flatly
asseverated throughout the book. Moral objections similar to those
employed against the doctrine of the inviolability of a “ten per cent
rake-off” thoroughly dispose of any rational claim it may make to
attention. Political experience is also against it. Reform movements
particularly in municipalities may be laughed at as “spasms,” but
these movements, which are usually based largely on charges of
corruption, occur so frequently as to discredit the belief that purely
prudential considerations on the part of corruptionists will restrain
effectively the excesses of their demands. Supine acceptance by the
electorate of the “lowest bidder” theory would speedily result in the
submission of none but extortionately high bids. In the long run
“millions for defence but not one cent for tribute” is a sentiment quite
as justifiable economically as ethically.
To recapitulate the preceding argument,—the structure of society,
no matter how completely evolved and generally beneficial to the
highest human interests, is nevertheless such that when brought into
contact with natural human egoism it offers access at many points to
the onslaughts of corruption. The evil consequences may be
extreme, or only severe, or in time they may be completely
overcome. History furnishes examples of all three eventualities. It
also bears witness to the fact that many gross and threatening forms
of corruption that were once prevalent have been eliminated from the
life of civilised nations. Those which remain to afflict us are the
object of vigorous corrective measures which are constantly being
extended and strengthened. Corrupt practices are found to be limited
in some cases to certain branches or spheres of government with
consequences of varying degrees of danger to the national life. Or
they may be limited in amount or percentage by various prudential
considerations on the part of political leaders who, however, are far
from being sufficiently restrained in this way as social welfare
requires. While corruption thus appears to be a persistent problem of
social and political life it is far from being a hopeless one. In the
words of Professor Henry C. Adams,[39] its solution “is a continuous
task, like the cleansing of the streets of a great city, or the renewing
of a right purpose within the human heart.”

FOOTNOTES:
[24] It would, of course, be absurd to assume that every victor
in such contests is free from all taint of corruption. A very large
and powerful state may, although extremely corrupt, succeed in
overcoming a small and weak state which is relatively free from
corruption. Something akin to this occurred when Finnish
autonomy was suppressed by Russia in 1902. On the other hand
it is evident that in such a struggle the honesty of the small state
would be in its favour while the corruption of the great state would
be a source of weakness.
[25] Although most of the references to historic forms of
corruption presented in the following pages are taken from the
comparatively recent annals of nations which are still living, it is
worth noting that the subject could also be illustrated abundantly
from ancient history. Even prior to the Christian era Rome
suffered from various kinds of political corruption that exist in very
similar forms at the present day. Readers of the Old Testament
find, particularly in the books of Isaiah and Micah, denunciations
of social evils not unlike those published in contemporary
magazines.
[26] Herbert Spencer shows “that from propitiatory presents,
voluntary and exceptional to begin with but becoming as political
power strengthens less voluntary and more general, there
eventually grow up universal and involuntary contributions—
established tribute; and that with the rise of a currency this
passes into taxation” (“Principles of Sociology,” vol. ii, pt. iv, ch. iv,
p. 371), and further that “In our own history the case of Bacon
exemplifies not a special and late practice, but an old and usual
one” (p. 372). Bribe giving may, therefore, be regarded as a lineal
descendant of an old practice once regarded as legitimate, but
now fallen under the ban. Given a social state in which public
dues are open, regular, and fixed in amount, and in which bribery
is distinctly reprobated, as contrasted with a social state in which
present giving is common and tolerated or defended by public
opinion, the higher moral standard of the former would seem
beyond question.
[27] Op. cit., pp. 44-45.
[28] “The Diary of Samuel Pepys,” edited by Henry B. Wheatley,
vol. i, p. 207, entry of date of August 16, 1660.
[29] Op. cit., vol. vii, p. 49, entry dated July 30, 1667.
[30] “Samuel Pepys and the World He Lived In,” by Henry B.
Wheatley, p. 62.
[31] Op. cit., pp. 161-162, note.
[32] Ibid., p. 15.
[33] Ibid., p. 42.
[34] Ibid., p. 16.
[35] “The Shame of the Cities,” p. 152.
[36] “Japan, Its History, Arts, and Literature,” by Captain F.
Brinkley, vol. iv, p. 250 et seq.
[37] New York Times, March 9, 1900.
[38] “Politics in a Democracy,” New York, 1893.
[39] “Public Debts,” p. 358
CORRUPTION IN THE PROFESSIONS,
JOURNALISM, AND THE HIGHER EDUCATION
IV
CORRUPTION IN THE PROFESSIONS, JOURNALISM, AND

THE HIGHER EDUCATION

The wisdom of some quasi-philosophic counsellors of ambitious


youth expresses itself in the aphorism that in this world there are as
many doors labelled “pull” as there are labelled “push.” Without
admitting the equality in ratio of the two kinds of avenues to material
well being, it is undeniable that a great many of our social
relationships are very commonly exploited by interests of a more or
less directly personal character. Church membership, for example,
may be maintained chiefly as a stepping stone to business,
professional, or social success. Business men are overrun with
solicitations for aid to church and charitable purposes under
circumstances which suggest the discrete advertisement of their
delinquency in case they do not contribute “according to their
means,” and the probable loss of custom in consequence. The
charitable organisations themselves are imposed upon by unworthy
applicants for relief who display a pertinacity and ingenuity
calculated to destroy all faith in any trait of human nature except
universal parasitism. Of course one should not look a gift horse in
the mouth, but in the case of many presentations from inferiors to
superiors or from favour-seekers to men of influence the motives of
the givers, and also at times of the recipients, are certainly not
beyond suspicion. The ethics of the petty tipping system are dubious
at best. Labourers “soldier on their jobs”; clerks appropriate office
supplies as “perquisites”; there are “tricks in all trades.” To avoid
conflicts in the kitchen good housewives frequently send bad
servants away with excellent “characters.” During hard time winters
newspapers maintain free soup stations and publish the harrowing
details of the poverty which they are relieving in such a sensational
fashion that even the most guileless reader finds himself wondering
whether any motive connected with self-advertisement or circulation
reinforces the charitable sentiments of the journalist. On the other
hand many a queer and clever scheme is devised to secure
newspaper notoriety for some presumably deserving person or
cause. The ways of authors with critics, and of critics with authors for
that matter, are said at times to stand in need of criticism
themselves. “Dead easy” professors and “snap” courses (of which,
be it said with grief and contrition, every institution seems to have a
few samples) are exploited by college students whose mental efforts
in other directions are hopelessly inhibited by chronic brain fag. In
short every person charged with administrative duties in connection
with any social organisation, be it a business house, a club, a
church, a school, a charity, or what not, is familiar ad nauseam with
the fact that tacit or overt efforts are constantly being made both by
outsiders and insiders to procure suspensions of the rules or other
unwarranted privileges and favours.
It would, however, be an unnecessarily harsh judgment to
condemn all actions of the foregoing character as corrupt. If criticism
is to be attempted it must be based on a full knowledge of motives in
given cases, and these are not always apparent. Then, too, customs
have grown up under the influence of which men act without
analyzing the real nature of their conduct. Reflection would show,
however, that, with the exception of conscious evil intent, the
elements of corruption are present not only in the cases cited above,
but in many others which are constantly being encountered in the
course of the day’s experiences. It is certainly an error to assume
that all the grafters are engaged in “big” business or “big” politics. Let
us not excuse in the slightest degree the misdeeds of great
corporations, but, on the other hand, let us not forget that conduct of
a precisely similar ethical colour is sometimes indulged in by
labourers, clerks, small retailers, farmers, and others. The fact that
corrupt or “near” corrupt practices are more common than people are
ordinarily inclined to believe is significant in another way. There is
always a direct relationship between the characteristic petty offences
of a people and its characteristic major crimes. Thus in a country
given over to brawling, crimes of violence will be numerous. Chicane
largely prevalent in every day affairs will certainly breed an
atmosphere favourable to the perpetration of gigantic frauds. For this
reason the minor forms of corruption which occur in the daily life of a
people are worthy of much more attention than they ordinarily
receive.
Let us turn now from the petty and dubious manifestations of a
corrupt spirit to those larger and more directly threatening practices
which have become subject to public criticism and in some cases to
repressive legislation. The field thus ventured upon is so extensive
and its features are so involved that no progress can be made in its
discussion without classification. Yet any scheme of classification
that may be attempted must encounter great difficulties. Individual
judgments vary widely regarding the importance or degree of danger
to the public interest of various anti-social developments. Along
certain lines corrupt practices have been exploited by journalistic
enterprise with great pertinacity, while other suspicious areas are still
largely neglected. As a consequence of the very difficulties which
embarrass it, however, there is a certain justification even for a
confessedly imperfect classification. A service of considerable
importance may be rendered merely by bringing together in the form
of an outline all or nearly all the more threatening forms of corruption
in such a way that some of their salient characteristics and
interrelations are more clearly developed. Without therefore claiming
finality for the following arrangement it would seem desirable to
distinguish roughly two great fields of corrupt practices: first,
corruption in professional life generally; and second, corruption in
business and politics. The divisions and subdivisions of these two
groups will be indicated later. Corruption in professional life will be
discussed with some detail in the present study.[40] Business and
political corruption, the interrelations of which are very numerous and
close, will form the subject of the following paper.
Corruption in professional life may be held to involve virtually all of
our social leadership outside of business and politics. Apart from the
specific services rendered by the various professions their principal
practitioners are instinctively looked up to by the community for
guidance. In a broad sense all professional men are teachers.
Corruption in the professions is thus equivalent to the defilement of
the sources of public instruction. Yet precisely on this ground very
sweeping and bitter accusations are made. Law, journalism, and the
higher education are more frequently attacked, but medicine,
philanthropy, and theology also come in for criticism. To cite specific
instances:—editors are accused of wholesale misrepresentation and
suppression of news in behalf of sinister interests; college
professors, assumed to be subtly bribed by munificent endowments,
are reproached as the crafty inventors of philosophic excuses for
menacing public evils; lawyers are denounced as servile hirelings
who “justify the wicked for reward” and who accept crooked
corporation or political work without demur; ministers, philanthropic
workers, and other leaders of thought are said to be purchased by
large contributions, gifts of parks, playgrounds, hospitals, and so on.
[41] There are many modern Micahs who go about saying of our
people that “the heads thereof judge for reward, and the priests
thereof teach for hire, and the prophets thereof divine for money.”
Corruption of the sources of public instruction is manifestly replete
with the potency of evil. If a nation’s “men of light and leading” fail in
their function the case is hopeless indeed. Moreover the regulation
of the various sources of public instruction is a task the complexity of
which far excels that of any problem presented by the other forms of
corruption. No insuperable technical difficulty is involved, for
example, in prescribing the standard of pure milk, the proper safety
devices for theatres, the best method of fencing dangerous
machinery in mills, the adequate safeguarding of the interests of
policy holders in life insurance companies. But who will tell us with
authority exactly what is news and what isn’t; who will define
explicitly the standard of orthodoxy for university instruction in
economics and political science; who will provide ministers of the
gospel with a social creed drawn up with the precision and free from
the dogmatic differences of their theological creeds? It is not strange,
therefore, that although there has been much vague talk of “tainted
money,” proposals for the legal definition and regulation of its alleged
pernicious consequences have been wanting. We already have
extended and complicated legal systems of inspection and regulation
of many of the material goods of life, while but little has been done or
even concretely outlined in the direction of state supervision of ideal
goods and services.
Great as are the technical difficulties in the way of the latter policy,
the real reason for its lack of advocates would seem to lie in the
partial efficiency of the various ancient and highly socialised codes of
professional ethics. Competition in the economic world has not been
similarly safeguarded from within. With the breakdown of the guild
system and the sudden changes introduced by the industrial
revolution business found itself upon an uncharted sea. Laisser faire,
laisser aller seemed perfectly obvious in this spacious time of
untouched world markets, but latterly distances have dwindled,
density has increased, and collisions with social norms have become
increasingly frequent. Too often and too easily competition has been
pushed beyond the limits of social safety. In the economic struggle
the “twentieth mean man” has been able to wield compulsory power
over his nineteen decent competitors and to force them on pain of
bankruptcy to adopt his own lower standards. The professional
“mean men,” on the other hand, knew from the start that they were
derogating from the ethics of their fellow practitioners, and in many
cases were brought quickly to book for it. Here rather than in any
differences of personal integrity must be found the reason for the
higher moral reputation enjoyed by professional as compared with
business men. It is impossible to believe that of the brothers of the
family the black sheep always went into business and the good boys
into medicine or the ministry. Finally we may expect the general
immunity of the professions from state regulation to continue just so
long as they develop progressively their own police systems. In this
connection it is significant that that one of them which has been most
frequently and severely accused of abetting corruption in economic
and political fields, namely the law, is precisely the one which has
shown the most concern recently in the reformation of its code of
ethics.[42] Obviously such sanitary processes may be materially
hastened by the pressure from without of a forceful and honest
popular feeling in opposition to abuses which have grown up in
professional practice.
The greatest immediate influence upon public opinion is exerted,
of course, by journalism. The question of its corruption or
corruptibility is, therefore, one of prime importance. Accusations
against the press on this score are common enough, but few of them
are so sweeping as the following attributed to the late John Swinton,
formerly of the New York Sun and Tribune.[43] At a banquet of the
New York Press Association in 1895, in response to a toast on “The
Independent Press” he is reported to have said:
“There is no such thing in America as an independent press unless
it is in the country towns. You know it, and I know it. There is not one
of you who dare express an honest opinion. If you express it, you
know beforehand that it would never appear in print. I am paid $150
per week for keeping my honest opinions out of the paper I am
connected with. Others of you are paid similar salaries for doing
similar things. If I should permit honest opinions to be printed in one
issue of my paper, like Othello, before twenty-four hours my
occupation would be gone. The man who would be so foolish as to
write honest opinions would be out on the street hunting for another
job. The business of the New York journalist is to distort the truth, to lie
outright, to pervert, to vilify, to fawn at the feet of Mammon, and to sell
his country and race for his daily bread; or for what is about the same
thing, his salary. You know this, and I know it; and what foolery to be
toasting an ‘independent press.’ We are tools, and the vassals of rich
men behind the scenes. We are jumping jacks. They pull the string
and we dance. Our time, our talents, our lives, our possibilities, all are
the property of other men. We are intellectual prostitutes.”
It is hardly probable that any one not himself accustomed to
drafting headlines could have so far exaggerated a situation, even
under post-prandial influences, as did the author of the above
paragraph. Whatever may be the measure of the sinning of any
newspaper, certainly no single sheet has ever been the corrupt
apologist for all anti-social interests. A paper which at any one time
should attempt to stand for unsanitary tenement houses, for child
labour, for quack medicines, for “embalmed” beef, for “tainted
money” colleges, for monopoly tactics in beating down small
competitors, for life insurance frauds, for the spoils system, the
stealing of elections, and franchise grabbing,—or for any
considerable number of these,—would certainly lose its influence
with extreme suddenness. Newspapers are of all kinds, of course.
They differ even more in character than do individuals. As the focal
points of every interest in a community the interests of a newspaper
are much more diverse than those of the individual, and, as in the
case of the individual, these interests are shot through and through
with the noble and the base. Few people who are unfamiliar with the
practical making of newspapers realise what a constant and bitter
struggle is being waged in many cases to keep them free from
selfish and dishonest influences. In other instances, of course, the
partial triumph of the counting-room is palpable. Advertising columns
still carry, although with much less frequency than formerly, the
insertions of get-rich-quick schemes, of bucket-shops, of salary-loan
sharks, of quack doctors, quack medicines, and clairvoyants. Of
course these are frankly presented as paid matter, and every reader
of intelligence understands that they are inspired by the directly
selfish motives of the advertiser. When one thinks of the poor, the
ignorant, and the sick, who are exploited through such agencies,
however, the despicable character of the abuse is manifest. In some
papers, also, the reader finds abundant evidence of the activities of
press and publicity bureaus working in the interest of certain forms of
business. Morally this abuse is much worse than the foregoing, for it
throws off the form of advertising and clothes itself as news or
editorial opinion.
Large advertisers, particularly since the development of daily full
page announcements by department stores, also insist at times, and
not always ineffectually, upon exerting influence over news and
editorial columns. A pitch of absurdity seldom realised in this
connection was exemplified by the silence or approval with which the
press of one of our largest cities, a single paper honourably
excepted, treated the clearly mistaken philanthropy of a certain
wealthy merchant who had established many distributing stations for
sterilised, rather than Pasteurised milk. The paralysing effect of box
office influence upon sincere and vigorous dramatic criticism is
another deplorable instance of the same sort.
Finally there are papers which, however free they may keep
themselves from outside interests, nevertheless represent the
immediate political or economic ambitions of their owners. It is easy
to exaggerate this abuse not only with regard to its present extent
absolutely considered, but also with reference to its contemporary
development as compared with the press of the past. In its earlier
periods journalism was almost universally the tool of party. During
the civil war,—the epoch of great editorial personalities,—political
ambitions constantly invaded the sanctum with the result that the
gross unfairness and bitter partisanship engendered by the times
were doubly and trebly emphasised in the columns of the press. The
new journalism which began its career about 1875 not only prints
more news but prints it more fairly than the old school. Of course
most of our papers are still the recognised organs of some party, but
they are far from being servile and characterless advocates of every
party policy. Moreover there is a considerable number of politically
independent papers, some of which are avowedly so, while others
are really so although they may still wear lightly some party emblem.
Fearless, continued criticism of public abuses is more and more
coming to be recognised as good policy both for a paper and for the
commonweal.
Unfortunately there is another side to this record of improvement
and achievement. Perhaps the most important single difference
between the old personal journalism and the journalism of to-day is
the large capitalistic character of the latter. When the mechanical
outfit of a city paper could be supplied with a comparatively small
sum of money, the personality of the editor was all important,
although, as we have seen, even this favouring economic condition
did not by any means produce uncorrupted journalism. At the
present time large capital is necessary not only to provide the
equipment, but also to meet the heavy losses of the few inevitable
lean years at the outset. In most cases the money is contributed by
one man or by a comparatively small number of men whose other
business interests are likely to be very harmonious if not already
consolidated. In consequence there is a common, and withal very
human, tendency on the part of the paper thus established and
owned to deal favourably under all circumstances with the financial
interest or group of interests back of it. This is the typical journalistic
danger of the present period, just as the political bee in the editor’s
bonnet was the typical evil of the old personal journalism. Legislation
requiring newspapers to print the names of their principal owners,
and to deposit full lists of stockholders in some state office of record
where they could be made available to all comers, ought to limit
considerably the possibility of capitalistic manipulation of the press.
By revealing facts regarding financial control which at best can only
be suspected at the present time, publicity of this character would
enable readers to make the necessary allowances for any undue
form of counting-room control which might manifest itself in the
editorial or news columns of a given paper. In spite of this and other
shortcomings, however, most observers agree that the American
press as a whole is more independent to-day than ever before.
In considering abuses which affect our journalism one should not
forget certain conditions which set a limit to the corrupt manipulation
of the greatest single agency of public instruction. A modern
newspaper is a large capitalistic enterprise, of course, but its
business is peculiar in that it must sell its product to tens of
thousands of people every day at the price of a cent or two per copy.
However plutocratic a paper may be at one end it always represents
the extreme of democracy at the other. Our press is occasionally
prostituted by large moneyed interests, but it is in much more
constant danger of that directly opposite form of corruption, namely
demagogy. Reform of the press depends ultimately upon the reform
of its readers. Even on the latter side, however, we have to note an
increasing and very gratifying readiness on the part of our papers to
tell the American people the truth about themselves and about
foreign peoples regardless of all our old time prejudices and
antipathies.[44]
Reverting to the plutocratic influences affecting the press,
however, we have seen that in the nature of things no single
newspaper can become the tool of all the anti-social interests. It can
defend effectively only the few which for one reason or another are
approved by the managers of its policy. Usually a newspaper which
is thus silent or mildly unctuous on certain abuses endeavours to
rehabilitate itself by the condemnation, sometimes in a sensational
and even hysterical fashion, of other abuses, thus conducting, so to
speak, a vigorous department of moral foreign affairs. As a result the
position taken by the press as a whole on most points is strongly
favourable to the public interest. On this ground one may find a
philosophic justification for the sentiment so compactly phrased by
Mr. George William Curtis to the effect that “no abuse of a free press
can be so great as the evil of its suppression.”[45]
Even in dealing with those subjects concerning which a given
paper is not honest with its readers great care must be exercised. So
far as possible it must conceal the evidences of selfish interest and
present its case on grounds of public policy. Now arguments based
on such grounds are always worthy at least of consideration. A very
large part of political discussion, not only journalistic but of other
kinds, is “inspired” in this fashion, and it not infrequently happens
that what may be in accord with the self interest of individuals and
groups is also in accord with public interest. If this is not the case a
competing paper ought to be able to expose pretty effectively the
false assertions of its wily contemporary. In dealing with national
questions which are discussed by newspapers in every part of the
country this function of mutual criticism is in general well performed.
Cases occur, however, especially in connection with municipal
issues, where practically every paper of wide local circulation is
either silenced or actively engaged in the support of a crooked deal.
Under such circumstances a fight in defence of public interest is
almost hopeless. The more nearly the press of a given district
approaches this condition of corrupt paralysis, however, the brighter
are the opportunities for an opposition paper. In journalism as
everywhere in the world of social phenomena the inviolable law
prevails that a function cannot be abused without corresponding
harm to the agency which allows itself to be perverted. If it should
ever happen,—although at the present time the prospect seems
remote enough,—that a thoroughgoing control embracing the daily
papers of the whole country should be established in defence of
consolidated interests, it is certain that some new agency of publicity
would spring up in the interest of the people as a whole. In the end
the daily papers themselves would be the worst sufferers from a
general perversion of their activities. As a matter of fact a new and
powerful journalistic organ has already developed an influence not
incomparable with that of the daily press. The wonderful growth of
the low priced monthly and weekly magazines during the decade just
past has been explained on various grounds:—the cheapening of
paper and of illustrations, the second-class mailing privilege, the
effectiveness of such media for advertisement, and so on. No doubt
these factors go far toward explaining the great expansion of
magazine circulation, but in spite of much journalistic prejudice to the
contrary circulation and influence are not necessarily correlative. And
the influence, as distinct from the circulation, of the magazines has
been due very largely to the boldness and effectiveness with which
they assailed many public abuses with regard to which for one
reason or another the daily press was silent or even favourable. Of
course the detached situation of the magazines made it easy and
even profitable for them to pursue policies which might have cost the
newspapers dear. In any event a new way was found for the
effective journalistic presentation of the public interest.
In discussing the alleged corruption of the learned professions as
a whole reference was made to the powerful influence of
professional codes of ethics. One must recognise the journalistic
instinct and journalistic traditions as strong factors of similar
character. Even where editorial and reportorial staffs have given way,
for purely bread and butter reasons, to what they knew were the
selfish suggestions of controlling financial interests these same
interests must sometimes have wondered at the lukewarmness of
their paper’s support, and also, perhaps, at the enthusiasm which it
manifested for some good cause indifferent to them. Moreover
professional standards are rising in this field as well as elsewhere.
No one has given clearer or more forcible expression to the highest
of these newer ideals of journalism than Mr. George Harvey of the
North American Review, whose words, by the way, present the
extreme of contrast to those quoted earlier from Mr. Swinton. After
pointing out that the great editorial leaders of the past generation,—
Greeley, Raymond, Dana, Bennett,—were shackled by their own
political ambitions, Mr. Harvey asks:
“What, then, shall we conclude? That an editor shall bar acceptance
of public position under any circumstances? Yes, absolutely, and any
thought or hope of such preferment, else his avowed purpose is not
his true one, his policy is one of deceit in pursuance of an
unannounced end; his guidance is untrustworthy, his calling that of a
teacher false to his disciples for personal advantage, his conduct a
gross betrayal not only of public confidence, but also of the faith of
every true journalist jealous of a profession which should be of the
noblest and the farthest removed from base uses in the interests of
selfish men.” ...
“He [the journalist] is, above all, a teacher who, through daily
appeals to the reason and moral sense of his constituency, should
become a real leader.... Above capital, above labour, above wealth,
above poverty, above class, and above people, subservient to none,
quick to perceive and relentless in resisting encroachments by any,
the master journalist should stand as the guardian of all, the vigilant
watchman on the tower ever ready to sound the alarm of danger, from
whatever source, to the liberties and the laws of this great union of
free individuals.”[46]

Discussion of the “tainted money” charge so far as it affects our


universities and colleges can not, of course, be presented with
complete objectivity by the present writer. Nothing can be promised
beyond an earnest effort to attain detachment and impartiality. On
the other hand, a decade spent in the active teaching of the principal
debatable subjects in three institutions of widely different character
may furnish a basis of experience of some value.[47]
First of all there must be no blinking of the importance of the
subject. “It is manifest,” wrote the acute Hobbes, “that the Instruction
of the people, dependeth wholly, on the right teaching of Youth in the
Universities.” Quaint as is the language in which he defends this
proposition the argument which it contains is applicable with few
changes to modern conditions.
“They whom necessity, or couvetousnesse keepeth attent on their
trades, and labour; and they, on the other side, whom superfluity, or
sloth carrieth after their sensuall pleasures, (which two sorts of men
take up the greatest part of Man-kind,) being diverted from the deep
meditation, which the learning of truth, not onely in the matter of
Natural Justice, but also of all other Sciences necessarily requireth,
receive the Notions of their duty, chiefly from Divines in the Pulpit, and
partly from such of their Neighbours, or familiar acquaintance, as
having the Faculty of discoursing readily, and plausibly, seem wiser
and better learned in cases of Law, and Conscience, than themselves.
And the Divines, and such others as make shew of Learning, derive
their knowledge from the Universities, and from the Schooles of Law,
or from the Books, which by men eminent in those Schooles, and
Universities have been published.”[48]
In spite of the development of other intermediate agencies of
public instruction since the seventeenth century, and particularly of
the press and our elementary school system, the influence of
universities and colleges was never greater than it is at present, and
it is an influence which is constantly increasing in strength. The
number of universities and colleges is larger, their work is more
efficient, their curricula are broader, the number of college bred men
in the community is greater, and their leadership therein more
perceptible than ever before. Professors are enlisting in industrial,
scientific, and social activities outside academic walls in a way
undreamed of so long as the old monastic ideals held sway. By
extension lectures and still more by books and articles they are
reaching larger and larger masses of the people. Newspapers
formulate current public opinion, but to the writer, at least, it seems
plainly apparent that the best thought of the universities and colleges
to-day is the thought that in all likelihood will profoundly influence
both press and public opinion in the near future. Academic observers
of the sound money struggle of 1896, for example, must have smiled
frequently to themselves at the arguments employed during the
campaign. There was not one of them which had not been the
commonplace of economic seminars for years. The newspapers and
the abler political leaders on both sides simply filled their quivers with
arrows drawn from academic arsenals. Extreme cleverness was
shown by many journalists and campaign orators in popularising this
material, in adapting it to local conditions, and in placing it broadcast
before the people, but of original argumentation on their part there
was scarcely a scintilla. It is significant also that the battle of the
ballots was decided in favour of the contention which commanded
the majority of scientific supporters. Subsequent political issues,
great and small, have developed very similar phenomena, although
of course it would be absurd to assert that in all cases the dominant
opinion of the literati prevailed at the ballot. There are also certain
academic ideals of the day with which practical politics and business
are demonstrably and crassly at variance. Not until the fate of many
future battles is decided can we estimate the full strength of the
university influence on such pending questions. Victory would seem
assured in a sufficient number of cases, however, to make it clear
that just as the wholesomeness of the public opinion of to-day is
conditioned by the independence of the press, so the
wholesomeness of the public opinion of to-morrow will be
determined largely by the independence of our colleges and
universities.
As compared with the press, universities possess certain great
advantages which justify the public in demanding from them higher
standards of accuracy and impartiality. The professor enjoys some
measure of leisure; the editor is always under the lash of production
on the stroke of the event. It is also a very considerable advantage
that the editorial “we” and the anonymity of the newspaper are
foreign to college practice. There is, of course, a pretty well
recognised body of opinion on methods and ideals common to the
faculties of our learned institutions, but in the separate fields of
departmental work any opinion that may be expressed is primarily
the opinion of the professor expressing it. His connection with a
given institution is, indeed, a guaranty of greater or less weight as to
his general scholarly ability, and he will, of course, be mindful of this
in all that he says or writes. But beyond this his personal reputation
is directly involved. Those who make a newspaper suffer collectively
and more or less anonymously for any truckling to corrupt interests.
The college president or teacher guilty of an offence of the same sort
must suffer in his own person the contempt of his colleagues, his
students, and the public generally.
Newspapers, moreover, are usually managed by private
corporations frankly seeking profit as one of their ends. Universities
and colleges, on the other hand, are much more free from the
directly economic motive. There are, however, certain large
qualifications to the advantages which institutions of learning thus
enjoy. Every university and college is constantly perceiving new
means of increasing its usefulness and persistently seeking to
secure them. The demands made in behalf of such purposes may
seem excessive at times, but it is clear that an educational institution
which does not appreciate the vital importance of the work it is doing,
and consequently the importance of expanding that work, is simply
not worth its salt. In a great many cases the readiest means of
securing the necessary funds is by appeal to rich men for large gifts
and endowments. As the number of munificent Mæcenases is
always limited and the number of needy institutions always very
considerable, a competitive struggle ensues, different in most of its
incidents from the directly profit seeking struggles of the business
world, but essentially competitive none the less. In the campaign of a
university or college for expansion a large body of students makes a
good showing; hence too often low entrance requirements weakly
enforced and low standards of promotion. At times even the springs
of discipline are relaxed lest numbers should be reduced by a
salutary expulsion or two. Courses are divided and subdivided
beyond the real needs of an institution and salaries are reduced in
order to secure a sufficient number of teachers to give the large
number of courses advertised with great fulness in the catalogue. A
large part of crooked collegiate athletics is due to an indurated belief
in the advertising efficacy of gridiron victories as a means of
attracting first, students, and then endowments. So far as charges of
corruption against our higher educational institutions are at all
justified they are justified chiefly by the practices just described.
Fairness requires the statement, however, that a marked change of
heart is now taking place. Public criticism has placed athletic graft in
the pillory to such an extent that enlightened self-interest, if no better
motive, should bring about its speedy abolition by responsible
college managements. Many sincere efforts have been made by
members of faculties singly and through organisations covering
certain fields of study to raise and properly enforce entrance and
promotion standards. Finally in the Carnegie Foundation for the
Advancement of Teaching there has been developed an agency of
unparalleled efficiency for detecting and exposing low standards. A
college may continue to publish fake requirements, to crowd its class
rooms with students who belong to high schools, to pad its courses,
to underpay and overwork its instructing staff, but if it does these
things it cannot, even if otherwise qualified, secure pensions for its
professors, and in any event its derelictions will be advertised
broadcast in the reports of the Foundation with a precision and a
conviction beyond all hope of rebuttal. Let cynics smile at a process
which they may describe as bribing the colleges to be good by
pensioning their superannuates, but unquestionably the work of the
Foundation has resulted in a new uprightness, a new firmness of
standards, a higher efficiency that bodes well for the future of
American education. Parents may give material encouragement to
this movement by reading the publications of the Carnegie
Foundation, as well as college catalogues and advertisements,
before they determine upon an institution for the education of their
children.
Although the conditions just described are the principal evil results
of the competitive struggle for college and university expansion, the
accusations of corruption against institutions of learning have usually
dealt with their teaching of the doctrines of economics, sociology,
and political science. Endowments must be secured; as a rule they
can be had only from the very rich; among the very rich are
numbered most of the “malefactors of great wealth”;—ergo university
and college teaching on such subjects must be made pleasing or at
least void of all offence to plutocratic interests.
There is a certain disproportion between the means and the ends
considered by the foregoing argument which is worth notice. To
found or endow a college or university requires a great deal of
money. Any institution worthy of either name is made up of
numerous departments,—languages, literature, the natural sciences,
history, and the social sciences,—of which only the last named are
concerned with the moot questions of the day. If one cherished the
Machiavellian notion of corrupting academic opinion to his economic
interest he would be obliged, therefore, to support an excessively
large number of departments the work of which would be absolutely
indifferent to him. Endowment of the social sciences alone would be
rather too patent. That they are not over-endowed at the present

You might also like