Professional Documents
Culture Documents
IC3I-2022 Paper 929
IC3I-2022 Paper 929
Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
Jss Academy of Technical Education Jss Academy of Technical Education Jss Academy of Technical Education
Noida, India Noida, India Noida, India
kakoli.banerjee@jssaten.ac.in harshakg@jssaten.ac.in vinooth@jssaten.ac.in
Abstract: Here in this paper, Prediction models aim to electrical activity of the brain ie. brain signals. Parts of the
use available data to predict a health state or outcome brain communicate through electrical impulses and are active
that has not yet been observed. Prediction is primarily all the time, even during sleep. Hence the EEG of a person
relevant to clinical practice but is also used in research diagnosed with a psychiatric disorder will have a distinctive
and administration. While prediction modeling involves brain activity that can be used in the identification and detection
estimating the relationship between patient factors and
outcomes, it is distinct from causal inference. Prediction The first step for our project would be data gathering and data
modeling thus requires unique considerations for preprocessing. Data preprocessing is an important step in the
development, validation, and updating.Journals are machine learning process as it involves the transformation of
witnessing an increase in submissions related to raw data into a useful format. The Second Step would be Data
prediction modeling. This stands to seed rapid Standardization and Splitting of Data into Test Data and
advancement in research and practice but also comes at Training Data. This helps the machine learning process by
the risk of pursuing false leads. improving the accuracy score of the model. The Third Step
would be Selecting the appropriate Machine Learning
Keywords: treatment, diagnosis, statistics, prediction
Algorithm and applying it in the Training and Testing of the
model
Data. The higher the accuracy of the model, the higher the
success of the predictive model. The Higher accuracy of the
I. INTRODUCTION
model depends on certain factors like the type of data set, the
In our project, We will use machine learning algorithms like integrity of data, the machine learning algorithm, etc. Please
Regression, SVM, etc to classify, test/train our data set and refer to Figure 1 for a better understanding of the process.
create a model. The ultimatum of our project would be to create
a predictive system that can successfully predict whether a A. DATASET
person has a psychiatric disorder or not, or the degree of the
The data sets have been gathered online from the Kaggle
disorder based on his EEG report and symptoms.
website. The data set presents reliable information as they have
Electroencephalography (EEG) is a method to measure the
been used, analyzed, and gathered by researchers after has evolved into a trustworthy tool for analyzing this data.
experimentations. The Data Set includes EEG reports (Table 1) Machine Learning is the use of advanced probabilistic and
from patients whose records were examined and who were later statistical approaches to building computers that can learn on
determined to have a specific psychiatric illness. There are 1000 their own from data.
such records in it. Electroencephalography (EEG) is a method
to measure the electrical activity of the brain ie. brain signals. This enables data patterns to be detected more simply and
Parts of the brain communicate through electrical impulses and correctly, as well as more accurate predictions from data sources.
are active all the time, even during sleep. Similar analytic approaches are being used to study mental
health data, with the potential to improve patient outcomes as
The Data Set has [] columns and [] entries, the first step is to well as an understanding of psychiatric illnesses and their
undergo data cleaning. It is the process of finding incomplete, management.
unnecessary, or missing data and then altering, replacing, or
removing it as needed. We discovered that three columns lacked There are numerous machine learning algorithms available
data. In Data Frames and Numpy arrays, Not a Number, or today, each designed to address a specific job. In summary, they
NaN, is a special value that signifies a cell with no value. The aid in the resolution of real-world problems..
following stage is data encoding. We use this categorical data
encoding approach when the categorical characteristic is Mainly there are 4 types of Machine Learning algorithms -
identified as ordinal. It is critical to keep the sequence in this
scenario. As a result, the sequence should be mirrored in the 1. Reinforcement Learning
encoding. During label encoding, each label is converted into 2. Unsupervised learning
an integer value. 3. Semi-supervised learning
4. Supervised Learning
Then preprocessing is done to get rid of duplicates, missing
entries, and values with incorrect formatting. It will then be
used to train and evaluate the model to guarantee accuracy. The
dataset was then divided into two parts: training and testing.
The data set's integrity is a crucial determinant in the likelihood
of high accuracy. Following that, the models are evaluated
using a range of machine learning approaches, including
logistic regression, K-nearest neighbor classifier, decision tree
classifier, Support Vector Machine, and random forest classifier.
The Accuracy of a given test set for a classifier is the proportion
of test set instances correctly classified by the classifier.
The second page has the section About the project, which
covers all of the information about the topic of Psychiatric Figure 6: Symptom database
Disorder, its forms and symptoms, measures to be taken, and
doctor visits required. Such disorders included in this study are Finally, the last section includes Treatment, which is an optional
related to depression, bipolar disorder, anxiety, insomnia, eating feature in our project. Here, the user has to input the predicted
disorder, and many more. psychiatric disorder in the previous step,i.e, in the fourth section
consisting of Diagnosis based on Symptoms, and the model will
The third page has a significant main section. EEG output various treatment methods for the user. All the treatment
Report-Based Detection, the user must enter his EEG report, methods are a result of online studies and research. Now, the
primarily the graphical values of the six primary brain waves person can get assistance by accessing the specified web url or
recorded during the test. For a more accurate outcome, the by consulting a medical doctor.
wave values should be the mean value recorded in the graph.
The Random Forest machine learning algorithm is used to train This concludes the foundation of this study, and the following
the model. The model's output will be either 1 or 0, indicating section will discuss the software needs needed to complete the
that the person may or may not have a psychological condition. research.
Here the perfect accuracy comes to be 89.43661 %, as shown in
Figure 5 given below. C. TECHNOLOGY USED
V. REFERENCES
[3] Ozkan, Birgul & Arguvanlı, Sibel & Sarac, Bayise &
Medik, Kadriye. (2015). Sleep Quality and Affecting Factors in
Patients With Chronic Psychiatric Disorders. Erciyes Tıp
Dergisi/Erciyes Medical Journal. 37. 10.5152/etd.2015.7837.
[4]https://www.kaggle.com/datasets/shashwatwork/eeg-psy
chiatric-disorders-dataset