Professional Documents
Culture Documents
Parkinson's Disease Prediction Using SVM and Logistic Regression Algorithm
Parkinson's Disease Prediction Using SVM and Logistic Regression Algorithm
Parkinson's Disease Prediction Using SVM and Logistic Regression Algorithm
A PROJECT REPORT
Submitted by
of
BACHELOR OF ENGINEERING
in
APRIL 2021
ANNA UNIVERSITY :: CHENNAI 600 025
BONAFIDE CERTIFICATE
……………………….. …………………………
SIGNATURE SIGNATURE
Dr.A.VELAYUDHAM M.E., Ph.D Ms.M.PAVITHRA M.E
HEAD OF THE DEPARTMENT SUPERVISOR
………………………. ………………………
INTERNAL EXAMINER EXTERNAL EXAMINER
ACKNOWLEDGEMENT
We would like to express our sincere thanks to the honourable Chairman Rtn.
MPHF. Shri. T.S. NATARAJAN and Vice Chairmen Mr. T.N. KALAIMANI &
Mr. T.N. THIRUKUMAR for providing all the facilities to do the project in the
college campus.
We extend our sincere thanks to all Technical and non – Technical staff
Members of our department who helped us in all aspects throughout this project.
I also thank the GOD ALMIGHTY for giving me courage and all the
needful to fulfil this project.
TABLE OF CONTENTS
NO. NO.
ABSTRACT
LIST OF FIGURES
1 INTRODUCTION 1
1.1 OBJECTIVE 5
1.2 PROBLEM DEFINITION 5
2 LITERATURE SURVEY 6
3 SYSTEM ANALYSIS 11
3.2 DISADVANTAGE 11
4 SYSTEM DESIGN 12
LANGUAGE 33
6 CONCLUSION 43
REFERENCES 46
APPENDIX 47
SAMPLE CODE 48
LIST OF FIGURES
NO. NO.
1 ARCHITECTURE DIAGRAM 13
2 DATASET 35
3 GRAPHICAL REPRESENTATION OF 36
DISEASE’S PREDICTION
ABSTRACT
Predicting Parkinson’s Disease by using Data mining in systematic way. This project
aims at detecting Parkinson’s disease through data mining. Since there is no standard
test to detect parkinsonism, we propose a statistical approach using the most common
symptoms of PD which are gait, tremors and micro-graphia. This includes analysing the
co-relation between the symptoms and classifying the achieved data using different
classification algorithms in order to find the algorithm which gives the highest accuracy
in diagnosing PD patients.
CHAPTER 1
INTRODUCTION
Machine learning is the computational learning using algorithm to learn from
and make predictions on data. Machine Learning is the powerful new technology
for analyst to focus on the most important information in their data warehouse.
Support vector machine is an open source it helps to extract the featured attribute
from the dataset.
Sequential Minimal Optimization use Poly Kernels to predict the rate of disease
in graphical manner. A type of machine learning, SVM allows categorization of an
individual's previously unseen data into a predefined group using a classification
algorithm, developed on a training data set.
In recent years, SVM has been successfully applied in the context of disease
diagnosis, transition prediction and treatment prognosis, using both structural and
functional neuroimaging data.
In recent years, SVM has been successfully applied in the context of disease
diagnosis, transition prediction and treatment prognosis, using both structural and
functional neuroimaging data.
Standard univariate analysis of neuroimaging data has revealed a host of
neuroanatomical and functional differences between healthy individuals and
patients suffering a wide range of neurological and psychiatric disorders.
Significant only at group level however these findings have had limited clinical
translation, and recent attention has turned toward alternative forms of analysis,
including Support-Vector- Machine (SVM). A type of machine learning, SVM
allows categorization of an individual's previously unseen data into a predefined
group using a classification algorithm, developed on a training data set. In recent
years, SVM has been successfully applied in the context of disease diagnosis,
transition prediction and treatment prognosis, using both structural and functional
neuroimaging data.-Standard univariate analysis of neuroimaging data has revealed
2
a host of neuroanatomical and functional differences between healthy individuals
and patients suffering a wide range of neurological and psychiatric disorders.
Significant only at group level however these findings have had limited clinical
translation, and recent attention has turned toward alternative forms of analysis,
including Support-Vector-Machine (SVM). A type of machine learning, SVM
allows categorization of an individual's previously unseen data into a predefined
group using a classification algorithm, developed on a training data set. In recent
years, SVM has been successfully applied in the context of disease diagnosis,
transition prediction and treatment prognosis, using both structural and functional
neuroimaging data.
Standard univariate analysis of neuroimaging data has revealed a host of
neuroanatomical and functional differences between healthy individuals and
patients suffering a wide range of neurological and psychiatric disorders.
Significant only at group level however these findings have had limited clinical
translation, and recent attention has turned toward alternative forms of analysis,
including Support-Vector- Machine (SVM).
A type of machine learning, SVM allows categorization of an individual's
previously unseen data into a predefined group using a classification algorithm,
developed on a training data set. In recent years, SVM has been successfully
applied in the context of disease diagnosis, transition prediction and treatment
prognosis, using both structural and functional neuroimaging data. Standard
univariate analysis of neuroimaging data has revealed a host of neuroanatomical
and functional differences between healthy individuals and patients suffering a
wide range of neurological and psychiatric disorders. Significant only at group
level however these findings have had limited clinical translation, and recent
attention has turned toward alternative forms of analysis, including Support-
Vector-Machine (SVM). A type of machine learning, SVM allows
categorization of an individual's previously unseen data into a predefined group
3
using a classification algorithm, developed on a training data set. In recent
years, SVM has been successfully applied in the context of disease
diagnosis, transition prediction and treatment prognosis, using both
structural and functional neuroimaging data. Standard univariate analysis
of neuroimaging data has revealed a host of neuroanatomical and
functional differences between healthy individuals and patients suffering
a wide range of neurological and psychiatric disorders. Significant only at
group level however these findings have had limited clinical translation,
and recent attention has turned toward alternative forms of analysis,
including Support-Vector- Machine (SVM). A type of machine learning,
SVM allows categorization of an individual's previously unseen data into
a predefined group using a classification algorithm, developed on a
training data set. In recent years, SVM has been successfully applied in
the context of disease diagnosis, transition prediction and treatment
prognosis, using both structural and functional neuroimaging data.
4
Parkinson's disease (PD) is a neurodegenerative disease which often
affects patients' movement. Currently, PD is diagnosed via various
neurological examinations by specialists. The most common symptoms of PD
are tremor, gait disturbance, stiffness, and slowness.
Through this project we are trying to co-relate different symptoms in order to
increase the accuracy in diagnosing Parkinson’s. The dataset will include
features such as jitters and stride. This data will be analyzed using different
classification techniques thus providing a reliable and accurate approach to
diagnose Parkinson’s at an early stage.
1.1 OBJECTIVE
Literature
survey
6
Mehrbakhsh Nilashi et al [3]
7
Dragana Miljkovic et al [6]
8
Ramzi M. Sadek et al [9]
“Parkinson’s Disease Prediction is using Artificial Neural
Network” In this system, 195 samples in the dataset were divided into 170
training samples and 25 validating samples. Then importing the dataset in the
Just Neural Network (JNN) environment, we trained, validated the Artificial
Neural Network model. The most important attributes contributing to the ANN
model were made known of. The ANN model was 100% accurate. Table -1:
Summary of Literature.
2.2 Saykin, A. 1., Shen, L., Foroud, T. M., Potkin, S. G., Swaminathan,
S., Kim, S., et al. "Alzheimer's disease Neuroimaging Initiative
biomarkers as quantitative phenotypes: genetics core aims, progress,
and plans". Alzheimers Dement. Vol. 6, No.3, pp. 265- 273, 2010.
Genome-wide array data have been publicly released and updated, and
several neuroimaging GWAS have recently been reported examining
baseline magnetic resonance imaging measures as quantitative phenotypes.
Other preliminary investigations include copy number variation in mild
cognitive impairment and Alzheimer’s disease and GWAS of baseline
cerebrospinal fluid biomarkers and longitudinal changes on magnetic
resonance imaging.
9
Blood collection for RNA studies is a new direction. Genetic studies of
longitudinal phenotypes hold promise for elucidating disease mechanisms
and risk, development of therapeutic strategies, and refining selection
criteria for clinical trials. A type of machine learning, SVM allows
categorization of an individual's previously unseen data into a predefined
group using a classification algorithm, developed on a training data set. In
recent years, SVM has been successfully applied in the context of disease
diagnosis, transition prediction and treatment prognosis, using both
structural and functional neuroimaging data.
10
CHAPTER 3
SYSTEM ANALYSIS
Clinical decisions are often made based on doctors’ intuition and experience
rather than on the knowledge rich data hidden in the database. This practice leads
to unwanted biases, errors and excessive medical costs which affects the quality
of service provided to patient.
❖ Use of wearable technologies through the implementation of Internet of
things.
3.2 DISADVANTAGES
11
CHAPTER 4
SYSTEM DESIGN
❖ Since individual analysis of every symptom has some drawback attached to it,
for example handwriting is a complex activity where other factors can
influence motor movement, in speech recognition additional steps such as
noise removal and speech segmentation are required, using breath samples
has been proved to fail to meet clinically relevant results.
ADVANTAGES
❖ No requirement for any special sensors and no need to solve the typical
problems of acoustic signal acquisition and processing.
12
ARCHITECTURAL DESIGN
SYSTEM ARCHITECTURE
Pre-process Apply ML
Dataset
SVM Logistic
algorithm regression
Figure: Overall
PredictArchitecture Predict
result result
Introduction
Computer Aided Diagnosis is a rapidly growing dynamic area of research in
medical industry. The recent researchers in Machine Learning promise the
improved accuracy of price predictions. Here the computers are enabled to
13
think by developing intelligence by learning. There are many types of
Machine Learning Techniques and which are used to classify the data sets.
Requirement Analysis
Functional Requirements
14
Product Perspective
The application is developed in such a way that any future enhancement can
be easily implementable. The project is developed in such a way that it requires
minimal maintenance. The software used are open source and easy to install. The
application developed should be easy to install and use.
Product features
User characteristics
❖ Easy to use
❖ Error free
15
Domain Requirements
This document is the only one that describes the requirements of the
system. It is meant for the use by the developers, and will also by the basis for
validating the final delivered system. Any changes made to the requirements in the
future will have to go through a formal change approval process.
User Requirements
Efficiency:
Less time for detection and price forecast for five days
Reliability:
Maturity, fault tolerance and recoverability
Portability:
It can the software easily be transferred to another environment,
16
including install ability.
Usability:
How easy it is to understand, learn and operate the software system
Organizational Requirements:
Do not block the some of available ports through the windows firewall.
Internet connection should be available
Implementation Requirements
User Interfaces
User interface is developed in python, which gets input such stock
symbol.
Hardware Interfaces
Ethernet
Ethernet on the AS/400 supports TCP/IP, Advanced Peer-to-Peer
Networking (APPN) and advanced program-to-program communications (APPC).
ISDN
To connect AS/400 to an Integrated Services Digital Network (ISDN) for
17
faster, more accurate data transmission. An ISDN is a public or private digital
communications network that can support data, fax, image, and other services over
the
same physical interface. can use other protocols on ISDN, such as IDLC and X.25.
Software Interfaces
No specific software interface is used.
Operational Requirements
• Economic
The developed product is economic as it is not required any hardware
interface etc.
• Environmental
Statements of fact and assumptions that define the expectations of the
system in terms of mission objectives, environment, constraints, and measures of
effectiveness and suitability (MOE/MOS). The customers are those that perform the
eight primary functions of systems engineering, with special emphasis on the
operator as the key customer.
18
software of a high integrity level then that system should not at the same time
accommodate software of a lower integrity level. Systems with different
requirements for safety levels must be separated. Otherwise, the highest level of
integrity required must be applied to all systems in the same environment.
MODULE DESCRIPTION:
DATA PRE-PROCESSING
We have taken multiple symptoms in our case study, in which we combined
the patient’s dataset with speech and keystroke dataset. Pre-processing of dataset is
done for converting the string attributes to numerals and missing data records are
dropped. The pre-processed data is stored in “newdata.csv” file, which is given as
input for machine learning models.
19
clf = svm.LinearSVC()
clf.fit(X_train,Y_train)
pred = clf.predict(X_test)
20
Logistic Regression
Split our dataset to train and test set and fit the dataset to Logistic regression
model, as given below.
result2=open("Output/resultLogisticRegression.csv","w")
result2.write("ID,Predicted Value" + "\n")
for j in range(len(pred)):
result2.write(str(j+1) + "," + str(pred[j]) + "\n")
result2.close()
21
Output is stored in csv as below:
22
Flow chart of Linear Regression algorithm:
Start
Sigmoid Function
End
23
DATA FLOW DIAGRAM
Level 0
Regressor/
Input values
classifier
Trained
Test input model
Predicted
result
24
Level 1
Regression,
Input values
SVM
Trained
Test input model
Predicted
result
25
UML DIAGRAM
Dataset CSV
Train model
Test input
Apply algorithm
Predicted results
26
SEQUENCE DIAGRAM
Load
Apply
Apply
Analyze
Analyze
27
ACTIVITY DIAGRAM
Input dataset
Data pre-process
Data split
Trainset Testset
Train model ML
Predicted
result
28
ER DIAGRAM
MDVP:Fh MDVP:Shimmer
MDVP:F MDVP:Jitt
i
o er
MDVP:Fl
o
Vocal Frequency
frequency check variation
condition check
Dataset
Vocal HNR
check component
check
NHR
keystroke
latency
Hold
Han time
d Result
Direction
29
IMPLEMENTATION DETAILS
DATASET
We considered multiple symptoms of patients such as speech and key stroke data
for Parkinson’s disease prediction.
30
Date YYMMDD
Timestamp HH:MM:SS.SSS
Hand L or R key pressed
Hold time Time between press and release for
current key milliseconds
Direction Previous to current LL, LR, RL, RR (and
S for a space key)
Latency time Time between pressing the previous key
and pressing current key. Milliseconds
Flight time Time between release of previous key
and press of current key. Milliseconds
31
CHAPTER 5
SYSTEM REQUIREMENTS
HARDWARE REQUIREMENTS
Processor : Any Processor above 500 MHz.
Ram : 4 GB
Hard Disk : 4 GB
Input device : Standard Keyboard and Mouse.
Output device : VGA and High Resolution Monitor.
SOFTWARE REQUIREMENTS
Operating System : Windows 7 or higher
Programming : Python 3.6 and related libraries
32
5.2 The Software Description
PYTHON
Python is an interpreted high-level programming language for general-
purpose programming. Created by Guido van Rossum and first released in 1991,
Python has a design philosophy that emphasizes code readability, notably using
significant whitespace. It provides constructs that enable clear programming on both
small and large scales.
Python features a dynamic type system and automatic memory
management. It supports multiple programming paradigms, including object-
oriented, imperative, functional and procedural, and has a large and comprehensive
standard library.
Python interpreters are available for many operating systems. CPython, the
reference implementation of Python, is open source software and has a community-
based development model, as do nearly all of its variant implementations. CPython is
managed by the non-profit Python Software Foundation.
33
tradition. It has filter(), map(), and reduce() functions; list comprehensions,
dictionaries, and sets; and generator expressions. The standard library has two
modules (itertools and functools) that implement functional tools borrowed from
Haskell and Standard ML.
The language's core philosophy is summarized in the document The Zen of
Python (PEP 20), which includes aphorisms such as:
➢ Readability counts
Rather than having all of its functionality built into its core, Python was
designed to be highly extensible. This compact modularity has made it particularly
popular as a means of adding programmable interfaces to existing applications. Van
Rossum's vision of a small core language with a large standard library and easily
extensible interpreter stemmed from his frustrations with ABC, which espoused the
opposite approach.
34
Python's developers strive to avoid premature optimization, and reject patches to
non-critical parts of CPython that would offer marginal increases in speed at the cost
of clarity. When speed is important, a Python programmer can move time-critical
functions to extension modules written in languages such as C, or use PyPy, a just-
in-time compiler. Cython is also available, which translates a Python script into C
and makes direct C-level API calls into the Python interpreter.
Since the name's storage location doesn't contain the indicated value, it is
improper to call it a variable. Names may be subsequently rebound at any time to
objects of greatly varying types, including strings, procedures, complex objects with
data and methods, etc. Successive assignments of a common value to multiple
names, e.g., x = 2; y = 2; z = 2 result in allocating storage to (at most) three names
36
and one numeric object, to which all three names are bound. Since a name is a
generic reference holder it is unreasonable to associate a fixed data type with it.
However at a given time a name will be bound to some object, which will have a
type; thus there is dynamic typing.
Expressions
Some Python expressions are similar to languages such as C and Java, while
some are not:
Addition, subtraction, and multiplication are the same, but the behavior of
division differs. There are two types of divisions in Python. They are floor division
and integer division. Python also added the ** operator for exponentiation.
From Python 3.5, the new @ infix operator was introduced. It is intended to
be used by libraries such as NumPy for matrix multiplication.
In Python, == compares by value, versus Java, which compares numerics
by value and objects by reference. (Value comparisons in Java on objects can be
performed with the equals() method.) Python's is operator may be used to compare
object identities (comparison by reference). In Python, comparisons may be chained,
for example a <= b <= c.
38
Python uses the words and, or, not for its boolean operators rather than the
symbolic &&, ||, ! used in Java and C.
Python has a type of expression termed a list comprehension. Python 2.4
extended list comprehensions into a more general expression termed a generator
expression.
Python makes a distinction between lists and tuples. Lists are written as [1,
2, 3], are mutable, and cannot be used as the keys of dictionaries (dictionary keys
must be immutable in Python). Tuples are written as (1, 2, 3), are immutable and
thus can be used as the keys of dictionaries, provided all elements of the tuple are
immutable. The + operator can be used to concatenate two tuples, which does not
directly modify their contents, but rather produces a new tuple containing the
elements of both provided tuples. Thus, given the variable t initially equal to (1, 2,
3), executing t = t + (4, 5) first evaluates t + (4, 5), which yields (1, 2, 3, 4, 5), which
is then assigned back to t, thereby effectively "modifying the contents" of t, while
conforming to the immutable nature of tuple objects. Parentheses are optional for
tuples in unambiguous contexts.
Triple-quoted strings, which begin and end with a series of three single or
double quote marks. They may span multiple lines and function like here documents
in shells, Perl and Ruby.
Raw string varieties, denoted by prefixing the string literal with an r. Escape
sequences are not interpreted; hence raw strings are useful where literal backslashes
are common, such as regular expressions and Windows-style paths. Compare "@-
quoting" in C#.
Python has array index and array slicing expressions on lists, denoted as
a[key], a[start:stop] or a[start:stop:step]. Indexes are zero-based, and negative
indexes are relative to the end. Slices take elements from the start index up to, but
40
not including, the stop index. The third slice parameter, called step or stride, allows
elements to be skipped and reversed. Slice indexes may be omitted, for example a[:]
returns a copy of the entire list. Each element of a slice is a shallow copy.
Methods
Methods on objects are functions attached to the object's class; the syntax
instance.method(argument) is, for normal methods and functions, syntactic sugar for
Class.method(instance, argument). Python methods have an explicit self parameter to
access instance data, in contrast to the implicit self (or this) in some other object-
oriented programming languages.
Typing
Python uses duck typing and has typed objects but untyped variable names.
41
Type constraints are not checked at compile time; rather, operations on an object
may fail, signifying that the given object is not of a suitable type. Despite being
dynamically typed, Python is strongly typed, forbidding operations that are not well-
defined (for example, adding a number to a string) rather than silently attempting to
make sense of them.
Python allows programmers to define their own types using classes, which are
most often used for object-oriented programming. New instances of classes are
constructed by calling the class (for example, SpamClass() or EggsClass()), and the
classes are instances of the metaclass type (itself an instance of itself), allowing
metaprogramming and reflection.
42
CHAPTER 5
CONCLUSION
Algorithm Accuracy
SVM 81.81
Logistic Regression 88.63
The following charts shows the error values such as mean squared error (MSE), mean absolute
error (MAE), Root Mean Square Error (RMSE) and R-squared value.
43
44
CONCLUSION
FUTURE ENHANCEMENTS
In future work, we plan to increase our subject pool and utilize optimal
feature selection strategies under MIL frameworks for developing robust person-
specific models. These techniques can potentially be adapted to various other
physiological sensing and monitoring applications as well.
45
REFERENCES
[1] Dragana Miljkovic et al, “Machine Learning and Data Mining Methods for
Managing Parkinson’s Disease” LNAI 9605, pp 209-220, 2016.
[2] Arvind Kumar Tiwari, “Machine Learning based Approaches for
Prediction of Parkinson’s Disease,” Machine Learning and Applications- An
International Journal (MLAU) vol. 3, June 2016.
[3] Dr. Anupam Bhatia and Raunak Sulekh, “Predictive Model for
Parkinson’s Disease through Naive Bayes Classification” International Journal
of Computer Science & Communication vol. 9, March 2018.
[4] M. Abdar and M. Zomorodi-Moghadam, “Impact of Patients’ Gender on
Parkinson’s disease using Classification Algorithms” Journal of AI and Data
Mining, vol. 6, 2018.
[5] Md. Redone Hassan et al, “A Knowledge Base Data Mining based on
Parkinson’s Disease” International Conference on System Modelling &
Advancement in Research Trends, 2019.
46
APPENDIX
SAMPLE SCREENSHOT
FigDataset
47
SAMPLE CODE
DATA PREPROCESS:
import os
import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.model_selection import train_test_split
from sklearn import linear_model, datasets
from sklearn import svm
newdata=[]
#print(newdata)
df_main.astype(float)
# Normalize values to range [0:1]
df_main /= df_main.max()
# split data into independent and dependent variables
y_all=df_main.iloc[:,16]
X_all = df_main.drop(df_main.columns[[16]], axis=1)
ncols=3
50
plt.clf()
f = plt.figure(1)
f.suptitle(" Data Histograms", fontsize=12)
vlist = list(df_main.columns)
nrows = len(vlist) // ncols
if len(vlist) % ncols > 0:
nrows += 1
for i, var in enumerate(vlist):
plt.subplot(nrows, ncols, i+1)
plt.hist(df_main[var].values, bins=15)
plt.title(var, fontsize=10)
plt.tick_params(labelbottom='off', labelleft='off')
plt.tight_layout()
plt.subplots_adjust(top=0.88)
plt.show()
import os
import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.model_selection import train_test_split
from sklearn import linear_model, datasets
from sklearn import svm
mse=[]
mae=[]
rsq=[]
rmse=[]
acy=[]
51
df_main = pd.read_table("Output/newdata.csv", sep=',')
df_main.astype(float)
# Normalize values to range [0:1]
df_main /= df_main.max()
# split data into independent and dependent variables
y_all=df_main.iloc[:,16]
X_all = df_main.drop(df_main.columns[[16]], axis=1)
clf = svm.LinearSVC()
clf.fit(X_train,Y_train)
pred = clf.predict(X_test)
result2=open("Output/resultSVM.csv","w")
result2.write("ID,Predicted Value" + "\n")
for j in range(len(pred)):
result2.write(str(j+1) + "," + str(pred[j]) + "\n")
result2.close()
print("---------------------------------------------------------")
print("MSE VALUE FOR SVM IS %f " % mean_squared_error(Y_test, pred))
print("MAE VALUE FOR SVM IS %f " % mean_absolute_error(Y_test, pred))
print("R-SQUARED VALUE FOR SVM IS %f " % r2_score(Y_test, pred))
rms = np.sqrt(mean_squared_error(Y_test, pred))
print("RMSE VALUE FOR SVM IS %f " % rms)
ac=accuracy_score(Y_test,pred) * 100
print ("ACCURACY VALUE SVM IS %f" % ac)
print("---------------------------------------------------------")
mse.append(mean_squared_error(Y_test, pred))
mae.append(mean_absolute_error(Y_test, pred))
rsq.append(r2_score(Y_test, pred))
rmse.append(rms)
acy.append(ac)
print("---------------------------------------------------------")
print("MSE VALUE FOR Logistic Regression IS %f " % mean_squared_error(Y_test,
pred))
print("MAE VALUE FOR Logistic Regression IS %f " %
mean_absolute_error(Y_test, pred))
print("R-SQUARED VALUE FOR Logistic Regression IS %f " % r2_score(Y_test,
pred))
rms = np.sqrt(mean_squared_error(Y_test, pred))
print("RMSE VALUE FOR Logistic Regression IS %f " % rms)
ac=accuracy_score(Y_test,pred) * 100
print ("ACCURACY VALUE Logistic Regression IS %f" % ac)
print("---------------------------------------------------------")
mse.append(mean_squared_error(Y_test, pred))
mae.append(mean_absolute_error(Y_test, pred))
rsq.append(r2_score(Y_test, pred))
rmse.append(rms)
acy.append(ac)
al = ['SVM','Logistic Regression']
result2=open('Output/MSE.csv', 'w')
result2.write("Algorithm,MSE" + "\n")
for i in range(0,len(mse)):
result2.write(al[i] + "," +str(mse[i]) + "\n")
result2.close()
result2=open('Output/MAE.csv', 'w')
result2.write("Algorithm,MAE" + "\n")
for i in range(0,len(mae)):
result2.write(al[i] + "," +str(mae[i]) + "\n")
result2.close()
fig = plt.figure(0)
df = pd.read_csv('Output/MAE.csv')
acc = df["MAE"]
alc = df["Algorithm"]
plt.bar(alc,acc,align='center', alpha=0.5,color=colors)
plt.xlabel('Algorithm')
plt.ylabel('MAE')
plt.title('MAE Value')
fig.savefig('Output/MAE.png')
plt.show()
result2=open('Output/R-SQUARED.csv', 'w')
result2.write("Algorithm,R-SQUARED" + "\n")
for i in range(0,len(rsq)):
result2.write(al[i] + "," +str(rsq[i]) + "\n")
result2.close()
fig = plt.figure(0)
df = pd.read_csv('Output/R-SQUARED.csv')
acc = df["R-SQUARED"]
alc = df["Algorithm"]
colors = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728", "#8c564b"]
explode = (0.1, 0, 0, 0, 0)
plt.bar(alc,acc,align='center', alpha=0.5,color=colors)
plt.xlabel('Algorithm')
plt.ylabel('R-SQUARED')
54
plt.title('R-SQUARED Value')
fig.savefig('Output/R-SQUARED.png')
plt.show()
result2=open('Output/RMSE.csv', 'w')
result2.write("Algorithm,RMSE" + "\n")
for i in range(0,len(rmse)):
result2.write(al[i] + "," +str(rmse[i]) + "\n")
result2.close()
fig = plt.figure(0)
df = pd.read_csv('Output/RMSE.csv')
acc = df["RMSE"]
alc = df["Algorithm"]
plt.bar(alc, acc, align='center', alpha=0.5,color=colors)
plt.xlabel('Algorithm')
plt.ylabel('RMSE')
plt.title('RMSE Value')
fig.savefig('Output/RMSE.png')
plt.show()
result2=open('Output/Accuracy.csv', 'w')
result2.write("Algorithm,Accuracy" + "\n")
for i in range(0,len(acy)):
result2.write(al[i] + "," +str(acy[i]) + "\n")
result2.close()
fig = plt.figure(0)
df = pd.read_csv('Output/Accuracy.csv')
acc = df["Accuracy"]
alc = df["Algorithm"]
plt.bar(alc, acc, align='center', alpha=0.5,color=colors)
plt.xlabel('Algorithm')
plt.ylabel('Accuracy')
plt.title('Accuracy Value')
fig.savefig('Output/Accuracy.png')
plt.show()
55
7
58