Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

INTRODUCTION TO

DATA ANALYTICS
LAB ASSIGNMENT 1

NAME – ANKUR KUNDU


REG.NO. – 17BIS0020
COURSE CODE – ECE – 2033
FACULTY – Prof. DHARANI BAI G.
1. Univariate analysis of the iris data
set using the basic statistical
measures
ii) Results in visual form :-
> data=iris
>boxplot(data$Sepal.Length,data$Sepal.Width,d
ata$Petal.Length,data$Petal.Width,
main="Boxplot", names=c("Sepal Length","Sepal
Width","Petal Length","Petal Width"),
col=c("yellow","red","blue","purple"))
Visualization –

> data=iris
>boxplot(data$Sepal.Length[data$Species=="set
osa"],data$Sepal.Length[data$Species=="vers
icolor"],data$Sepal.Length[data$Species=="virgi
nica"], main="Sepal Length", names=c("Setosa" ,
"Versicolor" , "Virginica"),
col=c("red","green","orange"))
Visualization –

>data=iris
>boxplot(data$Sepal.Width[data$Species=="seto
sa"],data$Sepal.Width[data$Species=="versic
olor"],data$Sepal.Width[data$Species=="virgini
ca"], main="Sepal Length", names=c("Setosa" ,
"Versicolor" , "Virginica"),
col=c("red","green","orange"))
Visualization –

>data=iris
>boxplot(data$Petal.Length[data$Species=="seto
sa"],data$Petal.Length[data$Species=="versic
olor"],data$Petal.Length[data$Species=="virgini
ca"], main="Petal Length", names=c("Setosa" ,
"Versicolor" , "Virginica"),
col=c("red","green","orange"))
Visualization –

data=iris
>boxplot(data$Petal.Width[data$Species=="seto
sa"],data$Petal.Width[data$Species=="versicol
or"],data$Petal.Width[data$Species=="virginica"
], main="Petal Width", names=c("Setosa" ,
"Versicolor" , "Virginica"),
col=c("red","green","orange"))
Visualization –

iii) a) Density Plots


>data=iris >
d1=density(data$Sepal.Length[data$Species=="s
etosa"]) >
d2=density(data$Sepal.Length[data$Species=="v
ersicolor"]) >
d3=density(data$Sepal.Length[data$Species=="v
irginica"]) > plot(d1,main="Sepal Length
pdf",col="blue",lty=1,xlim=c(0,10)) >
points(d2,col="red",pch=".") >
lines(d2,col="red",lty=2) >
points(d3,col="green",pch=".") >
lines(d3,col="green",lty=3)
>legend(0.5,1,legend=c("setosa","versicolor","vir
ginica"),col=c("blue","red","green"),lty=1:3,cex=
0.8)

Visualization –

>data=iris
>d1=density(data$Sepal.Width[data$Species=="s
etosa"])
>d2=density(data$Sepal.Width[data$Species=="v
ersicolor"])
>d3=density(data$Sepal.Width[data$Species=="v
irginica"])
>plot(d1,main="Sepal Width
pdf",col="blue",lty=1,xlim=c(0,5),ylim=c(0,3))
>points(d2,col="red",pch=".")
>lines(d2,col="red",lty=2)
>points(d3,col="green",pch=".")
>lines(d3,col="green",lty=3)
>legend(0.5,1,legend=c("setosa","versicolor","vir
ginica"),col=c("blue","red","green"),lty=1:3,cex=
0.8)

Visualization –
>data=iris
>d1=density(data$Petal.Length[data$Species=="
setosa"])
>d2=density(data$Petal.Length[data$Species=="
versicolor"])
>d3=density(data$Petal.Length[data$Species=="
virginica"])
>plot(d1,main="Petal Length
pdf",col="blue",lty=1,xlim=c(0,8))
>points(d2,col="red",pch=".")
>lines(d2,col="red",lty=2)
>points(d3,col="green",pch=".")
>lines(d3,col="green",lty=3)
>legend(6,2,legend=c("setosa","versicolor","virgi
nica"),col=c("blue","red","green"),lty=1:3,cex=0.
8)
Visualization –

>data=iris
>d1=density(data$Petal.Width[data$Species=="s
etosa"])
>d2=density(data$Petal.Width[data$Species=="v
ersicolor"])
>d3=density(data$Petal.Width[data$Species=="v
irginica"]) > plot(d1,main="Petal Width
pdf",col="blue",lty=1,xlim=c(0,4))
>points(d2,col="red",pch=".")
>lines(d2,col="red",lty=2)
>points(d3,col="green",pch=".")
>lines(d3,col="green",lty=3)
>legend(3,6,legend=c("setosa","versicolor","virgi
nica"),col=c("blue","red","green"),lty=1:3,cex=0.
8)

Visualization –
iii) b) Histogram Plots
>data=iris
>h1=hist(data$Sepal.Length[data$Species=="set
osa"])
>h2=hist(data$Sepal.Length[data$Species=="ver
sicolor"])
>h3=hist(data$Sepal.Length[data$Species=="vir
ginica"])
>plot(h1,main="Sepal
Length",xlim=c(0,10),ylim=c(0,20),col=rgb(0,0,1,
1/4))
>plot(h2,xlim=c(0,10),col=rgb(1,0,0,1/4),add=T)
>plot(h3,xlim=c(0,10),ylim=c(0,20),col=rgb(0,1,0
,1/4),add=T)
>data=iris
>h1=hist(data$Sepal.Width[data$Species=="seto
sa"])
>h2=hist(data$Sepal.Width[data$Species=="vers
icolor"])
>h3=hist(data$Sepal.Width[data$Species=="virg
inica"])
>plot(h1,main="Sepal
Width",xlim=c(0,5),ylim=c(0,30),col=rgb(0,0,1,1/
4))
>plot(h2,xlim=c(0,5),col=rgb(1,0,0,1/4),add=T)
>plot(h3,xlim=c(0,5),ylim=c(0,20),col=rgb(0,1,0,
1/4),add=T)
>data=iris
>h1=hist(data$Petal.Width[data$Species=="setos
a"])
>h2=hist(data$Petal.Width[data$Species=="versi
color"])
>h3=hist(data$Petal.Width[data$Species=="virgi
nica"])
>plot(h1,main="Petal
Width",xlim=c(0,10),ylim=c(0,20),col=rgb(0,0,1,
1/4))
>plot(h2,xlim=c(0,10),col=rgb(1,0,0,1/4),add=T)
>plot(h3,xlim=c(0,10),ylim=c(0,20),col=rgb(0,1,0
,1/4),add=T)
>h1=hist(data$Petal.Width[data$Species=="setos
a"])
>h2=hist(data$Petal.Width[data$Species=="versi
color"])
>h3=hist(data$Petal.Width[data$Species=="virgi
nica"])
>plot(h1,main="Petal
Width",xlim=c(0,3),ylim=c(0,35),col=rgb(0,0,1,1/
4))
>plot(h2,xlim=c(0,3),col=rgb(1,0,0,1/4),add=T)
>plot(h3,xlim=c(0,3),col=rgb(0,1,0,1/4),add=T)
From these plots we realize that Petal width and
Petal length can be used more effectively to
classify the flower species than Sepal length or
Sepal Width. If we have to select only one
parameter Petal length will be the best for
classifying the flowers.
iv) We will consider Petal length for classifying
the flowers using univariate analysis by the
density plot.

From the graph we can conclude that if petal


length is less than 2.2cm we can classify it as
Setosa. Else if the petal length is between 2.2 and
5 cm we can classify it as versicolor else as
virginica.
The points above the green line correspond to
virginica, the points below the red line
correspond to setosa and the points between the
two lines corresponds to versicolor.

You might also like