Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

RUBAN RAJ M

20MIY0018

MAT1031 – Bio-Statistics
Embedded Lab Using R Statistical Software

FALL SEMESTER – 2022∼2023


Slot: L53 + L54

E-RECORD
Experiment No.: 1

Submitted By

M.RUBAN RAJ
20MIY0018
M.Sc. Integrated (MIY) – III
SAS (Mathematics)

DEPARTMENT OF MATHEMATICS
SCHOOL OF ADVANCED SCIENCES
VELLORE INSTITUTE OF TECHNOLOGY
VELLORE – 632 014
TAMIL NADU
INDIA

Date: 21-SEPTEMBER-2022
RUBAN RAJ M
20MIY0018
Program / Problem No: 1
Experiment No.: 1
Statistical Data Analysis
Experiment Date – 15-September-2022

1. Problem Statement
Write the R Programming code for computing the mean, median, mode, -
quartile deviation, variance, standard deviation and coefficient of variation
for the following frequency distribution.
Wages
170-180 180-190 190-200 200-210 210-220 220-230 230-240 240-250
(inRs):
2. R
No. of
52 68 85 92 100 95 70 28
perso
ns:
Programming
x = seq(175,245,10)
f = c(52,68,85,92,100,95,70,28)

#Mean

avg = sum(f*x)/sum(f)
avg

#Median

cf = cumsum(f)
N = sum(f)
ml = min(which(cf>N/2))
h = 10
fr = f[ml]
c = cf[ml-1]
l = x[ml] - h/2
med = l + (((N/2)-c)/fr)*h
med

#Mode

h=10
m = which(f==max(f))
f0 = f[m]
f1 = f[m-1]
f2 = f[m+1]
RUBAN RAJ M
20MIY0018
l = x[m] - h/2
mod = l + ((f0-f1)/(2*f0-f1-f2))*h
mod

#Standard Deviation

y = rep(x,f)
sd(y)

#Variance

var = (sd(y))^(1/2)
var

#Coefficient of Variation

coe = ((sd(y))/avg) * 100


coe

#Quartile Deviation

ml = min(which(cf>N/4))
fr = f[ml]
c = cf[ml-1]
l = x[ml] - h/2
q1 = l + (((N/4)-c)/fr)*h
q1
ml = min(which(cf>3*N/4))
fr = f[ml]
c = cf[ml-1]
l = x[ml] - h/2
q3 = l + (((3*N/4)-c)/fr)*h
q3
qd = (q3-q1)/2
qd

avg
med
mod
sd(y)
var
coe
qd

Output
> avg
[1] 208.9831
> med
[1] 209.7826
RUBAN RAJ M
20MIY0018
> mod
[1] 216.1538
> sd(y)
[1] 19.71528
> var
[1] 4.44019
> coe
[1] 9.433915
> qd
[1] 15.77709

3. Proof for Output Execution in Lab Hours

Program / Problem No: 2

Experiment No.: 1
Statistical Data Analysis
Experiment Date – 15-September-2022
RUBAN RAJ M
20MIY0018
1. Problem Statement
Write the R code to compute the coefficient of correlation between X and
Y from the following data.

X 21 23 30 54 57 58 72 78 87 90

Y 60 71 72 83 110 84 100 92 113 135

2. R Programming

#Q2
x = c(21,23,30,54,57,58,72,78,87,90)
y = c(60,71,72,83,110,84,100,92,113,135)
r = cor(x,y)
r

Output:
> r = cor(x,y)
> r
[1] 0.8775417

3. Proof for Output Execution in Lab Hours


RUBAN RAJ M
20MIY0018
Program / Problem No: 3
Experiment No.: 1
Statistical Data Analysis
Experiment Date – 15-September-2022

1. Problem Statement
Write the R code to obtain the equation of the regression line of Y on X from the
following data:
X 4.7 8.2 12.4 15.8 20.7 24.9 31.9 35.1 39.1 38.8
Y 4.0 8.0 12.5 16.0 20.0 25.0 31.0 36.0 40.0 40.0

2. R Programming

#Q3
x = c(4.7,8.2,12.4,15.8,20.7,24.9,31.9,35.0,39.1,38.8)
y = c(4,8,12.5,16,20,25,31,36,40,40)
fit = lm(y~x)
summary(fit)

Output:
> fit = lm(y~x)
> summary(fit)

Call:
lm(formula = y ~ x)

Residuals:
Min 1Q Median 3Q Max
-1.3140 -0.1191 0.2321 0.3803 0.5383

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.73087 0.42971 -1.701 0.127
X 1.03589 0.01646 62.946 4.51e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6286 on 8 degrees of freedom


Multiple R-squared: 0.998, Adjusted R-squared: 0.9977
F-statistic: 3962 on 1 and 8 DF, p-value: 4.511e-12
RUBAN RAJ M
20MIY0018
3. Proof for Output Execution in Lab Hours

Program / Problem No: 4


Experiment No.: 1
Statistical Data Analysis
Experiment Date – 15-September-2022

1. Problem Statement
Write the R code to obtain the equation of multiple regression plane of Y on
X1 and X2 from the following data:

X1 30 40 20 50 60 40 20 60

X2 11 10 7 15 19 12 8 14

Y 110 80 70 120 150 90 70 120

2. R Programming

#Q4
x1 = c(30,40,20,50,60,40,20,60)
x2 = c(11,10,7,15,19,12,8,14)
y = c(110,80,70,120,150,90,70,120)
fit = lm(y~x1+x2)
RUBAN RAJ M
20MIY0018
summary(fit)

Output:
> fit = lm(y~x1+x2)
> summary(fit)

Call:
lm(formula = y ~ x1 + x2)

Residuals:
1 2 3 4 5 6 7 8
14.157 -5.552 3.110 -2.355 -1.308 -11.250 -4.738 7.936

Coefficients:
Estimate Std. Error t value Pr(>|t|)

(Intercept) 16.8314 11.8290 1.423 0.2140


x1 -0.2442 0.5375 -0.454 0.6687
x2 7.8488 2.1945 3.577 0.0159 *
---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.593 on 5 degrees of freedom
Multiple R-squared: 0.9191, Adjusted R-squared: 0.8867
F-statistic: 28.4 on 2 and 5 DF, p-value: 0.001862

3. Proof for Output Execution in Lab Hours


RUBAN RAJ M
20MIY0018
Program / Problem No: 5
Experiment No.: 1
Statistical Data Analysis
Experiment Date – 15-September-2022

1. Problem Statement
Write the R code to construct the general linear regression model for Y on X1
and X2 from the following data:

X1 30 40 20 50 60 40 20 60

X2 11 10 7 15 19 12 8 14

Y 0.10 0.80 0.70 0.30 0.50 0.90 0.70 0.20

2. R Programming

#Q5
x1 = c(30,40,20,50,60,40,20,60)
x2 = c(11,10,7,15,19,12,8,14)
y = c(0.10,0.80,0.70,0.30,0.50,0.90,0.70,0.20)
fit = glm(y~x1+x2,family="poisson")
summary(fit)

Output:
> fit = glm(y~x1+x2,family="poisson")
> summary(fit)
Call:
glm(formula = y ~ x1 + x2, family = "poisson")

Deviance Residuals:
1 2 3 4 5 6 7
8
-0.73820 0.27001 0.01094 -0.20714 0.26451 0.48777 0.06313 -
0.43284

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.0546672 1.7414786 0.031 0.975
x1 0.0009346 0.0800928 0.012 0.991
x2 -0.0633062 0.3422224 -0.185 0.853

(Dispersion parameter for poisson family taken to be 1)


Null deviance: 1.3474 on 7 degrees of freedom
Residual deviance: 1.1601 on 5 degrees of freedom
AIC: Inf
RUBAN RAJ M
20MIY0018
Number of Fisher Scoring iterations: 4

3. Proof for Output Execution in Lab Hours

You might also like