Digital Designing

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

EXPERIMENT 2

CORRELATION AND REGRESSION


REVATHI V
21BEC0877

AIM:
To find the correlation and regression of the following problems with the help of the data
provided in R programme.

MATHEMATICAL FORMULA:
Karl Pearson’s Coefficient of Correlation:

Spearman's Rank Correlation Coefficient:

Kendall's Coefficient of Concurrent Deviations:

R SYNTAX:
1. plot(x,y) : plots x versus y
2. cor.test(x,y) : gives the correlation coefficient (by default it gives the Pearson’s
correlation coefficient)
3. cor.test(x,y,method=”spearman”) : gives the Spearman’s correlation coefficient
4. cor.test(x,y,method=”kendall”) : gives the Kendall’s correlation coefficient
5. lm(X~Y) : regression coefficient of X on Y
6. data.frame(Y, X1, X2) : gives table of data of Y along with X1 and X2
PROBLEM 1:

R CODE:
> x = c(15,25,35,45,55,65)
> y = c(302.38, 193.63, 185.46, 198.49, 224.30, 288.71)
> plot(x, y, main="Average age vs. time spent in the library", xlab="Age", ylab="time spent in
the library", col="purple")

OUTPUT:
PROBLEM 2:

R CODE:
>selection =c(44,49,52,54,47,76,65,60,63,58,50,67)
> proficiency =c(48,55,45,60,43,80,58,50,77,46,47,65)
> cor.test (selection, proficiency, method ="spearman")

OUTPUT:
Spearman’s rank correlation coefficient, rho= 0.7202797
There is a positive correlation between selection and Proficiency. In other words, as
selection value increases, proficiency also increases and vice versa.
PROBLEM 3:

R CODE:
> x=c(34,37,36,32,32,36,35,34,29,35)
> y=c(37,37,34,34,33,40,39,37,36,35)
> fit=lm(x~y)
> fit

OUTPUT:
The equation of the line of regression of X and Y : X=18.9167+0.4167Y.
The required score of the student in Zoology is 30.5843.

PROBLEM 4:

R CODE:
>bmr=c(1459.3,1474.6,1413.4,1451.6,1551.1,1597,1352.2,1466.9,1581.1,1535.8,1505.2,156
6.4,1581.7,1558.8,1453.2,1470.6,1505.4,1528.6,1569.2,1482.2,1401,1493.8,1447.4,1459,14
70.6,1098.7,1201.6,1157.5,1054.6,1157.5)
>age=c(21,21,22,23,24,25,25,25,26,27,27,28,29,29,30,30,32,35,37,38,40,41,44,46,49,18,19,
19,20,21)
>ht=c(158,152,159,157,157,153,156,158,159,158,154,158,161,159,157,156,155,158,154,15
7,159,160,156,155,150,158,159,152,146,155)
>wt=c(51,52,48,50.5,57,60,44,51.5,57,56,54,58,59,57.5,49.5,51,54,56,59.5,52,45,53,49,50,5
1,41,48,45,38,45)
>bmi=c(20.43,22.51,18.99,20.49,23.12,25.63,18.08,20.63,22.55,22.43,22.77,23.23,22.76,22.
74,20.08,20.96,22.48,22.43,25.09,21.1,17.8,20.7,20.13,20.81,22.67,16.42,18.99,19.48,17.83
,18.73)
> input_data= data.frame(bmr,age,ht,wt,bmi)
> input_data
> RegModel = lm(bmr~ age + ht + wt + bmi, data=input_data)
> RegModel
> summary (RegModel)
OUTPUT:
Regression model:
bmr= -2500.492 + 4.021(age) + 17.293(height) + 50.553(weight) + 1.1019
R 2 is 0.8701, which is about 87% of BMR can be explained in terms of age HT, WT and BMI
of a person through this linear model, we also see that all the explanatory variables have
positive relationship with BMR. These regression coefficients are however not statistically
significant except that of age, though the F-test in ANOVA shows that the overall regression
is significant at 0.01 level (p-value is almost zero). The meaning of the regression coefficient
can be understood as follows if the age increases by 4.021 at fixed values of the other
factors like HT,WT and BMI.

PROBLEM 5:
R CODE:
> x = c(3.545,2.6,3.245,3.93,3.995,3.115,3.235,3.225,2.44,3.24,2.29,2.5,4.02)
> y = c(30,32,30,24,26,30,33,27,37,32,37,34,26)
> cor.test(x,y,method="pearson")

OUTPUT:
Correlation coefficient of X and Y : -0.8977642
LAB NOTEBOOK:

You might also like