Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 72

RMC FILE

INDEX

S.NO. Topic Page


No.
UNIT-1 Research design & Data Presentation
1 Basic commands for R studio
2 Variables and Data Types
3 Different types of operators
4 Simple programming Constructs such as -If, else
5 Simple programming Constructs such as -For, While, Break
6 User defined and Built-in functions
7 Lists
8 Vectors
9 Matrices
10 Arrays
UNIT-2 Data analysis using R
11 Data -Frames
12 Import from Spreadsheet
13 Data Exploration-
14 Summary statistics
15 Referring specific rows & columns
16 Quick-Plots
17 R-Notebook
UNIT-3 Graphical analysis of data
18 Histogram, Density plot, Whisker plots
19 Pie Charts, Cleveland dot Charts, Pair plots
20 Factor Data type
21 Date data type
UNIT-4 Statistical Tests
22 Correlation
23 t-test, two sample independent t -test , Paired t-test
24 Chi Square t test
25 Anova
26 Regression- Linear
27 Multiple Linear Regression
28 Logistics Regression
29 Step wise regression- Forward, Backward and Both
30 Tables- used :
 Input
 Cardata
 Sample data
 Pairedtestdata

P a g e 1 | 72
RMC FILE

 ANN2
 Twowayann
 Wtloss

P a g e 2 | 72
RMC FILE

ASSIGNMENT 1
BASIC COMMANDS FOR R STUDIO

Variables: A variable is a name for a value, such as x, current temperature, or subject.id. We can
create a new variable by assigning a value to it using assignment operator.
Assignment Operator: The use of these operators is to assign values to the variables. These are the
two kinds of assignments, leftwards and rightwards assignment.
Operators <- and = are used to assign variables.
ASSIGN VALUES TO VARIABLES, PASTE AND READLINE FUNCTIONS
> x <- 20
> x
[1] 20
>
> y
[1] 30
> z <- x + y
> z
[1] 50
> a <- "Hello"
> a
[1] "Hello"
> b <- "Kunal"
> b
[1] "Kunal"
> paste (a, b)
[1] "Hello Kunal"
> readline("what is your name?")
what is your name? Lakshay
[1] "Lakshay"
> urname <- readline("what is your name?")
what is your name? Kunal

P a g e 3 | 72
RMC FILE

> urname
[1] "Kunal"
> paste(a, urname)
[1] "Hello Kunal"
> m <- x+y+z
> m
[1] 100
> m <- x+y+z #Adding 3 numeric variables
> m <- x+y+z # Adding 3 numeric variables
> m # display value of m
[1] 100
> # "Good Night"
> "Good Morning!" # "Good night"
[1] "Good Morning!"
> c

P a g e 4 | 72
RMC FILE

ASSIGNMENT 2
VARIABLES AND DATA TYPES
age <- 5
age

# [1] 5

Age <- 10
Age

# [1] 10

x <-10
y <- 12.5
z <- 13L
class(z)

# [1] "integer"

class(x)

# [1] "numeric"

class(y)

# [1] "numeric"

urname <- "Hello"


class(urname)

# [1] "character"

name1 <- "12.5"


class(name1)

# [1] "character"

name2 <- "13.5"


name3 <- 14.5
name4 <- 15.5
m <- name3 + name4
m

# [1] 30

class(name3)

# [1] "numeric"

class(name2)

# [1] "character"

P a g e 5 | 72
RMC FILE

l <- "true"
l

# [1] "true"

class(l)

# [1] "character"

v <- TRUE
v

# [1] TRUE

class(v)

# [1] "logical"

str <- "R is lower case case sensative language. It is very good language.
It include"
str

# [1] "R is lower case case sensative language. It is very good language.
It include"

cat(str)

# R is lower case case sensative language. It is very good language. It


include

a <- "Hello!!! , How are you..,


+ i m fine."
a

# [1] "Hello!!! , How are you..,\n+ i m fine."

cat(a)

# Hello!!! , How are you..,


# + i m fine.

nchar(a)

# [1] 37

grepl("Hello",a)

# [1] TRUE

q <- "Hello"
w <-"How are you!!!!"
paste(q,w)

# [1] "Hello How are you!!!!"

P a g e 6 | 72
RMC FILE

## Builtin Math Function


max(5,10,15)

# [1] 15

min(5,10,15)

# [1] 5

sqrt(16)

# [1] 4

abs(-4.7)

# [1] 4.7

ceiling(1.4)

# [1] 2

floor(1.4)

# [1] 1

P a g e 7 | 72
RMC FILE

ASSIGNMENT 3
DIFFERENT TYPES OF OPERATORS

# Arithmatic Operators
> x <- 5
> y <- 2
> x
[1] 5
> y
[1] 2
> x <- 5
> y <- 2
> x
[1] 5
> y
[1] 2
> x+y
[1] 7
> x-y
[1] 3
> x*y
[1] 10
> x/y
[1] 2.5
> y^x
[1] 32
> x%%y
[1] 1
# Comparison operators
x

## [1] 5

P a g e 8 | 72
RMC FILE

## [1] 2

x == y

## [1] FALSE

x != y

## [1] TRUE

x > y

## [1] TRUE

x < y

## [1] FALSE

x >= y

## [1] TRUE

# Logical Operators

a <- 200
b <- 33
c <- 500
a

## [1] 200

## [1] 33

## [1] 500

# And operator (&)


z <- a >b & c>a
z

## [1] TRUE

z <- a >b & c<a


z

## [1] FALSE

# OR operator (|)
z <- a>b | a > c
z

## [1] TRUE

P a g e 9 | 72
RMC FILE

z <- a<b | a > c


z

## [1] FALSE

# Not operator (!)


x <- ! a >c
x

## [1] TRUE

x <- ! a < c
x

## [1] FALSE

P a g e 10 | 72
RMC FILE

ASSIGNMENT 4
IF CONSTRUCT AND NESTED IF

IF CONSTRUCT:
a <- 33
b <- 200
if (b>a){ print("b is greater than a")}

## [1] "b is greater than a"

a <- 33
b <- 33
if (b>a)
{print("b is greater than a")} else if (a==b)
{print("a and b are equal")}

## [1] "a and b are equal"

a <- 245
b <- 33
if (b > a)
{print("b is greater than a")} else if (a==b)
{print("a and b are equal")} else
{print("a is greater than b")}

## [1] "a is greater than b"

a <- 200
b <- 33
if (b > a)
{print("b is greater than a")} else
{print("b is not greater than a")}

## [1] "b is not greater than a"

NESTED IF:
myage <- readline("Enter your age")

## Enter your age

if(myage <18)
{print("You are not a major, u are not eligible to work")} else
{
if(myage >=18 & myage <=60) {
print("u are eligible to work, please fill application form and email
us")} else

P a g e 11 | 72
RMC FILE

{
print("As per govt rules, you are too old to work, please collect
your pension")
}}

## [1] "You are not a major, u are not eligible to work"

P a g e 12 | 72
RMC FILE

ASSIGNMENT 5
LOOP
While Loop:
i <- 1
while (i < 6) {
print(i)
i <- i + 1
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

i <- 1
while (i < 6) {
print(i)
i <- i + 1
if (i == 4) {
break
}
}

## [1] 1
## [1] 2
## [1] 3

i <- 0
while (i < 6) {
i <- i + 1
if (i == 3) {
next
}
print(i)
}

## [1] 1
## [1] 2
## [1] 4
## [1] 5
## [1] 6

dice <- 1
while (dice <= 6) {
if (dice < 6) {
print("No Yahtzee")
} else {
print("Yahtzee!")
}

P a g e 13 | 72
RMC FILE

dice <- dice + 1


}

## [1] "No Yahtzee"


## [1] "No Yahtzee"
## [1] "No Yahtzee"
## [1] "No Yahtzee"
## [1] "No Yahtzee"
## [1] "Yahtzee!"

For Loop:
for (x in 1:10) {
print(x)
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10

dice <- 1:6

for(x in dice) {
if (x == 6) {
print(paste("The dice number is", x, "Yahtzee!"))
} else {
print(paste("The dice number is", x, "Not Yahtzee"))
}
}

## [1] "The dice number is 1 Not Yahtzee"


## [1] "The dice number is 2 Not Yahtzee"
## [1] "The dice number is 3 Not Yahtzee"
## [1] "The dice number is 4 Not Yahtzee"
## [1] "The dice number is 5 Not Yahtzee"
## [1] "The dice number is 6 Yahtzee!"

P a g e 14 | 72
RMC FILE

ASSIGNMENT 6
USER DEFINED AND BUILT IN

BUILT IN FUNCTION:
> print(seq(32,44))

#[1] 32 33 34 35 36 37 38 39 40 41 42 43 44

> print(mean(25:82))

#[1] 53.5

> print(sum(41:68))

#[1] 1526

CALLING A FUNCTION:
new.function <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}

new.function(6)
##new.function(6)
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
[1] 36

CALLING A FUNCTION WITHOUT AN ARGUMENT


new.function <- function() {
for(i in 1:5) {
print(i^2)
}

P a g e 15 | 72
RMC FILE

}
new.function()

## new.function()
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25

CALLING A FUNCTION WITH ARGUMENT VALUES (BY POSITION


AND BY NAME)

new.function <- function(a,b,c) {


result <- a * b + c
print(result)
}
new.function(5,3,11)
new.function(a = 11, b = 5, c = 3)

# [1] 26
# [1] 58

CALLING A FUNCTION WITH DEFAULT ARGUMENT


new.function <- function(a = 3, b = 6) {
result <- a * b
print(result)
}
new.function()
new.function(9,5)

# [1] 18
# [1] 45

LAZY EVALUATION OF FUNCTION

new.function <- function(a, b) {

print(a^2)

print(a)

print(b)

P a g e 16 | 72
RMC FILE

new.function(6)

# [1] 36
# [1] 6
# Error in print(b) : argument "b" is missing, with no default

P a g e 17 | 72
RMC FILE

ASSIGNMENT 7
LISTS
LISTS OF STRING:
thislist <- list("apple", "banana", "cherry")
thislist

## [[1]]
## [1] "apple"
##
## [[2]]
## [1] "banana"
##
## [[3]]
## [1] "cherry"

thislist <- list("apple", "banana", "cherry")

thislist[1]

## [[1]]
## [1] "apple"

CHANGING VALUE OF SPECIFIC ITEM:


thislist <- list("apple", "banana", "cherry")
thislist[1] <- "blackcurrant"
thislist

## [[1]]
## [1] "blackcurrant"
##
## [[2]]
## [1] "banana"
##
## [[3]]
## [1] "cherry"

LENGHTH OF LISTS:
thislist <- list("apple", "banana", "cherry")
length(thislist)

## [1] 3

ADD ITEM AT THE END OF LIST:


thislist <- list("apple", "banana", "cherry")
append(thislist, "orange")

## [[1]]
## [1] "apple"

P a g e 18 | 72
RMC FILE

##
## [[2]]
## [1] "banana"
##
## [[3]]
## [1] "cherry"
##
## [[4]]
## [1] "orange"

TO ADD ITEM AFTER A SPECIFIED ITEM:


thislist <- list("apple", "banana", "cherry")
append(thislist, "orange", after = 2)

## [[1]]
## [1] "apple"
##
## [[2]]
## [1] "banana"
##
## [[3]]
## [1] "orange"
##
## [[4]]
## [1] "cherry"

REMOVING LIST ITEMS:


thislist <- list("apple", "banana", "cherry")
newlist <- thislist[-1]

newlist

## [[1]]
## [1] "banana"
##
## [[2]]
## [1] "cherry"

SPECIFYING RANGE OF INDEXES:


thislist <- list("apple", "banana", "cherry", "orange", "kiwi", "melon",
"mango")
(thislist)[2:5]

## [[1]]
## [1] "banana"
##
## [[2]]
## [1] "cherry"
##
## [[3]]
## [1] "orange"

P a g e 19 | 72
RMC FILE

##
## [[4]]
## [1] "kiwi"

thislist <- list("apple", "banana", "cherry")


for (x in thislist) {
print(x)
}

## [1] "apple"
## [1] "banana"
## [1] "cherry"

COMBINING LISTS:
list1 <- list("a", "b", "c")
list2 <- list(1,2,3)
list3 <- c(list1,list2)

list3

## [[1]]
## [1] "a"
##
## [[2]]
## [1] "b"
##
## [[3]]
## [1] "c"
##
## [[4]]
## [1] 1
##
## [[5]]
## [1] 2
##
## [[6]]
## [1] 3

P a g e 20 | 72
RMC FILE

ASSIGNMENT 8
VECTORS
VECTOR OF STRINGS:
fruits <- c("banana", "apple", "orange")
fruits
## [1] "banana" "apple” “orange"

numbers <- c(1, 2, 3)

numbers

## [1] 1 2 3

VECTOR WITH NUMERICAL VALUES WITH A SEQUENCE:


numbers <- 1:10
numbers

## [1] 1 2 3 4 5 6 7 8 9 10

numbers1 <- 1.5:6.5


numbers1

## [1] 1.5 2.5 3.5 4.5 5.5 6.5

numbers2 <- 1.5:6.3


numbers2

## [1] 1.5 2.5 3.5 4.5 5.5

VECTORS OF LOGICAL VALUES :


log_values <- c(TRUE, FALSE, TRUE, FALSE)
log_values

## [1] TRUE FALSE TRUE FALSE

TO FIND NO OF VECTORS:
fruits <- c("banana", "apple", "orange")
length(fruits)

## [1] 3

SORT OF VECTORS:
fruits <- c("banana", "apple", "orange", "mango", "lemon")
numbers <- c(13, 3, 5, 7, 20, 2)
sort(fruits)

## [1] "apple" "banana" "lemon" "mango" "orange"

P a g e 21 | 72
RMC FILE

sort(numbers)

## [1] 2 3 5 7 13 20

ACESSING VECTORS:
fruits[1]

## [1] "banana"

fruits <- c("banana", "apple", "orange", "mango", "lemon")

fruits[c(1, 3)]

## [1] "banana" "orange"

fruits <- c("banana", "apple", "orange", "mango", "lemon")


fruits[c(-1)]

## [1] "apple" "orange" "mango" "lemon"

fruits <- c("banana", "apple", "orange", "mango", "lemon")

fruits[1] <- "pear"

fruits

## [1] "pear" "apple" "orange" "mango" "lemon"

P a g e 22 | 72
RMC FILE

ASSIGNMENT 9
MATRICES

CREATING A MATRIX:
thismatrix <- matrix(c(1,2,3,4,5,6), nrow = 3, ncol = 2)
thismatrix

## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6

CREATING A MATRIX WITH STRING:


thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,
ncol = 2)
thismatrix

## [,1] [,2]
## [1,] "apple" "cherry"
## [2,] "banana" "orange"

ACESSING THE ITEMS:


thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,
ncol = 2)
thismatrix[1, 2]

## [1] "cherry"

thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,


ncol = 2)
thismatrix[2,]

## [1] "banana" "orange"

thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,


ncol = 2)
thismatrix[,2]

## [1] "cherry" "orange"

thismatrix <- matrix(c("apple", "banana", "cherry", "orange","grape",


"pineapple", "pear", "melon", "fig"), nrow = 3, ncol = 3)
thismatrix[c(1,2),]

## [,1] [,2] [,3]


## [1,] "apple" "orange" "pear"
## [2,] "banana" "grape" "melon"

P a g e 23 | 72
RMC FILE

thismatrix <- matrix(c("apple", "banana", "cherry", "orange","grape",


"pineapple", "pear", "melon", "fig"), nrow = 3, ncol = 3)
thismatrix[, c(1,2)]

## [,1] [,2]
## [1,] "apple" "orange"
## [2,] "banana" "grape"
## [3,] "cherry" "pineapple"

ADD ROWS AND COLUMNS:


C-BIND:

thismatrix <- matrix(c("apple", "banana", "cherry", "orange","grape",


"pineapple", "pear", "melon", "fig"), nrow = 3, ncol = 3)
newmatrix <- cbind(thismatrix, c("strawberry", "blueberry",
"raspberry"))newmatrix

## [,1] [,2] [,3] [,4]


## [1,] "apple" "orange" "pear" "strawberry"
## [2,] "banana" "grape" "melon" "blueberry"
## [3,] "cherry" "pineapple" "fig" "raspberry"

R-BIND:

thismatrix <- matrix(c("apple", "banana", "cherry", "orange","grape",


"pineapple", "pear", "melon", "fig"), nrow = 3, ncol = 3)
newmatrix <- rbind(thismatrix, c("strawberry", "blueberry",
"raspberry"))enewwmatrix

## [,1] [,2] [,3]


## [1,] "apple" "orange" "pear"
## [2,] "banana" "grape" "melon"
## [3,] "cherry" "pineapple" "fig"
## [4,] "strawberry" "blueberry" "raspberry"

REMOVING ROWS AND COLUMNS:


thismatrix <- matrix(c("apple", "banana", "cherry", "orange", "mango",
"pineapple"), nrow = 3, ncol =2)

thismatrix <- thismatrix[-c(1), -c(1)]


thismatrix

## [1] "mango" "pineapple"

CHECKING SPECIFIED ITEMS:


thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,
ncol = 2)
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,
ncol = 2)
dim(thismatrix)

## [1] 2 2

P a g e 24 | 72
RMC FILE

thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,


ncol = 2)
length(thismatrix)

## [1] 4

COMBINING MATRICES:
Matrix1 <- matrix(c("apple", "banana", "cherry", "grape"), nrow = 2, ncol
= 2)
Matrix2 <- matrix(c("orange", "mango", "pineapple", "watermelon"), nrow =
2, ncol = 2)
Matrix_Combined <- rbind(Matrix1, Matrix2)
Matrix_Combined

## [,1] [,2]
## [1,] "apple" "cherry"
## [2,] "banana" "grape"
## [3,] "orange" "pineapple"
## [4,] "mango" "watermelon"

Matrix_Combined <- cbind(Matrix1, Matrix2)


Matrix_Combined

## [,1] [,2] [,3] [,4]


## [1,] "apple" "cherry" "orange" "pineapple"
## [2,] "banana" "grape" "mango" "watermelon"

P a g e 25 | 72
RMC FILE

P a g e 26 | 72
RMC FILE

ASSIGNMENT 10
ARRAYS
ARRAY WITH ONE DIMENSION:
thisarray <- c(1:24)
thisarray

## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
23 24

ARRAY WITH ONE OR MORE DIMENSION:


multiarray <- array(thisarray, dim = c(4, 3, 2))
multiarray

## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 13 17 21
## [2,] 14 18 22
## [3,] 15 19 23
## [4,] 16 20 24

ACCESSING ELEMENTS IN ARRAYS:


thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))
multiarray[2, 3, 2]

## [1] 22

thisarray <- c(1:24)


multiarray <- array(thisarray, dim = c(4, 3, 2))
multiarray[c(1),,1]

## [1] 1 5 9

multiarray <- array(thisarray, dim = c(4, 3, 2))


multiarray[,c(1),1]

## [1] 1 2 3 4

TO FIND IF A ELEMENT IS PRESENT:

P a g e 27 | 72
RMC FILE

thisarray <- c(1:24)

multiarray <- array(thisarray, dim = c(4, 3, 2))


2 %in% multiarray

## [1] TRUE

NUMBER OF ROWS AND COLUMNS:


thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))
dim(multiarray)

## [1] 4 3 2

DIMENSION OF ARRAY:
thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))
length(multiarray)

## [1] 24

LOOP THROUGH AN ARRAY:


thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))
for(x in multiarray){
print(x)
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11
## [1] 12
## [1] 13
## [1] 14
## [1] 15
## [1] 16
## [1] 17
## [1] 18
## [1] 19
## [1] 20
## [1] 21
## [1] 22

P a g e 28 | 72
RMC FILE

## [1] 23
## [1] 24

P a g e 29 | 72
RMC FILE

ASSIGNMENT 11
DATA FRAMES
CREATING DATA FRAME:
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Data_Frame

## Training Pulse Duration


## 1 Strength 100 60
## 2 Stamina 150 30
## 3 Other 120 45

SUMMARIZE DATA FRAME:


Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

Data_Frame

summary(Data_Frame)

## Training Pulse Duration


## Length:3 Min. :100.0 Min. :30.0
## Class :character 1st Qu.:110.0 1st Qu.:37.5
## Mode :character Median :120.0 Median :45.0
## Mean :123.3 Mean :45.0
## 3rd Qu.:135.0 3rd Qu.:52.5
## Max. :150.0 Max. :60.0

ACCESSING COLUMS FROM DATA FRAME:

Data_Frame <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Data_Frame[1]

Data_Frame[["Training"]]

Data_Frame$Training

## Training
## 1 Strength

P a g e 30 | 72
RMC FILE

## 2 Stamina
## 3 Other

## [1] "Strength" "Stamina" "Other"

## [1] "Strength" "Stamina" "Other"

ADDING ROWS :

Data_Frame <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
New_row_DF <- rbind(Data_Frame, c("Strength", 110, 110))
New_row_DF

## Training Pulse Duration


## 1 Strength 100 60
## 2 Stamina 150 30
## 3 Other 120 45
## 4 Strength 110 110

ADD NEW COLUMNS:

Data_Frame <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
New_col_DF <- cbind(Data_Frame, Steps = c(1000, 6000, 2000))
New_col_DF
## Training Pulse Duration Steps
## 1 Strength 100 60 1000
## 2 Stamina 150 30 6000
## 3 Other 120 45 2000

REMOVING ROWS AND COLUMNS:


Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Data_Frame_New <- Data_Frame[-c(1), -c(1)]
Data_Frame_New
## Pulse Duration
## 2 150 30
## 3 120 45

NUMBER OF ROWS AND COLUMNS:

P a g e 31 | 72
RMC FILE

Data_Frame <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
dim(Data_Frame)

## [1] 3 3

Data_Frame <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
ncol(Data_Frame)

## [1] 3

nrow(Data_Frame)

## [1] 3

Data_Frame <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
length(Data_Frame)

## [1] 3

COMBINING TWO OR MORE DATA FRAMES:

R-BIND:

Data_Frame1 <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Data_Frame2 <- data.frame (
Training = c("Stamina", "Stamina", "Strength"),
Pulse = c(140, 150, 160),
Duration = c(30, 30, 20)
)
New_Data_Frame <- rbind(Data_Frame1, Data_Frame2)
New_Data_Frame

## Training Pulse Duration


## 1 Strength 100 60
## 2 Stamina 150 30
## 3 Other 120 45
## 4 Stamina 140 30

P a g e 32 | 72
RMC FILE

## 5 Stamina 150 30
## 6 Strength 160 20

C-BIND:

Data_Frame3 <- data.frame (


Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Data_Frame4 <- data.frame (
Steps = c(3000, 6000, 2000),
Calories = c(300, 400, 300)
)
New_Data_Frame1 <- cbind(Data_Frame3, Data_Frame4)
New_Data_Frame1

## Training Pulse Duration Steps Calories


## 1 Strength 100 60 3000 300
## 2 Stamina 150 30 6000 400
## 3 Other 120 45 2000 300

P a g e 33 | 72
RMC FILE

ASSIGNMENT 12
IMPORTING EXCEL FILE

STEP 1: First go to packages on the lower right column and install xlsx
package.
STEP 2: Create a excel file and import it via import data set.

Your excel file is imported.

P a g e 34 | 72
RMC FILE

ASSIGNMENT 13
DATA EXPLORATION

install.packages("xlsx")
any(grepl("xlsx",installed.packages()))
library(readxl)
INPUT <- read_excel("INPUT.XLSX")
print(INPUT)
class(INPUT)
head(INPUT)
tail(INPUT)
head(INPUT,3)
tail(INPUT,3)
names(INPUT)
dim(INPUT)
ncol(INPUT)
nrow(INPUT)

#class(INPUT)
[1] "tbl_df" "tbl" "data.frame"
# head(INPUT)
# A tibble: 6 × 4
ID NAME SALARY DEPT
<dbl> <chr> <dbl> <chr>
1 1 Rick 45000 IT
2 2 Dan 78000 Finance
3 3 Michel 45000 HR
4 4 Ryan 89000 Operations
5 5 Gary 48000 Finance
6 6 Nina 92000 IT
# tail(INPUT)
# A tibble: 6 × 4
ID NAME SALARY DEPT
<dbl> <chr> <dbl> <chr>
1 3 Michel 45000 HR
2 4 Ryan 89000 Operations
3 5 Gary 48000 Finance
4 6 Nina 92000 IT
5 7 Simon 65000 Operations
6 8 Guru 65000 HR
# head(INPUT,3)
# A tibble: 3 × 4
ID NAME SALARY DEPT
<dbl> <chr> <dbl> <chr>
1 1 Rick 45000 IT
2 2 Dan 78000 Finance

P a g e 35 | 72
RMC FILE

3 3 Michel 45000 HR
# tail(INPUT,3)
# A tibble: 3 × 4
ID NAME SALARY DEPT
<dbl> <chr> <dbl> <chr>
1 6 Nina 92000 IT
2 7 Simon 65000 Operations
3 8 Guru 65000 HR
# names(INPUT)
[1] "ID" "NAME" "SALARY" "DEPT"
# dim(INPUT)
[1] 8 4
# ncol(INPUT)
[1] 4
# nrow(INPUT)
[1] 8

P a g e 36 | 72
RMC FILE

ASSIGNMENT 14
SUMMARY STATISTICS

str(INPUT)
summary(INPUT)
# str(INPUT)
tibble [8 × 4] (S3: tbl_df/tbl/data.frame)
$ ID : num [1:8] 1 2 3 4 5 6 7 8
$ NAME : chr [1:8] "Rick" "Dan" "Michel" "Ryan" ...
$ SALARY: num [1:8] 45000 78000 45000 89000 48000 92000 65000 65000
$ DEPT : chr [1:8] "IT" "Finance" "HR" "Operations" ...
# summary(INPUT)
ID NAME SALARY DEPT
Min. :1.00 Length:8 Min. :45000 Length:8
1st Qu.:2.75 Class :character 1st Qu.:47250 Class :character
Median :4.50 Mode :character Median :65000 Mode :character
Mean :4.50 Mean :65875
3rd Qu.:6.25 3rd Qu.:80750
Max. :8.00 Max. :92000

P a g e 37 | 72
RMC FILE

ASSIGNMENT 15
REFERRING SPECIFIC ROWS AND COLUMNS

summary(INPUT[,3])
summary(INPUT[,2:3])
summary(INPUT[,c(2,4)])
min(INPUT$SALARY)
max(INPUT$SALARY)
mean(INPUT$SALARY)
median(INPUT$SALARY)
var(INPUT$SALARY)
sd(INPUT$SALARY)
quantile(INPUT$SALARY,0.25)
quantile(INPUT$SALARY,0.50)
quantile(INPUT$SALARY,0.75)

# summary(INPUT[,3])
SALARY
Min. :45000
1st Qu.:47250
Median :65000
Mean :65875
3rd Qu.:80750
Max. :92000
# summary(INPUT[,2:3])
NAME SALARY
Length:8 Min. :45000
Class :character 1st Qu.:47250
Mode :character Median :65000
Mean :65875
3rd Qu.:80750
Max. :92000
# summary(INPUT[,c(2,4)])
NAME DEPT
Length:8 Length:8
Class :character Class :character
Mode :character Mode :character
# min(INPUT$SALARY)
[1] 45000
# max(INPUT$SALARY)
[1] 92000

P a g e 38 | 72
RMC FILE

# mean(INPUT$SALARY)
[1] 65875
# median(INPUT$SALARY)
[1] 65000
# var(INPUT$SALARY)
[1] 365267857
# sd(INPUT$SALARY)
[1] 19111.98
# quantile(INPUT$SALARY,0.25)
25%
47250
# quantile(INPUT$SALARY,0.50)
50%
65000
# quantile(INPUT$SALARY,0.75)
75%
80750

P a g e 39 | 72
RMC FILE

ASSIGNMENT 16
QUICK PLOTS

STEP 1: First go on the lower right-side window and click on install, and install
GGally and ggplot2. When installed it will be this.

STEP 2: Now according to the file, create a qplot or quick plot. After creating a
new r script, qplot would look like this.

P a g e 40 | 72
RMC FILE

P a g e 41 | 72
RMC FILE

ASSIGNMENT 18
HISTOGRAM, DENSITY PLOT, WHISKER PLOT

BOX PLOT:

HISTOGRAM:

P a g e 42 | 72
RMC FILE

SCATTER PLOT:

P a g e 43 | 72
RMC FILE

DENSITY PLOT:

P a g e 44 | 72
RMC FILE

ASSIGNMENT 19
PIE CHARTS, CLEVELAND DOT CHARTS, PAIR PLOTS

PIE CHARTS:

PAIR PLOTS:

P a g e 45 | 72
RMC FILE

CLEVELAND DOT CHART:

P a g e 46 | 72
RMC FILE

ASSIGNMENT 20
FACTOR DATA TYPE
CREATING A VECTOR:
apple_colors <- c('green','green','yellow','red','red','red','green')
class(apple_colors)

# [1] "character"

CREATING FACTOR OBJECT:


factor_apple <- factor(apple_colors)
print(factor_apple)
print(nlevels(factor_apple))

# [1] green green yellow red red red green


# Levels: green red yellow

# [1] 3

EXAMPLE:
gender <- factor(c("female", "female", "male", "female", "male"))
gender
levels(gender)

# [1] female female male female male


# Levels: female male

## [1] "female" "male"

text <- c("test1", "test2", "test1", "test1") # create a character vector


class(text)
text_factor <- factor(text)
class(text_factor)

# [1] "character"
# [1] "factor"

day_vector <- c('evening', 'morning', 'afternoon', 'midday', 'midnight',


'evening')
factor_day <- factor(day_vector, order = TRUE, levels =c('morning',
'midday', 'afternoon', 'evening', 'midnight'))
factor_day

# [1] evening morning afternoon midday midnight evening


# Levels: morning < midday < afternoon < evening < midnight

P a g e 47 | 72
RMC FILE

P a g e 48 | 72
RMC FILE

ASSIGNMENT 21
DATE DATA TYPE

today <- Sys.Date()


date()

# [1] "Mon Dec 5 11:56:02 2022"

format(Sys.Date(), format = "%d %B, %Y")


format(Sys.Date(), format = "Today is a %A!")

## [1] "21 November, 2022"

## [1] "Today is a Monday!"

CONVERTING STRING TO DATES:


strDates <- c("01/05/1965","08/16/1975")

dates <- as.Date(strDates, "%m/%d/%Y")

strDates <- as.character(dates)

P a g e 49 | 72
RMC FILE

ASSIGNMENT 22
CORRELATION

PEARSON CORRELATION:
res <- cor.test(cardata$wt, cardata$mpg, method = "pearson")
res

Pearson's product-moment correlation

data: cardata$wt and cardata$mpg


t = -9.559, df = 30, p-value = 1.294e-10
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.9338264 -0.7440872
sample estimates:
cor
-0.8676594

KENDALL CORRELATION:

res1 <- cor.test(cardata$wt, cardata$mpg, method = "kendall")


res1

Kendall's rank correlation tau

data: cardata$wt and cardata$mpg


z = -5.7981, p-value = 6.706e-09
alternative hypothesis: true tau is not equal to 0
sample estimates:
tau
-0.7278321

SPEARMAN CORRELATION:

res2 <- cor.test(cardata$wt, cardata$mpg, method = "spearman")


res2

Spearman's rank correlation rho

data: cardata$wt and cardata$mpg


S = 10292, p-value = 1.488e-11
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
-0.886422

P a g e 50 | 72
RMC FILE

ASSIGNMENT 23
T-TEST, TWO SAMPLE INDEPENDENT T-TEST, PAIRED TEST

ONE SAMPLE T-TEST:


res <- t.test(sample_data$HEIGHT, mu = 65)
res

#One Sample t-test

data: sample_data$HEIGHT
t = -3.8005, df = 19, p-value =
0.001208
alternative hypothesis: true mean is not equal to 65
95 percent confidence interval:
57.16883 62.73117
sample estimates:
mean of x
59.95

PAIRED T TEST:

t.test(pairedttestdata$Before,pairedttestdata$After, paired = TRUE,


alternative = "two.sided")

#Paired t-test

data: pairedttestdata$Before and pairedttestdata$After


t = -20.804, df = 9, p-value =
6.411e-09
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-215.7375 -173.4225
sample estimates:
mean difference
-194.58

TWO SAMPLE INDEPENDENT T-TEST

t.test(sample_data$HEIGHT~sample_data$GENDER)

#Welch Two Sample t-test

data: sample_data$HEIGHT by sample_data$GENDER


t = -3.956, df = 11.223, p-value = 0.002165

P a g e 51 | 72
RMC FILE

alternative hypothesis: true difference in means between group FEMALE and


group MALE is not equal to 0
95 percent confidence interval:
-12.284618 -3.515382
sample estimates:
mean in group FEMALE mean in group MALE
56.0 63.9

P a g e 52 | 72
RMC FILE

ASSIGNMENT 24
CHI SQUARE T TEST

table(cardata$am,cardata$cyl)
chisq.test(table(cardata$am,cardata$cyl))
chisq.test(cardata$am,cardata$cyl)

## table(cardata$am,cardata$cyl)

4 6 8
0 3 4 12
1 8 3 2
## chisq.test(table(cardata$am,cardata$cyl))

Pearson's Chi-squared test

data: table(cardata$am, cardata$cyl)


X-squared = 8.7407, df = 2, p-value =
0.01265

## chisq.test(cardata$am,cardata$cyl)

Pearson's Chi-squared test

data: cardata$am and cardata$cyl


X-squared = 8.7407, df = 2, p-value =
0.01265

P a g e 53 | 72
RMC FILE

ASSIGNMENT 25
ANOVA
ONE WAY ANOVA TEST:
summary(aov(wtloss$grps~wtloss$df_wt))

## library(readxl)
> wtloss <- read_excel("D:/Lakshay/Bcom/RMC/wtloss.xlsx")
> View(wtloss)
> summary(aov(wtloss$grps~wtloss$df_wt))
Df Sum Sq Mean Sq F value
wtloss$df_wt 1 12.27 12.273 17.36
Residuals 18 12.73 0.707
Pr(>F)
wtloss$df_wt 0.00058 ***
Residuals
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

TWO WAY ANOVA TEST:


table(ANN2$supp, ANN2$dose)
res.aov2 <- aov(len ~ supp + dose, data = ANN2)
summary(res.aov2)
## library(readxl)
> ANN2 <- read_excel("D:/Lakshay/Bcom/RMC/ANN2.xlsx")
> View(ANN2)
> table(ANN2$supp, ANN2$dose)

0.5 1 2
OJ 3 1 3
VC 1 1 1
> res.aov2 <- aov(len ~ supp + dose, data = ANN2)
> summary(res.aov2)
Df Sum Sq Mean Sq F value
supp 1 38.7 38.7 1.506
dose 1 690.5 690.5 26.910
Residuals 7 179.6 25.7
Pr(>F)
supp 0.25935
dose 0.00127 **
Residuals
---

P a g e 54 | 72
RMC FILE

Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

EXAMPLE 2:
res.aov2 <- aov(plant_height ~ Watering_freq + sunlight_exp, data =
twowayann)
summary(res.aov2)
## library(readxl)
> twowayann <- read_excel("D:/Lakshay/Bcom/RMC/twowayann.xlsx")
> View(twowayann)
> res.aov2 <- aov(plant_height ~ Watering_freq + sunlight_exp, data =
twowayann)
> summary(res.aov2)
Df Sum Sq Mean Sq F value
Watering_freq 2 1.281 0.6405 1.694
sunlight_exp 3 7.028 2.3427 6.198
Residuals 4 1.512 0.3780
Pr(>F)
Watering_freq 0.2931
sunlight_exp 0.0552 .
Residuals
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

P a g e 55 | 72
RMC FILE

ASSIGNMENT 26
REGRESSION-LINEAR

x <- c(153, 169, 140, 186, 128,


136, 178, 163, 152, 133)
y <- c(64, 81, 58, 91, 47, 57,
75, 72, 62, 49)
model <- lm(y~x)
print(model)

## print(model)

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept) x
-39.7137 0.6847

EXAMPLE 2:

df <- data.frame(x = 182)


res <- predict(model, df)
cat("\nPredicted value of a person
with height = 182")
print(res)

## Predicted value of a person


with height = 182> print(res)
1
84.9098

PLOT:

P a g e 56 | 72
RMC FILE

ASSIGNMENT 27
MULTIPLE LINEAR AGGRESSION

input <- airquality[1:50,


c("Ozone","Wind","Temp")]
input

## input
Ozone Wind Temp
1 41 7.4 67
2 36 8.0 72
3 12 12.6 74
4 18 11.5 62
5 NA 14.3 56
6 28 14.9 66
7 23 8.6 65
8 19 13.8 59
9 8 20.1 61
10 NA 8.6 69
11 7 6.9 74
12 16 9.7 69
13 11 9.2 66
14 14 10.9 68
15 18 13.2 58
16 14 11.5 64
17 34 12.0 66
18 6 18.4 57
19 30 11.5 68
20 11 9.7 62
21 1 9.7 59
22 11 16.6 73
23 4 9.7 61
24 32 12.0 61
25 NA 16.6 57
26 NA 14.9 58
27 NA 8.0 57
28 23 12.0 67
29 45 14.9 81
30 115 5.7 79
31 37 7.4 76
32 NA 8.6 78
33 NA 9.7 74
34 NA 16.1 67
35 NA 9.2 84
36 NA 8.6 85
37 NA 14.3 79
38 29 9.7 82
39 NA 6.9 87
40 71 13.8 90

P a g e 57 | 72
RMC FILE

41 39 11.5 87
42 NA 10.9 93
43 NA 9.2 92
44 23 8.0 82
45 NA 13.8 80
46 NA 11.5 79
47 21 14.9 77
48 37 20.7 72
49 20 9.2 65
50 12 11.5 73

CREATE REGRESSION MODEL:

model <- lm(Ozone~Wind + Temp,


data = input)
print(model)

## print(model)

Call:
lm(formula = Ozone ~ Wind + Temp, data = input)

Coefficients:
(Intercept) Wind Temp
-58.239 -0.739 1.329

P a g e 58 | 72
RMC FILE

ASSIGNMENT 28
LOGISTICS REGRESSION

input <- cardata[,c("am","cyl","hp","wt")]


data1 = glm(formula = am ~ cyl + hp + wt, data = input, family =
binomial)
print(summary(data1))

## print(summary(data1))

Call:
glm(formula = am ~ cyl + hp + wt, family = binomial, data = input)

Deviance Residuals:
Min 1Q Median 3Q
-2.17272 -0.14907 -0.01464 0.14116
Max
1.27641

Coefficients:
Estimate Std. Error z value
(Intercept) 19.70288 8.11637 2.428
cyl 0.48760 1.07162 0.455
hp 0.03259 0.01886 1.728
wt -9.14947 4.15332 -2.203
Pr(>|z|)
(Intercept) 0.0152 *
cyl 0.6491
hp 0.0840 .
wt 0.0276 *
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 43.2297 on 31 degrees of freedom


Residual deviance: 9.8415 on 28 degrees of freedom
AIC: 17.841

Number of Fisher Scoring iterations: 8

P a g e 59 | 72
RMC FILE

ASSIGNMENT 29
STEPWISE REGRESSION

xl <- lm(mpg ~ 1, data = cardata)


x2 <- lm(mpg~cyl+disp+hp+drat+wt+qsec+vs+am+gear+carb, data = cardata)
summary(x2)

## summary(x2)

Call:
lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs +
am + gear + carb, data = cardata)

Residuals:
Min 1Q Median 3Q Max
-3.4506 -1.6044 -0.1196 1.2193 4.6271

Coefficients:
Estimate Std. Error t value
(Intercept) 12.30337 18.71788 0.657
cyl -0.11144 1.04502 -0.107
disp 0.01334 0.01786 0.747
hp -0.02148 0.02177 -0.987
drat 0.78711 1.63537 0.481
wt -3.71530 1.89441 -1.961
qsec 0.82104 0.73084 1.123
vs 0.31776 2.10451 0.151
am 2.52023 2.05665 1.225
gear 0.65541 1.49326 0.439
carb -0.19942 0.82875 -0.241
Pr(>|t|)
(Intercept) 0.5181
cyl 0.9161
disp 0.4635
hp 0.3350
drat 0.6353
wt 0.0633 .
qsec 0.2739
vs 0.8814
am 0.2340
gear 0.6652
carb 0.8122
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

Residual standard error: 2.65 on 21 degrees of freedom


Multiple R-squared: 0.869, Adjusted R-squared: 0.8066

P a g e 60 | 72
RMC FILE

F-statistic: 13.93 on 10 and 21 DF, p-value: 3.793e-07

FORWARD STEP WISE

forwardnew <- step(xl, scope = list(lower=xl,


upper=x2),direction="forward")
forwardnew$anova
summary(forwardnew)
forwardnew$coefficients

## summary(forwardnew)

Call:
lm(formula = mpg ~ wt + cyl + hp, data = cardata)

Residuals:
Min 1Q Median 3Q Max
-3.9290 -1.5598 -0.5311 1.1850 5.8986

Coefficients:
Estimate Std. Error t value
(Intercept) 38.75179 1.78686 21.687
wt -3.16697 0.74058 -4.276
cyl -0.94162 0.55092 -1.709
hp -0.01804 0.01188 -1.519
Pr(>|t|)
(Intercept) < 2e-16 ***
wt 0.000199 ***
cyl 0.098480 .
hp 0.140015
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

Residual standard error: 2.512 on 28 degrees of freedom


Multiple R-squared: 0.8431, Adjusted R-squared: 0.8263
F-statistic: 50.17 on 3 and 28 DF, p-value: 2.184e-11

## forwardnew$coefficients
(Intercept) wt cyl
38.7517874 -3.1669731 -0.9416168
hp
-0.0180381

BACKWARD STEP WISE

backwardnew <- step(x2,direction = "backward")


backwardnew$anova
summary(backwardnew)
backwardnew$coefficients

P a g e 61 | 72
RMC FILE

## summary(backwardnew)

Call:
lm(formula = mpg ~ wt + qsec + am, data = cardata)

Residuals:
Min 1Q Median 3Q Max
-3.4811 -1.5555 -0.7257 1.4110 4.6610

Coefficients:
Estimate Std. Error t value
(Intercept) 9.6178 6.9596 1.382
wt -3.9165 0.7112 -5.507
qsec 1.2259 0.2887 4.247
am 2.9358 1.4109 2.081
Pr(>|t|)
(Intercept) 0.177915
wt 6.95e-06 ***
qsec 0.000216 ***
am 0.046716 *
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

Residual standard error: 2.459 on 28 degrees of freedom


Multiple R-squared: 0.8497, Adjusted R-squared: 0.8336
F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11

## backwardnew$coefficients
(Intercept) wt qsec
9.617781 -3.916504 1.225886
am
2.935837

BOTH STEP WISE:

bothnew <- step(x2,direction = "both")


bothnew$anova
bothnew$coefficients
summary(bothnew)

## bothnew <- step(x2,direction = "both")


Start: AIC=70.9
mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb

Df Sum of Sq RSS AIC


- cyl 1 0.0799 147.57 68.915
- vs 1 0.1601 147.66 68.932
- carb 1 0.4067 147.90 68.986

P a g e 62 | 72
RMC FILE

- gear 1 1.3531 148.85 69.190


- drat 1 1.6270 149.12 69.249
- disp 1 3.9167 151.41 69.736
- hp 1 6.8399 154.33 70.348
- qsec 1 8.8641 156.36 70.765
<none> 147.49 70.898
- am 1 10.5467 158.04 71.108
- wt 1 27.0144 174.51 74.280

Step: AIC=68.92
mpg ~ disp + hp + drat + wt + qsec + vs + am + gear + carb

Df Sum of Sq RSS AIC


- vs 1 0.2685 147.84 66.973
- carb 1 0.5201 148.09 67.028
- gear 1 1.8211 149.40 67.308
- drat 1 1.9826 149.56 67.342
- disp 1 3.9009 151.47 67.750
- hp 1 7.3632 154.94 68.473
<none> 147.57 68.915
- qsec 1 10.0933 157.67 69.032
- am 1 11.8359 159.41 69.384
+ cyl 1 0.0799 147.49 70.898
- wt 1 27.0280 174.60 72.297

Step: AIC=66.97
mpg ~ disp + hp + drat + wt + qsec + am + gear + carb

Df Sum of Sq RSS AIC


- carb 1 0.6855 148.53 65.121
- gear 1 2.1437 149.99 65.434
- drat 1 2.2139 150.06 65.449
- disp 1 3.6467 151.49 65.753
- hp 1 7.1060 154.95 66.475
<none> 147.84 66.973
- am 1 11.5694 159.41 67.384
- qsec 1 15.6830 163.53 68.200
+ vs 1 0.2685 147.57 68.915
+ cyl 1 0.1883 147.66 68.932
- wt 1 27.3799 175.22 70.410

Step: AIC=65.12
mpg ~ disp + hp + drat + wt + qsec + am + gear

Df Sum of Sq RSS AIC


- gear 1 1.565 150.09 63.457
- drat 1 1.932 150.46 63.535
<none> 148.53 65.121
- disp 1 10.110 158.64 65.229
- am 1 12.323 160.85 65.672
- hp 1 14.826 163.35 66.166
+ carb 1 0.685 147.84 66.973

P a g e 63 | 72
RMC FILE

+ vs 1 0.434 148.09 67.028


+ cyl 1 0.414 148.11 67.032
- qsec 1 26.408 174.94 68.358
- wt 1 69.127 217.66 75.350

Step: AIC=63.46
mpg ~ disp + hp + drat + wt + qsec + am

Df Sum of Sq RSS AIC


- drat 1 3.345 153.44 62.162
- disp 1 8.545 158.64 63.229
<none> 150.09 63.457
- hp 1 13.285 163.38 64.171
+ gear 1 1.565 148.53 65.121
+ cyl 1 1.003 149.09 65.242
+ vs 1 0.645 149.45 65.319
+ carb 1 0.107 149.99 65.434
- am 1 20.036 170.13 65.466
- qsec 1 25.574 175.67 66.491
- wt 1 67.572 217.66 73.351

Step: AIC=62.16
mpg ~ disp + hp + wt + qsec + am

Df Sum of Sq RSS AIC


- disp 1 6.629 160.07 61.515
<none> 153.44 62.162
- hp 1 12.572 166.01 62.682
+ drat 1 3.345 150.09 63.457
+ gear 1 2.977 150.46 63.535
+ cyl 1 2.447 150.99 63.648
+ vs 1 1.121 152.32 63.927
+ carb 1 0.011 153.43 64.160
- qsec 1 26.470 179.91 65.255
- am 1 32.198 185.63 66.258
- wt 1 69.043 222.48 72.051

Step: AIC=61.52
mpg ~ hp + wt + qsec + am

Df Sum of Sq RSS AIC


- hp 1 9.219 169.29 61.307
<none> 160.07 61.515
+ disp 1 6.629 153.44 62.162
+ carb 1 3.227 156.84 62.864
+ drat 1 1.428 158.64 63.229
- qsec 1 20.225 180.29 63.323
+ cyl 1 0.249 159.82 63.465
+ vs 1 0.249 159.82 63.466
+ gear 1 0.171 159.90 63.481
- am 1 25.993 186.06 64.331
- wt 1 78.494 238.56 72.284

P a g e 64 | 72
RMC FILE

Step: AIC=61.31
mpg ~ wt + qsec + am

Df Sum of Sq RSS AIC


<none> 169.29 61.307
+ hp 1 9.219 160.07 61.515
+ carb 1 8.036 161.25 61.751
+ disp 1 3.276 166.01 62.682
+ cyl 1 1.501 167.78 63.022
+ drat 1 1.400 167.89 63.042
+ gear 1 0.123 169.16 63.284
+ vs 1 0.000 169.29 63.307
- am 1 26.178 195.46 63.908
- qsec 1 109.034 278.32 75.217
- wt 1 183.347 352.63 82.790

## bothnew$anova
Step Df Deviance Resid. Df Resid. Dev
1 NA NA 21 147.4944
2 - cyl 1 0.07987121 22 147.5743
3 - vs 1 0.26852280 23 147.8428
4 - carb 1 0.68546077 24 148.5283
5 - gear 1 1.56497053 25 150.0933
6 - drat 1 3.34455117 26 153.4378
7 - disp 1 6.62865369 27 160.0665
8 - hp 1 9.21946935 28 169.2859
AIC
1 70.89774
2 68.91507
3 66.97324
4 65.12126
5 63.45667
6 62.16190
7 61.51530
8 61.30730

## bothnew$coefficients
(Intercept) wt qsec
9.617781 -3.916504 1.225886
am
2.935837

## summary(bothnew)

Call:
lm(formula = mpg ~ wt + qsec + am, data = cardata)

Residuals:
Min 1Q Median 3Q Max
-3.4811 -1.5555 -0.7257 1.4110 4.6610

P a g e 65 | 72
RMC FILE

Coefficients:
Estimate Std. Error t value
(Intercept) 9.6178 6.9596 1.382
wt -3.9165 0.7112 -5.507
qsec 1.2259 0.2887 4.247
am 2.9358 1.4109 2.081
Pr(>|t|)
(Intercept) 0.177915
wt 6.95e-06 ***
qsec 0.000216 ***
am 0.046716 *
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1

Residual standard error: 2.459 on 28 degrees of freedom


Multiple R-squared: 0.8497, Adjusted R-squared: 0.8336
F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11

P a g e 66 | 72
RMC FILE

ASSIGNMENT 30
TABLES USED

INPUT TABLE:

CARDATA TABLE:

P a g e 67 | 72
RMC FILE

SAMPLE DATA TABLE:

PAIRED T TEST:

ANN2 TABLE:

P a g e 68 | 72
RMC FILE

TWOWAYANN:

WTLOSS:

P a g e 69 | 72
RMC FILE

P a g e 70 | 72
RMC FILE

ASSIGNMENT 17

R Notebook
summary(INPUT[,3])
summary(INPUT[,2:3])

quantile(INPUT SALARY ,0.25 ¿ quantile ¿SALARY,0.50) quantile(INPUT$SALARY,0.75)


SALARY
Min. :45000
1st Qu.:47250
Median :65000
Mean :65875
3rd Qu.:80750
Max. :92000

NAME SALARY
Length:8 Min. :45000
Class :character 1st Qu.:47250
Mode :character Median :65000
Mean :65875
3rd Qu.:80750
Max. :92000

min(INPUT$SALARY)
max(INPUT$SALARY)
mean(INPUT$SALARY)

median(INPUT SALARY ¿ var ¿SALARY) sd(INPUT$SALARY){r}


[1] 45000
[1] 92000
[1] 65875

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you


execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the *Run* button within the chunk or by
placing your cursor inside it and pressing *Ctrl+Shift+Enter*.

```{r}
plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing
Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved
alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

P a g e 71 | 72
RMC FILE

The preview shows you a rendered HTML copy of the contents of the editor. Consequently,
unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when
it was last run in the editor is displayed.

P a g e 72 | 72

You might also like