Professional Documents
Culture Documents
SEC Notes
SEC Notes
UNIT 1
Overview of R, R data types and objects, reading and writing data, Essentials of R
language, Running R, Packages in R, Variables names and assignment, Operators,
Integers, Factors, Logical operations, Operations of Scalars, Vectors, Lists, Arrays,
Matrices, Data frames, Control structures and function
Basic Syntax
z<-"Good morning"
To print
print(z)
output:
[1] "Good morning"
#To compare two objects
x="raju"
y="RAJU"
z="RAJU"
x==y
output:
[1] FALSE
x==z
output:
[1] FALSE
y==z
output:
[1] TRUE
#To clear the environment window
rm(list = ls())
#R as calculator
2+2
[1] 4
2*4
[1] 8
585*5
[1] 2925
2/2
[1] 1
Arithmetic operations
x=5
y=6
add = x+y
[1] 11
multiple=x*y
[1] 30
sub=x-y
[1] -1
z=9
x*z
[1] 45
x^y
[1] 15625
#modulos function which gives us the remainder
x%%2
[1] 1
Relational Operators
x<y
[1] TRUE
x>y
[1] FALSE
x<=y
[1] TRUE
x>=y
[1] FALSE
Data Types
Output:
[1] "numeric"
[1] "double"
Output:
[1] "complex"
[1] "complex"
3. Character Data Type
fruit="apple"
print(class(fruit))
[1] "character"
4. Integer Data Type
x=350L
print(class(x))
[1] "integer"
ls() function
This built-in function is used to know all the present variables in the workspace.
Syntax:
ls()
rm() function
This is again a built-in function used to delete an unwanted variable within your workspace.
rm() function
This is again a built-in function used to delete an unwanted variable within your workspace.
Find sum of numbers 4 to 6.
print(sum(4:6))
Vector
A Vector is an ordered collection of same data type.
x = c(1, 3, 5, 7, 8)
Output:
[1] 1 3 5 7 8
Sub setting a Vector
x <- c("a", "b", "c", "c", "d", "a")
x[1] # Extract the first element
[1] "a"
x[2] # Extract the second element
[1] "b"
List
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list(empId, empName, numberOfEmp)
print(empList)
Output:
[[1]]
[1] 1 2 3 4
[[2]]
[1] "Debi" "Sandeep" "Subham" "Shiba"
[[3]]
[1] 4
Dataframes
print(df)
Output:
Name Language Age
1 Amiya R 22
2 Raj Python 25
3 Asish Java 45
Matrices
A = matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, ncol = 3, byrow = TRUE)
print(A)
Output:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
Maths=c(55,64,86)
Maths
[1] 55 64 86
#No of elements
length(Maths)
[1] 3
#To read the 2nd value from the 'Maths' vector
Maths[2]
[1] 64
Maths[2:3]
[1] 64 86
Arrays
A = array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
print(A)
Output:
,,1
[,1] [,2]
[1,] 1 3
[2,] 2 4
,,2
[,1] [,2]
[1,] 5 7
[2,] 6 8
Factors
fac = factor(c("Male", "Female", "Male",
"Male", "Female", "Male", "Female"))
print(fac)
Output:
[1] Male Female Male Male Female Male Female
Levels: Female Male
Sorting elements of a Vector
x <- c(8, 2, 7, 1, 11, 2)
A <- sort(x)
cat('ascending order', A, '\n')
B <- sort(X, decreasing = TRUE)
cat('descending order', B)
Output:
ascending order 1 2 2 7 8 11
descending order 11 8 7 2 2 1
Creating a vector by seq() function
V = seq(1, 40, by= 4)
# Printing the vector
print(V)
# Printing the fifth element of the vector
print(V[5])
Output:
[1] 1 5 9 13 17 21 25 29 33 37
[1] 17 25
To sort the numbers in the vector
x=c(5,-2,3,-7)
sort(x)
[1] -7 -2 3 5
sort(x,decreasing = T)
[1] 5 3 -2 -7
Reverse function
rev(x)
[1] -7 3 -2 5
Concatenate the Strings
u<-paste("abc","de","f")
u
[1] "abc de f"
a<-paste("abc","de","f",u) #Concatenate the Strings
a
[1] "abc de f abc de f"
Separating
w<-paste("abc","de","f",sep = "/") #Concatenate the Strings
w
[1] "abc/de/f"
Splitting
x<-strsplit(w,split = "/")
x
[1] "abc" "de" "f"
Install packages
install.packages(“”)
Example : install.packages(“dplyr”)
Read the data
dt<-read.csv("Loan_data_new.csv")
names(dt)
head(dt)
tail(dt)
view(dt)
To Know the Current working path
getwd()
To change the Current working path
setwd("E:/BBA R code")
Data Frames
# Create the data frame.
BMI <- data.frame(
gender = c("Male", "Male","Female"),
height = c(152, 171.5, 165),
weight = c(81,93, 78),
Age = c(42,38,26) )
print(BMI)
gender height weight Age
1 Male 152.0 81 42
2 Male 171.5 93 38
3 Female 165.0 78 26
Strings
# R program to access
# characters in a string
# Accessing characters
# using substr() function
substr("Learn Code Tech", 1, 1)
# Create a string
print(string)
Output
"Hello, Universe!"
In this example, we format a string with two decimal places using the%d format specifier
for the integer value x and the%.2f format specifier for the floating-point value y. The
prepared string is saved in the variable result before being written to the console using the
print function. You should see the output when you run this code. The solution is 42, and pi
is 3.14, which is the formatted string with x and y values substituted for the format
specifiers.
#-----------------------------------------------------------------------------------------------------------
----
Loops
R if statement
if (test_expression) {
statement
}
If the test_expression is TRUE, the statement gets executed. But if it's FALSE, nothing
happens.
Here, test_expression can be a logical or numeric vector, but only the first element is taken
into consideration.
In the case of numeric vector, zero is taken as FALSE, rest as TRUE
if else statement
The syntax of if else statement is:
if (test_expression) {
statement1
} else {
statement2
}
if else Ladder
The if else ladder (if..else..if) statement allows you execute a block of code among more than
2 alternatives
if ( test_expression1) {
statement1
} else if ( test_expression2) {
statement2
} else if ( test_expression3) {
statement3
} else {
statement4
}
Only one statement will get executed depending upon the test_expressions.
#-----------------------------------------------------------------------------------------------------------
----Break
With the break statement, we can stop the loop before it has looped through all the items:
Example
for (x in fruits) {
if (x == "cherry") {
break
}
print(x)
}
The loop will stop at "cherry" because we have chosen to finish the loop by using
the break statement when x is equal to "cherry" (x == "cherry").
#-----------------------------------------------------------------------------------------------------------
----
Next
With the next statement, we can skip an iteration without terminating the loop:
Example
Skip "banana":
for (x in fruits) {
if (x == "banana") {
next
}
print(x)
}
#----------------------------------------------------------------------------------------------------------
x=0
if (x < 0) {
print("Negative number")
} else if (x > 0) {
print("Positive number")
} else
{print("Zero")
}
#-----------------------------------------------------------------------------------------------------------
----
x <- -10
# check if x is positive
if (x > 0) {
# check if x is even or odd
if (x %% 2 == 0) {
print("x is a positive even number")
} else {
print("x is a positive odd number")
}
#-----------------------------------------------------------------------------------------------------------
----
for (val in 1: 5)
{
# statement
print(val)
}
#--------------------------------------------------------------------------------------------- --------------
----
val = 1
# using while loop
while (val <= 5)
{
# statements
print(val)
val = val + 1
}
#----------------------------------------------------------------------------------------------------------
# R program to illustrate
# application of while loop
#----------------------------------------------------------------------------------------------------------
# R program to illustrate
# the application of repeat loop
#---------------------------------------------------------------------------------------------------------
# checking the stop condition
if (i == 5)
{
# using break statement
# to terminate the loop
break
}
}
#-----------------------------------------------------------------------------------------------------------
# R program to illustrate
# the use of break statement
#----------------------------------------------------------------------------------------------------------
for (x in fruits) {
if (x == "banana") {
next
}
print(x)
}
#----------------------------------------------------------------------------------------------------------
#----------------------------------------------------------------------------------------------------------
# R program to access
# characters in a string
# Accessing characters
# using substr() function
a=substr("Learn Code Tech", 1, 2)
a
#_______________________________________________________________________
R Switch Statement
A switch statement is a selection control mechanism that allows the value of an expression to
change the control flow of program execution via map and search.
The switch statement is used in place of long if statements which compare a variable with
several integral values. It is a multi-way branch statement which provides an easy way to
dispatch execution for different parts of code. This code is based on the value of the expression.
This statement allows a variable to be tested for equality against a list of values. A switch
statement is a little bit complicated. To understand it, we have some key points which are as
follows:
o If expression type is a character string, the string is matched to the listed cases.
o If there is more than one match, the first match element is used.
o No default case is available.
o If no case is matched, an unnamed case is used.
1) Based on Index
If the cases are values like a character vector, and the expression is evaluated to a number than
the expression's result is used as an index to select the case.
When the cases have both case value and output value like ["case_1"="value1"], then the
expression value is matched against case values. If there is a match with the case, the
corresponding value is the output.
The basic syntax of If-else statement is as follows:
3,
"Shubham",
"Nishka",
"Gunjan",
"Sumit"
print(x)
#________________________________________________________________________
ax= 1
bx = 2
y = switch(
ax+bx,
"Hello, Shubham",
"Hello Arpita",
"Hello Vaishali",
"Hello Nishka"
print (y)
#________________________________________________________________________
y = "18"
x = switch(
y,
"9"="Hello Arpita",
"12"="Hello Vaishali",
"18"="Hello Nishka",
"21"="Hello Shubham"
print (x)
#_________________________________________________________________________
x= "2"
y="1"
a = switch(
paste(x,y,sep=""),
"9"="Hello Arpita",
"12"="Hello Vaishali",
"18"="Hello Nishka",
"21"="Hello Shubham"
#________________________________________________________________________
R Packages
R packages are the collection of R functions, sample data, and compile codes. In the R
environment, these packages are stored under a directory called "library." During installation,
R installs a set of packages. We can add packages later when they are needed for some specific
purpose. Only the default packages will be available when we start the R console. Other
packages which are already installed will be loaded explicitly to be used by the R program.
There is the following list of commands to be used to check, verify, and use the R packages.
here is the following list of commands to be used to check, verify, and use the R packages.
Check Available R Packages
To check the available R Packages, we have to find the library location in which R packages
are contained. R provides libPaths() function to find the library locations.
libPaths()
When the above code executes, it produces the following project, which may vary depending
on the local settings of our PCs & Laptops.
R provides library() function, which allows us to get the list of all the installed packages.
library()
When we execute the above function, it produces the following result, which may vary
depending on the local settings of our PCs or laptops.
#_____________________________________________________________________
Like library() function, R provides search() function to get all packages currently loaded in the
R environment.
search()
#________________________________________________________________________
Install a New Package
In R, there are two techniques to add new R packages. The first technique is installing package
directly from the CRAN directory, and the second one is to install it manually after
downloading the package to our local system.
The following command is used to get the packages directly from CRAN webpage and install
the package in the R environment. We may be prompted to choose the nearest mirror. Choose
the one appropriate to our location.
install.packages("Package Name")
install.packages("XML")
#__________________________________________________________________________
Load Package to Library
We cannot use the package in our code until it will not be loaded into the current R
environment. We also need to load a package which is already installed previously but not
available in the current environment.
#---------------------------------------------------------------------------------------------------------
• Transpose of a Matrix
• Joining Rows and Columns
• Merging of Data Frames
• Melting and Casting
Why R – Data Reshaping is Important?
While doing an analysis or using an analytic function, the resultant data obtained because
of the experiment or study is generally different. The obtained data usually has one or more
columns that correspond or identify a row followed by a number of columns that represent
the measured values. We can say that these columns that identify a row can be the
composite key of a column in a database.
Transpose of a Matrix
We can easily calculate the transpose of a matrix in R language with the help of t()
function. The t() function takes a matrix or data frame as an input and gives the transpose of
that matrix or data frame as it’s output.
Syntax:
t(Matrix/ Data frame)
# R program to find the transpose of a matrix
#--------------------------------------------------------------------------------------------------------------
print("Original Matrix")
first
first
#--------------------------------------------------------------------------------------------------------------
Joining Rows and Columns in Data Frame
In R, we can join two vectors or merge two data frames using functions. There are basically
two functions that perform these tasks:
cbind():
We can combine vectors, matrix or data frames by columns using cbind() function.
Syntax: cbind(x1, x2, x3)
where x1, x2 and x3 can be vectors or matrices or data frames.
rbind():
We can combine vectors, matrix or data frames by rows using rbind() function.
Syntax: rbind(x1, x2, x3)
where x1, x2 and x3 can be vectors or matrices or data frames.
#--------------------------------------------------------------------------------------------------------------
# Cbind function
info <- cbind(name, age, address)
print(info)
age=c("28", "87"),
address=c("bangalore", "kolkata"))
#--------------------------------------------------------------------------------------------------------------
# Rbind function
print(new.info)
#--------------------------------------------------------------------------------------------------------------
Merging two Data Frames
In R, we can merge two data frames using the merge() function provided both the data
frames should have the same column names. We may merge the two data frames based on a
key value.
Syntax: merge(dfA, dfB, …)
ID=c("114", "115"))
print(total)
#--------------------------------------------------------------------------------------------------------------
Melting and Casting
Data reshaping involves many steps in order to obtain desired or required format. One of
the popular methods is melting the data which converts each row into a unique id-variable
combination and then casting it. The two functions used for this process:
melt():
It is used to convert a data frame into a molten data frame.
Syntax: melt(data, …, na.rm=FALSE, value.name=”value”)
where,
data: data to be melted
… : arguments
na.rm: converts explicit missings into implicit missings
value.name: storing values
dcast():
It is used to aggregate the molten data frame into a new form.
Syntax: melt(data, formula, fun.aggregate)
where,
data: data to be melted
formula: formula that defines how to cast
fun.aggregate: used if there is a data aggregation
UNIT 2
x<-c(65,81,72,59,71,53,85,66,66,70,72,71,79,76,77,68,65,73,64,
72,82,73,77,75,80,85,89,74,86,83,87,77,67,80,78,69,64,67,79,
60,62,78,59,92,74,68,63,69,67,67,84,83,69,72,62,74,73,68,74, 65)
Arithmetic Mean
AM=mean(x)
AM
[1] 72.66667
cat("Arithmetic Mean=",AM)
Median
Md=median(x)
Md
[1] 72
cat("Median=",Md)
Median= 72
Mode
Mo=table(x)
Mode=names(Mo)[Mo==max(Mo)]
Mode
Mode= 67 72 74
Measures of Dispersion
Range
R=diff(range(x))
cat("Range=",R)
Range= 39
Quartile Deviation
Q1=quantile(x,0.25)
cat("First Quartile=",Q1)
First Quartile= 67
Q3=quantile(x,0.75)
cat("Third Quartile=",Q3)
QD=(Q3-Q1)/2
cat("Quartile Deviation=",QD)
Mean Deviation
MD=sum(abs(x-AM))/n
cat("Mean Deviation=",MD)
V=var(x)
variance=((n-1)/n)*V
cat("Variance=",variance)
Variance= 67.62222
Sd=sqrt(variance)
cat("Standard Deviation=",Sd)
Quartiles
quartiles=quantile(x)
quartiles
Summary
summary(x)
Coefficient of Range
CR=(max(x)-min(x))/(max(x)+min(x))
CR
[1] 0.2689655
Q1=quantile(x,0.25)
cat("First Quartile=",Q1)
First Quartile= 67
Q3=quantile(x,0.75)
cat("Third Quartile=",Q3)
QD=(Q3-Q1)/2
cat("Quartile Deviation=",QD)
CQD=(Q3-Q1)/(Q3+Q1)
n=length(x)
AM=mean(x)
n=length(x)
MD=sum(abs(x- AM))/n
cat("Mean Deviation=",MD)
CMD=MD/AM
V=var(x)
variance=((n-1)/n)*V
cat("Variance=",variance)
Variance= 67.62222
Sd=sqrt(variance)
cat("Standard Deviation=",Sd)
Stud_A=c(82,73,95,46,54,61)
Stud_B=c(41,84,66,75,93,82)
For Student A
Stud_A_AM=mean(Stud_A)
V=var(Stud_A)
variance=((n-1)/n)*V
cat("Variance=",variance)
Variance= 329.9083
Sd=sqrt(variance)
cat("Standard Deviation=",Sd)
Stud_B_AM=mean(Stud_B)
V1=var(Stud_B)
variance1=((n-1)/n)*V1
cat("Variance=",variance1)
Variance= 329.9083
Std=sqrt(variance1)
cat("Standard Deviation=",Std)
CV1=Sd/Stud_A_AM
cat("Coefficient of variance=",CV1)
CV2=Std/Stud_B_AM
cat("Coefficient of variance=",CV2)
install.packages("moments")
library(moments)
x=c(6,8,17,21,15,11)
central_moments<-all.moments((x),order.max=4,central=T)
central_moments
[1] 1 0 27 18 1235
raw_moments<-all.moments((x),order.max=4,central=F)
raw_moments
Skewness
skewness<-skewness(x)
skewness
[1] 0.1283001
n=length(x)
AM=mean(x)
Md=median(x)
sd=sqrt(var(x)*(n-1)/n)
Skp=3*(AM-Md)/sd
Q1=quantile(x,0.25)
Q2=quantile(x,0.50)
Q3=quantile(x,0.75)
Skb=(Q3+Q1-2*Q2)/(Q3-Q1)
Kurtosis
Ku=kurtosis(x)
lb<-seq(5,13,2);lb
[1] 5 7 9 11 13
The popular tools for data visualization are of data or information in the form of graphs,
charts, maps, plots. Using these visual effects, it is easy to easily understand the huge
complex data. For employees and for businessman it is an easy way to present data to the
non-technical people. Tableau, Plotty, R, Google Charts, Infogram, etc. R programming is
designed for computing statistics and representation of graphs which is flexible and required
minimum code using different packages.
Advantages
R offers set of inbuilt functions and libraries for data visualization. Some of them are
mentioned below:
• ggplot2
• Lattice
• highcharter
• Leaflet
• RcolorBrewer
• Plotly
• sunburstR
• RGL
• dygraphs
Data Visualization- Diagrammatic Presentation (Bar and Pie)
Bar Charts
Bar charts are the pictorial representation of data with rectangular bars with heights. The
function barplot() is used in R to create bar charts
Syntax
where
1. The total number of runs scored by a few players in one-day match is given.
Players 1 2 3 4 5 6
No. of runs 30 60 10 50 70 40
#Frequencies
runs<-c(30,60,10,50,70,40)
#Plotting
barplot(runs, horiz = TRUE, xlab = "X-axis", ylab = "Y-axis", main ="Bar-Chart", col
="green")
2. The following table gives the value(in crores) of contracts secured from abroad, in
respect of civil construction, industrial turnkey projects and software consultancy in
three financial years. Construct a suitable bar diagram to denote the share of activity in
total export earnings from the three projects.
Multiple Bar
#Assigning colors
years<-c("1994-95","1995-96","1996-97")
years
values<-matrix(c(260,312,338,442,712,861,1740,1800,2000),
nrow=3,ncol=3,byrow=TRUE)
#Plotting
barplot(values, main = "Contracts", names.arg = years,
#Assigning Colours
#Inputing data
family<-c("family A","family B")
items<-c("Food","Clothing","Rent","Light and fuel","Miscellaneous")
#Frequencies
values<-matrix(c(1600,800,600,200,800,1200,600,500,100,600),
nrow=2,ncol=5,byrow=TRUE)
#Plotting
barplot(values, main = "Monthly Expenditure", names.arg = items,
xlab = "Items of Expenditure", ylab = "Families",
col = colors)
legend("topleft", projects, cex = 0.7, fill = colors)
4. The number of hours spent by a school student on various activities on a working day,
is given below. Construct a pie chart.
No of hours 8 6 3 3 4
Pie Chart
Pie chart is a representation of values as slices of a circle with different colors. In R, pie()
function is used which takes positive numbers.
Syntax
pie(x, labels, radius, main, col, clockwise)
where
• x is a vector containing the numeric values used in the pie chart.
• labels is used to give description to the slices.
• radius indicates the radius of the circle of the pie chart.(value between −1 and
+1).
• main indicates the title of the chart.
• col indicates the color palette.
• clockwise is a logical value indicating if the slices are drawn clockwise or anti
clockwise.
Syntax
#Frequencies
hours<-c(8,6,3,3,4)
activity<-c("Sleep","School","Play","Homework","Others")
#Calculating Percentages
piepercent<- round(100 * hours / sum(hours), 1)
# Plot the chart.
pie(hours, labels = piepercent,
main = "Pie chart", col = rainbow(length(hours)))
legend("topright", c("Sleep", " School", " Play", " Homework","Others"),
cex = 0.5, fill = rainbow(length(hours)))
Data visualization - Graphical Presentation (Histogram, frequency polygon, Ogives)
and their interpretations
Histogram
Histogram is used to plot continuous variable. It helps to break the data into bins and shows
the frequency distribution of these bins.
Syntax
hist(v, main, xlab, xlim, ylim, breaks, col, border)
where
• v: This parameter contains numerical values used in histogram.
• main: This parameter main is the title of the chart.
• col: This parameter is used to set color of the bars.
• xlab: This parameter is the label for horizontal axis.
• border: This parameter is used to set border color of each bar.
• xlim: This parameter is used for plotting values of x-axis.
• ylim: This parameter is used for plotting values of y-axis.
• breaks: This parameter is used as width of each bar.
Example
#Frequencies
heights <-
c(155,155,159,160,153,156,155,160,161,150,154,156,153,160,153,162,150,156,160,154,15
7,157,157,157,157)
#Histogram
hist(heights, xlab = "Heights in cms", col = "green",border = "black", xlim =
c(140,170),ylim = c(0,10), breaks = 5)
2. In a batch of 400 students, the heights of students is given in the following table.
Represent it through histogram.
Heights (in 140-150 150-160 160-170 170-180 180-190
cms.)
No. of 74 163 135 28 25
Students
Syntax
#Fequencies
f=c(74,163,135,28,25)
ubx
[1] 100 120 140 160 180 200
#Plotting
plot(ubx,lcf1,type="o",xlim=c(100,200),main="Ogive curve", xlab = "Heights in
cms",ylab="No. of Students",lwd=2)
lines(lbx,gcf1,type="o",,xlim=c(100,200), lwd=2)
Data visualization- Stem & leaf Plot, Box-Whiskers Plot and their interpretation.
Box plot
Box plot also known as box and whisker plot which is a type of chart used in explanatory
data analysis to visually show the distribution of data. Box plot includes minimum score,
lower (first) quartile, median, upper(third) quartile and maximum score.
It is also useful in comparing the distribution of data across data sets by drawing boxplots for
each of them.
Boxplots are created in R by using the boxplot() function.
Syntax
boxplot(x, data, notch, varwidth, names, main)
where
x is a vector or a formula.
data is the data frame.
notch is a logical value. Set as TRUE to draw a notch.
varwidth is a logical value. Set as true to draw width of the box proportionate to the sample
size.
names are the group labels which will be printed under each boxplot.
main is used to give a title to the graph.
Example
Store1<-c(350,460,20,160,580,250,210,120,200,510,290,380)
Store1
Store2<-c(520,180,260,380,80,500,630,420,210,70,440,140)
Store2
boxplot(Store2, Store1,notch=TRUE, main="boxplot",col="Orange")
[1] 110 175 161 157 155 108 164 128 114 178 165 133 195 151 71 94 97 42 30 62
[21] 138 156 167 124 164 146 116 149 104 141 103 150 162 149 79 113 69 121 93 143
Output :
3|0
4 | 02
5|
6 | 29
7 | 19
8|7
9 | 347
10 | 348
11 | 0346
12 | 1248
13 | 38
14 | 01346899
15 | 01567
16 | 124457
17 | 58
18 | 47
19 | 57
20 | 3
Scatter Plot
Scatter plot is also known as scatter graph or scatter chart or scatter diagram. Scatter plot is a
type of graph which gives the relationship between the variables in a data set.
Syntax
plot(x, y, main, xlab, ylab, xlim, ylim, axes)
where
• x is the data set whose values are the horizontal coordinates.
• y is the data set whose values are the vertical coordinates.
• main is the tile of the graph.
• xlab is the label in the horizontal axis.
• ylab is the label in the vertical axis.
• xlim is the limits of the values of x used for plotting.
• ylim is the limits of the values of y used for plotting.
Example
weight<-
c(2.620,2.875,2.320,3.215,3.440,3.460,3.783,4.067,4.333,4.578,2.567,3.554,2.678,2.569,3.55
67,4.321,3.4567)
mpg<-
c(21.0,21.0,22.8,21.4,18.7,18.1,19.2,17.3,16.4,19.5,15.64,16.57,15.89,17.54,16.877,15.776,2
0167)
plot(x = weight,y = mpg,
xlab = "Weight",
ylab = "Milage",
xlim = c(2.5,5),
ylim = c(15,30),
main = "Weight vs Milage"
)
Scatter Plot Matrices
When there are more than two variables, scatter plot matrix is used. The function pair() is
used in R.
Syntax
pairs(formula, data)
where
• formula represents the series of variables used in pairs.
• data represents the data set from which the variables will be taken.
Example
head(iris)
col = my_cols[iris$Species],
lower.panel=NULLAdding correlations A
A
upper.panel<-function(x, y){
Area Chart
A line chart is a graph that connects a series of points by drawing line segments between them.
These points are ordered in one of their coordinate (usually the x-coordinate) value. Line charts
are usually used in identifying the trends in data.
The plot() function in R is used to create the line graph.
syntax
The basic syntax to create a line chart in R is −
plot(v,type,col,xlab,ylab)
Take all parameters which are required to make line chart by giving a title to the chart and
add labels to the axes.
We can add more features by adding more parameters with more colors to the points and
lines.
#-------------------------------------------------------
In above example, we created line graphs by only one line in each graph.
#Example:
v<-c(17,25,38,13,41)
t<-c(22,19,36,19,23)
m<-c(25,14,16,34,29)
Violin plots help us to visualize numerical variables from one or more categories. They are
similar to box plots in the way they show a numerical distribution using five summary-level
statistics. But violin plots also have the density information of the numerical variables. It
allows visualizing the distribution of several categories by displaying their densities.
Parameters:
library(ggplot2)
geom_violin()
#----------------------------------------------
library(ggplot2)
geom_violin()
#----------------------------------------------
library(ggplot2)
geom_violin()
#--------------------------------------
library(ggplot2)
geom_violin()+
# violin plot
coord_flip()
https://www.tutorialspoint.com/r/r_linear_regression.htm
https://www.geeksforgeeks.org/regression-and-its-types-in-r-programming/