DM File - Merged

Experiment 1: Basic fundamentals, installation and use of
software, data editing
AIM: Basic fundamentals of R

 Variables: Variables are nothing but reserved memory locations to
store values. This means that when you create a variable you reserve
some space in memory.
 Data operators:
• Arithmetic operators
• Assignment operators
• Relational operators
• Logical operators
• Special operators
 Data type: No need to declare variables before using them

 Vectors: Sequence of data elements of same type
 Lists: Sequence of data elements of different types
 Arrays: Used to store data in more than 2 dimensions
 Matrices: Used to arrange elements in 2- dimensional rectangular
layout
 Factors: Used to categorize data and store it as levels
 Data frame: Data frame is a table or two-dimensional array-like
structure in which each column contains values of one variable and
each row contains one set of values from each column.
 Flow control statements:

• If
• If- else
• Switch
• Repeat
• For
• While
• Break
• Next
AIM: Installation of R
 For Windows
• Go to CRAN R project website.
• Click on the Download R for Windows link.
• Click on the base subdirectory link or install R for the first
time link.
• Click Download R X.X.X for Windows (X.X.X stand for the
latest version of R) and save the executable .exe file.
• Run the .exe file and follow installation instructions
AIM: Applications of R
• Statistical research
• Machine learning
• Deep learning
• Finding genetic anomalies and patterns
• Perform various statistical computations and analysis
• Clustering
• Used by universities like Cornell University and UCLA
• Used by top IT companies- Accenture, IBM, TCS, Paytm, Wipro,
Google, Microsoft
AIM: Data editing in R
DataEditR is an R package that makes it easy to interactively view,

enter, filter and edit data. We can use data_edit() function.
This package can be installed by:
Experiment 2: Use of R as a calculator, functions and
assignments
AIM: Use of R as calculator

 Addition
5+4
9
 Subtraction
5-4
1
 Multiplication
5*4
20
 Division
35/8
4.375
 Exponentials
3^ (1/2)
1.732051
AIM: R as Function
 Built-in
print(mean(22:80))
51
print(sum(41:70))
1665
 User defined
my_function <- function(x) {
return (5 * x)
}
print(my_function(3))
15
AIM: R as Assignment
x <- 3
x <<- 3
3 -> x
3 ->> x
y=5
print(x)
print(y)
3
5
Experiment 3: Use of R for matrix operations, missing data and

logical operators
AIM: Matrix operations

 Create
m <- matrix(c(1, 5, 6, 8), nrow = 2, ncol = 2)
n <- matrix(c(1,2,3,4), nrow = 2, ncol = 2)
print(m)
[,1] [,2]
[1,] 1 6
[2,] 5 8
print(n)
[,1] [,2]
[1,] 1 3
[2,] 2 4
m+n
[,1] [,2]
[1,] 2 9
[2,] 7 12
m*n
[,1] [,2]
[1,] 1 18
[2,] 10 32
AIM: Missing data

 Finding Missing values
x<- c(NA, 3, 4, NA, NA, NA)
is.na(x)
TRUE FALSE FALSE TRUE TRUE TRUE
 Removing NA or NaN values
x <- c(1, 2, NA, 3, NA, 4)
d <- is.na(x)
x[! d]
1234
AIM: Logical operators
vec1 <- c(0,2)
vec2 <- c(TRUE,FALSE)
 Performing operations on Operands
cat ("Element wise AND :", vec1 & vec2, "\n")
cat ("Element wise OR :", vec1 | vec2, "\n")
cat ("Logical AND :", vec1 && vec2, "\n")
cat ("Logical OR :", vec1 || vec2, "\n")
cat ("Negation :", !vec1,"\n")
Element wise AND : FALSE FALSE

Element wise OR : TRUE TRUE
Logical AND : FALSE
Logical OR : TRUE
Negation : TRUE FALSE
Experiment 4: Conditional executions and loops, data

management with sequences
AIM: If-else
a <- 33
b <- 20
if (b > a) {
print("b is greater than a")
}else {
print("b is not greater than a")
}
"b is not greater than a"
AIM: Switch
x <- switch(
3,
"a",
"b",
"c",
"d"
)
print(x)
“c”
AIM: For loop

for (x in 1:5) {
print(x)
}
1
2
3
4
5
AIM: While loop

i <- 1
while (i < 4) {
print(i)
i <- i + 1
}
1
2
3
AIM: Sequences
seq (10)
1 2 3 4 5 6 7 8 9 10
Sys.Date ( )
"2023-04-15"
Experiment 5: Data management with repeats, sorting,

ordering, and lists
AIM: Repeats
rep (3.5, times = 4)
3.5 3.5 3.5 3.5
rep ( 1 :4, 2)
12341234
AIM: Sorting
y <- c(8,5,7,6)
sort (y)
5678
sort (y , decreasing = TRUE)
8765
AIM: Ordering
y <- c(8,5,7,6)
order(y)
2431
AIM: Lists
x <- list("apple", "banana", "cherry")
print(x)
[[1]]
[1] "apple"
[[2]]
[1] "banana"
[[3]]
[1] "cherry"
Experiment 6: Vector indexing, factors, Data management with

strings, display and formatting
AIM: Vector indexing

letters [1 : 3]
"a" "b" "c"
letters [c(2,4,6) ]
"b" "d" "f"
AIM: Factors
genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock",
"Jazz"))
print(genre)
Jazz Rock Classic Classic Pop Jazz Rock Jazz
Levels: Classic Jazz Pop Rock
AIM: Strings formatting

print (pi )
3.141593
print ( pi , digits = 5)
3.1416
print (format ( pi, digits = 10) )
"3.141592654"
Python Libraries
NumPy- NumPy stands for Numerical Python. NumPy is a Python library used
for working with arrays. It also has functions for working in domain of linear algebra,
Fourier transform, and matrices.
Matplotlib- Matplotlib is a python library used to create 2D graphs and plots by

using python scripts. It has a module named pyplot which makes things easy for
plotting by providing feature to control line styles, font properties, formatting axes etc.
It supports a very wide variety of graphs and plots namely - histogram, bar charts,
power spectra, error charts etc. It is used along with NumPy to provide an environment
that is an effective open-source alternative for Matlab.
Pandas- Pandas is a Python library used for working with data sets. It has
functions for analyzing, cleaning, exploring, and manipulating data. Pandas allows us
to analyze big data and make conclusions based on statistical theories. Pandas can
clean messy data sets, and make them readable and relevant.
SciPy- SciPy is a scientific computation library that uses NumPy underneath.

SciPy stands for Scientific Python. It provides more utility functions for optimization,
stats and signal processing. Like NumPy, SciPy is open source so we can use it freely.
Scikit-learn- Scikit-learn is the most useful and robust library for machine
learning in Python. It provides a selection of efficient tools for machine learning and
statistical modeling including classification, regression, clustering and dimensionality
reduction via a consistence interface in Python. This library, which is largely written in
Python, is built upon NumPy, SciPy and Matplotlib.
Seaborn- Seaborn is an amazing library for visualization of graphical statistical

plotting in python. Seaborn provides many color palettes and beautiful styles to create
attractive statistical plots. It is built on the core of the matplot library and provides
dataset-oriented APIs.
 Factorial
 Fibonacci
 Array insertion
 Linear search
 Sorting array
 Sum of elements of array
 Shape and dimension of matrix using NumPy

 Matrix full of zeroes
and ones
 Matrix reshape and flatten

 Appending data vertically and horizontally
 Indexing
 Mean, median, standard deviation, min, max of
an arrray
 Data frame using pandas

 Reshaping data by categorizing into numbers 0,1
 Dealing with missing values by replacing them

with column mean
 Filtering data, printing student name with
marks>75 and dropping one column
 Line graph
 Bar Graph
 Scatter Graph
 Creating 2 tables S1, S2
Merging 2 tables on common column Rollno
 Creating a data frame for dictionary

 Sum ()
 Mean ()
 Sum (1) #Describe ()

 Std ()
 Describe (include = ‘all’)

 Machine Learning extension (Mlxtend)
 Apriori
 Get unique values from a column

 Iterate over rows
 Apriori frequent items

 Apriori
 Read data
 Grouping
 Transaction encoder
 Apriori
 Sorting
 ECLAT
 Read dataset
 Shape of data
 Scatter plot
 KMeans
 Clustering
 Scatter graph
 Line graph
R LAB STATISTICS
PRACTICAL FILE
Name : Vinay
Roll No : 21001016070
Branch : Btech CS (with specialization
in Data Science)
Year : 2nd
Submitted to: Dr. Neelam

INDEX
s.no. Content

DM File - Merged

Uploaded by

Copyright:

Available Formats

You might also like

DM File - Merged

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DM File - Merged

Uploaded by

Copyright:

Available Formats

Experiment 1: Basic fundamentals, installation and use of

software, data editing

AIM: Basic fundamentals of R

 Data type: No need to declare variables before using them

 Flow control statements:

AIM: Data editing in R

DataEditR is an R package that makes it easy to interactively view,

AIM: Use of R as calculator

Experiment 3: Use of R for matrix operations, missing data and

AIM: Matrix operations

AIM: Missing data

Element wise AND : FALSE FALSE

Experiment 4: Conditional executions and loops, data

AIM: For loop

AIM: While loop

Experiment 5: Data management with repeats, sorting,

Experiment 6: Vector indexing, factors, Data management with

AIM: Vector indexing

AIM: Strings formatting

Matplotlib- Matplotlib is a python library used to create 2D graphs and plots by

SciPy- SciPy is a scientific computation library that uses NumPy underneath.

Seaborn- Seaborn is an amazing library for visualization of graphical statistical

 Shape and dimension of matrix using NumPy

 Matrix reshape and flatten

 Data frame using pandas

 Dealing with missing values by replacing them

Merging 2 tables on common column Rollno

 Creating a data frame for dictionary

 Sum (1) #Describe ()

 Describe (include = ‘all’)

 Get unique values from a column

 Apriori frequent items

Submitted to: Dr. Neelam

You might also like