DM File - Merged

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Experiment 1: Basic fundamentals, installation and use of

software, data editing

AIM: Basic fundamentals of R


 Variables: Variables are nothing but reserved memory locations to
store values. This means that when you create a variable you reserve
some space in memory.

 Data operators:
• Arithmetic operators
• Assignment operators
• Relational operators
• Logical operators
• Special operators

 Data type: No need to declare variables before using them


 Vectors: Sequence of data elements of same type
 Lists: Sequence of data elements of different types
 Arrays: Used to store data in more than 2 dimensions
 Matrices: Used to arrange elements in 2- dimensional rectangular
layout
 Factors: Used to categorize data and store it as levels
 Data frame: Data frame is a table or two-dimensional array-like
structure in which each column contains values of one variable and
each row contains one set of values from each column.

 Flow control statements:


• If
• If- else
• Switch
• Repeat
• For
• While
• Break
• Next
AIM: Installation of R
 For Windows
• Go to CRAN R project website.
• Click on the Download R for Windows link.
• Click on the base subdirectory link or install R for the first
time link.
• Click Download R X.X.X for Windows (X.X.X stand for the
latest version of R) and save the executable .exe file.
• Run the .exe file and follow installation instructions

AIM: Applications of R
• Statistical research
• Machine learning
• Deep learning
• Finding genetic anomalies and patterns
• Perform various statistical computations and analysis
• Clustering
• Used by universities like Cornell University and UCLA
• Used by top IT companies- Accenture, IBM, TCS, Paytm, Wipro,
Google, Microsoft

AIM: Data editing in R

DataEditR is an R package that makes it easy to interactively view,


enter, filter and edit data. We can use data_edit() function.
This package can be installed by:
Experiment 2: Use of R as a calculator, functions and
assignments

AIM: Use of R as calculator


 Addition
5+4
9
 Subtraction
5-4
1
 Multiplication
5*4
20
 Division
35/8
4.375
 Exponentials
3^ (1/2)
1.732051

AIM: R as Function
 Built-in
print(mean(22:80))
51
print(sum(41:70))
1665
 User defined
my_function <- function(x) {
return (5 * x)
}
print(my_function(3))
15

AIM: R as Assignment
x <- 3
x <<- 3
3 -> x
3 ->> x
y=5
print(x)
print(y)
3
5

Experiment 3: Use of R for matrix operations, missing data and


logical operators

AIM: Matrix operations


 Create
m <- matrix(c(1, 5, 6, 8), nrow = 2, ncol = 2)
n <- matrix(c(1,2,3,4), nrow = 2, ncol = 2)
print(m)
[,1] [,2]
[1,] 1 6
[2,] 5 8
print(n)
[,1] [,2]
[1,] 1 3
[2,] 2 4
m+n
[,1] [,2]
[1,] 2 9
[2,] 7 12
m*n
[,1] [,2]
[1,] 1 18
[2,] 10 32

AIM: Missing data


 Finding Missing values
x<- c(NA, 3, 4, NA, NA, NA)
is.na(x)
TRUE FALSE FALSE TRUE TRUE TRUE
 Removing NA or NaN values
x <- c(1, 2, NA, 3, NA, 4)
d <- is.na(x)
x[! d]
1234
AIM: Logical operators
vec1 <- c(0,2)
vec2 <- c(TRUE,FALSE)
 Performing operations on Operands
cat ("Element wise AND :", vec1 & vec2, "\n")
cat ("Element wise OR :", vec1 | vec2, "\n")
cat ("Logical AND :", vec1 && vec2, "\n")
cat ("Logical OR :", vec1 || vec2, "\n")
cat ("Negation :", !vec1,"\n")

Element wise AND : FALSE FALSE


Element wise OR : TRUE TRUE
Logical AND : FALSE
Logical OR : TRUE
Negation : TRUE FALSE

Experiment 4: Conditional executions and loops, data


management with sequences

AIM: If-else
a <- 33
b <- 20
if (b > a) {
print("b is greater than a")
}else {
print("b is not greater than a")
}
"b is not greater than a"
AIM: Switch
x <- switch(
3,
"a",
"b",
"c",
"d"
)
print(x)
“c”

AIM: For loop


for (x in 1:5) {
print(x)
}
1
2
3
4
5

AIM: While loop


i <- 1
while (i < 4) {
print(i)
i <- i + 1
}
1
2
3

AIM: Sequences
seq (10)
1 2 3 4 5 6 7 8 9 10

Sys.Date ( )
"2023-04-15"

Experiment 5: Data management with repeats, sorting,


ordering, and lists

AIM: Repeats
rep (3.5, times = 4)
3.5 3.5 3.5 3.5
rep ( 1 :4, 2)
12341234

AIM: Sorting
y <- c(8,5,7,6)
sort (y)
5678
sort (y , decreasing = TRUE)
8765
AIM: Ordering
y <- c(8,5,7,6)
order(y)
2431

AIM: Lists
x <- list("apple", "banana", "cherry")
print(x)
[[1]]
[1] "apple"
[[2]]
[1] "banana"
[[3]]
[1] "cherry"

Experiment 6: Vector indexing, factors, Data management with


strings, display and formatting

AIM: Vector indexing


letters [1 : 3]
"a" "b" "c"
letters [c(2,4,6) ]
"b" "d" "f"

AIM: Factors
genre <- factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock",
"Jazz"))
print(genre)
Jazz Rock Classic Classic Pop Jazz Rock Jazz
Levels: Classic Jazz Pop Rock

AIM: Strings formatting


print (pi )
3.141593
print ( pi , digits = 5)
3.1416
print (format ( pi, digits = 10) )
"3.141592654"
Python Libraries
NumPy- NumPy stands for Numerical Python. NumPy is a Python library used
for working with arrays. It also has functions for working in domain of linear algebra,
Fourier transform, and matrices.

Matplotlib- Matplotlib is a python library used to create 2D graphs and plots by


using python scripts. It has a module named pyplot which makes things easy for
plotting by providing feature to control line styles, font properties, formatting axes etc.
It supports a very wide variety of graphs and plots namely - histogram, bar charts,
power spectra, error charts etc. It is used along with NumPy to provide an environment
that is an effective open-source alternative for Matlab.

Pandas- Pandas is a Python library used for working with data sets. It has
functions for analyzing, cleaning, exploring, and manipulating data. Pandas allows us
to analyze big data and make conclusions based on statistical theories. Pandas can
clean messy data sets, and make them readable and relevant.

SciPy- SciPy is a scientific computation library that uses NumPy underneath.


SciPy stands for Scientific Python. It provides more utility functions for optimization,
stats and signal processing. Like NumPy, SciPy is open source so we can use it freely.

Scikit-learn- Scikit-learn is the most useful and robust library for machine
learning in Python. It provides a selection of efficient tools for machine learning and
statistical modeling including classification, regression, clustering and dimensionality
reduction via a consistence interface in Python. This library, which is largely written in
Python, is built upon NumPy, SciPy and Matplotlib.

Seaborn- Seaborn is an amazing library for visualization of graphical statistical


plotting in python. Seaborn provides many color palettes and beautiful styles to create
attractive statistical plots. It is built on the core of the matplot library and provides
dataset-oriented APIs.
 Factorial

 Fibonacci

 Array insertion
 Linear search

 Sorting array
 Sum of elements of array

 Shape and dimension of matrix using NumPy


 Matrix full of zeroes
and ones

 Matrix reshape and flatten


 Appending data vertically and horizontally

 Indexing
 Mean, median, standard deviation, min, max of
an arrray

 Data frame using pandas


 Reshaping data by categorizing into numbers 0,1

 Dealing with missing values by replacing them


with column mean
 Filtering data, printing student name with
marks>75 and dropping one column

 Line graph
 Bar Graph
 Scatter Graph
 Creating 2 tables S1, S2

Merging 2 tables on common column Rollno

 Creating a data frame for dictionary


 Sum ()

 Mean ()

 Sum (1) #Describe ()


 Std ()

 Describe (include = ‘all’)


 Machine Learning extension (Mlxtend)

 Apriori

 Get unique values from a column


 Iterate over rows

 Apriori frequent items


 Apriori

 Read data
 Grouping

 Transaction encoder

 Apriori
 Sorting

 ECLAT
 Read dataset

 Shape of data

 Scatter plot

 KMeans

 Clustering
 Scatter graph

 Line graph
R LAB STATISTICS
PRACTICAL FILE

Name : Vinay
Roll No : 21001016070
Branch : Btech CS (with specialization
in Data Science)
Year : 2nd

Submitted to: Dr. Neelam


INDEX

s.no. Content

You might also like