Professional Documents
Culture Documents
50 R Exercises
50 R Exercises
Daniel Christofis
INTRODUCTION
Key-Features:
By sharing your thoughts and experiences with the product, you help others
make informed decisions. Your review can be a guiding light for fellow
customers, ensuring they choose the best products for their needs.
Thank you for being a valued customer and for considering writing a review. We
truly appreciate your support.
https://premiumbookself.wordpress.com/
BEGINNER EXERCISES
1.1 Hello, R!
Write a simple R script that prints the message "Hello, R!" to
the console.
1.2 Basic Arithmetic Operations
Create a script that performs the following arithmetic
operations:
1.2.1 Addition of 5 and 7
1.2.2 Subtraction of 10 from 15
1.2.3 Multiplication of 3 by 4
1.2.4 Division of 20 by 5
1.3 Variable Declaration
Declare a variable called my_number and assign it the value
42. Print the value to the console.
1.4 Vector Creation
Create a numeric vector named my_vector containing the
numbers 1 to 5. Print the vector.
1.5 Vector Operations
Perform the following operations on the vector my_vector:
1.5.1 Multiply each element by 2
1.5.2 Add 3 to each element
Print the updated vector.
1.6 Logical Operations
Declare two variables, x and y, with values 5 and 10,
respectively. Check if x is less than y and print the result.
1.7 Data Types
Create variables of different data types: numeric, character,
logical, and print them.
1.8 Basic Data Frame
Create a data frame named my_dataframe with two
columns: "Name" and "Age," and enter information for three
people.
1.9 Indexing in Vectors
Given the vector letters_vector <- c("A", "B", "C", "D", "E"),
print the third element.
1.10 Subsetting Data Frames
From my_dataframe, select and print the row where the age
is greater than 25.
1.11 Conditional Statements
Write a script that checks if a variable num is even or odd
and prints an appropriate message.
1.12 For Loop
Use a for loop to print the numbers from 1 to 5.
1.13 While Loop
Implement a while loop that prints the squares of numbers
from 1 to 4.
1.14 Functions
Create a function called square that takes a number as an
argument and returns its square.
1.15 Packages
Install and load the "dplyr" package. Create a simple data
frame and use the filter function to subset it.
1.16 Reading Data
Read a CSV file named "data.csv" into a data frame called
my_data.
1.17 Plotting
Create a scatter plot with my_dataframe where "Age" is on
the x-axis and "Name" is on the y-axis.
1.18 Random Numbers
Generate a vector of 5 random numbers between 1 and 10.
1.19 Strings and Manipulation
Declare a character variable my_string with the value
"Hello, World!" and print its length.
1.20 Basic Error Handling
Write a script that attempts to divide a number by zero and
handle the resulting error with an informative message.
INTERMEDIATE
EXERCISES
# 1.2.2 Subtraction
result_subtraction <- 15 - 10
print(result_subtraction)
# 1.2.3 Multiplication
result_multiplication <- 3 * 4
print(result_multiplication)
# 1.2.4 Division
result_division <- 20 / 5
print(result_division)
Reference Output
#1-1
[1] "Hello, R!"
#1-2-1
[1] 12
#1-2-2
[1] 5
#1-2-3
[1] 12
#1-2-4
[1] 4
#1-3
[1] 42
#1-4
[1] 1 2 3 4 5
#1-5
[1] 5 7 9 11 13
#1-6
[1] TRUE
#1-7
> numeric_var <- 42
> character_var <- "Hello, R!"
> logical_var <- TRUE
> print(numeric_var)
[1] 42
> print(character_var)
[1] "Hello, R!"
> print(logical_var)
[1] TRUE
#1-8
Name Age
1 Person1 25
2 Person2 30
3 Person3 35
#1-9
[1] "C"
#1-10
Name Age
2 Person2 30
3 Person3 35
#1-11
[1] "Odd"
#1-12
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
#1-13
[1] 1
[1] 4
[1] 9
[1] 16
#1-14
[1] 9
#1-15
Name Age
1 Alice 28
2 Bob 30
#1-16
First, create a csv file for example:
Student,Math,English,Science
Alice,78,92,87
Bob,95,76,60
Charlie,82,88,95
#1-18
[1] 3 9 2 8 4
#1-19
[1] 13
#1-20
> numerator <- 10
> denominator <- 0
> result <- tryCatch({
+ numerator / denominator
+ }, error = function(e) {
+ print(paste("Error:", e))
+ })
INTERMEDIATE
SOLUTIONS
# 2.1 Functions with Default Arguments
power <- function(base, exponent = 2) {
return(base^exponent)
}
print(power(3)) # Should print 9
# 2.2 Data Frame Manipulation
grades <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Math = c(60, 45, 92),
English = c(88, 20, 95)
)
# Data frame manipulation
grades$Total <- grades$Math + grades$English
print(grades)
# 2.3 Conditional Data Frame Manipulation (include the
code from 2.2)
filtered_grades <- grades[grades$Total > 150, ]
print(filtered_grades)
# 2.4 List Manipulation
my_list <- list(
numeric_vector = c(1, 2, 3),
character_vector = c("a", "b", "c"),
logical_vector = c(TRUE, FALSE, TRUE)
)
print(my_list)
# 2.5 List Operations
my_list$factor_vector <- factor(c("low", "medium", "high"))
print(my_list)
# 2.6 Apply Functions (include the code from 2.2)
column_means <- apply(grades[, c("Math", "English",
"Total")], 2, mean)
print(column_means)
# 2.7 Matrix Operations
matrix_data <- matrix(rnorm(9), nrow = 3)
row_sums <- apply(matrix_data, 1, sum)
col_sums <- apply(matrix_data, 2, sum)
print(row_sums)
print(col_sums)
# 2.8 Subsetting with Conditions
(include the code from 2.2)
subset_grades <- grades[grades$Math >
mean(grades$Math) & grades$English >
mean(grades$English), ]
print(subset_grades)
# 2.9 File Writing
write.csv(grades, "student_grades.csv")
# 2.10 Error Handling in Functions
power <- function(base, exponent = 2) {
if (base < 0) {
stop("Base must be non-negative.")
}
return(base^exponent)
}
print(power(3)) # Should print 9
# Uncomment the line below to test the error handling
# print(power(-2))
# 2.11 Loading External Data
data(iris)
print(head(iris, 5))
# 2.12 Data Frame Joining
# Creating data frames
students <- data.frame(StudentID = c(1, 2, 3), Name =
c("Alice", "Bob", "Charlie"))
courses <- data.frame(CourseID = c(1, 2, 3), Course =
c("Math", "English", "Science"))
# Merging data frames
joined_data <- merge(students, courses, by.x =
"StudentID", by.y = "CourseID")
# Printing the result
print(joined_data)
# 2.13 Reshaping Data
# Sample data creation
grades <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Math = c(60, 45, 92),
English = c(88, 20, 95),
Total = c(148, 65, 187)
)
# Install and Load reshape2 package
install.packages("reshape2")
library(reshape2)
# Melt the data frame
melted_grades <- melt(grades, id.vars = "Name",
measure.vars = c("Math", "English", "Total"))
print(melted_grades)
# 2.14 Time Series Creation
start_date <- as.Date("2023-01-01")
end_date <- as.Date("2023-01-10")
time_series <- seq(start_date, end_date, by = "day")
print(time_series)
# 2.15 Plotting with ggplot2
install.packages("ggplot2")
library(ggplot2)
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color =
Species)) +
geom_point()
# 2.16 String Manipulation with Regular Expressions
email_addresses <- c("user1@gmail.com",
"user2@yahoo.com", "user3@hotmail.com")
domain_names <- gsub(".*@(.*)", "\\1", email_addresses)
print(domain_names)
# 2.17 Creating a Shiny App
install.packages("shiny")
library(shiny)
ui <- fluidPage(
numericInput("input_num", "Enter a number:", value = 5),
plotOutput("output_plot")
)
server <- function(input, output) {
output$output_plot <- renderPlot({
plot(input$input_num^2)
})
}
shinyApp(ui, server)
# 2.18 Building a Function Pipeline
install.packages("magrittr")
library(magrittr)
numeric_vector <- c(1, 2, 3, 4, 5)
result <- numeric_vector %>%
mean() %>%
`^`(2) %>%
sum()
print(result)
# 2.19 Hierarchical Clustering
iris_subset <- iris[, 1:4]
hc_result <- hclust(dist(iris_subset))
plot(hc_result)
# 2.20 Parallel Processing
library(parallel)
grades <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Math = c(60, 45, 92),
English = c(88, 20, 95),
Total = c(148, 65, 187)
)
grades_parallel <- mclapply(grades, mean)
print(grades_parallel)
Reference Output
#2-1
[1] 9
#2-2
Name Math English Total
1 Alice 60 88 148
2 Bob 45 20 65
3 Charlie 92 95 187
#2-3
Name Math English Total
3 Charlie 92 95 187
#2-4
$numeric_vector
[1] 1 2 3
$character_vector
[1] "a" "b" "c"
$logical_vector
[1] TRUE FALSE TRUE
#2-5
[1] low medium high
Levels: high low medium
#2-6
Math English Total
64.00000 72.66667 136.66667
#2-7
> print(row_sums)
[1] 0.4316654 -1.2885164 -0.1953087
> print(col_sums)
[1] -2.0720193 0.6168985 0.4029610
#2-8
Name Math English
3 Charlie 92 95
#2-9
This code will create a CSV file named "student_grades.csv" in your working
directory with the specified data.
#2-10
Error in power(-2) : Base must be non-negative.
#2-11
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
#2-12
StudentID Name Course
1 1 Alice Math
2 2 Bob English
3 3 Charlie Science
#2-13
Name variable value
1 Alice Math 60
2 Bob Math 45
3 Charlie Math 92
4 Alice English 88
5 Bob English 20
6 Charlie English 95
7 Alice Total 148
8 Bob Total 65
9 Charlie Total 187
#2-14
[1] "2023-01-01" "2023-01-02" "2023-01-03" "2023-01-04" "2023-01-05"
[6] "2023-01-06" "2023-01-07" "2023-01-08" "2023-01-09" "2023-01-10"
#2-15
#2-16
[1] "gmail.com" "yahoo.com" "hotmail.com"
#2-17
#2-18
[1] 9
#2-19
#2-20
$Name
[1] NA
$Math
[1] 65.66667
$English
[1] 67.66667
$Total
[1] 133.3333
ADVANCED SOLUTIONS
# 3.1 Advanced Function
# Function to generate the Fibonacci sequence up to term n
fibonacci <- function(n) {
fib_seq <- numeric(n)
fib_seq[1:2] <- 1
for (i in 3:n) {
fib_seq[i] <- fib_seq[i - 1] + fib_seq[i - 2]
}
return(fib_seq)
}
print(fibonacci(10)) # Should print the first 10 Fibonacci
numbers
# 3.2 Advanced Data Frame Manipulation
library(dplyr)
# Create data frame with consistent number of rows for
each product
sales <- data.frame(
Date = rep(seq(as.Date("2023-01-01"), as.Date("2023-01-
31"), by = "days"), each = 2),
Product = rep(c("A", "B"), each = 31),
Revenue = rnorm(62, mean = 50, sd = 10)
)
# Calculate Rolling_Avg
sales <- sales %>%
group_by(Product) %>%
arrange(Date) %>% # Ensure proper ordering
mutate(Rolling_Avg = zoo::rollmean(Revenue, k = 7, fill =
NA, align = "right"))
# Print the result
print(sales)
# 3.3 Advanced List Manipulation
nested_list <- replicate(3, replicate(3, rnorm(3)))
print(nested_list[[2]][[1]][1]) # Accessing a specific
element
# 3.4 Advanced Apply Functions
install.packages("purrr")
install.packages("e1071")
library(purrr)
custom_function <- function(x) {
return(list(mean = mean(x), sd = sd(x), skewness =
e1071::skewness(x)))
}
numeric_vector <- rnorm(100)
result_list <- map(numeric_vector, custom_function)
print(result_list)
# 3.5 Advanced Matrix Operations
matrix_multiply <- function(mat1, mat2) {
if (ncol(mat1) != nrow(mat2)) {
stop("Number of columns in mat1 must equal the number
of rows in mat2.")
}
result_mat <- matrix(0, nrow = nrow(mat1), ncol =
ncol(mat2))
for (i in 1:nrow(mat1)) {
for (j in 1:ncol(mat2)) {
result_mat[i, j] <- sum(mat1[i, ] * mat2[, j])
}
}
return(result_mat)
}
mat1 <- matrix(1:6, nrow = 2)
mat2 <- matrix(7:12, ncol = 2)
print(matrix_multiply(mat1, mat2))
# 3.6 Advanced Subsetting with Conditions(using the “zoo”
package)
install.packages("zoo")
library(dplyr)
library(zoo)
# Create data frame with consistent number of rows for
each product
sales <- data.frame(
Date = rep(seq(as.Date("2023-01-01"), as.Date("2023-01-
31"), by = "days"), each = 2),
Product = rep(c("A", "B"), each = 31),
Revenue = rnorm(62, mean = 50, sd = 10)
)
# Calculate Rolling_Avg
sales <- sales %>%
group_by(Product) %>%
arrange(Date) %>% # Ensure proper ordering
mutate(Rolling_Avg = zoo::rollmean(Revenue, k = 7, fill =
NA, align = "right"))
# Create a new data frame for filtering
sales_filtered <- sales %>%
filter(Revenue <= 0.8 * Rolling_Avg)
# Print the result
print(sales_filtered)
# 3.7 Advanced File Writing
library(jsonlite)
json_data <- toJSON(sales, pretty = TRUE)
writeLines(json_data, "sales_data.json")
# 3.8 Advanced Error Handling in Functions
fibonacci <- function(n) {
if (!is.numeric(n) || n < 1) {
stop("Input must be a positive numeric value.")
}
fib_seq <- numeric(n)
fib_seq[1:2] <- 1
for (i in 3:n) {
fib_seq[i] <- fib_seq[i - 1] + fib_seq[i - 2]
}
return(fib_seq)
}
print(fibonacci(-5)) # Should trigger an error
# 3.9 Advanced Machine Learning
library(e1071)
iris_svm <- svm(Species ~ ., data = iris)
print(summary(iris_svm))
# 3.10 Advanced Parallel Processing
install.packages("future")
install.packages("furrr")
install.packages("randomForest")
library(randomForest)
library(furrr)
# Set up parallel processing
future::plan("multisession", workers = 5)
furrr::setup_furrr()
# Create random forests with different seeds in parallel
iris_rf <- furrr::future_map(1:5, ~ {
set.seed(.)
randomForest(Species ~ ., data = iris)
})
# Print the random forests
print(iris_rf)
Reference Output
#3-1
[1] 1 1 2 3 5 8 13 21 34 55
#3-2
Date Product Revenue
1 2023-01-01 A 53.35887
2 2023-01-01 A 63.90985
3 2023-01-02 A 50.64605
4 2023-01-02 A 38.39446
5 2023-01-03 A 51.66800
6 2023-01-03 A 50.33330
7 2023-01-04 A 58.92325
8 2023-01-04 A 44.40268
9 2023-01-05 A 62.85888
10 2023-01-05 A 66.73350
11 2023-01-06 A 57.52942
…
#3-3
[1] -0.2010459
#3-4
[[1]]
[[1]]$mean
[1] 0.699673
[[1]]$sd
[1] NA
[[1]]$skewness
[1] NaN
[[2]]
[[2]]$mean
[1] 0.5873104
[[2]]$sd
[1] NA
[[2]]$skewness
[1] NaN
[[3]]
[[3]]$mean
[1] -0.2093998
[[3]]$sd
[1] NA
[[3]]$skewness
[1] NaN
…
#3-5
[,1] [,2]
[1,] 76 103
[2,] 100 136
#3-6
# A tibble: 2 × 4
# Groups: Product [2]
Date Product Revenue Rolling_Avg
<date> <chr> <dbl> <dbl>
1 2023-01-13 A 40.1 52.5
2 2023-01-23 B 27.4 48.5
#3-7
JSON file is created:
[
{
"Date": "2023-01-01",
"Product": "A",
"Revenue": 58.7994
},
{
"Date": "2023-01-01",
"Product": "A",
"Revenue": 49.3781
},
{
"Date": "2023-01-02",
"Product": "A",
"Revenue": 41.3328
},
{
"Date": "2023-01-02",
"Product": "A",
"Revenue": 47.1283
},
{
"Date": "2023-01-03",
"Product": "A",
"Revenue": 60.7428
},
{
"Date": "2023-01-03",
"Product": "A",
"Revenue": 38.9607
},
…
#3-8
Error in fibonacci(-5) : Input must be a positive numeric value.
#3-9
Call:
svm(formula = Species ~ ., data = iris)
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 1
( 8 22 21 )
Number of Classes: 3
Levels:
setosa versicolor virginica
#3-10
[[1]]
Call:
Confusion matrix:
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 4 46 0.08
[[2]]
Call:
Confusion matrix:
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 3 47 0.06
[[3]]
Call:
Confusion matrix:
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 4 46 0.08
[[4]]
Call:
Confusion matrix:
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 3 47 0.06
[[5]]
Call:
Confusion matrix:
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 4 46 0.08