Professional Documents
Culture Documents
Cac Assignment 2340874: R Markdown
Cac Assignment 2340874: R Markdown
2340874
2024-04-17
R Markdown
INTRODUCTION: Strong understanding of R programming in data manipulation and
analysis using packages like dplyr and tidyr. Your code is well-organized, showcasing various
functions and techniques commonly used in R. Proficiency in reading data from CSV files using the
`readr` package, printing, summarizing, and manipulating data frames. Your ability to create basic
data visualizations using both base R (`plot`) and ggplot2 enhances your data exploration
capabilities. Effective use of conditional statements (`if-else`) and loops (`for`) in tasks like checking
conditions and classifying data is evident. Your understanding of control structures (`break` and
`next`) and their usage in controlling loop execution flow based on certain conditions is
demonstrated. Lastly, proficiency with the `dplyr` and `tidyr` packages for data manipulation tasks
like filtering rows, creating new columns, and reshaping data is well developed.
print(content)
## # A tibble: 176 × 17
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 166 more rows
## # ℹ 11 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <dbl>,
## # Financial_Freedom <dbl>
## # A tibble: 6 × 17
## Serial Country Region Year Overall_Score Property_Rights
Government_Integrity
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
<dbl>
## 1 0 Singap… Asia-… 2024 83.5 94.2
88.3
## 2 1 Switze… Europe 2024 83 94.2
91.3
## 3 2 Ireland Europe 2024 82.6 93.5
83.4
## 4 3 Taiwan Asia-… 2024 80 82.2
73.4
## 5 4 Luxemb… Europe 2024 79.2 96.9
84.9
## 6 5 Denmark Europe 2024 77.8 98.6
97.4
## # ℹ 10 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>
## # A tibble: 6 × 17
## Serial Country Region Year Overall_Score Property_Rights
Government_Integrity
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
<dbl>
## 1 170 Burundi Sub-S… 2024 38.4 28.2
12
## 2 171 Zimbab… Sub-S… 2024 38.2 20.2
19.8
## 3 172 Sudan Sub-S… 2024 33.9 12.5
19.5
## 4 173 Venezu… Ameri… 2024 28.1 0
6.4
## 5 174 Cuba Ameri… 2024 25.7 30.1
36.2
## 6 175 North … Asia-… 2024 2.9 16
3.6
## # ℹ 10 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>
4. head(content): Displays the first few rows of the data frame. By default,
it shows the first 6 rows.
5. tail(content): Displays the last few rows of the data frame. By default,
it shows the last 6 rows.
library(ggplot2)
plot(content$Serial, content$`Overall_Score`)
plot(content$`Government_Spending`, content$`Overall_Score`,
xlab = "Government Spending",
ylab = "Overall Score",
main = "Serial No. vs Overall Score")
mean(content$`Overall_Score`)
## [1] 58.64318
median(content$`Overall_Score`)
## [1] 58.8
mode(content$`Overall_Score`)
## [1] "numeric"
1. plot(): This function is a generic plotting function in R that can create various types of
plots, such as scatter plots, line plots, histograms, etc. It is often used to visualize
relationships between two variables.
Example use: `mean(data$variable)` calculates the mean of the variable in the data.
4. median(): Computes the median of a numeric vector, which is the middle value when the
values are sorted in ascending order.
Example use: `median(data$variable)` calculates the median of the variable in the data.
5. mode(): There is no built-in function in base R to directly compute the mode. However,
you can define a custom function to compute it if needed.
These functions are essential for both basic data visualization and statistical analysis in R.
They help in understanding the distribution and characteristics of the data, as well as in
creating visual representations of the data.
2. for loop: Used to iterate over a sequence (like a vector, list, or data frame) and perform
an operation for each element in the sequence.
3. Indexing: Used to access or manipulate elements in a data structure like a vector or data
frame. In the provided code, indexing is used to access and modify elements of the `content`
data frame based on the condition defined in the for loop.
## [1] 83.5
## [1] 83
## [1] 82.6
## [1] 80
## [1] 79.2
## [1] 77.8
## [1] 77.8
## [1] 77.8
## [1] 77.5
## [1] 77.5
## [1] 77.3
## [1] 76.3
## [1] 76.2
## [1] 73.1
## [1] 72.9
## [1] 72.4
## [1] 72.2
## [1] 72.1
## [1] 71.5
## [1] 71.5
## [1] 71.4
## [1] 71.1
## [1] 70.5
## [1] 70.2
## [1] 70.1
## [1] 70.1
## [1] 83.5
## [1] 83
## [1] 82.6
## [1] 80
## [1] 79.2
## [1] 77.8
## [1] 77.8
## [1] 77.8
## [1] 77.5
## [1] 77.5
## [1] 77.3
## [1] 76.3
## [1] 76.2
## [1] 73.1
## [1] 72.9
## [1] 72.4
## [1] 72.2
## [1] 72.1
## [1] 71.5
## [1] 71.5
## [1] 71.4
## [1] 71.1
## [1] 70.5
## [1] 70.2
## [1] 69.8
## [1] 68.8
## [1] 68.7
## [1] 68.6
## [1] 68.5
## [1] 68.4
## [1] 68.4
## [1] 68.1
## [1] 68.1
## [1] 68
## [1] 67.7
## [1] 67.5
## [1] 67.2
## [1] 67.2
## [1] 66.8
## [1] 66
## [1] 65.9
## [1] 65.9
## [1] 65.7
## [1] 65.6
## [1] 64.9
## [1] 64.8
## [1] 64.8
## [1] 64.5
## [1] 64.4
## [1] 64.1
## [1] 63.5
## [1] 63.4
## [1] 63.3
## [1] 62.9
## [1] 62.9
## [1] 62.9
## [1] 62.8
## [1] 62.7
## [1] 62.5
## [1] 62.5
## [1] 62.4
## [1] 62.2
## [1] 62.2
## [1] 62
## [1] 62
## [1] 62
## [1] 61.9
## [1] 61.6
## [1] 61.4
## [1] 61.2
## [1] 61.2
## [1] 61
## [1] 60.6
## [1] 60.6
## [1] 60.5
## [1] 60.4
## [1] 60.4
## [1] 60.1
## [1] 60.1
## [1] 59.8
## [1] 59.7
## [1] 59.2
## [1] 59.2
## [1] 59.1
## [1] 59
## [1] 59
## [1] 58.6
## [1] 58.5
## [1] 58.4
## [1] 58.3
## [1] 58.2
## [1] 58
## [1] 57.7
## [1] 57.5
## [1] 57.3
## [1] 57.3
## [1] 57.1
## [1] 56.9
## [1] 56.8
## [1] 56.2
## [1] 55.9
## [1] 55.8
## [1] 55.8
## [1] 55.6
## [1] 55.6
## [1] 55.4
## [1] 55.4
## [1] 55.3
## [1] 55.3
## [1] 55.2
## [1] 55.1
## [1] 55
## [1] 55
## [1] 54.4
## [1] 54.4
## [1] 54.3
## [1] 54
## [1] 53.6
## [1] 53.6
## [1] 53.4
## [1] 53.3
## [1] 53.2
## [1] 53.1
## [1] 52.9
## [1] 52.5
## [1] 52.3
## [1] 52.1
## [1] 52.1
## [1] 52
## [1] 52
## [1] 51.9
## [1] 51.9
## [1] 51.6
## [1] 51.4
## [1] 51.3
## [1] 51.3
## [1] 50.9
## [1] 50.7
## [1] 50.7
## [1] 50.6
## [1] 50.2
## [1] 49.9
## [1] 49.9
## [1] 49.7
## [1] 49.5
## [1] 49.4
## [1] 49.2
## [1] 48.8
## [1] 48.5
## [1] 48.4
## [1] 48.4
## [1] 48.3
## [1] 48.2
## [1] 47.9
## [1] 47.8
## [1] 47.8
## [1] 47.7
## [1] 47.6
## [1] 46.7
## [1] 46.3
## [1] 44.6
## [1] 43.9
## [1] 43.5
## [1] 42.7
## [1] 42.2
## [1] 41.3
## [1] 41.2
## [1] 39.5
## [1] 38.4
## [1] 38.2
## [1] 33.9
## [1] 28.1
## [1] 25.7
## [1] 2.9
2. next: Skips the remaining code within a loop for the current iteration and moves to the
next iteration. In Example 2, Overall Score values are printed except for those equal to 70.1,
where the loop skips printing and moves to the next iteration.
3. return: Terminates the execution of a function and returns a value to the calling
environment. In Example 3, a function named `calculate_mean_judicial_eff` is defined to
calculate the mean Judicial Effectiveness for a given species. If the species is not found in
the dataset, a message is printed, and NULL is returned. Otherwise, the mean Judicial
Effectiveness for the species is calculated and returned.
These control structures provide mechanisms for controlling the flow of execution in the
code, while the function definition and usage demonstrate how to create reusable pieces of
code for specific tasks.
##
## Attaching package: 'dplyr'
## # A tibble: 81 × 17
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 71 more rows
## # ℹ 11 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <dbl>,
## # Financial_Freedom <dbl>
library(dplyr)
mutated_row<-mutate(content, overall_score = (Overall_Score/10))
mutated_row
## # A tibble: 176 × 18
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 166 more rows
## # ℹ 12 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <dbl>,
## # Financial_Freedom <dbl>, overall_score <dbl>
library(dplyr)
mutate(content, Investment_Freedom = "60")
## # A tibble: 176 × 17
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 166 more rows
## # ℹ 11 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <chr>,
## # Financial_Freedom <dbl>
## # A tibble: 352 × 17
## Serial Year Overall_Score Property_Rights Government_Integrity
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 2024 83.5 94.2 88.3
## 2 1 2024 83 94.2 91.3
## 3 2 2024 82.6 93.5 83.4
## 4 3 2024 80 82.2 73.4
## 5 4 2024 79.2 96.9 84.9
## 6 5 2024 77.8 98.6 97.4
## 7 6 2024 77.8 92.8 81.2
## 8 7 2024 77.8 87.4 95.9
## 9 8 2024 77.5 98.8 95.6
## 10 9 2024 77.5 96.2 93.2
## # ℹ 342 more rows
## # ℹ 12 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>, `Country&Region`
<chr>,
## # Country <chr>
library(tidyr)
spread_data<-spread(gathered, key = "Country&Region", value = "Country")
spread_data
## # A tibble: 176 × 17
## Serial Year Overall_Score Property_Rights Government_Integrity
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 2024 83.5 94.2 88.3
## 2 1 2024 83 94.2 91.3
## 3 2 2024 82.6 93.5 83.4
## 4 3 2024 80 82.2 73.4
## 5 4 2024 79.2 96.9 84.9
## 6 5 2024 77.8 98.6 97.4
## 7 6 2024 77.8 92.8 81.2
## 8 7 2024 77.8 87.4 95.9
## 9 8 2024 77.5 98.8 95.6
## 10 9 2024 77.5 96.2 93.2
## # ℹ 166 more rows
## # ℹ 12 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>, Country <chr>,
## # Region <chr>
1. filter(): This function is from the dplyr package and is used to filter
rows from a dataframe based on specified conditions.
2. mutate(): Also from the dplyr package, this function is used to create new
columns or modify existing columns in a dataframe.
These functions are commonly used for data manipulation and restructuring,
making them essential tools in the data analysis workflow in R.
## [1] 58.64318
max<-max(content$Overall_Score)
max
## [1] 83.5
min<-min(content$Overall_Score)
min
## [1] 2.9
sd<-sd(content$Overall_Score)
sd
## [1] 11.15323
head<-head(content$Overall_Score)
head
str<-str(content$Overall_Score)
## num [1:176] 83.5 83 82.6 80 79.2 77.8 77.8 77.8 77.5 77.5 ...
nrow<-nrow(content$Overall_Score)
## [1] 176 17
ANALYSIS AND CONCLUSION: We first import a data set and apply various function
to differentiate it, change it, rectify it, put conditions over it, apply functions over it. Then we
analyse it using plots, we also apply different packages and perform their specific functions over the
data set.