Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 21

CAC ASSIGNMENT

2340874

2024-04-17

R Markdown
INTRODUCTION: Strong understanding of R programming in data manipulation and
analysis using packages like dplyr and tidyr. Your code is well-organized, showcasing various
functions and techniques commonly used in R. Proficiency in reading data from CSV files using the
`readr` package, printing, summarizing, and manipulating data frames. Your ability to create basic
data visualizations using both base R (`plot`) and ggplot2 enhances your data exploration
capabilities. Effective use of conditional statements (`if-else`) and loops (`for`) in tasks like checking
conditions and classifying data is evident. Your understanding of control structures (`break` and
`next`) and their usage in controlling loop execution flow based on certain conditions is
demonstrated. Lastly, proficiency with the `dplyr` and `tidyr` packages for data manipulation tasks
like filtering rows, creating new columns, and reshaping data is well developed.

OBJECTIVE: To understand R programming concepts and techniques is , and developing the


ability to apply them to real-world data analysis tasks is evident .

Q1. Using basic different functions of R


library("readr")
path<-("C:/Users/akash/Downloads/freedom_index.csv")
content<-read_csv(path, col_names = TRUE)

## Rows: 176 Columns: 17


## ── Column specification
────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Country, Region
## dbl (15): Serial, Year, Overall_Score, Property_Rights,
Government_Integrity...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this
message.

print(content)
## # A tibble: 176 × 17
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 166 more rows
## # ℹ 11 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <dbl>,
## # Financial_Freedom <dbl>

head(content) #prints the first few rows

## # A tibble: 6 × 17
## Serial Country Region Year Overall_Score Property_Rights
Government_Integrity
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
<dbl>
## 1 0 Singap… Asia-… 2024 83.5 94.2
88.3
## 2 1 Switze… Europe 2024 83 94.2
91.3
## 3 2 Ireland Europe 2024 82.6 93.5
83.4
## 4 3 Taiwan Asia-… 2024 80 82.2
73.4
## 5 4 Luxemb… Europe 2024 79.2 96.9
84.9
## 6 5 Denmark Europe 2024 77.8 98.6
97.4
## # ℹ 10 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>

tail(content) #prints the last few rows

## # A tibble: 6 × 17
## Serial Country Region Year Overall_Score Property_Rights
Government_Integrity
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
<dbl>
## 1 170 Burundi Sub-S… 2024 38.4 28.2
12
## 2 171 Zimbab… Sub-S… 2024 38.2 20.2
19.8
## 3 172 Sudan Sub-S… 2024 33.9 12.5
19.5
## 4 173 Venezu… Ameri… 2024 28.1 0
6.4
## 5 174 Cuba Ameri… 2024 25.7 30.1
36.2
## 6 175 North … Asia-… 2024 2.9 16
3.6
## # ℹ 10 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>

summary(content) #summarizes each column with functions incl. minimum,


maximum, quartiles, mean, median, mode etc.

## Serial Country Region Year


## Min. : 0.00 Length:176 Length:176 Min. :2024
## 1st Qu.: 43.75 Class :character Class :character 1st Qu.:2024
## Median : 87.50 Mode :character Mode :character Median :2024
## Mean : 87.50 Mean :2024
## 3rd Qu.:131.25 3rd Qu.:2024
## Max. :175.00 Max. :2024
## Overall_Score Property_Rights Government_Integrity
Judicial_Effectiveness
## Min. : 2.90 Min. : 0.00 Min. : 3.60 Min. : 3.30

## 1st Qu.:51.98 1st Qu.: 37.27 1st Qu.:28.10 1st Qu.:28.80

## Median :58.80 Median : 49.50 Median :40.90 Median :45.80

## Mean :58.64 Mean : 54.59 Mean :44.43 Mean :49.80

## 3rd Qu.:65.75 3rd Qu.: 72.55 3rd Qu.:58.48 3rd Qu.:71.88

## Max. :83.50 Max. :100.00 Max. :97.40 Max. :98.10

## Tax_Burden Government_Spending Fiscal_Health Business_Freedom


## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 5.00
## 1st Qu.: 72.42 1st Qu.:48.58 1st Qu.: 19.20 1st Qu.:49.30
## Median : 78.95 Median :70.25 Median : 62.50 Median :65.45
## Mean : 78.10 Mean :64.04 Mean : 52.18 Mean :62.18
## 3rd Qu.: 86.75 3rd Qu.:82.85 3rd Qu.: 82.45 3rd Qu.:75.35
## Max. :100.00 Max. :97.50 Max. :100.00 Max. :92.70
## Labor_Freedom Monetary_Freedom Trade_Freedom Investment_Freedom
## Min. : 5.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:51.25 1st Qu.:66.47 1st Qu.:63.55 1st Qu.:45.00
## Median :57.05 Median :70.90 Median :71.70 Median :60.00
## Mean :56.11 Mean :67.59 Mean :69.81 Mean :56.31
## 3rd Qu.:62.65 3rd Qu.:74.42 3rd Qu.:79.20 3rd Qu.:70.00
## Max. :78.90 Max. :81.90 Max. :95.00 Max. :95.00
## Financial_Freedom
## Min. : 0.00
## 1st Qu.:30.00
## Median :50.00
## Mean :48.58
## 3rd Qu.:60.00
## Max. :90.00

# Uses and definition of functions in Q1


1. library("readr"): Loads the readr package, which provides functions for
reading delimited files into R.

2. read_csv(path, col_names = TRUE): Reads a CSV file located at the


specified path into R, assuming the first row contains column names. It
returns a data frame.

3. print(content): Prints the entire content of the data frame to the


console.

4. head(content): Displays the first few rows of the data frame. By default,
it shows the first 6 rows.

5. tail(content): Displays the last few rows of the data frame. By default,
it shows the last 6 rows.

6. summary(content): Generates summary statistics for each column of the data


frame. This includes the minimum, maximum, quartiles, mean, and median values
for numerical columns.
7. write: This function is not explicitly used in the code snippet provided.
It is typically used to write data to a file in various formats, but the
specific usage is missing from the code.

These functions are fundamental for data manipulation and analysis in R,


allowing users to read, manipulate, summarize, and export data efficiently.

Q2. Basic data visualization and exploration.


Data Visualisation using the ggplot2 package as well as plot.
content<-content[complete.cases(content),]

library(ggplot2)
plot(content$Serial, content$`Overall_Score`)

plot(content$`Government_Spending`, content$`Overall_Score`,
xlab = "Government Spending",
ylab = "Overall Score",
main = "Serial No. vs Overall Score")
mean(content$`Overall_Score`)

## [1] 58.64318

median(content$`Overall_Score`)

## [1] 58.8

mode(content$`Overall_Score`)

## [1] "numeric"

# Uses and definition of functions in Q2

1. plot(): This function is a generic plotting function in R that can create various types of
plots, such as scatter plots, line plots, histograms, etc. It is often used to visualize
relationships between two variables.

Example use: `plot(x, y)` creates a scatter plot of y vs. x.


2. ggplot2: It is a popular R package for creating data visualizations. It provides a more
flexible and powerful approach to creating plots compared to the base R plotting system.

Example use: `ggplot(data, aes(x = variable1, y = variable2)) + geom_point()` creates a


scatter plot using ggplot2.

3. mean(): Calculates the arithmetic mean (average) of a numeric vector.

Example use: `mean(data$variable)` calculates the mean of the variable in the data.

4. median(): Computes the median of a numeric vector, which is the middle value when the
values are sorted in ascending order.

Example use: `median(data$variable)` calculates the median of the variable in the data.

5. mode(): There is no built-in function in base R to directly compute the mode. However,
you can define a custom function to compute it if needed.

These functions are essential for both basic data visualization and statistical analysis in R.
They help in understanding the distribution and characteristics of the data, as well as in
creating visual representations of the data.

Q3. Using conditional statements and looping


# Example 1: Checking Overall Scores using if Statement
if(mean(content$Overall_Score) > 55) {
print("Average Overall Score is greater than 55")
} else {
print("Average Overall Score is not greater than 5")
}

## [1] "Average Overall Score is greater than 55"


# Example 2: Classifying Overall Scores
classify<-for(i in 1:nrow(content)) {
if(content$Overall_Score[i] < 30) {
content$class[i] <- "Low"
} else if(content$Overall_Score[i] >= 30 & content$Overall_Score[i] <= 60) {
content$class[i] <- "Satisfactory"
} else {
content$class[i] <- "Excellent"
}
}

## Warning: Unknown or uninitialised column: `class`.

# Uses and definition of functions in Q3

1. if-else statement: A conditional statement used for decision-making in R. It evaluates a


condition and executes a block of code if the condition is true, otherwise, it executes
another block of code.

2. for loop: Used to iterate over a sequence (like a vector, list, or data frame) and perform
an operation for each element in the sequence.

3. Indexing: Used to access or manipulate elements in a data structure like a vector or data
frame. In the provided code, indexing is used to access and modify elements of the `content`
data frame based on the condition defined in the for loop.

4. Assignment operator `<-`: Used to assign values to variables or elements in a data


structure.

5. Warning message: Indicates that an unknown or uninitialized column (`class`) is being


used in the assignment operation. This might be due to the fact that the `class` column does
not exist in the `content` data frame before the loop execution. To resolve this warning, you
can pre-allocate the `class` column before the loop or use a different approach to classifying
the overall scores.
These concepts and functions are fundamental in programming and are commonly used for
data manipulation and decision-making tasks in R.

Q4. Use Control structures: (break, next, return)


# Example 1: Using break
# Print the Overall Score values until the first value lesser than 70 is
encountered
for (i in 1: length (content$Overall_Score)) {
if (content$Overall_Score [i] <70) {
break # Exit the loop if a value greater than 6 is encountered
}
print(content$Overall_Score[i])
}

## [1] 83.5
## [1] 83
## [1] 82.6
## [1] 80
## [1] 79.2
## [1] 77.8
## [1] 77.8
## [1] 77.8
## [1] 77.5
## [1] 77.5
## [1] 77.3
## [1] 76.3
## [1] 76.2
## [1] 73.1
## [1] 72.9
## [1] 72.4
## [1] 72.2
## [1] 72.1
## [1] 71.5
## [1] 71.5
## [1] 71.4
## [1] 71.1
## [1] 70.5
## [1] 70.2
## [1] 70.1
## [1] 70.1

# Example 2: Using next


# Print Overall Scores except those which are equal to 70.1
for (i in 1: length (content$Overall_Score)) {
if (content$Overall_Score [i] == 70.1) {
next # Skip to the next iteration if the value is less than 5
}
print(content$Overall_Score[i])
}

## [1] 83.5
## [1] 83
## [1] 82.6
## [1] 80
## [1] 79.2
## [1] 77.8
## [1] 77.8
## [1] 77.8
## [1] 77.5
## [1] 77.5
## [1] 77.3
## [1] 76.3
## [1] 76.2
## [1] 73.1
## [1] 72.9
## [1] 72.4
## [1] 72.2
## [1] 72.1
## [1] 71.5
## [1] 71.5
## [1] 71.4
## [1] 71.1
## [1] 70.5
## [1] 70.2
## [1] 69.8
## [1] 68.8
## [1] 68.7
## [1] 68.6
## [1] 68.5
## [1] 68.4
## [1] 68.4
## [1] 68.1
## [1] 68.1
## [1] 68
## [1] 67.7
## [1] 67.5
## [1] 67.2
## [1] 67.2
## [1] 66.8
## [1] 66
## [1] 65.9
## [1] 65.9
## [1] 65.7
## [1] 65.6
## [1] 64.9
## [1] 64.8
## [1] 64.8
## [1] 64.5
## [1] 64.4
## [1] 64.1
## [1] 63.5
## [1] 63.4
## [1] 63.3
## [1] 62.9
## [1] 62.9
## [1] 62.9
## [1] 62.8
## [1] 62.7
## [1] 62.5
## [1] 62.5
## [1] 62.4
## [1] 62.2
## [1] 62.2
## [1] 62
## [1] 62
## [1] 62
## [1] 61.9
## [1] 61.6
## [1] 61.4
## [1] 61.2
## [1] 61.2
## [1] 61
## [1] 60.6
## [1] 60.6
## [1] 60.5
## [1] 60.4
## [1] 60.4
## [1] 60.1
## [1] 60.1
## [1] 59.8
## [1] 59.7
## [1] 59.2
## [1] 59.2
## [1] 59.1
## [1] 59
## [1] 59
## [1] 58.6
## [1] 58.5
## [1] 58.4
## [1] 58.3
## [1] 58.2
## [1] 58
## [1] 57.7
## [1] 57.5
## [1] 57.3
## [1] 57.3
## [1] 57.1
## [1] 56.9
## [1] 56.8
## [1] 56.2
## [1] 55.9
## [1] 55.8
## [1] 55.8
## [1] 55.6
## [1] 55.6
## [1] 55.4
## [1] 55.4
## [1] 55.3
## [1] 55.3
## [1] 55.2
## [1] 55.1
## [1] 55
## [1] 55
## [1] 54.4
## [1] 54.4
## [1] 54.3
## [1] 54
## [1] 53.6
## [1] 53.6
## [1] 53.4
## [1] 53.3
## [1] 53.2
## [1] 53.1
## [1] 52.9
## [1] 52.5
## [1] 52.3
## [1] 52.1
## [1] 52.1
## [1] 52
## [1] 52
## [1] 51.9
## [1] 51.9
## [1] 51.6
## [1] 51.4
## [1] 51.3
## [1] 51.3
## [1] 50.9
## [1] 50.7
## [1] 50.7
## [1] 50.6
## [1] 50.2
## [1] 49.9
## [1] 49.9
## [1] 49.7
## [1] 49.5
## [1] 49.4
## [1] 49.2
## [1] 48.8
## [1] 48.5
## [1] 48.4
## [1] 48.4
## [1] 48.3
## [1] 48.2
## [1] 47.9
## [1] 47.8
## [1] 47.8
## [1] 47.7
## [1] 47.6
## [1] 46.7
## [1] 46.3
## [1] 44.6
## [1] 43.9
## [1] 43.5
## [1] 42.7
## [1] 42.2
## [1] 41.3
## [1] 41.2
## [1] 39.5
## [1] 38.4
## [1] 38.2
## [1] 33.9
## [1] 28.1
## [1] 25.7
## [1] 2.9

# Example 3: Using return


# Function to calculate the mean Sepal Width for a given species
calculate_mean_judicial_eff <- function (Judicial_Effectiveness) {
# Check if the species exists in the dataset
if (!(Judicial_Effectiveness %in% unique (content$Judicial_Effectiveness))) {
print("Judicial Effectiveness not found in the dataset")
return (NULL) # Return NULL if the species is not found
}
# Calculate and return the mean Sepal Width for the species
mean_judicial_eff <- mean
(content$Judicial_Effectiveness[content$Judicial_Effectiveness ==
Judicial_Effectiveness])
return (Judicial_Effectiveness)
}
# Call the function for different species
mean_judicial_eff <- calculate_mean_judicial_eff("Judicial_Effectiveness")

## [1] "Judicial Effectiveness not found in the dataset"


# Uses and definition of functions in Q4

1. break: Used to terminate a loop prematurely based on a certain condition. In Example 1,


the loop prints Overall Score values until the first value less than 70 is encountered, after
which the loop breaks.

2. next: Skips the remaining code within a loop for the current iteration and moves to the
next iteration. In Example 2, Overall Score values are printed except for those equal to 70.1,
where the loop skips printing and moves to the next iteration.

3. return: Terminates the execution of a function and returns a value to the calling
environment. In Example 3, a function named `calculate_mean_judicial_eff` is defined to
calculate the mean Judicial Effectiveness for a given species. If the species is not found in
the dataset, a message is printed, and NULL is returned. Otherwise, the mean Judicial
Effectiveness for the species is calculated and returned.

These control structures provide mechanisms for controlling the flow of execution in the
code, while the function definition and usage demonstrate how to create reusable pieces of
code for specific tasks.

Q5. Analysing the data using dplyr & tidyr package


library(dplyr)

##
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':


##
## filter, lag

## The following objects are masked from 'package:base':


##
## intersect, setdiff, setequal, union
filtered_rows<-filter(content,Overall_Score>60) #to filter out countries with
a score greater than 60.
filtered_rows

## # A tibble: 81 × 17
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 71 more rows
## # ℹ 11 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <dbl>,
## # Financial_Freedom <dbl>

library(dplyr)
mutated_row<-mutate(content, overall_score = (Overall_Score/10))
mutated_row

## # A tibble: 176 × 18
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 166 more rows
## # ℹ 12 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <dbl>,
## # Financial_Freedom <dbl>, overall_score <dbl>
library(dplyr)
mutate(content, Investment_Freedom = "60")

## # A tibble: 176 × 17
## Serial Country Region Year Overall_Score Property_Rights
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 0 Singapore Asia-Pacific 2024 83.5 94.2
## 2 1 Switzerland Europe 2024 83 94.2
## 3 2 Ireland Europe 2024 82.6 93.5
## 4 3 Taiwan Asia-Pacific 2024 80 82.2
## 5 4 Luxembourg Europe 2024 79.2 96.9
## 6 5 Denmark Europe 2024 77.8 98.6
## 7 6 Estonia Europe 2024 77.8 92.8
## 8 7 New Zealand Asia-Pacific 2024 77.8 87.4
## 9 8 Norway Europe 2024 77.5 98.8
## 10 9 Sweden Europe 2024 77.5 96.2
## # ℹ 166 more rows
## # ℹ 11 more variables: Government_Integrity <dbl>,
## # Judicial_Effectiveness <dbl>, Tax_Burden <dbl>, Government_Spending
<dbl>,
## # Fiscal_Health <dbl>, Business_Freedom <dbl>, Labor_Freedom <dbl>,
## # Monetary_Freedom <dbl>, Trade_Freedom <dbl>, Investment_Freedom <chr>,
## # Financial_Freedom <dbl>

Using tidyr package to analyse data


library(tidyr)
gathered<-gather(content, key = "Country&Region", value = "Country",
Country:Region)
gathered

## # A tibble: 352 × 17
## Serial Year Overall_Score Property_Rights Government_Integrity
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 2024 83.5 94.2 88.3
## 2 1 2024 83 94.2 91.3
## 3 2 2024 82.6 93.5 83.4
## 4 3 2024 80 82.2 73.4
## 5 4 2024 79.2 96.9 84.9
## 6 5 2024 77.8 98.6 97.4
## 7 6 2024 77.8 92.8 81.2
## 8 7 2024 77.8 87.4 95.9
## 9 8 2024 77.5 98.8 95.6
## 10 9 2024 77.5 96.2 93.2
## # ℹ 342 more rows
## # ℹ 12 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>, `Country&Region`
<chr>,
## # Country <chr>

library(tidyr)
spread_data<-spread(gathered, key = "Country&Region", value = "Country")
spread_data

## # A tibble: 176 × 17
## Serial Year Overall_Score Property_Rights Government_Integrity
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 2024 83.5 94.2 88.3
## 2 1 2024 83 94.2 91.3
## 3 2 2024 82.6 93.5 83.4
## 4 3 2024 80 82.2 73.4
## 5 4 2024 79.2 96.9 84.9
## 6 5 2024 77.8 98.6 97.4
## 7 6 2024 77.8 92.8 81.2
## 8 7 2024 77.8 87.4 95.9
## 9 8 2024 77.5 98.8 95.6
## 10 9 2024 77.5 96.2 93.2
## # ℹ 166 more rows
## # ℹ 12 more variables: Judicial_Effectiveness <dbl>, Tax_Burden <dbl>,
## # Government_Spending <dbl>, Fiscal_Health <dbl>, Business_Freedom
<dbl>,
## # Labor_Freedom <dbl>, Monetary_Freedom <dbl>, Trade_Freedom <dbl>,
## # Investment_Freedom <dbl>, Financial_Freedom <dbl>, Country <chr>,
## # Region <chr>

# Uses and definition of functions in Q5

1. filter(): This function is from the dplyr package and is used to filter
rows from a dataframe based on specified conditions.

Example use: `filter(dataframe, condition)` filters rows from the


dataframe where the condition is true.

2. mutate(): Also from the dplyr package, this function is used to create new
columns or modify existing columns in a dataframe.

Example use: `mutate(dataframe, new_column = expression)` creates a new


column in the dataframe based on the provided expression.
3. gather(): This function is from the tidyr package and is used to convert
wide-format data to long-format data by gathering columns into key-value
pairs.

Example use: `gather(dataframe, key, value, columns_to_gather)` gathers


the specified columns into key-value pairs.

4. spread(): Also from the tidyr package, spread() is the opposite of


gather(). It is used to convert data from long format to wide format by
spreading unique values in a key-value pair column into multiple columns.

Example use: `spread(dataframe, key_column, value_column)` spreads the


key-value pairs into separate columns based on the key column.

These functions are commonly used for data manipulation and restructuring,
making them essential tools in the data analysis workflow in R.

Q6. Statistical Analysis


mean<-mean(content$Overall_Score)
mean

## [1] 58.64318

max<-max(content$Overall_Score)
max

## [1] 83.5

min<-min(content$Overall_Score)
min

## [1] 2.9

sd<-sd(content$Overall_Score)
sd

## [1] 11.15323
head<-head(content$Overall_Score)
head

## [1] 83.5 83.0 82.6 80.0 79.2 77.8

str<-str(content$Overall_Score)

## num [1:176] 83.5 83 82.6 80 79.2 77.8 77.8 77.8 77.5 77.5 ...

nrow<-nrow(content$Overall_Score)

dim(content) #returns the dimensions of the dataframe

## [1] 176 17

names(content) #returns the names of the various dataframe columns

## [1] "Serial" "Country" "Region"

## [4] "Year" "Overall_Score" "Property_Rights"

## [7] "Government_Integrity" "Judicial_Effectiveness" "Tax_Burden"

## [10] "Government_Spending" "Fiscal_Health" "Business_Freedom"

## [13] "Labor_Freedom" "Monetary_Freedom" "Trade_Freedom"

## [16] "Investment_Freedom" "Financial_Freedom"

# Uses and definition of functions in Q6

1. mean(): Calculates the mean (average) of a numeric vector.


2. max(): Returns the maximum value in a numeric vector.
3. min(): Returns the minimum value in a numeric vector.
4. sd(): Computes the standard deviation of a numeric vector.
5. head(): Returns the first few elements of a vector or the first few rows of a data frame.
6. str(): Displays the structure of an R object, providing information about its type.
7. nrow(): Returns the number of rows in a data frame.
These functions are commonly used for basic statistical analysis and data exploration in R.
They help in understanding the distribution and characteristics of the data.

ANALYSIS AND CONCLUSION: We first import a data set and apply various function
to differentiate it, change it, rectify it, put conditions over it, apply functions over it. Then we
analyse it using plots, we also apply different packages and perform their specific functions over the
data set.

You might also like