Muthayammal College of Arts and Science Rasipuram: Assignment No - 3

Muthayammal College of Arts And Science
Rasipuram
Assignment No - 3
Name : K.Haritha
Roll no : 21UST004
Department : III- B.Sc., Statistics
Subject : R Programming For Data Analysis
Date :
K.Haritha
Student Signature Staff Signature

R PROGRAMMING FOR DATA ANALYSIS
UNIT -3
1.MEASURES OF CENTRAL TENDENCY

MEAN
In R, the mean() function is used to calculate the arithmetic mean, which is a
measure of central tendency. The arithmetic mean is the sum of all values in a dataset divided
by the number of observations.
PROGRAM
# Defining vector
x <- c(3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23)
# Print mean
print(mean(x))
OUTPUT
21.5
MEDIAN
In R, the median() function is used to calculate the median, which is another measure
of central tendency. The median is the middle value of a dataset when it is ordered. If the
dataset has an odd number of observations, the median is the middle value. If the dataset has
an even number of observations, the median is the average of the two middle values.
PROGRAM
# Defining vector
x <- c(3, 7, 5, 13, 20, 23, 39,
23, 40, 23, 14, 12, 56, 23)
# Print Median
median(x)
OUTPUT
21.5
MODE
The mode of a given set of values is the value that is repeated most in the set. There
can exist multiple mode values in case there are two or more values with matching maximum
frequency.
PROGRAM
# vector of marks
marks <- c(97, 78, 57,78, 97, 66, 87, 64, 87, 78)
# define mode() function
mode = function() {
# calculate mode of marks
return(names(sort(-table(marks)))[1])
# call mode() function
mode()
OUTPUT
78
2.MEASURES OF DISPERSION
RANGE
We can find the minimum and the maximum of a vector using the min() or the max()
function. A function called range() is also available which returns the minimum and
maximum in a two element vector
PROGRAM
# Sample numeric vector
data <- c(2, 5, 1, 8, 4, 10)
# Calculate the range
data_range <- range(data)
# Print the range
print(data_range)
OUTPUT
1 10
VARIANCE
Variance is the sum of squares of differences between all numbers and means. The
mathematical formula for variance is as follows, where, N is the total number of elements or
frequency of distribution. The variance can calculate by using var() function in R.
PROGRAM
# Taking a list of elements
list = c(2, 4, 4, 4, 5, 5, 7, 9)
# Calculating variance using var()
print(var(list))
OUTPUT
4.571429
STANDARD DEVIATION
Standard Deviation is the square root of variance. It is a measure of the extent to

which data varies from the mean. One can calculate the standard deviation by using sd()
function in R.
PROGRAM
# R program to get
# standard deviation of a list
# Taking a list of elements
list = c(2, 4, 4, 4, 5, 5, 7, 9)
# Calculating standard
# deviation using sd()
print(sd(list))
OUTPUT
2.13809
3.SKEWNESS
Skewness is a measure of the asymmetry of a distribution. This value can be positive

or negative. A negative skew indicates that the tail is on the left side of the distribution, which
extends towards more negative values. A positive skew indicates that the tail is on the right
side of the distribution, which extends towards more positive values. A value of zero
indicates that there is no skewness in the distribution at all, meaning the distribution is
perfectly symmetrical.
PROGRAM
# install.packages("e1071")
# Load the e1071 package
library(e1071)
# Create a numeric vector (replace this with your own data)
data <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5)
# Calculate skewness
skew_value <- skewness(data)
# Print the result
print(paste("Skewness:", skew_value))
OUTPUT
-0.317740168714651"
KURTOSIS
Kurtosis is a measure of whether or not a distribution is heavy-tailed or light-tailed

relative to a normal distribution. The kurtosis of a normal distribution is 3. If a given
distribution has a kurtosis less than 3, it is said to be playkurtic, which means it tends to
produce fewer and less extreme outliers than the normal distribution. If a given distribution
has a kurtosis greater than 3, it is said to be leptokurtic, which means it tends to produce more
outliers than the normal distribution. Note: Some formulas (Fisher’s definition) subtract 3
from the kurtosis to make it easier to compare with the normal distribution. Using this
definition, a distribution would have kurtosis greater than a normal distribution if it had a
kurtosis value greater than 0.
PROGRAM
# Install the moments package if not already installed
# install.packages("moments")
# Load the moments package
library(moments)
# Create a numeric vector (replace this with your own data)
data <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5)
# Calculate kurtosis
kurtosis_value <- kurtosis(data)
# Print the result
print(paste("Kurtosis:", kurtosis_value))
OUTPUT
2.256

Muthayammal College of Arts and Science Rasipuram: Assignment No - 3

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Muthayammal College of Arts and Science Rasipuram: Assignment No - 3

Uploaded by

Copyright:

Available Formats

Muthayammal College of Arts And Science

Department : III- B.Sc., Statistics

Subject : R Programming For Data Analysis

Student Signature Staff Signature

1.MEASURES OF CENTRAL TENDENCY

x <- c(3, 7, 5, 13, 20, 23, 39,

23, 40, 23, 14, 12, 56, 23)

# define mode() function

# calculate mode of marks

# call mode() function

# Sample numeric vector

data <- c(2, 5, 1, 8, 4, 10)

# Calculate the range

data_range <- range(data)

# Print the range

# Taking a list of elements

# Calculating variance using var()

Standard Deviation is the square root of variance. It is a measure of the extent to

# standard deviation of a list

# Taking a list of elements

# deviation using sd()

Skewness is a measure of the asymmetry of a distribution. This value can be positive

# Load the e1071 package

# Create a numeric vector (replace this with your own data)

data <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5)

skew_value <- skewness(data)

# Print the result

Kurtosis is a measure of whether or not a distribution is heavy-tailed or light-tailed

# Install the moments package if not already installed

# Load the moments package

# Create a numeric vector (replace this with your own data)

data <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5)

kurtosis_value <- kurtosis(data)

# Print the result

You might also like