Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Muthayammal College of Arts And Science

Rasipuram
Assignment No - 3

Name : K.Haritha

Roll no : 21UST004

Department : III- B.Sc., Statistics

Subject : R Programming For Data Analysis

Date :

K.Haritha

Student Signature Staff Signature


R PROGRAMMING FOR DATA ANALYSIS
UNIT -3

1.MEASURES OF CENTRAL TENDENCY


MEAN
In R, the mean() function is used to calculate the arithmetic mean, which is a
measure of central tendency. The arithmetic mean is the sum of all values in a dataset divided
by the number of observations.

PROGRAM

# Defining vector

x <- c(3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23)

# Print mean

print(mean(x))

OUTPUT

21.5

MEDIAN

In R, the median() function is used to calculate the median, which is another measure
of central tendency. The median is the middle value of a dataset when it is ordered. If the
dataset has an odd number of observations, the median is the middle value. If the dataset has
an even number of observations, the median is the average of the two middle values.

PROGRAM

# Defining vector

x <- c(3, 7, 5, 13, 20, 23, 39,

23, 40, 23, 14, 12, 56, 23)

# Print Median

median(x)
OUTPUT

21.5

MODE

The mode of a given set of values is the value that is repeated most in the set. There
can exist multiple mode values in case there are two or more values with matching maximum
frequency.

PROGRAM

# vector of marks

marks <- c(97, 78, 57,78, 97, 66, 87, 64, 87, 78)

# define mode() function

mode = function() {

# calculate mode of marks

return(names(sort(-table(marks)))[1])

# call mode() function

mode()

OUTPUT

78
2.MEASURES OF DISPERSION

RANGE

We can find the minimum and the maximum of a vector using the min() or the max()
function. A function called range() is also available which returns the minimum and
maximum in a two element vector

PROGRAM

# Sample numeric vector

data <- c(2, 5, 1, 8, 4, 10)

# Calculate the range

data_range <- range(data)

# Print the range

print(data_range)

OUTPUT

1 10

VARIANCE

Variance is the sum of squares of differences between all numbers and means. The
mathematical formula for variance is as follows, where, N is the total number of elements or
frequency of distribution. The variance can calculate by using var() function in R.

PROGRAM

# Taking a list of elements

list = c(2, 4, 4, 4, 5, 5, 7, 9)

# Calculating variance using var()

print(var(list))

OUTPUT

4.571429
STANDARD DEVIATION

Standard Deviation is the square root of variance. It is a measure of the extent to


which data varies from the mean. One can calculate the standard deviation by using sd()
function in R.

PROGRAM

# R program to get

# standard deviation of a list

# Taking a list of elements

list = c(2, 4, 4, 4, 5, 5, 7, 9)

# Calculating standard

# deviation using sd()

print(sd(list))

OUTPUT

2.13809
3.SKEWNESS

Skewness is a measure of the asymmetry of a distribution. This value can be positive


or negative. A negative skew indicates that the tail is on the left side of the distribution, which
extends towards more negative values. A positive skew indicates that the tail is on the right
side of the distribution, which extends towards more positive values. A value of zero
indicates that there is no skewness in the distribution at all, meaning the distribution is
perfectly symmetrical.

PROGRAM

# install.packages("e1071")

# Load the e1071 package

library(e1071)

# Create a numeric vector (replace this with your own data)

data <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5)

# Calculate skewness

skew_value <- skewness(data)

# Print the result

print(paste("Skewness:", skew_value))

OUTPUT

-0.317740168714651"
KURTOSIS

Kurtosis is a measure of whether or not a distribution is heavy-tailed or light-tailed


relative to a normal distribution. The kurtosis of a normal distribution is 3. If a given
distribution has a kurtosis less than 3, it is said to be playkurtic, which means it tends to
produce fewer and less extreme outliers than the normal distribution. If a given distribution
has a kurtosis greater than 3, it is said to be leptokurtic, which means it tends to produce more
outliers than the normal distribution. Note: Some formulas (Fisher’s definition) subtract 3
from the kurtosis to make it easier to compare with the normal distribution. Using this
definition, a distribution would have kurtosis greater than a normal distribution if it had a
kurtosis value greater than 0.

PROGRAM

# Install the moments package if not already installed

# install.packages("moments")

# Load the moments package

library(moments)

# Create a numeric vector (replace this with your own data)

data <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5)

# Calculate kurtosis

kurtosis_value <- kurtosis(data)

# Print the result

print(paste("Kurtosis:", kurtosis_value))

OUTPUT

2.256

You might also like