Assignment-3 Key-Graphical Representations

Q9) Calculate Skewness, Kurtosis & draw inferences on the following data
Cars speed and distance
Ans) #using e1071 package

> skewness(ex2_csv$speed)
[1] -0.7983898
> kurtosis(ex2_csv$speed)
[1] -0.2260851
> skewness(ex2_csv$dist)
[1] 1.150886
> kurtosis(ex2_csv$dist)
[1] 1.466731
#Using moments package
> skewness(ex2_csv$speed)
[1] -0.8448909
> kurtosis(ex2_csv$speed)
[1] 2.991396
> skewness(ex2_csv$dist)
[1] 1.217917
> kurtosis(ex2_csv$dist)
[1] 4.816933
Inferences: as you can see from the above data, there is a huge difference in the kurtosis values when
e1071 and moments package are compared with each other. This is due to different equations used by
the packages to find kurtosis.
SP and Weight(WT)
Ans)
#using e1071 package

> skewness(ex3_csv$SP)
[1] -0.3898407
> skewness(ex3_csv$WT)
[1] -1.230919
> kurtosis(ex3_csv$SP)
[1] -1.034207
> kurtosis(ex3_csv$WT)
[1] 0.5979244
#using moments package

skewness(ex3_csv$SP)
[1] -0.4076944
> skewness(ex3_csv$WT)
[1] -1.287292
> kurtosis(ex3_csv$SP)
[1] 2.086738
> kurtosis(ex3_csv$WT)
[1] 3.819284
Q10) Draw inferences about the following boxplot & histogram

.
Ans: The above boxplot suggests that the distribution has lots of outliers towards upper extreme
Q11) Suppose we want to estimate the average weight of an adult male in Mexico. We draw a random
sample of 2,000 men from a population of 3,000,000 men and weigh them. We find that the average
person in our sample weighs 200 pounds, and the standard deviation of the sample is 30 pounds.
Calculate 94%, 98%, 96% confidence interval?
Ans: n=2000
X = 200
s= 30
s 30
Confidence Interval Estimate= X ± Z => 200 ± Z
√n √2000
94% Confidence: qnorm(0.97)
[1] 1.880794=Z
30
200 ± 1.88* =198.74 – 201.26
√2000
98% Confidence: > qnorm(0.99)
[1] 2.326348=Z
30
200 ± 2.33* =198.44-201.56
√2000
96% Confidence: > qnorm(0.98)
[1] 2.053749
30
200 ± 2.05* = 198.62-201.38
√2000
Q12) Below are the scores obtained by a student in tests
34,36,36,38,38,39,39,40,40,41,41,41,41,42,42,45,49,56
1) Find mean, median, variance, standard deviation.

2) What can we say about the student marks?
Ans: 1) > mean(ex$scores)

[1] 41
> median(ex$scores)
[1] 40.5
> var(ex$scores)
[1] 25.52941
> sd(ex$scores)
[1] 5.052664
2) Mean > Median, This implies that the distribution is slightly skewed towards right. No outliers are
present.
Q13) what is the nature of skewness when mean, median of data are equal?
Ans) no skewness, symmetric
Q14) what is the nature of skewness when mean > median ?
Ans) Right skewed(tail on the right side).
Q15) What is the nature of skewness when median > mean?
Ans) Left Skewed(tail on the left side).
Q16) What does positive kurtosis value indicates for a data ?
Ans) peakness (sharp peak) and less variation.
Q17) What does negative kurtosis value indicates for a data?
Ans) less peakness (Broad peak) and more variation.
Q18) Answer the below questions using the below boxplot visualization.
What can we say about the distribution of the data?
Ans) it is not a Normal Distribution
What is nature of skewness of the data?
Ans) It is left skewed.
What will be the IQR of the data (approximately)?

Ans) Inter Quartile Range =Upper Quartile- Lower Quartile => 18-10=8
Q19) Comment on the below Boxplot visualizations?
Draw an Inference from the distribution of data for Boxplot 1 with respect Boxplot 2.
Ans) 1) The median of the two boxplots are same approximately 260.
2) The boxplots are not skewed in +ve or –ve direction.
3) Outliers doesn’t exist in both of the boxplots.
1.
Answer the following three questions based on the box-plot above.
(i) What is inter-quartile range of this dataset? (please approximate the numbers) In one
line, explain what this value implies.
Ans) Inter-quartile range is the range between upper quartile (Q3) and lower quartile (Q1).
IQR= Q3-Q1= 12-5 = 7
50% of the data lies between IQR.
(ii) What can we say about the skewness of this dataset?

Ans) From the above boxplot we can say that the distribution of X is right-skewed or
positively skewed.
(iii) If it was found that the data point with the value 25 is actually 2.5, how would the new
box-plot be affected?
Ans) if it was found that the data point is actually 2.5 instead of 25, the outlier in the boxplot
will be removed.
Whether the median shifts or not depends on the size of the data.
It will reduce the right skewness of the data.
2.
Answer the following three questions based on the histogram above.
(i) Where would the mode of this dataset lie?
Ans) We need to have actual data to get the exact value of the mode. The mode can lie between 4
and 10 because there are many values in this range but this is just an assumption. The 2 bars of the
same height doesn’t indicate mode every time.
(ii) Comment on the skewness of the dataset.
Ans) It is right skewed or +ve skewed.
(iii) Suppose that the above histogram and the box-plot in question 2 are plotted for the
same dataset. Explain how these graphs complement each other in providing
information about any dataset.
Ans) from the above histogram and barplot we can confirm an outlier at 25 in Y value. Both the plots
indicate the +ve skewness of the dataset.

Assignment-3 Key-Graphical Representations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment-3 Key-Graphical Representations

Uploaded by

Copyright:

Available Formats

Q9) Calculate Skewness, Kurtosis & draw inferences on the following data

Cars speed and distance

Ans) #using e1071 package

#using e1071 package

#using moments package

Q10) Draw inferences about the following boxplot & histogram

Q12) Below are the scores obtained by a student in tests

1) Find mean, median, variance, standard deviation.

Ans: 1) > mean(ex$scores)

Ans) no skewness, symmetric

Q14) what is the nature of skewness when mean > median ?

Ans) Right skewed(tail on the right side).

Q15) What is the nature of skewness when median > mean?

Ans) Left Skewed(tail on the left side).

Q16) What does positive kurtosis value indicates for a data ?

Ans) peakness (sharp peak) and less variation.

Q17) What does negative kurtosis value indicates for a data?

Ans) less peakness (Broad peak) and more variation.

Ans) it is not a Normal Distribution

What is nature of skewness of the data?

Ans) It is left skewed.

What will be the IQR of the data (approximately)?

Q19) Comment on the below Boxplot visualizations?

2) The boxplots are not skewed in +ve or –ve direction.

3) Outliers doesn’t exist in both of the boxplots.

(ii) What can we say about the skewness of this dataset?

You might also like