Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Laboratory Activity No.

3
Describing Data Numerically

1. Objective(s):

1.1 Use Google Sheets in summarizing data.


1.2 Interpret the results.

2. Intended Learning Outcomes (ILOs):

The students shall be able to:


2.1 Demonstrate scientific thinking and the ability to approach scientific resources intelligently.
2.2 Utilize Google Sheets
2.3 Infer appropriate conclusions based upon the results of the activity.

3. Discussion:

Google Sheets is a free software that can help you analyze data. It has statistical functions that can
be used to generate needed computations which are subject to statistical interpretations.

This laboratory activity assesses students’ understanding of important concepts in describing data
graphically. It also assesses students’ understanding in generating simple descriptive statistics
necessary for reports.

Graphs are important since they can present data visually, lift out the most necessary facts, and can
easily be interpreted.

The following are the most commonly used graphs.


1. Scatter plot - used to show the relationship between pairs of quantitative measurements
made for the same object/individual.
2. Pie Charts - the usual way of displaying how the total data are distributed between
different categories.
3. Bar Graph - used to display and compare number, frequency, or other measure for
discrete categories.
4. Histogram - special form of bar chart where the data represent continuous rather than
discrete categories. It is used to display and compare numbers, frequencies, or other
measures.
5. Line Graphs - shows, trends and changes in different measurements
6. Stem and Leaf Display - helps to visualize the shape of a distribution of quantitative data.

4. Resources:

Google Sheets

1
5. Procedure:

2
Exercise 1

1.1 Use ‘ClassData_FA2016.XLS’ data set given in Canvas.


1.2 First we will investigate the variable ‘Height’.
1.3 Create a histogram for ‘height’. Please compute for the prescribed class width (bucket size in
google sheets) using Sturges’ approximation formula.

1.5 Now, let us remove some of the ridiculous data points! Make a boxplot of ‘Height’.

1.6 How many outliers can you see? 5

1.7 Find out which observations represent the five outliers that have a height of 10 inches or less –
(5.085, 5.4, 5.5, 5.5, 10)

1.8 After removing the outliers, create a new histogram again. How would you describe the shape
of the distribution? Right-skewed

1.9 Compute for the other numerical measures as well.

What is the mean? 67.338

What is the standard deviation? 3.992

Complete the 5-number summary below:


Minimum = 54
Q1 = 65
Median = 67
Q3 = 70
Maximum = 78

Write a sentence that interprets the median.

1.10 Give the value that completes the following sentence.

About 1/4 of the students are less than _65 inches tall.

3
1.11 Give the value that completes the following sentence.

About 1/4 of the students are more than _70 inches tall.

1.12 What is an interval that describes the middle one half of the students’ heights?

1.13 Calculate the Inter-Quartile Range: IQR = Q3 – Q1

1.14 Now let’s compare the variable ‘Height’ for the different genders. Create side by side
boxplots:
Now, considering only the boxplots from ‘Male’ and ‘Female’:

a. Which gender, M or F, has the highest median?

b. Which gender, M or F, has a larger middle box?

c. Which gender, M or F, has the most outliers?

d. Are there any other noticeable differences between genders in their distribution
of height?

1.15 Creating a side-by-side boxplot like this one is one of the first steps in answering the
following question: Is there a statistically significant difference in height between college aged
men and women? More on this later in the
semester.

Exercise 2

2.1 The following table shows the film lengths (in minutes) of a sample of videotape versions of n
= 22 films directed by Alfred Hitchcock. Films are listed in alphabetical order.
Film Lengths Film Lengths
(min) (min)
The Birds 119 Psycho 108
Dial M for Murder 105 Rear Window 113
Family Plot 120 Rebecca 132
Foreign Correspondent 120 Rope 81
Frenzy 116 Shadow of a 108
Doubt
I Confess 108 Spellbound 111

4
The Man Who Knew Too 120 Strangers on a 101
Much Train
Marnie 130 To Catch a Thief 103
North by Northwest 136 Topaz 126
Notorious 103 Under Capricorn 117
The Paradise Cane 116 Vertigo 128

2.2 Calculate the five-number summary statistics (minimum, maximum, first quartile (Q1), second
quartile (Q2), and third quartile (Q3) for this data. The data below are ordered from minimum to
maximum.

2.3 Calculate the interquartile range (IQR) for this data.

2.4 Calculate the lower and upper fences for a boxplot of this data using the IQR from part
2.3. Recall that the lower fence is at position Q1 – 1.5×IQR with the upper fence at Q3 +
1.5×IQR.

2.5 Use the fences from part 2.4 to determine the data values for the endpoint of the lower
whisker, the endpoint of the upper whisker, and outliers (if any). Outliers are points beyond the
fences.

2.6 Print the descriptive statistics values such as the mean, median, mode (if there are
any), standard deviation, and variance.

2.7 Construct a boxplot using the statistics from parts 2.1 – 2.4. In the plot, make sure to include
and label the following: Q1, Q2, Q3, IQR, lower whisker endpoint, upper whisker endpoint, and
outliers (if any).

Exercise 3

5.1 Temperatures in the cities of Math Village and Stat Village are greatest in the month of
August. The highest temperature, in degrees Fahrenheit, in Math Village for each August from
1980 to 2021 is given below. The temperatures are sorted from minimum to maximum over this
42-year period.

69.1 72.3 75.6 77.6 78.1 79.0 79.1 79.7 81.8 82.5 83.1 83.5 83.5 83.8

84.1 84.4 84.6 84.8 85.7 86.5 86.7 86.8 87.3 87.3 87.5 87.7 87.8 88.0

88.2 88.3 88.4 88.5 88.7 89.1 89.2 89.3 89.5 89.5 89.5 89.7 89.8 89.8

The highest temperature, in degrees Fahrenheit, in Stat Village for each August from 1980 to 2021
is given below. The temperatures are sorted from minimum to maximum over this 42-year period.

5
70.1 70.1 70.1 70.4 71.7 724 72.7 72.9 73.7 75.0 77.4 78.2 78.7 78.9
79.2 79.5 79.6 79.8 79.8 80.0 80.1 80.1 80.2 84.8 85.0 86.1 86.4 88.3

89.1 90.4 90.4 91.6 92.2 93.2 94.5 97.7 98.6 98.7 98.7 99.5 100.6 102.0

5.2 Create comparison boxplots for the highest temperature in Math Village versus Stat Village in
August from 1980 to 2021. Use a meaningful title and correctly label the axes with units.

5.3 Given the comparison boxplots in part 5.2, answer the following true/false questions about the
data from both villages.

a. The temperatures are more variable for Stat Village than Math Village.

b. The temperatures in Stat Village are positively skewed.

c. Stat Village has a greater median temperature for those 42 years than Math Village.

d. Stat Village has a smaller IQR than Math Village.

e. It is obvious from the boxplots that Stat Village’s mean temperature for those 42 years
is less than Math Village’s temperatures.

f. The lower whisker endpoint for Stat Village is less than the lower whisker endpoint
for Math Village.

g. Stat Village’s second quartile is less than Math Village’s first quartile.

h. If you prefer August high temperatures that are consistently around 85 degrees Fahrenheit,
then you should move to Stat Village.

6. Data and Results

Exercise 1

1.5
1.1

6
1.8 1.9

Exercise 2

2.2 – 2.6

Exercise 3

7
2.7

5.3: A. TRUE B. TRUE C. FALSE D. FALSE E. TRUE F. TRUE G. TRUE H. TRUE

7. Conclusion

8
When describing data quantitatively, one must compute a number of summary statistics that
encapsulate the essential features of a dataset, including its variability, central tendency, and
distribution shape. The median indicates the midway value when the data is sorted in either
ascending or descending order, but the mean, which is the sum of all data points divided by the
total number of points, gives insight into the average or usual value of the dataset. The value or
values that appear in the dataset the most frequently are also identified by the mode. Larger values
of statistics like range, variance, and standard deviation indicate higher variability, and are used to
quantify the spread or dispersion of the data. Moreover, skewness and kurtosis measure the
distribution's tailed and asymmetry, respectively, providing insights into its form. Effective
communication of complicated datasets is made possible by numerical summaries, which also
make it easier to compare data, spot trends, and make defensible conclusions based on the
quantitative characteristics of the information.

9
Name: Acosta, Arthur James U. Laboratory Activity No.: 3

Program: BSEE Section: MATH027A-EE22S1


Date Performed: March 13, 2024 Date Submitted: March 15,2024
Instructor: Engr. Raffy Garcia

10
8. Assessment (Rubric for Laboratory Performance):

BEGINNER ACCEPTABLE PROFICIENT


CRITERIA SCORE
1 2 3
I. Laboratory Skills
Members do not Members occasionally Members always
Manipulative
demonstrate needed demonstrate needed demonstrate needed
Skills
skills. skills. skills.
Members are able to set- Members are able to
Experimental Members are unable to
up the materials with set-up the material
Set-up set-up the materials.
supervision. with minimum
Members do not Members occasionally Members always
Process Skills demonstrate targeted demonstrate targeted demonstrate targeted
process skills. process skills. process skills.

Members follow safety Members follow


Safety Members do not follow
precautions most of the safety precautions at
Precautions safety precautions.
time. all times.
II. Work Habits
Time Members finish ahead
Members do not finish
Management / Members finish on time of time with complete
on time with incomplete
Conduct of with incomplete data. data and time to revise
data.
Experiment data.
Members do not know Members have defined Members are on tasks
their tasks and have no responsibilities most of and have defined
Cooperative
defined responsibilities. the time. Group responsibilities at all
and
Group conflicts have to conflicts are times. Group conflicts
Teamwork
be settled by the cooperatively managed are cooperatively
teacher. most of the time. managed at all times.

Clean and orderly


Clean and orderly
Messy workplace workplace with
Neatness and workplace at all times
during and after the occasional mess during
Orderliness during and after the
experiment. and after the
experiment.
experiment.
Ability to do Members require Members require Members do not need
independent supervision by the occasional supervision to be supervised by the
work teacher. by the teacher. teacher.
Other Comments / Observations:
TOTAL SCORE

11
Rating =
(Total Score / 24) x 50
+ 50%

12

You might also like