Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Lab – Basic Data Analysis

Objectives
Use very simple methods to describe existing data, fill-in missing data values and to make simple predictions.
Part 1: Learn how to Use Data as Information
Part 2: Plot data and predict values

Background/Scenario
Data is meaningless in and of itself. Information is meaningful and useful. Data only becomes information
when it used in context to answer specific questions. In this lab, you will use graphs of existing data to create
missing values and to predict values based on trends.

Required Resources
 PC or mobile device with Internet access
 Browser capable of playing a video from the Internet.
 Audio capability to listen to video narration.

Part 1: Learn how to Use Data as Information

Data analysis can occur in many different ways. The ultimate goal is to discover something in the data that
gives insight into what has happened or to predict what may happen in the future. Descriptive statistics
summarizes what happened and provides the data in a numeric or graphical way. Predictive analytics
answers the question of what may happen in the future based on past data.

Step 1: Describe the data.


a. Review the data chart shown on Worksheet 1.
b. Over the 110-year period for the data set, what is the range (lower and upper limits) of median ages for
males and females at first marriage?
Male median age: 22.8 to 26.8
Female median age: 20.3 to 25.1
c. The range is a type of descriptive statistic that summarizes the data. It has been presented above in a
numerical format. But if you need to look at the trend of the data, it may be better to graph or plot the
data.

Step 2: Plot the data.


a. Plot the data on the chart that is found on Worksheet 1 at the end of this lab. Use different colors for the
male and female data. Do not connect the data points until you are told to do so in the video in Step
3.

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 7
Lab – Explore Sources of Open Data

Step 3: Perform simple data prediction.


a. Khan Academy is an excellent source for lessons on a wide range of topics including statistics and
statistical concepts. You will watch a short video and follow along with the narrator using the data table
and chart provided at the end of this lab. Navigate to https://youtu.be/aVDiAGZmcPo and follow the
Predicting with Linear Models video provided by Khan Academy. Pause the video and complete the
activities along with the video instructor. Use the plot you created.
Watch the entire video. You will only work with the first dataset that the instructor discusses in this lab. In the
video, the instructor demonstrates how to use data points as information to create new estimated data points.

males and females at fi rst marriage


30
MALES Females

25

20

15

10

What is a linear model?


using a line to discripe some trends in data

The instructor in the video teaches the processes of interpolation and extrapolation as tools for estimating or
predicting data in a linear model. Define each term.

interpolation trying to estimate what happened between two data points

extrapolation last two data points are? And see what trend looks like when keep on continue
that trend? And see what might happened if that trend continue
What are two interesting observations that the instructor in the video makes regarding the trends in the
median age of marriage and the ages of the males and females who marry?
1- Median age for males and females goes to lower and lower until 1960 then they got marriage older.
That is mean on 1960 people got marriage on a younger age for males and females.
2- Age difference between median age of males and females comes smaller and smaller till 2000
____________________________________________________________________________________

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 2 of 7
Lab – Explore Sources of Open Data

Part 2: Plot data and Predict Values


Other social trends can make use of the simple linear model that was demonstrated in the video. In this
section of the lab, you will interpolate and extrapolate values from a new data set.
a. Plot the data on Worksheet 2 at the end of the lab. Use different colors for the two different variables.
b. Note that the data was collected at different intervals. Before 2000, the data was collected every ten
years, however, after 2000 it was collected every 5 years. Interpolate values for the three missing years.
Plot your values.

Woman Men
Missing Year
hours Hours

1970 27.8 5

1980 23 8

1990 20 10

c. Extrapolate values for the year 2020 by creating a line that best summarizes the values for the previous
five periods.
For man it will be 12.2 hours and for woman it will be 15.5 hours in week
d. Another kind of information that can be derived from this data is about the gap between the number of
hours of housework for men versus the number of hours of housework for women. This will display
another trend regarding the equality between men and woman over this period. Complete the table below
by filling it in with the amount of time that women do house work subtracted from the amount of time that
men do housework.

Men
Women
housework Woman housework
Date hours - Men
hours/wee hours/week
hours
k
1965 4.4 31.9 27.5
1975 6 23.6 17.6
1985 10.2 20.7 10.5
1995 10.2 18.9 8.7
2000 10 18.6 8.6
2005 9.2 19.1 9.9
2010 10 17.4 7.4
2015 9.8 17.8 8
2020 12.2 15.5 3.3

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 3 of 7
Lab – Explore Sources of Open Data

HouseWork
35
MALES Exponential (MALES) Females Exponential (Females)

30

25

20

15

10

0
1960 1970 1980 1990 2000 2010 2020

e. Graph the calculated values on the chart provided on Worksheet 3 at the end of this lab.

What has been the trend for equality between men and women doing housework?
_I don’t believe the equality in this filed sense the woman most doing housework more than men
If men and woman were completely equivalent in the amount of housework they do per week, in 2020,
where would the next data point be plotted?
13.5

In the IoT, Big Data comes from many sources. Sometimes values are missing because a sensor temporarily
lost connectivity or data points were lost in transmission. Interpolation can serve as one strategy for replacing
missing data. Extrapolation is used to predict values for events that have not yet occurred. Because the IoT
yields so much data, predictive analytic models can be built that reliably see into the future by extrapolating
trends from historical data.

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 4 of 7
Worksheet 1

Median Age of Males and Females at First Marriage By Year


30

25

20
Median Age

15

10

0
1890 1910 1930 1950 1970 1990 2010
Year

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 5 of 7
Lab – Explore Sources of Open Data

Worksheet 2

35

30

25

20

15

10

0
1960 1970 1980 1990 2000 2010 2020

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 6 of 7
Lab – Explore Sources of Open Data

Worksheet 3

Men hours - Woman hours


30

25

20

15

10

0
1960 1970 1980 1990 2000 2010 2020

© 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 7 of 7

You might also like