Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

UNIT 3

TOPIC 1
BIVARIATE
DATA ANALYSIS
Miss Perry
BIVARIATE DATA
Analysing bivariate data is the study of two
variables (explanatory and response) at the same
time to determine if a relationship exists.
CATEGORICAL DATA

Data in categories
e.g: models of car, response to question

Two-way frequency tables are used to organize dad of


two categorical variables
EXAMPLE 4

A sports carnival had 125 attendees and were given a


free snack at morning tea, they had a choice of an ice
block or ice cream. Of the 115 students, 28 choose
an ice block while 2 teachers choose ice cream.

What is the explanatory variable?


What is the response variable?
EXAMPLE 5

A sports carnival had 125 attendees and were given a free snack
at morning tea, they had a choice of an ice block or ice cream.
Of the 115 students, 28 choose an ice block while 2 teachers
choose ice cream.

Explanatory variable: Type of attendee


(student/teacher)

Response variable: Type of snack (ice block/ice


cream)
EXAMPLE 6

A sports carnival had 125 attendees and were given a free snack
at morning tea, they had a choice of an ice block or ice cream.
Of the 115 students, 28 choose an ice block while 2 teachers
choose ice cream.

Fill in the information we know.


Students Teachers Total

Ice Cream

Ice Block

Total
EXAMPLE 7

A sports carnival had 125 attendees and were given a free snack
at morning tea, they had a choice of an ice block or ice cream.
Of the 115 students, 28 choose an ice block while 2 teachers
choose ice cream.

Calculate and find the missing data.


Students Teachers Total

Ice Cream
2
Ice Block
28
Total
115 125
EXAMPLE 8

Let’s turn this data into percentages to


make an analysis proportionally.

Students Teachers Total

Ice Cream
87 2 89
Ice Block
28 8 36
Total
115 10 125
EXAMPLE 9

Use the explanatory variable to calculate


the percentages.

Students Teachers Total

Ice Cream
89
Ice Block
36
Total
115 10 125
EXAMPLE 10

Draw a conclusion: Can you identify any


patterns and whether an association exists?

Students Teachers Total

Ice Cream
20%
Ice Block
80%
Total
115% 100%
EXAMPLE 11

In this sample of 125 attendees…


• A higher proportion of teachers choose ice blocks at
80%.
• A higher proportion of students choose ice cream at
76.65%
Therefore, this data indicates there is an association between
the type of attendee and snack choice
Students Teachers Total

Ice Cream
20%
Ice Block
80%
Total
115% 100%
NUMERICAL DATA

Data in numbers
e.g. height and time

Linear regression analysis


13

LINEAR REGRESSION ANALYSIS


1. Construct a scatterplot
2. Interpret the Scatterplot
3. Calculate the Pearson’s Correlation Coefficient (r)
4. Fit a Least-Squares Line
5. Make predictions
6. Check validity with coefficient of determination ()
7. Check linear assumption with Residual Plot
14

1. CONSTRUCT A SCATTERPLOT
Hours of Study 5 0.75 2.5 4 5.5 9 3 7 3.5 0 6.5 1.5 2 7.5 8.25

Results (%) 62 48 54 61 79 100 64 85 72 38 75 44 52 81 93


RESPONSE VARIABLE

X-AXIS
EXPLANATORY VARIABLE

Y-AXIS
15

1. CONSTRUCT A SCATTERPLOT
Hours of Study 5 0.75 2.5 4 5.5 9 3 7 3.5 0 6.5 1.5 2 7.5 8.25

Results (%) 62 48 54 61 79 100 64 85 72 38 75 44 52 81 93


Results (%)

X-AXIS
Amount of Study (hours)
Y-AXIS
16

1. CONSTRUCT A SCATTERPLOT
The relationship between the amount of hours students
study per week and their results in a General Maths Exam
120

100

80
Result (%)

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10

Hours of Study per week


17

2.INTERPRET THE SCATTERPLOT


Form: Strength:
linear/non-linear strong/moderate/weak/no association

Direction: Outliers:
positive/negative yes/no
18

2.INTERPRET THE SCATTERPLOT


The relationship between the amount of hours students
study per week and their results in a General Maths Exam
120
Form: linear
100

Strength: strong 80

Direction: positive Result (%) 60

40

Outliers: no 20

0
0 1 2 3 4 5 6 7 8 9 10

Hours of Study per week


19
3. CALCULATE THE PEARSON’S CORRELATION
COEFFICIENT (R)
Value is between -1 and 1 and measures the strength and
direction of linear association

total number of data set


explanatory variable
response variable
standard deviation of x variable
standard deviation of y variable
= mean of the x values
= mean of the y values
20
3. CALCULATE THE PEARSON’S CORRELATION
COEFFICIENT (R)

Hours of Study 5 0.75 2.5 4 5.5 9 3 7 3.5 0 6.5 1.5 2 7.5 8.25

Results (%) 62 48 54 61 79 100 64 85 72 38 75 44 52 81 93

total number of data set =

= mean of the x values =

= mean of the y values =


3. CALCULATE THE PEARSON’S 21

CORRELATION COEFFICIENT (R)

total number of data set = 15

explanatory variable

response variable

standard deviation of x variable =

standard deviation of y variable =

= mean of the x values = 4.4

= mean of the y values 67.2


22

3. FIT A LEAST-SQUARES LINE


the intercept () estimates the average
value of the response variable (y) when
the explanatory variable (x) = 0

explanatory variable
= the y intercept of the line 𝑆𝑦
= the slope/gradient of the line
𝑏=𝑟
𝑆𝑥
23

AREAS OF FOCUS
B2B MARKET SCENARIOS CLOUD-BASED OPPORTUNITIES

• Develop winning strategies to keep ahead • Iterative approaches to corporate strategy


of the competition • Establish a management framework from
• Capitalize on low-hanging fruit to identify the inside
a ballpark value
• Visualize customer directed convergence
Presentation title 24

SUMMARY
At Contoso, we believe in giving 110%. By using our next-generation
data architecture, we help organizations virtually manage agile workflows.
We thrive because of our market knowledge and great team behind our
product. As our CEO says, "Efficiencies will come from proactively
transforming how we do business."
THANK YOU
Mirjam Nilsson​
mirjam@contoso.com
www.contoso.com

You might also like