Professional Documents
Culture Documents
Parth Sharma (A1004820106) Vardan Malik (A1004820147) Kanishka Bisht (A1004820145) Harshita Gupta (A1004820132)
Parth Sharma (A1004820106) Vardan Malik (A1004820147) Kanishka Bisht (A1004820145) Harshita Gupta (A1004820132)
BY:-
Parth Sharma(A1004820106)
Vardan Malik(A1004820147)
Kanishka Bisht(A1004820145)
Harshita Gupta(A1004820132)
1
Module 1
Measures of Central Tendency
• What is a measure of central tendency?
• Measures of Central Tendency
– Mode
– Median
– Mean
• Shape of the Distribution
• Considerations for Choosing an
Appropriate Measure of Central
Tendency
What is a measure of Central
Tendency?
TOTAL 21
Median Exercise #2 (N is even)
Calculate the median for this hypothetical
distribution:
Satisfaction with Health Frequency
Very High 5
High 7
Moderate 6
Low 7
Very Low 3
TOTAL 28
Finding the Median in
Grouped Data
N (.5) Cf
Median L w
f
Percentiles
• A score below which a specific percentage of
the distribution falls.
• Finding percentiles in grouped data:
N (.25) Cf
25% L w
f
The Mean
• The arithmetic average obtained by
adding up all the scores and dividing
by the total number of scores.
Formula for the Mean
Y
Y
N
•“Y bar” equals the sum of all the scores, Y,
divided by the number of scores, N.
Calculating the mean with
grouped scores
fY
Y
N
•where: f Y = a score multiplied by its frequency
Mean: Grouped Scores
Mean: Grouped Scores
Grouped Data: the Mean &
Median
•Calculate the median and mean for the grouped
frequency below.
04/22/22 18
Illustrative Data (Doll, 1955)
per capita cigarette lung cancer mortality per
consumption (X) 100,000 in 1950 (Y)
n = 11
04/22/22 19
Scatterplot
Assess:
• Form
• Direction of
association
• Outliers
• Strength of
relation
04/22/22 20
Doll, 1955
• Form: linear
• Direction: positive
association
• Outlier: no clear
outliers
• Strength: difficult to
determine by eye
04/22/22 21
Correlation Coefficient r
• r ≡ Pearson’s product-moment
correlation coefficient
• Measures degree to which X
and Y “go together”
• Always between −1 and 1
• r ≈ 0 no correlation
• r > 0 positive correlation
• r < 0 negative correlation Karl Pearson
• Closer r is to 1 or −1, the 1857 - 1936
04/22/22 24
Coefficient of
determination (r )
2
04/22/22 25
Cautions
• Outliers
• Non-linear relations
• Confounding
(correlation is NOT
causation)
• Randomness
04/22/22 26
Outliers
Outliers can have profound influence on r
04/22/22 27
Linear Relations Only
r = 0.00
This strong
relationship is
missed by r
because it is not
linear
04/22/22 28
Least Squares Line
Residual ≡ distance of data point from regression line (dotted)
04/22/22 29
ŷ = 6.756 + 0.0284 ∙ X
Slope = “rise over run”
.0228 increase per unit X
6.756
(intercept)
04/22/22 31
Population Regression Model
where
• α ≡ intercept parameter
• β ≡ slope parameter
• εi ≡ residual error, observation i
Objective:
To estimate β with (1 – α)100% confidence
04/22/22 32
THANK YOU!
04/22/22 33