Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 85

BIOSTATISTICS

Professor Dr. Ejaz Ahmed Khan


Department of Community Medicine
Rahbar Medical and Dental College
Learning Outcome

• After the End of this Session Students will be able to:


– Explain Statistical Averages.
– Discuss Measures of Dispersion.

– Illustrtare Normal Distribution.

– Explain Tests of Significance.


– Discuss Chi-Square Test.

– Explain Correlation and Regression.


Biostatistics
Statistical Averages
• Definition:
– An Average is Defined as the Number that Measures the Central
Tendency of a Given Set of numbers.
• Mean

• Median
• Mode
Mean

• Mean is the Average of the Given Numbers.

• It is also Referred to as an Expected Value.


• Calculated by Dividing the Sum of Given Numbers by the Total
Number of Numbers.
• Mean = (Sum of All the Observations/Total Number of Observations)
Find MEAN of:

5 + 7 + 11 + 20 + 10
MEAN =
5

61
=
5

= 12.2
Median

• It is the Middle Value of the Given List of Data when Arranged in


an Order.
• Half of the Data Points are Smaller than the MEDIAN and Half of
the Data Points are Larger.
• To find the Median:
– Arrange the Data Points from Smallest to Largest.

– If the Number of Data Points is Odd, the Median is the Middle


Data Point in the List.
FIND MEDIAN

MEDIAN = 1, 7, 5, 4, 11, 3, 20, 10, 8

= 11

or

MEDIAN = 1, 7, 5, 4, 11, 3, 6, 20, 10, 8

= 11 + 3 /2 = 7
Mode

• A Mode is Defined as the Value that has a Higher Frequency in a


Given Set of Values.
• It is the Value that Appears the Most Number of Times.
• Place All Numbers in a Given Set in Order; This can be from Lowest
to Highest or Highest to Lowest, and then Count How Many Times
Each Number Appears or Repeated in the Set. The One that Appear
More than Others is Referred as Mode.
Find The Mode

MODE = 1, 7, 5, 7, 1, 6, 7, 1, 4, 1

= 1

or

MODE = 1, 7, 5, 7, 1, 6, 7, 1, 4, 1, 7

= 1&7
Measures of Dispersion

• In Statistics, the Measures of Dispersion Help to Interpret the


Variability/Variance of Data so as to Know How Much that Data is
Homogenous or Heterogeneous.
• It Helps in Understanding How much Data is Spread (i.e. its
variation) Around a Central Value, Mean or Average.
• The Measure of Dispersion is Always a Non-Negative Real Number
that Starts at Zero When All the Data is the Same and Rises as the
Data Gets More Varied.
Types of Measures of Dispersion
Measures of Dispersion

Algebric Graphical

Absolute Measures Relative Measures


Range Coefficient of Range
Mean Deviation Coefficient of Variation
Quartile Deviation Coefficient of Quartile
Standard Deviantion Coefficient of Mean Deviation
Absolute Dispersion

• “An Absolute Measure of Dispersion Measures the Variability


in Terms of the Same Units of the Data”
• The Units of the Measures of Dispersion will be Rs, Meters, Kg.

• Uses Numerical Variations to Determine the Degree of Error of


Figures in Context.
• They are Used for Measuring Variability.
Relative Dispersion

• “A Relative Measure of Dispersion Compares the Variability of Two or


More Data that are Independent of the Units of Measurement” 
• Or “A Relative Measure of Dispersion, Expresses the Absolute Measure of
Dispersion Relative to the Relevant Average and Multiplied by 100 Times”
• Uses Statistical Variations Based on Percentages to Determine the Reality
of the Figures in Context.
• They are Used for Comparison of Two or More Distribution.
Range

• The Range is the Difference Between the Lowest and Highest Values.

• To Find the Range, Simply Subtract the Lowest Value From the
Greatest Value, Ignoring the Others.
• The Sample Range is to Compare Variability Between Different
Distributions of Data. The Maximum, and the Minimum.
• Use en Dash (Not Hyphen or em Dash) With No Spaces Either Side
– 2-10
Find The Range

Range = 11, 7, 15, 16, 12, 4, 8, 9, 14, 10

= 16-4 = 12
or

Range = 21, 17, 15, 16, 9, 11, 14, 12

= 21-9 = 12
Mean Deviation

• It is the Average of Deviations from the arthmatic Mean.

• Formula M. D. = Ʃ ( x - x )
ƞ
Where:
ƞ = No of Observations
x = Each Point Value
x = Sample Mean
Example
• The Weight of 10 indiviuals are 83, 75,81, 79, 71, 95, 75,
77, 84, 90. Calculate Standard Deviation
x Arthmatic Mean (x - x)
83 81 2 Mean Deviation =
75 81 -6 Deviation form Mean / No. of
Sample Observation
81 81 0
79 81 -2
Mean Deviation =
71 81 - 10 56 / 10
95 81 14
75 81 -6
Mean Deviation =
77 81 -4 5.6
84 81 3
90 81 9
Total = 810 Mean = (810/10) ( ƞ = 10 ) Total = 56
Standard Deviation
• Standard Deviation is a Measure of How Much the Data is Dispersed
From its Mean.
– A High Standard Deviation Implies that the Data Values are More
Spread Out From the Mean.
– A Low Standard Deviation Means that Data Values are More Cluster
Around the Mean.
– Standard Deviation can be Zero (if all the values in the variable are the
same)
• Standard Deviation is Denoted by:
– “SD” and Greek symbol “σ” is used for Population Standard
Deviation.
– Latin letter “s” is used for Sample Standard Deviation.
Standard Deviation
• Formula for Population Standard Deviation and If Sample Size is
More than 30 Use following Formula:
Standard Deviation
• For Sample Standard Deviation and If Samle Size is less
than 30 use following Formula:
Individul Raw Point/Score
Symbol of Sum

Square of Result

Symbol for
Sample Standard Deviation
Sample Mean

Symbol of Square Root Number of Observation


Steps For Calculatig Standard Deviation
• First Take the Deviation of Each Value From Mean.

(x - x)
• Square Each Deviation.

( x - x )2
• Add the Squared Deviation.

Ʃ ( x - x )2
• Divide the Results by Number of Observation.
For mOre than 30 = ƞ and For Less tha 30 = ƞ - 1
• Last Square Root that Gives Standard Deviation.
Example
• The Weight of 10 indiviuals are 83, 75,81, 79, 71, 95, 75,
77, 84, 90. Calculate Standard Deviation
x (x - x) (x - x)2
83 2 4
75 -6 36
81 0 -
79 -2 2
71 - 10 100
95 14 196
75 -6 36
77 -4 16
84 3 9
90 9 81
Total = 810 , Mean = (810/10) ( ƞ = 10 ) Total = 482
Example

• Sample Standard Deviation:


=√ 55,55

√ 482 = 7.31
10-1

√ 482
9
Quartile Deviation

• The Quartiles are Defined as the 25th Percentile and the 75th Percentile.

• For the Normal Distribution, these Define a Narrower Interval Than Does
One Standard Deviation on Each Side of the Mean.
• Quartile Deviation Measures the Deviation in the Middle of the Data.

• Quartile Deviation is Half of the Difference Between the Third Quartile


and the First Quartile Value.
• The Formula for Quartile Deviation of the Data is:

Q.D = (Q3 - Q1)/2


Quartile Deviation

• Where n represents the total number of observations in the given data set.
• Thus Q2 is the median of the given data set, Q1 is the median of the lower
half of the data set and Q3 is the median of the upper half of the data set.
• Lower Quartile (25th Percentile) (Q1) = (N+1) * 1 / 4.
• Middle Quartile (50th Percentile) (Q2) = (N+1) / 2.
• Upper Quartile (75th Percentile) (Q3 )= (N+1) * 3 / 4.
• Interquartile Range = Q3 – Q1.
Example of Quartile
• Find the Quartiles and Quartile Deviation of the following data:
• 17, 2, 7, 27, 15, 5, 14, 8, 10, 24, 48, 10, 8, 7, 18, 28
Solution:
• Ascending order of the given data is:
• 2, 5, 7, 7, 8, 8, 10, 10, 14, 15, 17, 18, 24, 27, 28, 48(n = 16)
• Q2 = Median of the given data set
• n is even, median = (1/2) [(n/2)th observation and (n/2 + 1)th
observation]
• = (1/2)[8th observation + 9th observation]
• = (10 + 14)/2 = 24/2 = 12
• Q2 = 12
Example

• Now, Lower Half (25%th) of the data is:


• 2, 5, 7, 7, 8, 8, 10, 10 (even number of observations)

• Q1 = Median of lower half of the data

= (1/2)[4th observation + 5th observation]


= (7 + 8)/2
= 15/2
= 7.5
Example

• Also, the Upper Half (75%th)of the data is:


• 14, 15, 17, 18, 24, 27, 28, 48 (even number of observations)

• Q3 = Median of upper half of the data

= (1/2)[4th observation + 5th observation]


= (18 + 24)/2
= 42/2

= 21
Example

• Quartile Deviation = (Q3 – Q1)/2

= (21 – 7.5)/2
= 13.5/2

= 6.75
Normal Distribution
• Normal distribution, also known as the Gaussian distribution, is a
probability distribution that is symmetric about the mean.
• It shows that data near the mean are more frequent in occurrence
than data far from the mean.
• In graphical form, the normal distribution appears as a "bell
curve".
• In a Normal Curve:
– The Area between One Standard Deviation on either side of the Mean will
include 68% of the Value in the Distribution.
– The Area between Two Standard Deviation on either side of the Mean will
cover 96% of the Value in the Distribution.
– The Area between Three Standard Deviation on either side of the Mean will
include 99.7% of the Value in the Distribution.
Characteristic of Normal Distribution

• Three Characteristics of a Normal Distribution.


– Symmetric.
• The Right side of the Center is a Mirror Image of the Left side
– Unimodal.
• Only one Mode, or Peak, in a normal distribution
– Asymptotic.
• Continuous and have Tails that they Approach but Never
Touch the X-axis.
• The Mean, Median, and Mode are all Equal.
Normal Distribution
Standard Normal Curve
• The Standard Normal Curve Extends Indefinitely in both Directions,
Approaching, but Never Touching, the Horizontal Axis.
• The Standard Normal Curve is Smooth, Bell shaped, is Mean at z=0. Almost
all the Area Under the Standard Normal Curve Lies Between z=−3 and z=3.
• The Difference Between a Normal Distribution and Standard Normal
Distribution is:
– Normal Distribution Can Take on Any Value as its Mean and Standard Deviation.
– A Standard Normal Distribution Always have Fixed Mean and Standard Deviation.
Standard Normal Curve

• The Distant of a Value (x) from the Mean (µ) of the Curve in Units of

Standard Deviation is Called “Relative Deviate or Standard Normal

Variate” and is Denoted by “Z”.

• Formula Standard Normal Deviate:


Z = ( x - µ)
σ
Negative z Score Table

• Use the Negative z Score Table to Find Values on the Left of the

Mean as can be seen in the Graph.

• Corresponding Values which are Less than the Mean are

Marked with a Negative Score in the “z-Table”.

• Respresent the Area Under the Bell Curve to the Left of “z”.
Negative z Score Graph
Negative Z Table
Positive z Score

• Use the positive “z score Table” to Find Values on the Right

of the Mean as can be seen in the graph.

• Corresponding Values which are Greater Than the Mean

are Marked with a Positive Score in the “z-Table”.

• Respresent the Area under the Bell curve to the Left of “z”.
Positive z Score Graph
Positive Z Table
Example:
Pulse of Group of Normal Healthy Males was 72.
Randomly Chosen Male have Pulse of 80.
Standard Deviation is 2.
z = ( x - x)
σ
= ( 80 - 72)
2
z = 8
2
= 4
Example
• Mean Hb of Selected Group is 12
• Standard Deviation of 2 gm
• Probability of a Person Picked is having 16 or more.
z = ( x - x)
σ
z = (16 - 12) = 4
2 2
z = 2 = 0.4772
We are dealing with half curve, the area beyond 2 would be:
0.5 - 0.4772 = 0.0228 = 228/10000
There is a Probability of 228 persons having Hb 16 or more out of 10000.
Example
• Mean Anaemic Value of Selected Group is 12 gm
• Standard Deviation of 2 gm
• Cut of Value of Amaemia is 10.
z = ( x - x)
σ
z = (10 - 12) = -2
2 2
z = -1 = 0.3413
We are dealing with half curve, the area beyond 2 would be:
0.5 - 0.3413 = 0.1587 = 1587/10000
There is a Probability of 1587 persons having Hb 10 or more out of 10000.
Table 01

Areas of Standard Normal Curve with Mean 0 and Standard Deviation 1


Relative Deviate (Z) Proportion of the Area From Middle of
Z = ( x - x) The Curve of Designated Deviation
σ
0.00 0.0000
0.50 0.1915
1.00 0.3413
1.50 0.4332
2.00 0.4772
3.00 0.4987
4.00 0.49997
5.00 0.4999998
Interpretation of Example

• Area of Normal Curve for Deviate 4 is: 0.49997

• We are Dealing with Half of Total Area that is 0.5.

• The Area Beyond 0.49997 is equal to: 05 - 0.49997 = 0.00003 x 100,000

• Interpretation:

– Probability is that only 3 out of 100,000 would likely to have Pulse Rate

of 80 or Higher.
Sampling

• In a Study for a Larger Proportion of Population Sample is Taken.


• Samle Reduces the Cost of Study.

• Sample makes the Study Easier to Conduct.


• Sample must be Representative of the Whole Population or Items
that has to be Studied.
• A Sampling Frame is the member of the Universe from which
Sample has to be Taken for Study.
• Sampling Frame Accuracy and Completeness must be Ensured.
Sampling Method
• Three Methods:
– Simple Random Sample:
• Each household is assigned a number and a table is used having the
numbers in haphazard manner. Collection is done given equal chance of
selection to each household. Personal Bias in selection is eliminated.
– Systematic Random Sample:
• Picking ecevry 5th or 10th unit at a regular interval from the table.
Households are assigned numbers and then a number is selected from 1-
10 and then a random selection of 5th is carried out after that number.
– Stratified Random Sample:
• Separate Portion of the Sample is drawn delibrately to represent a group
or strata. It is useful when we want to study separate strata like age
groups or regligions in a sample.
Sampling Error

• Repeated Sampling from Same Population may Differ from Each Other
Results to Some Extent, this Variation is called Sampling Error.
• This difference is because Results are Drawn from Samples not from
Entire Population.
• Factors Influencing Sampling Error:
– Sample Size.
– Natural Variability of Individuals
Hypothesis
• Any Statement about the Population in a Research or Study is
Called Hypothesis.
• Two types of Hypothesis:
– Null Hypothesis HO Means No Relationship Between Two groups being
comapred. Every Researcher tends to Reject the Null Hypothesis.
• Null Hypothesis is either rejected or failed to be rejected during study.
• eg. Comparison between Abdominal Hysterectomy (AH) and Vaginal Hysterectomy
(VH).
• No difference between the two procedures AH and VH.
– Alternative Hypothesis HA Means either:
• Difference in complications between AH and VH (Two tail) or No Directional.
• VH has less complications than AH (One tail) or Directional
• AH has less complications than VH (One tail) or Directional
Situation: Failure to Reject the Ho
• Two Possibilities:
– True Siuation it should not have been hence we sr right in doing so
and our decision is Correct.
– True Situation it should have been rejected so we are worng in doing
so and our decision is incorrect.
Standard Error

• If we take Random Sample (ƞ) from the Population and repeat them over
and over again, we find that every Sample will have Different Mean (x).
• The Distribution of the Sample Means is almost a Normal Distribution and
is same as that of Population Mean (μ).
• The Standard Deviation of Mean is a Measure of Sample Error and the
Formula is (SD = σ / √ƞ) and called Standard Error of Mean.
• Nearly 95% of the Sample Means will lie within limits of Two Standard
Error [μ ± 2 (σ / √ƞ).
Confidence Interval
• We can Construct an Interval between which a Population Mean (μ)
may be drawn but we cannot exactly draw Population Mean (μ) with
the help of a Sample.
• Still we cannot be 100% Sure rather we can be 100% Confident.
• Formula for Population Mean (μ)± is:
μCI = X ± 2SE or X ± 2 SD/√N
Example: Sample Mean of 100 women as 12 gm with a SD of 2.
μCI = 12 ± 2 (2/√100 = 12 ± 4/10
= 12 ± 0.4 = 11.6 to 12.4
Chance of Error or p Value
• Chances of Error in research is always present.
• Chances of Error in research are kept at the minimum (Lower than
5% or 0.05 in terms of Probability).
• Chance of Error (5%) that is Set prior to Start of Study is known as
Level of Significance of Alpha (α).
• Definition of p Value.
– Probability of Committing Type I Error.
– Probability of Rejecting Ho when it is True.
– Probability of Falsely Rejecting Ho.
– Probability of Getting a Result By Chance.
• Level of Significance or Alpha (α) and p Value have same definition.
Test of Significance
• Standard Error of Mean:
– To Answer How Accurate is Sample Mean and What can be said True
Mean of the Universe, we Calculate Standard Error.
– We Take and Example:
• Random Sample of 25 Males, Age 20-24 Years, Mean Temp. is 98.14
deg. F, Standard Deviation is 0.6. Identify True Mean of Universe.
• SE (x) = s / √ƞ
= 0.6 / √25
= 0.12
Confidence Interval = 98.14 ± (2 x 0.12) = 97.88 to 98.38 deg. F
The Chances would be only 1 in 20 (P =0.05) that the Population
would be Outside these Limits.
Standard Error of Proportion
• Proportion of Male in Village is 52%, a Random Sample of 100 people taken, Male
proportion in the Sample was found to be 40%. Conclusion from the Sample and
Possible Range of Male in the Samle with 95% Confidence Limits.
– SE of Proportion =√ pxq/ƞ (Where “p” is Males and “q” is Females)
= √ 52 x 48 / 100
= 5
We Take two Standard Error on Either side of 52 as our Criterion.
Value Range in True Representative Sample = 52 + 2(5) = 62
= 52 - 2(5) = 42
Since Observed Proportion of Males was only 40% and well outside Confidence
Limit.
Relative Deviate = 52 - 40 / 5 = 2.4
Relative Deviate Exceed Two therefore Deviation is Significant.
Standard Error of Differences Between Two Means

• Often Researchers need to Compare Means Results between Two


Groups:
– Groups A (Control) having 12 Mice.
– Groups B (Intervention) having 12 Mice.
– Each Mice is sacrificed after specific Period and their Kidney was weighed
in milligrams.

Group Number Mean Standard Deviation


A 12 318 10.2
B 12 371 24.1

– SE (d) b/w Means = √ σ21 + σ22


ƞ1 ƞ2
Standard Error of Differences Between Two Means

– SE (d) b/w Means = √ (10.2)2 + (24.1)2


12 12

= √ 8.67 + 48.4

=√ 57.07

= 7.5
Standard Error of Differences Between Two Proportions

• Often Researchers need to Compare Proportions Results between


Two Groups:
• Trial of 02 Whooping Coupg Vaccines:
VaccineA Vaccinated No. of Exposure No. of Cases Attack Rate (%)
A 2400 90 22 24.4
B 2400 86 14 16.2

– SE (d) b/w Proportions = √ p1q1 + p2q2


ƞ1 ƞ2
Standard Error of Differences Between Two Proportions

– SE (d) b/w Proportions = √ p1q1 + p2q2


ƞ1 ƞ2

= √ 24.4x75.6 + 16.2x83.8
90 86

= √20.49 + 15.78= √36.27

= 6.02
Chi Square Test

• It is a statistical test used to examine the differences between


categorical variables from a random sample in order to compare
expected and observed actual results.
• It is mostly used while analysing survey response data.
• It tells us whether two variables are independent of one another.
• It is used to find out exact difference and p value.

• It is used for the qualitative Data where Two Proportion needs to be


comapred.
Example
• We have Vaccine A and B for prevention of Measles.
• Vaccine A given to 100 children, 20 later developed Measles.
• Vaccine B given to 100 children, 15 later developed Measles.
• Stistically prove which among the two vaccines is potent.
Measle Vaccine Attacked Not Attacked Total
A 20 80 100
B 15 85 100
Total 35 165 200
Proportion of children attacked = No. of Children attacked/No. of Exposed
= 35/200 = 0175 = 17.5%
Proportion of children not attacked = No. of Children not attacked/No. of Exposed
= 165/200 = 0.825 = 82.5%
Continue
Measle Vaccine Attacked Not Attacked Total
A 17.5 82.5 100
B 17.5 82.5 100
Total 35 165 200
• Expected Value (EV) = Row Total (RT) X Column Total (CT)
Grand Total (GT)
= 35 X 165 17.5
200
Cell Observed - Expected (O - E) (O - E)2 (O - E)2/E
a 20 -17.5 = 2.5 6.25 6.25/17.5 = 0.33
b 80 - 82.5 = - 2.5 6.25 6.25/82.5 = 0.07
c 15 - 17.5 = - 2.5 6.25 6.25/17.5 = 0.33
d 85 -82.5 = 2.5 6.25 6.25/82.5 = 0.07
Total = 0.33+).07+0.33+0.07 = 0.8
Continue
• Chi Square Value Calculated as 0.8.

• Degree of Freedom df = (r - 1) X (c - 1) (r = row and c = column)


As in 2 X 2 table we have 2 rows and 2 tables.

df = (2 -1) X (2 - 1) =1X1 =1
Hence Degree of Freedom is of 2 X 2 Table is 1

Degree of calculated freedom one (1) at 5% level of significance in the table is 3.84.
While we compare the Chi Square Value 0.8 with cut off value 3.84.

The difference is too small to be significant therefore Vaccine A and B are equally
effective and the difference among the two vaccine is by chance.
Students t Test
• It is used to compare Two Means (Continuous or Numerical Data).
• Formula of t Test = X-μ
(SD)/√ƞ)

Example:
A Population has a mean Hb of μ = 12 gm.
Sample of 46
X = 11.6
Standard Deviation = 0.2 gm
Confidence Interval 95%
Continue

T = X - μ = (11.2 - 12) = - 0.8 = - 2.8


(SD)/√ƞ) 0.2 / √46 0.29
df = n - 1 = 46 - 1 = 45
Critical Value of t Test is ±2.014 at a Sample size of 46 and df = 45.
Since Calculated Value is -2.8 is on the right side of the critical value, we
conclude that there is a significant difference in the Sample and Population
Mean and the p Value is less than 0.05.
Charts and Diagrams
• Presentation of Categorical Data
– Pie Chart
– Simple Bar Diagram
– Multipla Bar Diagram
– Component bar Diagram of Subdivided bar Diagram
• Presentation of Continuous Data
– Histogram
– Frequency Polygon
– Line Diagram
– Pictogram
– Statistical Map
– Scatter Diagram
– O give Curve
Pie Chart
• Most Popular when Presenting Different (4-5) Categories.
• Used for Categorical data
• Calculate the Category Angle:
– The Value of each Category is Divided by Sum of All Values
– Then Multiply by 360o.
– Then Each Category is Allocated the Respective Angles to Present.
Simple Bar Diagram

• Represent Data involving only One Variable.


• For Categorica Data specially More than 5 Categories.
• Height represents Frequency or Magnitude.
• Bars are Separated by space which is visible.
• Suitable Scale is Drawn to Present Bars.
Multiple Bar Diagram
• Used for Subdivided Categories of Categorial Data.
• It is used to present Ordinal Data.
• It is used when One Variable Cross Tabulated Against Other.
Component Bar or Subdivided Bar Diagram
• This Diagram is Used When More than 5 Categories have further
Sub Categories.
Histogram
• It is to Present Variables with No Gaps.
• It consist Series of Blocks unlike Bar Chart, which has Gaps.
• Intervals are given at Hosrizontal Axis while Frequencies on Vertical
Axis.
• Area of Each Block is Proportionate to the Frequency.
Frequency Polygon
• Obtained by Joining the Mid Point of Distribution Reading.
Line Diagram
• It is Used to Show Trends of Events with the Passage of Time.
Pictogram
• It is used for Presenting Data to
Layman.
• Pictures are used to explain the Data.
• Picture of Child to represent Under 5
Children Mortality.
• It is a form of Bar Chart.
Statistical Maps

• It is Data Representation of Geographical and Administrative Areas.


• Called Statistical Maps or dot maps.
• Areas are shaded by Different Colours or INtensities to make a
differentiation among them.
Sacatter Diagram
• Shows Relationship Between Two Variables.
• If Dots Cluster Round a Straight Line It is Linear Relation.
• If No Cluster around Straight Line, Variables have No Relationship.
Ogive Curve
• Used for Accumulated Information.
• Population at a Particular Point can Easily be Calculated.
• Shows Cumulative Frequency.
Correlation and Regression

• Both are used to analyze Associations involving Continuous or


Interval / Ratio Data.
• It Enable Researchers to Determine:
– Statistical Significance of observed associations.
– Magnitude of the associations.
– Amount of Variation in the Response Variable that is attributable in a
Putative Risk Factor.
Continue:

• Correlation Analysis:
– Correlation is a single statistic, or data point.
– Measures the Strength between two Study Variables.

– Both Variables must be Random or Determined by Nature.


– Correlation refers to Pearson’s Product Moment Cofficient (Pearson’s r).
Continue:

• Regression Analysis:

– It is the entire equation with all of the data points that are
represented with a line.
– Derives a Prediction equation for estimating the value of one
variable given the value of the Second.
– Regression allows us to see how one affects the other.

– One Variable is fixed and one is Random or Both Random.


Continue:

• Data Representation:
– Pairs of Measurments made on same study subject.

– Eg. Two study Variables on Same Subject and pair of Measurement.


• One Study Subject
– Hypertension and Serum Cholestrol
– Systolic Blood Pressure and Cholestrol.

– Notation:
• Independent Variable on X axis.
• Dependent or Response Variables on Y axis
Continue: Study Types

• In Epidemiological studies:

– Independent X Variable is Risk Factors.


– Depenednt Y Variable is the Occurence of Disease or Event.

• In Experimental studies:
– Indenpendent Variable X are fixed by Investigators.

– Eg. Investigators fix the dose of new drug (Independent Variable).


Thank You

You might also like