Professional Documents
Culture Documents
Biostatistics..and Orthodontics
Biostatistics..and Orthodontics
Depart of orthodontics
Pankaj Lakhanpan
Mds 1st year
INTRODUCTION
• Italian word “statista”- means “statesman”
• German word “statistik” - means political state.
(1620-1674)
3
USES OF BIOSTATISTICS
6
• Some data are numeric, such as height (5’6”), systolic
B.P. (112mm Hg), and some are non-numeric, such as sex
(female, male) and the patient’s level of pain (no pain,
moderate pain, severe pain).
7
POPULATION
• The collection of all elements of interest having one or
more common characteristics is called a population.
8
VARIABLE
E.g.:
Age
Sex
Waiting time in clinic
Diabetic levels
9
Types of variables
Quantitative/ Qualitative /
Numerical Categorical
Continuous
Discrete
10
QUALITATIVE VARIABLE
It is a characteristic of people or objects that cannot be
naturally expressed in a numeric value.
E.g.:
Sex – male, female
Facial type – Brachyfacial, Dolichofacial, Mesiofacial
Level of oral hygiene – poor, fair, good
11
QUANTITATIVE VARIABLE
E.g.:
Age
Height
Bond strength
12
DISCRETE VARIABLE
It is a random variable that can take on a finite number
of values or a countable infinite number (as many as
there are whole numbers) of values.
E.g.:
• The size of a family
• The number of DMFT teeth. T can be any one of the 33
numbers, 0,1,2,3,…32.
13
CONTINUOUS VARIABLE
It is a random variable that can take on a range of values
on a continuum, i.e., its range is uncountably infinite.
E.g.:
Treatment time
Temperature
Torque value on tightening an implant abutment
14
CONFOUNDING VARIABLE
The statistical results are said to be confounded when the
results can have more than one explanation.
E.g.:
In a study, smoking is the most important etiological factor
in the development of oral squamous cell carcinoma. It has
been suggested that alcohol is one of the major causes of
squamous cell carcinoma, and alcohol consumption is also
known to be closely related to smoking. Therefore, in this
study, alcohol is confounding variable.
15
SCALES OF MEASUREMENT
E.g.:
• Sex (F, M)
• Blood type (A, B, AB and O)
16
Ordinal Measurement Scale:
E.g.:
Pain after separator placement
0 - no pain
1 - mild pain
2 - moderate pain
3 - severe pain
4 - extremely severe pain
Only for statistic convenience
17
Interval Measurement Scale:
.Observations can be ordered, and precise differences
between units of measure exist. However, there is no
meaningful absolute zero.
E.g.:
• IQ score representing the level of intelligence.
IQ score 0 is not indicative of no intelligence.
18
Ratio Measurement Scale:
It is as same as interval scale in every aspect except that
measurement begins at a true or absolute zero.
E.g.:
• Weight in pounds.
• Height in meters.
19
OBSERVATIONS
20
DATA
Data are a set of values of one or more variables recorded on
one or more individuals.
Types of Data
21
Primary data:
It is the data obtained directly from an individual.
Advantages
I. Precise information
2. Reliable
Disadvantages
I. Time consuming
Secondary data:
It is obtained from outside sources,
22
Quantitative data:
Measure something with a number.
E.g: the amount of crowding, overjet, incisor
inclination, and maxillomandibular skeletal discrepancy.
Qualitative data:
Data is collected on the basis of attribute or qualities.
E.g: The sex of the patient, severity of mandibular plane
angle (high, normal, low), likelihood of compliance with
headgear or elastics (yes/no).
23
Uses Of Data:
24
Method of collection of
data
25
Presentation of Data:
Methods of presentation of
data
Tabulation Diagrams/graphs
26
Types of Tables
Frequency
Simple table Master table
distribution table
1. SIMPLE TABLE : in this, the characteristics under observation is fixed and the number or
frequency of events is small.
2. MASTER TABLE: in this table all initial readings as per the designed Performa are serially
recorded. When the number of observations is large and several attributes have to be
studied, the master table is a must.
29
BAR DIAGRAMS
1) Simple bar
. Used to represent and compare the frequency distribution of
discrete variables
All the bars must have equal width and only the length varies
according to the frequency in each category.
30
2) Multiple Bar
• Used to compare the qualitative data with respect to single
variable.
• Each category of variable have a set of bars of same width
corresponding to the different sections without any gap in
between the width and the length corresponds to frequency.
31
3) Proportional Bar Diagram
• When it is desired to compare only the proportion of
subgroups between different major groups of observations,
then bars are drawn for each group with same length.
• These are then divided according to the subgroup proportion
in each major group.
32
Pie chart
33
Pictogram
• A diagram that uses pictures to represent amount
or numbers of a particular thing
34
Spot Map
35
Histogram
36
Line diagram
37
Frequency polygon
38
Scatter diagram
39
CENTRAL TENDENCY / STATISTICAL AVERAGES:
• Central tendency refers to the center of the distribution of data
points.
• Statistics/parameters as the
Mean (the arithmetic average)
Median (the middle datum)
Mode (the most frequent score).
Objectives
•To condense the entire mass of data.
40
Ideal properties of central tendency
41
Mean:
• This measure implies the arithmetic average or arithmetic
mean.
• It is obtained by summing up all the observations and
dividing the total by number of observations.
E.g. The following gives you the fasting blood glucose levels of a
sample of 10 children.
I 2 3 4 5 6 7 8 9 10
56 62 63 65 65 65 65 68 70 71
42
Advantages:
Easy to calculate
Easily understood
Utilizes entire data
Affords good comparison
Disadvantages:
Mean is affected by extreme values, In such cases it leads
to bad interpretation.
43
Median:
• In median the data are arranged in an ascending or
descending order of magnitude and the value of middle
observation is located.
Advantages:
1. It is more representative than mean.
2. It does not depend on every observations.
3. It is not affected by extreme values.
44
Mode:
Mode is that value which occurs with the greatest
frequency.
A distribution may have more than one mode.
Disadvantages :
3. In small number of cases there may be no mode at all
because no values may be repeated; therefore it is not used
in medical or biological statistics.
46
DISPERSION:
Dispersion is the degree of spread or variation of the
variable about a central value. The measures of dispersion
helps us to study the spread of the values about the central
value.
47
Commonly used measures of variation are:
• Range
• Standard deviation
• Standard error
• Coefficient of variation
• Z score
48
The Range:
Advantage:
• Easy to calculate
Disadvantages:
• Unstable
• It is affected by one extremely high or low score.
49
Standard deviation
50
• Calculation of Standard deviation
S.D= (x-xi)2
n
x = individual value
xi= mean value
n= total number
51
Uses of standard deviation
52
Standard Error
It is not an error
Variation between sample size and population
53
Coefficient of variation
54
Z Score
• Standard score
• Signed fractional number of standard deviations by which the
value of an observation is above the mean value of what is
being observed or measured.
55
Standard Curve
56
Properties:
57
Properties of normal distribution
58
PROBABILITY
59
Laws of probability
1) Law of addition
• if A & B are mutually exclusive events then the probability of
A & B= PA+ PB
2) Law of multiplication
• if A & B are independent events then probability of A & B =
PA × PB
60
TESTS OF SIGNIFICANCE
61
Null hypothesis
•Hypothesis– Tentative prediction or
explanation of relationship between
two or more variables.
• Every test of significance begins with a null hypothesis
H0.
62
• For example:
63
Alternative hypothesis
64
• The final conclusion once the test has been carried out is
always given in terms of the null hypothesis.
65
• If we conclude "do not reject H0", this does not
mean that the null hypothesis is true, it only suggests
that there is not sufficient evidence against H0 in
favor of Ha
66
LEVEL OF SIGNIFICANCE
67
CONFIDENCE LIMITS
• When we set up certain limits on both sides of the population
mean on the basis of facts that means of samples are
normally distributed around the population mean.
• These limits are called confidence limits and the range
between the two is called the confidence interval.val.
68
Types of errors
• TYPE-I ERROR:
•When a true null hypothesis is rejected, it causes a Type I error
•TYPE-II ERROR:
•When a false null hypothesis is not rejected, it causes a Type II
error
70
Pearson coefficient of correlation
71
Steps involved in testing of a hypothesis:
73
Parametric Tests and Non Parametric Tests:
Parameters.
74
Difference between parametric and non parametric
tests:
75
Various test of significance:
Parametric tests Non parametric tests
Spearman rank
correlation
76
Z test (normal test):
Sample > 30
Used for
1. Comparison of sample mean and proportion mean
2. Difference between two sample proportions
77
t test:
• It was first described by W.S. Gossett, whose pen name
was student.
• t test is used for small samples (generally sample Size < 30), z
test is used for large samples.
78
CRITICAL RATIO:
• For t test
Critical ratio= t = Difference between two means
SE of the difference between two means
• For Z test
Critical ratio= z = Difference between two proportions
SE of the difference between two proportions
79
Types of t-tests
80
One sample t test:
It is used to compare the mean of a single group of
observations with a specified value.
t= X- μ
SD/√n
81
Paired t test
82
Unpaired t test
83
Analysis of Variance(ANOVA)
• Indications:
• When two or more groups are studied in term of two or more
factors.
84
• ANOVA makes a series of pair-wise comparisons for all the
groups. For example, if groups I, II, and III are compared,
ANOVA will compare I to II, I to III, and II to III.
85
One way Anova-test:
• Used when various experimental groups differs in terms of
only one factor at a time.
e.g. testing statistical significance of difference in heights of
school children among three socio economic groups.
86
Non Parametric Tests
87
Chi square test
88
Applications:
89
1. Test for goodness of fit
90
2. Test of association (independence)
• Binomial
Smoking and lung cancer.
Vaccination and immunity.
Weight and diabetes.
• Multinomial
Association between number of cigarettes, equal to 10,
10-20 or more than 20/ day and incidence of lung cancer.
91
3. Test of homogeneity or population variance
92
Mann- Whitney U test
93
Kruskal–Wallis test
94
• McNemar’s test: variant of chi squared test , used when data
is paired
95
Parametric Test Non parametric test Use
To compare two
Two sample t test Mann- whitney U test independent samples for
(wilcoxon rank sum test) equality of
means/medians
To compare nominal
- X2 analysis data: to compare two or
more samples for
equivalence in proportion
96
SPSS Statistics:
• It is a software package used for statistical analysis.
97
Conclusion:
• Advancing technology has enabled us to collect and safeguard
a wide variety of data with minimal effort, from patients
demographic information to treatment regimens.
98
References:
•Biostatistics for oral healthcare – Jay S. Kim,
Ronald J. Dailey
99