Professional Documents
Culture Documents
Biostatistics and Experimental Design: Yadgar Ali Mahmood University of Garmian
Biostatistics and Experimental Design: Yadgar Ali Mahmood University of Garmian
and
Experimental Design
Yadgar Ali Mahmood
University of Garmian
1. Introduction
1.1 Statistics
Defined in to two modes
1. As statistical data: it is a numerical
representation of things.
2. As statistical method: it is a field of study
that deals with mathematical formulas, models,
and techniques that are used in statistical analysis of
raw research data.
It helps to know the object under study in a better way
– Statistical methods include:
1. Designing studies
2. Collecting data
3. Presenting data
4. summarizing data
5. Drawing inferences
What is Biostatistics?
• It is the application of statistical methods to
the biological and life sciences.
Limitations of statistics
As a science, statistics has its own limitation
– Deals with only quantitative information
14
• The main divisions are qualitative (categorical)
and
quantitative (numerical variables).
Scales of measurement…
• Qualitative variable: a variable which can’t be
measured in quantitative form. But can only be
identified by name or categories
– E.g. place of birth, types of drug, stages of breast
cancer (I, II, III, or IV), degree of pain (minimal,
moderate, severe). …
15
Scales of measurement…
• Quantitative variable: A variable that can be
measured and expressed numerically and they
can be of two types (discrete or continuous).
– The values of a discrete variable are usually whole
numbers, e.g. the number of episodes of diarrhea in the
first five years of life.
– A continuous variable is a measurement on a
continuous scale, e.g. weight, height, blood 16
– E.g.
24
Stages in…
• Presentation of the data: The process of re-
organization, classification, compilation… of data to
present it in a meaningful form.
• Analysis of data: The process of extracting
relevant information from the summarized data
• Inference of data: The interpretation and further
observation of the various statistical measures
through the analysis of the data
– And by implementing those methods by which 25
2. Married/living together
3. Separated/divorced/widowed
2. No
Steps in designing questionnaire
1. Content
Decide what questions will be needed to
2. Formulating Questions
5. Translation
If the interview will be conducted in one or
more local languages, translate
2. Data presentation
Data presentation
• Having collected and edited the data, the next
step is to organize it.
• That is to present it in a readily clear
condensed form
• The presentation of data is classified in to two
1. Tabulation
2. Diagrammatic
Tabular presentation
• Frequency distribution: is the organization of
raw data in table form using classes and
frequencies
• There are three basic types of frequency
distributions
• Categorical frequency distribution
• Ungrouped frequency distribution
• Grouped frequency distribution
Categorical frequency distribution
• Used for data that can be place in specific
categories such as nominal or ordinal.
E.g. a researcher collected the following
data on marital status for 25 Patients.
(M=married, S=single, W=widowed and
D=divorced)
S D W D M
S M M M S
D S M M S
D D S S W
W W D D W
Solution
Make a table as shown
M ////// 6 24%
S /////// 7 28%
D /////// 7 28%
W ///// 5 20%
Ungrouped frequency distribution
• Is a table of all the potential raw score values
• Often constructed for small set or data on
discrete variable.
E.g. The following data represent the Weight of
12 clients in nutrition consulting clinic.
80 76 90
70 60 62
63 60 63
76 70 70
Solution
Make a table as shown
60 // 2
62 / 1
63 // 2
70 /// 3
76 // 2
80 / 1
90 / 1
Grouped frequency Distribution
• When the range of the data is large, the data
must be grouped in to classes that are more
than one unit in width
Qs: Construct a frequency distribution for the
following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Solution
• Then, draw
Bar chart
• is the most widely used graphical method for
describing qualitative data.
• A set of bars representing some magnitude
over time space.
• The common types of bar chart
– Simple
– Multiple
– Component … etc
Simple bar chart
E.g. Distribution of Decayed teeth among
children of a primary school
Multiple bar chart
E.g. Distribution of marital status by sex
%
60 Male
40 Female
20
4
3
2
• Objectives
• To understand the data easily
• To facilitate comparison
• To make further statistical analysis
Types of MCT
• The Mean (Arithmetic, Geometric and
Harmonic)
• The Mode
• The Median
• Quantiles (Quartiles, deciles and percentiles)
55
Mean for Ungrouped data
56
Mean for grouped data
• If data are given in the shape of a continuous
frequency distribution, then the mean is
57
Example: calculate the mean for the
following data
58
The Mode ( X ˆ )
• Mode is a value which occurs most frequently in
a set of values
• The mode may not exist and even if it does
exist, it may not be unique.
• In case of discrete distribution the value having
the maximum frequency is the modal value.
• The mode of a set of numbers X1, X2, X3,…Xn is
60
~X
The Median( )
• In a distribution, median is the value of the
variable which divides it in to two equal
halves.
• In an ordered series of data median is an
observation lying exactly in the middle of the
series.
61
Example:
Find the median of the following numbers.
a) 6, 5, 2, 8, 9, 4
b) 2, 1, 8, 3, 5
Solution:
a) First order the data: b) Order the data :
2, 4, 5, 6, 8, 9 1, 2, 3, 5, 8
Here n=6, which is even Here n=5 , which is
n=6 odd
62
MV: Measures of variation
• The spread of items of a distribution is known
as dispersion or variation.
• In other words, the degree to which numerical
data tend to spread about an average value is
called dispersion or variation of the data.
63
Objectives of measures of variation
• To judge the reliability of MCT
• To control variability itself
• To compare two or more groups of numbers in terms
of their variability
• To make further statistical analysis
64
Types of Measures of Dispersion
• The most commonly used measures of
dispersions are:
– Range and relative range
– Standard deviation and coefficient of
variation
– Quartile deviation and coefficient of
Quartile deviation
65
The Range
• The range is the largest score minus the
smallest score.
• It is a quick and dirty measure of variability.
• It is greatly affected by extreme scores.
Example: 32 35 36 42 42 43 43 45
Range is 45-32=13
66
Mean Deviation
• Is the arithmetic mean of the values of the
absolute deviations from a given average
• Depending up on the type of averages used
we have different mean deviations
• Mean deviation for raw data and for
frequency
distribution respectively as follows:
67
The variance and standard deviation
Population Variance:
• If we divide the variation by the number of
values in the population, we get the
population variance.
• This variance is the "average squared
deviation from the mean"
• And for frequency distribution
68
Sample Variance
• It simply be the population variance with the
population mean replaced by the sample mean.
• However, one of the major uses of statistics is
to estimate the corresponding parameter.
• To counteract this, the sum of the squares of
the deviations is divided by one less than the
sample size 69
Sample variance formula
For raw data:
Or
shorthand formula
,
shorthand formula
Standard deviation
• It is the square root of variance
• Population standard deviation
71
Examples:
• Find the variance and standard deviation of
the following sample data
1. 5, 17, 12, 10.
2. The data is given in the form of frequency
distribution
72
Cont…
73
Cont…
74
Coefficient of Variation (C.V)
• Is defined as the ratio of standard deviation
to the mean usually expressed as percents.
75
Example:
• An analysis of the monthly wages paid to
workers in two dep’t Pedi (A) and Ortho (B)
belonging to the same campus gives the
following results
Value Dep’t A Dep’t B
78
Cont…
• Z gives the deviations from the mean in units
of standard deviation
• Z gives the number of standard deviation a
particular observation lie above or below the
mean.
• It is used to compare two observations
coming from different group
79
Examples:
1. Two sections were given Biostatistics
examinations. The following information was
given.
Value HO (Sec1) Nursing (Sec2)
Mean 78 90
Sd 6 5
• Measures of kurtosis
– Leptokurtic
– Mesokurtic
– Platykurtic