Professional Documents
Culture Documents
1 Biostatistics LECTURE 1
1 Biostatistics LECTURE 1
5. Hypothesis testing
3. Probability Distributions (Discrete &
Continuous) 6. Parametric and Non-parametric statistics
Sampling distribution
• Normal distribution 7. Introduction to statistical software
• Standard normal distribution (STATA)
Article for Review
1-4
1-5
1-6
1-7
1-8
1-9
1 - 10
1 - 11
1 - 12
1 - 13
1 - 14
1 - 15
Introduction to Biostatistics
BIOSTATISTICS
• It is the science which deals with development and application of the
most appropriate methods for the:
Collection of data.
Analysis and interpretation of the results.
Making decisions on the basis of such analysis
Sources of data
Comprehensive Sample
Sources of Data
2- External sources:
The data needed to answer a question may already exist in the form of published
reports, commercially available data banks (e.g. GenBank), or the research literature, i.e.
someone else has already asked the same question.
3- Surveys:
The source may be a survey, if the data needed is about answering certain questions.
4- Experiments:
Frequently the data needed to answer a question are available only as the result of an
experiment.
Methods of presentation of data
1. Numerical presentation
2. Graphical presentation
3. Mathematical presentation
1- Numerical presentation
Tabular presentation (simple – complex)
-
- Categories
-
Total
Table : Distribution of 50 patients at the surgical department of KATH
in May 2010 according to their ABO blood groups
Table: Distribution of 20 lung cancer patients at the chest department of KATH and 40 controls in May 2***
Lung cancer
Total
Smoking Cases Control
No. % No. % No. %
Smoker 15 75% 8 20% 23 38.33
Non
smoker 5 25% 32 80% 37 61.67
• Data (singular): The value of the variable associated with one element of a population or sample.
This value may be a number, a word, or a symbol.
• Data (plural): The set of values collected for the variable from each of the elements belonging to
the sample.
Nominal
Qualitative Binary
Ordinal
Variable
Discrete
Quantitative
Continuous
Nominal Variable: A qualitative variable that categorizes (or describes, or
names) an element of a population.
• A Parameter:
It is a descriptive measure computed from the data of a
population.
Since it is difficult to measure a parameter from the
population, a sample is drawn of size n, whose values are
1 , 2 , …, n. From this data, we measure the statistic.
Measures of Central Tendency
The Mean (average), the Median (midpoint), and the Mode (most frequently occurring
number).
The Mean:
It is the average of the data.
The Population Mean:
Example:
Here is a random sample of size 10 of ages, where
1= 42, 2= 28, 3 = 28, 4 = 61, 5 = 31,
6 = 23, 7 = 50, 8 = 34, 9 = 32, 10 = 37.
Example: Assume the values are 115, 110, 119, 117, 121 and 126.
The mean = 118.
But assume that the values are 75, 75, 80, 80 and 280. The mean =
118, a value that is not representative of the set of data as a
whole.
The Median:
When ordering the data, it is the observation that divide the set of observations into
two equal parts such that half of the data are before it and the other are after it.
* If n is odd, the median will be the middle of observations. It will be the (n+1)/2 th
ordered observation.
When n = 11, then the median is the 6th observation.
* If n is even, there are two middle observations. The median will be the mean of
these two middle observations. It will be the (n+1)/2 th ordered observation.
When n = 12, then the median is the 6.5th observation, which is an observation
halfway between the 6th and 7th ordered observation.
Example:
For the same random sample, the value 28 is repeated two
times, so it is the mode.
Note:
• Range concern only onto two values
• Data:
• 43,66,61,64,65,38,59,57,57,50.
• Find Range?
• Range=66-38=28
Boxplot
The Variance (how far the values of x are spread out)
• It measures dispersion relative to the scatter of the values about the mean.
a) Sample Variance ( ) :
• ,where is sample mean
• b)Population Variance ( ) :
• where µ is Population mean
•If mean = median = mode, the sample shows a perfectly normal distribution
•If mean < median < mode, the sample shows a negatively skewed distribution
•If mean > median > mode, the sample shows a positively skewed distribution
Standard Normal Distribution
• : Sample mean.