Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31


Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria

Important statistical terms

Population: a set which includes all measurements of interest to the researcher (The collection of all responses, measurements, counts that are of interest)


Sample: A subset of the population

Why sampling?
Get information about large populations

Less costs Less field time

More accuracy i.e. Can Do A Better Job of

Data Collection When its impossible to study the whole population

Target Population: The population to be studied/ to which the investigator wants to generalize his results Sampling Unit: smallest unit from which sample can be selected Sampling frame List of all the sampling units from which sample is drawn Sampling scheme Method of selecting sampling units from sampling frame

Types of sampling

Non-probability samples

Probability samples

Non probability samples

Convenience samples (ease of access) sample is selected from elements of a population that are easily accessible Snowball sampling (friend of friend.etc.) Purposive sampling (judgemental) You chose who you think should be in the study Quota sample

Non probability samples

Probability of being chosen is unknown Cheaper- but unable to generalise potential for bias

Probability samples

Random sampling

Each subject has a known probability of being selected

Allows application of statistical sampling theory to results to:

Generalise Test hypotheses


Probability samples are the best

Representativeness Precision

Methods used in probability samples

Simple random sampling Systematic sampling Stratified sampling Multi-stage sampling Cluster sampling

Simple random sampling

Table of random numbers

684257954125632140 582032154785962024 362333254789120325 985263017424503686

Systematic sampling
Sampling fraction Ratio between sample size and population size

Systematic sampling

Cluster sampling
Cluster: a group of sampling units close to each other i.e. crowding together in the same area or neighborhood

Cluster sampling
Section 1 Section 2

Section 3

Section 5 Section 4

Stratified sampling Multi-stage sampling

Errors in sample
Systematic error (or bias) Inaccurate response (information bias) Selection bias Sampling error (random error)

Type 1 error

The probability of finding a difference with our sample compared to population, and there really isnt one. Known as the (or type 1 error)

Usually set at 5% (or 0.05)

Type 2 error

The probability of not finding a difference that actually exists between our sample compared to the population
Known as the (or type 2 error) Power is (1- ) and is usually 80%

Sample size
Z 2 2 n D2

Z2 (1 ) n D2

2 (1 2 2 )xF n 2 D

2 P (1- P) F n D2

Problem 1 A study is to be performed to determine a certain parameter in a community. From a previous study a sd of 46 was obtained. If a sample error of up to 4 is to be accepted. How many subjects should be included in this study at 99% level of confidence?

n 2 2 Z 2 D

2 2 2.58 x 46 n 880 .3 ~ 881 2 4

Problem 2

A study is to be done to determine effect of 2 drugs (A and B) on blood glucose level. From previous studies using those drugs, Sd of BGL of 8 and 12 g/dl were obtained respectively. A significant level of 95% and a power of 90% is required to detect a mean difference between the two groups of 3 g/dl. How many subjects should be include in each group?


( )xF n 2 D
2 1 2 2

3 in each group

(8 12 )x10.5
2 2 2

242.6 ~ 243

Problem 3
It was desired to estimate proportion of anaemic children in a certain preparatory school. In a similar study at another school a proportion of 30 % was detected. Compute the minimal sample size required at a confidence limit of 95% and accepting a difference of up to 4% of the true population.

Z (1 ) n 2 D

1.96 x 0.3(1 0.3) n 504 .21 ~ 505 2 (0.04)


Problem 4
In previous studies, percentage of hypertensives among Diabetics was 70% and among non diabetics was 40% in a certain community. A researcher wants to perform a comparative study for hypertension among diabetics and non-diabetics at a confidence limit 95% and power 80%, What is the minimal sample to be taken from each group with 4% accepted difference of true value?

2 P (1 - P) F n D2

2 x 0.55 (1- 0.55) x7.8 n 2413 . 2 2 0.04

Precision Cost

You might also like