Professional Documents
Culture Documents
Introduction To Biostatistics: Data Collection Descriptive Statistics
Introduction To Biostatistics: Data Collection Descriptive Statistics
Introduction To Biostatistics: Data Collection Descriptive Statistics
Introduction to Biostatistics
Data Collection
Descriptive Statistics
Sample 1: Representative? Y N
Sample 2: Representative? Y N
Sample 3: Representative? Y N
5
Sampling Approaches
Convenience Sampling: select the most
accessible and available subjects in target
population. Inexpensive, less time consuming,
but sample is nearly always non-representative
of target population.
Hypothesis
Data collection
Presentation of data
Data analysis
Interpretation of data
Polgar, Thomas 9
Types of Data Collection
Surveys/Questionnaires
Self-report
Interviewer-administered
proxy
Direct medical examination
Direct measurement (e.g. blood draws)
Administrative records
10
Understanding and Presenting
Data
11
Types of Data
Frequency Table
Frequency Histogram
Relative Frequency Histogram
Frequency polygon
Relative Frequency polygon
Bar chart
Pie chart
Box plot
15
Frequency Table
Generally, the first approach to examining
your data.
Identifies distribution of variables overall
Identifies potential outliers
Investigate outliers as possible data entry
errors
Investigate a sample of others for data entry
errors
16
Frequency Table
The measurements are: 42, 38, 51, 53, 40, 68, 62,
36, 32, 45, 51, 67, 53, 59, 47, 63, 52, 64, 61, 43, 56,
58, 66, 54, 56, 52, 40, 55, 72, 69. 19
Age Groups Frequency Relative
Frequency
32 -36 yr 2 2/30=0.067
37- 41 yr 3 3/30=0.100
42-46 yr 4 4/30=0.134
47-51 yr 3 3/30=0.100
52-56 yr 8 8/30=0.267
57-61 yr 3 3/30=0.100
62-66 yr 4 4/30=0.134
67-72 yr 3 3/30=0.100
Total n=30
20
Frequency Polygon
Use to identify the distribution of your data
9
8 Female
7 Male
6
Frequency
5
4
0
20- 30- 40- 50- 60-69
Age in years
21
Table 1 in a paper
Describe your study population in a frequency table
Table Title
Name of variable Frequency Mean
%
(Units of variable) (n) (SD)
-
- Categories
-
Total
22
Measures of Central Tendency
1. Mean
2. Median
3. Mode
23
Sample Mean
The arithmetic mean (or, simply, mean) is
computed by summing all the observations in the
sample and dividing the sum by the number of
observations.
26
Measures of non-central locations
Quartiles
Quintiles
Percentiles
27
Measures of Dispersion or Variability
Variance
2 i=1
s =
n -1
S = standard deviation
(square root of variance)
29
Calculation of Variance and
Standard deviation
2 2 2 2
2 (6000-18000 ) +(10000-18000 ) +(10000-18000 ) +(14000-18000)+(50000-18000 )
S= =
5-1
2
S = 328,000,000
S 18110.77
30
Mean and Standard deviation (SD)
7 8
7 7
7 77
7 77
6 3 2
7
7 8 13
Mean = 7 9
Mean = 7 SD=0.63
SD=0
Mean = 7
SD=4.04
31
Empirical Rule
For a Normal distribution approximately,