8 Sampling

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 32

Sampling

Sampling
Probability Sampling Based on random selection
Non-probability sampling Based on convenience

Sampling Miscues: Alf Landon for President (1936)


Literary Digest: post cards to voters in 6 states Correctly predicting elections from 1920-1932 Names selected from telephone directories and automobile registrations In 1936, they sent out 10 million post cards Results pick Landon 57% to Roosevelt 43%
Election: Roosevelt in the largest landslide Roosevelt 61% of the vote and 523-8 in Elect. Col.

Why so inaccurate?: Poor sampling frame Leads to selection of wealthy respondents

Sampling Miscues: Thomas E. Dewey for President (1948)


Gallup uses quota sampling to pick winner 1936-1944 Quota sampling:

matches sample characteristics to characteristics of population

Gallup quota samples on the basis of income

In 1948, Gallup picked Dewey to defeat Truman Reasons:


1. Most pollsters quit polling in October 2. Undecided voters went for Truman 3. Unrepresentative samplesWWII changed society since census

Non-probability Sampling
In situations where sampling frame for randomization

doesnt exist
Types of non-probability samples: 1. Reliance on available subjects
convenience sampling

2. Purposive or judgmental sampling 3. Snowball sampling 4. Quota sampling

Reliance on Available Subjects


Person on the street, easily accessible
Examples:
Mall intercepts, college students, person on the street

Frequently used, but usually biased


Notoriously inaccurate
Especially in making inferences about larger population

Purposive or Judgmental Sampling


Dictated by the purpose of the study
Situational judgments about what individuals should be

surveyed to make for a useful or representative sample


E.g., Using college students to study third-person effects

regarding rap and metal music


3pe: Others are more affected by exposure than self Assessing effects on self and others Using college students makes for homogeneity of self

Snowball Sampling
Used when population of interest is difficult to locate E.g., homeless people
Research collects data from of few people in the targeted

group
Initially surveyed individuals asked to name other people to

contact
Good for exploration Bad for generalizability

Quota Sampling
Begins with a table of relevant characteristics of

the population
Proportions of Gender, Age, Education, Ethnicity from

census data Selecting a sample to match those proportions

Problems: 1. Quota frame must be accurate 2. Sample is not random

Probability Sampling
Goal: Representativeness
Sample resembles larger population

Random selection
Enhancing likelihood of representative sample
Each unit of the population has an equal chance of being

selected into the sample

Population Parameters
Parameter: Summary statistic for the population
E.g., Mean age of the population

Sample is used to make parameter estimates


E.g., Mean age of the sample
Used as an estimate of the population parameter

Sampling Error
Every time you draw a sample from the population, the

parameter estimate will fluctuate slightly


E.g.: Sample 1: Mean age = 37.2 Sample 2: Mean age = 36.4 Sample 3: Mean age = 38.1

If you draw lots of samples, you would get a normal curve

of values

Normal Curve of Sample Estimates


Frequency of estimated means from multiple samples

Likely population parameter

Estimated Mean

Standard Error
The average distance of sample estimates from the

population parameter
68% of sample estimates will fall within in one standard error

of the population parameter

Normal Curve of Sample Estimates


Frequency of estimated means from multiple samples

Population parameter

1 standard error unit

Estimated Mean

Normal Curve of Sample Estimates


Frequency of estimated means from multiple samples

2/3 of samples

Population parameter

1 standard error unit

Estimated Mean

Standard Error Estimates and Sample Size


As the sample size increases:
The standard error decreases In other words, are sample estimate is likely to be closer to

the population parameter As the sample size increases, we get more confident in our parameter estimate

Confidence Levels
Two thirds of samples will fall within the standard error of

the population parameter


Therefore: a single sample has a 68% chance of being within

the standard error


Confidence levels: 68% sure estimate is within 1 s.e. of parameter 95% sure estimate is within 2 s.e. of parameter 99% sure estimate is within 3 s.e. of parameter

Confidence Interval
Interval width at which we are 95% confident contains

the population parameter


For example, we predict that Candidate X will receive

45% of the vote with a 3% confidence interval


We are 95% sure the parameter will be between: 42% and 48%

Confidence interval shrinks as:


Standard error is smaller Sample size is larger

Sample Size & Confidence Interval


How precise does the estimate have to be?
More precise: larger sample size

Larger samples increase precision


But at a diminishing rate
Each unit you add to your sample contributes to the accuracy

of your estimate
But the amount it adds shrinks with additional unit added

95% Confidence Intervals


Sample Size
% split 50/50 N= 100 10.0 N= 200 7.1 N= 300 5.8 N= 400 5.0 N= 500 4.5 N= 700 3.8 N= 1000 3.2 N= 1500 2.6

70/30

9.2

6.5

5.3

4.6

4.1

3.5

2.9

2.4

90/10

6.8

4.2

3.5

3.0

2.7

2.3

1.9

1.5

Sampling Frame
List of units from which sample is drawn Defines your population E.g., List of members of organization or community
Ideally youd like to list all members of your population as

your sampling frame


Randomly select your sample from that list

Often impractical to list entire population

Sampling Frames for Surveys


Limitations of the telephone book:
Misses unlisted numbers Class bias:
Poor people may not have phone Less likely to have multiple phone lines

Most studies use a technique such as Random Digit Dialing

as a surrogate for a sampling frame

Types of Sampling Designs


Simple Random Sampling
Systematic Sampling Stratified Sampling Multi-stage Cluster Sampling

Simple Random Sampling


Establish a sampling frame
A number is assigned to each element Numbers are randomly selected into the sample

Systematic Sampling
Establish sampling frame
Select every kth element with random start E.g., 1000 on the list, choosing every 10th name yields a

sample size of 100

Sampling interval: standard distance between units

on the sampling frame

Sampling interval = population size / sample size

Sampling ratio: proportion of population that are

selected

Sampling ratio = sample size / population size

Stratified Sampling
Modification used to reduce potential for sampling error
Research ensures that certain groups are represented

proportionately in the sample


E.g., If the population is 60% female, stratified sample selects

60% females into the sample E.g., Stratifying by region of the country to make sure that each region is proportionately represented

Two Methods of Stratification


1. Sort population in groups
Randomly select within groups in proportion to relative

group size

2. Sort population into groups


Systemically select within groups using random start

Disproportionate stratification:
Some stratification groups can be over-sampled for sub-

group analysis Samples are then weighted to restore population proportions

Cluster Sampling
Frequently, there is no convenient way of listing the

population for sampling purposes


E.g., Sample of Dane County or Wisconsin
Hard to get a list of the population members

Cluster sample
Sample of census blocks
List of people for selected census block

Select sub-sample of people living on each block

Multi-stage Cluster Sample


Cluster sampling done in a series of stages:
List, then sample within

Example:
Stage 1: Listing zip codes Randomly selecting zip codes Stage 2: List census blocks within selected zip codes Randomly select census blocks Stage 3: List households on selected census blocks Randomly select households Stage 4: List residents of selected households Randomly select person to interview

Multi-stage Sampling and Sampling Error


Error is introduced at each stage
One solution is to use stratification at each stage to try to

reduce sampling error

You might also like