Professional Documents
Culture Documents
Week 4 - CH7 Sampling and Sampling Disributions
Week 4 - CH7 Sampling and Sampling Disributions
Sampling frame
Sampling frame: list, map or directory from which to take a sample to represent the population.
- Frames that have over-registration contain all the target population units plus some additional
units.
- Frames that have under-registration contain fewer units than in the target population.
- Sampling is done from the frame, not the target population.
- In theory, the target population and the frame are the same, but in reality, a researcher’s goal is to
minimise the differences between the frame and the target population
Nonrandom sampling: not every unit of the population has a known probability of being selected in the
sample. sometimes nonrandom sampling is called nonprobability sampling
- Members of nonrandom samples are not selected by chance. For example, they might be selected
because they are at the right place at the right time or because they know the people conducting
the research.
- Because units of the population have an unknown probability of being selected, it is impossible to
assign a probability of occurrence in nonrandom sampling.
- Nonrandom sampling methods are not appropriate techniques for gathering data to be analysed
by most of the statistical methods
Disproportionate: whenever the proportions of the strata in a sample are different from the proportions of
the strata in the population.
Systematic sampling
Systematic sampling: every kth item is
selected to produce a sample of size n from a
population of size N.
- k is sometimes called the sampling
cycle
- If k is not an integer value, its whole
number value should be used.
- Systematic sampling methodology is based on the assumption that the source of population
elements is random
Two stage sampling: clusters are too large and a second set of clusters is taken from each original cluster.
Nonrandom sampling
Nonrandom sampling techniques: techniques used to select elements from a population by any
mechanism that does not involve a random selection process.
- Convenience sampling: elements for the sample are selected for the convenience of the
researcher. The researcher typically chooses elements that are readily available, nearby or willing
to participate
- Judgement sampling: occurs when elements selected for the sample are chosen by the
judgement of the researcher.
- Quota sampling: certain population subclasses, such as age group, gender and geographical
region, are used as strata; the researcher uses a nonrandom sampling method to gather data from
one stratum until the desired quota of samples is filled. Quotas are described by quota controls,
which set the sizes of the samples to be obtained from the subgroups.
- Snowball sampling: survey subjects are selected based on referral from other survey respondents.
Sampling error
Sampling error: difference between the value computed from a sample (a statistic) and the corresponding
value for the population (a parameter).
- Occurs because the sampling process involves selecting a subset of the population and not the
entire population.
- We can minimise sampling error by taking a larger sample or using stratified random sampling.
- this has to be weighed against the extra cost involved in doing so.
- sampling error formulas can be derived for each of the random sampling designs, which can then
be incorporated into the expression for the sampling distribution.
Nonsampling errors
- Missing data
- Recording errors
- Input processing errors and analysis errors.
- Errors of unclear definitions
- Defective questionnaires
- Poorly conceived concepts.
Sampling distribution: The way sample means are spread out when plotted.
- Samples that are randomly selected, even when selected and not replaced into the population,
have a range of sample mean values that are spread out and follow some kind of distribution.
Standard error of the mean: Standard deviation of the sample means: population divided by the square
root of the sample size.
σ
SEmean=σ mean=
√❑
Central limit theorem → as long as a sample size of 30 or more is selected from any shaped population
distribution, the sampling distribution of the sample mean will be approximately normally distributed.
- Creates potential to apply our knowledge of the normal distribution to many problems when the
sample size is sufficiently large.
- Used when the distribution of a variable in a population is unknown.
- To find 𝜇x without using a mathematical derivation, all possible samples of the same size would
have to be selected randomly from a population. Then each sample average would have to be
calculated. Finally, the average of all the sample averages would be calculated to find 𝜇x.
σ
Using the central limit theorem with 𝜇x = 𝜇 and 𝜎x = SEx = and then making these substitutions into the
√❑
z formula for the sampling distribution of the sample means gives the z formula for sample means.
- When a population variable is normally distributed and the sample size is one (n = 1), the z formula
for the sampling distribution of the sample means becomes exactly the same as the z formula for
individual values in the population.
- If n = 1, the sample mean of a single value is the same as that value, and the value of SEx =
σ
√❑
-
Whenever researchers are working with a finite population they can use the fpc factor.
- Many researchers check to see if the sample size is less than 5% of the finite population size
(meaning n N < 0.05).
- If this is the case, then the fpc factor will have little effect on the standard error of the mean
and so can be disregarded in any calculations.
- The central limit theorem applies to sample proportions in that the normal distribution
approximates the shape of the distribution of sample proportions.
- The approximation is true if both np > 5 and nq > 5 (where p is the population proportion and q = 1 − p).
- The mean of all possible sample proportions of the same size n randomly drawn from a population
is p (the population proportion).
Key Equations