Stats and Sampling

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

Lecture 5 Statistics and Sampling

Topics:

1. Statistics related to sampling.

Handouts/Readings:

1. Chapter 2 - section 2.1-2.14 - Avery and Burkhart


2. Chapter 3 - section 3.1-3.12 - Avery and Burkhart

Assignment:

Complete problem set #3 before next Monday's (Feb. 11) lecture.

Notes:

Review of basic statistics:

The variance and standard deviation provide measures of the


dispersion of individual observations about their arithmetic mean.

Ex. - if we calculated the estimated standard deviation for n


measurements of tree heights to be 1.2 cm, then we can expect
about 2/3 of the n measurements of individual tree heights to
fall within + 1.2 cm of the estimated mean.
The standard error, just like the standard deviation is a measure
of the variation of individual observations about their mean,
represents the variation among sample means. It is essentially
the standard deviation among means of samples of a fixed size n.

We use the standard error to calculate confidence intervals and


to determine the sample size needed for a specified sampling
precision.

The standard error for an infinite population (or an unknown


population) is:

For a finite population (i.e., we know how many total individuals


there are) is:

This finite correction factor reduces the standard error.

Confidence intervals establish an interval, which given some


specified probability level would be expected to include the
sample mean.
We use the standard error and t values to establish these limits:

Ex. - A 95% confidence interval says that if the population was


repeatedly sampled, 95% of all possible samples will produce
confidence intervals that contain the estimated mean value. If
20 samples were taken only 1 of 20 (P=0.5) would be expected to
produce a mean value outside the calculated confidence limits.

Sampling designs:

Sample design - is the method employed to select non-overlapping


sampling units.

There are many different kinds of sampling designs (overhead -


fig. 3-1 - Avery and Burkhart). One of the most common is simple
random sampling.

Simple random sampling - provides for an equal and independent


chance of every possible combination of sampling units being
selected.
Sampling units can be selected with or without replacement.
Sampling with replacement allows each unit to appear as often as
it is selected. Sampling without replacement allows each unit
selected to appear only once.

Determining sampling intensity:

When planning an inventory we want to select enough sampling


units of a desired precision so that our sample is statistically
significant and practically efficient.

We can calculate the required sampling intensity using a formula


based on the relationship of the confidence limits on the mean
(assuming an infinite population):

Where,
n = the number of sampling units
t = the critical value from the t-distribution table
s = the standard deviation
E = the desired half-width of the confidence interval

The desired precision, E, can be estimated by (a) obtaining a small


preliminary sample of the population or (b) using information
obtained from previous sampling of the same or a similar
population.
Ex. - Suppose we have conducted a preliminary inventory of 25
plots to estimate volume per acre of a timber stand. From that
sample we estimated the mean to be 4,400 bd ft per acre and a
standard deviation of 2,000 bd ft per acre. We want to
determine the sampling intensity needed to be within + 500 bd ft
per acre, with a confidence level of 95%.

When sampling from a finite population, the sample-size formula


is:

Where,
n = the number of sampling units required
N = the number of sampling units in the population
t = the critical value from the t-distribution table
s = the standard deviation
E = the desired half-width of the confidence interval

If we know the allowable error and have an estimate of the


coefficient of variation (CV) the required sampling size (for an
infinite population) is:

Where,
AE = the allowable error
T = the critical value
CV = coefficient of variation
This allows you to estimate the number of observations needed to
estimate a population mean within + X percent at a defined
probability level.

For a finite population the formula is:

Effects of plot size variability:

Small sample plots generally exhibit more variability than larger


plots. Large plots tend to average out the effect of clumping and
openings.

If the coefficient of variation (CV) has been estimated for plots


of a given size, we can approximate the CV for different sized
plots using the following formula:

Where,
CV2 = estimated CV for new plot size
CV1 = known CV for plots of known size
P1 = plot size used to estimate CV1
P2 = new plot size

Ex. - If the CV for 1/5-acre lots is 30% the estimated CV for


1/10-acre plots would be:
You can then compare the number of plots needed for each plot
size.

Ex. - Assume the 1/5-acre plots produced a sample measn of 4 cd


per plot (20 cd per acre) and the sample standard deviation is 6
cd per acre (30% from previous example). We want to estimate
the total number of 1/5-acre plots needed to estimate the mean
volume per acre within + 2 cd per acre at a 95% probability level
(approximating t = 2):

Compared to1/10-acre plots, which would have a standard


deviation of + 7.2 cd per acre (0.36*20 = 7.2):

You might also like