Professional Documents
Culture Documents
TRB Gharaibeh Garber Liu 2010 - Determining Optimum Sample Size Percent-Within-Limits Specifications
TRB Gharaibeh Garber Liu 2010 - Determining Optimum Sample Size Percent-Within-Limits Specifications
Highway construction and materials acceptance plans use a sample size of these decisions) would be too high for state DOTs. If the sample size
that is often established on the basis of practical considerations such is too large, the cost of sampling and testing would be unnecessarily
as personnel and time constraints. Commonly used sample sizes range high, especially where destructive testing is used.
between three and seven units. While a sample size within this range may This paper presents a methodology for determining the optimum
be practical, it may not be economically optimal. If this sample size is too sample sizes for acceptance plans that use the percent-within-limits
small, the probability of making erroneous acceptance or pay adjustment (PWL) quality measure. PWL is considered a preferred measure of
decisions (and thus the expected cost consequences of these decisions) quality because it considers both the central tendency and variability
would be too high for state departments of transportation (DOTs). in a statistically sound way (3). The developed methodology was used
If this sample size is too large, the cost of sampling and testing would for determining the optimum sample sizes for asphalt binder content
be unnecessarily high, especially where destructive testing is used. and hot-mix asphalt core density in the AASHTO Quality Assurance
A computational model for determining the optimum sample size was Guide Specification for hot-mix asphalt concrete (HMAC) pavement.
developed and is presented in this paper. This model is intended to help When one considers the large amount of highway construction that
highway agencies determine how much to sample to minimize their total takes place every year throughout the United States [e.g., $66 billion
acceptance cost (cost of sampling and testing plus the cost of erroneously of highway and street construction took place in 2005 (4)], identifying
accepting poor-quality materials and construction). Inputs to this model optimum sample sizes for commonly used AQCs could translate to
can be obtained from an agency’s specifications book, historical data on a considerable cost savings to state DOTs.
quality, prevalent unit bid prices, and prevalent sampling and testing
The remainder of the paper is organized as follows: first is a review
prices. The developed model was applied to determine the optimum
of the literature on methods for sample size determination. Second,
sample size for the AASHTO acceptance plan for binder content and
the computational model for determining optimum sample size for
density of hot-mix asphalt concrete pavements. The model shows that,
AQCs under PWL-based acceptance plans is presented. Third, the
when historical quality levels are satisfactory, the state DOT may consider
application of the developed model to AASHTO’s Quality Assur-
reducing sample size as much as practically possible (in most cases, a
ance Guide Specification is discussed. Finally, key conclusions and
sample size of three per lot for each acceptance quality characteristic is
recommendations about sample size are presented for consideration
optimal). Only in the case of large lot size, combined with historically
by state DOTs.
extremely poor quality and high unit bid price, was a larger sample size
found to be optimal (n ⴝ 7 to 8).
77
78 Transportation Research Record 2151
Equation 1 is widely used in statistics textbooks and is adopted by and the probability of rejecting high-quality product (i.e., seller’s risks),
Standard Practice for Calculating Sample Size to Estimate, With Speci- they ignore the consequences of these probabilities. Furthermore,
fied Precision, the Average for a Characteristic of a Lot or Process these methods ignore the historical trends in delivered quality. The
(ASTM Standard E122-07). [ASTM Standard E 122-07 represents optimum sample size model presented in this paper addresses
(Zα + Zβ) as a multiplier. For example, with a multiplier of 3.0, it these shortcomings by considering the following factors:
is practically certain that the sampling error will not exceed E.] Efforts
have been made to improve this formula for determining sample size. 1. Cost consequences of accepting poor-quality materials as well
Some of these key efforts are discussed in the following paragraphs. as the cost of sampling and testing and
McCabe et al. (5) developed a model to account for concerns 2. Proportion of prior lots of rejectable quality throughout the state.
about confidence in the sample standard deviation obtained from a
small sample size. McCabe’s model provides a conservative approach South (8) and Govindaraju (9) suggested that a desirable feature
for estimating the population’s standard deviation (and thus the of any sampling plan is a clear consideration of the economic impact
sample size) by resorting to a modified population standard deviation. of the quality levels used in the plan. The main question to be addressed
The modified population standard deviation represents a confidence is whether the acceptance plan is worth its cost.
interval instead of a mere deterministic and uncertain single value,
as shown in Equation 2.
SAMPLE SIZE OPTIMIZATION MODEL
⎛ σ⎞
σ ′ = σ ± Zα / 2 ⎜ ⎟ c (2)
⎝ 2⎠ The model presented in the following paragraphs allows for comput-
ing the sample size that minimizes the agency’s total cost of accept-
where ing a construction or material lot. The total cost of lot acceptance
consists of two components:
σ′ = modified estimate of population standard deviation,
c = gamma function parameter, determined as a function of
1. Cost of sampling and testing and
sample size n, and
2. Cost associated with making erroneous acceptance decisions.
σ = deterministic estimate of population standard deviation.
Velivelli et al. (6) applied this conservative approach to determine Mathematically, the agency’s total cost of lot acceptance can be
the sample size for concrete strength in the specifications for airfield formulated as follows:
concrete pavement (FAA P-501). This has resulted in a sample size
of five beams per lot for flexural strength and seven cylinders per lot total cost of lot acceptance = cost T ( n ) + cost D( n ) (3)
for compressive strength. Sheng and Fan (7) used Bayesian techniques
to apply prior distribution of quality to the statistical problem of where costT(n) is the agency’s cost of sampling and testing, which
determining sample size. They concluded that when historical data is an increasing function of sample size n, and costD(n) is the agency’s
indicate “good” quality then the prior probability density function is expected cost from erroneous acceptance decisions, which is a
“optimistic” and the sample size can be reduced. For PWL-based decreasing function of n.
acceptance plans, sample size can be determined by using operating- The optimum sample size is determined as the sample size that
characteristic (OC) curves. OC curves allow for identifying sample minimizes the lot’s total acceptance cost to the agency, as defined in
size for any desired levels of buyer’s and seller’s risks. Equation 3. This sample size represents the agency’s sample size
While the above methods for determining sample size consider (not the combined sample size from the contractor and agency). This
the probability of accepting poor-quality product (i.e., buyer’s risk) model appears graphically in Figure 1.
costD(n)
costT(n)
The costT (n) component of Equation 3 is relatively straightforward Cost of Acceptance Decisions for a Single AQC
and can be computed as follows:
For acceptance plans with a single AQC, a lot is considered of poor
cost T ( n ) = n × m × c (4) quality if that particular AQC is rejectable [i.e., PWL is at (or below)
the rejectable quality level (RQL)]. Thus, the agency’s expected cost
where m is the number of replicates and c is the unit cost of testing. due to erroneous acceptance decisions is computed as follows:
The relationship between n and the cost of sampling and testing may
not be exactly linear in some cases; in those cases, where a linearity cost D( n ) = β RQL × PRQL × B × S (5)
assumption is not deemed to be appropriate, a discrete function can
where
be applied.
The costD(n) component of Equation 3, in contrast, is fairly complex, βRQL = buyer’s risk [(i.e., probability of erroneously accepting a
especially when multiple AQCs are considered. lot that has a true PWL equal to or less than the RQL,
To balance the complexity of the developed method with the level computed at the midpoint between zero and RQL), the
of accuracy needed to achieve the study’s objectives, the following concept of buyer’s (agency’s) and seller’s (contractor’s)
assumptions were made: risk being shown in Figure 2],
PRQL = proportion of prior lots of rejectable quality throughout
• An “erroneous acceptance decision” is defined as accepting a lot the state (this prior distribution of quality can be obtained
that should have been rejected. It is assumed that erroneous decisions from the agency’s construction quality databases or paper
associated with assigning the wrong pay adjustment cancel each other records of past construction projects),
out (i.e., in the long run and on a statewide basis, the cost to the state B = unit bid price, and
S = lot size.
DOT of paying more for a lot than it is worth is offset by the gain to
the state DOT of paying less for a lot than it is worth). β is computed by using the closed-form solution provided in
• AQCs are independent or weakly dependent. This assumption is Appendix A of this paper. This procedure has been verified and found
not unrealistic for most AQCs (including those addressed in this paper) to agree with the OCPLOT software (a well-established simulation-
because contractors tend to pay attention to individual AQCs rather based computer program for developing operating characteristic
than to combined measures of quality. In other words, if one AQC curves) (10).
has poor quality, it does not necessarily indicate that other AQCs also
have poor quality. Cost of Acceptance Decisions for Multiple AQCs
• Because the minimum sample size for variables acceptance
plans with unknown standard deviation is three, a lower cap of n = 3 For an acceptance plan that considers k AQCs, a lot is considered
was placed on optimum sample size. Thus, under this method, the poor quality if at least one of the AQCs is rejectable (i.e., PWL is at
optimum sample size cannot be less than three. or below RQL). Thus, the agency’s expected cost due to erroneous
• An agency’s expected cost due to erroneous acceptance decisions acceptance decisions is computed as follows:
is based on bid price and does not consider other costs such as user
l
costs or future maintenance and rehabilitation costs.
cost D( n ) = ∑ [β i Pi × β i +1Pi +1 × × β k Pk ] j × B × S (6)
j =1
When the above assumptions are considered, costD(n) is computed
for acceptance plans with a single AQC and for those with multiple where l is the number of combinations of β and P for all AQCs
AQCs, as discussed in the following sections. and β = βRQL and P = PRQL for at least one AQC. Combinations that
100
Seller’s Risk, α
Probability of Acceptance, %
Buyer’s Risk, β
0
RQL AQL
0 100
Percent within Limits (PWL)
TABLE 1 Probability of Accepting Poor Quality Lot When Two AQCs Are Considered
Probability of Statewide
Acceptance for Historical
Scenario PWL Any Given Lot Occurrence Statewide Probability of Acceptance
Both AQC1 and AQC2 PWL1 ≤ RQL1 & PWL2 ≤ RQL2 β1RQL × β2RQL P1RQL × P2RQL β1RQL × β2RQL × P1RQL × P2RQL
are rejectable.
AQC1 is rejectable, and PWL1 ≤ RQL1 & RQL2 < PWL2 < AQL2 β1RQL × β2RQL-AQL P1RQL × P2RQL-AQL β1RQL × β2RQL-AQL × P1RQL × P2RQL-AQL
AQC2 is acceptable
without pay increase.
AQC1 is rejectable, and PWL1 ≤ RQL1 & PWL2 ≥ AQL2 β1RQL × β2AQL-100 P1RQL × P2AQL-100 β1RQL × β2AQL-100 × P1RQL × P2AQL-100
AQC2 is acceptable
with pay increase.
AQC1 is acceptable RQL1 < PWL1 < AQL1 & PWL2 ≤ RQL2 β1RQL-AQL × β2RQL P1RQL-AQL × P2RQL β1RQL-AQL × β2RQL × P1RQL-AQL × P2RQL
without pay increase,
and AQC2 is rejectable.
AQC1 is acceptable with PWL1 ≥ AQL1 & PWL2 ≤ RQL2 β1AQL-100 × β2RQL P1AQL-100 × P2RQL β1AQL-100 × β2RQL × P1AQL-100 × P2RQL
pay increase, and AQC2
is rejectable.
do not have all AQCs at the RQL are influenced by the acceptable Relationships between sample size (n) and cost of sampling and
quality level (AQL) and include βRQL−AQL PRQL−AQL [where RQL < testing [costT(n)] for these AQCs (see Figures 2 and 3) were developed
PWL < AQL (i.e., accept without pay increase)], and βAQL−100 PAQL−100 on the basis of an interview with a commercial laboratory manager
[where AQL ≤ PWL ≤ 100 (i.e., accept with pay increase)]. from Austin, Texas, in 2008. Austin’s prices were adjusted to rep-
Equation 6 may be further explained by the following example. resent the national average prices by using a ratio of 0.803 (11). A
If it is supposed that an acceptance plan includes two AQCs, a 0.803 city index indicates that sampling and testing prices in Austin
rejectable-quality lot can occur under any one of the five scenarios are 80.3% of the national average prices.
in Table 1. For each scenario, the probability of acceptance is com- The following assumptions were made in estimating the sampling
puted as the multiplication of the individual probabilities of accep- and testing costs for asphalt binder content:
tance (i.e., βi). The prior proportion of built lots for each scenario
is computed as the multiplication of the individual prior proportions • Testing is performed in the plant.
(i.e., Pi). Finally, the statewide probability of accepting a rejectable- • Asphalt binder content and gradation are tested at the same
quality lot is computed as the summation of Π i =1β i Pi for all scenarios
k time and on the same unmolded HMA specimen.
(i.e., sum of last column in Table 1). • Cost of sample preparation and testing is $57.50 per technician-
hour.
• Cost of transportation is $34.00.
• Testing duration is 3h per set of tests.
APPLICATION OF DEVELOPED OPTIMIZATION
• Production rate is four sets of tests per day, assuming that the
MODEL FOR SAMPLE SIZE
plant will produce for 10 to 12 h and the technician stays at the plant
during that time period.
The optimum sample sizes for asphalt binder content and HMA core
density under the AASHTO Quality Assurance Guide Specification
The following assumptions were made in estimating the sampling
for HMAC pavement were determined by using the developed model.
and testing costs for HMA core density:
3000 3000
Sampling and Testing
Cost, $
1500
1500
1000
1000
500
500
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0
Sample Size, Samples per Lot 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sample Size, Samples per Lot
FIGURE 3 National average cost of sampling and testing
for unmolded HMA (binder content and gradation together), FIGURE 4 National average cost of sampling and testing for HMA
2008 prices. density cores, 2008 prices.
• Cost of coring (including coring, preparation, and patching) is preclude agencies from replacing outliers with new sample units if
$99.60 per core. there is evidence that the outliers occurred because of sampling and
• Cost of maximum theoretical specific gravity testing is $52.30 testing errors.
per core. The AASHTO Quality Assurance Guide Specification specifies a
• Cost of density testing is $26.20 per core. minimum sample size of five for density and a minimum sample size
of four for binder content. Based on the analysis performed here, the
AASHTO-recommended minimum sample sizes are unnecessarily
Cost of Sampling and Testing Example high for most practical cases. A sample size of three is more optimal.
The effect of the number of AQCs considered in the acceptance
This example explains how Figures 3 and 4 were developed from the plan on the optimum sample size is demonstrated here by using the
cost assumptions presented above. The cost of sampling and testing same AASHTO Quality Assurance Guide Specification for asphalt
of five HMA cores for density is computed as follows: binder content and HMA core density (discussed in the earlier section
Sampling or Testing Item Cost ($) on model inputs). To determine the effect of the number of AQCs
considered in the acceptance plan on optimum sample size, the
Transportation between lab and project site 115.10
optimum sample sizes for the following scenarios were computed
Coring ($99.60/core) 498.00
Maximum theoretical specific gravity test 261.50 by using the optimization model for sample size:
($52.30/core)
Density test ($26.20/core) 131.00 • Acceptance plan considers binder content only,
Total (for five cores) 1,005.50 • Acceptance plan considers HMA core density only, and
• Acceptance plan considers both binder content and HMA core
density.
Model Outputs and Sensitivity
Figure 5 shows that, as the number of tested AQCs increases, the
The optimum sample size for asphalt binder content and HMA core optimum sample size for each AQC decreases. Furthermore, the total
density (when both of these AQCs are included in the acceptance plan) amount of testing decreases. Thus, for any given total amount of
are shown in Table 3 for various scenarios of lot size, unit bid price, testing, it is more cost-effective to test more AQCs than to increase the
and historical quality levels. Table 3 shows that, for most practical sample size for any individual AQC. This result is expected from both
cases, the optimum sample size is three tests per lot. Only in the case theoretical and practical standpoints. From a theoretical standpoint, the
of historically poor quality, high unit bid price, and large lot size is joint probability of occurrence for all scenarios involving the accep-
a high sample size warranted (n = 7 or 8 tests per lot for a 4,000-ton lot tance of rejectable quality [i.e., the term (Pi βi × Pi+1βi+1 × . . . × Pk βk)
with a unit bid price of $100/ton or $120/ton, respectively; optimum in Equation 6] decreases as the number of AQCs (i.e., k) increases.
sampling frequency = optimum n ÷ lot size). That model does not From a practical standpoint, the chances of detecting and rejecting
Scenario Historically Poor Quality Historically Regular Quality Historically Good Quality
Lot = 2,000 tons, unit bid price = $80 per ton Binder content: 3 tests per lot Binder content: 3 tests per lot Binder content: 3 tests per lot
HMA density: 3 cores per lot HMA density: 3 cores per lot HMA density: 3 cores per lot
Lot = 3,000 tons, unit bid price = $100 per ton Binder content: 3 tests per lot Binder content: 3 tests per lot Binder content: 3 tests per lot
HMA density: 3 cores per lot HMA density: 3 cores per lot HMA density: 3 cores per lot
Lot = 4,000 tons, unit bid price = $120 per ton Binder content: 8 tests per lot Binder content: 3 tests per lot Binder content: 3 tests per lot
HMA density: 8 cores per lot HMA density: 3 cores per lot HMA density: 3 cores per lot
Lot = 4,000 tons, unit bid price = $100 per ton Binder content: 7 tests per lot Binder content: 3 tests per lot Binder content: 3 tests per lot
HMA density: 7 cores per lot HMA density: 3 cores per lot HMA density: 3 cores per lot
82 Transportation Research Record 2151
16
10
each AQC
8
0
Poor Regular Good
Historical Quality
FIGURE 5 Optimum sample size for single and double AQCs of HMA
(lot ⴝ 4,000 tons, unit bid price ⴝ $100/ton).
poor quality lots increase as more AQCs are tested, leading to lower both the computational model and the practical aspects of sampling
optimal sample size. and testing.
⎛ AQL ⎞ Z α ⴱ z 2 + Zβ ⴱ z1
z1 = ϕ −1 ⎜ ( A1)
⎝ 100 ⎟⎠ k= ( A9)
Z α + Zβ
and 2
⎛ k 2 ⎞ ⎛ Z + Zβ ⎞
n = ⎜1 + ⎟ ⴱ ⎜ α ( A10)
⎛ RQL ⎞ ⎝ 2 ⎠ ⎝ z1 − z 2 ⎟⎠
z 2 = ϕ −1 ⎜ ( A 2)
⎝ 100 ⎟⎠
4. Because Zα and Zβ are the standardized normal variables of
where ϕ−1 is the inverse of the standard normal cumulative distribution the α and β probabilities, respectively, α and β can be computed
function. by using Excel’s NormSDist (which returns the standard normal
To determine z1 and z2, one may use the standard charts for cumu- cumulative probability of Zα and Zβ) as follows:
lative probabilities of the normal distribution function (found in
most statistics and probability books) or Microsoft Excel functions seller’s risk = α = 1 − NormSDist ( Z α ) ( A11)
as follows:
buyer’s risk = β = 1 − NormSDist ( Zβ ) ( A12)
⎛ AQL ⎞
z1 = NormSInv ⎜ ( A3)
⎝ 100 ⎟⎠ α and β are computed for sample size n = 3, 4, 5, . . . , 30 and the
results (i.e., n and its associated α and β) are stored in the spreadsheet
⎛ RQL ⎞ for use in the cost optimization model.
z 2 = NormSInv ⎜ ( A 4)
⎝ 100 ⎟⎠
where NormSInv is a Microsoft Excel function that returns the inverse REFERENCES
of the standard normal cumulative distribution (AQL and RQL here
are expressed in PD). 1. Russell, J. S., A. S. Hanna, E. V. Nordheim, and R. L. Schmitt. NCHRP
Report 447: Testing and Inspection Levels for Hot-Mix Asphaltic
2. Compute the beta distribution value k equivalent to M. This can
Concrete Overlays. TRB, National Research Council, Washington, D.C.,
be determined through a trial-and-error procedure. Several k values 2001.
are tried to calculate PD by using the following beta distribution 2. Mahoney, J. P., and A. W. Backus. QA Specifications Practices.
function. The trial-and-error operation is stopped when a k value WA-RD 498.1. Washington State Department of Transportation, Seattle,
results in a PD value that is equal to M (such k is considered 2000.
equivalent to M): 3. Burati, J. L., R. M. Weed, C. S. Hughes, and H. S. Hill. Evaluation of
Procedures for Quality Assurance Specifications. FHWA-HRT-04-046.
FHWA, U.S. Department of Transportation, 2004.
( (
x = max ⎡⎣ 0 ,1 2 − k n ) 2(n −1))⎤⎦ 4. Construction Spending November 2008. Census Bureau, U.S. Depart-
PD = ∫ β ( a, b, x ) dx ( A5) ment of Commerce. http://www.census.gov/const/www/c30index.html.
x =0 Accessed Feb. 16, 2010.
5. McCabe B., S. AbouRizk, and J. Gavin. Sample Size Analysis for Asphalt
where a = b = (n/2) − 1. Pavement Quality Control. Journal of Infrastructure Systems. Vol. 5,
The above function is implemented in many statistical software No. 4, 1999, pp. 118–123.
6. Velivelli, K. L., N. G. Gharaibeh, and S. Nazarian. Sample Size Require-
tools. For example, it can be solved using Microsoft Excel’s function
ments for Seismic and Traditional Testing of Concrete Pavement Strength.
BetaDist, as follows: In Transportation Research Record: Journal of the Transportation
Research Board, No. 1946, Transportation Research Board of the National
PD = BetaDist(x, a, b, 0, 1) Academies, Washington, D.C., 2006, pp. 33–38.
7. Sheng, Z., and D.-Y. Fan. Bayes Attribute Acceptance Sampling Plan.
where BetaDist returns cumulative beta probability density function IEEE Transactions on Reliability, Vol. 41, No. 2, 1992, pp. 307–309.
and x is computed as follows: 8. South, J. B. Selecting an Acceptance Sampling Plan That Minimizes
Expected Error and Sampling Cost. Quality Progress, Vol. 15, No. 10,
1982, pp. 18–22.
⎡⎛ 1
x = max ⎢ ⎜ 0, −
k n ⎞⎤
⎟⎥
( ) ( A6)
9. Govindaraju, K. Certain Observations on Lot-Sensitive Sampling Plan.
⎢ ⎜⎝ 2 2 ( n − 1) ⎟⎠ ⎥
Communications in Statistics: Theory and Method. Vol. 19, No. 2, 1990,
⎣ ⎦ pp. 617–627.
10. Weed, R. M. Quality Assurance Software for the Personal Computer.
3. Now that k, z1, and z2 are known, compute Zα and Zβ (which FHWA-SA-96-026. FHWA, U.S. Department of Transportation,
1996.
are the standardized normal variables of the α and β risks) as follows:
11. Heavy Construction Cost Data. R. S. Means Publishing, 18th ed.
(E. R. Spencer, ed.), Kingston, Mass., 2008.
⎛ n ⴱ ( z1 − z 2 ) ⎞ 1 12. Duncan, A. J. Quality Control and Industrial Statistics, 5th ed. R. D. Irwin,
Zβ = ⎜ ⎟ⴱ ( A7) Homewood, Ill., 1986.
⎜ k2 ⎟ ⎛ k − z1 ⎞
⎝ 1+ ⎠ ⎜⎝ 1 + z − k ⎟⎠
2 2 The Management of Quality Assurance Committee peer-reviewed this paper.