4.highway Traffic Monitoring and Data Quality - (PG 86 - 111)

60 Highway Traffic Monitoring and Data Quality
• Classification types that are difficult to separate;

• Length of shift and time since last break;
• Enumerator ability, willingness, vision quality, and reliability;
• Weather and lighting conditions;
• Vehicle flow rates.
With so many factors at play, it is likely that enumerators will make mis-
takes and that variation is likely to be unpredictable.
If enumerators are to be assessed, they should be assessed either against a
video or other permanent test, or against each other in live situations. In the lat-
ter case, it is reasonable to assume initially that each enumerator is of the same
performance variance, and the net variance of each is assumed to be 1/√2 of the
difference between the two.
Unless further quantified information is at hand, a manual enumeration
error between 1% and 5% is not an unreasonable assumption.
3.16.2 Typical Blunders

The following error sources are common to all technologies:
• Incorrect dimensions are entered into the TME.

• Sensors installed are not perpendicular to the flow of traffic at the site.
Often traffic flow is not parallel to the curb or in the center of the lane.
Before installation, observe the flow of traffic, and place the sensors in the
center of this track and perpendicular to the flow.
• Data is mistakenly collected from the wrong site or assigned the wrong
Copyright © 2008. Artech House. All rights reserved.
site ID. This sounds like an obvious error, but it happens about 1 in a 100
times on temporary surveys.
• Technician is not following procedures or manuals or is inadequately
trained.
• A relevant quality manual and/or appropriate attitude by the employing
organization is lacking.
Equipment should be calibrated for speed measurement before error analy-

sis. This is because most equipment uses the speed measurement to calculate
other measurements (e.g., vehicle length, wheelbases, gap).
Dalgleish, Michael, and Neil Hoose. Highway Traffic Monitoring and Data Quality, Artech House, 2008. ProQuest Ebook
Central, http://ebookcentral.proquest.com/lib/utranscomm-ebooks/detail.action?docID=456896.
Created from utranscomm-ebooks on 2018-01-30 00:53:03.
Errors in Traffic Data 61
3.16.3 Equipment Parameter Settings

The metadata field for equipment and parameters does not allow an array of val-
ues. Therefore, it follows that equipment and parameters must not alter within
a single data file. It is permissible for a piece of equipment and/or sensor to be
replaced with an identical unit during a survey. Any data lost or affected by the
process should be recorded. Under no condition should equipment or sensors be
tweaked or adjusted during the course of the construction of a data file without
an appropriate record being made, or the data recording being restarted.
3.16.4 Loop Detector Error Sources

A TME utilizing inductive loops as vehicle sensors is often the most accurate and
reliable method of traffic monitoring. Its main disadvantage is the capital cost
and inconvenience of installation.
Errors can accumulate from a number of sources:
• Loops installed are not exactly square and central to lanes of traffic flow.
• Loops installed are the wrong size or different sizes when paired.
• Feeders are too long. Ideally feeders should be 50m or less, but sometimes
practical equipment location considerations make this impossible.
These error sources result partly from technological limitations and partly
from blunders. If they are a feature of a site, the documentation or file note for
the site should record the reasons.
3.16.5 Errors in Length Measurement Using Loops

There is a high random-error component in loop length measurements because a
loop detector measures the metallic length of the vehicle, not its physical length.
This is a basic technology limitation. Usually, the two characteristics are fairly
closely related, but the assumption is not perfect because
1. The front and rear of the vehicle may be largely constructed of plastic, and
the distance from the main metal in the vehicle is variable from the phys-
ical edges.
2. The height of the metal at the front and rear of the vehicle varies
considerably.
The key point is that the loop detector measures the length of the con-
tiguous conductive area of the vehicle, not its true physical length. The con-
ductive length is a proxy for physical length but satisfactory for most traffic
applications.
The typical variation between the actual physical length and the actual
conductive length for cars is about ±20 cm at a 95% confidence level. The actual
errors in the detection technology need to be added to these values. This then
gives the gross error.
In other words, there are two types of error: proxy error, or the degree to
which the measurand is indicative of the parameter being measured, and detec-
tor error, or the error that the detector makes when measuring the proxy. The
upshot of all this for length measurement is that an optical technique is going to
get results much closer to the truth because the proxy error is so low.
Going back to the loop length, ±4 cm repeatability of length measurement
is routine for current loop detectors. Note this is not the accuracy but the repeat-
ability, or “precision,” of the repeated measurement of a single vehicle.
Microwave detectors also use more approximate methods to estimate
length, which will show much greater variation than loop detection.
3.16.6 Tube Detector Error Sources

Tube detectors are the most common form of axle detector and are ideal for
temporary surveys. Potential blunders include the following:
• Tubes are not at a right angle to vehicle flow;

• When two tubes are used, they are not parallel;
• A systematic error is caused by unequal tube lengths when two tubes are
used.
Technology limitations include:
• Tubes that move about on the road surface, especially with large vehicles;
• The finite sampling resolution of the machine when measuring speed;
• Undercounting when vehicle speeds fall;
• Overcounting when vehicles stop or turn on the tubes and at high
speed.
Axle detectors also cause undercounting due to masking when placed over
two lanes. This may be modeled statistically. If tube detectors are used to count
vehicles, they require axle-to-vehicle calibration or the determination of correc-

tion factors.
3.16.7 Microwave Sensor Error Sources

Microwave sensors are subject to the following error sources:
• Multiple targets are in view at once (i.e., very heavy traffic).

• Stationary or very slow traffic results in undercounting because the
Doppler shift effect is too small for reliable detection.
• Obscuration of far targets by large objects near the detector will cause
undercounting in multilane situations.
3.16.8 Number Plate Reader Error Sources

The error rates of ANPR are affected by a number of factors. For example, the
reading of individual number plates by an ANPR is affected by:
• The age of the plate, as readings of old plates are more unreliable;
• Special and custom number plates, which will fail syntax checks more
frequently.
Obscuration due to traffic flow and dynamics is another performance issue

for ANPR. The camera cannot see some or all of the number plate when:
• There is heavy or very slow traffic, for example, during peak congested
hours;
• A vehicle in a near lane obstructs the view of vehicles in the target lane;
• A goods vehicle ahead obscures a vehicle behind;
• A goods vehicles behind obscures a vehicle in front;
• Lane changing avoids camera sensing.
In addition, site location factors affect the reading rate. Systems may be
installed,
• Badly due to the cost of support structures;

• Near queuing traffic, leading to bad lane discipline;
• Near vehicular joining or departing access;
• At a difficult viewing angle;

• Near a reflective, heated, or nonporous (spray causes obscuration) road
surface;
• Next to surface markings in the region of interest;
• Near “dirty operations” (e.g., near quarries, refuse or landfill sites, logging
and other forest sites, farms that burn straw, incinerators), where more
dirt and grime will accumulate on the camera and affect performance.
The following environmental conditions can also affect the read rate:
• Direct or reflected sun light shining directly onto the camera glass;
• Bad weather, wind, precipitation, and/or bad visibility.
Gantry sites are used wherever possible to avoid obscuration effects.

In the case of pole-mount sites, you should only target the closer running
lanes to realize the performance of which the equipment is capable. This is because
ANPR equipment capture and recognition rates will be reduced when the line of
sight is steeply angled and through the lorry lane of the motorway. The pole should
always be as close as possible to the path of the vehicles in the lane under survey.
3.16.9 Bias in Number Plate Readers

Most number plate readers detect and recognize between 50% and 90% of plates,
depending on the factors listed above. The errors present in a particular site can
be systematic, biased, and time varying (i.e., biased in time). As they are nonran-
dom errors, most statistical theory cannot account for such systematic errors. For
example, a system may ignore any plate that does not meet a fixed aspect ratio
between its width and height, so the percentage of successful reads will depend
on the proportion of the population of vehicles not conforming to the aspect
ratio. Environmental influences vary over time (e.g., position of the sun during
the day), so their effect on accurate plate reading will also vary overtime.
Most ANPR systems also need to record the time at which license plates are
captured. In contrast with the above, all ANPR have reliable and accurate clocks,
and it is relatively easy to check remotely for any time drift and correct it before
it becomes significant. For example, where journey times are being measured by
matching ANPR data from two locations, the significance of an error of a few
seconds depends on the distance between ANPR sites.
The significance of any bias in ANPR systems depends on the application
for which the data is collected. Enforcement applications need those plates that
are read to be correct but can accept a lower number of successful reads provided
that the credibility of the systems is not undermined. Journey time monitoring
needs consistent readings of plates at different sites to allow matches to be maxi-
mized, although the individual reads do not necessarily have to be correct.
Therefore, quantifying bias in ANPR means verifying its performance
against all vehicles that traverse the section during the verification period. This
entails a manual process in most cases. Note that due to the high prevalence of
time-varying sources of error, the selection of verification periods becomes im-
portant in designing a verification regime.
3.17 Meaning of Capability

When the term capability is used in connection with a TME, it refers to the
performance that the TME is capable of in good or faultless conditions. Actual
performance at a particular site will vary from this optimum capability level due
to the reality of the external factors that influence field performance. For ex-
ample, all the following factors have the potential to reduce performance below
capability:
• The characteristics and positioning of the sensor array;

• The composition of the vehicular stream at various times in the day;
• Environmental conditions, such as rain, fog, snow, wind, and temperature.
3.18 Relevance of Quality Assurance

Every traffic monitoring site should be the subject of a permanent procedure,

and that data should be accompanied by sufficient information to properly detail
what the data is, as well as its bias and variability.
It is essential that all manufacturer-originated and local user information
about the TME be made available to the installation personnel at the time of
installation and to the application user at the time of use.
All TME devices require careful installation and calibration to give best
results. If devices are installed without calibration, then the results will be of
unknown quality.
We take limited error surveys because it is impractical to survey the entire
population for errors. Taking a sample introduces doubt in assuming that the
survey results are representative of a survey of the whole population. Much of
this document is concerned with the quantification of the additional uncertainty
due to sampling (i.e., sampling error).
3.19 Summary
This chapter has focused on the systematic error rate and its determination. In
particular, it has highlighted a few critical data items:
• The mean error;

• The confidence interval for the mean error;
• The minimum sample needed to get the confidence interval of the mean
error within specified bounds.
This chapter has given worked examples to illustrate the differences be-
tween different assumptions and the associated calculations. Finally, the main
sources of error for a variety of data collection techniques have been discussed.
4
Accuracy Assessments
4.1 Introduction
Up to now, we have made all our assessments in terms of systematic and ran-
dom error components. This is the preferred way to assess equipment. However,
some specifications refer to accuracy. Unfortunately, accuracy is not a universally
agreed upon term and has a number of different meanings.
In its simplest form, accuracy refers to the difference between a reported
measurement and the accepted reference value. For a number of reports, accura-
cy refers to either a confidence interval or a lack of systematic error or bias. While
these definitions sound straightforward, the data needed to support a claim of a
particular level of accuracy (e.g., ±1%) is more complicated. Most of this chapter
is about that data and how it should be interpreted.

All measuring systems, including traffic monitoring systems, have sys-
tematic error or bias. In the case of traffic monitoring, the systematic error can
be regarded as a fixed ratio. Sometimes it may be a function of environmental
conditions and traffic flow, but this variance is ignored for the present analysis.
Systematic error or bias can be regarded as the opposite to accuracy; an accurate
machine has a small bias, while a machine with a large bias has a low accuracy.
When a specification refers simply to an accuracy requirement without further
qualification (e.g., “the accuracy requirement is ±1%”), such a requirement can
be assumed to be a maximum systematic error or bias.
67
The difference in accuracy specifications is only the degree of accuracy

required (e.g., ±1%) and the thoroughness with which accuracy is determined to
lie within this range. For example, the tests in this chapter include:
• An even-probability test, where the accuracy is confirmed just to be prob-

able;
• A two-sigma test, where the accuracy is confirmed with a 95% confi-
dence level;
• A three-sigma test, where accuracy is confirmed to a 99.7% confidence
level;
• Additional conditions, such as the restricted mean (see Section 4.11).
Each of these tests confirms that systematic errors or bias lie within a cer-
tain range with a certain confidence level.
Minimizing sample size is a key requirement to keep costs down and min-
imize time on-site. This chapter describes mathematical methods to calculate
the minimum sample size when determining compliance with accuracy speci-
fications. It also has been written to focus on counting accuracy, but much of
the material is common with continuous measurements such as vehicle speed,
length, and so forth. However, before starting that discussion, we need to deal
with the more difficult issue of combining the fixed, but unknown, systematic
error with the random error for an interval count.
4.2 Interval Counting Variance

The previous chapter dealt with the most common question of long-term bias
or systematic error in the data produced by the TMS. This addressed the issue of
the long-term difference between true and reported counting. But, as the exam-
ple showed, the error in the TMS is not regular but varies from count to count.
On occasion, it is required to determine these random errors in interval
counts. For example, if tolls are being paid based on hourly traffic counts, then
the variation to be seen in hourly figures will be of interest in addition to the
mean error, because counts in one period may attract different charges to another
period.
Due to the random nature of miscounts, the random errors in small in-
tervals (e.g., 20 seconds or 1 minute) may be quite high compared with the
mean error rate. The random errors become more significant as the interval de-
creases but conversely approach zero for very long periods. These errors turn up
dalg_CH04_1.indd 2 7/22/2008 2:15:08 PM

Accuracy Assessments 69
randomly as the name suggests (sometimes in threes, like buses!), and it is this
variance for given intervals that we wish to quantify.
4.3 Confidence Interval for Individual Counts

When the multiple-sample survey method described in Chapter 3 is used, the
confidence interval including random errors for all such intervals can be esti-
mated. The example survey in Table 3.1 consists of six 10 minute measurement
samples. In this case, we can estimate the random error for the population of 10
minute samples based on the errors seen in the survey.
Refer to (2.8) and assume that a 95% confidence level is required. The
standard deviation (SD) of the percentage error of all the samples in the example
is 0.71%. We also know that the confidence interval of the mean is ±0.74% from
the multiple-sample survey. Thus, the confidence interval for all individual 10
minute count reports (CII) is estimated as
CII 95% = CIM 95% ± z 95% × SD

= −0.61% ± 0.74% ± 1.96 × 0.71%
= −0.61% ± 2.13%
where 2 is the random variable for the standard normal distribution.

In this formula we simply added the confidence interval of the mean to the
random error determined from the individual samples. Statisticians debate this
combining of systematic error with random errors. The simple additive combi-
nation in this manner feels right, is relatively conservative, and is easy to use.
The reason is as follows. The mean or systematic error of the TMS is a
fixed but unknown amount. We only know from the work in Chapter 3 that its
best estimate is –0.61% and that there is a 95% chance that it lies somewhere
between +0.13% and –1.35%. It may well lie near the middle of this interval,
but we don’t know that. So, we have to be impartial about where the mean might
actually be.
Now, if the mean error varied with every report, we would be justified in
combining it in quadrature with the random error. But since it doesn’t, the addi-
tion of the two deviations needs to be a straight arithmetic addition. Accepting
this approach means, in the case of continuing 10 minute counts with about 200
vehicles in each sample, that:
• In 95% of cases, the true count of vehicles will lie within (–0.61% –
2.13%) to (–0.61% + 2.13%) (i.e., –2.74% to +1.48%) of what was
reported.
• If all counts are adjusted by being multiplied by 1.0061, they will then be
accurate to ±2.13% with a confidence level of 95%.
A useful implication of the central limit theorem is that longer sample

periods will have smaller individual error variations. Therefore, the result applies
to all counts with a time period of more than 10 minutes duration at a higher
confidence level.
4.4 Calculating the Confidence Interval for Different Periods

The CII is different for different count periods. This is because the random
variations, plus and minus, tend to even out with longer periods, eventually
tapering out to the mean.
If CIIn (CII for an n-minute interval) is known, then the confidence interval
for an m-minute period can be calculated according to the following formula:
CII n
CII m =
m (4.1)
n
For example, the CII for 10 minute intervals in Section 4.3 was 2.13%. To
calculate the CII for 60 minute intervals, use m = 60 and n = 10:
CII 10 ±2.13%
CII 60 = = = ±0.87%
60 2.45
10
Thus, the confidence interval for all individual 60 minute count reports is
estimated as
CII 95% = CIM 95% ± z 95% × SD

= −0.61% ± (0.74% + 1.96 × 0.87%) = −0.61% ± 2.45%
It is probably obvious that as the time period expands toward infinity, then
the confidence interval approaches that set for the mean. The confidence interval
for individual counts for any period can be determined using this method.
4.5 Some Words about Systematic Error

Systematic error is a characteristic to be understood and incorporated into down-
stream confidence intervals. Of course, all systematic error or bias should be mini-
mized, if at all possible, by the selection of appropriate-quality equipment and
properly engineered site installations. But since it cannot be eliminated, it must be
monitored and included in data-quality statements wherever appropriate.
To illustrate the permanent presence of systematic error components, con-
sider the case of a perfect TME using loop sensors to count traffic that includes
motorcycles. Motorcycles that drive along the longitudinal edge of the loop will
be missed as a result of the inherent characteristics of the loop and the layout.
Thus, because of basic sensor limitations, a loop-based TMS will always exhibit
bias in undercounting motorcycles. If there are no other systematic errors, this
machine will always return an undercount of total vehicles.
To illustrate an opposite permanent bias component, consider the same
loop installation with respect to high-chassis vehicles and caravans. The TMS
designer has to make certain decisions about whether a gap between vehicles in
slow traffic is interpreted as separating two vehicles or is assumed to indicate a
single, high-chassis vehicle. Such a TMS will overcount high-chassis vehicles and
undercount close-following vehicles in dense, slow-moving traffic, which may
routinely happen every day. Thus, in given traffic stream conditions, a positive
bias component will always be present.
These are two examples of the permanent nature of bias components. The
TMS designer will endeavor to balance these overcounts and undercounts, but
these and other underlying bias components will always be present in certain
traffic stream and site conditions.
The best approach to the specification of error rates is to define permissible
systematic and random error rates. For example, a specification could call for
an overall mean error rate of less than 0.50% and a systematic error of less than
0.50%, plus a random error of less than ±1.00%, at a 95% confidence level for
interval recordings. (The precise figures given here are for illustration only.)
A specification may sometimes call for the TMS to show “no systematic
bias,” “no bias,” “no long-term cumulative error,” or something similar. As ex-
plained, all TMSs have systematic errors or bias; hence, none of these points is
actually achievable. The purpose of such a requirement is usually to disallow the
calibration of the equipment in such a manner as to establish a permanent bias
in favor of one party or another. There is an alternative way of approaching these
types of requirement through a deliberately biased piece of equipment; this is
discussed further in Section 4.16.
If such a requirement is confirmed, it is customarily interpreted as meaning
“no significant bias,” where “significant” refers to the performance specification.
As a working rule, this may be taken to mean that the bias should be less than
some proportion of the stated accuracy requirement, for example, 50% or 33%.
4.6 Even-Probability Accuracy Test

The even-probability accuracy test returns a judgment of acceptable bias and/or
accuracy if CIM50% is contained wholly within the specified accuracy limits.
The logic of this test is that bias outside the specification is only a 50/50
probability (i.e., is not proven beyond reasonable doubt). In other words, the ac-
curacy specification is met at a 50% confidence level. This is a relaxed assessment
of bias and/or accuracy. It can be remembered by the fact that the even in the test
represents the odds that it is correct.
4.7 Two-Sigma Probability Accuracy Test

The two-sigma accuracy test returns a verdict of acceptable bias if CIM95% is
contained wholly within the specified accuracy limits.
This test extends the even-probability test by mandating that the estimated
true population mean must lie within the specification at a 95% confidence
level. It is a medium-strength assessment of bias. It can be remembered by the
fact that the two is close to the z factor of 1.96.
4.8 Three-Sigma Probability Accuracy Test

The three-sigma test returns a judgment of acceptable bias and/or accuracy if
CIM99.7% is contained wholly within the specified accuracy limits. This test fur-
ther extends the even- and two-sigma probability tests by mandating that the
true population mean must lie within the specification at a 99.7% confidence
level (i.e., will be incorrect only 3 times in 1,000). It is a rigorous assessment
of bias since the mean from this machine will be within the specification 997
times out of every 1,000. In other words, it is practically never in error. It can be
remembered by the fact that the three is the z factor of 3.00
4.9 Discussion of the Tests

These tests are best understood through a graphical analysis as shown in Figures
4.1 to 4.3. The graphs show some of the distribution of errors that would pass
Upper spec limit

Lower spec limit
Zero error
Negative errors Positive errors
Hit accuracy/low precision
and within specification
Figure 4.1 Distribution of errors for a TME with high accuracy and low precision.
Upper spec limit

Lower spec limit
Zero error
Low accuracy/medium precision

Figure 4.2 Distribution of errors for a TME with low accuracy and medium precision.
the accuracy test despite having very different accuracy (i.e., the size of the mean
error) and precision (i.e., the range of errors around the mean error).
The position of the mean is irrelevant; all that matters is that the 95% con-
fidence interval lies inside the specification. Figure 4.1 shows the widest distribu-
tion that would pass, albeit using a quite unbiased machine. Figure 4.2 shows
how the distribution can be quite biased and quite wide (i.e., imprecise too) and
yet still pass. The final graph in Figure 4.3 shows a biased but precise machine.
All passes are on the limit.
Lower spec limit
Upper spec limit

Zero error
Low accuracy/high precision

Figure 4.3 Distribution of errors for a TME with low accuracy and high precision.
4.10 Additional Conditions to the Basic Tests

In addition to the three basic accuracy tests described above, additional condi-
tions may be added. These are all designed to further restrict the position of the
mean, the ratio of the mean to random errors, or the inclusion of zero in the con-
fidence interval. They are generally mutually exclusive and simply added to the
basic test. For example, the two-sigma test combined with the restricted mean is
known as the two-sigma restricted mean accuracy test.
4.11 Restricted Mean

The restricted mean test reflects the view that a mean well away from the accu-
racy limit will be more robust than a mean close to the limit. The restricted mean
test adds the additional criteria that the mean must be within ±50% of the speci-
fied accuracy limit. The logic of this addition is that the most likely systematic
error is restricted to lying within one-half of the accuracy requirement, which
acknowledges the fact that systematic errors are much more inconvenient than
random errors, whose effect tends to cancel out.
Figure 4.4 shows three notable distributions. The central distribution shows
a wide range for the mean, extending from the lower to the upper accuracy lim-
its, with 95% of the range lying exactly at the specification limit. The two other
distributions show results at each extremity of the range allowed for by the mean.
All distributions between these two will satisfy the restricted mean test.
Mean restricted to lie

within the central 50%
Lower spec limit
Upper spec limit

Zero error
Restricted mean–
three possible distributions
Figure 4.4 Error distributions for restricted mean test.
4.12 Zero Included in Range

The zero-included test introduces the further condition that CIM must contain
zero. The logic of this is that if bias is not proven at the chosen confidence level
(e.g., 95%), then it is not possible to say at this confidence level (i.e., 95% in this
example) that bias does exist. Because CIM contains zero, then zero is one of the
possible values of the mean error with a 95% certainty.
Figure 4.5 shows two contrasting distributions. The less precise machine
with the broad (wide) distribution passes the test, while another machine with
about the same mean error fails the test because the confidence interval of the
mean does not include zero.
4.13 Sample Size Trap

The zero-included test is unreliable because different sample sizes give different
results for the same machine or TMS. Ironically, the larger sample, most prob-
ably made at a higher cost, may fail, while a smaller sample size may pass. In
fact, the output of the test is not a sole function of TMS performance but more
directly, in many cases, of sample size. It thus may reasonably be described as a
test having poor reliability.
Lower spec limit
Upper spec limit

Zero error
Zero included test–

Precise distribution fails,
wide random error passes
Figure 4.5 Error distributions for zero-included-in-range test.
The main problem is that this test will produce a failure result as sample
size increases beyond a critical point. In fact, the test is the result of a misun-
derstanding of the nature and meaning of the confidence interval of the mean.
Since all machines have a bias or systematic error, zero will eventually lie outside
the confidence interval of the mean of any machine, given a large enough sample
size. Therefore, ultimately all machines can be made to fail this test.
Figure 4.6 illustrates this point. When just a few samples are taken, the
confidence interval is very flat and wide, as shown on the diagram for n = 5
After n=100 samples

After n=30 samples

Lower spec limit
Upper spec limit

Zero error
After n=5 samples
Increasing the number of samples

tightens the confidence interval of the mean
Figure 4.6 Diagram illustrating the sample size trap.
samples. At this number of samples, the TMS fails the specification since the
95% confidence interval extends outside the upper limit. After 30 samples, the
estimate of the population mean has not moved much, but the confidence inter-
val has now reduced in width, and the (same) TMS can be stated to be within
specification for accuracy, including both zero-included and restricted mean re-
quirements. Finally, after 100 samples have been taken, again the mean has not
moved much, but the confidence interval is now much narrower and therefore
excludes zero, making the TMS now fail the zero-included test.
All three results apply to the same TMS; all that has changed is the number
of samples in the error survey. This hopefully gives a better understanding of
what the sample number does to the estimate of the confidence interval of the
mean and how the zero-included test is really not reliable. This is further elabo-
rated in the next section.
4.14 Random Error Trap

The zero-included test has been shown above to be unreliable due to the influ-
ence of sample size. This was the case for a single TMS with fixed systematic and
random error rates. But that is not the only reason this test is unreliable.
It might be thought that reducing the basic systematic and/or random er-
rors in a TMS would increase its likelihood of satisfying all applicable perfor-
mance requirements, particularly the zero-included requirement. However, this
is not the case for the zero-included test.
The so-called random error trap, which we shall now describe, is a further
unreliable aspect of the zero-included test, which is observed when the random
error rate improves more than the systematic error rate, causing the zero-includ-
ed test to fail.
Consider the example in Table 3.1. The systematic error test is assessed
as –0.61% ± 0.74% (i.e., –1.35% to +0.13%). Therefore, 0.00% is included
in the range. This TMS is thus assessed as passing the zero-included test. Now
assume the supplier makes a large improvement to the TMS random error rate.
Assume that the standard deviation test result is improved to ±0.25%, which is
almost three times as good as the previous ±0.71%. This means that the readings
coming from the machine are more precise or consistent from reading to read-
ing than previously. At the same time, assume that the supplier also manages to
improve the systematic error rate to 0.50%, more than a 20% improvement on
the previous value of –0.61%.
Taking the improvement in precision and reworking the calculation for the
confidence interval, we have
SD 0.25
CIM 95 = t 95,n × = ±2.57 × = ±0.26%
n 6
Now, using the revised mean error of –0.50%, the confidence interval of
the mean is assessed as –0.50% ± 0.26% (i.e., –0.50% – 0.26%) and –0.50%
+ 0.26% (i.e., –0.76% to –0.24%). Since the confidence interval of the mean
no longer includes 0.00%, the TMS is now declared as failing the zero-included
test. This is shown graphically in Figure 4.7, where the taller curve represents the
error distribution from the improved TMS.
In other words, by improving the random error rate more than the system-
atic, we caused the TMS to fail the zero-included-in-range test. In the same way,
it can be shown that increasing the random error rate, either alone or in conjunc-
tion with a smaller increase in the systematic error rate, will make passing the
zero-included test more probable.
4.15 Test Failure Options

If the client insists on the zero-included test and it fails, the following procedures
can be applied to assist a pass:
• Repeat the test using a subset of the data collected to reduce the sample
size. This will in effect increase the divisor in the confidence interval esti-
mation and produce a wider distribution. This will clearly only work if
Lower spec limit
Upper spec limit

Zero error

Zero included test–
precise distribution fails,
wide random error passes
Figure 4.7 Diagram illustrating the random error trap.
the mean is 50% or less than of the tolerance or accuracy requirement after
the mean from the reduced sample set is considered.
• The technique implied above in Section 4.14 can be applied to the TMS.
This involves introducing random errors into the raw data (inside or
outside the TMS) and producing adjusted data. This will increase the de-
nominator and again produce a wider distribution. Again, it will only
work if the mean is 50% or less than of the accuracy requirement.
Both of these actions aim at exploiting the weakness in this test. This is
permissible as long as the repeated tests performed on the adjusted data are then
in compliance with the systematic and random error requirements. However, the
client could be asked whether he would rather adopt the restricted mean condi-
tion, which achieves the same goal (i.e., halving bias from the specified limit),
while not having sample size problems. If the restricted mean test is adopted,
there may be implications for the cost of the TMS, as well as its maintenance
and verification.
In any event, the verification team would do well to watch the sample size
and keep it to the minimum necessary to enable a pass at the specification limit,
either positive or negative. As soon as this point has been reached, no more
samples should be taken. Incidentally, the remarks in this section, while focused
on vehicle counting, also apply to most vehicle and traffic stream parameters.
4.16 One-Sided Accuracy Requirements

TMS installations are normally arranged to provide a best estimate of actual traf-
fic flows and parameters. This is appropriate for most engineering applications.
But in the specific case of “shadow tolling,” a problem arises: when the system
turns in systematic overcounts, the shadow toll payer can become dissatisfied
that he is continually paying for vehicles that don’t exist. If the payer is the
taxpayer, then the political aspect of paying for vehicles that don’t exist can be
problematic.
This problem is similar to the measure in any commodity subject to weights
and measures control. The criterion changes from best estimate to guaranteed
goods for a given price. In other words, the specification becomes, say, +0.00%
– 2.00%.
This of course relies on equipment being available to achieve such a goal,
which is not necessarily straightforward for a counting device, although it is pos-
sibly easier for measuring continuous variables. However, the advantage is that
the payer will (at, say, a 95% confidence level) never pay for vehicles that weren’t
there or didn’t meet some other variable criterion.
4.17 Minimizing Sample Size by Combining Mean and CIM Data

With the above accuracy specifications, sample size can be further minimized,
depending on the systematic and random error rates.
4.17.1 Minimum Multiple Sample for Determining Accuracy within Specification

Determining compliance with an accuracy specification at a certain confidence
level is frequently required. A TME with an accuracy specification of ±1% will
thus be required to have a mean error plus an error in the mean equal to or less
than the accuracy specification at the required confidence level. In the example
in Table 3.1, having taken six samples, we determined the bias to be –0.61%
and the standard deviation to be 0.71%. Therefore, 0.39% (1.00% – 0.61%) is
available for the confidence interval of the mean. We can calculate how many 10
minute samples we need to get a confidence interval wholly within the overall
accuracy requirement of ±1% as follows:
2
 SD   0.71 
2
 0.71 
2
n =t × = × = ×  = 6.6 × 3.31 = 21.9

2 2 2
 2. 57   2. 57 
0.39  0.39 
p ,n
 CIM p 
Rounding this up to 22, which is now larger than the 6 in the first survey,
we split the difference and add 10% (i.e., 14 + 1 = 15). Substituting student’s t
of 2.15 for n = 15, we get

2
 SD   0.71 
2
n =t 2
×  = 2 .15 2
×   = 4.6 × 3.31 = 15.3
0.39 
p ,n
 CIM p 
This figure is close to the assumption of 15, so we accept this result and
round up to the nearest integer (i.e., 16 samples are required). From this may be
deducted the 6 samples already taken, leaving another 10 to be done (i.e., 1 hour,
40 minutes more). At the rate of 1,157 vehicles per hour, this means a minimum
sample size of about 3,085 vehicles.
After the total 16 observations are made, the mean error and standard de-
viation should be recalculated and then reassessed for the minimum samples. For
example, let’s assume that after 16 samples we have a result of a mean error of
0.60% and a standard deviation of 0.73%:
2
 SD   0.73 
2
n =t 2
×  = 2.15 × 
2
 = 4.6 × 3.33 = 15.3
0.40 
p ,n
 CIM p 
This confirms that the number of samples is safely 16, after taking account
of all the data.
4.17.2 Minimum P and M Sample for Determining Accuracy within Specification

As in Section 4.17.1 above, we are required to determine compliance with an
accuracy specification at a certain confidence level. A TME with an accuracy
specification of ±1% will thus be required to achieve a mean error plus an er-
ror in the mean equal to or less than the accuracy specification at the required
confidence level.
In the example, having taken 1,157 samples, we determined the bias to be
–0.61%. As above, 0.39% (1.00% – 0.61%) is available for the confidence inter-
val of the mean. We can calculate how many more vehicle samples we need to get
a confidence interval wholly within the overall accuracy requirement of ±1% as
follows. From (3.25), we developed the formula for sample size where the error
rate and CIM are both expressed as percentages. Hence,
E % (100 − E %) 0.61(100 − 0.61)

n = t p2,n × 2
= 1.96 2 × = 1, 531
CIM % p 0.392
This means just another 375 (1,531 – 1,157) vehicles need be surveyed. As
before, the calculation should be rechecked when that number of observations is
complete and the updated value for mean error is available.
For example, let’s assume the new mean error rate is 0.62%:
E % (100 − E %) 0.64 (100 − 0.64 )

n = t p2,n × 2
= 1.96 2 × = 1, 606
CIM % p 0.36 2
This means another 75 (1,606 – 1,531) samples need to be taken to bring

the CIM within the new range of 0.36%. Again, the check should be redone at
that point to ensure the number is sufficient.
4.17.3 K Ratio and P or M Equal to Zero Minimum Sample Size

The same principles can be used to assess minimum sample size for K ratio and
P or M equal to zero, similar to the explanation in Sections 3.13.3 and 3.13.4.
However, due to the assumptions involved in these two methods and the fact
that we are calculating a tight compliance with an overall accuracy statement,
using this technique is not recommended.
4.18 Semiautomated Minimum Sample Sizing

Modern data collection systems for count and classification data collection use
automated systems for the collection of accepted reference value data for later
enumeration and calculation of accuracy compliance. From the preceding sec-
tions of this and the previous chapter, it should be clear that as data is sequential-
ly analyzed and/or enumerated, the confidence interval of the mean reduces with
sample size, the mean tends toward its central value, and the standard deviation
of the sample also tends toward its central value. Critically, there comes a point
when an accuracy requirement is met and no further samples or enumeration are
required. If work stops at that exact point, savings can be made from the avoid-
ance of further, redundant data collection or processing.
Two examples of such potential savings are:
1. A system has fixed CCTV cameras on-site, which are connected to

Moving Pictures Experts Group (MPEG) encoders and a hard disk
recording system. At each verification date, the system is set to record up
to, say, 3 hours of passing vehicle video overlaid with data from the TMS
at this site. The resulting MPEG files are passed to enumerators to enu-
merate the vehicle count and/or classification against the TMS count and
compile a pass/fail for compliance with the accuracy specification.
2. A speed-measuring TMS is to be audited using a calibrated handheld
radar gun. The process is manual, with each pair of readings, the value
from the TMS, and the accepted reference value from the speed gun
being entered into a spreadsheet program on-site. After the survey is
complete, the resulting mean error and standard deviation are used to
calculate compliance with the specification.
In both cases, the manual element of the work (i.e., the enumeration in
the first case and the data collection in the second case) can be stopped if the
mean and the confidence interval of the mean are calculated after each vehicle is
assessed. In the case of the vehicle counts, this would be one of the automated
procedures in Section 4.17; speed measurement would use the principles in

Chapter 11. The only requirement is that the analysis be performed after each
data entry and the process stopped as soon as the sum of the mean and the con-
fidence interval of the mean lie inside the required accuracy specification. Mod-
ern laptop or desktop computers can be programmed to perform this function
quickly and conveniently.
4.19 Accuracy Test Failures

If an accuracy test fails upon first attempt, it is acceptable to assume that the
test has been affected by a rogue event and to repeat the test. If the repeated test
passes, no further action is normally taken. Verification tests, conducted under
the laws of probability and statistics, may be expected to fail occasionally. This
test result may simply be one of the expected, but low-probability, individual
failures in the range beyond the confidence interval.
4.20 Calibration
Calibration is a process whereby a TME or the reported data is adjusted such that
the data provides the best estimate of the true values. Calibration may involve a
process with the TME (e.g., speed calibration) and/or a process in a downstream
computer system (e.g., count adjustment).
The derivation of calibration factors is a nontrivial exercise, especially
where the accuracy of the raw data is good, for example, ±1% or better. Many
hours of manually enumerated (24 hour) data from video-recorded observations
are required in order to determine calibration factors with suitable confidence
intervals. And this work must be redone if there are significant changes in traffic
flow at the site.
It is possible that the confidence interval for the calibration factors may be
wider than their effect. In this case, a decision not to use the calibration factors
may be made since the application of the factors to the raw data adds insignifi-
cant value at the required confidence level. This will often be the case with very
accurate TMS installations.
Calibration factors have to be reviewed and resurveyed at appropriate time
intervals and involve considerable survey efforts. In general, the more accurate the
raw TMS data is, the greater the cost of providing usable adjustment factors.
Calibration is a different process from assessment or verification. The de-
sign of an assessment process is such that qualitative data arising may be un-
suitable for calibration use. When verification is performed with nonrandom
samples, the sample data should not be used for calibration without careful
consideration.
4.20.1 An Example of Calibration for Vehicle Length

If the TMS is consistently showing a systematic speed, length, weight, or other
error, the configuration parameters in the device should be altered to correct this,
particularly if the error takes the equipment reports outside the specification.
However, calibration should only be performed after substantial data is available
from an appropriate number of error surveys, and all sources of blunder have
been excluded.
Assume an example where an average error of underreporting by 0.59% of
a vehicle parameter is determined after a large number of random samples. An
adjustment should be made to increase the appropriate configuration parameter
for the TME, for example, the loop separation parameter (LPSEP) in the case of
length measurements.
Assume that the current value of a LPSEP for the lane in question is 450
(cm) and that the average systematic error is –0.59%. Then the adjustment is
calculated as follows:
current _ LPSEP 450

New _ LPSEP = = = 452.7 cm (4.2)
1.0 + systematic _ error 0.9941
Since LPSEP can usually only be entered to the nearest centimeter, the
value of 453 cm would be used. A small residual systematic error would remain,
and subsequent tests should be expected to yield a continuing variation in the
order of this difference. This is then an example of a systematic error that cannot
be removed.
5
Collecting Data in Groups
5.1 Introduction
Binning is the process that collects data into groups by incrementing a count
(bin) based on a vehicle parameter. For example, a vehicle measured at a speed
of 48 kph would be placed in a speed bin with limits of 40 to 50 kph. Binning is
often used to conserve memory and keep communication charges low. It is most
commonly used for:
• Speed band counting;

• Length band counting;
• Vehicle type classification;
• Headway and gap band counting.
Each of these binning counts use the same binning error theory, but the
different distributions of the parameters being measured will require different
detailed analysis.
It is unlikely in a modern machine for binning to be done in error. For
example, it is unlikely that a vehicle reported to be traveling at a speed of 48 kph
will be placed in the speed bin with limits from 51 to 65 kph. But because of the
85

4.highway Traffic Monitoring and Data Quality - (PG 86 - 111)

Uploaded by

Copyright:

Available Formats

You might also like

4.highway Traffic Monitoring and Data Quality - (PG 86 - 111)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4.highway Traffic Monitoring and Data Quality - (PG 86 - 111)

Uploaded by

Copyright:

Available Formats

60 Highway Traffic Monitoring and Data Quality

• Classification types that are difficult to separate;

3.16.2 Typical Blunders

• Incorrect dimensions are entered into the TME.

Equipment should be calibrated for speed measurement before error analy-

3.16.3 Equipment Parameter Settings

3.16.4 Loop Detector Error Sources

3.16.5 Errors in Length Measurement Using Loops

3.16.6 Tube Detector Error Sources

• Tubes are not at a right angle to vehicle flow;

Technology limitations include:

vehicles, they require axle-to-vehicle calibration or the determination of correc-

3.16.7 Microwave Sensor Error Sources

• Multiple targets are in view at once (i.e., very heavy traffic).

3.16.8 Number Plate Reader Error Sources

Obscuration due to traffic flow and dynamics is another performance issue

• Badly due to the cost of support structures;

• At a difficult viewing angle;

Gantry sites are used wherever possible to avoid obscuration effects.

3.16.9 Bias in Number Plate Readers

3.17 Meaning of Capability

• The characteristics and positioning of the sensor array;

3.18 Relevance of Quality Assurance

Every traffic monitoring site should be the subject of a permanent procedure,

• The mean error;

is about that data and how it should be interpreted.

The difference in accuracy specifications is only the degree of accuracy

• An even-probability test, where the accuracy is confirmed just to be prob-

4.2 Interval Counting Variance

dalg_CH04_1.indd 2 7/22/2008 2:15:08 PM

4.3 Confidence Interval for Individual Counts

CII 95% = CIM 95% ± z 95% × SD

where 2 is the random variable for the standard normal distribution.

A useful implication of the central limit theorem is that longer sample

4.4 Calculating the Confidence Interval for Different Periods

CII 95% = CIM 95% ± z 95% × SD

4.5 Some Words about Systematic Error

4.6 Even-Probability Accuracy Test

4.7 Two-Sigma Probability Accuracy Test

4.8 Three-Sigma Probability Accuracy Test

4.9 Discussion of the Tests

Upper spec limit

Upper spec limit

Negative errors Positive errors

Low accuracy/medium precision

Lower spec limit

Upper spec limit

Low accuracy/high precision

4.10 Additional Conditions to the Basic Tests

4.11 Restricted Mean

Mean restricted to lie

Lower spec limit

Upper spec limit

Figure 4.4 Error distributions for restricted mean test.

4.12 Zero Included in Range

4.13 Sample Size Trap

Lower spec limit

Upper spec limit