Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

“There are three kinds of lies: lies, damned lies, and statistics.


( Benjamin Disraeli, 1804 - 1881 )

Transportation Engineering
ENCV605015
Lecture Note 3. Introduction Statistics Method
for Transportation Engineering
Tim Dosen:
Andyka Kusuma, ST., MSc., PhD.
1
andyka.k@eng.ui.ac.id
Expected Learning Outcome
• Students must be able to apply • Level of knowledge
statistics theorem for determining
various aspects in transportation • Develop problem statements and
engineering, particularly for solve well-defined fundamental civil
analyzing traffic variables engineering problems by applying
appropriate techniques and tools.
• Assessment: (ASCE-8 /Cognitive 3/C3 and
Psychomotor 3/P3)
• Mid term test: 10%
• Analyzed and solve well-defined
• Laboratory activity 2: 5% engineering problems in at least
• Learning experiences: four technical areas appropriate to
civil engineering (ASCE-14 / C4)
Students will accomplish a number
of survey activities and interpret as • Design a system or process to meet
well as presenting survey data, desired needs within such realistic
calculate descriptive statistics of constraints as economic,
those traffic parameters and carry environmental, social, political,
out hypothesis test of the traffic ethical, health and safety,
performance. constructability, and sustainability.
(ASCE-9 / C5)

10/19/2016 2
Introduction
Understanding of the Measurements

• Measurements are made on one of the • Usually nominal and ordinal data are
following scales: recorded as the numbers of observations in
each category. These count or frequencies
1. Nominal classifications: e.g. bus, truck, are called discrete variables.
passenger car; or in the questioner such
as yes, no, don’t know, not applicable. In • For continuous data the individual
particular, for binary or dichotomous measurements are recorded.
variables, there are only two categories:
male, female; dead, alive; or link • The term quantitative is often used for a
(segment) and node (junction). variable measured on a continuous scale and
the term qualitative for nominal and some-
2. Ordinal classifications in which there is times for ordinal measurements
some natural order or ranking between
categories: e.g. young, middle aged, old;
or age groups: <5, 5-0, 10-14, 15-19 and In multivariate models, a qualitative,
so on.
explanatory variable is called a factor and its
3. Continuous measurements where categories are called the levels of a factors. A
observations may, at least in theory, fall
anywhere on a continuum: e.g. weight, quantitative explanatory variable is called
length, time, or speed measurements covariate

10/19/2016 3
Introduction
Understanding of the Measurements

• Presentation of the nominal,


ordinal and continuous data • Two data are possible be gathered. First is time
must be carefully made when series data and second is cross sectional data.
using chart presentation. • A time series is a series of data points listed (or
• For example: line is more graphed) in time order. Most commonly, a time
proper for presenting the series is a sequence taken at successive equally
continuous data and time spaced points in time. Thus it is a sequence of
series data. discrete-time data. Examples of number of
accidents by sequence years
• Cross-sectional data, or a cross section of a
study population is a type of data collected by
observing many subjects (such as individuals,
firms, countries, or regions) at the same point
of time, or without regard to differences in
time.

10/19/2016 4
Introduction
Summary of Scale Types

10/19/2016 5
Introduction
Component of Time Series

A time-dependent process may consist of four


components:
• Trend (Tr). This is long-term change in average
quantities, such as the growth of traffic over
time;
• Seasonal variation (Se). These result from
different levels of flow at different times of the
year;
• Cyclic variation (Cy). These result from cycles in
activity, and
• Random component (Ra). This may stem from
short term behavior or special events
The components may be summed to give an
overall effect, e.g. variable X(t) at time is given by:
X(t) = Tr(t) + Se(t) + Cy(t) + Ra

10/19/2016 6
Introduction
Understanding of the measurements

• Example of using a line graph for showing


the roughness of roads along a 20 km road
• Example of using a pie chart: Percentage of segment. One of the roughness
Fatality by Modes in Municipality of Depok measurement methods is by means of IRI
in 2013 (439 deaths) (International roughness index).

10/19/2016 7
Introduction
Understanding of the measurements

• Example of ordinal classification for age


group at fault in Indonesia between 2011
• Speed data for individual cars. Data usually and 2013. Presentation is using a bar chart
follows the normal distribution and
individual data presented by bar from the
lowest one on the left and the highest one
the right (It is also called a histogram). The
normal distribution as a continuous line can
be add as a proxy of the data.

10/19/2016 8
Introduction
Statistics descriptive

Example from 30 observation speed data


• Data (datum is only one observation or
point) can represent several descriptive
numbers. It is not a singular.
• For example: Speed data from the histogram
graph can be examined by: average or mean,
median, minimum, maximum and standard
deviation. Furthermore, it can represent the
85th percentile of speed which traffic
engineers believe is a proxy for installing
actual posted speed sign.
• If there a maximum speed limit
sign, others derivation data can
be established such as mean of
speed above the maximum
posted speed sign (mean of
speeding), etc.

10/19/2016 9
Introduction
85th percentile speed

• The maximum speed limits posted as the


result of a study should be based primarily
on the 85th percentile speed, when
adequate speed samples can be secured.
The 85th percentile speed is a value that is
used by many states and cities for
establishing regulatory speed zones.
• Speed checks should be made as quickly as
possible, but it is not necessary to check the
speed of every car. In many cases, traffic will
be much too heavy for the observer to
check all cars.
Homework:
Using the above data, could you make a
85th percentile graph? What is your
suggestion of the posted speed sign?

10/19/2016 10
Introduction
Population and Sample

Statistical Experiment
• Two very important terms in statistics are
population and sample • A contractor specifies the use of a certain street
lighting system which requires a certain bulb. A
• POPULATION (defined in the statistical
sense) is the complete collection of items or manufacturer produces 10,000 LED bulbs as a
things that can be identified (i.e. events) special order management of this firm wishes
to determine the average service life of the
• SAMPLE is a portion or a subset of a bulbs. The average can be calculated by one of
population two statistical experiments:
• POPULATION APPROACH - test all the light
bulbs, the average will be calculated with 100%
confidence
• SAMPLE APPROACH - take a representative
sample of the population. A representative
sample is one where all the light bulbs in the
production run have an equal chance of being
selected.
• If an average is calculated from a sample then
the question is how close is the sample average
to the true average (i.e .the population mean). -
the larger the representative sample the more
confidence we will have in the sample average
being close to the true average
Traffic Survey

10/19/2016 12
Traffic Survey

10/19/2016 13
Cross Tabulation

One Dimensional Table One Dimensional Table

Two Dimensional Table This means also • Two-way classification tables can be
cross sectional obtained by subdividing the stub. These are
studies sometimes referred as cross-tabulations
• Three-dimensional tables can be created by
subdividing the subdivision stub.

10/19/2016 14
Background of t-test

• The t-test is used to test hypotheses about means


when the population variance is unknown (the
usual case). Closely related to z, the unit normal.
• Developed by Gossett for the quality control of
beer.
• Comes in 3 varieties:
• Single sample, independent samples, and
dependent samples.
Background of t-test
What kind of t is it?

• Single sample t – we have only 1 group; want to


test against a hypothetical mean.
• Independent samples t – we have 2 means, 2
groups; no relation between groups, e.g., people
randomly assigned to a single group.
• Dependent t – we have two means. Either same
people in both groups, or people are related, e.g.,
husband-wife, left hand-right hand, hospital patient
and visitor.
Background of t-test
Degrees of Freedom
For the t distribution, degrees of freedom are always a
simple function of the sample size, e.g., (N-1).

One way of explaining df is that if we know the total or


mean, and all but one score, the last (N-1) score is not free to
vary. It is fixed by the other scores. 4+3+2+X = 10. X=1.
Background of t-test
Single-sample t-test
With a small sample size, we compute the same numbers
as we did for z, but we compare them to the t distribution
instead of the z distribution.
H 0 :   10; H1 :   10; s X  5; N  25
s 5 (11  10)
est . M  X  1 X  11  t  1
N 25 1

t (.05,24)  2.064 (c.f. z=1.96) 1<2.064, n.s.

X  tˆ M
Interval =
11  2.064(1)  [8.936, 13.064]
Interval is about 9 to 13 and contains 10, so n.s.

You might also like