An Unsupervised Learning Framework For Track Quality Index and Safety

Transportation Infrastructure Geotechnology
https://doi.org/10.1007/s40515-019-00087-6
TECHNICAL PAPER
An Unsupervised Learning Framework for Track

Quality Index and Safety
Ahmed Lasisi 1 & Nii Attoh-Okine 1
Accepted: 24 July 2019/

# Springer Science+Business Media, LLC, part of Springer Nature 2019
Abstract
Rail track geometry defect remains a primary cause of train accidents in the USA. With
over $1 billion lost annually to geometry-related accidents, there is an urgent need to re-
assess the analysis and treatment of geometry defects. The development of track quality
index (TQI) takes a contracted view of track assessment by focusing only on quality
without any safety consideration. Track geometry safety limits are set by Federal
Railroad Administration based on raw track geometry data. Since different variations
of track geometry parameters exist, there is a skepticism about the effectiveness of the
current bipartite analytical approach of track quality and safety. These results into two
maintenance regimes: regular and spot maintenance. This study aims to create a
framework through which a hybrid index that combines both safety and quality can
effectively eliminate costly spot maintenance practices. This index would be used to
create a data-driven maintenance scheme that maximizes the time between two main-
tenance cycles and minimizes disruptions. This technical note describes the proposed
framework for creating such an index using unsupervised machine learning.
Keywords Rail track engineering . Safety . Data science . Safety . Track quality
1 Introduction
There is a growing application of data analytical tools in the area of rail track safety and
novel data science applications (Sharma et al. 2018). A previous study focused on
methods of improving Weibull rail defect analysis (Lasisi and Attoh-Okine 2019a;
Lasisi and Attoh-Okine 2019b) while this article considers the role of track quality
indices in rail track safety.
Track quality index (TQI) is used to achieve average-based assessments of discrete rail
track segments (or sections) for the purpose of maintenance planning (Lee 2005; Esveld
* Nii Attoh-Okine
okine@udel.edu
1
Department of Civil & Environmental Engineering, University of Delaware, Newark, DE, USA
2016). Federal railroad administration (FRA) data shows that over $1 billion is annually
lost to track geometry-related accidents in the USA (Appendix). These accidents repeat-
edly involve fatalities and casualties (FRA 2018). The role of TQIs in track safety needs a
close re-examination. For example, if TQIs are used to assess track quality (and determine
geometry maintenance regimes), why aren’t they sensitive to safety exceptions? In other
words, an index that measures the quality of track should inherently have the ability to
imply when rail track is not in the best shape for operation, hence, safety. The current state
of business suggests that geometry exceptions be handled by issuing slow-down orders on
a given track section that violates FRA safety thresholds until the exception is corrected. Is
there any way this “spot and fix” approach can be improved by developing TQIs that are
both sensitive to track quality and safety? The aim of this article is to re-evaluate the
purpose of TQIs and suggest techniques that can help modify them to integrate several
elements of track quality, ride comfort, and geometry exceptions (safety). The literature is
lavish with abundant TQIs that have been developed with different degrees of variability
in focus, approach, and effectiveness. Authors have highlighted a few of them in this note
to acquaint readers with. Figure 1a depicts a visual relationship between an aggregate track
quality measure (TQI) over a defined length, D and the occurrence of geometry exceptions
within the same length (El-Sibaie and Zhang 2004). Railroad professionals argue that it is
difficult to examine TQI per unit length of the track because of the hyper-sensitivities in
the track measurements. Therefore, maintenance decision-making due to these measure-
ments becomes a non-trivial task. However, by obtaining a collective measure of the per-
unit measurements (e.g., surface, gage, cross-level, alignment) in form of mean, standard
deviation, or power spectral density, the variabilities in the track parameter/measurement
over the chosen length, D can be summarized by a single scalar value. While this approach
simplifies the problem of hypersensitivity, it has a fallen short of picking up threshold
violations, exceptions, or defects.
Typical examples of the measurements shown in Fig. 1b are long- and short-wave
profile measurements for a 4-km section of in-service freight track. As can be observed,
the stochastic nature of the per-unit measurements is likely to be lost if measurements
are averaged or aggregated as suggested by the TQI approach.
1.1 Objective Track Quality Index
Objective tract quality index is also known as single or individual TQIs because it only
aggregates one-parameter measurements (e.g., profile or surface) over a fixed length of
the track. Examples include FRA length-based TQI, Canadian National (CN) TQI, and
Amtrak roughness index summarized in Table 1 (Lasisi and Attoh-Okine 2018).
The above are considered amongst the most popular methods of track quality
evaluation until it became difficult to consider TQIs for different track geometry
parameters differently for the purpose of collective maintenance decision-making.
The need to combine TQIs then arose inadvertently.
1.2 Artificial or Combined TQI
With the CN polynomial-based TQI approach, it was easy to combine all TQIs by a
simple average of all parameters (gage, surface (long and short chords), alignment (long
and short chords), and cross-level) except warp. The China railway (CR) approach to
(a) Track Quality Index and Defects (exceptions)[8]
(b) Long (20m) and Short (10m) Wave Profile measurements for a sample
4km stretch of track.
Fig. 1 A track quality index and the nature of track geometry measurements. a Track quality index and defects
(exceptions) (Ciobanu 2016); b long (20-m) and short (10-m) wave profile measurements for a sample 4-km
stretch of track
combining TQIs is to sum the standard deviations for each parameter (Xu et al. 2011).
Australian rail track corporation (ARTC) and Swedish National Railway have a slightly
different approach to combining different parameters as follows:
1. ARTC five parameter track defectiveness

W 5 ¼ 1− 1−wprofile 1−walignment 1−wgage ð1−wcant Þð1−wtwist Þ
where w is the calculated defectiveness for each parameter.

Table 1 Popular objective track quality indices in North America
TQI Mathematical illustration Variable description

n qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
FRA length–based Ls ¼ ∑ Δxi
2
þ Δyi 2 Ls, traced length of space curve
¼ LLos −1 106
TQI s i¼1 Δx, sampling space
Δy, difference of two consecutive measures
TQIs, computed TQI for each parameter
Lo,theoretical length of section
CN polynomial–based TQIi = 1000 − C(σi2) σi, standard deviation of each parameter
C, constant, 700 for mainline track
TQIi, computed TQI for each parameter
n x2ij
Amtrak roughness index TRI i ¼ ∑ n x, measurements
j¼1 i, for a given parameter (e.g., surface)
j, at a given track section of chosen length
2. Swedish Q index (Muinde 2018)

σH σs
Q ¼ 150−100 þ2 3
σH lim σslim
where σH is the average standard deviations of right and left level, σs is the average
standard deviations of alignment, σHlim and σSlim are the corresponding threshold limits
(Muinde 2018).
The artificial TQIs above present interesting mathematical and statistical methods
behind which are intrinsic assumptions that are only obvious to the data savvy. Firstly,
the simple average and summation approach by CN and CR respectively emphasize the
idea that all geometry parameters are equally important; hence, they have similar
weights. However, studies have shown that surface condition should be given priority
over other track geometry parameters because maximizing track quality condition can
be achieved by reducing surface irregularities (Audley and Andrews 2013;
Soleimanmeigouni et al. 2016; Martey and Attoh-Okine n.d.).
Secondly, the lack of consensus in approach (not measurements nor parameters)
opens the room for misinterpretations of what the combined TQIs represent. For
instance, it happened that the CN TQI which was previously taught to be a measure
of overall track quality was only measuring longitudinal parameters as found in (Lasisi
and Attoh-Okine 2018).
2 Flaws and Shortcomings
While some of the shortcomings of artificial TQIs are highlighted above, single/
objective TQIs are also not free of limitations. The very fact that they aggregate
parameter measurements and miss the sensitivities per unit length (or exceptions)
makes their use questionable. Although, because most objective TQIs are expressed
in the same units as the parent parameters (inch or mm), a dimensional representation
gives room for further analysis. However, combined TQIs are ambiguous, dimension-
less, and often arbitrary. While this is common of other engineering indices like
pavement condition or international roughness index for highway pavements, the latter
has a universally acceptable scale with meaningful connotations and understanding.
Same cannot be said of many of the artificial/combined track quality indices in the
rolling stock world.
The subjective nature of the weight assignment for individual parameters found in
combined TQIs like the Swedish Q index, Polish J coefficient, Indian TGI, etc. (Lasisi
and Attoh-Okine 2018) further lowers the objectivity of any potentially acceptable
universal TQI (combined). In the following lines, authors have attempted to meet some
of the identified shortcomings with novel applications of data science techniques in rail
track engineering.
3 Potential Enhancements
A group of railroad professionals admit some of these shortcomings while others are
very resistant because change takes time. The dilemma of combining/aggregating
parameters without losing geometry exceptions can be reframed as a dimension
reduction problem that maximizes the variance in the data set, in other words,
projecting high-dimensional data onto a plane that exhibits most of the variabilities in
the data. This plane is defined by latent unique characteristics of track (or any matrix-
form) data known as Eigen-vectors. The process of transforming or decomposing
hitherto mutually correlated track geometry measurements into a set of linearly uncor-
related variables or components is called the principal component or factor analysis.
This approach can be used to decompose multi-dimensional, multi-observation track
geometry data into simple components that can describe the vertical, horizontal, and
transverse track irregularities. In order to verify the consistency of this approach,
several tests were conducted using different lengths of track and class. It was found
that three principal components were sufficient to summarize track geometry data.
Figure 2 shows that 3 components effectively summarize 95% of the variance in
track geometry. This observation was confirmed with three different track geometry
data as follows:
1. 1 mile of passenger track

2. 84 km of track class 4 freight track (South American)
3. 28 miles of class 5 mixed-use track
We have demonstrated this approach using the afore-stated datasets and found a way
to solve the problem of subjective weight assignment as well as combining multivariate
track geometry data in a way that correlates with sensitivities or exceptions in raw
geometry measurements (Lasisi and Attoh-Okine 2018). Figure 3 shows a correlation
plot of objective TQIs and combined artificial TQIs as well as the transformed or
decomposed principal components (PC1, PC2, and PC3). Combined TQIs like CNTQI,
J_Coeff, CHTQI, and TGI represents Canadian National, Polish, Chinese and Indian
TQIs respectively. While ALI_R_124 indicates the long chord (124 ft) right alignment
measurement for a sample mile track, copious details can be found in the cited paper
(Lasisi and Attoh-Okine 2018).
Fig. 2 Cumulative variance plot versus number of principal components (Lasisi and Attoh-okine 2019)
It is immediately obvious from Fig. 3 that PC1 correlates more with surface
parameters and can be seen as a measure of vertical irregularities. PC3 however
combines elements of gage and cross-level indicating its intrinsic properties of trans-
verse irregularities. Due to the nature of principal components, PC2 has to be longitu-
dinal because all components (PC1, PC2, and PC3) are orthogonal transformations of
raw track geometry data. Hence, it can be deduced that CNTQI, J_Coeff, and TGI
actually measure longitudinal track properties contrary to assessing overall track
quality. Therefore, none of these combined TQIs assess vertical track irregularities that
often influence most track geometry maintenance work (Soleimanmeigouni et al.
2016).
3.1 Stochastic Embedding
The stochastic nature of track geometry measurements and geometry exceptions

residing in the dimensional space of non-defect data (Galván-Núnez and Attoh-Okine
2018) requires the application of stochastic embedding techniques to cluster track
geometry measurements in a way that similar measurements (non-defect
observations) are assigned high probability while dissimilar points (or exceptions) are
isolated. This procedure can be used to combine individual TQIs while also considering
safety (isolating exceptions). This approach would effectively combine TQIs, isolate
exceptions but is not guaranteed to measure ride comfort or safety. To compensate for
this, a framework has been developed below in Fig. 3. The framework describes a four
step procedure that begins with data collection and ends with hybrid index develop-
ment. Track geometry data is collected in two major formats:
1. Distance-based measurements; and

2. Time-based data.
Fig. 3 Correlation matrix of objective and artificial TQIs
The former is the more conventional method that most railways utilize. Per-
unit length observations of gage, cross-level, surface, and alignment are
collected for a given track location. Time-based observations of three-
dimensional acceleration (vertical, horizontal, and transverse) are also collect-
ed. The effect of different train speed is reflected in the number of time-based
observations per unit length. Figure 4 presents the structure of time-based data
per foot.
Depending on the speed of track geometry inspection vehicle, there can be
5 to 15 observations for every geometry measurement taken per unit length.
The higher the speed, the greater the frequency of time-based observations
taken and vice versa. Both data sources are connected by a common index
which can be used to extract the exact track acceleration tensor (vectors)
corresponding to a given track geometry measurement. Hence, the difference
in speeds is accounted for by the number of time-based data produced per-
unit length.
Due to the relatively high frequency of time-based data, it is important to
spatially synchronize the measurements with the distance-based readings. One
way to decompose the measurements and achieve a one-to-one mapping is by
computing moving averages or performing time series analysis on the high-
frequency data (time). Once both data sources are synchronized in Step 1,
objective TQIs are developed from distance-based data while an exploratory
data analysis is carried out on time-based measurements. Step 3 will enable the
elimination of redundant acceleration parameters and gyroscope readings based
Fig. 4 Data structure of time-observations versus distance-based data
on the outcome of Step 2 (Fig. 5). Embedding operations previously discussed

will also be implemented in this phase to crush multivariate distance-based
measurements into two or three critical components C1, C2, C3.
The last phase of this framework features a unidirectional decomposition of
acceleration and gyro pairs ({αx, γx}, {αy, γy}, {αz, γz} ) into a trio of hybrid
parameters, τx, τy, τz to represent vertical, horizontal, and transverse track
stabilities. Correlation analysis of C1, C2, C3 with τx, τy, τz is then conducted
to extract indices for track irregularities in three dimensions as well as ride
comfort or quality assessment. This phase serves as a mutual validation for
the developed critical components and stability parameters. Summary of re-
sults will be shared in a follow-up article on this rail track safety data science
series.
3.2 Safety and Probability Thresholds
In this section, the rationale for safety is considered using dimension reduction (Fodor
2002) and machine learning.
In order to integrate safety into the proposed index, the original track
geometry data is reduced using specific unsupervised learning technique (linear
and nonlinear dimension reduction). The observed measurements are then
checked for safety threshold limits. Every observation (vector) is labeled as a
function of which track geometry limit is exceeded (e.g., “Surface Defect,”
“Profile Defect,” or “No defect”). A supervised learning technique (Hastie et al.
2009) or machine learning algorithm (Martey et al. 2017) is then used to
predict the labeled column using three predictors:
a. Original track geometry parameters

b. Linearly reduced dimensions
c. Nonlinear reduced dimension
Fig. 5 A framework for developing Hybrid Indices that combine TQIs, safety, and ride quality
The performance of the predictors is evaluated for the linear and nonlinear reduced
dimensions using the original track geometry parameters as the baseline. Metrics such as
sensitivity and specificity (Marsland 2015) can be used to assess how observations with
safety violations are learned by the hybrid index(ces) (Zarembski et al. 2016; Fazio
1986). Once this is established, a soft-max layer can be used to export the probability
distribution of each parameter to examine the correlations with the reduced components
or original geometry measurements thresholds with the probability of defects. If a
correlation is established, a threshold can be determined based on the observations with
established geometry defects. Figure 6 shows a principal component threshold consid-
eration for profile defect. This limit already considers the different wavelengths of
profile since they have been learned during the supervised learning process. As a result,
the thresholds for each wavelength were learned from the labels. Through this approach,
Potenal Threshold
Fig. 6 Principal component 1 profile threshold consideration for safety limit
several spot corrective maintenance activities can be eliminated because the hybrid TQI
presents an index that sets an artificial limit to measured geometry values (Sharma et al.
2018). This will result in cost savings and further buttresses the potential of machine
learning in rail infrastructure management (Wright et al. 2016).
4 Concluding Remarks
Track quality indices have been the industry standard for assessing track condition and
making maintenance decisions for a couple of decades. This trend continues to gain
currency amongst railroad practitioners across the world. Unfortunately, there has not
been rational, repeatable ways of verifying the efficacy of these TQIs except that
practitioners can only convince their guts that their decision-making is supported by
some numerical track analysis. Classical data science techniques have shown promising
potential throughout our railway engineering world and beyond (Lasisi et al. 2018;
Ghofrani et al. 2018) and can potentially unravel the latencies in the TQI methods that
have been developed in the past to address track quality. Hence, this article is one in a
series of initial steps into an exposition of data science potentials in track quality
assessment and rail track safety engineering.
Acknowledgments Authors would like to recognize the contributions of Dr. Joseph Palese and Dr.
Emmanuel Martey.
Appendix
Table 2 FRA track geometry-related accidents (2008–2018) (FRA 2018)
Specific causes Total Type of accident Reportable damage Casualty
Count % Collision Derailment Other Amount % Killed Non-fatal
T001-roadbed settled or soft 204 0.9 2 198 4 59,049,393 1.9 0 18

T002-washout/rain/slide/etc. damage-track 61 0.3 – 56 5 38,881,187 1.2 2 31
T099-other roadbed defects 20 0.1 – 19 1 4,724,996 0.1 0 0
T101-cross-level of track irregular (joints) 135 0.6 – 135 – 12,516,909 0.4 0 0

T102-cross-level track irreg. (not at joints) 166 0.7 – 163 3 39,754,862 1.3 0 0
T103-deviate from uniform top of rail profile 42 0.2 – 40 2 3,816,745 0.1 0 0
T104-disturbed ballast section 3 0 – 1 2 60,388 0 0 0
T105-insufficient ballast section 8 0 – 8 – 2,470,131 0.1 0 0
T106-super elevation improper, excessive, etc. 39 0.2 – 38 1 8,665,070 0.3 0 0
T107-super elevation runoff improper 5 0 – 5 – 195,703 0 0 0
T108-track alignment irregular-not buckled/sun kink 116 0.5 – 116 – 34,976,654 1.1 0 1
T109-track alignment irregular (buckled/sun kink) 223 1 1 220 2 127,461,269 4 2 7
T110-wide gage (defective/missing crossties) 1122 4.9 1 1118 3 90,207,840 2.8 0 9
T111-wide gage (spikes/other rail fasteners) 293 1.3 – 293 – 32,682,671 1 0 4
T112-wide gage (loose, broke, etc., gage rods) 42 0.2 – 42 – 2,531,146 0.1 0 0
T113-wide gage (due to worn rails) 114 0.5 – 114 – 6,966,800 0.2 0 4
T199-other track geometry defects 90 0.4 – 89 1 15,220,247 0.5 0 1
Total 2683 11.8 4 2655 24 480,182,011 15.1 4 75
References
Audley, M., Andrews, J.: The effects of tamping on railway track geometry degradation. Proc. Inst. Mech.
Eng. Part F J. Rail Rapid Transit. 227(4), 376–391 (2013)
Ciobanu, C.: Evaluation of the track quality. Pway Blog, Permanent Way Institution. (2016)
El-Sibaie, M., Zhang, Y.-J.: Objective track quality indices. Transp. Res. Rec., no. 1863, 81–87 (2004)
Esveld, C.: Modern Railway Track, 2nd edn. MRT-Productions (2016)
Fazio, J.L.C.: Track quality index for high speed tracks. Transp. Eng. 1(112), 46–61 (1986)
I. K. Fodor, “A survey of dimension reduction techniques,” 2002
FRA, FRA office of safety analysis, 3.10 - FRA accident causes, 2018. [Online]. Available: http://safetydata.
fra.dot.gov/officeofsafety/default.aspx
S. Galván-Núnez and N. Attoh-Okine, “A threshold-regression model for track geometry degradation,” 2018
Ghofrani, F., He, Q., Goverde, R.M.P., Liu, X.: Recent applications of big data analytics in railway
transportation systems: a survey. Transp. Res. Part C. 90, 226–246 (2018)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning, vol. 1. Springer (2009)
Lasisi, A., Attoh-Okine, N.: Principal components analysis and track quality index: a machine learning
approach. Transp. Res. Part C Emerg. Technol. 91, 230–248 (Jun. 2018)
Lasisi, A., Attoh-okine, N.: Nonlinear dimension reduction technique for hybrid rail track quality index.
Transp. Res. Part C Emerg. Technol. (2019)
Lasisi, A., Attoh-Okine, N.: Research: future of Weibull defects analysis in the railway industry - railway track
and structures. Railway Track & Structures. (2019a)
Lasisi, A., Attoh-Okine, N.: Machine learning ensembles and rail defects prediction : a multi-layer stacking
methodology. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civ. Eng. (2019b)
Lasisi, A., Martey, E.N., Guillot, D., Attoh-okine, N.: A three-step agglomerated machine learning : an
alternative to Weibull defect analysis of rail infrastructure. IEEE Trans. Intell. Transp. Syst. (2018)
S. Lee, “Development of objective track quality indices,” 2005
Marsland, S.: MACHINE LEARNING, An algorithmic perspective, Second Edi. Chapman & Hall/CRC CRC
Press, Boca Raton (2015)
E. N. Martey and N. Attoh-Okine, “Modeling tamping recovery of track geometry using the copula-based
approach”
Martey, E.N., Ahmed, L., Attoh-Okine, N.: Track geometry big data analysis : a machine learning approach.
In: 2017 IEEE Int. Conf. Big Data (Big Data), pp. 3718–3727 (2017)
Muinde, M.S.: Railway track geometry inspection optimization. Luleå University of Technology. (2018)
Sharma, S., Cui, Y., He, Q., Mohammadi, R., Li, Z.: Data-driven optimization of railway maintenance for
track geometry. Transp. Res. Board Part C. 90, 34–58 (2018)
Soleimanmeigouni, I., Ahmadi, A., Kumar, U.: Track geometry degradation and maintenance modelling: a
review. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit. 0(0), 1–30 (2016)
Wright, N.P., Gan, R., McVae, C.: Software and machine learning tools for monitoring railway track switch
performance. In: 7th IET Conf. Railw. Cond. Monit. 2016 (RCM 2016), pp. 1–7 (2016)
Xu, P., Sun, Q., Liu, R., Wang, F.: A short-range prediction model for track quality index. Proc. Inst. Mech.
Eng. Part F J. Rail Rapid Transit. 225, 277–285 (2011)
Zarembski, A.M., Einbinder, D., Attoh-Okine, N.: Using multiple adaptive regression to address the impact of
track geometry on development of rail defects. Constr. Build. Mater. 127, 546–555 (2016)
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

An Unsupervised Learning Framework For Track Quality Index and Safety

Uploaded by

Copyright:

Available Formats

You might also like

An Unsupervised Learning Framework For Track Quality Index and Safety

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Unsupervised Learning Framework For Track Quality Index and Safety

Uploaded by

Copyright:

Available Formats

Transportation Infrastructure Geotechnology

An Unsupervised Learning Framework for Track

Ahmed Lasisi 1 & Nii Attoh-Okine 1

Accepted: 24 July 2019/

1.1 Objective Track Quality Index

1.2 Artificial or Combined TQI

(a) Track Quality Index and Defects (exceptions)[8]

1. ARTC five parameter track defectiveness

where w is the calculated defectiveness for each parameter.

Table 1 Popular objective track quality indices in North America

TQI Mathematical illustration Variable description

2. Swedish Q index (Muinde 2018)

2 Flaws and Shortcomings

1. 1 mile of passenger track

3.1 Stochastic Embedding

The stochastic nature of track geometry measurements and geometry exceptions

1. Distance-based measurements; and

Fig. 3 Correlation matrix of objective and artificial TQIs

Fig. 4 Data structure of time-observations versus distance-based data

on the outcome of Step 2 (Fig. 5). Embedding operations previously discussed

3.2 Safety and Probability Thresholds

a. Original track geometry parameters

Fig. 6 Principal component 1 profile threshold consideration for safety limit

Table 2 FRA track geometry-related accidents (2008–2018) (FRA 2018)

Specific causes Total Type of accident Reportable damage Casualty

Count % Collision Derailment Other Amount % Killed Non-fatal

T001-roadbed settled or soft 204 0.9 2 198 4 59,049,393 1.9 0 18

T101-cross-level of track irregular (joints) 135 0.6 – 135 – 12,516,909 0.4 0 0

You might also like