Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

206

Statistica Neerlandica (2006) Vol. 60, nr. 2, pp. 206–224

Econometric software development: past,


present and future
Marius Ooms*
Department of Econometrics, Vrije Universiteit Amsterdam,
De Boelelaan 1105, NL-1081 HV Amsterdam, The Netherlands

Jurgen A. Doornik
Nuffield College, University of Oxford, Oxford OX1 1NF, UK

We give a short international history of econometric software develop-


ment, with an emphasis on the origin of the main existing economet-
ric packages. We provide a Dutch perspective on this development.
We identify the characteristics of econometric software in comparison
with mathematical and statistical software. Finally, a number of recent
developments connected with the reuse of code across econometric
softwares are discussed.

Keywords and Phrases: econometric software package, modelling


language, user interface, reproducibility.

1 Introduction
Econometrics is an empirical science which develops and applies sophisticated and
realistic statistical models to economic phenomena. Existing econometric models
and methods are constantly tested against new observations and new phenomena.
New types of data constantly require new models. Econometricians were very early
adapters of the computer for economic analysis and modern econometrics still
requires state-of-the-art computer hardware and specialized software.
Over the last 50 years, econometric software has developed from complicated sets
of computer-specific instructions into widespread easy-to-use software packages and
programming languages, extensively used in academic research and education, in
official institutions, and in business.
In this paper, we first describe the history of econometric software. This part
draws heavily on the extensive account of Renfro (2004b), who corresponded with
many econometric software developers, including ourselves, in preparing his article
and in editing Renfro (2004a). In this short history, we make a small digression to
the Dutch situation. As the use of general statistical software by econometricians
increased markedly over the last decades, we then try to answer the question what

*mooms@feweb.vu.nl
 VVS, 2006. Published by Blackwell Publishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.
Econometric software development 207

distinguishes econometric software from statistical software in our present situation.


Finally, we shortly discuss recent challenges in econometric software development
and issues connected with the reproducibility of econometric computations and the
reuse of code across econometric softwares.

2 Short history of econometric software development


Econometric software development started around 50 years ago. Renfro (2004b)
gives a detailed and up-to-date account of the history of economic software devel-
opment in the English-speaking world and states: “as a general phenomenon the
programmable electronic computer became an economic research tool only during
the 1960s” (p. 10).
Early econometric software development was labour-intensive and served only a
few institutions that could manage and pay the substantial capital input for the
required programmable computers. Moreover, software was very computer-specific.
Today, this situation has completely changed. Modern econometric software is writ-
ten by a few individuals and thousands of users perform econometric estimations,
forecasts and tests on thousands of machines. The joint cost of econometric soft-
ware and hardware is in the order of magnitude of an economists’ monthly sal-
ary with low depreciation and maintenance costs, which dropped even more in the
last years. Thanks to a concentration in hardware and software development, a few
developers now serve an entire community. How did the current situation arise? The
early establishment of a few centres of econometric software development by lead-
ing econometricians has been instrumental. It has been common since the 1960s to
use the computer when developing estimators and methodologies. Software devel-
opment has reflected the perceived individual needs of the econometricians doing
the development (Renfro 2004b, p. 39, 44).

2.1 Econometric software


The origins of existing econometric software packages can be traced back to the
1960s and 1970s, when it became easier to move computer code around. The geo-
graphical distribution of econometric software development reflects the moves of
individual econometricians, who took their code with them during their academic
careers.
Not surprisingly, early academic development of econometric software in the US
started in Boston at the Massachusetts Institute of Technology (MIT), more spe-
cifically at the Center for Computational Research in Economics and Management
Science, where Edwin Kuh led the development of Time-shared Reactive OnLine
Laboratory (TROLL), a modelling tool and a testbed for the development of new
algorithms and techniques, including nonlinear least squares and regression influ-
ence diagnostics. TROLL was initiated by Mark Eisner and developed by
 VVS, 2006
208 M. Ooms and J. A. Doornik

econometricians like David Belsley, Roy Welsch and Robert Pindyck. Also at MIT,
Robert Hall laid the foundations of Time Series Processor software (TSP), in coop-
eration with, inter alia, Ray Fair. Common elements of TROLL and TSP were early
versions of a matrix language and the use of symbolic differentiation. Symbolic
differentiation was an important step for the 1970s implementation of the gradient
based-BHHH algorithm of Berndt, Hall, Hall and Hausman (1974) for maxi-
mum likelihood estimation of nonlinear econometric models, which is still used in
TSP-based softwares.
TROLL later developed into a commercial package. Online database management
software was developed by Data Resources Inc. (Lexington, MA, USA), which is
now part of Global Insight with Headquarters in Boston. TROLL is still used by
governmental agencies around the world. This cannot be said of XSIM, a once suc-
cessful commercial offspring of TROLL, which was much more user-friendly, strong
in mixing multiperiod data, including daily data, a feature introduced only recently
in modern econometric packages.
Robert Hall later moved to UC Berkeley and continued the development of TSP,
still in cooperation with Ray Fair and others. In the PC era of the 1980s, TSP was
split into two separate programs, MicroTSP, headed by David Lilien and PC-TSP,
headed by Bronwyn Hall. MicroTSP later became the Windows-program Eviews,
Econometric Views, whereas PC-TSP is now simply called TSP, and available for
different operating systems. Both programs continue to be developed in Califor-
nia, TSP at Palo Alto and Eviews at Irvine, see Hall and Cummins (2005) and
Eviews (2004). One of the main attractions of MicroTSP and Eviews was the timely
interface for the popular generalized autoregressive conditional heteroskedasticity
(GARCH) models, which were developed in close cooperation with Robert Engle
at UC San Diego.
Naturally, official institutions in Washington DC needed and developed econo-
metric software at a very early stage. TSP code developed in the 1960s and 1970s
at MIT was not really copy(right) protected. In Washington it was also used at the
FED and at the Brookings Institution, where Charles Renfro developed the software
MODLER.
At the Census Bureau in DC, the first software for seasonal adjustment of eco-
nomic time series, Census X-11, was developed, implementing a methodology that
is now an international standard, see Ladiray and Quenneville (2001).
There was close cooperation between MIT, the Washington FED and U Penn
in the development of the influential FRB-MIT-Penn Econometric model and code
and ideas were shared. In Philadelphia, at the University of Pennsylvania, Law-
rence Klein founded the Wharton Econometric Forecasting Association (WEFA),
which generated research funds selling forecasts from the Wharton model. Law-
rence Klein used the experience he earlier developed in computer-aided econometric
model building at the University of Michigan, where he implemented the first Mich-
igan models together with Arthur Goldberger. Software developed for WEFA was
also used for the FRB-MIT-Penn model. WEFA is now part of Global Insight and
 VVS, 2006
Econometric software development 209

markets the econometric software AREMOS, which is still influenced by Klein’s


modelling methodology. Precursors of AREMOS were developed in Southampton,
in the UK. Another econometric computer package which can be traced back to the
developments at WEFA, is Modeleasy, a modelling program built on the Speakeasy
language, and currently used by several Central Banks, see http://www.modeleasy.
com.
Ray Fair later moved to Yale, where he developed the Fair–Parke program, which
is still actively used for the analysis of his national and international models, see
Fair and Parke (2003). The program allows the estimation, simulation and eval-
uation of large nonlinear dynamic macromodels with (model consistent) rational
expectations.
The University of Chicago, home of the Cowles Foundation in the 1940s and
early 1950s, was another centre of early software development. In the 1960s, Hous-
ton Stokes began developing his general econometric package B34S. Arnold Zellner
started the software called Bayesian Regression Analysis Package (BRAP), which
survived until the 1990s.
At Wisconsin University under the direction of George Box, the first software for
ARMA analysis was written by David Pack, later transformed into AUTOBOX by
David Reilly at U. Penn. Likewise at Wisconsin, starting with code for multinomial
logit models by Nerlove and Press, William Greene developed LIMDEP for LIM-
ited DEPendent variable econometrics, see Greene (2002). He moved with his code
to Cornell and subsequently to NYU. The main development of SHAZAM by Ken
White also started at Wisconsin. White later moved with his code to Rice, Michigan
and UBC in Vancouver. Whistler et al. (2004) describe the latest version.
At Minneapolis, Minnesota, Chris Sims developed SPECTRE at the end of the
1970s. This was one of the first econometrics programs offering spectral analysis.
Subsequently, Chris Sims’s innovative macroeconometric methodology of Vector
AutoRegressive (VAR) modelling, published in Sims (1980), was implemented in a
new more general package, Regression Analysis of Time Series (RATS) by Thomas
Doan, see Doan (2004).
In the UK, Cambridge, London and Oxford were the natural places for economet-
ric software to be developed. At the Department of Applied Economics of the Uni-
versity of Cambridge, Richard Stone, the pioneer in estimation methods for national
accounts, supervised the building of the sizeable disaggregated Cambridge Multisec-
toral Model of the British economy, which involved considerable software develop-
ment. A limited company, Cambridge Econometrics, was set up by members of the
Cambridge Growth project and now maintains several models. Also at the DAE,
Hashem and Bahram Pesaran used their expertise in econometric estimation and
testing for the development of Microfit econometric software.
At the department of Statistics at the London School of Economics, economet-
ric software development was inspired by the hands-on tradition of Denis Sargan.
David Hendry, a student and later a colleague of Sargan, developed the pro-
grams AUTOREG and GIVE. Hendry took his programs to Oxford and developed
 VVS, 2006
210 M. Ooms and J. A. Doornik

PCGIVE (Generalized Instrumental Variable Estimator) and PCFIML (Full Infor-


mation Maximum Likelihood) on the IBM PC. PCGIVE was seen as the software
implementing the influential LSE-methodology of dynamic econometric modelling.
PCFIML was the first user-friendly software to include Søren Johansen’s likelihood-
based analysis of cointegration in VAR models, see Johansen (1995). Jurgen Door-
nik continued the development. He later built a Windows interface, GiveWin, and
introduced the object-oriented econometric matrix programming language Ox, Door-
nik (1998), which allowed independent development of new packages and which was
later integrated with PcGive, Doornik and Hendry (2001). GiveWin is also avail-
able as an interface for TSP, a link that was initiated when Bronwyn Hall, owner
of TSP, was a professor at Oxford.
Also at the LSE in the 1980s, Andrew Harvey initiated the development of STAMP,
for structural time series modelling, implementing an econometric methodology
which serves both as an alternative to Box–Jenkins forecasting models and as an
alternative to Census X-11 seasonal adjustment. Later, the main development of
STAMP was taken over by Siem Jan Koopman. Under the Windows operating sys-
tem, STAMP uses the same GiveWin user interface as PcGive, Doornik and Hendry
(1999). Both programs are now part of OxMetrics, http://www.oxmetrics.com.
Somehow Princeton, New Jersey, does not appear in this brief history of today’s
econometric software. At Princeton University, Richard Quandt, a pioneer in regime
switching and disequilibrium models, developed GQOPT. This is a flexible and com-
prehensive econometrics program. However, it is only useful for skilled FORTRAN
programmers, as this is the only programming language in which GQOPT can be
used. Princeton is also the current affiliation of Chris Sims, who, after using RATS
and MATLAB, now develops his free software in R, a statistical programming lan-
guage with an increasing number of econometric applications. The decreasing impor-
tance of FORTRAN is not a local phenomenon. FORTRAN is still being replaced
by Java and C++ as the lower level programming language of choice for econome-
tricians and financial engineers, both for education and research, see e.g. the Effi-
cient Method of Moments (EMM) software of Ron Gallant and George Tauchen
at Duke University, Durham, North Carolina, but even free availability in C++ is
not enough for widespread use of econometric procedures.
It is clear that the origins of current econometric softwares can be traced back
to locations which have a long-lasting reputation for econometric excellence. Soft-
ware development simply has been a necessary condition for innovative economet-
ric methods. Only a few specific econometric programs have survived, because they
have been supplying the important new econometric methods and because they have
been providing the necessary updates for the major changes in computer hardwares,
operating systems, and user interfaces.
Hendry and Doornik (2000) discuss and illustrate the necessary changes of the
time-series econometrics program PcGive in 1980s and 1990s: from command inter-
action to menu interaction and IDE (Integrated Development Environment), from
text menus to mouse-pointer driven drop down menus and dialogs of a WIMP
 VVS, 2006
Econometric software development 211

(Windows, Icons, Menus, Pointing) graphical user interface (GUI), from black and
white text graphs to coloured bitmap to high quality adjustable publication ready
figures, from a static manual to a context-sensitive help system, from static presen-
tation to live presentations of simulation exercises, from basically one program code
in FORTRAN, and later in C, to a modular architecture allowing user-built exten-
sions with a user interface with the same look and feel as the standard applications.
Other softwares have had to provide similar updates in order to survive.

2.2 Econometric applications in statistical software


Some programs which originated in econometrics like RATS and Ox are also used
for statistical (time series) research outside econometrics, see e.g. Cribari-Neto and
Zarkos (2003), but in the last two decades several statistical programs also became
more geared towards econometrics and subsequently widely used by econometri-
cians. The beginning of the PC era witnessed the start of GAUSS, see Gauss (2005),
developed by Sam Jones in Maple Valley, Washington State and the birth of Stata,
by William Gould, in College station, Texas, see Stata (2005). Although GAUSS
did not offer a new econometric methodology, it turned out to have a combina-
tion of price and features that was particularly appealing to econometricians and
economists. It soon became popular. A simple macro language with short matrix
expressions, decent graphs, fast numerical algorithms, tools to handle large data
sets with limited memory and a wide range of free and powerful packages imple-
ments econometric applications for cross-sectional models and time series. Schoen-
berg (1997), affiliated with Washington University, developed early procedures for
constrained maximum likelihood for GAUSS, which found widespread application
in the estimation of GARCH models. Ron Schoenberg also wrote FANPAC, a finan-
cial time series analysis package with early applications of multivariate GARCH
models.
On the other hand, Stata was not an instant success among econometricians. At
first, it did not have extensive programming facilities and specialized in applications
for survival data and the analysis of complicated survey samples. Later it introduced
more programming tools and eventually a matrix language and added more and
more econometric models. Stata’s data management features made it well suited for
the econometric analysis of panel data. Time series procedures have been added.
Stata is increasingly popular and a number of introductory econometric textbooks
present examples using Stata. Kit Baum at Boston College maintains a large archive
within RePEc, http://www.repec.org, with nearly 1000 free open source Stata mod-
ules for economics and econometrics.
Three large firms, Mathworks in Boston MA, Insightful in Seattle WA, and SAS
(Statistical Analysis System) in Cary, NC, provide econometric packages for MAT-
LAB, S-PLUS and SAS, respectively. MATLAB (2004), S-PLUS and correspond-
ing packages cater for financial econometrics and operations research: financial time
series analysis, modelling credit risks and optimizing asset allocation. SAS (2004)
 VVS, 2006
212 M. Ooms and J. A. Doornik

has a long tradition of implementing macroeconometric and microeconometric pro-


cedures for large data sets. Econometric software innovation is not usually initiated
by these firms, but they do implement and support profitable econometric applica-
tions.
The matrix programming language and signal processing tools of MATLAB are
used by many econometricians to implement model solvers and estimation meth-
ods. A comprehensive archive of econometric tools, http://spatial-econometrics.com,
is administered by James P. LeSage at the university of Toledo, Ohio. Although the
archive is set up for spatial econometrics procedures, LeSage and Pace (2004), it
contains many “estimation functions that provide printed and graphical output sim-
ilar to that found in RATS, SAS or TSP”.
S-PLUS, originally a product of StatSci, founded by R. Douglas Martin in Seat-
tle, Washington, is a commercial version of the object-oriented statistical program-
ming language S, which Martin learned at Bell Laboratories in Murray Hill, New
Jersey, now Lucent technologies. The software was primarily developed for statisti-
cal data analysis of many types, see Venables and Ripley (2002). Martin himself
added robust estimation procedures, inspired by John Tukey, inventor of the terms
“bit”, FFT (Fast Fourier Transform) and EDA (Exploratory Data Analysis). The
current owner of S-PLUS, Insightful, focuses on data mining and risk management.
Zivot and Wang (2005), also in Seattle, Washington, develop the S-PLUS FinMet-
rics software for financial econometric time series analysis. Andrew Bruce and Doug
Martin provided the robust estimation methods for this package. The package also
includes financial engineering procedures developed by Carmona (2004) and recent
state space procedures by Siem Jan Koopman, see Koopman, Shephard and Door-
nik (1999).
More and more econometricians are switching from S-PLUS to the freely-avail-
able statistical system R. Free procedure libraries are available for R, http://www.
r-project.org, an Open Source statistical system, not unlike S, which was initiated
by statisticians Ross Ihaka and Robert Gentleman, see Cribari-Neto and Zarkos
(1999) for an early review. A comprehensive archive of R for econometrics does not
exist yet. A comprehensive package for financial engineering, http://www.rmetrics.
org, which encompasses many econometric time series functions, has recently been
built by Diethelm Würtz at the ETH in Zürich.
In academic research in econometrics, SAS (2004) has lost ground from its strong
position at the end of the 1980s, though its econometrics features are still being
developed, recently in state space procedures and in generalized maximum entropy
estimation. Of course, SAS is widely used in official institutions and in business
applications, but few modern econometrics textbooks continue to use SAS exam-
ples. The developers of the econometric procedures in SAS are not well known in
the econometrics community. They visit Allied Social Science Association (ASSA)
meetings and econometric conferences and courses as participants, but not as exhib-
itors or presenters, whereas developers of Eviews, TSP, GAUSS and OxMetrics
do.
 VVS, 2006
Econometric software development 213

TROLL X−12
Census X−11
Econometrics USA/Can

ModelEasy ModelEasy+
TSP/PC−TSP TSP

TSP Micro−TSP Eviews


AUTOBOX
MODLER
AREMOS
SHAZAM
B34T
B34S
LIMDEP
NLOGIT
SPECTRE RATS
Math/statistics

GAUSS
SAS
Stata
S−PLUS
MATLAB
R
Econometrics UK

AUTOREG PCGIVE PcGive OxMetrics


Ox
GIVE PCFIML PcFiml PcGets G@RCH
STAMP

Microfit

1960 1965 1970 1975 1980 1985 1990 1995 2000 2005

Fig. 1. Econometric softwares and related statistical software in time. Top: Products developed in
the USA, but Shazam also in Canada. Middle: Statistics softwares with Econometric Appli-
cations, mostly US, but R international. Bottom: Products developed in UK, but G@RCH
in Belgium, STAMP also in the Netherlands. Release dates only approximate.

Figure 1 depicts the main software products discussed in this section. The hor-
izontal axis represents the time of release of the different softwares. The first new
product wave in the 1960s can be connected with the availability of FORTRAN.
The second wave in the 1970s corresponds with the appearance of computer ter-
minal interfaces. The third wave in the beginning of the 1980s was connected with
the development of the first micro-computers and IBM-PCs. Finally, the graphical
interface for microcomputers that became widely available in the 1990s led to new
names in econometric software.

2.3 Dutch econometric software


As this article is written to commemorate the founding of the Econometric Insti-
tute in Rotterdam, it befits this occasion to add a few paragraphs on the history of
econometric software development in the Netherlands. Before the widespread avail-
ability of standard econometric software like TSP, econometric researchers devel-
oped their own code, using the tools of the time: FORTRAN, punch cards, card
readers and mainframe computers. Note that econometrics predates computer sci-
ence, so econometricians also organized, bought and tuned the computers and even
wrote computer manuals, see Merkies and Boas (1963). Dutch undergraduate
 VVS, 2006
214 M. Ooms and J. A. Doornik

studies in econometrics, which were introduced in the 1960s, soon included compul-
sory courses in computer programming, numerical mathematics and mathematical
programming. To this day, these subjects constitute an important part of under-
graduate econometrics education. Program it yourself, became the device for the
graduates in econometrics. This was even more true when econometric programming
languages like TSP and GAUSS became widely available. Little effort was made in
documentation and distribution of software outside a small circle of co-workers.
Therefore, most of the old code is no longer used, but the algorithmic ideas have
been documented. Not only the econometricians, but also scientific programmers of
the Econometric Institute, like Adrie Louter, Peter Hop and Gerrit Draisma made
significant contributions to algorithms, see, for example, Louter and Dubbelman
(1973), Van Dijk and Hop (1988), and Draisma and De Haan (1996). Regrettably,
the positions of scientific programmers had to be discontinued. Research assistants
were employed, but the short-term contracts did not allow serious long-term soft-
ware development any longer.
The CPB, formerly the Central Planning Bureau, now the Netherlands Bureau for
Economic Analysis in the Hague has a long tradition of econometric model build-
ing, right from the days of Jan Tinbergen and Henri Theil. The CPB still devel-
ops its own software, but this is now only for internal use. Henk Don, part-time
professor at the University of Amsterdam and former director of the CPB, distrib-
uted his model simulation software package SIMPC in the beginning of the 1990s
and published some of his algorithms, see e.g. Don (1990). Also at the University
of Amsterdam, Jurgen Doornik developed LogitJD for discrete choice models, see
Doornik (1985), a program which is now integrated in PcGive.
At the University of Tilburg and later at the Vrije Universiteit, Siem Jan Koopman
continued his development of the STAMP software for structural time series anal-
ysis and prediction, see Koopman et al. (2000). Herman Bierens, part-time profes-
sor in Tilburg until 2004, continues to develop EasyReg International at Penn State
University in the USA. EasyReg is a free international software package, primarily
developed for econometrics education. It is built in Visual Basic and implements
more advanced procedures than the other free interactive econometrics package,
GRETL, by Allin Cottrell of Wake Forest University, http://gretl.sourceforge.net.
GRETL is open source (in C), and even more international, with menus in French,
Italian, Spanish, Polish and German as well as English.

3 What makes software econometric software?


As the discussion in the previous section illustrates, many econometric techniques
can now be implemented using existing mathematical and statistical software pack-
ages. This raises the question what presently distinguishes econometric software. A
first requirement is that the package should be affordable. The cost is usually born
by comparatively poor departments. In practice, this means that the distinguishing
 VVS, 2006
Econometric software development 215

econometric features that we discuss below have to be implemented by small teams


of developers. We discuss four topics in turn: econometric ideas, econometric doc-
umentation, econometric modelling features and econometric extendibility.

3.1 Econometric ideas


As software development continues to reflect the needs of the econometricians
doing the development, it is the development of econometric modelling ideas – prob-
ably inspired by innovations in computer science – that primarily drives the econo-
metric software innovation. This goes beyond the application of new algorithms to
solve sets of differential equations or the introduction of novel optimization tech-
niques from mathematical programming or the introduction of new types of statisti-
cal inference from mathematical statistics. Implementing these purely mathematical
or statistical ideas is clearly relevant, but this still does not make econometric soft-
ware.
Important, influential and long-lasting econometric ideas can be rewarded with
the Nobel prize in economics and nowadays econometric ideas can only become
widespread if useful applications are carefully implemented in software. Nobel prize
winning ideas are our most important illustrations of relevant econometric ideas. A
Nobel prize for mathematics or statistics does not exist, so purely statistical ideas
do not qualify for this honour.
Econometric Nobel prizes were awarded as follows, see http://www.nobelprize.org.
In 1969 Ragnar Frisch and Jan Tinbergen received the prize for having developed
and applied dynamic models for the analysis of economic processes. In 1980 Law-
rence R. Klein was awarded the prize for the creation of econometric models and the
application to the analysis of economic fluctuations and economic policies and Rich-
ard Stone was a laureate in 1984 for having made fundamental contributions to the
development of systems of national accounts and hence greatly improving the basis
for empirical economic analysis. Trygve Haavelmo received the prize in 1989 for his
clarification of the probability theory foundations of econometrics and his analyses of
simultaneous economic structures. Tjalling Koopmans was a laureate in 1975, his dis-
tinguished work in the field of econometric methods was recognized, but he received
the honour for his contributions to the theory of optimum allocation of resources,
which does not fall in the narrow definition of econometrics that we use today.
Stone and Klein were pioneers in managing econometric software development,
which involved procedures for data management, economic modelling, mathematical
solution and statistical estimation and testing, culminating in the likelihood-based
analysis of dynamic simultaneous equation models implemented in, e.g., TROLL,
TSP and PcGive. These methods are not widely available in statistical software.
Already four econometricians received the Nobel Prize in the new millennium.
Only in 2000 the first microeconometricians were awarded the prize, James J. Heck-
man for his development of theory and methods for analysing selective samples
and Daniel L. McFadden for his development of theory and methods for analysing
 VVS, 2006
216 M. Ooms and J. A. Doornik

discrete choice. Their microeconometric ideas were implemented in the first widely
available microeconometric software, LIMDEP, of William Greene. McFadden
established Berkeley’s Econometrics Laboratory (EML) “dedicated to education and
research in the field of computationally intensive econometrics, utilizing and advanc-
ing state-of-the-art methods, software, and hardware”. As the estimation of many
of the more realistic (mixed) discrete choice models are computationally intensive
indeed, notably because of the simulation-based inference, advanced methods have
only recently become available in easy-to-use software. These econometric ideas are
not confined to economics applications. They are also applied in transportation
science and other social sciences.
In 2003, the last econometricians to receive a Nobel Prize were Robert F. Engle,
for methods of analysing economic time series with time-varying volatility (ARCH),
and Clive W. J. Granger, for methods of analysing economic time series with com-
mon trends (cointegration). Eviews (formerly MicroTSP) was the first software to
implement easy-to-use GARCH models. GARCH models have been implemented
in many softwares outside econometrics, notably in statistical time series analysis
for management science and in financial engineering. The more advanced statistical
analysis of cointegration, based on Vector Autoregressions, was first implemented
in PCFIML (now part of PcGive) and soon taken up by RATS and Eviews, which
made the application of these ideas widely available in a short period of time. Basic
versions of GARCH models and Cointegration are available in statistical packages,
but serious empirical applications and up-to-date inference still require specialized
econometric software.
It is not easy to forecast which econometric idea will qualify for a future Nobel
Prize, but we can predict that it will have been implemented in econometric software
before the prize is awarded.

3.2 Econometric documentation


Over the last decades, it has become much easier and less time-consuming to do
complicated econometric computations. The software also makes the production of
empirical reports with mathematical model formulation, tables and graphs of empir-
ical results faster and less complicated. But this has not made econometric empirical
analysis easier per se. Applied econometrics cannot be taught by an online Wizard,
the main concepts have to be known before the software is useful.
Econometrics textbooks still need to be studied and the tutorials, user guides,
manuals and online help of econometric software somehow refer to textbook termi-
nology. This makes the software easier to understand and use for economists who
are used to the econometric terminology and notation and more difficult and less
useful for researchers educated in other, yet related disciplines like political science
and psychology. The interfaces of many computer programs for data input, pro-
gramming, text processing, formula and graph editing become more and more simi-
lar, due to the worldwide concentration in operating systems and standardization of
 VVS, 2006
Econometric software development 217

other scientific applications like LaTeX. Yet, the terminology and notation remains
different from discipline to discipline. The Durbin–Watson statistic is understood
in all disciplines with a Statistics 101 course, while the Breusch–Godfrey test and
Hansen’s J -test are specific to econometrics.
The connection between a priori terminology knowledge and econometric soft-
ware usability is not problematic when the textbook writing and software develop-
ment is led by the same person. William Greene updates his standard textbook,
Greene (2003), and LIMDEP software regularly. TSP and Eviews extensively use
terminology and examples of Pindyck and Rubinfeld (1998) and Greene (2003) in
their documentation. David Hendry and Jurgen Doornik wrote extensive tutorials
to accompany PcGive, which can be considered textbooks, see Hendry and Door-
nik (2001). Recent influential econometric textbooks like Wooldridge (2006) and
Verbeek (2004) present empirical examples using Eviews and Stata. Recent Dutch
econometrics textbooks, which are more time-series oriented, Heij et al. (2004), from
the Econometric Institute in Rotterdam and Vogelvang (2005) from the Vrije Uni-
versiteit in Amsterdam, use Eviews in their explicit applications, following a number
of other textbooks. Although most textbook authors no longer produce their own
software, there is still a strong connection between specific econometric software and
applied econometric methods, both in education and research.
A recent textbook on Bayesian econometric investigation by Geweke (2005),
closely connected with the free BACC software for Bayesian Analysis, Computation,
and Communication by Chen, McCausland and Stevens (2003), probably fills the
econometric documentation gap for Bayesian software. BACC is a library of rou-
tines which can work in GAUSS, MATLAB, S-PLUS and R. It does not provide
a Bayesian User Interface, like Bayesian software developed for medical and spatial
statistics like BUGS, see http://www.mrc-bsu.cam.ac.uk/bugs.
Mainstream statistical software like SPSS simply does not fit econometrics edu-
cation and research.

3.3 Econometric modelling features


Another aspect which distinguishes econometric software is the standard availability
of features for the interactive modelling cycle: models are not only easily specified
and estimated, but diagnostic tests, easy respecification, and re-estimation facilities
are provided in order to make the interpretation of parameter estimates and fore-
casts as credible as possible. Today, this requires a graphical (WIMP) interface that
is sufficiently intuitive and easy to learn and remember for new users.
This recursive modelling is especially relevant for the econometric analysis of time
series, where new observations become available in a natural order, with associated
testing possibilities and possible adaptations of existing models. In the context of
dynamic linear regression models PcGive was the first program to cater for the influ-
ential general-to-specific methodology of econometric model selection. A “Progress”
menu in PcGive simplifies the interactive model selection process. Although this
 VVS, 2006
218 M. Ooms and J. A. Doornik

feature per se has not been copied in other packages, a wide range of standard
specification tests and diagnostics for estimated models has now become a crucial
ingredient of every econometric software.
The model selection process can also be automated. Successful automated model
selection has long been available for pure Box–Jenkins time series modelling for
forecasting in the AUTOBOX software by David Reilly and in the Census X-11-
ARIMA program for seasonal adjustment of the US Census. Automated linear
dynamic model selection for economic analysis, based on a wide range of diagnostic
tests and multiple-path general-to-specific modelling is available in PcGets, Hendry
and Krolzig (2001). Yet, automated methods still require a “most general” well
specified model, for which extensive tests should be available.
Nowadays, stochastic simulation and bootstrap analysis of econometric models
should be available as a matter of course, both for the interpretation of nonlinear
models, and for associated statistical inference. If the inference is simulation based,
one also needs diagnostics on the efficacy and reliability of the associated simulation
methods.

3.4 Econometric extendibility


A large, well documented set of easy-to-use excellent models and methodologies may
suffice for econometrics education and repetitive research, but innovative economet-
ric research requires adaptability and extendibility of the models for specific data
sets and specific economic questions. Evaluation and improvement of existing imple-
mentations for nontrivial models should also be a constant concern, see e.g. the
discussion of numerical precision of econometric packages by McCullough and
Vinod (1999), the evaluation of multivariate GARCH models in different packages
by Brooks, Burke and Persand (2003) and the discussion on this topic in
Renfro (2004b). Improvements and work-arounds for weaknesses in current soft-
ware require adaptability.
Large models require an extendible modelling language and new models require
an efficient programming language in which to code new algorithms to estimate
and evaluate new model types. The programming language should at least cater
for effective data management, fast matrix operations, robust optimization methods,
state-of-the-art stochastic simulation, decent graphical and textual output facilities.
It should have a well-defined syntax without too many idiosyncrasies, allowing for
efficient maintenance of the code.
For business use, the econometric programming language should be applicable as
an engine within other software, so that econometric procedures can be called by
and feed results to programs like Excel, Access, or commercial front-office and back-
office applications written in lower level languages like C or Perl. Ideally, the econo-
metric software should have an interface to the Structured Query Language (SQL),
a standard language that provides an interface to many relational database sys-
tems, and to specific economic, financial, and energy data management software, like
 VVS, 2006
Econometric software development 219

FAME, http://www.fame.com, or HAVER, http://www.haver.com. Finally, in higher-


level econometric languages, one should be able to integrate existing numerical pro-
cedures from low-level languages. In very computing intensive simulation-based
methods, one might want to optimize code for parallel computing or for specific
hardwares at a low level to increase speed.

4 Present econometric software developments


New econometric ideas develop into new challenges for econometric software devel-
opment and will remain the driving force behind innovations. Storing, managing,
and the economic analysis of financial transaction data requires new software capa-
bilities. Whereas features for monthly data have been available since the 1960s, facili-
ties for analyzing daily time series data have only recently been added to econometric
software packages. It is a long way before standard models for the main dynamic
features of daily data are developed and implemented. For unaggregated transaction
prices there is still a longer way to go. For example, the univariate time series anal-
ysis of realized volatility, see inter alia Andersen et al. (2003), is far from standard
using existing software, let alone its multivariate analysis.
It will always remain difficult to estimate and analyze a nonlinear unbalanced
dynamic panel data model with a combination of (latent) continuous and categorical
variables and random effects, but appropriate data and effective modelling strategies
may become available (and implemented) before very long.
User interfaces will have to be updated. Following Microsoft, Google and GRETL,
users will expect econometric software to deal with labels and numbers in their own
language and application menus will have to be presented in different character sets
as well. The graphical interface will also need reconstruction as the current graphical
Windows interface is replaced in future Microsoft products. The new interface will
help to make better use of the many options that programs have, most of which are
ineffective because they are hard to find in the current, very extensive menu struc-
tures.
The market for econometric software seems to be too small to develop one pro-
gram which does not only keep up with all recent scientific developments in econo-
metrics to keep advanced knowledgeable customers interested in buying updates, but
which also implements the necessary changes in user interface necessary to keep
attracting new customers. It does not seem likely that new fully-fledged econometric
software packages with high academic standards are going to be developed. Econo-
metric software patents are nearly unknown. This is probably also a distinguishing
feature, but note White (2000), who patented the computer implementation of his
reality check for data snooping. Academic returns on high quality, robust, versatile,
and well documented econometric software development are low. Today, as in the
past, academic econometric software development has to be combined with com-
mercial consulting to make ends meet.
 VVS, 2006
220 M. Ooms and J. A. Doornik

In the 1990s, the increasing popularity of the Internet generated optimism about
cooperation in development of software to make advanced econometric computa-
tions more easily reproducible. Through the establishment of well documented
econometric method archives, the development of common platform-independent
compilers, user interfaces, and even computation and database centres – all within
easy reach through the Internet, the future for easily reproducible advanced econo-
metric academic computing development looked bright, according to Härdle and
Horowitz (2000). Unfortunately, only one of their suggested Method and Data
technology centres has been created. Nevertheless, a web interface, called XploRe
Quantlet client (XQC), has been realized for the statistical software XploRe at the
Statistics department of Humboldt University in Berlin, see http://www.xplore-stat.de.
Online electronic books with advanced econometric and financial time series appli-
cations are provided for educational purposes. The Xplore system still does not seem
be well known among economists and econometricians outside Germany and Spain.
It is primarily a package for (nonparametric) statistics and quantitative finance.
Some other interesting, open source, platform-independent graphical user inter-
faces for econometric computing have recently been created in Germany. Markus
Krätzig built an interface in Java which runs GAUSS code for Multivariate time
series analysis, see Lütkepohl and Krätzig (2004). Merten Joost developed the
JAVA Application Programming Interface JAPI, see http://www.japi.de, which runs
James Davidson’s Ox code for his package Time Series Modelling, see Davidson
(2005). This interface also involves OxJapi, by Choirat and Seri (2002). None of
the JAVA-built interfaces mentioned above are used by more than one globally dis-
tributed econometric software package.
Of course, thanks to the search engine Google and free specific Internet aggrega-
tors of economic and econometric research (papers, articles, books, citations, data
and software) like RePEc, http://www.repec.org, it is now easy to find properly doc-
umented econometric source code written for one of the main econometric soft-
wares on the web. However, it is still difficult to assess the quality of this code if
one does not have access to the corresponding econometric software for which it
was developed. As most of these codes developed for academic research papers are
available free of charge, authors cannot be expected to set up a helpdesk, and one
has to resort to mailing lists, Usenet and http://groups.google.com to get necessary
information, which also may be unreliable. Unsurprisingly, given the background of
most econometricians, robust, high-quality econometric procedures seldomly come
for free. An exception is the Census X-12 procedure for seasonal adjustment of the
US Census, which development is financed by the U.S. Government. With some lag,
these standard procedures have been integrated in the major econometric softwares,
but the basic version is also available for free.
Thanks to the modular structure of econometric software, it is increasingly becom-
ing possible to use econometric code outside its original environment. For exam-
ple, Laurent and Urbain (2003) provide an interface called M@ximize for Ox,
based on OxGauss, to test GAUSS programs which involve constrained maximum
 VVS, 2006
Econometric software development 221

likelihood estimation and which does not require GAUSS or GAUSS packages to
run. This helps the reproducibility required in academic econometrics. On the other
hand, Diethelm Würtz, author of RMetrics, recently provided an interface in R to
the G@RCH package that Laurent and Peters (2005) developed for Ox, but this
still requires the availability of Ox. Cameron Rookley provides a resource, GTOML
(GAUSStoMATLAB), to translate GAUSS code to MATLAB. This requires the
open source language PERL and it does not allow translation of GAUSS constrained
maximum likelihood (CML) code, see http://www.cameronrookley.com. Robert Hen-
son provides MATLAB R-link with functions for calling R from within MATLAB,
see Henson (2004). This naturally requires R to work. These are just a few exam-
ples. More links between softwares are being developed. As these transformation
tools and compilers are not really supported by the respective softwares, it is uncer-
tain whether they are “upgrade resistant”. They are primarily useful for short pilot
projects, not so much for long term software development.
In sum, new econometric ideas and new types of data will require new software.
Basic procedures will have to be developed in order to make these ideas known
and to make computational results reproducible. Sophisticated software will still be
needed for user-friendly education and presentation in economics and finance, both
in academia and business, and for fast, robust and scalable computation in aca-
demic and business applications. These extensions will only be created on demand
and are not likely to be inexpensive. A high quality, unified platform for economet-
ric computing still does not exist and is not likely to arise in the near future. This
may not be desirable anyway, because competition between the few remaining soft-
ware providers is beneficial for econometric consumers. Useful links between existing
econometric softwares are being developed, but they are usually experimental. Their
structural maintenance is doubtful.
Internet links to OxMetrics products developed by Jurgen Doornik are available
via http://www.oxmetrics.com. An annotated list of links to software mentioned in
this article, edited by Marius Ooms, is available via the Econometric Links of the
Econometrics Journal of the Royal Economic Society at http://www.econometric-
links.com. This website started in 1995 as the Econometric Links of the Econometric
Institute at the Erasmus University in Rotterdam. Marius thanks Eelco van Aspe-
ren, who established http://www.cs.eur.nl, one of the first www-servers in the world,
for his software and advice which provided a timely start of our exposure on the
World Wide Web. We apologize to all econometric software developers who are not
mentioned in this article. We acknowledge it is not comprehensive.

References
Andersen, T. G., T. Bollerslev, F. Diebold and P. Labys (2003), Modeling and forecasting
realized volatility, Econometrica 71, 579–625.
Berndt, E. R., B. H. Hall, R. Hall and J. A. Hausman (1974), Estimation and inference in
nonlinear structural models, Annals of Economic and Social Measurement 3, 653–665.
 VVS, 2006
222 M. Ooms and J. A. Doornik

Brooks, C., S. P. Burke and G. Persand (2003), Multivariate GARCH models: software
choice and estimation issues, Journal of Applied Econometrics 18, 725–734.
Carmona, R. A. (2004), Statistical analysis of financial data in S-Plus, Springer-Verlag, New
York, USA.
Chen, W., W. McCausland and J. J. Stevens (2003), User manual for the Windows R
version of BACC (Bayesian Analysis, Computation, and Communication), http://www2.
cirano.qc.ca/∼bacc/.
Choirat, C. and R. Seri (2002), OxJapi: an Ox version of Merten Joosts Java Application Pro-
gramming Interface, Department of Economics, University of Insubria, Varese, Italy.
Cribari-Neto, F. and S. Zarkos (1999), R: yet another econometric programming environ-
ment, Journal of Applied Econometrics 14, 319–329.
Cribari-Neto, F. and S. G. Zarkos (2003), Econometric and statistical computing using Ox,
Computational Economics 21, 277–295.
Davidson, J. (2005), TSM time series modelling 4.14, University of Exeter, Exeter, UK, http://
www.timeseriesmodelling.com.
Doan, T. A. (2004), User’s manual RATS, Version 5, Estima, Evanston, IL, USA, http://www.
estima.com.
Don, F. J. H. (1990), Some issues in solving large sparse systems of equations, Journal of Eco-
nomic Dynamics and Control 14, 313–325.
Doornik, J. A. (1985), LOGITJD, een computerprogramma voor het schatten van onafhankelijke
logitmodellen, AE Note N5/85, Interfaculteit der Actuariële Wetenschappen en Econometrie
der Universiteit van Amsterdam, The Netherlands.
Doornik, J. A. (1998), Object-oriented matrix programming using Ox, Timberlake Consultants
Press, London, UK, http://www.oxmetrics.com.
Doornik, J. A. and D. F. Hendry (1999), GiveWin: an interface to empirical modelling, 2nd
edn, Timberlake Consultants Press, London, UK.
Doornik, J. A. and D. F. Hendry (2001), Econometric modelling using PcGive, Vols I–III, Tim-
berlake Consultants Press, London, UK, http://www.pcgive.com.
Draisma, G. and L. De Haan (1996), An estimator for the extreme-value index, Communica-
tions in Statistics – Theory and Methods 25, 685–694.
Eviews (2004), Eviews 5 user’s guide, Quantitative Micro Software, Irvine, CA, USA,
http://www.eviews.com.
Fair, W. R. and R. C. Parke (2003), The Fair–Parke program for the estimation and analy-
sis of nonlinear econometric models, user’s guide, Yale University, New Haven, CT, USA,
http://fairmodel.econ.yale.edu.
GAUSS (2005), GAUSS 7.0 user’s guide, Aptech Systems Inc./Trafford Publishing, Maple Val-
ley, WA, USA, http://www.aptech.com.
Geweke, J. (2005), Contemporary Bayesian econometrics and statistics, John Wiley & Sons,
New York.
Greene, W. H. (2002), LIMDEP 8.0 econometric modeling guide, Econometric Software Inc.,
New York, NY, USA, http://www.limdep.com.
Greene, W. H. (2003), Econometric analysis, 5th edn, Prentice-Hall, Englewood cliffs, NJ.
Hall, B. H. and C. Cummins (2005), TSP 5.0 user’s guide, TSP International, Palo Alto, CA,
USA, http://www.tspintl.com.
Härdle, W. and J. Horowitz (2000), Internet-based econometric computing, Journal of Econo-
metrics 95, 333–345.
Heij, C., P. de Boer, P. H. Franses, T. Kloek and H. K. van Dijk (2004), Econometric methods
with applications in business and economics, Oxford University Press, Oxford, UK.
Hendry, D. F. and J. A. Doornik (2000), The impact of computational tools on time-series
econometrics, in: T. Coppock (ed.), Information technology and scholarship applications in
the humanities and social sciences, Oxford University Press/British Academy, Oxford, UK,
257–269.
Hendry, D. F. and J. A. Doornik (2001), Empirical econometric modelling using PcGive, Vol.
I, 3rd edn, Timberlake Consultants Press, London, UK.

 VVS, 2006
Econometric software development 223

Hendry, D. F. and H.-M. Krolzig (2001), Automatic econometric model selection using
PcGets, Timberlake Consultants Press, London, UK.
Henson, R. (2004), MATLAB R-Link, MATLAB Central, Boston, MA, USA,
http://www.mathworks.com/matlabcentral.
Johansen, S. (1995), Likelihood-based inference in cointegrated vector autoregressive models,
Oxford University Press, Oxford, UK.
Koopman, S. J., A. C. Harvey, J. A. Doornik and N. Shephard (2000), STAMP, structural
time series analyser, modeller and predictor, Timberlake Consultants Press, London, UK.
Koopman, S. J., N. Shephard and J. A. Doornik (1999), Statistical algorithms for mod-
els in state space using SsfPack 2.2, The Econometrics Journal 2, 107–160, http://www.
ssfpack.com.
Ladiray, D. and B. Quenneville (2001), Seasonal adjustment with the X-11 method, Springer-
Verlag, New York, NY, USA.
Laurent, S. and J.-P. Peters (2005), G@RCH 4.0, estimating and forecasting ARCH models,
Timberlake Consultants Press, London, UK.
Laurent, S. and J.-P. Urbain (2003), Bridging the gap between Ox and Gauss using OxGauss,
paper presented at the first Oxmetrics user conference, London, Center for Econometrics and
Operations Research (CORE), Louvain-la-Neuve, Belgium.
LeSage, J. P. and R. K. Pace (eds.) (2004), Advances in econometrics, Volume 18: Spatial and
spatiotemporal econometrics, Elsevier, Oxford, UK.
Louter, A. S. and C. Dubbelman (1973), An exact autocorrelation test for small n and k, a
computer program and a table of significance points, Report 1973-04, Econometric Institute,
Rotterdam, The Netherlands.
Lütkepohl, H. and M. Krätzig (2004), The software JMulti, in: H. Lütkepohl and M. Krät-
zig (eds.), Applied time series econometrics, Chapter 8, Cambridge University Press, Cam-
bridge, UK, http://www.jmulti.de.
MATLAB (2004), MATLAB 7.1, The Mathworks Inc., Boston, MA, USA, http://www.
mathworks.com.
McCullough, B. D. and H. D. Vinod (1999), The numerical reliability of econometric soft-
ware, Journal of Economic Literature 37, 633–665.
Merkies, A. H. Q. and J. Boas (1963), The institute’s computer, a guided tour, Report 6325,
Econometric Institute, Rotterdam, The Netherlands.
Pindyck, R. S. and D. L. Rubinfeld (1998), Econometric models and economic forecasts, inter-
national edition (4th edn), McGraw-Hill Inc., New York, USA.
Renfro, C. G. (ed.) (2004a), Computational econometrics: its impact on the development of
quantitative economics, IOS Press, Amsterdam, The Netherlands, http://www.iospress.com.
Renfro, C. G. (2004b), Econometric software: the first fifty years in perspective, Journal of
Economic and Social Measurement 29, 9–107.
SAS (2004), SAS/ETS econometrics and time series 9.1 user’s guide, SAS Publishing, Cary, NC,
USA, http://www.sas.com.
Schoenberg, R. (1997), Constrained maximum likelihood, Computational Economics 10, 251–
266.
Sims, C. A. (1980), Macroeconomics and reality, Econometrica 48, 1–48.
Stata (2005), Stata user’s guide, Stata Press, College Station, Tx, USA, http://www.stata.com.
Van Dijk, H. K. and J. P. Hop (1988), User’s guide for the computer programs Sisam and Mixin,
Report 8810/C, Econometric Institute, Erasmus University, Rotterdam, The Netherlands.
Venables, W. and B. Ripley (2002), Modern applied statistics with S, 4th edn, Springer-Verlag,
New York.
Verbeek, M. (2004), A guide to modern econometrics, 2nd edn, John Wiley & Sons, New York.
Vogelvang, B. (2005), Econometrics, theory and applications with EViews, Pearson Education,
Harlow, UK.
Whistler, D. K., K. J. White, S. D. Wong and D. Bates (2004), SHAZAM version 10 user’s
reference manual, northwest econometrics, Vancouver, Canada, http://www.econometrics.
com.
White, H. (2000), A reality check for data snooping, Econometrica 68, 1097–1126.

 VVS, 2006
224 M. Ooms and J. A. Doornik

Wooldridge, J. (2006), Introductory econometrics: a modern approach, 3rd edn, South-Western


College Publishing, Mason, OH, USA.
Zivot, E. and J. Wang (2005), Modeling financial time series with S-PLUS, 2nd edn, Springer-
Verlag, New York, USA.

Received: November 2005. Revised: December 2005.

 VVS, 2006

You might also like