Professional Documents
Culture Documents
PDF Probability Theory and Statistical Inference Empirical Modeling With Observational Data 2Nd Edition Spanos A Ebook Full Chapter
PDF Probability Theory and Statistical Inference Empirical Modeling With Observational Data 2Nd Edition Spanos A Ebook Full Chapter
https://textbookfull.com/product/probability-theory-and-
statistical-inference-empirical-modeling-with-observational-data-
aris-spanos/
https://textbookfull.com/product/probably-not-future-prediction-
using-probability-and-statistical-inference-lawrence-n-dworsky/
https://textbookfull.com/product/exact-statistical-inference-for-
categorical-data-1st-edition-shan/
https://textbookfull.com/product/applied-statistical-inference-
with-minitab-second-edition-lesik/
Theory of stochastic objects probability stochastic
processes and inference 1st Edition Micheas
https://textbookfull.com/product/theory-of-stochastic-objects-
probability-stochastic-processes-and-inference-1st-edition-
micheas/
https://textbookfull.com/product/theory-of-stochastic-objects-
probability-stochastic-processes-and-inference-first-edition-
micheas/
https://textbookfull.com/product/statistical-independence-in-
probability-analysis-number-theory-kac/
https://textbookfull.com/product/computer-age-statistical-
inference-algorithms-evidence-and-data-science-1st-edition-
bradley-efron/
https://textbookfull.com/product/computer-age-statistical-
inference-algorithms-evidence-and-data-science-1st-edition-
bradley-efron-2/
Probability Theory and Statistical
Inference
Aris Spanos
Virginia Tech
(Virginia Polytechnic Institute & State University)
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906
www.cambridge.org
Information on this title: www.cambridge.org/9781107185142
DOI: 10.1017/9781316882825
c Aris Spanos 2019
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 1999
Third printing 2007
Second edition 2019
Printed in the United Kingdom by TJ International Ltd. Padstow Cornwall
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: Spanos, Aris, 1952– author.
Title: Probability theory and statistical inference : empirical modelling
with observational data / Aris Spanos (Virginia College of Technology).
Description: Cambridge ; New York, NY : Cambridge University Press, 2019. |
Includes bibliographical references and index.
Identifiers: LCCN 2019008498 (print) | LCCN 2019016182 (ebook) | ISBN 9781107185142 | ISBN
Subjects: LCSH: Probabilities – Textbooks. | Mathematical
statistics – Textbooks.
Classification: LCC QA273 (ebook) | LCC QA273 .S6875 2019 (print) | DDC 519.5–dc23
LC record available at https://lccn.loc.gov/2019008498
ISBN 978-1-107-18514-2 Hardback
ISBN 978-1-316-63637-4 Paperback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To my grandchildren Nicholas, Jason, and Evie,
my daughters Stella, Marina, and Alexia, and my
wife Evie for their unconditional love and support
Contents
vii
viii Contents
References 736
Index 752
Preface to the Second
Edition
The original book, published 20 years ago, has been thoroughly revised with two objectives
in mind. First, to make the discussion more compact and coherent by avoiding repetition and
many digressions. Second, to improve the methodological coherence of the proposed empir-
ical modeling framework by including material pertaining to foundational issues that has
been published by the author over the last 20 years or so in journals on econometrics, statis-
tics, and philosophy of science. In particular, this revised edition brings out more clearly
several crucial distinctions that elucidate empirical modeling, including (a) the statistical
vs. the substantive information/model, (b) the modeling vs. the inference facet of statistical
analysis, (c) testing within and testing outside the boundary of a statistical model, and (d)
pre-data vs. post-data error probabilities. These distinctions shed light on several founda-
tional issues and suggest solutions. In addition, the comprehensiveness of the book has been
improved by adding Chapter 14 on the linear regression and related models.
The current debates on the “replication crises” render the methodological framework
articulated in this book especially relevant for today’s practitioner. A closer look at the
debates (Mayo, 2018) reveals that the non-replicability of empirical evidence problem is,
first and foremost, a problem of untrustworthy evidence routinely published in prestigious
journals. The current focus of that literature on the abuse of significance testing is rather
misplaced, because it is only a part of a much broader problem relating to the mechan-
ical application of statistical methods without a real understanding of their assumptions,
limitations, proper implementation, and interpretation of their results. The abuse and mis-
interpretation of the p-value is just symptomatic of the same uninformed implementation
that contributes majorly to the problem of untrustworthy evidence. Indeed, the same unin-
formed implementation often ensures that untrustworthy evidence is routinely replicated,
when the same mistakes are repeated by equally uninformed practitioners! In contrast to the
current conventional wisdom, it is argued that a major contributor to the untrustworthy evi-
dence problem is statistical misspecification: invalid probabilistic assumptions imposed on
one’s data, another symptom of the same uninformed implementation. The primary objec-
tive of this book is to provide the necessary probabilistic foundation and the overarching
modeling framework for an informed and thoughtful application of statistical methods, as
well as the proper interpretation of their inferential results. The emphasis is placed less on
the mechanics of the application of statistical methods, and more on understanding their
assumptions, limitations, and proper interpretation.
xix
xx Preface to the Second Edition
N O T E: All sections marked with an asterisk (∗) can be skipped at first reading without any
serious interruption in the flow of the discussion.
Acknowledgments
More than any other person, Deborah G. Mayo, my colleague and collaborator on many
foundational issues in statistical inference, has helped to shape my views on several method-
ological issues addressed in this book; for that and the constant encouragement, I’m most
grateful to her. I’m also thankful to Clark Glymour, the other philosopher of science with
whom I had numerous elucidating and creative discussions on many philosophical issues
discoursed in the book. Thanks are also due to Sir David Cox for many discussions that
helped me appreciate the different perspectives on frequentist inference. Special thanks are
also due to my longtime collaborator, Anya McGuirk, who contributed majorly in puzzling
out several thorny issues discussed in this book. I owe a special thanks to Julio Lopez for
Preface to the Second Edition xxi
his insightful comments, as well as his unwavering faith in the coherence and value of the
proposed approach to empirical modeling. I’m also thankful to Jesse Bledsoe for helpful
comments on chapter 13 and Mariusz Kamienski for invaluable help on the front cover
design.
I owe special thanks to several of my former and current students over the last 20 years,
who helped to improve the discussion in this book by commenting on earlier drafts and
finding mistakes and typos. They include Elena Andreou, Andros Kourtellos, Carlos Elias,
Maria Heracleous, Jason Bergtold, Ebere Akobundu, Andreas Koutris, Alfredo Romero,
Niraj Pouydal, Michael Michaelides, Karo Solat, and Mohammad Banasaz.
Symbols
N – set of natural numbers N:={1, 2, ..., n, ...}
R – the set of real numbers; the real line (−∞, ∞)
n times
Rn :=R × R× · · · × R
R+ – the set of positive real numbers; the half real line (0, ∞)
f (x; θ ) – density function of X with parameters θ
F(x; θ ) – cumulative distribution function of X with parameters θ
N(μ, σ 2 ) – Normal distribution with mean μ and variance σ 2
E – Random Experiment (RE)
S – outcomes set (sample space)
– event space (a σ −field)
P(.) – probability set function
σ (X) – minimal sigma-field generated by X
Acronyms
AR(p) – Autoregressive model with p lags
CAN – Consistent, Asymptotically Normal
cdf – cumulative distribution function
CLT – Central Limit Theorem
ecdf – empirical cumulative distributrion function
GM – Generating Mechanism
IID – Indepedent and Identically Distributed
LS – Least-Squares
ML – Maximum Likelihood
M-S – Mis-Specification
N-P – Neyman-Pearson
PMM – Parametric Method of Moments
SLLN – Strong Law Large Numbers
WLLN – Weak Law Large Numbers
UMP – Uniformly Most Powerful
1 An Introduction
to Empirical Modeling
1.1 Introduction
Empirical modeling, broadly speaking, refers to the process, methods, and strategies
grounded on statistical modeling and inference whose primary aim is to give rise to “learn-
ing from data” about stochastic observable phenomena, using statistical models. Real-world
phenomena of interest are said to be “stochastic,” and thus amenable to statistical modeling,
when the data they give rise to exhibit chance regularity patterns, irrespective of whether
they arise from passive observation or active experimentation. In this sense, empirical
modeling has three crucial features:
1
2 An Introduction to Empirical Modeling
the statistical systematic information in the data. Section 1.6 discusses briefly the connection
between a statistical model and the substantive information of interest.
At first sight these two attributes might appear to be contradictory, since “chance” is often
understood as the absence of order and “regularity” denotes the presence of order. However,
there is no contradiction because the “disorder” exists at the level of individual outcomes
and the order at the aggregate level. The two attributes should be viewed as inseparable for
the notion of chance regularity to make sense.
Example 1.1 To get some idea about “chance regularity” patterns, consider the data given
in Table 1.1.
3 10 11 5 6 7 10 8 5 11 2 9 9 6 8 4 7 6 5 12
7 8 5 4 6 11 7 10 5 8 7 5 9 8 10 2 7 3 8 10
11 8 9 5 7 3 4 9 10 4 7 4 6 9 7 6 12 8 11 9
10 3 6 9 7 5 8 6 2 9 6 4 7 8 10 5 8 7 9 6
5 7 7 6 12 9 10 4 8 6 5 4 7 8 6 7 11 7 8 3
A glance at Table 1.1 suggests that the observed data constitute integers between 2 and 12,
but no real patterns are apparent, at least at first sight. To bring out any chance regularity
patterns we use a graph as shown in Figure 1.1, t-plot: {(t, xt ), t = 1, 2, . . . , n}.
The first distinction to be drawn is that between chance regularity patterns and determin-
istic regularities that is easy to detect.
Deterministic regularity. When a t-plot exhibits a clear pattern which would enable one
to predict (guess) the value of the next observation exactly, the data are said to exhibit
deterministic regularity. The easiest way to think about deterministic regularity is to visualize
4 An Introduction to Empirical Modeling
12
10
8
x
1 10 20 30 40 50 60 70 80 90 100
Index
1.5
1.0
0.5
0.0
x
–0.5
–1.0
–1.5
1 10 20 30 40 50 60 70 80 90 100
Index
18
16
(%). Each bar of the histogram represents the frequency of each of the integers 2–12.
For example, since the value 3 occurs five times in this data set, its relative frequency is
RF(3)=5/100 = .05. The relative frequency of the value 7 is RF(7)=17/100 = .17, which is
the highest among the values 2–12. For reasons that will become apparent shortly, we name
this discernible distribution regularity.
[1] Distribution: After a large enough number of trials, the relative frequency of the
outcomes forms a seemingly stable distribution shape.
Thought experiment 2. In Figure 1.1, one would hide the observations beyond a certain
value of the index, say t = 40, and try to guess the next outcome on the basis of the observa-
tions up to t = 40. Repeat this along the x-axis for different index values and if it turns out
that it is more or less impossible to use the previous observations to narrow down the poten-
tial outcomes, conclude that there is no dependence pattern that would enable the modeler
to guess the next observation (within narrow bounds) with any certainty. In this experiment
one needs to exclude the extreme values of 2 and 12, because following these values one
is almost certain to get a value greater and smaller, respectively. This type of predictability
is related to the distribution regularity mentioned above. For reference purposes we name
the chance regularity associated with the unpredictability of the next observation given the
previous observations.
[2] Independence: In a sequence of trials, the outcome of any one trial does not
influence and is not influenced by the outcome of any other.
Thought experiment 3. In Figure 1.1 take a wide enough frame (to cover the spread of the
fluctuations) that is also long enough (roughly less than half the length of the horizontal axis)
and let it slide from left to right along the horizontal axis, looking at the picture inside the
frame as it slides along. In cases where the picture does not change significantly, the data
exhibit the chance regularity we call homogeneity, otherwise heterogeneity is present; see
Another random document with
no related content on Scribd:
by one into the arms of gendarmes below. The palaces along the
Riva were a broad ribbon of color with a binding of black coats and
hats. The wall of San Giorgio fronting the barracks was fringed with
the yellow legs and edged with the white fatigue caps of two
regiments. Even over the roofs and tower of the church itself specks
of sight-seers were spattered here and there, as if the joyous wind in
some mad frolic had caught them up in very glee, and as suddenly
showered them on cornice, sill, and dome.
Beyond all this, away out on the lagoon, toward the islands, the red-
sailed fishing-boats hurried in for the finish, their canvas aflame
against the deepening blue. Over all the sunlight danced and blazed
and shimmered, gilding and bronzing the roof-jewels of San Marco,
flashing from oar blade, brass, and ferro, silvering the pigeons
whirling deliriously in the intoxicating air, making glad and gay and
happy every soul who breathed the breath of this joyous Venetian
day.
None of all this was lost upon the Professor. He stood in the bow
drinking in the scene, sweeping his glass round like a weather-vane,
straining his eyes up the Giudecca to catch the first glimpse of the
coming boats, picking out faces under flaunting parasols, and waving
aloft his yellow rag when some gondola swept by flying Pietro’s
colors, or some boat-load of friends saluted in passing.
Suddenly there came down on the shifting wind, from far up the
Giudecca, a sound like the distant baying of a pack of hounds, and
as suddenly died away. Then the roar of a thousand throats, caught
up by a thousand more about us, broke on the air, as a boatman,
perched on a masthead, waved his hat.
“Here they come! Viva Pietro! Viva Pasquale!—Castellani!—Nicoletti!
—Pietro!”
The dense mass rose and fell in undulations, like a great carpet
being shaken, its colors tossing in the sunlight. Between the thicket
of ferros, away down the silver ribbon, my eye caught two little
specks of yellow capping two white figures. Behind these, almost in
line, were two similar dots of blue; farther away other dots, hardly
distinguishable, on the horizon line.
The gale became a tempest—the roar was deafening; women waved
their shawls in the air; men, swinging their hats, shouted themselves
hoarse. The yellow specks developed into handkerchiefs bound to
the heads of Pietro and his brother Marco; the blues were those of
Pasquale and his mate.
Then, as we strain our eyes, the two tails of the sea-monster twist
and clash together, closing in upon the string of rowers as they
disappear in the dip behind San Giorgio, only to reappear in full
sight, Pietro half a length ahead, straining every sinew, his superb
arms swinging like a flail, his lithe body swaying in splendid,
springing curves, the water rushing from his oar blade, his brother
bending aft in perfect rhythm.
“Pietro! Pietro!” came the cry, shrill and clear, drowning all other
sounds, and a great field of yellow burst into flower all over the
lagoon, from San Giorgio to the Garden. The people went wild. If
before there had been only a tempest, now there was a cyclone. The
waves of blue and yellow surged alternately above the heads of the
throng as Pasquale or Pietro gained or lost a foot. The Professor
grew red and pale by turns, his voice broken to a whisper with
continued cheering, the yellow rag streaming above his head, all the
blood of his ancestors blazing in his face.
The contesting boats surged closer. You could now see the rise and
fall of Pietro’s superb chest, the steel-like grip of his hands, and
could outline the curves of his thighs and back. The ends of the
yellow handkerchief, bound close about his head, were flying in the
wind. His stroke was long and sweeping, his full weight on the oar;
Pasquale’s stroke was short and quick, like the thrust of a spur.
Now they are abreast. Pietro’s eyes are blazing—Pasquale’s teeth
are set. Both crews are doing their utmost. The yells are demoniac.
Even the women are beside themselves with excitement.
Suddenly, when within five hundred yards of the goal, Pasquale
turns his head to his mate; there is an answering cry, and then, as if
some unseen power had lent its strength, Pasquale’s boat shoots
half a length ahead, slackens, falls back, gains again, now an inch,
now a foot, now clear of Pietro’s bow, and on, on, lashing the water,
surging forward, springing with every gain, cheered by a thousand
throats, past the red tower of San Giorgio, past the channel of spiles
off the Garden, past the red buoy near the great warship,—one
quick, sustained, blistering stroke,—until the judge’s flag drops from
his hand, and the great race is won.
“A true knight, a gentleman every inch of him,” called out the
Professor, forgetting that he had staked all his soldi on Pietro. “Fairly
won, Pasquale.”
In the whirl of the victory, I had forgotten Pietro, my gondolier of the
morning. The poor fellow was sitting in the bow of his boat, his head
in his hands, wiping his forehead and throat, the tears streaming
down his cheeks. His brother sat beside him. In the gladness and
disappointment of the hour, no one of the crowd around him seemed
to think of the hero of five minutes before. Not so Giorgio, who was
beside himself with grief over Pietro’s defeat, and who had not taken
his eyes from his face. In an instant more he sprang forward, calling
out, “No! no! Brava Pietro!” Espero joining in as if with a common
impulse, and both forcing their gondolas close to Pietro’s.
A moment more and Giorgio was over the rail of Pietro’s boat,
patting his back, stroking his head, comforting him as you would
think only a woman could—but then you do not know Giorgio. Pietro
lifted up his face and looked into Giorgio’s eyes with an expression
so woe-begone, and full of such intense suffering, that Giorgio
instinctively flung his arm around the great, splendid fellow’s neck.
Then came a few broken words, a tender caressing stroke of
Giorgio’s hand, a drawing of Pietro’s head down on his breast as if it
had been a girl’s, and then, still comforting him—telling him over and
over again how superbly he had rowed, how the next time he would
win, how he had made a grand second—
Giorgio bent his head—and kissed him.
When Pietro, a moment later, pulled himself together and stood erect
in his boat, with eyes still wet, the look on his face was as firm and
determined as ever.
Nobody laughed. It did not shock the crowd; nobody thought Giorgio
unmanly or foolish, or Pietro silly or effeminate. The infernal Anglo-
Saxon custom of always wearing a mask of reserve, if your heart
breaks, has never reached these people.
As for the Professor, who looked on quietly, I think—yes, I am quite
sure—that a little jewel of a tear squeezed itself up through his
punctilious, precise, ever exact and courteous body, and glistened
long enough on his eyelids to wet their lashes. Then the bright sun
and the joyous wind caught it away. Dear old relic of a by-gone time!
How gentle a heart beats under your well-brushed, threadbare coat!
SOME VENETIAN CAFFÈS
VERY one in Venice has his own particular caffè, according
to his own particular needs, sympathies, or tastes. All the
artists, architects, and musicians meet at Florian’s; all the
Venetians go to the Quadri; the Germans and late
Austrians, to the Bauer-Grünwald; the stay-over-nights, to the
Oriental on the Riva; the stevedores, to the Veneta Marina below the
Arsenal; and my dear friend Luigi and his fellow-tramps, to a little
hole in the wall on the Via Garibaldi.