2012-10-09 Understanding Bayesian History (Richardcarrier - Info) (2616)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

Understanding Bayesian History

BY R ICH AR D CAR R IER / ON OCTOBER 9, 2012 / 46 COMMENTS

So far I know of only two critiques of my argument in Proving History that actually
exhibit signs of having read the book (all other critiques can be rebutted with three words:
read the book; although in all honesty, even the two critiques that engage the book can be
refuted with five words: read the book more carefully).

As to the first of those two, I have already shown why the criticisms of James McGrath
are off the mark (in McGrath on Proving History), but they at least engage with some
of the content of my book and are thus helpful to address. I was then directed to a series
of posts at Irreducible Complexity, a blog written by an atheist and evolutionary
scientist named Ian who specializes in applying mathematical analyses to evolution, but
who also has a background and avid interest in New Testament studies.

Ian’s critiques have been summarized and critiqued in turn by MalcolmS in comments on my reply to McGrath, an
effort I appreciate greatly. I have added my own observations to those in that same thread. All of that is a bit clunky and
out of order, however, so I will here replicate it all in a more linear way. (If anyone knows of any other critiques of Proving
History besides these two, which actually engage the content of the book, please post links in comments here. But only
articles and blog posts. I haven’t time to wade through remarks buried in comment threads; although you are welcome to
pose questions here, which may be inspired by comments elsewhere.)

Ian’s posts (there are now two, A Mathematical Review of “Proving History” by Richard Carrier and An
Introduction to Probability Theory and Why Bayes’s Theorem is Unhelpful in History; he has promised a third) are
useful at least in covering a lot of the underlying basics of probability theory, although in terms that might lose a humanities
major. But when he gets to discussing the argument of my book, he ignores key sections of Proving History where I
actually already refute his arguments (since they aren’t original; I was already well aware of these kinds of arguments and
addressed them in the book).

When Ian isn’t ignoring the refutations of his own arguments in the very book he’s critiquing, he is ignoring how applications
of Bayes’ Theorem in the humanities must necessarily differ from applications in science (again for reasons I explain in the
book), or he is being pointlessly pedantic and ignoring the fact that humanities majors need a more colloquial instruction and
much simpler techniques than, for instance, a mathematical evolutionist employs.
To illustrate these points I will reproduce in bold the observations of MalcolmS on what Ian argues (which also does a fine
job of summarizing Ian’s substantive points; my thanks to him for all of it), and then follow with my own remarks, which I
have also expanded upon here (saying a bit more than I did in the original comments).


1. Your form of Bayes’s theorem is “confusing and unnecessarily complex.” His preferred form of BT
is P(H|E) = P(E|H)P(H)/P(E).

He has 2 objections to your form of the formula: (1) The denominator has been expanded, which he
feels is unnecessary. [I pointed out to him that most textbooks actually state BT with the expanded
denominator and that most applications of the theorem that one encounters use it in that form as well,
but he replied that in his own field (AI) he only needs P(E), so maybe this is just his personal
preference from his own experience. Moreover, you give both forms of BT in the appendix.]

And (2) adding the background [i.e., the term b] explicitly is “highly idiosyncratic,” “condescending,”
and “irksome,” reminiscent of William Lane Craig. [I agree that it is unusual and unnecessary, but it
is not wrong, as even he acknowledges. Moreover, it is easier to transition to the form of the equation
that you actually are using, whereby some of the evidence is incorporated into the background, but
you never make this explicit.]

2. He also criticizes you for failing to explain the derivation of BT or discussing the definition of
conditional probability generally: “While Carrier devotes a lot of ink to describing the terms of his
long-form BT, he nowhere attempts to describe what Bayes’s Theorem is doing. Why are we dividing
probabilities? What does his long denominator represent?” Consequently BT becomes “a kind a
magic black box.”

He then states cryptically: “In this Carrier allows himself to sidestep the question whether these
necessarily true conclusions are meaningful in a particular domain. A discussion both awkward for
his approach, and one surely that would have been more conspicuously missing if he’d have described
why BT is the way it is.”

[I’m not sure what point Ian is making here, but I think he is alluding to the difficulty of calculating
P(E), which he discusses in his 2nd post. As I’ll point out later, his criticism is based on a
misunderstanding of how you are applying BT.]

MalcolmS finds these two objections to be too trivial. I think they are outright pedantic and belie an abandonment of
pedagogical goals and therefore are not really a worthwhile criticism. For example, a humanities major is not going to
understand what P(E) is or how it derives from the sum of all probabilities; they are also going to have a much easier time
estimating P(E|H) and P(E|~H) because estimating those probabilities is something they have already been doing their
whole lives–they just didn’t know that that’s what they are doing. Much of my book is about pointing this out.

This is also why I keep the termb [for background knowledge] in all equations (as I even explain in an endnote, which
clearly Ian did not read: see note 10, p. 301): so that laymen won’t lose sight of its role at every stage. Mathematicians like
Ian don’t need it there. But historians are not mathematicians like Ian. This is also why I don’t waste the reader’s time by
explaining how Bayes’ Theorem (or BT) was derived or proved; instead, Irefer interested readers to the literature that
does that (e.g., note 9, pp. 300-01). That’s how progress works: you don’t repeat, but build on existing work. I don’t have
to prove BT or explain how it was derived; that’s already been done. I just have to reference that work and then show how
BT can be applied to my field (history).


3. His next criticism is one that I partially share: “Carrier correctly states that he is allowed to divide
content between evidence and background knowledge any way he chooses, provided he is consistent.
But then fails to do so throughout the book.” He cites as an example, p. 51, where the prior is defined
to explicitly include evidence in it. [The prior should be the probability that the hypothesis is true
before any consideration of the evidence.] He continues with this quote from your book [which I also
find objectionable]: “For example, if someone claims they were struck by lightening five times … the
prior probabilty they are telling the truth is not the probability of being struck by lightening five
times, but the probability that someone in general who claims such a thing would be telling the
truth.”

This is his response: “This is not wrong, per se, but highly bizarre. One can certainly bundle the claim
and the event like that, but if you do so Bayes’s Theorem cannot be used to calculate the probability
that the claim is true based on that evidence. The quote is valid, but highly misleading in a book
which is seeking to examine the historicity of documentary claims.”

Ian’s point is that BT is defined as P(H|E) = P(E|H)P(H)/P(E), with the prior by definition being P(H),
i.e., without any conditioning on the evidence. In the example of the claim of being struck by
lightning 5 times, the hypothesis H would normally be “someone was struck by lightning 5 times” and
the evidence E would be “he claims to have been struck by lightning 5 times.” Then the prior would
indeed be the probability of being struck by lightning 5 times. You instead have as your prior the
conditional probability that someone is telling the truth when he claims “in general such a thing,”
which would be (part of) the evidence for the claim.

[Effectively what you are doing is treating the evidence E (“someone claims to have been struck by
lightning 5 times”) as if it were an intersection of E with a larger set, F (“someone claims such a
thing”), that is more general, and then absorbing F into B. The form of BT that you are using is then:
P(H|EB) = P(E|HFB)P(H|FB)/P(E|FB).

Now, this trick may be useful for actually calculating P(H|BE), since then you avoid having to
calculate P(H|B) or P(E|B), but you haven’t been entirely upfront with the reader about what you are
doing.] Moreover, as he points out in the quote above, you still are left with P(H|FB), which is very
similar to P(H|EB), so you haven’t really used BT to solve the problem.

[In the actual applications in your book, what you generally do is use BT in this way to reduce the
problem to conditional probabilities with more general evidence, and then use an empirical frequency
to estimate them. This avoids the problem that he raises about having to estimate, say, the probability
of the NT existing.]

Indeed. And this likewise ignores the fact that historians need to do different things than scientists. Thus the way I
demarcate b from e is what is most useful to historians–and again, I even explain this explicitly in an endnote (note 10, p.
301) and discuss it several other times in the book (see “background knowledge vs. evidence” in the index, p. 333). Ian is
simply ignoring all of that, and thus not responding to what my book actually argues.
Historians are testing two competing hypotheses: that a claim is true vs. the claim is fabricated (or in error etc.), but to a
historian that means the actual hypotheses being tested are “the event happened vs. a mistake/fabrication happened,” which
gives us the causal model “the claim exists because the event happened vs. the claim exists because a mistake/fabrication
happened.” In this model, b contains the background evidence relating to context (who is making this claim, where, to what
end, what kind of claim is it, etc.), which gives us a reference class that gives us a ratio of how often such claims typically
turn out to be true, vs. fabricated (etc.), which historians can better estimate because they’ve been dealing with this kind of
data for years. We can then introduce additional indicators that distinguish this claim from those others, to update our
priors. And we can do that anywhere in the chain of indicators. So you can start with a really general reference class, or a
really narrow one–and which you should prefer depends on the best data you have for building a prior, which historians
rarely have any control over, so they need more flexibility in deciding that (I discuss this extensively in chapter 6, pp. 229-
56).

You could, if you wanted, build out the whole Bayesian chain (e.g. see endnote 11, page 301), all the way from raw data,
but why should historians trouble themselves with that? They already have context-determined estimates of the global
reliability of statements based on their experience. If they get into an argument over conflicting estimates there, then they can
dig into the underlying assumptions and build out the whole Bayesian case from raw data, or at least from further down the
chain of underlying assumptions. But it’s a massively inefficient waste of their time to ask them to do that all the time, or
even a lot of the time.

Ultimately, all Bayesian arguments start in the middle somewhere. If they didn’t, they’d all have priors of 0.5 (or whatever
equally ramifies the spread of all possible hypotheses). Ian might prefer to start somewhere past the assembly of raw
sensory data and toward the frequency-sets of basic event-occurrences (so maybe he would try to answer the question
“Did Winston Churchill exist?” by starting with questions like “What is the physical probability that a man named Winston
Churchill would be born in pre-WWII England?,” which would be a ridiculous way to proceed). But even that is doing
what I am doing (he, too, is skipping a step: in this case, how we know the frequency data about names is correct, given a
certain body of sensations Ian experiences, and so on). Historians usually skip all the science steps. Because they’re doing
history. Not science (in the narrow sense; I discuss the relation between history and science on pp. 45-49). But one can
always go back in and check those steps. If you had to for some reason.

In short, historians need to be more flexible in how they model questions in Bayes’ Theorem. Ian’s pedantry wouldn’t help
them at all. Because it really doesn’t matter how you build the model–as long as you can articulate what you are doing, and
it’s correct (as I explain in chapter six, especially). Because then it can be vetted and critiqued. Which is all we want. And
that is all my book arms the historian to do. And that’s all she needs in order to get started.

Indeed, having these conversations (about what models to use and what frequencies fall out of them, and thus how to
define h and demarcate e from b in any given case) is precisely what historians need to be doing. My book gives them the
starting point for doing that. Because otherwise it won’t be the same for every question, because the data-availability differs
for each case and thus historians have to demarcate differently in different cases. Scientists don’t face this problem, because
they always have old data (which entails a prior) and new data (which gives likelihoods) and then only address problems
that have tons of precise data to work from. Historians can almost never do any of those things. They have to adapt their
application of Bayesian reasoning to the conditions they are actually in. Proving History explains why. A lot. Ian,
apparently, just ignores that.

MalcolmS also observes how pedantic and insubstantial these criticisms are…


So far all of his criticisms have been stylistic (either about how equations were expressed or how they
were explained), rather than truly mathematical. The rest of his post is no different.
4. He takes you to task for using Bayes’s formula as a synonym for Bayesian reasoning. In particular,
he ridicules this quote from your book: “any historical reasoning that cannot be validly described by
Bayes’s Theorem is itself invalid.” His objection seems to be that, while BT is of course true, there are
other equations that one could derive from the definition of conditional probability that couldn’t be
derived from BT.

[Actually, one could derive the formula for conditional probability from BT if one had some sort of
definition of conditional probability that implied P(AB|B) = P(A|B) and P(A|AB) = 1: If one assumes
BT (i.e., P(H|E) = P(E|H)P(H)/P(E)) then P(H|E) = P(HE|E) = P(E|EH)P(HE)/P(E) = P(HE)/P(E). But
this is besides the point, since you only limited your claim to “historical reasoning,” a term which you
unfortunately didn’t define.]

He further states your attempt to prove this “laughable” assertion is not credible, but gives no other
reason than what I just stated above.

Which is an example of a non-critique critique: saying something is wrong, but giving no reason why, nor even interacting
with the argument you are gainsaying at all. Ian’s overall claim is that Bayes’ Theorem can’t be used to reach historical
conclusions because the probabilities are all unknown. But if that’s true for BT, it’s true for all probabilistic reasoning about
history, which means all reasoning about history whatever.

I demonstrate (with a formal deductive syllogism even: pp. 106-14; supporting the informal arguments on pp. 62-65 and
81-93) that all historical arguments are fundamentally Bayesian (whether historians realize this or not), so if Ian were correct
that no conclusions about history can be reached by Bayesian reasoning, then he is saying no conclusions about history can
be reached. Period. Such radical skepticism about history I have refuted before (inRosenberg on History, where I also
show how, if that were true, science is also impossible, as it depends on historical facts, i.e. data and reports about things
that happened or were observed, so if you can’t do history, you can’t do science).

That Ian totally ignores this, and doesn’t address my syllogistic argument at all, makes his critique here useless. Indeed, that
is the point of my formalizing an argument: so critics will be able to identify any errors that invalidate the conclusion. If he’s
not even going to do that, then he isn’t taking the book’s argument seriously. And so neither should we take his critique
seriously.

One wonders what method he thinks would replace Bayes’ Theorem, that historians can use. Since all historical arguments
consist of deriving conclusions from statements of probability, is there any logically valid way for them to derive those
conclusions other than Bayes’ Theorem? (Or anything that reduces to it? See pp. 92-93; and again, pp. 106-14.) If you
want to be a useful critic, you have to answer that question. I suspect any sincere effort to do so will result in realizing the
answer is no.


5. He then goes on to discuss your “cheeky” proposal to unify frequentist and Bayesian interpretations
of probability. His criticisms here are that your proposal is “unnuanced” and presented as if were
original, when it is not. (Not that he is accusing you of taking credit for others’ ideas but rather of
being possibly unaware of previous work in the field.) He also states that this “hubris” is typical of “a
tone of arrogance and condescension that I consistently perceived throughout the book.”
Which is just ad hominem. I’m quite sure I don’t know all the arguments published on the debate between Frequentists
and Bayesians (it must be in the thousands, counting books and articles), as I’m sure neither does Ian. Or any living person
probably. But certainly, if anyone has articulated the same conclusion as mine before, I’d love to accumulate those
references (it seems Ian claims they exist, but then fails to adduce a single example). So by all means, if anyone knows, post
them in comments here.

That is neither here nor there. The real issue is whether my resolution of that debate is correct. Whether Ian dislikes my
tone or thinks it’s arrogant or condescending is not a valid rebuttal to whether it is correct. I also don’t think he’s making
an objective assessment, since I am responding to the debate as framed in recent literature by leading professors of
mathematics (some of which I actually cite in the book), so he is here actually critiquing them for not knowing the solution
I propose. After all, if even they don’t know about this supposedly condescendingly unoriginal argument of mine (and if
they did, they’d have resolved the debate with it in the literature I cite), then why is it condescending for anyone like me to
suggest it?


6. As a final comment on the mathematics he raises 2 issues but doesn’t elaborate on either: “I felt
there were severe errors with his arguments a fortiori…and his set-theoretic treatment of reference
classes was likewise muddled (though in the latter case it coincidentally did not seem to result in
incorrect conclusions).” This is the entire extent of his discussion of these, from his perspective,
problems.

I readily concede that my colloquial discourse will lead to ambiguities that chafe at mathematicians; but this is precisely the
kind of shit they need to get over, because they are simply not going to be able to communicate with people in the
humanities if they don’t learn how to strategically use ambiguity to increase the intelligibility of the concepts they want to
relate.

It’s like Heisenberg’s Uncertainty Principle: you can have precision with unintelligibility to almost everyone but extremely
erudite specialists, or you can have ambiguity but with intelligibility to everyone else. The more ambiguity, the greater the
clarity, but the lower the precision. This is a fundamental principle of all nonfiction literature, especially any popularization of
scientific or mathematical concepts to a nonscientific, nonmathematical public.

I have a particular audience. I am writing for them. And they are not mathematicians or scientists. However, I always think
there are several points where I could be a better writer. Because I always know there is room for improvement. It would
be more helpful to see someone articulate a point I make in my book better than I did. I would love that. And if anyone
points me to any examples of that, I’ll definitely blog about it.


7. In his conclusion he has some positive things to say about the book:

“Outside the chapters on the mathematics, I enjoyed the book, and found it entertaining to consider
some of the historical content in mathematical terms….History and biblical criticism would be better
if historians had a better understanding of probability….

I am also rather sympathetic to many of Carrier’s opinions, and therefore predisposed towards his
conclusions. So while I consistently despaired of his claims to have shown his results mathematically,
I agree with some of the conclusions, and I think that gestalts in favour of those conclusions can be
supported by probability theory.”

But here is his final critique:

“But ultimately I think the book is disingenuous. It doesn’t read as a mathematical treatment of the
subject, and I can’t help but think that Carrier is using Bayes’s Theorem in much the same way that
apologists such as William Lane Craig use it: to give their arguments a veneer of scientific rigour that
they hope cannot be challenged by their generally more math-phobic peers.”

As you can see, he hasn’t presented any concrete objection to the mathematics in the book – just the
way the mathematics was presented and explained and the overall tone of the book.

So it would seem. That’s not really a substantive critique.

Moreover, the difference between me and W.L. Craig is revealed by all of Ian’s qualifying remarks–like “coincidentally did
not seem to result in incorrect conclusions,” a backhanded way to admit I’m actually using it correctly, unlike Craig, thus
negating his analogy. The whole point of my book is to prevent Craig-style abuses by making it clear how to use BT
correctly and how to spot its being used incorrectly. And indeed, I repeatedly emphasize that anyone who wants to use it
needs to be clear in how they are using it so it can be vetted and critiqued, thus avoiding the “dazzling with numbers” tactic
by arming the reader with the ability to see through it. (Hence I somehow managed to “psychically” refute Ian’s argument in
note 33, p. 305, before he had even made it; likewise my remarks on pp. 90-92. In other words, Ian didn’t really read the
book very carefully, as he is clearly unaware of my rebuttals to his own arguments.)

From the whole of his initial critique, Ian doesn’t seem to have as much experience as I do in trying to explain Bayesian
reasoning to nonmathematicians. Much of my book was formed in response to the difficulties I faced when doing that.
Things Ian thinks would be a better way to proceed, I have discovered first-hand are often the worst way to proceed. In
communicating ideas to humanities majors especially, I have learned you have to approach explanations in very different
ways than trained mathematicians do; and that often, mathematicians do not understand this.


Now I’ll turn my attention to Ian’s 2nd post…

Most of his post is taken up with an explanation of conditional probability and Bayes’s theorem,
which is actually pretty good; it would’ve been a good idea to devote a few pages in your book to
something like this. But so far there’s no criticism of your book, or even really much mention of it. I’ll
start the numbering over again from 1 to list his criticisms, which eventual start to appear.

1. For historical questions, there usually is no easy way to calculate, or even estimate in many cases,
P(E). He says, “I’ve never seen any credible way of doing so. What would it mean to find the
probability of the New Testament, say?…I’m not sure I can imagine a way of calculating either
P(H∩E) or P(E|H) for a historical event. How would we credibly calculate the probability of the New
Testament, given the Historical Jesus? Or the probably of having both New Testament and Historical
Jesus in some universe of possibilities?”
[He writes as if he didn’t actually read your book, although I know that he has, because he is going by
his own knowledge of using Bayes’s theorem and not looking at the examples where you apply it. As I
pointed out in a previous post, you get around the issue of estimating P(E) and P(H) by conditioning
on general statements of the evidence, so that you’re calculating P(H|F) and P(E|(~H)&F). These
need to be estimated somehow, but may be easier since the evidence F is more commonly encountered.
It’s funny that he observed you doing this in his first post but then never thought through the
implications for the rest of his posts. I suggest that if you explained what you were doing
mathematically and how it differed from the way scientists usually state and apply Bayes’s theorem
there would not be so much confusion.]

MalcolmS is right: my book is actually articulating ways to get around the very problem Ian’s talking about (which I
certainly acknowledge: note, for example, my discussion of it on pp. 110-14). Where we can’t, we can’t claim to know
anything (Axiom 3, page 23; also Axiom 5, page 28). His question about how we derive “the probability of the New
Testament” is unclear (what exactly does he mean?), but I address something quite close to it on pp. 77-79, using the
Gospel of Mark rather than the whole NT (and I get even more specific in my discussion of emulation criteria later on: pp.
192ff.), which appears to completely answer his question. So why, then, does he not know that I answered his question? If
he is ignoring my answer, then he is not critiquing my book, but some straw man of it.

In any event, the problem he is talking about (and that I also talk about in the book) is addressed by (a) being less
ambitious in what you will attempt to prove (a lesson historians often need to learn anyway) and (b) being more clear and
precise in laying out what evidence it is that you think produces a differential in the consequent probabilities (most evidence
simply will not, and therefore can be ignored). Thus “the whole NT” is irrelevant to historicity; likewise even “the exact
content of Mark.” We will need to get much more specific than that. What is it in Mark that makes any significant
difference? And why? And how much? Whatever your answers to those questions are–literally, whatever they are–we can
model your answer using BT. And in my next book I do that.

Ironically, as I noted before, Ian is committing the very mistake here that I warn against in the book: if we cannot estimate
P(E|H), then historical knowledge is simply impossible. Because all historical conclusions implicitly rely on estimates of
P(E|H) (and/or P(E|~H)), or their differential (using the “Odds Form” of BT: see “Bayes’ Theorem, odds form” in the
index, p. 333). That’s all historical conclusions ever reached before now, and all that will ever be reached by anyone
ever. Thus, if BT can’t solve this problem, no method can. And if Ian thinks otherwise, it’s his task to produce that method,
a method by which (a) a historian can get a conclusion about history without (b) ever relying on any implicit assumption
about any P(E|H) or P(E|~H); or, for that matter, P(H). Good luck with that. Because it can’t be done. If he’d tried it, he’d
know.


2. He then asks whether using the expanded-denominator version of the formula [which he again
annoyingly attributes to you and Craig as if you are the only ones who write it this way, and he’s not
talking about the inclusion of the background here either] could ameliorate this problem with
estimating P(E):

“This is just a further manipulation. The bottom of this equation is still just P(E), we’ve just come up
with a different way to calculate it, one involving more terms. We’d be justified in doing so, only if
these terms were obviously easier to calculate, or could be calculated with significantly lower error
than P(E).”
[He doesn’t seem to think that P(E|~H) is often easier to calculate than P(E), nor does he notice the
advantage of using a variable, P(E|~H), that is independent of the other 2, P(H) and P(E|H), unlike
P(E), which isn’t. Perhaps if he looked at your examples or even the standard example of medical
testing he’d see why the expanded form often works better.]

Here MalcolmS has already effectively rebutted this point. Ian just fails to grasp the way BT has to be employed in the
humanities, and what it takes to translate how people in the humanities already reason into terms definable within BT. The
utility of knowing the whole structure of Bayes’ Theorem is so we can understand the logical structure of our own thought–
and thus make better arguments, and better identify the flaws in others.


3. His next criticism is a bit bizarre, as he complains about having to use estimates: “If these terms are
estimates, then we’re just using more estimates that we haven’t justified. We’re still having to
calculate P(E|H), and now P(E|~H) too. I cannot conceive of a way to do this that isn’t just
unredeemable guesswork. And it is telling nobody I’ve seen advocate Bayes’s Theorem in history has
actually worked through such a process with anything but estimates.”

[Of course you use estimates – even in the sciences one does. Unless you’re doing a problem with dice
or cards, the numbers one plugs in are always estimates. And you present several examples where you
attempt to justify your estimates. It would help if he actually addressed one to show why it was
nothing but “unredeemable [sic] guesswork.”]

He sums up in a similar vein: “So ultimately we end up with this situation. Bayes’s Theorem is used
in these kind of historical debates to feed in random guesses and pretend the output is meaningful.”

Here MalcolmS has already effectively rebutted this point, too. Ian seems to be conflating “not knowingx precisely” with
“not knowing x at all.” I explicitly address this fallacy several times in the book (early in chapter six, and in my discussion of
arguing a fortiori: pp. 85ff.). In short, historians don’t need the kind of precision Ian seems to want. In fact, as I explain in
the book, that they can’t get it even if they wanted it is precisely what demarcates history from science (pp. 45-49).


4. As a teaser for what he intends to write about in a later post, he states why he thinks a fortiori
reasoning doesn’t work:

“But, you might say, in Carrier’s book he pretty much admits that numerical values are unreliable,
and suggests that we can make broad estimates, erring on the side of caution and do what he calls an
a fortiori argument – if a result comes from putting in unrealistically conservative estimates, then that
result can only get stronger if we make the estimates more accurate. This isn’t true, unfortunately, but
for that, we’ll have to delve into the way these formulas impact errors in the estimates. We can
calculate the accuracy of the output, given the accuracy of each input, and it isn’t very helpful for a
fortiori reasoning.”
[His characterization of your a fortiori argument – “if a result comes from putting in unrealistically
conservative estimates, then that result can only get stronger if we make the estimates more accurate”
– is easily demonstrably true: P(H|E) is monotonically increasing in P(H) and P(E|H) and decreasing
in P(E|~H), so it follows immediately that if one takes a maximum estimate for P(H) and P(E|H) and a
minimum one for P(E|~H) then the estimate for P(H|E) using Bayes’s theorem is a maximum, and
similarly one derives a minimum for P(H|E). Furthermore, tightening the possible ranges for these
variables yields a tighter range for P(H|E), so the a fortiori argument is in fact valid.

I think what he is intending to say is that while one may in principle get a possible range for P(H|E)
from possible ranges for the other probabilities, in practice, the range turns out to be too wide to be
useful. Whether that turns out to be the case will be seen when you actually try to apply it.]

His final comment supports my reading of him: “It doesn’t take much uncertainty on the input before
you loose any plausibility for your output.”

[If this were true it would true for all uses of BT, so how does he account for its use in science? It’s
not like those estimates for false positives in DNA testing (in his example) lack errors.]

Here MalcolmS has already effectively rebutted this point as well. Like he says, we’ll have to see what Ian comes up with.
But I suspect he is ignoring everything I explained in the book about the fact that historians have to live with some measure
of ambiguity and uncertainty, far beyond what scientists deal with. That’s what makes history different from science (again:
pp. 45-49; and again, Axioms 3 and 5, per above). Ian seems either to want history to be physics, or to think I want
history to be physics. The one is foolish; the other betrays a failure to really read my book (e.g. pp. 60-67).


5. He hints at another problem here but says he’ll explain some other time:

“[I]n subjective historical work, sets that seem not to overlap can be imagined to overlap in some
situations. This is another problem for historical use of probability theory, but to do it justice we’ll
need to talk about philosophical vagueness and how we deal with that in mathematics.”

As MalcolmS says, Ian’s criticism on this point is too vague to even know what he means. At present no response is
required.


6. There are 4 footnotes to the post, only the last of which could be taken as a criticism of using BT
with uncertainty (specifically the form of BT with the denominator expanded). Here it is in full:

“You’ll notice, however, that P(E|H)P(H) is on both the top and the bottom of the fraction now. So it
may seem that we’re using the same estimate twice, cutting down the number of things to find. This is
only partially helpful, though. If I write a follow up post on errors and accuracy, I’ll show that errors
on top and bottom pull in different directions, and so while you have fewer numbers to estimate, any
errors in those estimates are compounded.”

[Since P(E|H)P(H) appears in both the numerator and denominator, an increase in it would increase
both the numerator and denominator, so the effects from each would offset, not compound! But P(H)
appears in the second term in the denominator, so it is not quite that simple. Changes in P(E|H)
would partially cancel in the numerator and denominator, making P(H|E) less sensitive to changes in
it, but for P(H) depending on the ratio of P(E|H) and P(E|~H), the effect of changes in the prior on the
top and bottom of the fraction can indeed compound, but not for the reason he stated.]

That is his whole criticism. Basically he doubts that BT can be practically used in BT because he feels
the inputs are too subjective and have uncertainties that are too wide. We’ll have to wait for your next
book to see if you can pull it off.

The quoted argument doesn’t seem mathematically informed, unless he means errors in P(E|H) and P(E|~H) can pull in
different directions; otherwise, P(H) and P(~H) always sum to 1, so they actually consist of a single estimate, not two, so
they can’t pull against each other. If you estimate P(H) against yourself, you have also estimated P(~H) against yourself, by
definition. And if you do the same with both P(E|H) and P(E|~H), they can’t ever pull in opposite directions, either. The
compounded error between them will then only make your argument more a fortiori. So there isn’t any discernible
criticism here that I can make out. (You can see how I already address this whole issue on pp. 85-93.)

Overall Conclusion: So far I am not impressed. It doesn’t look like Ian’s taking any argument in the book seriously. His
critiques are almost like some Cato Institute effort at refuting a policy that they don’t really have any valid argument against
but they have to refute it anyway so they come up with whatever trivia and handwaving they can think of. The fact that
almost all of Ian’s critique is already rebutted in the book itself (often directly and explicitly) only strengthens the analogy.

Share
this:

   

SKEPTICON 5 W ILL B E A W ESOM E HISTORICITY NEW S: THA LLUS ET A LIUS

46 comments
A U STER I TY • OC TOBER 9 , 201 2, 4:07 P M

I have one narrow criticism that, after reading the above, might be related to your discussion of point 3 in Ian’s
first post. It might best be summarized as advice to prefer clarity when it comes to stating hypotheses.

I thought about posting this a couple of times but decided both times that maybe the distinction I am
recommending is so subtle as to be no distinction at all. That perhaps it was just me being momentarily dense. It
was something I had to stop and think about for a minute though, to make sure you were saying something
sensible.

I have the Kindle edition and the Kindle tells me it is on page 133. The discussion regards Mark’s narrative
following the Psalms.

You write, “In Bayesian terms, the probability of all these coincidences with the Psalms is much lower on the
hypothesis that they really happened than on the hypothesis that Mark is creating a narrative out of the Psalms.”

So the evidence, e, is “coincidences with the Psalms are in Mark,” and the hypotheses are h1=”they really
happened,” and h2=”Mark invented them by following the Psalms.” Shortly after you say that P(e|INVENTED)
>>> P(e|HAPPENED).

This gave me pause. My thinking went in a few steps.

1) With the wording of the paragraph, P(e|HAPPENED) should represent how likely our evidence isgiven the
hypothesis that the events really happened.

It wasn’t clear to me why it would be significantly lower than the INVENTED case. Especially if our
background includes the fact that a Jesus cult formed. That seems like exactly the kind of thing someone would
want to include. “You’re telling me the way he died followed the Psalms? I’m definitely including that!”

Now, HAPPENED might have a low prior but you are clearly talking about the consequent here, so I don’t
think this is what you could really mean.

2) So I thought maybe you just meant that some-passion-or-other HAPPENED. Then P(e|HAPPENED) would
be lower because there are a lot of ways that some-passion-or-other could happen and one that follows the
Psalms is only one among many.

The problem here is that then HAPPENED isn’t mutually exclusive with INVENTED. Some-passion-or-other
could happen and still have Mark invent his particular narrative. (i.e. P(H) + P(~H) won’t add up to 1)

3) Finally I decided you must mean the first hypothesis is h1=”Mark accurately recorded real events.” (I told
you it was subtle!) This has the advantage that, by being non-specific about what exactly he is recording (other
than it being true), it implicitly pulls in all the other ways some-passion-or-other could happen, while still being
exclusive with INVENTED (or, technically, with the hypothesis that at least some of Mark was invented).

You are probably thinking something like, “So Mark accurately recorded real events? In other words , ‘they
really happened’!” But, I think this distinction makes it more clear what is really going on, which might be
important since your audience is people without any Bayes background. I think maybe being colloquial, in this
case, gets in the way of making the use of BT clear.

Or maybe not. I know it is pretty subtle but, like I said, it gave me a bit of pause, and at first I thought you might
have made a blunder. I was thinking there was a later instance of this kind of potentially confusing language, but I
couldn’t locate it. It isn’t easy to thumb through a book on a Kindle.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 0, 201 2, 1 0:27 AM

You’re right, my wording is confusing there. This is an example of where my writing could be
improved (as I mentioned in the article above).

If we assume h is literally that the coincidences all happened as described, then P(e|h) = 1
and (as you note) the problem is moved to the prior (where the prior probability of such
coincidences is then low; the math then works out the same–but the model should still then be
described differently).

So I should have explained “happened” means simply that Jesus was crucified etc. (and not
the presumption of any specifics; which throughout the book I explain is the better way to
form a hypothesis: otherwise, as in the scenario above, we are gerrymandering–see
“gerrymandering” in the index, but esp. the basic explanation of the problem of trying to move
improbabilities from the consequent to the prior on pp. 80-81).

Then P(e|h) = the probability that “Jesus was crucified etc.” would generate any such set of
coincidences, which is low; whereas P(e|~h) = the probability that “the author is inventing
some set of coincidences” would generate some comparable set of coincidences, which is
high. Here, in parallel, we don’t have ~h assert the exact coincidences, only the general
hypothesis that the author is inventing coincidences (regardless of what they turn out to be;
hence, pp. 77-79, with “coefficient of contingency” in the index).

You note the problem of excluded middle: that “Jesus was crucified etc.” is compatible with
“the author is inventing some set of coincidences” and therefore there is a third hypothesis
that has to be considered. This would be P(e|hspecial): that “Jesus was crucified etc.”and
that Mark’s account of it is completely (or mostly) fabricated. But P(e|h) = the probability
that “Jesus was crucified etc.” would generate any such set of coincidences, which probability
is the same whether Mark is fabricating his account or not (although by definition it is here
assumed not); and P(e|~h) = the probability that “the author is inventing some set of
coincidences” would generate some comparable set of coincidences, which probability is the
same whether Jesus was crucified or not (and here, no assumption is stated regarding that).
So there is no middle being excluded; the model simply isn’t even testing whether Jesus was
crucified, but whether Jesus being crucified would cause e.

This means P(e|hspecial) is already implicitly included in P(e|~h). In other words, ~h only
asserts that Mark fabricated, not that Jesus wasn’t crucified; so h can then be restated as
“Mark didn’t fabricate” rather than “Jesus was crucified etc.” (it’s just that “Mark didn’t
fabricate” entails that “Jesus was crucified etc.” must have caused e). And that is essentially
the solution you worked out. You have provided a guideline for a much better rewrite of that
page.

(And the cool thing is how well you thought like a Bayesian to work out the correct model!
That’s exactly the sort of thing I want historians to know how to do.)
BTW, one could still unpack that third hypothesis if they wanted to and have a three-
hypothesis test. P(e|hspecial) would differ from P(e|~h) only by the degree to which it would
be improbable for Jesus to actually have been crucified etc. and for Mark to fabricate an
account of it anyway, which might not be very much different (i.e. on minimal historicity,
whereby we assume most information was lost, thereby explaining the widespread
contemporary silence about it even among Christians, it would not be very unlikely that Mark
would fabricate a crucifixion account even for a real crucifixion; whereas theories that Mark
would have good information, which would make the probability of his “fabricating anyway”
lower, would then struggle to explain why we had to wait for Mark to write about it, which
silence would in turn be less probable, and so on–but this then gets into more complex arrays
of hypotheses to test, and over-complicates the problem for the purposes of the point being
made on p. 133).

R E P LY

Y OU N GA LEX A N D ER • OC TOBER 9 , 201 2, 4:41 P M

MalcolmS: “3. His next criticism is a bit bizarre, as he complains about having to use estimates …”

Indeed.

“Of course you use estimates – even in the sciences one does. Unless you’re doing a problem with dice or
cards, the numbers one plugs in are always estimates.”

Of course they are. An observation of some property of a physical phenomena results in an estimate of its value
within certain limits. It would be meaningless otherwise. Such errors are simply an integral part of the
measurement procedure. When the data is subsequently processed the errors are as well.

As you reply, Ian appears to be complaining of the size of the errors in history, but that is irrelevant. He gives the
impression of one who is recoiling from the ‘shock’ of the new, ie. the application of BT to history.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 0, 201 2, 1 0:42 AM

That or he is recoiling from the shock of realizing that historical knowledge is a lot less certain
than scientific knowledge. Most of us already got that memo a long time ago. But perhaps it’s
startling to scientists?

I know the general public has a hard time with this–the levels of uncertainty that actually exist
for historical claims (often even what we consider well nigh certain claims) would disturb
people if the risk of being wrong were greater. For example, if we said the odds were 1 in
100 that Jesus didn’t exist, we’d confidently say it was certain that Jesus existed. But if we
said the odds were 1 in 100 that your car will explode the next time you sit in it, you would
not confidently say it was certain your car was safe. To the contrary, you sure as hell would
never get into it again. Translating one’s willingness to grant 1 in 100 as a real risk from the
one case to the other is intuitively difficult for people. And notably, a 1 in 100 chance of being
wrong would be rejected by every scientific peer review process there is.

Thus, historians, and people generally, have become complacent with a double standard,
assuming their probabilities equate to certainties, when in fact they don’t really, not by
scientific standards, nor by any standard that relates to a real risk. It only gets worse when
we are looking at claims in history that have, say, a 1 in 5 chance of being false. Which would
not be unusual. But that is still different from a 4 in 5 chance of being false. Thus, even highly
uncertain outcomes in historical analysis make a difference to what we assert–a concept that
might be hard for scientists to grasp, who are accustomed to simply throwing out all such
uncertain results–being, as they are, unpublishable. But “unpublishable in a science journal”
does not equate to “communicating no knowledge about the world at all.”

You just have to be more comfortable with ambiguity and uncertainty than that. Obviously,
not when the risk of being wrong is high. But that isn’t commonly the case in history.
Although it is sometimes the case. So one has to be careful before using a historical claim on
which to base a philosophy or social policy, for example. And that is why scientists have such
high standards of documentation, since science depends on historical claims, and thus it must
ensure those claims maintain extremely low probabilities of being false…the fact that people
didn’t have that sensibility in antiquity is precisely what puts us in realms of uncertainty about
such things as whether Jesus existed or how Christianity began, something you’d think God
would anticipate and fix right from the get-go; just one more evidence Christianity had no
God behind it.

R E P LY

A LEX A N D ER JOH A N N ESEN • OC TOBER 9 , 201 2, 4:49 P M

That settles it; I’ll buy your book right now. Ian’s criticism is damn outright puzzling, but I think he actually
explained his error the best;

“It doesn’t read as a mathematical treatment of the subject”

Um, yes? It isn’t, nor intended to be, nor advertised as such. Someone is, indeed, being pedantic, nitpicking at
the straw outline of a straw man.

R E P LY

IR R C OIAN • OC TOBER 1 0, 201 2, 5:31 AM

There’s a style of rebuttal by exhaustive quoting. Which is fair enough, but degenerates into picking fault with the
character of what is said. So although there are lots of things in this response that in my opinion are wrong,
misunderstood or unfair (discussing through an intermediary is probably not useful), I want to focus on the book.

It might be worth me summing up my conclusions on the book.

1. I agree with a lot of your conclusions. Particularly with regard to criteria. I probably liked the book partly
because the you chose was on things were you confirmed many of my biases.
2. I think probabilistic reasoning under a Bayesian interpretation can help build stronger intuitions about history,
and are valuable skills to have for anyone in the arts. I’m not convinced it is the most important thing a historian
should learn, but if learned, and learned rigorously, can be nothing but helpful. Another way to state this:
Malcolm on my blog and you here (and in the book) make the point that, even with the loosest assumptions and
the worst approximations, we’re no worse off with probability theory than making the same assumptions and
approximations narratively. Which is what most historians do anyway. I agree, totally, but that doesn’t mean it is
anything more than trivially more accurate. Can you quantify the increase in accuracy? I don’t think so, not
without quantifiable test cases.

3. Pedagogy: the book seemed to be aimed at helping folks interested in this area understand the applicability of
Bayes’s Theorem to the field. It didn’t read as a book that assumed that knowledge and suggested an
application, but read as an introduction to the topic. So I think it is a valid and non-pedantic criticism to say that
it’s lack of any explanation of what probability is, what conditional probability is, what Bayes’s theorem is doing
and why, and how it is derived, is important. The fact that such foundational content is relegated to footnote
references to other books made me doubt your sincerity in actually wanting to help your readers get to grips with
the topic.

4. Technical accuracy: It is understandable if you’re approaching the math as an amateur, but you give the
impression you simply don’t understand certain things you discuss. Issues like Frequentist/Bayesian issues: where
the controversy isn’t even discussed in recognizable terms, let alone your solution being valid. It isn’t that I’m
claiming your solution is not novel, but that you haven’t even stated the problem properly. Similarly with Bayes
under sets of evidence: the problem of independence of multiple pieces of evidence. Statistical sensitivity to
reference classes. Changes of reference class when updating for new evidence. These are tricky, tricky issues,
that when you do this for real, you have to spend time on. Ignoring them has sent innocent people to jail.

5. The upshot: Doing this non-rigorously (you make the point that you don’t want to use this scientifically, or
mathematically) gives the massive danger of choosing inputs to the process (probability estimates, error ranges,
reference classes, choice of evidential features, phrasing of the evidence classes, phrasing of the hypothesis class
— all of which are under your control) which give the output you want. There are simply so many people
who’ve used Bayes’s Theorem as a black box to give them back their pre-conceptions. This happens *a lot* in
science, where statistical analysis demonstrate the researchers biases, which then disappear when repeated by
others without the same bias. I agree with you that being clear about what those inputs are is important, but the
choosing of the actual numbers is perhaps the most obvious and least important. From what I’ve seen over years
of this, you are falling into the trap of thinking you’ve done more than restate your conclusions.

So tl:dr – I liked the book; but I got the sense it was more polemic and less pedagogic than you wanted it to
appear; you displayed a tendency to use terminology sloppily, over-extend your knowledge (clearly you have a
good grasp of some elements of the math), and make generalizations without enumerating the conditions for them
to hold; and the book did very little to show that you’d properly wrestled with the errors inherent in your
selection biases and numeric estimates, and therefore that you conclusions were anything more than what you put
in.

There is a big literature dealing with some of these issues, and while I would never expect you to have mastered
it (nor have I, or perhaps anyone), it is important to know where the main problems are.

Unfortunately, the general style of internet arguments degenerates rapidly into people insisting others prove their
misunderstandings wrong. Which by definition is hard. So I’ve struggled to address, for example, Malcolm, on
my blog, without saying “go away and do some of this for real problems where you can quantitatively and
objectively check your answers, and then you’ll see how tricky it is.”
R E P LY

R IC H AR D C AR R IER • OC TOBER 1 0, 201 2, 1 1 :43 AM

Thank you for these clarifications. They didn’t come through in your blog, so it’s helpful to
have them.


Malcolm on my blog and you here (and in the book) make the point that, even
with the loosest assumptions and the worst approximations, we’re no worse
off with probability theory than making the same assumptions and
approximations narratively. Which is what most historians do anyway. I
agree, totally, but that doesn’t mean it is anything more than trivially more
accurate. Can you quantify the increase in accuracy? I don’t think so, not
without quantifiable test cases.

I think you are confusing here two different kinds of improvement: improvement in probability
estimates for correctly reasoned arguments; and improvement from incorrectly reasoned
arguments to correctly reasoned arguments.

I don’t claim much in the way of the former (that can perhaps arise from the process of using
BT as a tool for better mediating disagreement, as I explain, for example, in chapter 6, pp.
208-14, and chapter 3, pp. 88-93; and also by an understanding of BT teaching historians
the importance of getting more data, as has happened in archaeology; and in the simple sense
of forcing historians to actually confront what their probability estimates are, and what that
means as far as odds of being wrong and the attending humility that often will entail), but I am
much more concerned with the latter (hence re-read pp. 92-93 in particular).

Historians are notorious for making illogical arguments and thinking they’ve made a good
case (not only is this massively documented in Fischer’s Historians’ Fallacies but it’s what
I essentially document throughout chapter 5), or rejecting logical arguments by resorting to
fallacious dismissals of them (rejecting sound arguments from silence, or not knowing when
an argument from silence is sound, or how to test the soundness of an argument from silence,
is a very common example, hence understanding BT can help historians do all of those things,
and thus get out of the rut of relying on their uninformed and often illogical gut intuition: hence
pp. 117-19).

This is where BT can cause tremendous improvement in how historians reason, argue, and
debate. Fischer notes in his book that someone needs to discover the logic of history, and
that (as of his writing) no one had done that yet. Proving History does that. Alhough I
wasn’t the first to think of it, as essentially the same case is made in Aviezer Tucker’s Our
Knowledge of the Past: A Philosophy of Historiography, only he doesn’t convert it into
practical advice or explain much of how historians can use BT or even write for historians, he
just shows that historical reasoning is already necessarily Bayesian (sadly, I was unaware of
his book until after mine came out, so I didn’t get it included in the endnotes, where I’d surely
want it; it probably wouldn’t have influenced PH, though, as everything useful in it I had
already thought of independently, which IMO makes him Wallace to my Darwin…or vice
versa).


So I think it is a valid and non-pedantic criticism to say that it’s lack of any
explanation of what probability is, what conditional probability is, what
Bayes’s theorem is doing and why, and how it is derived, is important. The
fact that such foundational content is relegated to footnote references to
other books made me doubt your sincerity in actually wanting to help your
readers get to grips with the topic.

Except I do explain what conditional probability is (pp. 79-81), and I explain as much as a
historian needs to know about what BT is doing (chapter 3). The practical fact is that
historians don’t care about the history of BT or its logical foundations or indeed almost any of
what you wrote about on your blog. You and I find that interesting. But they don’t, and
quickly get bored with it. I know this from experience. And the fact is, they really don’t need
to know it. All they need to know is what I provide in chapters three and four. Anyone who
wants to know more, can follow the references provided.

I would love to find a really good textbook on probability theory that isn’t unintelligible or
almost entirely useless or irrelevant to humanities majors, so I can recommend it as a twin to
mine. So far the best I have is McKellar’sMath Doesn’t Suck, which is perfect except that
it only treats probability very briefly and rudimentarily. If she ever produces a book just on
statistics (which will certainly cover all the basics of probability theory), that would be the
dream. Or if anyone else does that. If you know of any already, do let me know. But it really
has to be something comparable (as in, aimed at teaching nonmathematicians; a blizzard of
differential equations doesn’t do that).


It is understandable if you’re approaching the math as an amateur, but you
give the impression you simply don’t understand certain things you discuss.

If there are better ways to argue a point in my book, please improve on my work by
producing it. I am actively looking for blogs and blog articles that do that, and if they are
good enough and accomplish the task, I will definitely blog about them (if I know of them; so
usually, someone has to tell me about them).

Otherwise, all I need to know is what I try to convey. Errors I need to correct; but someone
voicing impressions that don’t relate to any actual errors to correct, is not useful.



Issues like Frequentist/Bayesian issues: where the controversy isn’t even
discussed in recognizable terms, let alone your solution being valid. It isn’t
that I’m claiming your solution is not novel, but that you haven’t even stated
the problem properly. Similarly with Bayes under sets of evidence: the
problem of independence of multiple pieces of evidence. Statistical sensitivity
to reference classes. Changes of reference class when updating for new
evidence. These are tricky, tricky issues, that when you do this for real, you
have to spend time on. Ignoring them has sent innocent people to jail.

I don’t think historians are worried about going to jail over any of this, so the hyperbole is
unwarranted. I didn’t write a book for lawyers or risk managers.

As to the more specific points, I would love–reallylove–good blog posts covering every one
of those issues you list, in terms a nonmathematician can understand. Produce them, and I’ll
blog them. Produce enough of them, and you’ll have a book you can market to historians
who want to go beyond what I cover or to understand it better.

That’s how a field makes progress. And having scientists working to help historians is
precisely the kind of interdisciplinary work I want to see more of. You just have to
understand that they speak a different language and lack almost all the assumptions and
mathematical background you take for granted, and have different needs and interests than
you (as for example our different interest in how BT was proved and what it is doing
mathematically, vs. that being of no use or interest to historians who just want to know how
to use it and benefit from it; although a good blog post/book chapter on that subject would
still be great, in any work that aims to expand on mine and help historians dig deeper into the
whole matter of Bayesian reasoning and the pitfalls and wonders of probability theory).


Doing this non-rigorously (you make the point that you don’t want to use this
scientifically, or mathematically) gives the massive danger of choosing inputs
to the process (probability estimates, error ranges, reference classes, choice
of evidential features, phrasing of the evidence classes, phrasing of the
hypothesis class — all of which are under your control) which give the output
you want.

That’s true of all methods whatever. Historians are already doing this, all the time. It’s
precisely because they don’t know how to logically vet or model what they are doing that
they (a) don’t know they are doing it or (b) don’t know how to detect or prove when it’s
being done. Knowing BT arms you against exactly those problems, arming you against
misuses of BT. If you understand the mechanics of BT, you can then spot and therefore
defend yourself against exactly the dangers you refer to. And if you still miss a problem or
mistake, for its complexity perhaps, someone else can come along and catch it and explain
the problem. Then you redo. And progress has occurred. I fully expect that to happen to me.
It’s precisely what I want.

But if we don’t even get started on this process, we will experience no progress whatever.
Thus, we have to start somewhere. The potential for errors is irrelevant, precisely because
that potential exists now (no matter what method you use; again, as Fischer and I
demonstrate), yet we have no tool to get beyond it. But now we do. That’s why we need to
get started on learning and applying that tool.

Indeed, I expect there are many ways mathematicians could help historians do this even
better, by covering the logic of everything you mention here, in terms understandable to and
usable by historians (rather than in esoteric terms only intelligible or useful to scientists or
mathematicians; for example, you have to cast aside science-level quality standards and ask
instead how we deal with highly uncertain information and small datasets, since the latter is
what historians face–in fact, the latter is what most people face, most of the time, so really
scientists and mathematicians are leaving neglected a vast realm in great need of improved
reasoning, all because of a misapplied standard that disregards everything that wouldn’t pass
scientific peer review as being moot, when in fact it’s not moot, it’s just not that certain, and
yet nevertheless is far from being moot, but is often essential to daily life and a great many
professions).


There are simply so many people who’ve used Bayes’s Theorem as a black
box to give them back their pre-conceptions.

And when you understand BT, you can call them out on this when it happens. It’s when you
don’t understand BT that you are in danger of being misled or bamboozled by this.

And again, this trick is used even without BT, thus avoiding BT won’t avoid the problem
anyway (see, again, note 33, p. 305, and my remarks on pp. 90-92.). Whereas since any
argument can be modeled with BT, you can use BT to vet even other methods to see if the
same tricks are being pulled there as well.

And again, even my own errors can be more easily detected (and thus, once pointed out,
more easily corrected) the more people who understand the mechanics of BT. I consider that
a tremendously useful feature. Because I actually like catching and correcting my own errors.

But that actually has to be done. That is, you can’t just vaguely handwave about there being
errors, and then never point out where these errors are or why they are errors (or how to fix
them, which would be useful, too).

R E P LY
IR R C OIAN • OC TOBER 1 0, 201 2, 1 :32 P M

Thanks Richard.

I wonder where the most constructive area to focus is. Some of your response seems fair enough, other bits I
still just don’t buy. But no point just restating my objections.

not “improvement in probability estimates for correctly reasoned arguments;” but “improvement from incorrectly
reasoned arguments to correctly reasoned arguments.”

So, am I understanding you right, you don’t see using Bayes’s Theorem is a way of arriving at estimates of
confidence in conclusions starting from estimates of its inputs. But rather you see it as a tool to look at both sides
of the equation at the same time to figure out if they are consistent?

A fan of yours who interacted on my blog said this today:

“I think I understand the debate well enough to say that Carrier states that he doesn’t view or explain BT as “a
strict input-output process” and I agree with him.”

I replied that, if that is what you’re saying, I agree too.

I got the sense from the book you thought that we could estimate the inputs independent of estimates of the
conclusion and derive important insights on the conclusion. If you’re not saying that, then a chunk of my
objections are rendered moot.

“That is, you can’t just vaguely handwave about there being errors, and then never point out where these errors”

Yeah, it is tough to write in detail in a review. So I picked issues that were maybe more pedantic, but I wanted
to communicate my feeling that you were using the math polemically rather than mathematically.

In the comments to the two posts there are discussions about sensitivity to reference classes, the way BT is ill-
conditioned for certain probability bounds, the problem of adding evidence to small probabilities. I’d also say my
issue with your characterization of Bayesian/Frequentist postilions is quite specific: you do not describe them in
terms that describes why they are not trivially unified (if the problem were as you describe it, you wouldn’t be
the first to describe your solution to it!).

If you want to unpack any of these, can I suggest we pick one and discuss, because I find it quite tough to
interleave multiple issues.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 8:49 AM


So, am I understanding you right, you don’t see using Bayes’s Theorem is a
way of arriving at estimates of confidence in conclusions starting from
estimates of its inputs. But rather you see it as a tool to look at both sides of
the equation at the same time to figure out if they are consistent?

That’s not quite what I said. But yes, that is also one other useful thing we can do with it (as I
explain on p. 214).

But more exactly, I said two things: (1) that in some cases, BT can be used to improve our
estimates (the kinds of cases I spelled out in my reply, for example) and (2) that the main
thing understanding BT does is help us argue correctly instead of incorrectly. In a very
abstract sense you can say that any fallacious argument (even by someone who knows
nothing about BT) can be modeled with BT and that will show that there is an inconsistency
as you suggest, and then understanding BT can help a historian fix or avoid that inconsistency.
But doing either requires understanding BT, and doing both is badly needed in the field of
history.

And of course once historians can do (2), they can do (1), through debate. That’s the other
benefit: once a historian sees the Bayesian model for what an opponent is arguing, he can
start challenging the premises even of a consistent argument, and a debate can ensue as to
how credible those estimates are and whether they should be changed; and once changed,
what effect that has on the conclusion. This is something historians, right now, cannot
coherently do at all. They do it in some sort of vague, hard-to-pin-down, get-nowhere battle
of intuitions. BT will allow them to actually see how to proceed and make arguments in this
process that can actually derive from observations and not just intuitions.

This is what I explain in the sections on mediating disagreement (in chapters 3 and 6).


I got the sense from the book you thought that we could estimate the inputs
independent of estimates of the conclusion and derive important insights on
the conclusion. If you’re not saying that, then a chunk of my objections are
rendered moot.

If you mean, you thought historians approach hypotheses with no idea whether they may or
may not be true until they run them through BT, then that doesn’t even correctly describe
how scientists behave. Even scientists often have a good idea of how the results of a study
or equation will come out, but do it anyway just to be sure (because sometimes they find they
are wrong; yet usually, depending on their experience, they were ballpark right, certainly with
the same accuracy as historians, who almost never get results more accurate than ballpark
anyway).

Yes, historians, like scientists, sometimes don’t have any idea what a result will be until they
run the numbers. But usually, once they see the data, they have an idea. (That’s why any
history has ever been done successfully until now: obviously historians have been using some
arguments that BT would verify are correct.) It’s then just a question of making sure they’re
right. And BT can do that; although it can also give you surprises (since people are often bad
at estimating the effect on a larger conclusion of a firm probability belief they have).

Above all, even if all you do is use BT to get the result you want, byusing it you’ve let the
cat out of the bag. Now others can come in and point out that your inputs are implausible.
Which will result in a debate over how we can know that (or not know it’s not that, as the
case may be). Which will result in progress toward a consensus that will be less dependent
on what a proponent wants to be the case and more dependent on what the data can actually
support. Again, I cover this in the sections on mediating disagreement in chapters 3 and 6.

But this is all within the paradigm of historians who use BT. I’m trying to get historians into
that paradigm. Right now, they are using not BT but a slew of, let’s say, “wonky theorems,”
often or wholly fallacious formulas of getting from premises to conclusions (I give examples all
through chapter 5), so that even sound inputs aren’t getting them sound outputs; and because
their reasoning’s logic is not made explicit with any model, critics have no clear way to
identify what they are doing wrong, or even to detect that something wrong was done. Thus,
“mediating disagreement” has no objective procedure, and progress is near impossible
(beyond random drift).

By contrast, BT reduces all arguments to a debate over just three numbers. That’s a
tremendous advance.

Indeed, usually historians do not disagree egregiously over one or two of those numbers, so
often BT can reduce an argument to a debate over just one number, or two. It’s easy to see
how progress can then ensue–or even at worst, how historians can settle on where their
disagreement actually is (e.g. faith-based historians will over-estimate the prior probability of
miracles, secular historians will not; but even there, the only way faith-based historians can
get the results they want is by so hugely over-estimating the prior probability of miracles that
their position can easily be exposed as ridiculous…by anyone who understands BT: see my
deployment of this point in my use of BT to refute the McGrews’ use of BT to defend the
resurrection of Jesus in The Christian Delusion, although I never name the McGrews’, I
address their every argument, as well as the standard args. of Craig and Habermas and
Licona, and put the effect on the Bayesian model in my endnotes; I expand that argument to
even more devastating effect in my Bayesian analysis of the origins of Christianity in The End
of Christianity, where I don’t even need probability estimates, just what could be reduced
to completely nonnumerical statements of relative probability [like C > D and C = D],
another example of how the logic of BT can be used to good effect without even using
numbers).


In the comments to the two posts there are discussions about sensitivity to
reference classes, the way BT is ill-conditioned for certain probability bounds,
the problem of adding evidence to small probabilities. I’d also say my issue
with your characterization of Bayesian/Frequentist postilions is quite specific:
you do not describe them in terms that describes why they are not trivially
unified (if the problem were as you describe it, you wouldn’t be the first to
describe your solution to it!). If you want to unpack any of these, can I
suggest we pick one and discuss, because I find it quite tough to interleave
multiple issues.

I like them all (as well as the others you listed before). You should pick the one you are most
enthusiastic or comfortable blogging on, and do a blog post on it, and let me know the URL.
Since you are certainly more expert in these details, and there could be great benefit in this.
I’m most interested in how to improve on PH. So a blog that shows how to frame a problem
correctly or what the pitfalls are and how to avoid them (even if you aren’t sure how a
historian would do that, just outline what in general would need to be done, maybe give an
example of how it’s done in a scientific case–I might then be able to give examples from the
historical field that are analogous).

R E P LY

R IC H AR D MAR TIN • OC TOBER 1 0, 201 2, 5:02 P M

Hi Richard,

In response to your point about the lack of good introductory texts on probability and statistics for the
humanities, I would like to propose the following as providing considerable insight into concepts of everyday
probability and risk, both in business and life in general.

Phil Rosenzweig, The Halo Effect


Douglas W. Hubbard, How to Measure Anything
Leonard Mlodinow, The Drunkard’s Walk – How Randomness Rules Our Lives
Dan Gardner, Risk
Dan Gardner, Future Babble: Why Expert Predictions Fail – and Why We Believe Them Anyway
Ian Hacking, Probability and Inductive Logic
Morris Kline, Mathematics for the Non-mathematician (there are chapters on probability and statistics).

A book that just came out is by Nate Silver, The Signal and the Noise: Why So Many Predictions Fail – But
Some Don’t

Dan Ariely’s book Predictably Irrational provides a look at why people have such a hard time with probability
and rational thinking, as does Daniel Kahneman’s Thinking Fast and Slow.

More advanced but more complete is Jonathan Baron’s Thinking and Deciding, 4th edition.

I hope that helps,

Richard Martin

R E P LY
R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 9 :01 AM

I’ll look into those. Thanks.

R E P LY

JT51 2 • OC TOBER 1 0, 201 2, 9 :06 P M

I’m not sure what either Ian or MalcolmS is getting at with P(E|H)P(H) being in both the top and the bottom of
the fraction in Bayes’ Theorem. Errors in P(E|H)P(H) neither “pull in opposite directions” nor “offset.”

The effect of each term in Bayes’ Theorem on the posterior probability of H becomes clear by examining the
odds form of Bayes’ Theorem, because each term only appears once in the equation:

odds(H|E) = P(E|H)/P(E|~H) × odds(H)

Jay

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 9 :1 1 AM

Well, no, the same terms are all there (the numerator and denominator of a fraction can pull in
opposite directions). But that’s the same thing I mentioned: odds(H), which is P(H)/P(~H),
can look like it has two terms that can be independently in error, but as we know, there is
really only one term here, since P(H) and P(~H) must sum to one, so there is only one error
that can happen here, and if it’s against you, you are arguing a fortiori, and all is well. Then
there is P(E|H)/P(E|~H), and thosecan vary independently of each other, and thus could, in
principle, create a compound error (if each errs in your favor, you have a double error in
your favor), but if you are arguing a fortiori, you are going to make sure they both err in the
other direction, and thus the compound error only makes your argument even more a
fortiori, which is what you want.

When the errors are as much against you as you can reasonably believe them to be, then
the conclusion is as much against you as you can reasonably believe it to be (by commutative
logic). Thus, you simply cannot reasonably believe the result is any more against you than
that. Now, yes, there are lots of ways you can be so badly misinformed or ignorant that even
a reasonable belief is false, but that’s true for all beliefs whatever, whether derived by BT or
not. We can only do the best we can do, whatever our methods. And BT reminds us that
there is always some probability of being wrong–since the conclusion of BT is not “the
probability of h is x” but “the probability of h is x given b and e,” where b plus e represents
the sum of all you know at that point in time. The resulting probability has a converse that is
the probability that you are still wrong (and that will be because of some fact you did not
know about and thus did not include in b or e…but that’s the whole point of empirical
methods being tentative and revisable, and why science, and thus also history, can advance in
knowledge).
R E P LY

MA LC OLMS • OC TOBER 1 0, 201 2, 1 1 :23 P M

I have a couple of quick remarks before I move onto something more substantive in my next comment.

1. I feel it is bad form for you to have responded to Ian’s 2 posts only via the filter of my comments on your
post. Originally you had said that you would reply to his criticisms only if he posted a comment on your blog, so
I summarized his arguments in the comments for you to respond, and you did so. Nothing wrong with that. But
when you decided to devote a blog post to his review, I think it would’ve been better to directly quote from his
blog. As it is now, one can’t even tell for sure from your post if you’ve even read his blog. Maybe I’ve
mischaracterized or misrepresented him, or at least Ian might feel that way. If the situation were reversed you’d
probably want that courtesy as well.

I realize that you said, “I will reproduce in bold the observations of MalcolmS on what Ian argues (which also
does a fine job of summarizing Ian’s substantive points),” but I still don’t think this goes far enough.

2. I also feel that we (I’m guilty of it, too) may have been too harsh on Ian in the description of Ian’s criticisms of
the way you’ve presented the mathematics as not being “mathematical.” That is, originally we became aware of
his posts through some commenters on your blog who felt that Ian had demolished the mathematical content in
your book. A read through of the review, though, revealed that this was not the case, as most (if not all) of the
criticisms did not involve any actual mathematical errors. However, his post was merely titled a “Mathematical
Review” of the book, and even in mathematics, a review of a book (such as you’ll find, e.g., in the “Bulletin of
the AMS,” or even Amazon book reviews), usually addresses the author’s choice of content and way of
presenting it, not the accuracy of the assertions, so while many of his criticisms may have been merely stylistic,
they still would be considered mathematical.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 9 :21 AM

1. I hope Ian will correct me if I (or you) mistook any of his arguments. You did such a good
job, I could not improve on what you said (beyond what I said in turn, if that even counts as
improvement). But yes, we could both be wrong about that. And Ian is welcome to point out
any instances of it. I would be more than comfortable with our roles reversed in this instance.

2. I agree. We both approached this from a context Ian might not have intended. Our points
remained valid, but mine at least were phrased a little too harshly, for thinking he was
accomplishing what naysayers said he was, when in fact he himself didn’t say that’s what he
was doing (or at least not in the posts themselves…I didn’t read his comments elsewhere). I
have fallen victim to this effect before.

To you an Ian, I apologize for the mistake. I don’t know how to avoid it in future (since once
an argument has been framed by a third party, there is no obvious way to guarantee you’ll
realize that that has been done and you are still looking at it from that POV), but I’ll try to be
more mindful of it.
R E P LY

IR R C OIAN • OC TOBER 1 2, 201 2, 1 2:33 P M

“mine at least were phrased a little too harshly”

Definitely don’t worry about tone towards me. As long as we’re all willing to communicate, a
seasoning of gentle hostility is fine by me. It only becomes a problem, imho, when used as an
excuse to prevent discussion.

I didn’t have a problem with this post, or how it was structured.

I would say that Malcolm was very useful in pointing out sloppy mistakes on my blog, and
helping me communicate more clearly. So kudos for that. Due to his critique, some of the
things quoted in this post, I’ve now rephrased and improved.

R E P LY

MA LC OLMS • OC TOBER 1 1 , 201 2, 5:24 AM

A few of us were continuing the conversation over on Ian’s blog after I had posted my comments on yours. A
couple of problems with applying BT came up, which I would like to illustrate with some examples you discuss
in your book.

First of all, there’s the reference-class problem. You of course discuss this in your book, and Ian mentions it in
his comment above, but I think it would help to examine it in a particular case, say, the case of a claim that there
was a global 3-hour darkness during the day either in 1983 or 2000 years ago. The hypothesis is that such a
darkness actually occurred (i.e., the claim of it having occurred is true), so your prior, employing the method you
prefer for solving these problems, would be the a priori probability that such a claim would be true, before any
consideration of evidence used to support or refute this claim.

But what other claims would be in this reference class? Only claims of 3-hour worldwide darkness? How many
of them have their been? You would of course agree that this class is too narrowly defined. OK, then how broad
should we make it? All claims of darkness over a large area for an extended period of time? There are still not
that many cases (can you think of any?). Then we would have to expand it to include “comparable” cases, but
how do we know that another claim is comparable? Do we mean claims of events that are equally improbable?
That would require estimating the probability of the 3-hour worldwide darkness, without conditioning on the
claim, with all its concomitant difficulties (recall, you’re trying to avoid this in the first place). Moreover, one
would have to estimate the probability of all other claims to determine whether they lie within this class or not, a
probably impossible task. Your example of a person who claims to have been struck by lightning 5 times also
suffers from this deficiency.

In your worked example you just picked a number out of thin air – 1% – without any real attempt to argument
for it, other than to say that it is “small” (apparently because you are implicitly doing some sort of application of
BT involving the raw prior probability of such a darkness). The exact value was not of concern to you because
you were trying to show that the same prior could give rise to different results depending on the likelihood of the
evidence, conditioning on various hypotheses, but you seem to have overlooked that in at least one case the
value of P(H|B) is crucial (which will my next point below).

So how would you go about trying to estimate, to the best of your ability, P(H|B) for either of these examples? If
you can’t do this, how could you possibly hope to use BT to estimate how likely it is that Jesus existed?

Moreover, the case of the claimed darkness in 1983 illustrates another difficulty in applying this method: A
fortori arguments only work if you can bound both the numerator and denominator away from zero. I stated in
my earlier comment, which you quoted above, that a fortiori reasoning should always be valid, but tha is only
partially true. While it is true that bounds on the 3 inputs to BT will always yield a bound on P(H|E), it is not
always true that improving those bounds will improve the final result; in particular, if P(H) or P(E|H) has a lower
bound of zero, changing the upper bound of P(E|~H) won’t have any impact on the lower bound for P(H|E)
(which will also be 0) and similarly if P(E|~H) has a lower bound of zero, changing the upper bound of P(E|H)or
P(H) won’t have any impact on the upper bound for P(H|E), which would be 1. The worst case situation would
be where both P(E|~H) and either P(E|H)or P(H) are so small that they can’t be bounded from below. In that
case we will know absolutely nothing (0<=P(H|E)<=1) and changing the nonzero bounds won't help.

We can see this in the example from 1983, where both P(H) (the prior probability that the claim of the darkness
is true) and P(E|~H) (the probability that all this photographic, etc., evidence existed despite there having been
no such darkness) would both be very small. Could you put a lower limit on either and say, "well, the probability
has to be larger than X"? If not, then you will know nothing. (Moreover, your estimate of P(E|H)~1 is way too
large as well, since the evidence would include your own experience that you never heard of this event before,
despite having even been alive at that time.) It was only by plugging in an artificial value of 0.01 for P(H) that you
managed to get a value for P(H|E) close to 1. In your book you claimed that you were being conservative, but
this is only true for the Gospel example. In the 1983 case a conservative estimate would be a smaller one, not a
larger one. How small would P(E|~H) have to be before you would really start to believe in such an event?
Everybody would probably have a different answer – I don't even know what mine would be.

I'll mention here one other point with this example (in the Gospel case) that Ian raised concerning the miraculous.
In particular, if your hypothesis includes the possibility of a supernatural darkness, which I think you allowed in
your book, then when calculating P(E|H) you would have to consider the possibility that a supernatural agent
structured the evidence to be the way that it is, with reports only appearing in one source. How would one
calculate the probability that God prevented people all over the earth from recording this event, if one is
conditioning on God having caused the darkness?

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 0:28 AM


But what other claims would be in this reference class? Only claims of 3-hour
worldwide darkness? How many of them have their been? You would of
course agree that this class is too narrowly defined. OK, then how broad
should we make it?
As broadly as you can reasonably believe acceptable. Or even more broadly than that, if the
conclusion still comes out a fortiori in your favor (and that is obviously what I did in the
book, since the prior I chose was obviously vastly larger than anyone could ever reasonably
believe it to be, no matter what reference class they preferred…you call it “artificial,” which is
correct, but moot, since it still works, so it doesn’t matter how artificial it is, as long as it is
defensible as an a fortiori estimate).


All claims of darkness over a large area for an extended period of time?
There are still not that many cases (can you think of any?).

Of course Christian apologists do: hence the claim that the Evangelists just meant a dark
cloud front. Certainly, if you start from that assumption, the math comes out differently; it’s
just that if you did a Bayesian analysis on what the Evangelists meant by their words, it comes
out as almost certainly not this.


Then we would have to expand it to include “comparable” cases, but how do
we know that another claim is comparable? Do we mean claims of events
that are equally improbable? That would require estimating the probability of
the 3-hour worldwide darkness, without conditioning on the claim, with all its
concomitant difficulties (recall, you’re trying to avoid this in the first place).
Moreover, one would have to estimate the probability of all other claims to
determine whether they lie within this class or not, a probably impossible
task.

Indeed. That’s why historians can’t and don’t need to be that precise. They can argue a
fortiori and get along fine. Hence my example of the asteroid (p. 85).

Really, scientists do this, too: when they exclude, for example, “magic” as an explanation of a
drug study’s results. Try getting an exact estimate of the prior probability that magic has
relevantly affected the data of any scientific study. Scientists, if pressed, will admit they have
no idea what that prior is or how to calculate it, but that they have more than enough reason
to believe it is sufficiently low that they can safely disregard it–unless a difference in
consequent probabilities came along that was superbly huge; then we’d have to start thinking
more seriously about the prior probability of magic.


So how would you go about trying to estimate, to the best of your ability,
P(H|B) for either of these examples? If you can’t do this, how could you
possibly hope to use BT to estimate how likely it is that Jesus existed?

That last is actually a lot easier than you think. But you’ll have to wait for my next book to
see why.

If we ignore what I will actually do and ask instead what we would do with no clear data set
to start from, then even the worst case would leave us only to build out what the difference in
consequent probabilities is (which means, what it is at best; I have found the odds form
approach is most agreeable here) and then explain what prior you would have to accept in
order to accept historicity (and what prior is needed to be highly confident in historicity,
which is not the same thing). This then re-frames the debate around whether any of those
priors is really credible or not, and thus whether asserting historicity requires an act of
desperation, or whether asserting mythicism does, or whether agnosticism is what falls out as
the most credible position. Either way, material progress…and finally some facts and
numbers historians can start debating.


Moreover, the case of the claimed darkness in 1983 illustrates another
difficulty in applying this method: A fortori arguments only work if you can
bound both the numerator and denominator away from zero. I stated in my
earlier comment, which you quoted above, that a fortiori reasoning should
always be valid, but that is only partially true. While it is true that bounds on
the 3 inputs to BT will always yield a bound on P(H|E), it is not always true
that improving those bounds will improve the final result; in particular, if
P(H) or P(E|H) has a lower bound of zero, changing the upper bound of
P(E|~H) won’t have any impact on the lower bound for P(H|E) (which will
also be 0) and similarly if P(E|~H) has a lower bound of zero, changing the
upper bound of P(E|H)or P(H) won’t have any impact on the upper bound for
P(H|E), which would be 1. The worst case situation would be where both
P(E|~H) and either P(E|H)or P(H) are so small that they can’t be bounded
from below. In that case we will know absolutely nothing (0<=P(H|E)<=1)
and changing the nonzero bounds won't help.

This seems to confuse physical with epistemic probabilities. No epistemic probability can
ever be zero (except in very specific cases not applicable here: see Axiom 4, pp. 23-26, with
endnotes; also, see my “nonzero” remarks on pp. 55, 62, 80, 83, 94, 246-50, 260, 268).

Certainly, when we know only that (0<=P(H|E)<=1), we know nothing. Historians often do
face that reality, and usually accept it (although for some reason religiously charged subjects
seem impervious to that humility, sometimes even when approached by secular historians).
But we're often not in so dire a state of ignorance. In all other cases, we only care about one
bound, not the other. Because we only care what the probability is that we are wrong. In
particular, how high that probability can reasonably be. The other bound (how much more
probably right we might be than that) is generally of no use knowing, even if it could be
known (and it usually can't). For example, I would guess the probability of supernatural
phenomena (suitably defined) is actually in fact zero (see The God Impossible), but I don’t
know that for sure. What I want to know is how likely it is that I am wrong about that (or in
particular, how likely it is that I am wrong to say there is no supernatural phenomena in this
universe). And that only relates to the other bound: which, arguing a fortiori, is the highest
probability I can reasonably believe that bound to be.

I haven’t explored that, since I’ve never had to–nothing has come even close to it, so I can
use wildly exaggerated bounds and still conclude no miracle occurred in any given case (the
Gospel darkness, for example). But if evidence started getting strong, then I’d have to start
seriously examining what that bound is, which amounts to examining at what point I would
believe in the supernatural (in effect, what differential in consequent probabilities would it
take), and I can certainly countenance there being such a point (since I believe the resort to
excuses would certainly get implausible at some point, as I discuss in Defining the
Supernatural).

So if we only ask what that one bound is for all three key terms (P(E|H), P(E|~H), and
P(H)), and indeed even allow exaggerations beyond that bound, always a fortiori, then there
is never a problem–unless the result comes out ambiguous or close and we want to be more
certain, then we can take away the exaggerations and try to generate some numbers closer to
the actual data, whatever that happens to be.

For example, if h is “my wallet was stolen” and I want to argue for that a fortiori, I would
pick a prior I know is too low (let’s say that turns out to be 0.6) and a ratio of consequents I
knew was as much against h as I could reasonably believe possible (let’s say, P(E|H) =
P(E|~H)). The result would be my wallet was probably stolen, but I couldn’t be entirely
certain (since there would be a 40% chance it wasn’t). Any improvement of the numbers
toward what their actual values were would increase that probability (and thus make h even
more likely) but at the cost of less confidence (since my model is getting less a fortiori and
thus approaching greater possibility for error; cf. p. 87). Unless, of course, I could improve
those numbers with minimal loss of confidence: e.g., if my starting estimates were not merely
a fortiori, but exaggeratedly a fortiori, such that moving them over would actually keep
them a fortiori (as we could surely do in the Gospel darkness case). But any further, and I
can no longer have confidence in my numbers, and thus in the premises, and thus in the
conclusion. The resulting probability would therefore be useless to me (hence: pp. 111-14).


I’ll mention here one other point with this example (in the Gospel case) that
Ian raised concerning the miraculous. In particular, if your hypothesis
includes the possibility of a supernatural darkness, which I think you allowed
in your book, then when calculating P(E|H) you would have to consider the
possibility that a supernatural agent structured the evidence to be the way
that it is, with reports only appearing in one source. How would one calculate
the probability that God prevented people all over the earth from recording
this event, if one is conditioning on God having caused the darkness?
That’s called Cartesian Demon reasoning and has been a standard question in epistemology
for hundreds of years (thus it is not a problem at all unique to BT, but affects all methods and
epistemologies whatever). You can always ask that question. For example, “what if God
arranged all my drug study data to hide the fact that the drug killed everyone who took it?”;
“What if God arranged all my Large Hadron Collider data to hide the fact that it generated a
dozen unicorns and a flying turtle?” Etc.

The short answer is that Cartesian Demons have vanishingly small priors and therefore can be
ruled out (to see why, revisit my discussion of ad hoc theory enhancement: pp. 80-81, and
“gerrymandering” in the index; also, think of the many degrees of CD: fromPunked, to The
Truman Show, to The Matrix, to the scenario which you just described, which must
necessarily have a prior even lower than those).

That does mean CDs can always evade detection, but then they are defined in exactly that
way (a CD is by definition an entity that always evades detection). It remains the case that
they are very unlikely to exist. That does not mean the probability of their existing is zero,
however. People often have a hard time grasping the distinction. And a lot of it has to do with
problems in the definition of knowledge standardly used in philosophy, as justified true belief.
We can only ever have justified true belief in the improbability of CDs; we can never have
justified true belief in the impossibility of CDs (as such knowledge is logically impossible;
unless someone, someday, proves CDs to be logically impossible).

And, of course, weaker CDs can be exposed eventually or in principle (as they are in
Punked, The Truman Show, and The Matrix, and God could likewise expose himself or if
God is not perfect, someone else might manage to expose him–there are some amusing
ancient Jewish legends along those lines). But the stronger the CD, the morea priori
improbable that CD is (since you have to heap on more and more undocumented
assumptions to strengthen that CD from detection). You can also always invent a CD to
explain away the exposure of another CD (The Truman Show inside The Matrix…think,
Inception), but you can see why the prior probability of that scenario is vastly less (since
when two very small prior probabilities multiply, the result is an improbability many, many,
many times smaller than either).

R E P LY

MA LC OLMS • OC TOBER 1 3, 201 2, 1 1 :00 P M

Dr. Carrier,

You still don’t seem to grasp the full extent of this a fortiori reasoning problem. I’ll try to
make my point again, hopefully more clearly.

First of all, even though probabilities are not 0 (unless something really is logically impossible,
in which it probably would not be an interesting historical question), when one does an a
fortiori argument one needs to use ranges for the inputs (P(H), P(E|~H), P(E|H)). In the case
where one cannot give a minimum estimate for a probability the lower bound would be 0. For
example, if we say that we know P(H) is very small, say, lower than 1%, then our range for
P(H) would be 0<P(H)<0.01, so one still has to deal with a zero bound in calculations, even
though the probability itself would never be zero.
So if the lower bound for P(H) or P(E|H) is zero, then the lower bound for P(H|E) is zero.
So far no problem. But what if the lower bound for P(E|~H) is also zero? In that case, this
gives an upper bound for P(H|E) of 1 (because a lower bound for P(E|~H) gives an upper
bound for P(H|E)). Thus in the case where both P(H) (or P(E|H)) and P(E|~H) are known to
be small, but we can't say which one is larger, we will get a range for P(H|E) of 0 to 1.

Here's a numerical example. Let's say we can say with confidence that P(H) < 5% and
P(E|~H) < 2%. (P(E|H) doesn't matter so much in this example, so let's say it is exactly 1.)
What is our possible range for P(H|E)? 0 to 1, i.e., we know nothing. Now let's improve our
bounds on P(H) and P(E|~H) to, say, 1% for both. Now what is our possible range for
P(H|E)? Still 0 to 1. So tightening our range didn't help at all.

When I made my previous post I had claimed that your example of a claim of worldwide
darkness in 1983 falls into this type of situation, but I had overlooked the fact that in your
book you explicitly stated that P(H) < P(E|~H)/1000. In that kind of case, where one has a
bound of the ratio P(H)/P(E|~H), then this problem is avoided. But having such an estimate
for the ratio is crucial; without it one couldn't say anything about P(H|E).

For that specific example in your book I also would dispute the estimate of P(E|H) ~ 1, since
one must take into account that one had never heard about this claim until recently, and also
the fact that one had not witnessed the darkness (or a report of it) even having been alive at
the time. For example, if one said that P(E|H) = 0.1%, then one gets about 50% for P(H|E),
and this is true no matter how small you choose P(H) and P(E|~H) so long as their ratio is
1:1000. Thus whether H is almost certain, fifty-fifty, or even improbable depends critically on
the exact ratio of P(H)/P(E|~H) to P(E|H), which would require a more elaborate argument
one than you present in your book.

Therefore I assert that this case is actually one of them where BT can't help us too much in
getting a good handle on whether the claim is true or not. Of course it's not just BT that has
this problem – any attempt to reason this out logically would founder on the same problems
as well. In fact, BT has an advantage here in that it allows one to see precisely how and why
this claim is difficult to evaluate, whereas if one were just to respond intuitively one might not
grasp how sensitive one's answer is to small changes in assumptions.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 5, 201 2, 1 1 :41

AM


MalcolmS: …when one does an a fortiori argument one
needs to use ranges for the inputs (P(H), P(E|~H), P(E|H)).
You don’t need to give a range. Because the other end of the margin is
irrelevant. Only one margin is relevant: the one that draws the line between
whether you are right or wrong, and does so beyond where that line
actually is, in favor of your being wrong. Hence producing a conclusion
a fortiori (“from the stronger reason”). The other margin does not
produce an a fortiori argument but an a tenuiori argument, which is a
fallacy (because that line between whether you are right or wrong is the
weakest, being much less likely correct, making your argument look the
stronger but actually making it weaker). You therefore don’t need to
bother with that other bound.

That doesn’t mean you can’t work with different ranges, if there is
something useful in doing so, e.g. to show what the conclusion is with
different assumptions.

For example, I estimate the prior probability of a passage about Jesus in a


non-Christian source before the 4th century being an interpolation is a
fortiori better than 1 in 200. We could say the other bound is then 1 in 1
(i.e. that they are all interpolations and no evidence could ever show
otherwise), but what use is that? None. Obviously if the most against-
interpolation bound (1 in 200) produces a result, in the face of the
evidence for a specific passage, that that passage is probably an
interpolation, then using the other bound (1 in 1) will make the probability
that that passage is an interpolation much higher (in fact, it automatically
becomes 100% no matter what evidence you did or didn’t produce). But
of what use is knowing that? None. Only the a fortiori bound produces a
conclusion that can be at all persuasive. Moreover, only the a fortiori
bound produces a conclusion that we can have a high confidence in. I
cannot have a high confidence in a conclusion produced from the other
bound (of 1 in 1, or even anything close to 1 in 1). So the conclusion that
comes from that bound produces no appreciable confidence. It’s
therefore useless even to us, much less for the task of persuading others.
And this is true even in scenarios like you describe.


But what if the lower bound for P(E|~H) is also zero?

It depends on whether the “lower” bound here is a fortiori or not. If not,


it’s moot (since, as I just explained, such a bound cannot produce an a
fortiori argument nor generate confidence and is therefore useless). But if
it is a fortiori, as are all the other bounds being used, and the scenario is
as you describe:



Thus in the case where both P(H) (or P(E|H)) and P(E|~H)
are known to be small, but we can’t say which one is larger,
we will get a range for P(H|E) of 0 to 1.

If we assume P(E|H) is high, then this translates to: we don’t know. You
have just mathematically modeled a class of historical claims whose truth
is unknowable on present evidence. That is not a problem. Because we
already know that most historical claims are such. Unless you want to
claim all historical claims are described by this scenario; but clearly you
can’t be saying that. So what use is this as an objection to anything I argue
in PH? That some claims are undecidable is already affirmed repeatedly in
PH.

But let’s instead imagine a scenario in which all three are low (P(H),
P(E|H) and P(E|~H)), that is, thea fortiori bound for each is somewhere
uncertainly close to zero.

Note then that the key statement here would be “we can’t say which one
is larger.” In the case of the likelihoods, that translates to: we do not know
the ratio of likelihoods would be in the Odds Form of BT–in fact, not only
do we not know them, but we don’t even know them a fortiori. Which
means, for all we know, that ratio is 1 to 1: we have no knowledge
establishing it is any higher, nor any knowledge establishing it is any
lower–because if we had either, “we can’t say which one is larger” would
be false, and the scenario would not apply.

But if for all we know that ratio is 1/1, and the ratio of priors is
approaching 0, then we should conclude H is probably false. Until we get
information that allows us to argue that the ratio of likelihoods (i.e. the
evidence) favors H over ~H.

Indeed, that’s the definition of “having evidence for H” : having a ratio of


likelihoods that favors H.

The question then is how much evidence favoring H do we need in order


to argue that H is true when H has a very small prior. That is the unusual
scenario you are describing. And that gets into a whole slew of other
questions.

If we are dealing with an absurd claim, then we will have a vanishingly


small prior and a vanishingly small likelihood ratio. We should conclude
against H.

That leaves only one scenario to be concerned with: one where the a
fortiori prior for H approaches very near to 0 but the ratio of likelihoods
is vastly in favor of H. In that case we can’t use “1” and “0” because now
we are dealing with a case where we have to start taking seriously the
boundaries of our epistemic certainty. For example, as I said before,
though I suspect the prior probability of miracles is 0, my confidence in
that bound is not epistemically high. It therefore can never be an a
fortiori bound.

Thus, if I were faced with a case where the prior is vanishingly small but
the evidence is extraordinarily strong (this has never happened, and is a
very bizarre scenario–a red flag for any philosophical argument: if you
have to create a completely unrealistic scenario in order to make a point,
odds are the point wasn’t worth making, especially if your aim is to
discuss how to approach reality), then I have to ask what I think the
epistemic prior probability of (let’s say) a miracle really is, in other words
how much evidence will finally convince me miracles exist (which
translates to: how improbable P(E|~H) must be, relative to P(E|H)).

I know there is some such probability, as I have described scenarios that


would persuade me before (in Why I Am Not a Christian, for
example), so all I need do is figure what my probability estimates really
translated to in those cases, in particular the a fortiori bounds, and then
benchmark back to the case at hand. And that can then be open to further
debate, if someone wants to insist I’m wrong to set that benchmark. And
so progressive debate ensues.

I actually discuss these kinds of bizarre scenarios (and how they differ
from actual scenarios of low P(H) and P(E|~H)) in PH, pp. 246-55 (but
see again pp. 243-46 for context). The example of Matthias the Galilean
industrial mechanic, and what evidence would it take to persuade a
historian that such a man existed, is exactly on this issue (being a realistic
example): of starting with very low priors, but then getting good evidence
(yet notice here the most plausible a fortiori prior, the lowest we can
reasonably believe it to be, will not be zero, or anywhere near as low as
in the case of successful alchemy or sorcery, the counter-examples I
explore).

So, to adapt the Matthias the Galilean industrial mechanic example to your
numbers:


Here’s a numerical example. Let’s say we can say with
confidence that P(H) < 5% and P(E|~H) < 2%.

This cannot be a fortiori. Because your P(H) < 5% is the wrong bound
(the useless one). The bound we want is the lowest we can reasonably
believe P(H) to be, which I proposed is P(H) > 0.000001. I also
explored the highest reasonable prior and found it to be 0.002, but that is
too high, because I know the actual prior is less than that, and so I cannot
use that with any confidence. I can accept debate over the other bound of
0.000001, however, since one might make an evidence-based case that
that is too low (in fact, I actually do believe it is too low), but even then all
they would do is end up making a case that the a fortiori bound is
somewhere else (the whole point of that section: these are the kinds of
debates historians should be having), although I suspect it will still be
closer to 0.000001 than to 0.002.

Your P(E|~H) < 2% however would then be the right bound, since to
make an a fortiori argument we want to know the highest this probability
could reasonably be. And if that’s what we had, a 1 in 50 chance a
source is lying or in error (or whatever), and we were confident the odds
were at least that high, then we wouldn’t have sufficient evidence to
believe in Matthias the Galilean industrial mechanic; we would believe that
the source probably made him up. But of course that assumes that’s what
we premised, that the source is that unreliable on claims like this. And
that might be very arguable; indeed, it might not be a reasonable belief at
all, no confidence being warranted in so high an estimate of the likelihood
of fabrication on that kind of point (and so on).

For example, finding a sarcophagus in the Palestinian region for Matthias


the Galilean industrial mechanic, much like the one we have found in
Turkey (cf. n. 36, p. 331), would not have a 1 in 50 chance of being
forged or in error; the odds of that would be many millions to one. It
would therefore more than overwhelm even an a fortiori prior of
0.000001. When we turn to the case of a historian referring to him, then
the matter may be more complicated, and may end in uncertainty–for
example we could conclude that the historian’s reliability on such details
must be at least X (X being a frequency of correctness on such points,
and thus the prior) in order for us to be confident that Matthias the
Galilean industrial mechanic existed (rather than “was made up” etc.).

This X might be inside the range of uncertainty and thus not capable of
making an argument a fortiori. In which case we would state as much: in
colloquial terms, we would say he might have existed, that it’s plausible
but we’re nor sure; in exact terms, we’d say that we can be highly
confident he existed only if we adopt assumptions (about the prior
probability and/or the likelihood of fabrication or error) in which we are
not highly confident, which means we can be confident neither that he did
exist, nor that he didn’t (that latter distinguishing a case like this, from a
more absurd case like alchemy: see again the distinction drawn between
plausible unknowns and effective impossibilities in Axiom 5, pp. 26-29).


Now what is our possible range for P(H|E)? Still 0 to 1. So
tightening our range didn’t help at all.
Here you are making a moot point. That the “possible range” includes
many other values is of no use knowing. Because that “possible range”
will include things like “the frequency of interpolated passages about Jesus
in non-Christian literature before the 4th century is 100%” and no a
fortiori argument can proceed from a premise like that. So that our
“possible range” includes it is irrelevant. We don’t care what the “possible
range” is. We only care what the a fortiori result is. And that only uses
one bound for each value. It therefore does not produce a range, other
than “X or less” or “X or more” (depending on whether we are arguinga
fortiori for or against H).


When I made my previous post I had claimed that your
example of a claim of worldwide darkness in 1983 falls into
this type of situation, but I had overlooked the fact that in
your book you explicitly stated that P(H) < P(E|~H)/1000.
In that kind of case, where one has a bound of the ratio
P(H)/P(E|~H), then this problem is avoided. But having
such an estimate for the ratio is crucial; without it one
couldn't say anything about P(H|E).

Indeed. This is why I also discuss the Odds Form in the book (someone
having rightly convinced me of its importance) and why I discuss the tactic
of employing artificial ratios even when using the straight form (as in my
discussion of neighbors with criminal records on pp. 74-76).

So that can’t be the problem.

That leaves only this…


For that specific example in your book I also would dispute
the estimate of P(E|H) ~ 1, since one must take into
account that one had never heard about this claim until
recently, and also the fact that one had not witnessed the
darkness (or a report of it) even having been alive at the
time. For example, if one said that P(E|H) = 0.1%, then
one gets about 50% for P(H|E), and this is true no matter
how small you choose P(H) and P(E|~H) so long as their
ratio is 1:1000. Thus whether H is almost certain, fifty-fifty,
or even improbable depends critically on the exact ratio of
P(H)/P(E|~H) to P(E|H), which would require a more
elaborate argument one than you present in your book.
You are introducing elements not stipulated in the analogy. That makes
this a straw man argument. I never said anything about “one had not
witnessed the darkness despite being alive at the time” nor does “one had
never heard about this claim until recently” make a difference if, for
example, you are in school and hearing all sorts of things for the first time.
In other words, if “hearing about this claim for the first time” is not
unexpected, it makes no difference to the consequent probability; you
have to stipulate that it is unexpected, which changes the scenario.

Obviously, if you change the scenario that has been stipulated, then you
change how it gets modeled in BT. That is not an argument against a
fortiori reasoning.

If we stipulated the scenario you do, that we have no personal memory of


the event even though we should have, and only just now are hearing
about it and that this would be strange, then indeed we may be in a state
of indecision, given all the other evidence there is. That simply isn’t the
scenario I posited. But we could posit it, as an example of a bizarre
scenario where we might not be able to know what is true. That just
wouldn’t be relevant to the point I was making with the analogy there
(which was to illustrate what it would take to convince us, not what it
would take to produce uncertainty). Nor would that scenario be
analogous to any actual situation we are ever in (since I cannot think of a
single “comparably incredible claim” for which I have such a comparably
vast scale of evidence contradicting my own memory, outside of a Philip
K. Dick novel).

In short, I see no objection here to a fortiori reasoning. All I see is a


recognition that some claims are unknowable (both in practice, and in
extremely bizarre fiction). Which is already argued in Proving History.

R E P LY

S C _ 7 B C A 5 4 4 D 5 9 6 F 8 4 A 5 F 5 6 D 0 F 9 6 74 C 0 E 2 2 E • O C T O B E R 1 1 , 2 0 1 2 , 6 : 4 8 A M

Professor Carrier, maybe you should change your main claim, and assert that you are not doing Bayesian
statistics at all, but instead what you are doing is formally correct fuzzy logic, as defined by Lofti Zadeh and
others.

As soon you assert that you are manipulating “truth values” (subjective statistics), and not actual statistics, many
of the objections raised by statisticians may disappear.

Just a thought.

R E P LY
R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 0:36 AM

Because it’s not exactly the same thing. Fuzzy logic involves a much broader and more
complex system of rules, terms, and procedures, almost none of which is of practical use to
historians. Although I would expect any Bayesian model can be described in fuzzy logic, that
does not mean doing so is useful (it’s just too much work for no practical gain: see my
discussion of this problem in the case of Dempster-Shafer theory, in note 19, page 303). This
is rather like trying to drive to work by building a car, rather than just using the car already
parked in your driveway.

And I never claim to be doing statistics, either. I occasionally use analogies from statistics, but
I never say I am doing “Bayesian statistics” (that phrase appears nowhere inProving
History, for example; except in the titles of other people’s books I cite). I am talking about
Bayesian reasoning. There is a significant difference (not only between that and “Bayesian
statistics,” but also between that and “fuzzy logic,” even though they might conceptually
overlap).

R E P LY

IR R C OIAN • OC TOBER 1 1 , 201 2, 3:39 P M

I added another post on error, its causes and effects: http://irrco.wordpress.com/2012/10/11/the-effect-of-


error-in-bayess-theorem/

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 1 :06 AM

Thank you.

In effect I have responded to the point of that post already upthread.

But overall, the point you make there doesn’t contradict anything I argue in Proving History,
where I repeatedly note the fact that there will be cases too close to call and therefore in
which nothing can be known (and that in fact this happens a lot in history, ancient history
especially). You essentially just built out a detailed mathematical demonstration of why and
when that’s true.

Your examples are also useful starting points, and similar to examples I gave myself (e.g., pp.
250-56 and 212-14, and my whole discussion of Axiom 5, pp. 26-19). If historians were
concerned, for example, they can now start discussing whether your method of deriving the
prior probability that Caesar was in Alexandria at a given date is actually the right way to go
about it, or if there is a much more practical approach (having more to do with the prior
probability that a source claiming he was there is reliable)? To see this problem magnified,
look at what I consider to be the wholly incorrect way Michael Martin goes about adducing a
prior probability that Jesus was God’s Atonement Sacrifice in The Empty Tomb, pp. 43-
54, which is similar to what you were attempting for Caesar, only much more obviously
incorrect.

The one main flaw in your post is what seems to be an unjustified inference from (a) we will
have all these errors [true], to (B) therefore these errors will always stack up to such a large
cumulative error as to make knowledge of historical facts impossible [false].

When that does happen, then yes, historical knowledge is impossible. And indeed, you just
defined how historians can determine what is unknowable (like Caesar dicing with
Maxsuma). But it simply doesn’t always happen. We obviously can know some of the dates
that Caesar was in Alexandria, for example, to at least some reasonably high probability.
Even after accounting for all likely sources of error.

You also seem to think that this problem is only introduced by using BT. In fact, BT makes
exactly zero difference to any of this. These errors (and their effects on our uncertainty) exist
no matter what. Thus, historians are saddled with them even if they never use BT. Their only
recourse is to argue fallaciously, or nonfallaciously. If they choose the former, they are writing
fiction. If they choose the latter, they are going to be following BT (whether they are aware of
it or not). Best to just get everyone on the latter page, so they know what they are doing and
how to do it consistently and better. In other words, BT forces historians to confront what
their sources of error really are, and how to compensate for them (or if, in a particular case,
they can).

Which brings me to this:


In reality Carrier (and anyone else doing this) will also be choosing the
definitions, and choosing the reference classes, and there is no similar a
fortiori process for determining which are the least favourable definitions to
ones cause, and which reference classes are the most troubling, and adopting
those

There actually is. I discuss the criteria of better and worse reference classes in chapter six
(and show how sometimes starting with a different reference class won’t logically get a
different result anyway, since all the data has to go in eventually). Likewise I frequently talk
about the relative merits of simple vs. complicated hypotheses and the way definitions (i.e.
context) affects estimates, and so on. Moreover, once we get to this point, of arguing over
whether we are using a valid definition or reference class, or whether there is a better way to
model the problem, we’re making progress. It is precisely a knowledge and understanding of
BT that makes that possible.

And as soon as I (or anyone else) chooses a definition and a reference class, others can then
start examining whether I am making a sound or unsound start, and argue for a better
definition or reference class, or ask if different ones get different results and why (and
likewise start looking at what possible sources of error there are and what effect they have).
In other words, this starts a useful dialogue that can make progress on almost any question in
the field, as now we know what we’re supposed to be arguing about.
Finally, your endnote 3 seems to ignore my discussion on pp. 240-43 (or any of my other
discussions of staged iteration: see “iteration, method of” in the index). It seems like you are
saying I never discuss this procedure, when in fact I recommend it several times.

R E P LY

IR R C OIAN • OC TOBER 1 2, 201 2, 1 1 :38 AM

“The one main flaw in your post is what seems to be an unjustified inference from (a) we will
have all these errors [true], to (B) therefore these errors will always stack up to such a large
cumulative error as to make knowledge of historical facts impossible [false].”

But unfortunately, unless you actually do the error analysis numerically, you’ll never know the
difference. And you don’t. Unless I’ve missed it.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 2:1 5

PM


But unfortunately, unless you actually do the error analysis
numerically, you’ll never know the difference…

If that were true, then all knowledge would be impossible. Humans don’t
do numerical analyses of daily or professional judgments, for example.
Nor need they. Only when the probabilities get dangerously close to
uncertainty do we need to take greater care figuring out what they are.
Otherwise, no amount of accumulated error is likely to make me wrong to
say “the odds of an asteroid hitting my house today are less than 1 in
1000,” for example. So, too, in historical judgments. Where there is
knowledge to be had.

In general, we can look at all possible known sources of error and judge
at-sight whether they will make a difference or not, were we to work up
their numerical effect. And when it’s obvious they won’t, we don’t need
to waste the time doing it. Just as scientists don’t waste any time trying to
work out the prior probability that magic is affecting their data.

In less clear cases, we just adopt less certain conclusions. For example, I
can show that the actual rate of interpolated references to Jesus in pagan
literature of the first three centuries is higher than 1 in 10 and of
interpolated passages in the NT over that same period is higher than 1 in
400, and if we include interpolated phrases in existing passages, it’s over
1 in 200. So if I adopt a prior for interpolated references to Jesus in
pagan literature of 1 in 200, there is no amount of error that is going to
make me wrong to say “the prior probability a reference to Jesus in pagan
literature is interpolated is no less than 1 in 200” (because in actual fact,
it’s 1 in 10, so any adjustment of the prior toward reality would increase
my result twenty times over; there is no likely accumulation of errors that
is going to make me wrong by a factor of twenty–one would have to
imagine an extremely implausible set of circumstances to get that kind of
effect, which borders on Cartesian Demon reasoning).

Likewise any other questions. It is simply not necessary to know exact


numbers in order to know general facts like these. That’s why human
reasoning is even possible in the first place. We couldn’t find our way to
the bathroom otherwise.

R E P LY

IR R C OIAN • OC TOBER 1 2, 201 2, 1 2:1 5 P M

“You also seem to think that this problem is only introduced by using BT. In fact, BT makes
exactly zero difference to any of this.”

Not quite. My central point is that your numeric results cannot honestly be interpreted as
probabilities of anything meaningful, because you are running BT informally, on poorly
specified data, with no way of properly quantifying and therefore controlling for errors, and
no way of empirically confirming your inputs.

To the extent that you want to use BT to show flaws in bad arguments. That’s fine.

But you seem to want to make positive arguments too, based on probabilistic calculations.

If that is the case, then the sensitivity of your conclusion to errors in your input data is
crucially important, it makes a big difference.

The problem is, of course, you’ll never actually see the errors if you can’t do proper
quantification. Because you’re never going to use it on anything you can go and check
quantitatively.

Your results are essentially unfalsifiable, at least quantitatively. And because there are a small
number of boolean hypotheses you’re interested in, and you’re generating probability results,
you’re never going to have enough results to confirm that your method works.

As such I think using BT and all this messing around with numbers is at best unnecessary, and
at worst tendentious.
You could have achieved the same methodological critiques in your book without having to
dress them up in the language of probability theory. And dressing them up probability theory
means you’re here defending your use of probabilistic reasoning, rather than defending your
critiques, which is a shame.

If it helps your intuition to think that way, that’s fine. But I don’t think you are doing what you
seem to want to claim to be doing.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 2:53

PM


“You also seem to think that this problem is only introduced
by using BT. In fact, BT makes exactly zero difference to
any of this.”

Not quite. My central point is that your numeric results


cannot honestly be interpreted as probabilities of anything
meaningful, because you are running BT informally, on
poorly specified data, with no way of properly quantifying
and therefore controlling for errors, and no way of
empirically confirming your inputs.

Which is a description of all historical reasoning generally. In fact, of


almost all human reasoning generally (in daily life and professions).

The bottom line is: we are already estimating priors and consequents and
coming to conclusions from those estimates. Every historical argument
ever made does that. We’re just being even more imprecise than we need
to be and often illogical about it. We should recognize what we are doing,
do it correctly, and get better at doing it. That’s the sum of Proving
History‘s argument.


To the extent that you want to use BT to show flaws in bad
arguments. That’s fine.

But you seem to want to make positive arguments too,


based on probabilistic calculations.
If that is the case, then the sensitivity of your conclusion to
errors in your input data is crucially important, it makes a
big difference.

That’s always true, and just as true, whether we use BT or not. The
advantage of BT is that now we know we’re doing it, and what we can
do to do it better, or how much uncertainty we should actually accept
knowing that we can’t.


Your results are essentially unfalsifiable, at least
quantitatively.

If that were true, then all historical knowledge would be impossible (this
follows from what I demonstrate on pp. 106-14).

For example, the statement “there is less than a 1 in 1000 chance an


asteroid will hit my house today” is certainly falsifiable. Even though it isn’t
anywhere near exact. I give other examples relating to history in the book.
For example, regarding silent reading in antiquity (Axiom 11, pp. 33-34,
with note 10, pp. 298-99); and the libraries example (pp. 229ff.), and my
discussion of Galilean industrial mechanics (pp. 250ff.), and the deleting of
the sun (pp. 41ff.), and so on.

I even discuss the problem of falsifiability in the graverobbing analogy (pp.


212-14).


And because there are a small number of boolean
hypotheses you’re interested in, and you’re generating
probability results, you’re never going to have enough
results to confirm that your method works.

That depends on what you mean by “works.” How do you think any
reasoning about history works? If correct reasoning about history is not
Bayesian reasoning, then what is it? By what “working” method can
historians claim to know anything?
Can they do it without probabilistic reasoning?

I’d love to see you demonstrate that.

And if they have to use probabilistic reasoning (and they do), are they not
then logically bound by Bayes’ rule? Is there any possible way they can
argue without following Bayes’ rule?

Let’s see an example of how a historian would do that.

Until then, I don’t think you understand what is going on here.

R E P LY

IR R C OIAN • OC TOBER 1 2, 201 2, 1 2:46 P M

“It is simply not necessary to know exact numbers in order to know general facts like these.
That’s why human reasoning is even possible in the first place. We couldn’t find our way to
the bathroom otherwise.”

Conversations like this can easily slip into violent agreement!

Of course this is true. But I simply don’t see how it addresses the point.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 5, 201 2, 1 1 :51

AM


Carrier: “It is simply not necessary to know exact numbers
in order to know general facts like these. That’s why human
reasoning is even possible in the first place. We couldn’t
find our way to the bathroom otherwise.”

Ian: Of course this is true. But I simply don’t see how it


addresses the point.

That depends on what the point was.

If it was that some cases are undecidable on BT, that point is moot.
Because that simply translates what is already true generally: with or
without BT, some cases are undecidable. BT just tells us when and why.

But if it was that BT cannot usefully model historical reasoning because


the probabilities are always fuzzy, it certainly addresses the point. Because
“being fuzzy” does not translate to “are not known to any useful degree.”

Almost all human reasoning, and historical reasoning especially, deals in


fuzzy probabilities. That does not make such knowledge impossible–not
generally, nor in the hands of BT.

So on neither the first point nor the second do I see any relevant objection
being made.

That leaves the possibility that your point was something other than those
two things. In which case, you’ll need to explain what your point is, in
such terms as to make clear how it is not either of those.

R E P LY

F • OC TOBER 1 1 , 201 2, 9 :1 6 P M


I readily concede that my colloquial discourse will lead to ambiguities that chafe at
mathematicians; but this is precisely the kind of shit they need to get over, because they are
simply not going to be able to communicate with people in the humanities if they don’t
learn how to strategically use ambiguity to increase the intelligibility of the concepts they
want to relate.

T-shirt.

R E P LY

BR ETTON GAR C IA • OC TOBER 1 2, 201 2, 1 2:44 AM

Oddly enough, Historical Jesus SUPPORTERS have recently used Bayes. And I noted some problems with
their application.

A few months ago I interfaced on a blog with Dr. James Tabor, the archeologist who is doing the “Jesus
Tomb”/Talpiot A and B excavations. He’s been looking at the fact that the names on the six or seven bone
boxes in the tombs, seem to rather exactly match Jesus’ family. Then Tabor cites Bayes mathematicians; who
asserted that given the popularity of each individual name, then the likelihood of all of them coming together at
random, the chance of this particular grouping of names not being of the Jesus family, is extremely small. The
conclusion being that these two tombs are the authentic tombs of Jesus and/or his family.

But here was one objection I posed to the input into this application of Bayes: though to be sure, it was
extremely unlikely at first that these particular names (and no others?) would come together at random, the odds
of names like “Jesus” and “Joseph” and “Mary” and so forth occuring in one family … go up astronomically …
shortly after the death of Jesus. As Christianity became somewhat more popular, there would be many more
parents naming their children, after New Testament figures.

Applying Bayes to historical situations is quite complex; and much depends on how we think of each situation,
and how many different scenarios we consider.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 2:00 P M

I’ve blogged about that before: see The Jesus Tomb and Bayes’ Theorem.

Although their error was not the one you suggest (which would not likely affect their data: the
find unmistakably predates 70 AD, and such an effect on name frequencies would be too
small to have any observable effect by then, e.g. given there were over a million Jews in
Judea, even assuming ten thousand Christians in Judea by the year 70, an absurd over-
estimate, Christian names could have only a hundredth part effect on name frequencies in
Judea, too small to matter).

R E P LY

SA WELLS • OC TOBER 1 2, 201 2, 1 :42 AM

I think you mean Wallace to your Darwin, not Watson; not that it isn’t a lovely mental image (Astounding, my
dear Charles! Elementary, my dear Watson…).

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 2:03 P M

Oh doi. Thank you. Fixed!

R E P LY

MA LC OLMS • OC TOBER 1 2, 201 2, 9 :41 AM

BTW, Ian has followed up with that promised blog post that expands on his previous hints about mathematical
difficulties in applying BT in history. The main 2 points he makes, about reference classes and error ranges when
the inputs are very small, are what I was discussing in my previous post (which I see is still awaiting moderation).

Here’s the link: http://irrco.wordpress.com/2012/10/11/the-effect-of-error-in-bayess-theorem/

These are actually much more serious issues than the ones he concentrated on in his first 2 posts.

R E P LY

IR R C OIAN • OC TOBER 1 2, 201 2, 1 1 :41 AM

“There actually is. I discuss the criteria of better and worse reference classes in chapter six”

Yes, you do. I respectfully suggest you haven’t grasped what I’m talking about in that section. I go on to refer to
your method for choosing a reference class, but that’s not a solution to the *sensitivity* of the result to changes
in reference class.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 2, 201 2, 1 2:28 P M

You didn’t give any relevant examples of that being a fatal problem, though. You discuss
miracle claims, but those have vanishingly small priors and terrible evidence, two facts
together that entail the only way errors could make us wrong to reject them is if the
accumulated error is causing our estimates to be off by over a thousand times (in
coincidentally the one convenient direction), and there is simply no plausible litany of errors
that can do that in any relevant case (at least, none you list), much less an undetectable
mistake in selecting a reference class (since we can rule out detectable mistakes, so those
aren’t at issue).

Why not apply your argument to the actual example I employ in chapter six: the probability
that a newly excavated Roman city in Italy had a public library. I provide various possible
numbers there that you can work with, and discuss a variety of possible sources of error, that
you could expand on. Then you can show whether it is therefore impossible to know whether
any Roman city had a library, because of the possible accumulation of errors that aren’t being
numerically estimated make the prior probability indeterminate.

And do that without proposing anything like a Cartesian Demon.

Even better if you can come up with a better way historians can approach that problem than I
map out.

R E P LY
IR R C OIAN • OC TOBER 1 2, 201 2, 1 2:51 P M

“The short answer is that Cartesian Demons have vanishingly small priors and therefore can be ruled out ”

Not in situations where we’re explicitly being asked to determine if a supernatural event occurred. Its a different
kettle of fish to finding correlations in gene sequences and ignoring the possibility that a deceptive God is playing
with our instruments. We’re not talking about a Cartesian Demon here, but a teleological purpose for the
evidence, that is related to the hypothesis.

I think there’s a very good reason to enforce methodological naturalism on the discussion in all cases and to
explicitly disregard any supernatural event.

Sure, you lose any leverage over true believers, then, but I suspect you didn’t have much of that to start with.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 5, 201 2, 1 1 :57 AM


Carrier: “The short answer is that Cartesian Demons have vanishingly small
priors and therefore can be ruled out ”

Ian: Not in situations where we’re explicitly being asked to determine if a


supernatural event occurred.

Untrue. The nature of the claim makes no difference. Either way you are still positing an entity
with an extremely low prior. Indeed, a lower one, since a God who acts like a Cartesian
Demon is inherently less likely than a God in general (this is thus just another instance of
gerrymandering that halves the prior, at best: as I explain on pp. 80-81, and that’s even
assuming that there is a straight 50/50 chance that if God exists, he acts like this, which IMO
is an absurdly generous hypothesis, especially if one tries to make it compatible with the
definition of any God anyone actually believes in).


Its a different kettle of fish to finding correlations in gene sequences and
ignoring the possibility that a deceptive God is playing with our instruments.
We’re not talking about a Cartesian Demon here, but a teleological purpose
for the evidence, that is related to the hypothesis. I think there’s a very good
reason to enforce methodological naturalism on the discussion in all cases
and to explicitly disregard any supernatural event. Sure, you lose any
leverage over true believers, then, but I suspect you didn’t have much of that
to start with.
I fully agree with all of this. But that last point has to have a logically valid reason. BT
provides that reason. And the first point makes no difference to what I was saying: you do
not know the prior probability of a meddling god, yet you know it is low enough to exclude it.
Thus “not knowing the prior” is not a valid argument against modeling reasoning with BT,
whether in science or history.

R E P LY

MAR K ER IC K S ON • OC TOBER 1 3, 201 2, 8:59 P M

I’m the “fan” Ian mentions above. To define myself, and in regards to Ian’s most recent post saying we’re all just
tribalists, I really enjoyed reading Proving History and the same goes for this blog. Dr. Carrier is a good writer,
both for clarity and entertainment, and his insights are valuable in my opinion. But I don’t uncritically accept
whatever he says, and I’m open to correction, as I think I’ve shown in comments on Ian’s blog.

More importantly, this thread is great stuff, really digging in and mucking around in the weeds, but, it’s still in the
weeds. I’d like to ask Carrier to rise up out of the weeds and give a big picture summary of this debate. It seems
clear to me from both the book and lots on this thread that Carrier’s main point is to improve the results of
historical debate, by exposing fallacious reasoning and providing a framework to moderate disputes. BT
accomplishes this not so much based on mathematical rigor, but from explicitly stating premises and arguments.
Put another way, BT for history is more logical and less statistical. Ian seems stuck in the statistical weeds. He
may have very good points, but viewed from afar, it is just a small russle of the grass. Is that right? I also suggest
including this in one of your upcoming posts, as very few will have gotten through all the weeds above. Thanks.

R E P LY

R IC H AR D C AR R IER • OC TOBER 1 5, 201 2, 9 :39 AM


BT accomplishes this not so much based on mathematical rigor, but from
explicitly stating premises and arguments.

Spot on.

R E P LY
ELLE8 7 • N OVEMBER 3, 2012, 10:05 AM

Just to let you know, one of the main reviews on amazon is a pretty negative one. Figured you may want to have
a brief look at it since its author claims to be a mathematician.

http://www.amazon.com/review/R392IPXC3QP131/ref=cm_cr_pr_viewpnt#R392IPXC3QP131

R E P LY

RICH ARD CARRIER • N OVEMBER 6 , 2012, 9 :34 AM

That’s a pretty lame critique.

He gives no examples of what he claims to find when he says “his methods press well beyond
the confines of any form of axiomatic probability theory” … okay, where exactly do I do
that? And in what way would that even be bad, since not everything is about axiomatic
probability theory?

I have no idea what he means by “in general, heuristic arguments are not only invalid but also
useless from a purely logical framework.” What is he critiquing with that statement? What
heuristic arguments? What in my book is he claiming is a heuristic argument? And what does
he mean by saying all heuristic arguments are invalid and useless? (Really? All heuristic
arguments, by definition?)

His claim that “most irritating was Carrier’s insistence that proving something to be unlikely is
equivalent to proving something false” seems to be ignorant of basic epistemology (if being
unlikely is not what we mean by saying something is false, then nothing can ever be claimed to
be false, since everything has a nonzero probability of being true: Axiom 4, pp. 23-25).

His reference to the Banach-Tarski Paradox has no discernible relevance to anything I


argue in my book. He doesn’t even provide an explanation of what relevance he thinks it has
to anything I actually argue in my book.

As far as his claiming “I would consider his entire premise suspect, due to his insistence on
applying subjective quantities to an objective theorem and general lack of mathematical
rigour,” that suggests he didn’t actually read the book, which addresses that objection
extensively and in detail (so if he has no argument against what my book says about that, and
it appears he does not, then it appears he did not actually read the book).

And as far as his vague and undefended complaint against my “fast-and-loose treatment of
mathematics and logic,” I think my article here (above) addresses that more than adequately.

R E P LY

BLOTON TH ELA N D SC A P E • N OV EMBER 5, 201 2, 7:38 AM

New book on Bayes:http://www.amazon.co.uk/The-BUGS-Book-Introduction-


Statistical/dp/1584888490. It’s a companion to BUGShttp://www.mrc-bsu.cam.ac.uk/bugs/, free Bayes
software, although might be a bit advanced for the lay-user.

R E P LY

RICH ARD CARRIER • N OVEMBER 6 , 2012, 9 :48 AM

Yes. Far too advanced for most.

R E P LY

Add a Comment (For Patrons & Select Persons Only)

Enter your comment here...

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Search This Blog


Search here...

Get Carrier’s Latest!


Follow Richard Carrier’s Work & Announcements

 

Categories

Select Category

Archives

Select Month

About The Author

Richard Carrier is the author of many books and numerous articles online and in print. His avid readers span the world from Hong Kong to
Poland. With a Ph.D. in ancient history from Columbia University, he specializes in the modern philosophy of naturalism and humanism, and the
origins of Christianity and the intellectual history of Greece and Rome, with particular expertise in ancient philosophy, science and technology. He
is also a noted defender of scientific and moral realism, Bayesian reasoning, and historical methods.

Support Dr. Carrier

Subscribe To This Blog

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 5,459 other subscribers

Email Address

Subscribe

Subscribe

Books By Dr. Carrier


Explore C.H.R.E.S.T.U.S.
Get Your E-Books Signed!

Take Online Courses With Dr. Carrier

As An Amazon Associate I Earn From Qualifying Purchases Following Links On My Website.


Buying From Here Helps Fund My Work.

Recommendations

Proudly powered by WordPress | Copyright 2016 Danza

You might also like