Thursday, August 11, 2005

Absence of Evidence is Evidence of Absence

I know Carl Sagan said the opposite, but he was clearly wrong.

Suppose that a priori X has probability p of being true. We now look for evidence for X of a certain type. Suppose that there is a probablity q that we find this evidence if X is true and probability q' that we don't find this evidence if it is false. We will assume p<1 (otherwise we wouldn't bother looking for evidence) and that q>q' (otherwise it couldn't be said that the evidence we're looking for is evidence for X).

So we have four possibilities:

  1. X is true and we find evidence for X: probability pq
  2. X is true and we don't find evidence for X: probability p(1-q)
  3. X is false and we find evidence for X: probability (1-p)q'
  4. X is false and we don't find evidence for X: probability (1-p)(1-q')

Under the hypotheses above, the conditional probability that X is true given that we failed to find the evidence is p(1-q)/(p(q'-q)+1-q').

Use Bayes' Theorem.

Some elementary rearrangement shows this is always less than p given the above hypotheses. It doesn't matter if we are unable to assign an a priori probability, this holds whatever value p has as long as it's less than 1. And if we don't know that q>q' then we shouldn't be in the business of looking for evidence. If the experiment we're doing is any good then q'=0 but as I have shown, the result holds even if we relax this condition.

So clearly failing to find evidence for X should lower our estimate of the probability that X is true.

I wonder what made Sagan say this. I think that maybe he meant to say "absence of evidence is not proof of absence". The theorem shows that under the original hypotheses the conditional probability is never 1, and so while we have evidence of absence, we don't have a proof. But if we can look for enough independent types of evidence it's quite possible for the conditional probability to get close to 1.


Kenny said...

I suppose what you're saying is fairly intuitive, but it's good to see mathematical proof of it. I assume (without looking at the original) that Sagan's point might have been that absence of evidence often stems from something other than having looked for evidence once and not found any. If in any significant number of cases, evidence of absence stems in fact from not having looked for it, then the prior probability of presence will overshadow the probability rendered by evidence or its lack.

sigfpe said...

If you do a google news search on absence of evidence you can see it's been used quite a bit by politicians lately - especially with reference to WMD. Again I think we have to interpret what they say as "absence of evidence isn't proof of absence" otherwise it makes no sense.

Anonymous said...

I think Sagan was definitely thinking of it from a deductive logic framework; one wouldn't be surprised if Sagan cautioned one against using ad hominem arguments, but couldn't we also show that a speaker's honesty and other personal characteristics should affect our priors about that speaker's statements?

Gwenhwyfaer said...

"Suppose that there is a probability q that we find this evidence if X is true and probability q' that we don't find this evidence if it is false." - surely the "don't" should be deleted?

I'm also not convinced about your maths. Unpicking your rearrangements and stating it in the simplest form, your core assertion is that
p(1-q)/(p(1-q) + (1-p)(1-q')) < p
But cancelling p gives us
(1-q)/(p(1-q) + (1-p)(1-q')) < 1
and rearranging to gather p terms
(1-q)/(p((1-q)-(1-q')) + (1-q')) < 1
which can be expressed as
(1-q) < p((1-q)-(1-q')) + (1-q')
(1-q)-(1-q') < p((1-q)-(1-q'))
or, cancelling for commons,
1 < p
which contradicts our initial condition of p < 1 .

(Admittedly my maths is very rusty, and I could well have screwed up something basic; if so, please don't spare my blushes!)

Anonymous said...

Gwenhwyfaer: I ended up with 1-p < q-q'.

However, if q' is the probability that you don't find evidence for false X, isn't it always 1?

It's not clear what kind of things we're talking about here. So, I assume X is a logical formula and in this system logical axioms are randomly generated initially. Then, we can say that X has a probability p of being true.

The evidence we're looking for is a deduction. We don't know the axioms, and we're going to carry out the same steps with some assumptions. However, we have a machine that can halt our deduction sequence if at a given step our assumption is not satisfied by the axiom. Then we can speak of a probability q that we complete the proof. q could be 0 for some true X due to Godel's incompleteness, but q' has to be 1 because our deduction just can't terminate with X as the result, or else our logic is inconsistent.

Furthermore, I don't think P(X is true and we find evidence for it) is just pq because they're not independent: an axiom that invalidates X will not let you find evidence for a false X.

Blog Archive