## Atheism---A Bayesian Approach*

#### by Peter D. Wilson

Bayesian inference is a method of analysis that is very useful when comparing hypotheses. It assigns probabilities to all the possible outcomes of an experiment, combines this with all the knowledge one has before performing the experiment, and then calculates the probability of each hypothesis being true given the actual observation. The method is fully contained in Bayes theorem which is

p(H|UI) = p(H|I) * p(U|HI) / p(U|I)

where p(H|I) is the prior probability of the hypothesis, H, being true given the prior information, I; p(U|I) is the global likelihood of the observations, U, occurring given I and is simply a normalization constant independent of H; and p(U|HI) is the likelihood of U given both H and I. The quantity p(H|UI) is the posterior probability of the hypothesis, H, being true given the observations, U, and the prior knowledge, I. There will be a separate probability equation for each proposed hypothesis.

Occam's Razor states that given two hypotheses for a set of observations, the better choice is the simpler hypothesis. The more complicated model will always make better predictions and will be capable of explaining a wider range of observations, but if these complications are unnecessary the simpler model should be preferred. Bayesian inference has this built into it because it deals with probabilities. We have the constraint that the integral of each of these probabilities over all hypotheses or observations be equal to 1. This requires that at least one of our hypotheses be true and that one observation actually occur. One must therefore distribute a finite amount of probability among all possible observations; the more observations one must distribute over, the less probable any single observation becomes.

Let's start by considering the hypotheses ``Reindeer can fly'' (H1=R) and ``Reindeer can't fly'' (H2=notR) to illustrate how the equations work. Our experiment will be to throw reindeer one by one off a tall building and count how many fly and how many die. Before doing the experiment we need to assign prior probabilities to each hypothesis. This will set up a quantified form of the statement ``Extraordinary claims require extraordinary evidence.'' Because flying reindeer would appear to violate everything we know about how flying works, we would assign a very low, but non-zero, probability to R and a near-one probability to notR. Therefore, it will take a very strong piece of evidence, U, incapable of being explained with notR in order to counter this prior weighting. The necessary evidence is seeing a reindeer fly. There are two outcomes to each experiment: the reindeer falls to its death or it flies. What is the probability of a reindeer dying if reindeer can't fly? 100%. And the probability of a reindeer flying is absolutely zero! If reindeer can fly the probabilities aren't so easy to assign. There are reasons a reindeer might not fly and die. But there is also a probability of it flying so the probability is shared by the two outcomes. Now do the experiment. If the reindeer flew, the posterior probability of notR being true, p(notR|U=flying,I), is zero because p(U=flying|notR,I)=0. The probability of R being true, p(R|U=flying,I), is non-zero and becomes 1 after normalization. This is the case regardless of how biased we were at the start. (Including the possibility of hallucinations in notR will prevent its probability from going to zero but given sufficient evidence this will quickly approach zero.)

If the reindeer died, neither hypothesis goes to zero. We are left with the relative probability of the two hypotheses. Because the probability of a reindeer dying with notR being true is 1 while for it not dying with R being true is less than 1, the posterior probability of R will decrease (and notR increase) after each experiment. This occurs regardless of the prior weightings and the distribution of probability between flying and dying with R. Because R can explain both flying and dying reindeer but notR can only explain dying reindeer, when only dying reindeer are observed the better hypothesis becomes notR. The preference for notR over R will never be absolute because the probability of R won't ever become zero although it may come close. How strong must the preference be before notR can be accepted and we move on to more important questions? 3-to-1? 100-to-1? or, a billion-to-one?

For theism vs. atheism we have two hypotheses: ``God exists'' (H1=G) and ``God doesn't exist'' (H2=notG). U is the sum total of our observations of the universe and I is whatever constraints are placed on the universe before looking at it. One such constraint could be that life form in order for someone to pose the question. As I see no convincing argument for the existence of life to be more consistent with a god existing than one not existing I set p(G|I)=p(notG|I)=0.5. In the end, so long as one doesn't put the constraint that life is inconsistent with a godless universe (p(notG|I) is very near 0), the weightings won't significantly affect the result. We are therefore left with the calculation of p(U|GI) and p(U|notG,I) (the probabilities of the universe we observe occurring if God does and does not exist) determining the probabilities of the two hypotheses.

The question now becomes how many universes are consistent with a god and how many are consistent with no god. Because a god can always create a naturalistic universe the number of god universes must be equal to or greater than the number of godless universes. In addition I think it is obvious that a god greatly expands the number of possibilities over the no god case. A 6000~yr old Earth is allowed with a god but not without. Violations of the conservation of energy is allowed with a god but not without. The same goes for virgin births, `burning bushes', flying reindeer, faith healing, magic carpets, Santa Claus, etc. A godless universe must be limited to a narrow range of possibilities each based on natural laws while a universe with a god has no such restriction. Therefore, p(U|GI) is a very broad function of U with a very small value for any particular U. For an omnipotent god, p(U|GI) goes to zero since all universes are possible. Our universe appears to fall into the category of being run by natural laws so

p(U|notG,I) >> p(U|GI)

and for reasonable weights of p(G|I) and p(notG|I) it follows

p(notG|UI) >> p(G|UI).

One can illustrate this with the following symbolic figure in which the area under each curve is 1.

Hundreds of years ago before the most basic physical laws were discovered, the ordered workings of the universe could be seen as implying an intelligent hand. A godless universe was necessarily disordered and p(U|notG,I) would be nearly zero for the observed universe. The existence of a god would then be the better choice. Morning glories needed a divine nudge to open after every sunrise. The planets needed to be pushed across the sky by angels. Eventually, science could explain these things without resorting to a god and the godless explanation becomes the better choice. This is the ``god of gaps'' and can still be argued for today. When an observation can't be explained by the current theories its location on the graph is shifted away from the godless peak. The size of the offset depends on the seriousness of the discrepancy. The solar neutrino problem is only slightly offset since the problem deals with a difference between predicted and observed quantities, not with unknown processes. The offset does increase, however slightly, the relative probability that a god exists and is interfering with neutrino production but the universe is still much more likely to be well-ordered, just ordered a little differently than we thought. The existence of ESP on the other hand would be well off the curve. There is no known process by which ESP could work. If G and notG were the only choices for explaining ESP then G would be the best hypothesis and would remain so until a physical explanation was found. Once a cause other than G was found, notG would revert to being the best hypothesis.| + | + + +++ H = notG | + + --- H = G p(U|HI) | + + | + + Probability| + + of Universe| + + U occurring| + + |-----------++-----++------------ | ++++ ++++ |+++++++ ++++++++ |________________________________ ^ ^ U (Type of Universe) Creationist Naturalistic Universe Universe

The strength of the inequality and the difference in peak values depend on how the probability is distributed. I drew the figure such that there is approximately equal probability for any god-consistent universe. If one starts putting constraints on the activities of the god such that some universes become less likely, the probability of a well-ordered universe with a god can increase. For example, if one limits the god to having no more than one human offspring every few millenia, an ancient greek mythological universe can be ruled out. Restricting the god to no more than one divine intervention per person narrows the choices even further. Eventually when so many restrictions are placed such that the god can *never* interfere with the workings of the universe, we have reduced the god to a *roi faineant* and

p(U|GI)=p(U|notG,I).

One is therefore left with the prior probabilities p(G|I) and p(notG|I) and the question of whether a universe with life implies a divine designer --- possibly an unknowable.

#### Note:

* Originally published on the Cornell University website, where it is no longer available. As I have been unable to locate that author, it is made available here without permission.