## Friday, August 17, 2012

### Unspecified hypotheses make false predictions

Note: This post draws ideas liberally from A Technical Explanation of Technical Explanation.

A common accusation against pseudoscientific claims is that they explain everything. Not that they explain every bit of evidence we see in this world, mind, but that they explain every bit of evidence in all possible worlds. They frequently don't exclude any events from happening. For example, in astrology, a Taurus could be either an introvert or an extrovert; could like sports or hate sports; could like punk rock or classical (or both or neither). Astrologists don't take any of these facts as refutations of astrology. If they predict that you're a Taurus and will therefore have a dominant personality, but it turns out you're submissive, the astrologist will claim that things can actually change down to the millisecond level, and without knowing the precise time of your birth, astrology's predictions can't be refuted.

It's obvious that the predictive power of such hypotheses is basically zero. Some people--especially the people who advance such claims--don't seem to have a problem with this. But if you consider the issue in Bayesian terms--in terms of assigning probabilities to worlds with different bits of evidence--you might notice that such theories are worse than useless.

Consider a simple example. Say we're trying to predict the output of an algorithm that always produces a number between 0 and 100, without access to the algorithm. There are several hypotheses we can try. "It always produces something ending with an 8." "It always produces a number greater than 80." "It only produces the number 1." When we see an output, we can refine which hypotheses we believe to be more correct. If it gives us an 8, we can exclude "always give us a number ending in 9" and "always gives us a 1."

In probability terms, each hypothesis is assigning probabilities to each individual number. Without further specification, "it always produces a number greater than or equal to 90" assigns 10% probability to 90, 10% probability to 91, etc.

Now consider a hypothesis that predicts all numbers with equal probability: "It produces a number between 0 and 99."

Imagine that the algorithm only outputs numbers between 90 and 99. The hypothesis in question assigns these numbers 1% probability each. It also assigns the same probabilities to 90 other numbers. Each number should be expected with equal probability, but we only see the same 10 numbers over and over. This means that our unspecified hypothesis--all numbers are equally likely--isn't just useless in predicting what numbers will actually appear. It's actively wrong 90% of the time! How? It does indeed predict that numbers between 90 and 99 will appear, but it also predicts that all the other numbers will as well. All of those other predictions are completely wrong.

If the algorithm actually does output all numbers between 0 and 99 with equal probability, then this hypothesis is right 100% of the time and all other hypotheses are wrong at least some of the time. But we're considering a very simplified example, with only 100 possible outcomes, none of them logically contradictory. In the real world, there are near-infinite possibilities and many of them are logically contradictory.

Consider two simple hypotheses about lemons. One of them says that, because of descended heritage, lemons are typically yellow. This theory predicts that, excluding mutations, hybridization, or decay, lemons will be yellow. Another theory says that lemons are yellow because the lemon fairy makes them yellow. Without specifying why the lemon fairy makes them yellow, it seems she could change her mind at any time, and make them any color she wants.

The first hypothesis predicts yellow lemons. The second hypothesis, because it refuses to specify details about the lemon fairy, predicts all colors of lemons. So, when we only find yellow lemons, the second hypothesis has predicted countless events to have taken place that have not and will not.

So unspecified hypotheses that can explain all possible outcomes are not just useless at making predictions. They actively and constantly make false predictions.

Keep this in mind when considering theories. When scientists ask, "Can this be falsified?" they're not just doing so for the sake of a cult of scientific method. They're trying to avoid theories that make innumerable false predictions. I doubt I need to convince many people that's a worthy goal.