Sunday, May 07, 2006

The 7 Major Thinking Errors of Highly Amusing Pseudoscientists

In the spirit of the "Seven Habits of Highly Effective..." series of psychobabble books, I have collected the top seven (believe me, there are way more than seven!) thinking errors that I have seen used by quacks and pseudoscientists. These thinking errors are all logical fallacies of one type or another, which leads to a short explanation of what logic is and why it is important.

Logic, despite its rather erudite and ethereal reputation, is not just about scoring points on the Debate Team or sounding like Spock on Star Trek. Logic is about thinking straight. It's about not letting the words we use get in the way of what is being said. Logic is about seeing how people use language to try to fool you - intentionally or inadvertently - in the day-to-day world.

A few of the more passionate pseudoscientists - and their apologists - denounce logic as "mere wordplay" or sophistry. Nothing could be further from the truth. The Sophists developed the use of logical fallacies (the Logic 101 word for "thinking errors") in order to prove anything right or wrong. They were the ancient ancestors of today's "Spinmeisters" and would often engage in "debates" where they would prove both sides of an argument true - or false - just to show off their skills.

Nor is logic just for academics and ivy-covered professors - anyone who listens to an argument, be it a political debate or a television advertisement, can find the logical fallacies in those arguments. It's all about seeing how the other person is trying to fool you into agreeing with them. The fact that they may have also fooled themselves into believing it makes it all the more important to understand how it's done.

Thinking Error 1 - Association is Causation:

Boy, has this one been beaten to death! You can't open a newspaper, turn on the telly or browse the Internet without somebody trying to tell you that, since X is asociated with Y, X causes Y.

A correlation or association merely means that two (or more) things have been found together more often than would be expected by random chance. Their correlation or association might still be due to chance, just as it is possible to get a long string of "heads" or "tails" when flipping coins.

My favorite example of correlation abuse has to be the very strong correlation between reading ability and shoe size seen in every elementary school. Try it - go to your local school and measure the shoe size of any group of children (get the Principal's permission first, or you may have to do a lot of explaining to the police) and compare that to their reading ability. You will be immediately impressed by the correlation - children with larger shoes read better!

Now, does this mean that the way to handle slow readers is to get them larger trainers? Will an oversize pair of Reeboks help a child to advance their reading skills? Well, of course not! As you might have guessed, the children with larger shoes also happened to be (on average) older and older children (on average) tend to have better reading skills.

In this example, the chosen variable (shoe size) was a surrogate for a variable (age) that actually was correlated with reading ability. This is one possiblity for the association. But let's take this one step further.

What about an association that has nothing to do with the outcome? What if we randomly pick people off the street, weigh them and record their eye colour? If our sample is small enough or if random chance intervenes, we could find that a certain eye colour - green, for example - is associated with obesity. Of course, our baloney-meter tells us that eye colour has nothing to do with obesity, but our "random survey" established that very thing.

Or did it?

One way to enhance the possibility that chance will favor our venture (by giving us a correlation we can publish) is to measure a larger number of variables in our study. We can then find those that correlate and publish them. This is what happens in data mining, where huge surveys are done, collecting data on dozens (or hundreds) of variables and then looking for correlations. Some of the results have been quite hilarious (in retrospect).

The problem with data mining is that the people doing it don't "play fair" when they go to analyse the data. They should correct the correlation statistics for the number of variables they studied, but that would lead to the correlations looking like random chance (which they are), and nobody is going to publish that, not even the National Enquirer (headline: "Eye colour associated with obesity by random chance!").

X might cause Y, X and Y might have a common cause, Y might cause X or the result might simply have been the result of a random "clustering" and X and Y might have nothing to do with each other at all. There is no way of knowing without further investigation.

So, when you read about X being said to cause Y because of a "correlation", "association" or even a "strong association", remember to read that as "No causal connection shown between X and Y."

How about the reverse? Does a lack of correlation fail to prove a lack of causation? Sadly, it is not that simple. Which leads us to:

Thinking Error 2 - "A" Cause is "THE" Cause:

This is really an extension of the previous logical error, but it deserves its own listing because it pervades the pseudoscience and quackery communities.

Having established that X causes Y (if that has, in fact, been established) does not mean that X is the only cause for Y. This is a pseudoscience favorite, because it is relatively easy to find well-done studies showing that, for example, mercury impairs the function of a certain enzyme (enzyme "Z"). The quacks and pseudoscientists then proceed to claim (without doing any more research) that finding impaired enzyme Z is a "biomarker" of mercury poisoning.

Unfortunately for them (and the people who believe them), they have established no such thing. There may be one, two or three hundred causes for enzyme Z impairment - many of which may not yet have been investigated - so finding impaired function of enzyme Z "proves" nothing.

Zip, zilch, nil, nada.

However, the ever-confident pseudoscientist will usually not share that little tidbit with the public. The news release will be, "Impaired enzyme Z proof of mercury poisoning!" Doubts rarely make the headlines.

The really annoying part is that the discovered cause - again, one of potentially very many - may not even be the most common cause of the effect. This may lead people to avoid one cause and run - unknowingly - into another.

So, without knowing how many possible causes a certain effect may have, how can you assess whether something is a significant cause of something else?

One simple way is to resort to epidemiology, an admittedly blunt instrument in finding causation, but one that has a particular utility in this instance. By looking at the incidence of - in this example - enzyme Z impairment in a large population and comparing that with the mercury exposure within that population, you can get a pretty good idea of whether or not the hypothesis of causation holds together.

If you find that enzyme Z impairment tracks well with mercury exposure - i.e. the segments of the population with higher mercury exposure have greater impairment of enzyme Z - than you have a piece of supporting data. If, however, the levels of enzyme Z function don't track with increasing exposure, then your hypothesis has a serious problem.

You may have noticed that many of the quacks and pseudoscientists avoid getting to the point of actually having any data like this - data that can test their causation hypothesis. Or, if they have and the data didn't "pan out", they have a ready explanation.

Tune in next time for "Thinking Error 3 - The Post Hoc Correction:"