How to Lie with Statistics by Darrell Huff

Summary

  1. The many ways that data and statistics can be manipulated depending on the story the author wants to tell and how to guard against this chicanery

Key Takeaways

  1. Overview
    1. So it is with much that you read and hear. Averages and relationships and trends and graphs are not always what they seem. There may be more in them than meets the eye, and there may be a good deal less. The secret language of statistics, so appealing in a fact-minded culture, is employed to sensationalize, inflate, confuse, and oversimplify. Statistical methods and statistical terms are necessary in reporting the mass data of social and economic trends, business conditions, “opinion” polls, the census. But without writers who use the words with honesty and understanding and readers who know what they mean, the result can only be semantic nonsense.
    2. This book is a sort of primer in ways to use statistics to deceive. It may seem altogether too much like a manual for swindlers.
    3. The fact is that, despite its mathematical base, statistics is as much an art as it is a science. A great many manipulations and even distortions are possible within the bounds of propriety.
    4. Not all the statistical information that you may come upon can be tested with the sureness of chemical analysis or of what goes on in an assayer’s laboratory. But you can prod the stuff with five simple questions, and by finding the answers avoid learning a remarkable lot that isn’t so. Who Says So?
      1. How Does He Know?
      2. What’s Missing?
      3. Did Somebody Change the Subject?
  2. Sample Size
    1. It is sad truth that conclusions from such samples, biased or too small or both, lie behind much of what we read or think we know.
    2. A river cannot, we are told, rise above its source. Well, it can seem to if there is a pumping station concealed somewhere about. It is equally true that the result of a sampling study is no better than the sample it is based on. By the time the data have been filtered through layers of statistical manipulation and reduced to a decimal-pointed average, the result begins to take on an aura of conviction that a closer look at the sampling would deny. To be worth much, a report based on sampling must use a representative sample, which is one from which every source of bias has been removed. The test of the random sample is this: Does every name or thing in the whole group have an equal chance to be in the sample? The purely random sample is the only kind that can be examined with entire confidence by means of statistical theory, but there is one thing wrong with it. It is so difficult and expensive to obtain for many uses that sheer cost eliminates it. A more economical substitute, which is almost universally used in such fields as opinion polling and market research, is called stratified random sampling.
    3. The importance of using a small group is this: With a large group any difference produced by chance is likely to be a small one and unworthy of big type. A two-peracent-improvement claim is not going to sell much tooth-paste.
    4. The point is that when there are many reasonable explanations you are hardly entitled to pick one that suits your taste and insist on it. But many people do.
  3. Averages
    1. When you are told that something is an average you still don’t know very much about it unless you can find out which of the common kinds of average it is—mean, median, or mode. So when you see an average-pay figure, first ask: Average of what? Who’s included?
    2. Only when there is a substantial number of trials involved is the law of averages a useful description or prediction.
    3. If the source of your information gives you also the degree of significance, you’ll have a better idea of where you stand. This degree of significance is most simply expressed as a probability,
    4. Comparisons between figures with small differences are meaningless. You must always keep that plus-or-minus in mind, even (or especially) when it is not stated.
  4. Other
    1. In the end it was found that if you wanted to know what certain people read it was no use asking them. You could learn a good deal more by going to their houses and saying you wanted to buy old magazines and what could be had?
    2. To say “almost one and one-half” and to be heard as “three”—that’s what the one-dimensional picture can accomplish.
    3. If you can’t prove what you want to prove, demonstrate something else and pretend that they are the same thing. In the daze that follows the collision of statistics with the human mind, hardly anybody will notice the difference. The semiattached figure is a device guaranteed to stand you in good stead. It always has.
    4. More people were killed by airplanes last year than in 1910. Therefore modern planes are more dangerous? Nonsense. There are hundreds of times more people flying now, that’s all.
    5. The fallacy is an ancient one that, however, has a powerful tendency to crop up in statistical material, where it is disguised by a welter of impressive figures. It is the one that says that if B follows A, then A has caused B.
    6. Percentages offer a fertile field for confusion. And like the ever-impressive decimal they can lend an aura of precision to the inexact.
    7. It is the illusion of the shifting base that accounts for the trickiness of adding discounts. When a hardware jobber offers “50% and 20% off list,” he doesn’t mean a seventy percent discount. The cut is sixty percent since the twenty percent is figured on the smaller base left after taking off fifty percent.
    8. Author Louis Bromfield is said to have a stock reply to critical correspondents when his mail becomes too heavy for individual attention. Without conceding anything and without encouraging further correspondence, it still satisfies almost everyone. The key sentence: “There may be something in what you say.”

What I got out of it

  1. Written decades ago but even more important today than in the past