In The Seven Pillars of Statistical Wisdom, published in March by the Harvard University Press, Stephen Stigler identifies seven fundamental principles of statistics, a largely interdisciplinary field.
Stigler, the Ernest DeWitt Burton Distinguished Service Professor of Statistics, explained that statistics is not a field that feeds on itself. Rather, statistics addresses quantitative questions in a variety of fields, such as philosophy, literature, medicine, physics, economics, and sociology. In his book, Stigler aims to differentiate statistics from mathematics and computer science as a data science, and point out what makes statistics unique.
“The pillars are the support, not the substance, of statistics,” said Stigler. “This book is a taxonomy of the intellectual terrain of statistics.”
While writing The Seven Pillars of Statistical Wisdom, Stigler embraced the challenge of trying to communicate to a broad audience and make clear concepts that took 100 years to develop.
Stigler outlined the seven pillars as aggregation, information measurement, likelihood, intercomparison, regression, experimental design, and the residual. Here is a sampling of stories and examples from Stigler’s book that illustrate some of these ideas.
The first pillar, aggregation, for example taking an average, is paradoxical.
“By aggregating, you lose the identity of the individual, so you’re throwing away information, but you’re also gaining information of a different sort,” said Stigler. “No one wants to be reduced to a statistic, but by losing the identity of the individual, you are producing information about the group.”
Information measurement, the second pillar, focuses on measuring the information available. “People often assume that the more data you have, the more information you have, but data and information are not proportional,” said Stigler. “If you double your data, you don’t double your information. In fact, sometimes you’re better off throwing data away.”
Stigler called on an example from John Venn, the English logician and philosopher remembered for the Venn Diagram, to explain this paradox.
An army general has laid siege to a fort, and the people in the fort have run out of provisions and ammunition. They are ready to surrender, but the general knows he’ll have to replenish the fort to fight another approaching army, so he sends a spy to gauge what size cannonballs they’ll need.
The spy returns and says the army will need 8-inch cannonballs, but a second spy reports that the army will need 9-inch cannonballs. In this situation, it doesn’t make sense to take the average and bring 8.5-inch cannonballs, which would work in neither case. Instead, the general would be better off by throwing out some of the information and choosing between eight or nine.
Likelihood, the third pillar, uses numerical probability to calibrate the value of data. To explain this idea, Stigler elaborated on philosopher David Hume’s claim that a miracle is a violation of natural law.
“According to Hume, if someone reports that the sun did not rise or that the tides didn’t come in, there are two distinct possibilities,” he said. “One is that a miracle actually occurred. Or, the person who witnessed the miracle is lying or misunderstood. Hume argued that it is far more likely that the person is not telling the truth or is unclear. Hume’s likelihood argument inspired Thomas Bayes and Richard Price to offer a counter argument that was the first appearance of Bayesian inference.”
Stigler described regression, the fifth pillar, as basically a relativity principle for statistics. “Depending on the data you select, you will get different answers that may even seem compatible,” he said.
For example, say you select an extremely tall person from a crowd. One might assume that this extremely tall person has equally tall parents, or equally tall children, on average. This would be incorrect, Stigler explained.
Height has two components, factors such as genetics that affect all family members equally; and variation associated with factors that are unrelated for different family members, factors that have no average effect on others in the family. The parents and children will share one component but not the other. The extremely tall individual will on average have shorter parents and shorter children: It is actually more likely that an extremely tall person has only moderately tall parents and children.
Stigler has also written two other books, The History of Statistics: The Measurement of Uncertainty Before 1900 and Statistics on the Table: The History of Statistical Concepts and Methods.