Wednesday Wisdom

Simpson's Paradox

WHO?

Edward Simpson was born in London, England in 1922 and studied at the University of Cambridge, where he earned his PhD in mathematics in 1949. He spent most of his career at the University of London, where he held various academic positions in the Department of Statistics at the London School of Economics and Political Science. In 1951 while working on the statistical analysis he realized the anomaly in data which became known as as "Simpsons Paradox". Simpson's work has had a lasting impact on the field of statistics and continues to be widely referenced and studied today. He is remembered as a pioneering statistician and a key figure in the development of modern statistical methods. He was also a pioneer in the field of multivariate analysis and made significant contributions to the development of statistical methods for the analysis of complex data sets.

What he produced

"Simpson's Paradox" is a statistical concept that refers to a situation where a trend that is observed in multiple groups of data disappears or reverses when the groups are combined. It is a reminder that correlation does not always imply causality and highlights the importance of considering the context and underlying factors when interpreting statistical relationships. His paradox realized that statistics are a powerful tool, however, without logic, it could produce false narratives. This effect tends to happen on larger data sets like populations where the statistical observation for the whole may point to one conclusion, while all the subgroups tell another story. The paradox is useful in understanding demographics, medicine, college admissions, and even baseball where statistics (think of "Moneyball" ) help in decision-making.

As we embark on another baseball season, a great example of this paradox can be found in comparing a player's batting average. In 2007 and 2008 Boston Red Sox Jacoby Ellsbury batted .353 and .280 while his teammate Mike Lowell batted .324 and .274 which seemingly shows Ellsbury as having two better seasons. However, when the two seasons are looked at in aggregate, Mike Lowell batted .304 while Ellsbury two year average was .293. This reversal of the relationship is known as "Simpson's Paradox".

2023 why do we care?

Simpson's Paradox can arise due to various factors, such as the presence of many variables, poor or non-random sampling, or biased data. It highlights the importance of considering the context and underlying factors when interpreting statistical relationships, and it is a reminder that correlation does not always imply causality. This is the intersection where statistics and logic intersect and an epistemological approach is needed. For example, there is a high correlation between ice cream consumption and shark bites, huh? they certainly have nothing to do with each other. More people swim in hot weather and also eat ice cream, two statistics that coincide but don't correlate as logic dictates.

In late 2021, headlines across social media stated that vaccinated people were dying at 2x over the unvaccinated. Conspiracists jumped on this statistic without any understanding of the numbers. While in aggregate this may have been true when broken down by age groups, vaccinated mortality rates as well as hospitalization rates were 5-10x less. The data was skewed to older and more vulnerable who were vaccinated at higher rates while younger demographics with extremely low mortality rates were unvaccinated. Once again a prime example of "Simpson's Paradox" and more salient than a baseball statistical anomaly.

Simpson Paradox can be used intentionally to mislead or used as a confirmation bias ( finding data to confirm a previously held belief). Logic and philosophy fill in the blanks to get to the truth, especially when statistics are misinterpreted or used to push a false narrative.

Simpson has been quoted, "The theory of statistics is intimately connected with the theory of knowledge, and therefore with philosophy."

 

And now you know...

Philosophy is the art of thinking, the building block of progress that shapes critical thinking across economics, ethics, religion, and science.

METAPHYSICS: Literally, the term metaphysics means ‘beyond the physical.’ Typically, this is the branch that most people think of when they picture philosophy. In metaphysics, the goal is to answer the what and how questions in life. Who are we, and what are time and space?

LOGIC: The study of reasoning. Much like metaphysics, understanding logic helps to understand and appreciate how we perceive the rest of our world. More than that, it provides a foundation for which to build and interpret arguments and analyses.

ETHICS: The study of morality, right and wrong, good and evil. Ethics tackles difficult conversations by adding weight to actions and decisions. Politics takes ethics to a larger scale, applying it to a group (or groups) of people. Political philosophers study political governments, laws, justice, authority, rights, liberty, ethics, and much more.

AESTHETICS: What is beautiful? Philosophers try to understand, qualify, and quantify what makes art what it is. Aesthetics also takes a deeper look at the artwork itself, trying to understand the meaning behind it, both art as a whole and art on an individual level. A question an aesthetics philosopher would seek to address is whether or not beauty truly is in the eye of the beholder.

EPISTEMOLOGY: This is the study and understanding of knowledge. The main question is how do we know? We can question the limitations of logic, how comprehension works, and the ability (or perception) to be certain.