The Footy Blog’s coverage of soccer analytics has taken on more of a “meta” approach in keeping with the editor’s layman’s knowledge of statistical methods in sports. My wheelhouse is picking apart conventional media narratives, not Pythagorean tables. To that end, I haven’t been the only one to acknowledge how US media criticism of Fivethirtyeight election stats blogger (and former baseball statistician) Nate Silver’s methodology reflects the traditional sports pundit’s criticism of analytics in sports.
David Roher’s defense of Silver and the use of analytics in sports on Deadspin yesterday is a must-read for anyone interested in this topic. It’s hard to pick and choose among Roher’s thorough account, but this gets to the core of his argument:
Just like their colleagues in the sports section, the political pundits see the wrong kind of uncertainty in Nate Silver. They associate statistics with mathematical proof, as if a confidence interval were the same thing as the Pythagorean Theorem. Silver isn’t more sure of himself than his detractors, but he’s more rigorous about demonstrating his uncertainty. He’s bad news for the worst members of the punditry, who obscure the truth so their own ignorance looks better by comparison and who make their money on the margin of uncertainty, too.
This to me gets at the heart of the fundamental misunderstanding of how analytics works. I can’t speak for all soccer analytics experts, but I think it can be broadly said that the eventual hope in the field is for a) a set of models and metrics with reliable predictive quality, over an entire season or the course of single game, that can be isolated for regional tactical preferences and player skill (i.e. what a team should do to win more games across all possible worlds); and b) a set of metrics for scouts to help reasonably evaluate the long-term utility of a position player.
But while those goals would truly set the field apart and give it a strong dollar value, it’s likely they are chimerical. In reality, analytics experts just want to make better cases about how individual players should behave together in a football match against a particular opposition in order to win. And they want to make accurate qualitative statements about why a player played well on the day.
Most, if not all, football pundits have done exactly the same thing for years. This team should have done this but didn’t and they lost. This player was good because he tackled well, or was featured in a lot of plays or took a lot of shots.
The difference is statisticians (well, the good ones) ground their statements in empirical evidence. For pundits this is poison, because they associate any form of rigorous scientific empiricism with certainty. In reality, analytics is just a long category of “shoulds.” This is what the Premier League table “should” look like based on transfer spending. Teams with higher shots on goal ratios “should” win more games over the course of a season. Mistaking these “shoulds” for “wills” is a bit like saying a forecast calling for a 60% chance of showers means it will certainly rain.
Weather forecasting is a good example here. That 60% is a probability, not a certainty. But that probability was determined by empirical evidence offered by meteorological science, not because of the way the air smells or because of grandpa’s swollen knee or because it rained yesterday or because there are clouds in the sky. We don’t expect pundits to make weather forecasts; why do we accept their pronouncements in something as serious as football, let alone something as frivolous as politics?