In the process of advancing the cause for advanced quantitative analysis in hockey, I have come across a number of fallacies and duplicitous rhetoric to undermine or dismiss the endeavor. This isn’t to say those skeptical of the utility of things like Corsi, zone starts or the like are necessarily wrong – we’re only taking the first few hesitant steps towards properly modelling what happens on the ice, after all. Skepticism is always warranted in matters of statistical analysis, particularly in sport where control groups don’t exist and real, scientific inquiry is impossible.

That said, many of the common arguments against advanced stats are mere sophistry: the groping of someone either too ignorant of the subject to advance a meaningful counter claim or the dismissal of evidence that is inconvenient or doesn’t square with populist notions.

Recently, Robert Cleave of Flamesnation discussed the disappointing season and apparent decline of Miikka Kiprusoff here. A rebuttal recently appeared at My Hockeybuzz by Flames blogger saneopinion. His post is an apt example many of the fallacies that are typically brandished to attack advanced stats, so I have decided to give it a thorough fisking as a way to answer many of these criticisms:

Stats will never tell a whole story, they will never tell you about the compete levels in players, their motivations, their bad or good luck. They won’t tell you humidity factor in different arenas or noise level, they don’t tell you about the chirping of one player to another or about hidden injuries that players play through.

First off, the notion that “stats try to tell the whole story” is a strawman, albeit one universally mentioned during such discussions. Statistics are measures of performance. They aren’t plot devices nor thematic overviews. Stats help answer certain questions, track certain aspects of the game and make informed decisions about the future. That is their relevance and their function.

The other factors mentioned – compete level, motivations, rink conditions – can all be captured or cancelled out statistically. If they are persistent, then their effects can be detected in various outcomes. If they are transient, then they will mostly likely cancel out in long-run, which is why sample size is always a primary consideration when doing quantitative analysis. The more games and the more seasons, the more powerful the sample size and meaningful the analysis. For example, if a player really is a good goal scorer, then over time he will score a lot of goals. It won’t matter that a couple nights a year his skates may not have been sharp enough. His results may be poor on those particular nights, but over the long run truth will out, as it were.

Corsi, the way I understand it, is basically the plus minus stat on roids. It deals with 5 on 5 play and the shots for and against while a player is on the ice. Here again we could look at so many different variables it makes a head spin, ice condition, how well the trainer sharpened skates on and on… nothing is ever perfect.

This is a red herring. If every tiny contributor to error or variability of a players performance had to be taken into account for every performance metric, then they could all be dismissed as too hopelessly confounded to be meaningful. The reason Corsi is actually more useful than common plus/minus is that there are about 10-times more corsi events than goals. That means the sample size is a lot bigger and the results much more meaningful. Again, transient issues like how well a guys skates sharpened would be factored out over time.

I like stats as a basic outline that show how a player does in general. For instance if a player is minus 27 he is probably a defensive liability or if a player has 100 points he is pretty damn good player.

There is a difference between principled analysis and rules of thumb or analysis via heuristics. As I noted in my recent article on the theory behind current advanced stats:

True analysis is understanding the variables and agents that give rise to outcome. The means to the end rather than just the end itself. The coal before before the diamond, if you will.

This is perhaps the crossroads at which conventional thought and so called “advanced stats” most frequently clash. Every hockey fan’s perception of a player is inevitably anchored by heuristics; “rules of thumb” which evolve from information that is perceptually impactful or easily available. Human cognition works in general by conjuring habitual psychological markers that act as lighthouses in the maelstrom of data that is life in general. The problem is, a heuristics is not a principle: it is a quick-start guide at best, an inherent bias at worst. It isn’t the template. It is the stereotype.

In general, players at the tail-ends of performance in just about any performance marker are probably, as saneopinion says, “good or bad”. In particular, there a number of contextual issues and that can be expressed quantitatively examined to determine to degree to which his output represents a given player. And, detailed analysis of this type becomes even more useful (ie; when rules of thumb become less indicative) increases the further away from the extremes we get.

The real and only true way to judge hockey…is to watch hockey. Kipper was fantastic last year and yes there was the occasional bad goal but is he really the weakest link? I remember so many goals that I was flabbergasted at the defence being atrocious yet the stats say that five on five they do a decent job.

Observational analysis can certainly yield a lot of useful information. However, the human mind is notoriously innumerate and tends to detect patterns, illusory correlations and fabricate narratives. In short, observation and memory are prone to certain framing and encoding biases that can lead to hopelessly subjective judgements. This is one of main reasons principled quantitative analysis is absolutely essential if understanding and predicting performance is the goal.

For example, the issue at play here is almost certainly confirmation bias. Because Kiprusoff was once an elite goalie and is commonly held as a folk-hero in Calgary owing to his outstanding performances earlier in his career as a Flame, Flames fans are now primed to interpret his performances as necessarily good ones. The cognitive dissonance between “Kipper is an elite goalie” and “Kipper isn’t very good anymore” leads many to dismiss the ample evidence of his very real decline over the last four years. Never mind that his reputation as an elite ‘tender is almost entirely built on the very evidence that now is denigrated in his defense: had he previously not saved a lot more pucks than the average goalie in the league, nobody would be up on the battlements defending his honor currently.

It’s hockey. One mental slip by a player and the opposition is in the slot in a prime scoring area.

Sure. But this is only relevant if the Flames were provably worse than the average NHL team at allowing scoring chances. There is no such evidence aside from subjective assertions. Calgary was actually one of the league leaders in terms of shots against at five-on-five, suggesting just the opposite.

The other issue with this argument is the implicit premise that NHL team quality is a strong causal force when it comes to a goalies overall even strength save percentage. The logical problem for folks who advance this theory, particularly in defense of someone like Kiprusoff, is that it suggests prior elite performances were probably due to the strength of the team’s defense as well. It is inconsistent to blame relatively poor save rates on the abilities of the skaters, but to celebrate high save rates as evidence of a goalie’s superior talent. Either teams have a strong influence on a goalie’s performances over time or they don’t.

The stats used in this blog say five on five when the opposition beat the Flames five on five it’s Kipper’s fault. C’mon.

“C’mon” isn’t an argument. Rhetorical expression of incredulity, perhaps. Believe or not, just because some evidence or data seems unlikely or runs counter to established preconceptions, doesn’t make it incorrect.

To summarize I think stats are only meant to round out an opinion and to maybe provide some general justification, if you rely solely on them you blind yourself to the beauty of hockey.

Perhaps the second most common strawman argument wielded against stats is the claim that quantitative analysis somehow robs the sport of it’s “beauty” or “excitement” or some other such inherent quality. To me, this seems like claiming that watching, say, race cars isn’t as fun if you know how fast they’re going or how many horsepower each engine has. Not only is it a non-sequitur, in truth having greater information and context about what’s going improves the experience for many. Stats and analysis certainly aren’t required to enjoy watching hockey (or indeed, any sport) but they are by no means mutually exclusive.

Besides if one really wanted to look at Kipper’s stats I could argue to one that matters most, wins, he was third. In losses he has less than bums like Lundqvist, Ward, Brodeur…you know statistically better goalies than Kipper 5 on 5… How is that possible…because hockey is played on the ice not on a calculator.

The difference between a valid argument and a rationalization can often be detected in the consistency of the premise. Here, saneopinion contradicts his stated proposition (stats aren’t overly useful!) by…referring to stats (wins) as somehow meaningful. That’s because the buried purpose of the piece is to defend the Flames starter (and therefore avoid the cognitive dissonance discussed above) by undermining the evidence that suggests his best days are behind him. The other evidence that suggests he’s good, however (win totals) must be the more relevant metric. Not because it is more accurate in principle (wins is probably the very worst way to judge puck stoppers) but because it accords with existing preconceptions and is therefore less threatening.

There is still a long way to go for quantitative analysis in the NHL. However, the arguments forwarded by saneopinion – and many similar others – are not legitimate criticisms of the current state of hockey theory or knowledge. At best, they are rationalizations erected to shield folks from evidence that seems either counter-intuitive or unpleasant.