The Mona Lisa

“Sometimes Messier’s skates feel like broken galoshes; sometimes they feel like Mercury’s wings, and he doesn’t know why. It has nothing to do with their sharpening, tightening, the make of the skates, or his health or attitude. It is simply true.”
-Ken Dryden, Home Game

It began with Jaroslav Halak.

You remember this story. In 2010 the Montreal Canadiens, as is their custom in recent years, squeezed into the playoffs as an 8th seed. In doing so, they drew as their opponent the Washington Capitals. The Capitals were not then the troubled, fragmented franchise they’ve since become, but an offensive powerhouse that ripped up the League in storm and fury. As a first-round opponent, Montreal seemed likely to be hardly more than a speed bump. The smart money and virtually all the pundits said the same thing: Washington in 4 or 5. No contest.

So obviously when the Canadiens won the series in seven games it caused some cognitive dissonance for the hockey world. Journalists scrambled for explanations, and a lot of previously untested narratives were floated in quick succession. There was talk of dressing room camaraderie, of steely determination and unshakable resolve. There were tales of Martin’s cold, calculating defensive system and the players’ willingness to buy into it. Although both the shot counts and the scoring chances went heavily against Montreal, there was speculation that the Canadiens won perhaps by keeping the Capitals to the perimeter, or stifling breakaways, or somehow getting inside Ovechkin’s head.

You know where this is going, right? Team with a run of improbable short-term success? Narratives about brilliant coaching and exceptionally dedicated players? Invocations of shot quality? These phenomena attract hockey statisticians the way blood draws sharks. They’re the surface signatures that indicate the presence of unsustainable percentages. They’re the omens of regression. And, sure enough, the venerable Desjardins of what was then Behind the Net and is now Arctic Ice Hockey did a little digging and found that- shock and horror- there was no evidence that the Canadiens were getting better shots or restricting chances against or playing some remarkable system. They were benefiting from high percentages, specifically the .977 save percentage that Jaro Halak amassed in over games 5-6-7 of the series. You do not have to be a mathematician to know that .977 is not a sustainable SV%, no matter what the system played, no matter what the defense is doing. Ergo, said the fancystats gentlemen, the Habs got lucky. Case closed.

One day, after several posts demonstrating the inadequacy of the conventional explanations and the statistical evidence that such a run could potentially be nothing but randomness, after the requisite lengthy comment thread battles with recalcitrant Canadiens fans, the following survey was posted under the title: Questionnaire for Montreal Believers. The answers to every question were the same, incorporating what Desjardins must have imagined were the entirety of beliefs it was possible for Canadiens fans to hold about their unexpected defeat of the Capitals. They were as follows:

a. Halak is the next big-game goaltender.
b. The system is suited to the players and has proven it works, the unification aspect comes in, bonds the team, and makes it greater than the sum of its parts.
c. Stop pissing on my parade, Gabe
d. Montreal’s luck is bound to turn.

I looked at this survey far longer than it warranted. It was mostly a joke, of course, a time-and-space filler rather than a sincere question, but nevertheless it rankled that there was no answer there I could wholly accept. Each of the responses had an element of truth, each of them was also partly wrong. This was not the whole range of possible beliefs on the matter. Something was missing, although I couldn’t define what exactly.

I’ve been thinking about it for two years.

***

Regression (properly called regression to the mean) is not a difficult concept. There’s no complicated math involved, hardly even any simple math in fact. There’s no necessary averaging; you don’t have to do long-division. You don’t even have to add anything. Understanding regression involves nothing more sophisticated than knowing whether a given number is higher or lower than another.

The numbers, in this case, are one of two percentages: shooting percentage or save percentage. Sometimes you have to consider both added together, a stat referred to as PDO, but for the time being we’ll stick with the component parts. The essential fact about percentages is that they do not vary all that dramatically over the long term. Shooting percentages, either for individuals or teams, will over long periods of time cleave close to certain averages- League averages in the case of teams, career averages in the case of individuals. The range of sustainable percentages, percentages that might be expected to continue into the bright blue future, is quite narrow. Therefore, any extraordinarily high shooting percentage or save percentage over a short run is destined to fall, and when it does, team results will fall accordingly.

Regression is not a hypothesis, a conjecture, or a supposition, this is not some radical avant-garde theory. It’s true. It works. It is, in fact, pretty much the most consistently useful predictive principle thus far discovered in hockey stats. More than that, though, regression is intuitive. Or it should be, anyway. Hockey is a game of streaks and slumps. Fans of long-standing know this; we’ve seen it happen dozens of times, hundreds even, on both the team and individual levels. Players who have seasons that shatter their career averages usually, superstars excepted, return to their customary form the next year. Teams that vastly overperform expectations in the early fall usually fall back to earth by late winter. Hot goalies eventually go cold; cold shooters eventually warm up again. Regression is nothing more difficult, really, than putting a handy measurement on a perennial truism: what goes up must come down. It ought to be about as controversial as gravity.

And yet, of all the battles I have ever seen across the interwebs over advanced stats in hockey, almost all of the most vicious, vitriolic, protracted conflicts have been over regression. It happens once or twice every season, the unfortunate result of two things: 1) regression is the easiest statistical principle to make an accurate future prediction about, and 2) the predictions it makes are invariably hard to hear. Some team will be riding high percentages and the fans getting all fluttery with silvery Stanley Cup-daydreams, and along will come some man with the numbers prophesying the fall and the blogs will just explode in spasms of rage. No matter how much regression happens, no matter how many case studies we have all witnessed, no one ever wants to believe it when it comes to their team.

Part of the problem is certainly bad narratives. Unsustainable percentages generate bad narratives the way damp logs generate mushrooms. Bad narratives, in this case, are almost always tales that give overwhelming explanatory power to factors of personality and disposition. In the absence of any concrete justification for success, there are invariably some people in every market who will want to spin a hot run as the result of some particular awesomeness of their own or one of their favored team members. There will always be a GM ready to say that it’s all due to his grand plan in action, or a coach happy to imply that he’s got a special secret system. There’s always a journalist ready to talk about a guy he thinks is great in the room, and Lord knows there are plenty of guys who are very happy to be called inspiring leaders.

Narratives of personality are almost always inaccurate and lacking in predictive power because they rely on evidence that cannot be surely known or proven. Issues of team chemistry, motivation, leadership, and effort almost certainly play some sort of role in influencing outcomes, but the role they play is tiny compared to on-ice issues of strategy, tactics, talent, and ability. Moreover, as outsiders- outside the room, outside the heads of the relevant parties- we cannot know anything about the real emotional and psychological characteristics of players. Narratives that locate the critical differences between winning and losing in feelings and character traits are at best exaggerations and at worst completely made-up shit that turn real people into fictional characters in the fantasy psychodramas of fans.

People who are deeply committed to self-created narratives about things they cannot possibly know, things like passion level and work ethic and chemistry in the room, will always be blindsided by regression. It’s their destiny, the price they pay for not being interested in tangible evidence that has proven itself true repeatedly in the past. But, while the bad-narrative believers may be the most aggressive shock-troops of the anti-regression crowd, I don’t think the entire backlash against the way the advanced stats community talks about regression and luck is entirely comprised of such people.  Because I know how regression works. I know what constitutes an unsustainable shooting percentage and an unsustainable save percentage. I am a PDO believer. I understand these concepts, I support them, I believe wholly in their validity and legitimacy, and I do not rely on narratives of personality to provide meaning in hockey for me. And yet, despite all of that, I still get the impulse to lash back every now and then.

***

When the statisticians speak of regression, they often resort to two words: luck and randomness. By this, they mean things that are unsustainable, that cannot be expected to hold as true over the next three years as they have for the past three weeks. Anything that is reflected in the average of many years of mature play is true talent. Anything that isn’t is luck.

The problem is that luck, in the common tongue, is not the same thing as unsustainability. Calling something unsustainable is purely descriptive: based on the evidence of the percentages, it cannot be expected to persist. Calling it lucky or random is adding a value judgment. Luck implies not just a result but a cause, and that cause is the chaos of the universe. Luck is the opposite of skill.

But spikes in performance* are still an element of skill. They’re not accidents; they’re not weird bounces or freak uninjuries. A player on a hot run is not having a series of encounters with enchanted pucks that just leap unbidden from his stick to the twine. Except in very rare cases, he’s not usually having an extra puck go in off his ass every other night. He’s not benefitting from anything that a layperson or a layplayer would identify as ‘luck’. Luck is winning the lottery. Luck is getting goals you never expected or intended. A player on a hot streak is still doing the things that lead to his success. He is still getting in position, he is still taking the shots, he is still in a very substantive way making his luck. He is, often, actually having a run of making good plays. The ability is real. It’s just unsustainable.

Unsustainability in hockey is not only due to luck, it’s also an organic result of the difficulty and complexity of the game. Hockey talent may well be a fixed thing, but its expression is invariably erratic. Consistency is the game’s great unrealized ideal, and though players struggle after it for their entire careers, hardly any of them ever achieve it. The statistician looks at the average over the longue duree, many seasons worth of work amalgamated into a reasonable anticipated contribution, but the player in the playing does not experience his abilities as an average. What he experiences are a series of peaks and valleys, streaks and slumps, heat and cold. Sometimes Messier’s skates feel like broken galoshes. Sometimes they feel like Mercury’s wings. He doesn’t know why.

It is the nature of the game. As I’ve talked about in this space before, hockey is a constant struggle to perfect something that is not perfectable, to achieve a level of precision that is virtually impossible for human beings to realize. The level of physical and mental discipline it takes to play the game well is so extraordinary and so exhausting that most players are forever, in some essential way, underperforming. Despite all the training and all the practice and his very best possible effort, on most game nights a man might make a dozen small lapses and errors, slight misjudgments of time and distance, moments of panic, milliseconds of distraction. There is a gap between what one knows one should be able to do and what one actually can do in hockey that begins the first time you pick up a stick and never really goes away, even at the pinnacle of the game. Hockey is played in the shadow between the idea and the reality, the conception and the creation.

But every now and then, for a short period of time, a player manages to tighten that gap between what he should do and what he can do a little bit more than usual. He gets to the right spot a half-step quicker, sees the lane opening up a half-second earlier, gets the shot off just a hair cleaner. Sometimes, for just a few games or a few weeks, things click together just a little more smoothly than usual, and sometimes, that shows up on the scoreboard. These are the peaks of the erratic expression of talent, and they are amazing, exhilarating phases. They also never, ever last.

The pursuit of hockey is partly the pursuit of more consistency and better averages, the slow workmanlike improvement of one’s game, but it is also the pursuit of these unsustainable highs. There is some serendipity in these peaks, in that they do rely on a confluence of internal and external factors for their expression, but it is the serendipity born of long and difficult labor, the emergent property that can only come out of something made ready for it.The point of the constant training and conditioning, the coaching and the video review, the rehearsing of shots and plays thousands of times, is not just to make the normal better but, insha’allah, to make the best better. When people dream hockey dreams, they don’t dream of long columns full of respectable career numbers. They dream of being able to have one perfect game.

Hockey is a little bit schizophrenic in this way. It holds two mutually contradictory values, the short-term and the long-term, and this is reflected in the structure of the season itself. The regular season is constructed to prize durable, sustainable talents, to average out streaks and slumps and give the favorable position to the most consistent teams. But then that reward is followed up by a tournament of ridiculously small sample sizes where any little surge of awesomeness, from anyone, no matter what their true talent level, might make the difference between a Cup ring and a golf cart. The game is designed, in the end, to give out its highest prizes based on unsustainable streaks.

Fans, of course, value both kinds of hockey success, in differing proportions according to the individual. As fans, that’s part of our privilege- we don’t have to think like GMs. We’re allowed to love someone out of measure for something they did once and never replicated. We’re allowed to see an unsustainable run of luck and cross our fingers and pray that it hold just a few games longer. Neither of those things are necessarily idiocy. Neither of those things are necessarily wrong. Condemn people, if you will, for their bad narratives but don’t denigrate the validity of enjoying and hoping for the unsustainable. That hope isn’t always delusion, it’s just a different kind of hockey value.

And so the dichotomy that often emerges from statistical work between ‘luck’ and ‘skill’, where the unsustainable is contemptuously dismissed as chance and the sustainable valorized as ‘true’ talent is not wholly in sync with the values of the game. Sustainable talent may be the most essential thing as far as general management goes, it may be the foundation of good contracts and sensible cap allocations, but from the cultural standpoint it is only one part of hockey. We do not have hockey only to see who has the best averages over the duration of their career. We have hockey, in part, to see who can turn out the most ridiculously, improbably, unsustainably incredible performances. We have hockey to show us strange things that cannot possibly be maintained- eight-point nights and shutout streaks and Travis Moen scoring runs- and we love these things so very much that we deliberately create situations where they count for more than sustainable talent.

There must be a golden mean here, a compromise position, some way of recognizing that the game’s utmost moments are both real achievements and transient ones, things that cannot be permanent but are nevertheless whole and remarkable and worthy of appreciation.

***

Every art, every skill is like this at its highest point. Great novels, great musical compositions, great paintings- people struggle through long years of mediocrity and produce much that is dull, uninspiring, or merely ‘good’ in order to occasionally, every now and then, for reasons they cannot clearly define, produce something extraordinary. We do not dismiss a brilliant book as somehow less brilliant because it’s author hasn’t produced something equivalently brilliant every year. We don’t dismiss a beautiful composition as less beautiful because the composer also wrote a lot of less moving things. Similarly, we should not dismiss a transient hockey achievement as somehow less real because it isn’t replicated. No matter what the field, the underlying principle is the same: you struggle and train and practice and discipline your whole long life in order to be able to achieve these moments where you transcend the probable and push right up against the limits of the possible.

Those three games, at the end of the first round playoff series against the Capitals, they were Jaro Halak’s To Kill A Mockingbird, his Mona Lisa. That was his point of peak performance. It was incredible. It was remarkable. It was an achievement, and even if he never achieves such a thing again the fact that he did it once remains a significant and abiding part of his career as a goaltender. It was and is a part of his ‘true’ talent, although it is not a part of his sustainable talent.

This is the fifth answer to Dejardins’ quiz: Halak achieved an insanely impressive feat of goaltending that he will never in a million seasons be able to sustain. And also, stop pissing on my parade, Gabe.

*Unsustainably high percentages virtually always coincide with a spike in observed performance, but not every transient spike in observed performance is driven by percentages. All Huskies are dogs, not all dogs are Huskies.

Comments (9)

  1. I definitely voted C on that Gabe poll.

    I’ve actually tried making that argument before: Halak was known even in that regular season to have an extraordinarily high level of play for a period of up to a month, I believe we had Chris Boyle breaking down Halak vs. Price by month that year and his December numbers were incredible, essentially stopping everything from the secondary range (the outer limits of the ‘home plate’ scoring chance area) and only getting beat on the crease crashing plays/rebounds/deflections. It was a great year for Halak, but even that performance vs. Washington and then to a lesser extent Pittsburgh wasn’t necessarily unique, he had simply ‘found’ that level of his game at the exact right time.

    Just as we all are in various pursuits, we can’t always find that level, it’s fleeting, it is random (and hence the luck arguments), but we have potential for greatness. When we acheive it, we should celebrate it. When Guns N Roses released Appetite for Destruction, no one was saying “Yeah, this is great, but wait until the release a terrible record in 20 years. No one can be this good forever.” Or how, over the course of 20+ years, the average Simpsons episode has become quite dull when after 7 years the average episode was better than 95% of TV shows’ best episode.

    And Cammalleri did this as well in 2010, especially vs. the Penguins in Round 2.

  2. Two things I love about this post:

    1) I love this post. Fantastic job breaking down what luck and random actually mean when speaking about regression.

    2) The placement of this post right after Bourne’s post about how the Bruins mythical “punk test” was the reason they were amazing earlier in the year and that teams figured it out and that is why they suck now. I am a huge fan of Bourne, but that post wreaks of a “tale that gives overwhelming explanatory power to factors of personality and disposition.”

    • I don’t think that regression is mutually exclusive from “other factors”. Regression tells us that over the long term, we expect certain stats to ebb and flow around an average, but it doesn’t tell us when and why those ebbs take place. Regression tells us that Boston will, at some point, play at a lower level. It’s predictive, but not explanatory. Bourne’s article is purely explanatory, and it’s a hypothesis at that. It’s an opinion. You can disagree on why Boston isn’t playing as well as they were, but citing regression as the “reason” doesn’t fly, because regression cannot explain. Will they play bad tomorrow? Will they play bad for a week? How about a month? Regression might tell us to expect an upswing, but it wouldn’t tell when, for how long, or how big that upswing will be.

      • And where exactly did I say any of what you just said? I never said regression is the reason they have tailed off. I merely pointed out that Bourne’s post was was the exact type of explanation that has nothing to do with the actual on-ice statistics that determine whether a team wins or loses. I would imagine that the “punk test” may have played something of a role in the Bruins drop off, its more likely that injuries, drop off in goaltending and players scoring at a higher than career rate early this year were much bigger factors. But it’s a much better read to talk about the “punk test” instead. Which is basically exactly what Ellen was talking about in this article.

        • Yeah sorry, wasn’t trying to put words in your mouth, but my point was that his explanation and this articles discussion of regression are not mutually exclusive. Everything factors into changing stats, and personality and disposition can have much larger impacts than this article seems willing to admit. Citing the stats doesn’t explain the “why”, and i’m arguing that, at times, players play better or worse than their career averages precisely because of things like swagger and punk tests.

          The article largely devalues factors that cannot be directly measured. It’s a scientific approach, and it has it’s uses, but i think it’s an incomplete approach once you leave the lab.

  3. What a great, great article. I think you captured something here that is really hard to get across.

  4. I don’t quite understand why we need statistics to let us know that players on hot streaks won’t stay on hot streaks:

    You mean that the Flyers won’t keep getting shut-outs? Really? Wow!

    I don’t mean to write off the statistics stuff, but a little more nuance might be nice. For instance, does playoff scoring regress to the regular season mean?

    In other words, maybe some players are chokers and some come up big .

    • “I don’t quite understand why we need statistics to let us know that players on hot streaks won’t stay on hot streaks”

      It tells us people are on a hot streak in the first place. Martin Brodeur wouldn’t be on much of a hot streak allowing 1 goal per game in the early 2000s on the Devils, but Halak was super hot. Why? We know about the quantity and quality (not 100%, but a good deal) of the shots faced. Gabe has commented on this before–it’s not so much “stats” as “counts.”

      As for the second question, penalties, goals per game, etc, are very, very close to identical between regular season and playoffs. With the playoffs at an individual level, though, you run into sample size issues.

  5. This is an excellent article to describe randomness and luck. It is the base to great team improvement because once we know randomness and luck happens and that we cannot explain causality on every event, there are some things that we can do to increase the “mean of a team”
    That is by being the coaching staff and the athletes they coach being much much better prepared and also doing things differently than what has ever been done before.
    If all things were equal (talent salary cap etc) then teams should take turns winning championships which would average out over say a 30 year period. This doesn’t happen. Sports with no salary cap show that the spend on talent outweighs any other factor in success (Soccernomics) Issues such as Hot Hands in basketball are actually just examples of randomness in action. (Think fast and slow).
    However great performers increase their own” Mean” performance (and subsequently the team too) because they work harder, practice more and use imagination to be the best. Those teams outperform randomness and money do so because of preparation.(Hence the difference in the mean % of free throws of professional athletes versus college players). It is many many more hours of quality practice and performance under pressure.
    To those who talk about choking well that is another factor. Most clubs have lots of staff to help with physical training and rehab but very few have great sport psychs who deal with the mental weight training. When performers are helped with overcoming anxiety with specific actions and greater preparation is given to psychological training and selection, then players will stay within their adjustment capacity and you will NOT see such a problem in big games.
    So my take ..there is regression to the mean, there are factors such as salary caps but if there is a way of increasing an individuals “Mean” by getting a team to understand Mastery, Preparation, Imagination & Persistence and Never giving up. This can increase a team’s mean which will lead to more success than randomness dictates.
    (My example would be if I take the car to work here in Adelaide SA it generally takes 22 minutes to get into the city. Somedays it is 18 and infact it may be 18 several days in a row but the other day it was 36 because of an accident and roadworks. The regression to the mean explains that I should understand that it will average out at about 22. I can be content with this as the time it gets to town.
    If I wanted to decrease the mean of 22 minutes I suggest that it must be by creativity or preparation. I could prepare to go earlier (much less traffic so mean time will decrease) I could also always look up a roadworks or accident app and avoid problems that contribute to much longer times. (Preparation) or I might even take the car half way with the bike in the back and cut the mean time as the last 3 Kms are a grid lock and I could park the bike easier. (Imagination and preparation)
    Love putting the maths with the human factors of motivation and my experience as an elite coach, player and now psychologist. The maths is a wonderful explanation of anomalies and averages but some of us know that if you don’t want to be average then you have to do much much more than what average does and be smarter in recruiting.

Leave a Reply

Your email address will not be published. Required fields are marked *