Toronto FC v Montreal Impact

The Montreal Impact had, up until June 20th, 2013, a 9-3-2 record in the MLS Eastern Conference. Of their last four games however, they’ve lost two and drawn two. Their latest game was a 4-0 loss to top-of-the-table rivals New York Red Bulls, and the two preceding draws came against Chivas USA and Toronto FC, two of the worst teams in the league. If you were a writer under deadline tasked with writing a story on the Montreal Impact season to date, what would you have to say about the team?

One very attractive option would be to use the two draws against bad teams and the big loss against a very good team as evidence that Montreal are “dropping down” to their level of true talent. After all, Montreal don’t have the players of the Red Bulls, nor do they have the “experience” of Sporting Kansas City in fighting for top spot. Finally, they didn’t make the playoffs last season, finishing 7th in the East on 42 points. This all feels vaguely like a sign Montreal must be overachieving.

You see this argument made often in the Premier League, for example, particularly when applied to teams like Spurs. I don’t want to be uncharitable, but it seems often that this argument rests on nothing much more than, “Spurs don’t regularly qualify for the Champions League, don’t tend to finish above Arsenal, and certainly never challenge for titles, ergo, the best they can hope for is 5th place.” And when the prediction comes true, even when it comes down to a matter of a single, solitary point, everyone feels pretty vindicated.

This past weekend I wrote about the Impact (and other teams) for the Guardian, and prefaced my little blurb with this:

What’s the difference between a great team going through a temporary slump, and a mediocre team regressing to its true level of talent after a “lucky” streak? If you know the answer, you should consider taking up a career in sports gambling.

Long-time readers and analytics enthusiasts will know that we have a pretty good indicator of a team “luck” in PDO, which is just shot percentage plus save percentage times (perhaps arbitrarily) 1000. Because these statistics regress heavily to the mean, they seem to be, for the most part, random noise. That means they’re not great at measuring an underlying baseline of skill or talent, but they’re pretty good at making a judgment call over whether a team might be under or overachieving (insert your own pedantic caveats here).

I’m not privy to a lot of MLS shot data, sadly, but I can reverse engineer a rough answer from whoscored.com game averages. What follows therefore should be take with a grain of salt. It doesn’t take into consideration quality of competition or game states or any of that shit, for example. And, so far as I know, no one has done a historical linear regression between PDO and points totals in MLS, but I would guess they regress just as quickly as anywhere else.

Anyway, Montreal’s shot percentage is 13.2%. Their save percentage is 89.24%. This yields a PDO number of 1024. Is this high? I don’t know the context in the East, but Ben Massey has done a little work on this in the West as recently as June 20th. Seattle were rocking a sky high PDO of 1119, and Vancouver was in the basement at 952. Seattle were clearly set for a correction (and recent results suggest as much), and Vancouver were going to improve (five wins and one draw in their last six games ain’t bad!).

And, again, while we don’t really know if Total Shots on Target Ratios have as high an R squared correlation to points totals as they do in the Premier League, Montreal maintains an impressive TSotR of .625.

So, while the picture isn’t definitive (had I the resources I would isolate for GS and qualcomp), there’s some evidence that Montreal is just about where they should be. That’s not to say the picture may not drastically change between now and then. But if I were Joey Saputo, I might want to give a serious think about the wisdom of replacing head coach Marco Schällibaum with Juan Carlos Osorio at this delicate stage.

I’m writing this because of something I read by the great sabermetrician Tom Tango this week, on Bill James and the importance of recognizing and appreciating random variation when you’re looking at things like streaks. Tango quotes Bill James:

So, Bill says:

The question is, to what extent, in watching the games, are we seeing what is real, and to what extent are we seeing an illusion created by random clusters? … But for the most part, those studies always show that the variance in the real-life performance is identical to the variance that would be expected if nothing was operating except the normal randomization.

Now, I wouldn’t say “identical”, but the spread in real-life performance is only slightly larger than you’d expect from random variation, and so, makes it a virtually unactionable property. The Book does document the existence of real streaks (cold hitters, hot pitchers, clutch skill, etc), but it’s barely visible in the most extreme of conditions that for all intents and purposes, you’d only be able to use this information in tie-breaker-type scenarios.

So, Bill is right that the question is not whether something exists, but by how much. And that’s what the job of a saberist really is about, to figure out how much signal is in all that noise. Because humans are involved, there’ll always be a signal.

The problem of streaks is usually applied to individual player actions, like hot streaks and that sort of thing, but I think it should also be applied to winning streaks as well. That’s again why I think that, despite the enormous strides being made, this area of football analytics still needs to be further explored, and shots data needs a better marriage with CMS for us bozos to be able to quickly judge, say, a team’s tied GS PDO against a particular quality of competition, without having to stare at an Excel table.