Manchester United v Fulham - Barclays Premier League

Last week I wrote a little bit on Expected Goals. While not a new metric, it’s starting to come into vogue as more and more analysts demonstrate its predictive value.

The idea behind Expected Goals or ExpG is simple: it uses average conversion rates by shot type–whether by location, foot or head, distance etc.–to add another layer of analysis to raw shot data, like Total Shots Ratio.

There has been an outpouring of work in this subject over a short period of time, so it’s hard to keep track of every new development. My tentative opinion however based on the very early returns is that the truly “repeatable” element (that is the part of ExpGs which involves skill rather than random variation) is, as with shots, the volume of higher probability chances, rather than the actual ExpG to Actual Goal count which I think, like total team shot percentages, come to down to as yet unknown variables and skew a lot higher for the “best” teams.

Now as I’ve discovered over the years, whenever you write about what you think is clearly an exciting development, someone inevitably leaves a comment like this one:

“Wait. So you’re telling me the team that creates more, high quality chances, will win?


(This is a real comment).

And this very real commenter has a point. What do football clubs do except to try to create as many high quality scoring chances in a game as they can with the players they can afford? What is the added value of this kind of statistical analysis?

Well, as Daniel Altman eloquently argued last week, it depends on just what exactly you’re looking to do with the data. If you’re a gambler, this kind of data can help improve your predictive model and put more money in your pocket. But what if you’re a manager?

Before I answer, I would urge you to take a quick look at Michael Caley’s Premier League table, which incorporates a host of data on Expected Goals for and against, shots from various areas of the pitch, etc, and reveals a fairly distinct correlation (this season at least) between the ratio of ExpGs for and against and place in the table.

Done? Good.

Now imagine you’re a Premier League manager and you don’t have access to this kind of league-wide information. All you know you is you have to improve your club’s performance or you’re going to lose your job (and maybe get your team relegated in the process). This leaves you with a vague set of hunches about what to do. Sure, you have your team analyst giving you sophisticated performance data about your squad. But what you know is in a vacuum. You don’t have a clear idea of how your side is performing against other teams in the league in various areas, or even if improving these aspects of your team performance makes a significant difference over the long term.

So you’re left to tinker, to adjust your tactics, to play around with formations, to try out better teams talks. Here’s hoping!

This is of course a caricature. Perhaps Premier League clubs already have access to detailed, league-wide ExpG data and every team has already made any and all tactical adjustments to try to improve. Perhaps what we’re seeing in the league right now is in fact perfect competitive equilibrium, solely influenced by luck and filthy lucre. Maybe, but I have my doubts.

“Okay fine,” says our interlocutor. “But what good is analytics unless it tells managers WHAT they should do, tactically speaking, to improve their team?” And at this point they’ll send you Kirk Goldsberry’s Grantland article on the NBA using SportVu data to develop an Expected Possession Value.

This is where I hear the ghost of Nicholas Nassim Taleb scream “Platonicity!” in my ear. The idea of using extremely complex mathematical probability models to establish “idealistic” attacking scenarios sounds cool and sci fi and stuff, but at some point you have to translate these abstract, momentarily ideal decisions to human players who are human and have years of experience playing a particular way and, if I haven’t said it already, are human.

Why do we need analytics to be an oracle? Why not just use it for old fashioned diagnosis? Your team is allowing too many chances directly in front of goal–what do you do? Go back to the video, see exactly what is happening, and adjust. Try a different formation. Play a deeper defensive line. Switch the counter against aggressive teams. There’s no magic answer. You don’t need a Harvard mathematician to figure it out.

That we can do this at all is incredible. If you can’t see the value in it, well, good luck to you in your struggle to stay up.

Comments (4)

  1. I’d like for somebody to do a Silverman on top European strikers (getting averages and mapping a decline when you hit 29/30) so that when RVP and Rooney stink things up in 2-3 years, nobody can claim they’re surprised.

  2. I went through the same thought process when I read the Goldsberry article. that was great stuff but still is probably a couple years or so from being able to work out the kinks and tell us anything in the NBA, in my opinion. soccer it’s going to be a lot longer than that. the largeness of the pitch compared to the court, 11 players instead of 5, and the lack of clear possession change add so much to the calculations that it will be very hard to make significant progress soon.

  3. Struggling football managers need to get on their crazy owner’s good side to maintain their jobs, regardless of their results. If the guy in charge likes you, you’re golden.

  4. Thanks for acknowledging my post.

    Here’s my beef:

    I am trying, TRYING, to sympathize with this school of football analysis (punditry?). In any event, this practice of putting statistics in context to derive actionable prescriptions and/or unique insights otherwise skipped over by simple observation, which is called ‘Analytics’. Bout right?

    However, why on earth am I (or is anyone) supposed to get excited over this development? As far as I can tell, and without being “deliberately obtuse” the following 2 overarching points can be gleaned from your recent posts on ExpG:

    1. We can use ExpG to empirically evaluate a teams recent performace(s).

    Presumably if a team is generating a higher ExpG then they’re playing better. This, used in collusion with other metrics, can show if a team is playing well or not. “Each metric is a tool used alongside one another to paint a more comprehensive picture” but this is essentially useless since we watch the match and come to our own conclusions regardless.

    It’s: “WOW what a chance! Did you see that?? He had a go from the half way line and it dinged the post!!
    “Blimey that was a good chance, if only it had been half an inch lower!”

    Not: “you idiot, everyone knows he should have played it into the winger to work it across the danger zone so that striker could capitalize on the relative merits of shooting within a 5 yard radius of the penalty spot”

    You say that: “something doesn’t feel intuitively right” about the raw numbers and yet that’s exactly what observation is predicated on- feelings (sometimes inexplicable) and insight that is subjective and unfounded. Analytics will never supersede this. Fans, journalists (and professional players) will always base their emotional responses/evaluations on what they see and feel during and after the game. They will use their own judgements and make up their minds irrespective of “randomness” and “small sample size” and consequently owners will make decisions accordingly, managers will be sacked as a result of these unfounded observations.

    2. “ExpG has predictive value”. “The idea behind ExpG is simple: it uses average conversion rates by shot type to add another layer of analysis to raw shot data”. But to what end?

    A. So fair enough, it depends on what you’re looking to do with it: finally we get to the real point of it all, the real value Analytics has to bear- the WHY and HOW.

    It can put $ in a gambler’s pocket? How, exactly? By past shooting statistics serving to construct model for future results? Sounds a bit like using a stocks past volatility to forecast its price on November 14 2037, but hey! Maybe so? Especially when specific and ever-changing variables like players, opponents, age, weather, day of the week and kit are involved, how will these models help? At least by taking on these questions your endless pursuit of extolling the virtues of Analytics may garner some more followers, as that seems to be your primary modus operandi with Counter Attack.

    B. It can help a manager improve club performance and potentially reduce his chances of losing his job? Without Analytics you’re left with a vague set of hunches? What a load of absolute horseshit. You mean to tell me that without statistical analysis the insight/obersvation/diagnosis of a manager is without merit? It equates to vagueness? Or perhaps that it pales in comparison to the raw, incontestable numbers? That the experience built up playing at a high level of competition over 25 years and managerial experience that creates compassion, genuine empathy and intimate understanding of the subtleties of professional footballers lives, counts for less than statistical models? WHY should one rely on these numbers? “Tinker. Try better team talks. Here’s to hoping!”. That’s both snarky and simplistic, par for the course when it comes to your write-ups. The value of a manager with unique insights into the human/professional footballer-specific factors that bear on performance are far more valuable and likely to succeed in reversing a teams fortunes than a loose statistical strategy cobbled together under the guise of “layers of analysis”.

    So here’s my final ask: Why does it have to be all or nothing? And why, when it comes down to it, are you so damn hell-bent on pushing this croc of Analytics as if it’s heaven sent and the only thing that matters in football analysis? It is NOT the be-all and end-all that you make it out to be.

Leave a Reply

Your email address will not be published. Required fields are marked *