Manchester United v Fulham - Barclays Premier League

Last week I wrote a little bit on Expected Goals. While not a new metric, it’s starting to come into vogue as more and more analysts demonstrate its predictive value.

The idea behind Expected Goals or ExpG is simple: it uses average conversion rates by shot type–whether by location, foot or head, distance etc.–to add another layer of analysis to raw shot data, like Total Shots Ratio.

There has been an outpouring of work in this subject over a short period of time, so it’s hard to keep track of every new development. My tentative opinion however based on the very early returns is that the truly “repeatable” element (that is the part of ExpGs which involves skill rather than random variation) is, as with shots, the volume of higher probability chances, rather than the actual ExpG to Actual Goal count which I think, like total team shot percentages, come to down to as yet unknown variables and skew a lot higher for the “best” teams.

Now as I’ve discovered over the years, whenever you write about what you think is clearly an exciting development, someone inevitably leaves a comment like this one:

“Wait. So you’re telling me the team that creates more, high quality chances, will win?

BRILLIANT!”

(This is a real comment).

And this very real commenter has a point. What do football clubs do except to try to create as many high quality scoring chances in a game as they can with the players they can afford? What is the added value of this kind of statistical analysis?

Well, as Daniel Altman eloquently argued last week, it depends on just what exactly you’re looking to do with the data. If you’re a gambler, this kind of data can help improve your predictive model and put more money in your pocket. But what if you’re a manager?

Before I answer, I would urge you to take a quick look at Michael Caley’s Premier League table, which incorporates a host of data on Expected Goals for and against, shots from various areas of the pitch, etc, and reveals a fairly distinct correlation (this season at least) between the ratio of ExpGs for and against and place in the table.

Done? Good.

Now imagine you’re a Premier League manager and you don’t have access to this kind of league-wide information. All you know you is you have to improve your club’s performance or you’re going to lose your job (and maybe get your team relegated in the process). This leaves you with a vague set of hunches about what to do. Sure, you have your team analyst giving you sophisticated performance data about your squad. But what you know is in a vacuum. You don’t have a clear idea of how your side is performing against other teams in the league in various areas, or even if improving these aspects of your team performance makes a significant difference over the long term.

So you’re left to tinker, to adjust your tactics, to play around with formations, to try out better teams talks. Here’s hoping!

This is of course a caricature. Perhaps Premier League clubs already have access to detailed, league-wide ExpG data and every team has already made any and all tactical adjustments to try to improve. Perhaps what we’re seeing in the league right now is in fact perfect competitive equilibrium, solely influenced by luck and filthy lucre. Maybe, but I have my doubts.

“Okay fine,” says our interlocutor. “But what good is analytics unless it tells managers WHAT they should do, tactically speaking, to improve their team?” And at this point they’ll send you Kirk Goldsberry’s Grantland article on the NBA using SportVu data to develop an Expected Possession Value.

This is where I hear the ghost of Nicholas Nassim Taleb scream “Platonicity!” in my ear. The idea of using extremely complex mathematical probability models to establish “idealistic” attacking scenarios sounds cool and sci fi and stuff, but at some point you have to translate these abstract, momentarily ideal decisions to human players who are human and have years of experience playing a particular way and, if I haven’t said it already, are human.

Why do we need analytics to be an oracle? Why not just use it for old fashioned diagnosis? Your team is allowing too many chances directly in front of goal–what do you do? Go back to the video, see exactly what is happening, and adjust. Try a different formation. Play a deeper defensive line. Switch the counter against aggressive teams. There’s no magic answer. You don’t need a Harvard mathematician to figure it out.

That we can do this at all is incredible. If you can’t see the value in it, well, good luck to you in your struggle to stay up.