Blackburn Rovers v Burnley - Sky Bet Football League Championship

So I didn’t quite anticipate the flood of responses when I opened the floor on Twitter to help me out for today’s post:

While I anticipated the usual hang ups, I received back a flurry of well thought out, serious objections to the application of analytics to the sport. Because of the sheer volume of both, today I’ll stick to common reader objections to stats analysis, and next week I’ll look at serious objections from those within the field itself. Apologies if I missed something obvious, but please do let me know.

1. Analytics and stats in general do not take context into consideration often enough

This one from @Dbaser92. I think this encapsulates a lot of what people don’t like about the use and application of advanced stats in football, so I’ll use this point to get a common problem with analytics writing out of the way.

I think as readers we need to pay close attention to what analysts are actually saying when they publish their findings. For example, there’s a big difference between “Joe Allen had a great performance because he had a 91% pass completion rate” and “shot dominance tends to correlate with a higher points total.” Both require context, but only one oversteps its bounds.

Simply posting a stat under the assumption that it is a good or bad number without having done the work to see if it’s repeatable or significant is bad analysis, but it doesn’t make analytics bad per se, much in the same way Adrian Durham doesn’t make football writing bad per se. In my experience, the majority of analysts don’t make huge empirical leaps, and if they do, they’re usually called out on it within minutes of posting.

One needs to be vigilant, both as writers (Twitter means we’re all writers now) and readers, in ensuring that analysis writes cheques that its butt can cash. That’s on you, man.

2. “Stats are like miniskirts, they don’t reveal everything”

That one from Prozone analyst and all around smart guy @OmarChaudhuri. That is a direct quote from a comment on an excellent post on Man United he wrote a few weeks ago.

Again, you as the reader need to be careful to judge what the analyst is telling you. Analytics isn’t about ‘revealing everything.’ It’s a method. Some use it to build better betting models, some use it to make better decisions in player recruitment, some use it to see if there’s a better way to play the game. It’s far better to view analytics as a tool, rather than a world-explaining philosophy. This is why I cringe when I read about the “Analytics Movement.” It’s like referring to the “Mathematics Movement.” Math is a method, a means of measurement. It’s not a world view. Your job is simply to judge whether the analyst accomplished what they set out to do.

3. Analytics doesn’t take into consideration the ‘intangibles’ like heart and romance

You know? It doesn’t. But the intangibles vs tangibles thing is a false dilemma. If you run a regression and you notice a strong correlation with, say, final third touches and goal differential, that is literally all you know. You can make all sorts of inferences from that information: possession-based teams win more games, build up play is superior to counter-attacking football, the big teams in Europe prefer short passing etc. But in the end, all you have is a sample size and a strong correlation between two variables. None of this discounts the importance of things like teammanship, or heart, or passion, or desire.

I think what gets analysts’ knickers in a twist is when people make huge empirical leaps based on those intangibles. “Chelsea lost the game because they weren’t confident.” This is an exaggeration, but pundits say this kind of thing all the time. Well, single matches involve a lot of complex interplay, unforeseeable outcomes, lucky bounces, fortunate decisions, incredible technique, momentary lapses in concentration, bad decisions from the refs etc. etc. To single out something as abstract as a ‘state of mind,’ one that the pundit presumes is equally shared by all eleven players, with no evidence to support your claim is to open yourself to reasonable criticism.

This isn’t to say confidence doesn’t matter, or trust in the manager isn’t important. But we have no clear idea if they matter more or less than the weather or the state of the pitch. It sure feels like they do.

4. A lot of analytics is just pointing out the obvious

This is probably the most common complaint I hear about stats analysis. “Oh, teams that shoot more than they concede will win more games? Bravo Einstein.” “Wow, so you’re saying if teams shoot from better positions they’ll score more goals? Here’s your Nobel!”

Couple things here. First, imagine if you asked someone asked you how far it was from New York to Los Angeles. I could either say, “it’s far.” Or I could say, “4,491.0 km.” Or I could say “one day and 16 hours of driving.” The value of the answer depends on what you’re looking to do.

If the analyst set out to find a revolutionary new way to win at football, well, the correlation between shot dominance and table position may not help.

But if they’re building a betting model? Or giving managers a concrete means to tell if their team is in okay shape? Well, measuring an “obvious” data point is going to be very helpful indeed.

5. It’s never going to be like baseball

Oh man this, from @Rui_xu, is defo a common one. Baseball is a sport of discrete events, soccer is complex game of flow. Ergo advanced stats work in one and not the other.

That soccer is the way soccer is puts analysts at a disadvantage, if the aim of the analyst is to find market inefficiency in how players and teams are evaluated. If your goal is to do something else, like predict which team will likely finish first and which team will likely be relegated, or which player is creating better chances on a regular basis than an another player, analytics can be very helpful indeed. Plus X,Y positioning data can offer more depth, though I’m not sure it will foment a revolution in how teams play the game. I doubt it.

But again, it all depends on what analysts are trying to do. This is the running theme, in case you missed it.

6. Stats are boring and stat blogs are poorly written

When you see something stats related, don’t read it. Don’t know what else to tell you. It’s the Internet. You have personal agency, comrade. Don’t like it, avoid it.

7. I’m not good at math

Well, getting your head around certain statistical concepts is tricky for us all. I think though at the core it’s just common sense. Teams that tend to do this over a long period of time tend to score this many goals or earn these many points. Here is a graph showing a regression analysis. Google ‘regression analysis.’ I would also recommend reading Michael Mauboussin’s The Success Equation, which is a great primer on most of this stuff. But if you don’t want to be bothered, see number 6.