Now that we’re approaching three weeks since the end of the domestic season, the analytics blogging community has simply exploded in output. It’s becoming extremely difficult to keep up with all the developments, particularly as the number of analysts seems to grow almost every week. Please bear with me if this gets…dense.
Last week’s State of Analytics column on shot conversion as a function of luck touched off a bit of controversy.
Coincidentally, many of the analytics blogs this week have focused on different ways to analyse shot conversion. The danger however is that the more you attempt to isolate for factors outside of individual skill’ that influence shot percentages, factors like game state and pitch area, the more you risk drawing false conclusions from a dangerously small sample size.
I’ll try to explain what I mean by this by going through some of the ways soccer analytics bloggers have approached this question over the past week.
From the Simple to the Complex
The simplest way to analyze shot conversion is to simply calculate the number of goals as a percentage of total team shots. As James Grayson helpfully reminds us, it’s important to ensure that penalties don’t skew your data sets here—the average conversion rate for penalties in the Premier League is 78%, whereas the shot-on-target conversion rate is around 20%.
Now Grayson and others over the last few seasons have gone to great lengths to demonstrate that team and individual shot conversion rates are very much a product of random variation—they regress to the mean very quickly, a sign they’re more a product of dumb luck than deliberate skill. Some have taken issue with the notion however that shot efficiency is a chimera and that volume of shots counts for everything.
The Differentgame blog for example looked at number of shots and shot conversion percentages from the centre of the box, an area with a generally high shot % rate, over three seasons. Some familiar names in both players and teams managed to maintain an above average number in both categories. This analysis isn’t at all conclusive as far as conversion as a measure of skill is concerned, but United’s trend in both categories doesn’t appear on first glance to be merely accidental. This seems to match in part what we know of their tremendous PDO last season.
One can go deeper with this question by breaking down average shot conversion rates by area of the pitch. This is exactly what 11tegen11 did this last week when, armed with two seasons worth of data from Eredivisie, he broke down the pitch into four ‘zones’ with average conversion rates.
More than that, he further broke down how game states—whether the score is -2, against -1 against, 0, or +1 for, +2 for etc.—affects those conversion rates in each zone. This seems to hint at how extraneous factors like whether a team is chasing a lead influence shot quality.
One could apply this analysis to the Differentgame’s breakdown to further explain the consistency of certain teams and players, except as Simon Gleave warns:
— Simon Gleave (@SimonGleave) June 10, 2013
Remember: the lodestone of whether shot conversion is a function of luck or skill comes down to whether an individual player is more likely to score from a similar position of the pitch than a fictional (and as yet statistically non-existent) ‘replacement level striker.’ Broken down this way (shot from X position in Y game state), there simply isn’t enough data to make any kind of definitive statement that one player is better at finishing than another. I’ve heard that some data exists on this behind the proprietary wall, but I’ll defer to Wittgenstein and say, “Whereof one cannot speak, thereof one must be silent.”
Okay, so another way to approach this is to look at where on the pitch players are shooting from as a percentage of their total shots. We’ve already ascertained that shooting from certain areas of the playing surface will produce a higher conversion rate in general (duh duh duh). The statsbettor blog for example looked at where strikers shot from on the pitch, stripping out penalties and only looking at players with more than 70 shots for the season. This is interesting to look at (some players have a far more diverse shot repertoire than others), but I think the winner in terms of how to really analyze shot conversion goes to Ted Knutson.
Are Football Tactics the Missing Link in Understanding Shot Conversion?
I’m betting most of you who don’t like analytics might be looking at the above analysis as total gobbledygook. Football isn’t a game of just racking up more shots from the right position of the pitch as if it was darts. The game is dynamic, and shot conversion is as much a function of where a striker is relative to the opposition. Knutson hits the nail on the head on this point:
That’s the issue with data abstraction. As a “shot,” they all get lumped in to the same areas and they look the same, despite the fact that one shot will be twice as valuable as the other, depending on where the defenders are located.
That’s also one reason why I say positioning is everything.
Knutson argues the next step in data analysis is to try and isolate for positional sets. There are of course a number of ways we could do this, and Knutson makes his case for several situations in a separate post that is well-worth your time. This second post is particularly interesting because it reads much more like a tactics blog than an analytics piece. His conclusion is searching but worth considering:
It’s not enough to simply create a lot of shots in the modern game. In order to win the biggest leagues and the Champions League, offensive systems now need to overcome packed, organized defences nearly every week of the season, and they need to be efficient at converting the chances they create. In order to create better offensive chances, teams need to have the ability to either attack fewer defenders, defences on the move, or develop interesting ways of moving defenders around in the penalty box until someone loses concentration for a moment, at which point they strike.
Teams have developed a variety of ways to deal with improved defense, some (like Barcelona and Manchester United) far better than others, but employing those systems may have a high cost in terms of player skill and/or personnel consistency. This also goes a little way toward explaining how some math models like TSR have difficulty analysing teams who create fewer chances, but whose chances have a higher probability of yielding goals.
Knutson is alluding to Manchester United’s weird numbers this past season, in which they had a relatively low Total Shots Ratio (shots for/shots for + shots against, a measure of shot dominance) which should have put them nowhere near their 89 point total. Their PDO (sh% + sv% * 1000) however was very high.
Which brings us back to Grayson. You might remember he had his own explanation for this discrepancy this season. He wrote:
So why isn’t TSR capturing [United's] deviation? Well it certainly doesn’t encapsulate a teams playing style – I’ve noticed for a while that those teams with a reputation for free-flowing football under-perform compared to shots metrics, whereas others, such as Stoke, repeatedly outperform what is expected from them.
An initial conclusion that lines up well with Knutson’s “position” thesis.
Grayson goes on to theorize that gamestates might have further skewed United’s TSR, particularly as teams that are ahead by a goal take fewer shots (United I believe was far and away ahead last season in time spent in 1-0 lead). Although Ben Pugsley looked at these numbers in April and noted that United had a simply extraordinary shot conversion rate and save percentage in a tied gamestate. His off-the-cuff theory?
In short it’s all about the scoring and save%’s for man United. We know these are luck driven, but there may well be a shot location/tactical element to consider on these percentages over a single season, although we need to see more info to be absolutely sure on that.
This would underline Knutson’s point about the importance of positioning. Perhaps this kind of shot analysis could be the bridge between analytics and tactics, the first major link between coaching preferences and shot efficiency. Maybe the better thing to do would be to avoid the question of “good finishers,” and look instead at the more interesting question of “good football tactics.”