I was planning to have a book review ready to go on The Numbers Game but sleep deprivation courtesy of my two year old curtailed my reading time this past week. So instead, I want to touch on two topics today.

Why Chelsea parked the bus against Hull

Some skeptics believe that data analytics can’t tell us anything about football that we don’t know already using just our eyes in concert with common sense. And to some degree, that’s true. But sometimes the best part about data analysis comes not in revealing “hidden secrets” but simply in providing a empirical framework to better understand how football teams behave in given circumstances.

Take Chelsea’s performance against Hull. Here (courtesy of the must-have Stats Zone app) is Chelsea’s passing stats in the first half compared with the second:


As visual illustration, this obviously works rather well. Chelsea barely passed out of their own 18 yard box in the first half, but in the second they were far less narrow and more deep, sending long balls down the flanks with little success and passing less frequently.

Several analysts watched this second half and surmised that Hull were not the Premier League pushovers many believed them to be. Others criticized specific areas of the Chelsea set up, including the two makeshift “defensive” midfielders in Ramires and Lampard, and concluded they could not keep up the attacking pace of the first half.

Of course others remembered this tale of two halves was a habit of the Old Mourinho, a manager sometimes content to sit on a lead and then “park the bus” rather than smash the opposition to pieces. This raises several questions however: why sit in a defensive posture against a team like Hull, when there is an opportunity to run up the goal differential at home? Was it a fitness measure?

Ted Knutson over at StatsBomb has an answer:

The reason for shelling comes because counterattacking is one of the best ways to create better chances (link). It provides a situation where your attackers are going up against a limited number of defenders, who are also on the move. And we know from a reasonably large sample of statistics that it’s far easier to score in these situations than going up against an entrenched defense. Thus creating these situations to exploit has a huge benefit.

I would recommend clicking on the link in there. You’ll note Ted Knutson’s off-hand reference to a “large sample of statistics”—he’s referring in part to work on Game States. This refers to the behaviour, measurable over a broad sample size, of teams depending on the in-game scoreline. In short, teams which lead by a goal shoot less often but generally more accurately. Here’s an illustrative post on the subject over at 11tegen11, and another one.

Now while shot dominance eventually evens out to its tied state at +2—the game state Chelsea enjoyed in the second half against Hull—as 11tegen11 discovered in a large data set collected from the Eredivisie, at +2 the shot conversion rate actually increases from the league average of 11.7% to 17.1%. That’s not a small jump.

You don’t have to be a genius to guess why that happens. The trailing team must push further forward in an attempt to get back in the game. In doing so, they leave themselves exposed at the back, with fewer defenders able to track back in time, which would increase the conversion rate of the team leading by two goals.

So now we have a pretty good picture of Chelsea’s strategy in the second half. Sure, they could have played “entertainers” like Pellegrini’s City against Newcastle on Monday and tried to rip Hull apart in the second 45 minutes. But in “shelling”, Mourinho can keep they’re players in shape for the midweek fixture against Aston Villa; it is much easier after all to play deep rather than play high up the pitch, press the opposition for possession, and then track at full-steam back should things go wrong.

Even so, I think to the degree that Chelsea’s long-passes were pretty errant in the the second half, there is reason to believe that Mourinho’s counter attacking approach needs work:


But the logic of the approach, based on what we know about the benefits of playing on the counter after securing a lead, is sound.

Note that analytics wasn’t necessary in arguing Chelsea’ motivations for bunkering down. After all, any tactics person could plausibly make a similar argument without reference to the data. But the data grounds it in reality; the improvement in shot conversion when leading is no longer just supposition, or speculation, but correlative with a large sample of matches. There’s no reductionism here, no death of common sense, no ruination of football. Analytics just is what it is.

Player churn

Yesterday Gab Marcotti Tweeted out a link to a story he wrote for the Wall Street Journal. In it, Marcotti wonders at whether leaving crucial signings late could negatively impact the ability of a manager to instill his vision on the team for the coming season. He writes:

Familiarity, chemistry and tactical cohesion are only some of the benefits to acquiring players early. There are others, such as allowing them and their families more time to settle in a new environment (and, often, a new country) and allowing the club to use the newly acquired star to drive ticket sales and sponsorships.

And yet, evidently, all this is valued somewhat less at certain clubs. Or, rather, the benefits of having a new signing on board early are outweighed by the premium you might have to pay to lock him up before training camp begins. Either that, or they’re just not very good at closing deals early.

Now Marcotti is really focusing on the upheaval from new stars jumping into the first team late in August having not participated in pre-season training, but I think this overlaps a bit with a brief Twitter convo I participated in on Friday over the possible effect of “player churn” on using last season’s team statistics to make predictions about the coming season. “Player churn” is a measurement of squad turnover. It’s an important concept, because many analysts use statistics from the previous season like TSR (total shots ratio, a predictive measure of the ability of a team to control the game) and PDO (sh% + sv%, a measure of luck) to predict outcomes for the following year.

While I think player churn has very real effects (duh: if an okay team buys a lot of really good players chances are they will improve), year-to-year squad turnover isn’t as dramatic as you might think. The nifty new Press Association infographic site Match Story ran a visual on it last week. The only team to lose a significant number of its first team regulars from last season to this was Sunderland.

In reference to Marcotti’s piece, I suspect any effect from one or two late signings would be slim to none, and if they existed, they could be easily mitigated by efforts to intelligently transition new signings into the first team. What of course is far important is that new signings fit in with the club’s overall tactical needs.