I mentioned at the start of these columns I know next to nothing about even basic statistical science, but it appears if I or indeed any of us are going to help in moving forward on developing useful metrics in gaining a better understanding of best practices in football, that might have to change.
The reason is that one of the major misunderstandings among amateur soccer analytics writers is the importance of Statistical Power, generally determined by an accurate and useful sample size. The name of the game is eliminating the possibility of Type I and II errors, which involve ruling out a null hypothesis (no statistical correlation between factor X and result Y).
Now, as of writing, I can only use common sense in determining which conclusions can safely be drawn within a particular sample size and which conclusions cannot, and there will come a time when that alone won’t be good enough for me or anybody purporting to be an amateur soccer stats person. But it’s clear some work being done under the guise of meaningful analysis of soccer statistics is woefully inaccurate.
The amateur soccer analytics community is still fairly small, so I don’t want to rustle too many feathers. Here is one example from the EPL Index. It contains a sample error so basic, it’s a wonder this was ever published:
Using EPL Index Opta Stats we can analyse Cole and Carroll’s performances this season. We’ll start by looking at how many games they’ve played and how much pitch time they’ve accumulated.
Cole’s extra minutes are because Carroll joined West Ham late and suffered a hamstring injury.
From this data, the author goes on to compare pass completion statistics, aerial’s won, and possession statistics in order to draw broad conclusions about the value of the two players to Sam Allardyce’s West Ham.
Now, the author’s conclusions about Carroll’s skill in comparison to that of Carlton Cole are probably correct, but these statistics don’t qualify as evidence. You simply cannot take an entirely unrepresentative sample size (TWO starts for Carroll!) to make extraordinary claims about a player’s comparative value. The playing time here is not large enough to correct for position, type of opposition, Allardyce’s tactical instructions on the day, whether or not Carroll’s and Cole’s role remained static, were interchangeable, or different. Moreover, the lopsided sample size for Cole compared to Carroll further make the conclusions inaccurate, regardless of the “per minute” distinctions, which involve a total playing time so small as to be practically meaningless.
This article isn’t an outlier. Less egregious forms of these errors seem to occur in even the most cursory analytics articles, or tactics pieces that attempt to use measurements with no statistical power to draw conclusions.
One of the more common defenses for this kind of abuse is the ‘author caveat,’ in which a writer will defend the general lack of statistical power with a warning—e.g. “We should be careful not to read too much into this apparent correlation”—before they dive headlong in and do just that. It’s hard to be charitable to this approach, particularly when the “new information” in the piece is drawn from a sample group so paltry it tells us absolutely nothing.
Not all analytics writers are guilty of this, and many have long corrected for reversion to the mean to determine just what team behaviours tend to produce more goals over the long-term. Nor should tactics writers be afraid to look at in-game data to make observations about the tactical approach of a given team, particularly if they’re backed up by the manager himself (that said, there is still a danger of the post hoc fallacy rearing its head from limited data).
Of course, this is sports, and moreover, it’s a team invasion sport. It’s all impossibly vague, so why not just use the data to tell us what we already know? This approach however simply plays into the hands of those who would sneer at analytics as an odious bit of Americanism, rather than the future of football.
For dumb ass journalists who don’t want to be caught out on this sort of thing, a resource.