The perception of soccer analytics in the public imagination began to crystallize (in part, these things can never be pinned down on a single event) with the release of Simon Kuper and Stefan Szymanski’s popular book, Why England Lose, released in North America as Soccernomics in 2009. The book purported to do the same thing for soccer as Lewis’ book Moneyball did for baseball—look at ways statistical analysis can reveal hitherto unknown facts about football that can either be used to purchase players below the market rate, or to improve team and player performance in ways conventional coaching could not.

For a while, Kuper’s book was a kind of interesting literary gewgaw, more Freakonomics than Moneyball, despite some fairly common sense advice on everything from player transfers to the way European football clubs operate. Kuper then profiled Liverpool FC’s director of football Damien Comolli for an article in the Financial Post, and over time, helped by the renewed popularity of the principle of Sabermetrics with the release of the film version of Lewis’ book, many came to regard Oakland A’s GM and Sabermetrics true believer Billy Beane’s apparent disciple at Liverpool as some sort of metrics guru.

After he left the club by mutual consent earlier this year, many in the media declared Moneyball in soccer “dead” without having read a single paper on the subject, or analysed a single graph, or spoken with a single employee of ProZone, or StatDNA, or Opta. Because they believed Comolli used some sort of statistical metric to purchase Andy Carrol, “soccernomics” didn’t work.

Often the debate over the use of analytics in soccer goes like this. Some one parps up about using numbers and stuff to find good players below market rate. Someone else then says that soccer is a free-flowing team game where a single mistake can cost an entire game, not like baseball in which players are essentially walking on-base percentages. Ergo, no amount of analysis will ever yield any useful information about the game.

The reality is a lot more boring. The question of soccer’s lack of “discrete events” like those in baseball (pitching, batting) to analyse was addressed by Leeds University professor Dr Bill Gerrard in an influential paper for the International Journal of Sport Finance titled Is the Moneyball Approach Transferable to Complex Invasion Team Sports? His thinking broadly sums up the convention view of the potential of soccer analytics:

Replicating Moneyball and Scully’s pay-and-performance analysis in invasion team sports such as the various codes of football, field and ice hockey, and basketball is much more problematic. Invasion team sports are much more complex and hence the separability of individual player contributions is considerably more difficult.

Difficult, yes, and largely prohibitive to Bill James-esque casual fan analysis, but not impossible for an interested company with reliable technology, useful metrics, and the financial resources to secure both.

Statistical and positional analysis in football has been around for a while now. Prozone made its debut in the UK in 1998 and has since then found more than a few champions within football, including the much-maligned manager Kevin Keegan. Since then, companies like Opta, Infostrada, and StatDNA have emerged, gathering and analyzing player statistics to sell on to clubs, and in a more rudimentary form, the general public via third-person applications like FourFourTwo’s Opta app and whoscored.com.

Jen Chang’s article today on Everton’s use of advanced data gives a good picture of how far the game has come:

As teams throughout the league become increasingly progressive in their use of match data, the level of detail available to teams to assist in their pre and postmatch analysis and preparation has become ever more complex. Stat providers such as Prozone and StatDNA supply data in various forms to an estimated 15 of the 20 Premier League clubs.

These companies, having paid a pretty penny for the information-gathering techniques and analysis they sell on to elite football clubs, are understandably secretive about counterintuitive statistical correlatives. Despite this, there has been a small cottage industry of blogs like A Beautiful Numbers Game and Soccer by the Numbers that have quietly used their own metrics based on the available data. So while we don’t know as much as we’d know if we paid a gajillion dollars to big companies for some sexy metrics, there is still a lot of information out there.

I am not a statistician. I dropped out of high school calculus. I was a professional singer before I started writing about football. So it’s of no interest to me or use to you to post various mathematical models or acronym-laden charts. Part of the aim of this appointment post therefore is to both try and break down various soccer metrics for a broader public in order to help dispel some of the myths out there about hardcore analytics. I’ll be talking to experts, looking at blog posts, reading boring PDFs, all in an attempt to popularize what is a misunderstood but very modern aspect of the game. Stay tuned…

Comments (2)

  1. First and foremost, I believe that statistics about soccer are fascinating. I love game and geek out frequently with analytics of all sorts. I believe that the use of statistics in soccer is very important on a few fronts.

    At the elite level (EPL, La Liga, Serie A, etc.) there isn’t much that separates the champion from a mid-table team. Collecting data (as chronicled in Soccernomics) can provide an advantage (the size of which can be debated). Even if you assume a small advantage, at that level, it may mean the difference between playing in the Champions League or having to find a new manager.

    When dealing with the public, it gets a little more complicated. Soccer is a storied sport that spans every ethnic, socioeconomic and generational groupings.

    Here in the US, the “millennial” generation outnumbers the baby booms, which is very significant. The “millenial” generation has a very technological way of life. They want info here, they want info now. They want to see things in a different way than they are used to and they want it in 160 character bits. Statistics is very useful in this way. The “millenial” generation is also very visual, which lends itself to infographics and the like.

    This isn’t to say that we should tailor an approach/product for the new generation, but we must recognize that they will be the new consumer of the game.

    We must also carry the previous generations and those that may not care about such things as how fast Messi runs or how many tackles John Terry made last weekend. This is the population you risk alienating if you fill broadcasts with too many statistics.

    Anyway, just a few thoughts. Love the topic, though.

  2. Football as an “invasion team sport” still frustrating for fans who are interested in the stats side of things but aren’t statisticians.

    That companies (understandably) protect their data means it’s hard for the layperson to get involved, in turn making it harder for the early adopters to get the masses interested.

    If there’s no Bill James-esque reporting, why would the general football fan care what goes on behind the scenes in the training ground offices?

Leave a Reply

Your email address will not be published. Required fields are marked *