Archive for the ‘Soccer Analytics’ Category

Views Of German Sports Betting Offices

The season is over, and some of the model assessments among the soccer analytics community are starting to trickle in. Martin Eastwood is first out of the gate, whose eponymous index was included in Counter Attack’s Friday Football Predictions model comparison. It turns out the Index did rather well:

Looking at Figure 1 you can see that the Eastwood Index has consistently outperformed the bookmakers all season – and this isn’t just one bookmaker that the Eastwood Index has beaten but the combined knowledge of the industry as I’ve aggregated multiple bookmakers’ odds together and stripped out the overround to make the comparison as tough as possible.

Interestingly, the difference in accuracy seems to be greatest as both ends of the season. I expected the start of the season to be difficult to forecast as new teams have been promoted, players have been bought and sold, and managers may have changed clubs but the Eastwood Index seems to have coped with these variables better than the bookmakers’ odds have.

While Eastwood published his promising results today, the post follows nicely on a Twitter discussion I had earlier on the weekend. One prominent anaytics blogger was calling out a popular stats site for posting game recommendations from professional “tipsters,” those guys who have a lock on this, that, or the next thing. All he needed to do was post up their success percentage. A bitchy thing to do perhaps, but necessary.

I don’t think it needs too much stressing, but if you’re hoping to make money on the subjective recommendations of a guy on the Internet relying on nothing but his heuristic hunch and his self-proclaimed expert status, you’re going to have a bad time. But this got me thinking about ways to address sceptical attitudes about the predictive efficacy of analytics, and about the use and application of football analytics in general.

Let’s imagine for a moment a weird and implausible dystopian future in which governments forced citizens to bet all their earnings on football matches. Chances are, a lot more people would have a vested interest in football, and Bill Shankly’s claim that football is more than life and death would take on a whole new meaning.

Within this world, experts would come out of the woodwork with advice on how to make your weekly, government enforced football wagers. Some opportunist characters might claim you could make a quick buck fast by betting against the odds on one or two matches based on their advice. Others might go on the basis of pundits and opinion-shapers in the big papers. Still others would argue for a safe investment strategy that doesn’t stray far from the official betting lines. And then you’d get people like Eastwood publishing predictions based on an internal empirically-based betting model.

Remember: these bets are enforced. You don’t have the choice here not to play. Your entire livelihood depends on your ability to make safe wagers, and if you’re lucky, you can earn a little when the games break against the odds. Which approach do you choose?

Well, the smart investor would likely choose a proven, ever-improving predictive model provided by one of the analysts, right? Particularly as the tipsters’ success rates are all noise and no signal. While this should be obvious, even in our normal, everyday world, many investors make decisions based on gut feelings, some of which turn out well, and some don’t. This isn’t to completely dismiss subjective impressions in making investment choices, but to show that they cannot compete with a a solid statistical model built for long-term success.

Perhaps over time, those model forecasters would show the most consistently stable return on investment. If you were a football betting advisor, you know which portfolio you’d choose. You probably wouldn’t look too much into the model’s methods, and you’d stop watching football, confident it will all balance out in the long run.

This should be self-evident, and yet time and time again many in football choose to trust the subjective impression of journalists or TV pundits over the boring, hard evidence provided in by good statistical analysis. Remember: in betting analytics, statistics are not used to proscribe how football should be played. Rather, for the most part it’s more interested in football as it is already played. No one is currently or at least credibly calling for wholesale changes to how the game is played based on this kind of analysis, or at least not yet.

As long as football is won or lost on tactical decisions and good players with luck thrown in to make things interesting, there will always be a place for storytelling provided by chumps like yours truly. The problem is when chumps like me start sniffing their own farts and trusting their subjective impressions over the record of evidence. If your livelihood depended on it though, the choice should be obvious.

162802157

The subtitle of Chris Anderson and David Sally’s new book The Numbers Game, which following the inevitable colon that must be attached to all non-fiction titles, is Everything You know About Football Is Wrong. This is obviously a selling tactic.

“Oi, Maggie, this book says everything I know about football is wrong.”

“Does it Frank?”

“Yeah, it does. It can’t be anything more than a game of 11-a-side trying to put a ball in a net, could it?”

“Well, buy it and find out. We have a gift certificate.”

But I kind of wish they’d gone with Everything You Know About Football is Right.

Let me explain.

Yesterday I watched a Tweeter react to the initial excerpt release of the book in the Times yesterday. His basic point made over several 140-character posts was what nothing in the excerpt was particularly revelatory. Of course teams that score a lot of goals and don’t concede a lot of goals do well. Of course possession is important, and of course it matters that team don’t turnover the ball too much. Sure, it’s kind of interesting that corners are by and large a waste of time as a set-piece, but the banal truths on display here is comprehensive proof advanced statistics is a waste of time (this by the way is the corollary to the argument over the alternate approach to football stats, which argues the game is far too complex to analyze and therefore any attempt to learn from it via statistical analysis is also—surprise!—a waste of time).

Perhaps part of the problem was the sense of expectation foisted on readers by the copy-editor. The introduction boldly states that the book presents “a sea change not just in what we think we know about the game, but — as shown here, in this exclusive first extract from the book — how we think we should play it.”

No doubt there will be some eyebrow-raising statistics in here, and some major challenges to our perception of the game. But the key phrase in this excerpt is this:

“We are not concerned with theory. We are concerned with facts.”

In university I was fascinated by the writings of medieval Catholic philosopher Thomas Aquinas, more for the form than the content. His Summa Theologica was written in a set, repeated format. He poses a question, presents counter-arguments, than makes his argument and addresses the objections in turn. Once he’s done, he moves on as if the argument is settled, and builds upon its implications, constructing his castle in the sky brick by brick.

Of course it’s a massively flawed approach that flatters Aquinas’ faculty of reason to the point of absurdity. That’s because when it comes to matters of fact, naked reason is no substitute for empiricism. And that’s where Aquinas’ approach is instructive.

Despite the old joke about lies, damned lies and statistics, the numbers themselves presented by Sally and Anderson, and others doing interesting work in the football analytics field at the moment, simply are what they are. They are not a journalistic cliche, they’re of no bias or clique, they don’t have an agenda. How they’re interpreted can sometimes be a subject of debate, and that’s certainly where the fun lies. But in of themselves, if the method is sound, the numbers are as close to the fact of the matter in football as one can go. At the moment, they point to some broad truths that seem so obvious in retrospect that one wonders why anyone went to the trouble of finding them.
Read the rest of this entry »

163766279

A couple of quickies this week:

Judging the Value of Goals

To borrow a canned expression from my old music teacher, my preferred approach to analytics involves the KISS method: Keep It Simple Stupid.

This isn’t to ward off trying something ambitious with complex data sets and counter-intutive key performance indicators and whatnot, but more to encourage those outside the proprietary data wall to just sit and stare at what’s right in front of their noses.

I thought of this when I was directed to 11tegen11′s article on De Zestien this last week on a means to rate a league’s “most valuable scorer.” The article is here, but author Sander IJtsma has been using this approach for a while.

The explanation of his method can be found on his old blog, but it runs on the idea that not all goals are created equal. Obviously a goal scored in extra-time against a superior opponent away is going to be more valuable than the sixth goal in a 6-0 rout against minnows at home. IJtsma however figured out a lovely, simple way of expressing that value, based on match outcome probabilities provided by bookmakers, who adjust their odds mid-match depending on the scoreline:

At any given moment during the match, an expected value can be computed for the amount of points any team wins from the match. Simply multiply the chance of a win by 3 and the chance of a draw by 1. Should a team at any point during the match have a 30% (or 0.3) chance of winning and a 35% (or 0.35) chance of drawing the match, the expected value for the amount of points won from that match would be 3 * 0.3 + 1 * 0.35 = 1.25.

The value of scoring a goal at that point in the match can simply be computed by taking the difference between the expected value for the amount of points won from the match just before the goal and immediately after the goal.

Based on this calculation, IJtsma was able to compile a list of Eredividie’s MVPs for the season (Jozy Altidore ranks 4th). Now some immediate things jump out here. For one, goals are fairly randomly distributed (Poisson curve), so it’s not clear whether players exercise much control over this value, ie when they score, against which type of opponents, and in which particular game state (0, -1, +1, etcetera). It shouldn’t be too difficult though to look into whether certain players have scored high on this list consistently, without quick mean regression.

Then there’s the element of chance creation as well; no player scores in isolation, and the value of a particular player to score in a valuable situation in terms of expected points could be a function of team as much as player.

Still, for what it is, it’s pretty interesting. And, as with all good things in the analytics field, it opens itself up to more scrutiny, which could uncover some pretty cool things eventually.

Experience

Last night during the Sunderland Stoke match, ESPN colour guy and former Liverpool player Steven McManaman started banging on about how players needed experience in a relegation battle to better succeed in a relegation battle.

I’m a little tired of this “experience” thing being bandied about with abandon from pundits and fans alike as if it’s a necessity, like skill. For one, it’s hard to hate on Alan Hansen for telling the world “You can’t win anything with kids,” and then in the next breath talk about the importance of “experience” at the end of the season, or in a cup final.

This isn’t to discount the value of experience per se; it’s surely a good thing to have a better idea of what to expect in a high pressure situation, calmed nerves, insight into how to wind down a game or defend a 1-0 lead etc. But this presumes that all players learn from their experiences equally. Moreoever, it also presumes that the value of things like better confidence and the ability to better handle stress in a high pressure situation makes a meaningful difference in a game determined by luck, differences in talent, tactical preparation, etcetera.

This is one of those problems that doesn’t necessarily have a simple solution, either. You could look at a team’s progression in the Champions League based on the number of players with experience in the competition, but then this could just as easily be explained by their innate talent. As in good players tend to have Champions League experience because they’re good players.

There are exceptions of course; Dortmund have a whole slate of players in the big show for the first time. The annoying thing is though is that should Dortmund lose against Bayern, lack of experience may be cited as a “factor.” If they win, it could be “despite their inexperience.” Which is essentially the same thing as saying, “Lack of experience can get a team to the CL final, just not allow them to win it.” This isn’t to say experience can’t influence how a team performs, but it’s very difficult to isolate from other factors. In fact, I think it’s pointless to make the attempt.

That’s because “experience” as a means to explain a particular result is a self-fulfilling theory that cannot be verified. That doesn’t mean experience isn’t important, but it does mean that you can’t say beyond a doubt it was the determining factor in the outcome of a match. But this is exactly what McMananam is inferring when he says that a team “needs” experience in a relegation battle to perform better in that situation. Unless you can definitively show that when two teams of equal talent play each other the more experienced team will win at a statistically meaningful rate, you’re not saying anything at all.

141134408

Football analytics means different things to different people. This is particularly true within the field itself. For some, analytics is the search for ways to improve team performance. For others, it’s a means to scout better players below market rate. For still others (like myself, more and more), it’s a means to make better bets match predictions.

In practice, compared to other sports it’s not a particularly exciting field. There are no football sabermetricians (sefermatricians?) waiting in the wings with revolutionary theories that will upend the sport as we know it…yet. There are however some really cool things being done at the basic level however that tell us a bit about things like scoring effects, or how to distinguish luck from talent in team performance. Slowly but over time, these broad metrics are being improved via an informal process of online peer review. In my experience as an outsider looking in, the most convincing analysts have incredible bullshit detectors, and are far more aware of the limits of analytics than their critics in the wider world of football.

It’s this latter group I’d like to write about today, however—analytics’ critics (kind of rhymes). We read about them all the time. Often they’re in the media, or higher up the food chain within football clubs. They either write dismissively about analytics and get it “totally wrong” (usually by writing about Moneyball in some way shape or form), or they rail about how they refuse to use ProZone to coach because computers “can’t tell you anything meaningful about football.”

I can already tell you’re rolling your eyes—not another goddamn inside baseball, masturbatory meta post on soccer statistics. Either show us a graph or GTFO. I know, right?

Except addressing analytics’ doubters seems to be the major preoccupation among many leading analysts right now. It was the central theme of the soccer analytics panel at Sloan, and has been identified as the premiere challenge for most would-be club analysts: how to convince coaches and players that analytics can help improve performance?
Read the rest of this entry »

113064162

SPOILER: I don’t know. If you could find out and tell me, that would be great.

One of my favourite things about covering a wide swath of independent analytics sites is the moments when they coalesce around a single topic of interest. My little niche (at least in terms of what I find interesting and what I understand) has been the correlation between a high total shots ratio (TSR = ShotsFor/ShotsFor + ShotsAgainst) and table finish. Remember, the correlation isn’t with goals scored or individual games won, just total points.

James Grayson has already noted a positive relationship between the two (R^2 = 0.66), which makes Manchester United’s season so weird. With a TSR of .539 at this point in this season, Grayson writes United should have “a 5,000 to 1 long-shot” of scoring more than 80 points. And yet here we are, with Sir Alex Ferguson already targeting a 96 point finish. So what gives?

Grayson points to game states, an intriguing concept again imported from ice hockey, that tempers both TSR and PDO. PDO you’ll recall is simply shot% + save%; because both are regress heavily to the mean, they are considered primarily an indication of luck. So, in theory, a team with a high PDO (luck) and low TSR (skill) will eventually fall back down to earth, while a team with a low PDO but a decent TSR should in theory return at some point to the median. Man United are in the former camp, and yet they haven’t dropped a single bit. Could the answer be hidden in the game state?
Read the rest of this entry »

This isn’t soccer-related at all really, but it was the best panel by a mile at the Sloan Sports Analytics conference last month in Boston. I think there are some lessons about the potential to beat popular betting lines, as evidenced earlier today in my little predictions post.

Also, read Theodore Knutson’s blog on the subject, which has some really ace insights.

Last week, I had a minor freak out over David Conn’s Tweets which revealed that several clubs advertised full-time unpaid positions for performance analyst interns, with no travel costs covered.

Immediately after, several in-the-know analysts like Omar Chaudhuri were quick to give the news some context. This wasn’t necessarily a case of clubs trying to get analytics on the cheap; all Premier League clubs have full-time analysts (although some strictly in video analysis, which isn’t strictly the same thing as analytics). Chaudhuri proffered this background article on ProZone to give the reader a good idea of the current culture of club analysts in football.

It provides context, but also reveals some fundamental flaws in the current approach to sports analytics in English football. It seems clubs believe the demand does not match the supply of potential sports analytics staff, and that they can therefore ‘test drive’ potential analysts with internships as long as a year. This market reality however does not make the practice acceptable. Prevalent unpaid positions for periods as long as a year in any field demeans the profession and over time dissuades potential all-star candidates from going down a soccer analyst route. Moreover, comments from Paul Brand, Head of Performance Analysis at Blackburn Rovers, indicate an almost Dickensian attitude to interns:

“You want them to be in before you’re in and you want them to leave after you leave. When we’ve got a reserve game, I need to know that I’ve got an intern who’ll be up until midnight preparing the work for the coaching staff and players to go through the following morning.”

In this kind of environment, analysts have even less reason to challenge the status quo, particularly if the Head of Performance answers to the manager or coach. Brand reiterates how “a lot of routes into Performance Analysis are opened up through the creation and maintenance of good relationships.” One would hope this doesn’t trump their ability to create effective models and pinpoint inefficiencies both in the transfer market and on the pitch, but the optics aren’t good.
Read the rest of this entry »