## Today in meaningless graphs

The following graph, depicting in which periods of the game certain teams scored in the Premier League so far this season, is currently trending first on Reddit/r/soccer:

Here ladies and gentleman, is another example of something looking like it’s telling us something when it isn’t. At all.

First, goals, as we’ve discussed in previous posts on this subject, are unevenly distributed over a number of games. In other words, they’re comparatively rare events in football. And we’ve only had 11 games this season.

The graph seems to presume that some teams have “trouble scoring” at certain moments of the game. First, the sample, considering the general paucity of goals scored relative to shots/shots on target, is far, far too small to make any general statements about how teams are faring as far as which periods of the game they’re scoring in.

Even if the sample was adequately expanded (and it would have to be big following the law of large numbers), it would also have to correct for in-game situations in order to yield any interesting information. As in, a team may not have scored in the latter half of a game because it’s protecting a lead and has change to a more defensive formation.

Trying to point out goal-deficiencies in fifteen minute intervals in an eleven-game season is only a little less stupid than wondering why there weren’t more plane crashes on August 24th at 10:30 in the morning this year.

1. Only 3 types of lies, lies, damn lies, and statistics. Always going to be true. And a thumbs up for the Picard meme.

2. Well, dude. Small sample size means it’s not repeatable or predictive. This graph is still a perfectly fine way of evaluating previous peformance, which I think is how most people looked at it.

Quick plug of my new soccer analytics website http://www.disciplesofbeane.com , if i may.

• Except in this case it would be far, far better to glean information on this subject by analyzing several individual games with similar trends based on tactics, player mistakes, substitutions etc. This chart by itself doesn’t yield anything of value.

• Yeah, there’s definitely something of value to be analyzed in the success of teams later in games and after their substitutions which does have real correlation with a team’s depth.

This graph doesn’t really show that, you’d need to display it as relative to a team’s general performance, and then you’d get half the teams performing relatively better post-substitutions, and half the teams performing relatively worse. I don’t know if I’d be confident with even a full season’s data for goals though, I’d probably refer to Corsi (shots directed at net)

3. Plus, the 30-45 and 75-90 categories presumably include stoppage time at the end of each half. So – leaving aside the small sample size – it’s not even comparing data across equal time intervals.

4. Goals scored is nice… goals conceded in my opinion tells us more about teams than when the goals are scored