## Winning the War Against Batting Average

You’ll have to excuse the inflammatory title to this piece. There is no war on batting average and it isn’t an inherently evil way to discuss baseball. But its time — as a measure of a baseball player’s skill and/or contribution to his team’s chances — is over. Like compact discs and chastity vows, it doesn’t tell us anything we didn’t already know and doesn’t hold up to even the slightest scrutiny.

Just last week Parkes posted about questions he receives as the big boss of a baseball blog. Batting average was a main topic of discussion there, as it was in yesterday’s post on Manny Ramirez possibly joining the Blue Jays. “We need a .300 hitter!” is a common refrain for fans still making their way around the advanced stats learning curve.

That’s okay, we’re here to discuss and help everyone come away with a better understanding of the way our favorite game works. Allow me to gently take your hand and lead you away from the tyranny of batting average.

Let me reiterate what I said above: batting average isn’t inherently evil. When Stevie from Oshawa calls in to the local radio station decrying a lack of .300 hitters, it isn’t because hitting .300 is the ultimate measure of a player’s worth; it is because guys who hit .300 are often good baseball players. That is what every team wants, more good baseball players. But a .300 average only goes so far.

I don’t want to get too nerdy here, but permit me a moment to use a little intuition and some pretty simple math to explain. What are we trying to capture or explain when we quote a players batting average? Well, what are the only two things that really matter in baseball? Runs and outs.

That’s it. Runs and outs. Keep scoring runs until you all your outs are gone. So if we’re trying to distill a player’s worth down to a percentage, let’s use the number that best relates to the creation of runs. Let’s look at the relationship between these numbers and runs scored.

Take the traditional triple slash line (batting average/on base/slugging) of all 30 major league baseball teams and plot it against their runs scored. Then — with the help of our friendly neighbourhood spreadsheet program — examine the relationship between the two. Which number has the strongest relationship with runs scored? Which teams score more runs: those with high batting average, high on base, high slugging, or high OPS (on base plus slugging)?

Again, I took every team and graphed their batting average, on base percentage, slugging, and OPS against their total runs scored. Not a perfect measurement but a good starting point. You can see the lowly Mariners and their off-the-charts bad offensive season at the bottom of each. Teams like the Royals stick out due to their high batting average but very low runs scored.

The Blue Jays, conversely, had extremely low on base and batting average numbers but their explosive offense (high slugging percentage) propped up their runs scored. The Jays are an interesting study in the necessary balance between on base and slugging. The homer-happy Jays lead baseball in home runs by a long distance, but they still scored fewer runs than the year before.

The “R-squared” number you see (beside the trendline of each data group) indicates the strength of the relationship. The closer to 1 the better. Remember, just because a relationship exists doesn’t mean one causes the other. But like shots of Jägermeister and fistfights, you can safely assume one has a lot to do with the other. Le’s put that another way.

This shows just how strong a relationship exists between OPS and scoring runs. If we follow that line of thinking, OPS is a better indicator of skill and value than batting average could ever hope. Is it perfect? Of course not.

Other, more advanced numbers like wOBA (weighted on base average) have an even stronger relationship to scoring runs while incorporating speed and base running. But OPS is a good start for us here at Getting Blanked.

Use the comment section to air your beefs or pose a question to the crack staff if you’re not sure of anything you see here, of if you wish for us to get off your lawn.

1. Keep it coming dude! Hopefully the better educated the fans get, the better fans they’ll become. And you know, maybe won’t come to games and heckle Carl Crawford for no reason…

2. Hahaha. This is an awesome post. Well written, not derogatory in any way, exactly what all baseball fans need.

3. My eyes frequently glaze over when people talk about simple math, not gonna lie. But then you bring me back with things like: “But like shots of Jägermeister and fistfights, you can safely assume one has a lot to do with the other,” and it all makes sense again!

4. Drew: I hope you realize that people like you are the reason Joe Morgan got fired. How can you sleep at night?

5. Excellent post. Never apologize for being nerdy, especially if you are right.

6. I must be new to the whole stats thing because i had to google what OPS was (since this article did not define it…..). interesting article all the same

7. hey definately well thought out, but terribly written, bloggers would be taken more serioulsy if they could learn to write, stop using spell check and try proof reading, it gets freaking annoying trying to understand a complex issue when the language has to be re-read over and over.

9. I’m as big a fan of advanced metrics and statistical analysis as anyone, but I’m not really buying in to what you’re trying to demonstrate here.

I don’t really understand how a measure of the lack of outliers in a linear regression of OBP vs. the lack of outliers in a linear regression of AVG is showing that OBP is more important for scoring runs. It shows that there is a lot less variation in the statistical set, and the linear fit is a better model for predicting future results. (I.E, you know that if your team OBP for 2011 is .325 you can be confident you’re going to score ~700 runs, but if your team average is .250 you’re going to score somewhere between 600 and 675 runs) Showing that there is a stronger correlation isn’t demonstrating that OBP has more value in scoring runs than AVG, it’s just showing that there is less statistical variance.

The characteristic of the linear fit which demonstrates the value of one statistic compared to another is the slope of the line. The extremely high slope of the OBP fit shows that a small amount of change in OBP results in a larger change in runs than a similar amount of change in AVG or SLG.

I’d be curious to see what this analysis would look like over a much larger set of data. It seems to follow the mentality that a 1 point increase in OBP is worth 3 points of SLG.

10. Wow, Matt. Now that’s a comment.

I agree that using a linear regression doesn’t provide the depth of insight of which you speak. My goal here wasn’t to predict future results (which would be nice) but more of a primer demonstrating the relative strengths of each rate.

Showing a strong correlation (in a tiny one year sample) between OPS and runs scored might cause someone still getting a handle on this flurry of new numbers to pause and consider alternatives to tried and true batting average .

I feel as though showing the respective correlations between each of the well-known slash numbers serves as a better foundation to move into more predictive advanced numbers in the future.

Comments like yours are amazing and push us all to keep doing better. Thanks for your feedback and thanks for reading.

11. Apologies. I got a little bit ahead of myself with where you were going with the post.

Indeed OPS is a more accurate way of comparing team runs, and by extension, a more accurate way of comparing a player’s run contribution to other players.

I was getting ahead of myself with a comparison of the value of the different statistics on a point by point basis.

12. No need to apologize, your position is important and valued. Keep it up!