Pasta Lunging O-Dog for a single
The value and interpretations of in-play averages (or BABIP) is one of the more contentious topics here on Getting Blanked. Dave Gershman posted a graphic that sparked a hearty debate which continued when discussing the hot starts of Matt Garza and Brett Wallace as well as an uncharacteristically slow start from Ichiro!

The basic (yet hotly contested) idea is as follows: both hitters and pitchers are susceptible forces beyond their control when the ball leaves the bat. Pitchers can expect balls to find gaps and holes at a somewhat steady rate (as a collective) while hitters individual profile and skill set impacts their personal rates much more.

In both cases an element of luck and random chance joins the mix. Hard hit balls wind up being caught, dribblers make their way through. Group a few weeks (or months) of this together and suddenly you have a career season or a year to forget.

“Famed” SABR researcher (and poet, I assume) Tom Tango wrote a post on his The Book blog recently about this very subject. He mentioned battling with Angels fans when discussing the in-play average of pitcher Jered Weaver. Weaver got off to a torrid start, helped along by a very low BABIP. Angels fans on Halos Heaven believed it would stay low, Tango stated it would come up (it did.) Here is the money quote on the subject:

It’s not like we saberists just decided to believe that a low BABIP is unsustainable. We studied the issue. We relied on actual past performance. We relied on the data that the players themselves produced. All we are saying is: here, take a look at the weight of history, and before you decide to bet against what’s always happened, YOU tell us why this time, it’s “different”.

I realize some feel as though this is pseudoscience, an overcomplication created by fart-sniffing tall foreheads in love with the sound of their own fingers dancing across the calculator keys. Others might be wary, wondering how all pitchers could surrender hits at similar rates.

One key comment from a previously linked comment section echoed this concern, wondering how getting a hit off Roy Halladay could be equally likely to reaching base against a Jo-Jo Reyes-style tomato can. To you I offer one man: Pedro Martinez.

At the turn of the century, Pedro Martinez turned in two of the greatest seasons in baseball history, especially when one considers the balloon-ball era in which he competed. In 1999 he was at the top of his game.

Even playing in the notorious Fenway bandbox, Pedro dominated. He struck out more than 13 batters per 9 innings, walking fewer than 2. He gave up just 9 home runs in 213 inning pitched. He amassed 13 WAR, a staggering number I can barely comprehend.

In a year where the league average BABIP was .298 and his team’s in-play average was .287, Pedro Martinez posted .323.

Pedro faced 835 batters. 313 struck out1, 37 walked, 9 hit nine batters and served up 9 home runs. 12 batters reached on errors. The rest reached on 132 singles and 28 doubles. For the season he gave up 56 runs2.

The very next year, Martinez posted very similarly ridiculous numbers in terms of strikeouts (284) and walks (32). He pitched just four and a third additional innings. He gave up 8 more home runs. For the year 2000 season he gave up just 42 runs. He allowed only 92 singles and 18 doubles. How? His BABIP? .235.

Was Pedro Martinez easier to hit than 1999 versus 2000? I think we can assume not, though the extra home runs might say otherwise.

It is important for to not fear a key principle of life: [GETTING BLANKED] Happens. Even across 400 balls hit weakly off one of the greatest pitchers of our era, sometimes the ball finds some green and sometimes it doesn’t. Every year some players experience dips in their numbers fueled by little more than random chance.

Ducks snort. Texas Leaguers are lost in the Texas sun. Seeing-eye singles get blind drunk. Outfielders like Franklin Gutierrez turns gappers into Dead Flying Things while designated hitters masquerading as left fielders play fly outs into doubles. It happens, sometimes in bunches and sometimes to players deserving much better fates.

The sooner we embrace this randomness, the sooner we can dispatch with semantic arguments on the merits of in-play averages and the like. The vast majority of advanced stats are designed to isolate the true talent elements of performance. In play average is just another piece of the puzzle which helps us better understand and enjoy the game we all love.

1 - lol.


Comments (22)

  1. But…. How can I go to the strip club after work without going home to my girlfriend smelling like strippers?

  2. holy shit, drew, you purposely wrote this article just to provoke me right? you are the self proclaimed expert on babip aren’t you? why is it every one of your posts begins with, “this guy was unlucky because his babip was this”. NO there isn’t randomness. you can’t be unlucky for “months” otherwise jojo reyes is just unlucky and he’s a real good pitcher. nice how you conveniently left out pedro’s BA? was he easier to hit in 1999? yes he was. his avg against was .203 in 1999 and dropped to .166 in 2000. that is what caused his babip to drop. he was that much better, not because of randomness.

    also combine the fact that in 2000 he gave up more home runs, which lowers babip, and struck out less, which also lowers babip.

    you act like players are the same every year. they aren’t. they have good years and bad years based on their skill, not based on randomness. the sooner you can embrace this, the better off you’ll be.

  3. Before this goes any further, explain how was easier to hit in 1999. Explain how a season in which he struck out more, walked fewer, and gave up fewer home runs he was easier to hit.

    I’ll wait.

  4. what are you asking? these are all separate issues that have nothing to do with each other. striking out hitters doesn’t have anything to do with how many hits you give up. does aj burnett give up fewer hits than roy halladay because he strikes out more? does francisco give up few hits since he strikes out a lot? walks have nothing to do with hits either. home runs fluctuate from year to year. why are you asking me to explain why? the BA against tells you so. players are not static. their numbers fluctuate from year to year. there doesn’t have to be a meaning behind it

    the fact is you keep contradicting yourself. you try to espouse the value of babip but you CAN’T explain why pedro’s babip dropped in 2000 can you? so instead you say it’s just randomness. I CAN and have explained to you why it’s a useless stat and why it fluctuates. it doesn’t measure what you think it’s measuring. I can explain it to you further if you actually want me to.

    • You said he was easier to hit in 1999, a season in which he struck out more, gave up fewer home runs, and walked fewer. He was in many (all) ways a better pitcher in 1999 except that he surrendered more singles. You haven’t explain anything. I’m still waiting for you to tell me how he registered fewer strike outs, walked more hitters, gave up more home runs and yet pitched better. Please, I’m on pins and needles.

      You said “his batting average was higher so his BABIP will be higher.” This is a chicken/egg situation. Does his batting average drive his BABIP or does the rate at which balls in play are converted into outs drive his batting average? Please explain why BATTING AVERAGE AGAINST is the ultimate arbiter of talent, skill, and performance. Tell me why a number which treats singles and home runs the same while completely ignoring walks and the impact of defense and the ballpark tells YOU, a guy unable to use his own fucking name on the internet, all you need to know about pitching evaluation. Tell me why DIPS theory is meaningless and randomness doesn’t exist when a round bat meets a round ball thrown at 95 miles per hour. Explain it again, since I don’t see where you explained a single thing here or in any previous exchanges. You say “HIS AVERAGE AGAINST WENT UP, ERGO HE PITCHED WORSE.” No proof, no reasoning. That is all you’ve “explained.”

      You’re right, players aren’t static. Their talent is what it is. Some years they pitch better, some years they pitch worse. It is nearly impossible to predict. We can, however, look at the things they do control and say “he pitched pretty well this year. He didn’t give up many home runs or walk many guys yet he gave up more hits than I would expect. What gives?” Some numbers fluctuate year to year, but others do not. Strike outs, walks, home runs, all tend to stabilize and remain consistent from year to year. These are wildly acknowledged to be the things pitchers have the most control over, the best representation of their skills.

      I can’t explain why his BABIP dropped, you’re right. Shit happens. I can flip a coin ten times and have it come up heads ten times. Does that mean I’m good at flipping that coin or that shit just happened? I have about as much control over that coinfip as Pedro Martinez or Roy Halladay or anyone has control over where and when a ball they’ve thrown will go when and if it meets their bat, especially as it relates to the ball’s position relative to the fielders standing behind them.

      That isn’t good enough for you. I’m sorry. But please, do explain. Dumb it down as far as you can. These high concepts are obviously lost on a try-hard like me. The emperor has no clothes, clearly.

  5. you asked the question, “Was Pedro Martinez easier to hit than 1999 versus 2000?”

    the answer is yes, because he gave up MORE HITS. his BA was higher, his whip was higher, his era was higher, his bb/9 was higher. is this too difficult for you to understand? hr, walks, and ko’s are UNRELATED to each other. the only thing he did better in 1999 was k ratio and hr ratio. THAT’S IT. big fucking deal. how is that “in all ways makes him a better pitcher”?

    “Strike outs, walks, home runs, all tend to stabilize and remain consistent from year to year.”

    no they don’t. just look at pedro’s numbers. they are all over the place every year. show me how they stabilized.

    “I can’t explain why his BABIP dropped, you’re right.”

    if you can’t explain it, then you shouldn’t be using it at all to explain any player’s performance since you have no fucking idea why it moves up or down. isn’t this what I have been saying from the very beginning?

  6. I don’t buy BABIP over very long periods of time, like a full season. Like Chewy posted, “you can’t be unlucky for “months” . Career years are not the product of luck alone, as luck is a single event phenomenon.

    As far as if Pedro pitched better in 1999 or 2000; well, they are both great years. His stats for 2000 indicate that he was harder to hit, generally, but, seeing that he gave up twice as many HR’s and found himself in 17% less 3 ball counts then in 1999, maybe he made a conscious effort to improve upon his sick 1999 K/B rate. That sounds like the Pedro I grew to love. But it doesn’t matter how great a pitcher is, he tries to blow it by a hitter for strike 3, he’s gonna give up more HR’s.

    Whatever the case, I laugh at any conclusion indicating the difference between those two years was strictly luck.

    • I think we’re overstating the “luck” aspect. It is more about random chance. Can you really say that a pitcher has “control” over where balls are hit? If he makes a concerted effort to “pitch to contact”, what can a pitcher do to ensure those balls are hit at his defenders?

      The answer is not a lot. A pitcher’s results can change year-to-year based on the distribution of his batted balls. Surely better hit balls get through but weak hits get through too.

  7. The problem I have with BABIP is when you take it to its extreme: As an example, imagine I make the Jays rotation as a starter. Now, all the years of hot dogs and beer, combined with my 50 MPH fastball would make my start resemble BP. So, I somehow get through the inning after allowing 10 runs to score, all on towering homeruns. My four walks didn’t help matters either. My BA against is 0.666, my era is 90.0, but my BABIP is zero.
    Thus, after the game, Drew Fairservice writes an article about how, “Ray had some troubles out there, but he was helped by some luck. Expect his numbers to normalize as his BABIP returns to normal.”
    I have a very basic problem with a statistic that does that. Having said that, I still find pitcher wins oddly comforting, so I’m probably the wrong person to ask…

    • @Ray – nobody would say that as you, like Jo-Jo Reyes and Josh Towers and any other shitballers you could name, do not have the talent levels to realistically expect a workable in play average.

  8. And just to head you off at the sabermetrics pass, I know “you need a larger sample size for numbers to…” bla bla bla. But you can’t have it both ways. If BABIP doesn’t tend to normalize over a large sample size (as you helpfully proved with Pedro) then my point is valid.
    Not to get too “mathy” on you here either, but if you derive BABIP by its component parts (taking into account that HR is a function of H) you can easily see that high HR rate players (a stat which can’t be explained by randomness) will trend towards lower BABIP naturally. The same is true of high HR, walk or strikeout pitchers. There’s no doubt the stat has some value, but there are just SO many factors which can skew it that I find it difficult to use as a measure of anything quite frankly.

    • But BABIP does tend to normalize. A very clear baseline for BAIBP exists for all pitchers. When a pitcher experiences a wild variation in this department but not the others, don’t you have to think something is up? In the Pedro example, how could he be more hittable yet surrender fewer home runs, especially when his home ballpark is considered? How could he clearly be a superior pitcher to his teammates yet his batted balls yield hits at such a greater rate while playing in front of the same defense?

      Here’s a quote from a Fangraphs article on the subject:

      Using binomial distribution, we can see that the odds of a pitcher with a true talent level BABIP of .300 randomly posting a .350+ BABIP in any given month (of 115 BIP) is about 10 percent. Thus, the odds of that same pitcher posting a .350+ BABIP in any four out of five months is 1 in 2,200.

      These things happen. It’s okay. They even happen to great pitchers like Pedro. When we see Pedro post a higher than expected BABIP over one season, we can reasonably expect that number regress back to established baselines based on his talent level. And it did. He didn’t pitch much better or worse, his results slightly improved when his in play average returned to its normal levels. If anything, it went too the other extreme and fueled his eye-popping ERA.

      Again, it happens. Embrace it warmly, the universe implores you.

  9. “hr, walks, and ko’s are UNRELATED to each other”

    Did I say that? Actually, they’re all related to each other, coming from the same pitcher. I think maybe you misread my post. Pedro found himself in 17% LESS 3 ball counts in 2000 but gave up more than twice the HR’s, comparative to 1999. If that doesn’t lend proof that he was pitching differently, albeit withe same awesome talent, I don’t know what to say.

    17% less 3-ball counts. That’s a different approach.

    • @Boom – that was in one of Chewey’s comments.

      I agree, that is a different approach to attacking the strikezone. He clearly threw more strikes and yes, surrendered more home runs. Yet he surrendered hits on balls in play at a lesser rate. Again, this isn’t about the number of hits, it’s the rate at which they came. Throwing more hittable pitches (in the zone) and more hard hit balls (ie home runs) yet hits fell in at a 10% lesser rate. It doesn’t add up, not in the way you’re arguing in this case.

  10. I see two major flaws in that argument right away (not the me being incapable of establishing an in play average, that’s spot on)
    1. The idea of each pitcher having a “baseline level” to which his BABIP will normalize indicates that BABIP IS within a pitcher’s control to the extent that his talent sets the baseline. This would imply that a pitcher COULD alter his BABIP to a new baseline by improving, pitching more or less to contact, losing speed or spin on one or many pitches. Most importantly, it makes the claim of randomness at best uncertain.
    2. Assuming that the results of balls in play would follow a binomial distribution, especially in a sample as small as a month, seems unlikely. In fact, in a sampling of at most 6 starts the chance of outliers should be higher than the base binomial model. As well, setting a fixed number (10%) implies that the standard deviation of every pitcher is the same, which seems unlikely given that some pitchers are more consistent than others.
    None of this proves that BABIP is useless or even that I’m correct in mistrusting it. I just find BABIP to be a statistic with too many variables to apply effectively or confidently, especially in the case of pitchers.

    • @Ray – not each pitcher, all major league quality pitchers baseline around .298.

      Consider these pitchers BABIP since 2006:

      • Roy Halladay .292. CC Sabathia .293. Dan Haren .288 Felix Hernandez .297 Justin Verlander .290 Tim Lineceum .294.
      • Cliff Lee .303 AJ Burnett .297 Mike Pelfrey .306 Jesse Litsch .282 Joe fucking Saunders .295 Nick Blackburn .300 Oliver Perez .288.

      There are some guys with numbers consistently lower (fly ball pitchers like Shaun Marcum for example) and higher but even shitty pitchers see their numbers land on the same spot.

      If a good pitcher seems his number vary greatly in either direction from that baseline, it probably won’t last.

  11. @drew so the question you’re asking is why did pedro hr total jump by 9 in 2000 when he was so much a better pitcher? well it’s probably just blind luck, but you assume everything is static every year. it isn’t. maybe he faced some power hitters more often in 2000?

    jeter was 0/6 in 1999 but was 1hr in 14 ab in 2000

    tony bautista was 0/3 in 1999 but had 2hr in 11 ab in 2000

    bernie williams was 0/6 in 1999 but had 1hr in 10ab in 2000

    mike cameron was 1hr in 6ab in 1998 but never faced pedro in 1999, then had 1 hr in 9ab in 2000

    so 5 of the extra 9 hr were due simply to extra chances against good hitters. the rest are due to just random fluctuations. pitchers dont face the same hitters the exact same amt every year.

  12. jose bautista vs david price

    he had 0hr in 10ab last year but 2 hr in 3ab this year.

    is david price so much worse this year? of course not. his whip is even lower. hr can fluctuate yearly, it doesn’ t mean anything

  13. Did you really just try to use a bunch of sample sizes 10 and under to prove a point?

Leave a Reply

Your email address will not be published. Required fields are marked *