Extracting Luck From BABIP (2024)

Balls in play are subject to lucky bounces, bloops, and exquisite defensive plays. Are some great hitting seasons and breakout performances just a player getting lucky on more than their fair share of balls? Is there any way to tell if a player is truly lucky or good, or if his batting average on balls in play is higher than we would expect? Could building a better expected BABIP help us find over- or undervalued players?

In the hopes of better understanding players’ true abilities, I looked specifically at the correlation between BABIP and launch characteristics. A player’s BABIP viewed across a short timeframe, such as a single season, can be highly influenced by luck. BABIP doesn’t converge well over a small sample. Using the law of large numbers, we know that given enough balls in play, a player’s BABIP should converge to their “true” BABIP.Fortunately, other launch characteristics like exit velocity and launch angle (both vertical and horizontal) converge more quickly. My goal was to build a model for expected BABIP based on those launch characteristics that removes as much luck as possible and more closely reflects a player’s true skill.

This project started as work I did along with Eric Langdon, Kwasi Efah, and Jordan Genovese for Safwan Wshah’s machine learning class at the University of Vermont. We were using launch characteristics (exit velocity, vertical launch angle, and derived horizontal launch angle) to predict if balls would land for hits or not. We initially tried using a support vector machine classification but found that a random forest model delivered more accurate predictions.

Comparing My Results

Statcast has a “Batting Average using Speed Angle” stat that incorporates exit velocity and vertical launch angle, which I used as a comparison point. In terms of predicting a player’s BABIP for a season from their launch characteristics, my model delivered 38% less error than Statcast’s model (using 2019 data for players with over 300 balls in play). When using a player’s 2019 data to predict their 2020 BABIP, my model performed similarly to Statcast’s model, giving average errors of 3.94% and 3.90%, respectively. Both models were better at predicting a player’s 2020 BABIP than their actual 2019 BABIP numbers, which had an average error of 4.39%. The 2020 predictions have to be taken with a grain of salt, as I expect progression and regression to occur between seasons, and neither of the two models nor the data accounts for that.

Cross-Validation To Verify Accuracy

I sought to verify the accuracy of my model’s predictions using cross-validation. I split the 2019 season data into two groups, randomly assigning balls in play into Box A or Box B. This allowed me to predict a player’s BABIP in Box A using their expected BABIP, their Statcast expected BABIP, and their actual BABIP from Box B, and vice versa. Regression/progression was not an issue since I was drawing two random samples from the same population. After running the comparison on each player with 125 balls in play in both groups A and B (>250 BIP for 2019), my model produced an average error of just 3.2% while Statcast had an average error of 3.5% and actual BABIP had an average error of 3.7%.

What’s Luck Got to Do With It?

If a player’s BABIP is higher than his expected BABIP, we can deem him lucky.Luck, as I define it, is the gap between actual and expected BABIP. The luckiest player in 2019 per my analysis was Nolan Arenado, who outperformed his expected BABIP of .325, achieving an actual BABIP for the season of .368. Unfortunately for Arendado, his luck didn’t follow him into 2020, where his expected BABIP exactly matched his actual BABIP at .277. On the flip side, 2019’s unluckiest player was Marcell Ozuna, who posted a BABIP of .314 despite an expected BABIP of .370. After signing a one-year deal with the Braves in 2020, Ozuna exploded and out-performed his expected BABIP of .429 to the tune of a .456 BABIP en route to a sixth-place finish in MVP voting.

I have listed the top 10 unluckiest and luckiest players from 2020 based on the difference between their expected and actual BABIPs. This year the expected BABIPs may be especially useful given the shortened 2020 season.With a greatly truncated sample size, the law of large numbers was not at play, so 2020’s actual BABIPs may fluctuate significantly from the true mean. The table includes players’ actual 2020 hits, balls in play, and BABIPs as well as my model’s expected BABIP (xBABIP), MLB’s Statcast Expected Batting Average on Balls in Play using SpeedAngle (mlbBABIP), and the difference between expected BABIP and actual BABIP (diff).

Further Refining Expected BABIP

While the results are encouraging, there is still much more that can be done to improve how to predict a player’s true BABIP. I am exploring some ideas on how to do that and hope to outline them, as well as the anticipated challenges, in future writing. The code I used for this project and the final report from the machine learning class are available upon request for those interested in the gory math details. I plan to continue this research, so any suggestions via questions or comments would be greatly appreciated.

Jack Olszewski graduated from the University of Vermont and has interned as a video scout for Baseball Info Solutions, a statistician for several college baseball and hockey teams, and a data analyst for a national publisher. He is currently pursuing entry-level positions in baseball operations and can be reached via email or LinkedIn.

FAQs

Extracting Luck From BABIP? ›

If a player's BABIP is higher than his expected BABIP, we can deem him lucky. Luck, as I define it, is the gap between actual and expected BABIP.

Keep Reading ›

Is BABIP a luck stat? ›

If a player has at or above the league average in line drive percentage (23.9%), hard hit percentage (39.9%), and pulled ball percentage (30.1%) they should be at least within 30 points of the league average BABIP of . 299. If their BABIP is lower, it can genuinely be attributed to bad luck.

Does BABIP include errors? ›

It is calculated based on flyouts, groundouts, singles, doubles, triples, fielder's choices, errors, and sacrifices. Compared to BA, OBP, SLG, OPS, and wOBA, BABIP includes the least amount of batter outcomes.

Learn More ›

Can a pitcher lick his fingers while on the rubber? ›

While in contact with the pitching rubber, the pitcher is not allowed to touch his mouth or lips at all. He can touch his mouth or lips when in the 18-foot circle surrounding the pitching rubber, but he is not permitted to then touch the baseball or the pitching rubber without first wiping his pitching hand dry.

What is the difference between fWAR and bWAR? ›

Note: fWAR refers to Fangraphs' calculation of WAR. bWAR or rWAR refer to Baseball-Reference's calculation. And WARP refers to Baseball Prospectus' statistic "Wins Above Replacement Player." The calculations differ slightly -- for instance, fWAR uses FIP in determining pitcher WAR, while bWAR uses RA9.

Keep Reading ›

How to calculate BABIP baseball? ›

Read On ›

What does xFIP measure? ›

Definition. xFIP finds a pitcher's FIP, but it uses projected home-run rate instead of actual home runs allowed. The home run rate is determined by that season's league average HR/FB rate.

Know More ›

What does a 333 batting average mean? ›

333, which also means you get a base hit 33.3% of the time. While a . 333 average is very common in high school ball and college baseball, it's an amazing feat if you're a major league ball player and you'd probably lead the league in hitting if you pulled it off.

What is the FIP stat in baseball? ›

Fielding Independent Pitching (FIP) sort of works like the more common Earned Run Average (ERA), but FIP aims to measure what a pitcher's ERA should look like if he were to experience league average results on balls in play.

See Details ›