Over the last few days I finally managed to start reading — can anyone really read an academic text…it’s more like reading a few sentences and banging your head on a desk–one of my Christmas gifts: Analyzing Baseball Data with R. If you’re reading this, you’re already familiar with baseball; it’s a game that loves its numbers. Fortunately, R is a programming language that, too, loves its numbers.
By no means am I an intermediate user, much less an expert, but even some basic tinkering with R — and RStudio — has allowed quick generation of simple visuals that lead to questions/answers as we all approach draft — or free agency if you participate in dynasty leagues.
Using R and FanGraphs exports, it is quite simple to visualize simple categories. Perhaps in the future it will be quite simple do more than just that.
Once upon a time, I had no clue about BABIP. Now that I do, it is one of the first things I like to look at during the pre-season.
2015 Hitters with >400 PA; BABIP vs. AVG
As expected, the general trend is that BABIP and AVG correlate.
Bottom 5 BABIP: S. Drew, A. Pujols, C. Utley, B. McCann, L. Valbuena
Top 5 BABIP: O. Herrera, M. Cabrera, D.Gordon, P. Goldschmidt, K. Bryant
It’s pretty common knowledge that stolen bases are drying up relative to home runs. Simple distributions show the disparity in number of players who can contribute more than just a few.
Home runs for comparison:
Only 4 players with 20/20 seasons last year (though J. Upton just missed with 19 SB):
Pollock (20 HR/ 39 SB)
Goldschmidt (33 HR / 21 SB)
Machado (35 HR / 20 SB)
Braun (25 HR / 24 SB)
Last for now is a basic presentation of ERA vs. xFIP including labels –which I finally learned how to input– for the 3 biggest under- and over-achievers.
Cursory review reveals:
- Chris Young’s ERA typically outperforms his xFIP
- Marco Estrada’s does not
- Hector Santiago’s ERA typically outperforms his xFIP
- Mat Latos’ ERA does not typically underperform xFIP
- Ditto Rick Porcello
- Ditto Michael Pineda
Quick thoughts (after all, it was indeed cursory):
Ignore Marco Estrada unless you are counting on the ~4 ERA. Young and Santiago are not much more than afterthoughts in most leagues. But in deeper(est?) leagues, they may be worth a cheap annual contract / late pick.
One can point to Porcello and Pindea’s more-than-career-aveage HR/FB% as a major reason why they underperformed. Nothing quite so immediate stands out on Mat Latos. He warrants future considerations — or may not considering that ever the deeper leagues may be Latos-intolerant.