Converting OBP/SLG to wOBA

Statistical analysis has crept into the mainstream consciousness. It is no longer difficult to find fans who look at slash lines before Triple Crown stats or who can properly choose (after looking it up, of course) between Brian Giles' and Carlos Lee's careers at the plate (just try to do it without considering walks). Unfortunately, the face of this new wave of stats has become OPS. As frustrating as it can be that the past few decades of work have been essentially boiled down to the equivalent of rummaging through the pantry, picking out a mishmash of the best ingredients, and throwing them into a casserole rather than looking up a real recipe, it is a step forward. It may be tiring having to explain that, even without considering park effects, Kenny Lofton still hit better than Vinny Castilla, OPS be damned, but it's infinitely better than trying to have a discussion with someone who insists that Gary Gaetti and his 360 HR and 1341 RBI are Lofton's superior. Plus, it's nice to be able to go to a ball game and look up to the scoreboard when an unfamiliar hitter comes up and see something that will give you a better idea of how well he's hit than AVG/HR/RBI.

Luckily for those of us who don't care to stop at OPS, linear weights, et al have become widely available. Baseball-Reference has Jim Palmer's Batting Runs. BaseballProspectus has EqA. FanGraphs and StatCorner have wOBA. All these are designed to model runs, not to stab in the dark at them. These are the stats that looked up the recipe first. But can we still make OPS work? The Book says, "for you OPS lovers, you will note that (OBPx2+SLG)/3 is a close approximation of wOBA." The problem with this is that if you run that calculation with the idea of wOBA for scale, you will find a lot of hitters to be much better than you thought, as average for that calculation is a good .030-.035 points higher than for wOBA. So if all you have are a player's OBP and SLG, and you want to know how good a hitter he is, 2OPS/3 will be on an unfamiliar scale. You know how good a .350 wOBA is, but not how good a .350 2OPS/3 is.

The upside of 2OPS/3 is that it retains the simplicity of OPS. It's easy to calculate, and, because it is still arbitrary, it looks clean and simple. So if you want to use it, know that average (for this decade) is around .365, not .330. That's decent enough, I guess. We can still do better, though. Of course, to be honest, should we? The previous paragraph listed several alternatives that are readily available and, frankly, already do everything we can hope to with OPS, but better. So why bother? There is a reason, after all, that The Book wraps up its lone paragraph on OPS rather succinctly: "This is the last time we will talk about OPS."

But say for some reason all you have are OBP and SLG. Maybe you're at a game and the scoreboard flashes a player's slash line and you want to know how good that is beyond what OPS can tell you (and you brought a calculator or pen and paper, or at least picked up some spare napkins at the concession stand to jot down your calculations on). Maybe you're having a real-life, offline discussion where the stats come up. Maybe you have a projection that only gives you the traditional slash line, or you are looking up some sort of split that isn't presented on the sites that give you linear weights. Whatever the reason, you find yourself trying to decide if a .400/.420 line is better than, worse than, or about the same as a .370/.460 line, and just how much difference there is. How do you do it?

One way is to look at all players who have a specific breakdown of OBP/SLG and see what those players' corresponding wOBA was. You can look at all players who had roughly a .330 OBP and .420 SLG and see that they had, on average, a .326 wOBA. And then you can do the same for every semi-common combination of OBP and SLG. Sound like a plan?

Now take every player-season from 1993-2008 and round the OBP and SLG to the nearest hundredth, and group all player-seasons with the same truncated OBP and SLG together. Limit combinations to only those with a combined 2500 PAs. Then, you can look at the average wOBAs for all slugging percentages with a set OBP to see how much each point of SLG is worth and vice versa to determine the value of a point of OBP. You end up with something like this:

wOBA by SLG, OBP=.330

trOBP
trSLG
actOBP
actSLG
wOBA
PA
.3300 .3200 .3289 .3192 .2930 2755
.3300 .3300 .3296 .3295 .3007 6635
.3300 .3400 .3299 .3398 .3019 8002
.3300 .3500 .3296 .3499 .3061 15144
.3300 .3600 .3297 .3599 .3077 13466
.3300 .3700 .3297 .3701 .3124 11015
.3300 .3800 .3299 .3800 .3132 16022
.3300 .3900 .3302 .3908 .3181 15032
.3300 .4000 .3300 .3999 .3208 23948
.3300 .4100 .3309 .4101 .3243 19832
.3300 .4200 .3290 .4195 .3256 17874
.3300 .4300 .3301 .4303 .3296 17440
.3300 .4400 .3302 .4393 .3321 16495
.3300 .4500 .3302 .4506 .3364 14757
.3300 .4600 .3298 .4601 .3392 17508
.3300 .4700 .3293 .4701 .3416 10955
.3300 .4800 .3303 .4794 .3476 13828
.3300 .4900 .3306 .4899 .3501 7501
.3300 .5000 .3308 .4987 .3528 6732
.3300 .5100 .3310 .5092 .3571 5783
.3300 .5200 .3309 .5207 .3620 4285
.3300 .5400 .3307 .5392 .3637 3024
.3300 .5500 .3332 .5494 .3751 2722

And so on for every other truncated OBP, and then repeat grouping by truncated SLG instead of OBP. From here, we can look at how much wOBA changed for each .010 points of SLG when we hold OBP constant and see that .010 points of SLG is worth about .003 points wOBA. Note from the following graph that the relationship is roughly linear:



We can also do the same to see that .010 points of OBP is worth .005-.006 points wOBA (the graph for OBP vs. wOBA when SLG is held constant looks similar, just with a steeper slope). This is pretty close to the 2:1 rule of thumb for OBP:SLG (it's actually around 1.8 by this method). This value is relatively constant whether OBP and SLG are high, low, or average, as illustrated by the following graph:


Again, the graph is similar whether you hold SLG or OBP constant. There is a slight downward trend as SLG and OBP rise, but nothing major.

That is the first result to note: .010 points of SLG are worth roughly .003 points wOBA, while .010 points of OBP are worth .005-.006 points wOBA. You can use this rule of thumb to compare two players by taking the differences between their OBPs and SLGs.

What if you want to actually replicate a wOBA figure, though? This is a bit messier. Really, this isn't worth it unless you just don't have access to wOBA itself for whatever reason. But say you need to do it. We want to complete the formula:

wOBA = A*OBP + B*SLG + C

We already know A and B to be .56 and .31, but we don't yet know C. So we go back to our original table and calculate .56*OBP + .31*SLG for each combination of OBP and SLG, and then subtract that from the wOBA for each combination:

C = wOBA - (A*OBP + B*SLG)

Here, we introduce a problem. C is not really a constant. It changes when OBP and SLG change. Honestly, did we really expect anything related to OPS that wasn't arbitrary to be mathematically simple? Here's what the graph of C looks like for each predicted wOBA:


Lovely. The formula for the line of best fit is printed on the graph. That is our value for C. The x in that equation is really (A*OBP + B*SLG). So our formula for converting OBP and SLG to wOBA is now:

x = .56*OBP + .31*SLG
wOBA = -.53x^2 + 1.35x - .045

This can be combined into one equation with substition if you prefer, but it looks a bit ugly, so we'll just leave it be for now. This is now to the point where anyone who would care to do the calculation would almost be better off just calculating wOBA directly from raw stats, but whatever. Go only as far into the calculations as you need. If you want to go this far, this is how you do it.

Does this formula work? For the most part, yeah. The scale and league average match pretty closely with wOBA, and for most players, it works out to be pretty close. This estimate is within .010 points of the actual wOBA for over 95% of player seasons since 1993. About 3 quarters of player seasons are within .005 points wOBA. Half of them are within .0027 points wOBA. The average absolute difference between predicted and actual wOBA is .0036 points. Not bad, especially considering wOBA counts stolen bases and our estimate doesn't.

Obviously, the players this works most poorly for are those with a large effect from either stolen bases or from intentional walks, as wOBA handles those in a fundamentally different way from OBP and SLG (in that it considers SB/CS at all and that it differentiates IBB from nIBB). For example, the two biggest discrepancies between predicted and actual wOBA were Bonds in 2004 (120 IBB) and Willy Taveras in 2008 (68/75 in SB attempts). Both were over .025 points off.

So there you have it. Can you convert OBP and SLG to a reasonable estimate of wOBA? Yes. Should you? Probably not, unless it's all you have and you really need a reliable way to convert to actual runs. If all you want to do is get an idea of how good someone is at the plate or who is better than whom, there's little point in going all the way through the conversion. But you could do it. So take that, OPS.

Now I just need to convince the scoreboard operators at my local stadium to substitute my definition of OPS for theirs. Then we'll really be in business.

2 comments:

Anonymous said...

The biggest problem with (2*OBP+SLG)/3 is that the "2" is an oversimplification. It's been known for awhile that the best fit is a multiplier around 1.75. Just use (1.75*OBP+SLG)/3 and you will find the league average is very close to .330.

Of course your fitted curve will give a better fit than a linear equation, but it's not always worth the extra math.

Kincaid said...

Right, it's mostly just a matter of how much math you're willing to do and how precise you want to be. (2*OBP+SLG)/3 is good because it is simple to calculate, whereas 1.75*OBP is a pain to try to estimate in your head, so it loses a little functionality. (1.75*OBP+SLG)/3 is better if you have a spreadsheet or calculator or something, and it's still simple to remember the formula and type it in, which is probably the biggest advantage over something more precise like the equation in the article. If you are doing a whole list of players and are running the calculations in Excel, the complex formula wouldn't be much more cumbersome than the simpler formula, except there's no way someone will have the more precise formula memorized for immediate use. Also, as long as you are using 1.75 instead of 2, you'd probably want to use a divisor closer to 3.05 to scale the mean a bit better, which goes back to how precise do you want to be compared to how much you want to have to remember or rely on computational aids.

Honestly, I don't think the formula in this article is of much practical use except in a few rare cases (though if you do want to be more precise and don't mind a little extra work, I guess it's up to someone to decide if it's worth using). If there is something of use to take from this exercise, it's probably that OPS (or other incarnations thereof) are a messy operation that doesn't really work off of a clean set of logic like something like wOBA. Combining unlike terms such as OBP and SLG distorts value to some extent even if you weight them properly, and they happen to work well enough at valuing most players mostly by accident.

Not that you should really care about that if you are just trying to get a quick and decent estimate of a hitter's value. The shortcut works well enough, so you might as well take advantage of that.

Post a Comment

Note: Only a member of this blog may post a comment.