Recipe for Regression: a case study in bullpens

A good bullpen is a wonderful thing-look at where it got the Rays and the Phillies this year. The bullpen plays an undeniably major role in many teams' success and is often the piece that vaults them to the next level. It's also true, however, that compared to the starting rotation and the line-up, the bullpen provides far less value. Tom Tango estimates that only about 10% of the value provided to a team comes from the bullpen, right about the value Win Shares gives it. The relatively low value of the bullpen makes it difficult to build a team around long-term, but it is still possible to ride a hot bullpen to short-term success. The problem with that is that the high returns teams get out of these bullpens are usually not sustainable over multiple years, and the value of even a good bullpen is bound to come back down to earth.

Over the past 5 paired seasons, teams that ranked in the top 5 in baseball in WPA (win probability added-essentially how many wins they contributed over the year) by their bullpens one year averaged 7.43 wins from the bullpen in the year they were in the top 5 and dropped an average of 6.03 wins the following year. So teams with great bullpens one year carried over less than a fifth of the win value over average the next. Of the 25 teams to be in the top 5 in bullpen WPA from 2003-07, only 5 (20%) were able to stay in the top 5 the following year (for comparison, 17% would stay in the top 5 just by random chance), and 21 (84%) regressed the following year. So a great bullpen, while often a key ingredient to a successful team, generally does not hold extreme value from one year to the next.

This is of particular interest to this year's big surprise team, the aforementioned Rays. There was a lot of talk last year about the Rays' "different hero every night" on offense and stellar starting pitching, but the #1 reason they were where they were was their bullpen. The rotation, while it was good, wasn't even the best in the division-Toronto's staff earned that distinction-and was pretty comparable to Boston's staff. The offense was pretty much average and certainly worse than the Sox' line-up. Where they really distinguished themselves and made up that ground was in the bullpen.

Fangraphs lists the Rays' team bullpen WPA as 9.30, the best in baseball. That's more than double the wins contributed by their offense and starting pitching combined. With an average bullpen, they were basically the Blue Jays (about an 86 win team). Luckily for the Rays, they didn't have an average bullpen. Wins are wins no matter where you get them, and 9 or 10 wins from the bullpen are just as good to the Rays as getting those wins from anywhere else.

To the 2008 Rays, that is. What we want to know is what that does for the 2009 Rays. Can they repeat their 08 performance? Chances are, as shown above, no. Where does all that production go, though? It can't just disappear into thin air. So let's look at where the production came from and try to figure out what can happen to turn a great bullpen into a fairly pedestrian one.

The Rays' bullpen did abnormally well in high leverage situations when it counted more last year. Their WPA/LI, which measures what the pen did with every at bat equally weighted, was significantly lower than what you'd expect from a bullpen that won that many games, meaning the bullpen was not pitching well enough to win nearly that many games in general. By multiplying the bullpen's WPA/LI by its LI*, we can get how many wins they should have contributed based on how well they actually pitched, regardless of situation. Doing this leaves the expected win value of the Rays' bullpen 4 wins shy of their actual win value. Again, great for the 2008 Rays, but not such a good sign for the 2009 Rays.

Chances are, if you believe in clutch performance, you're shaking your head by now. The Rays won with a great clutch bullpen performance in 2008, and maybe they just have great clutch pitchers. You know, guys like Jason Hammel, J.P. Howell, Scott Dohmann, and Gary Glover (their top 4 relievers in outperforming their season numbers in high-leverage situations). Not exactly the best group of pitchers, but part of the Rays' success was that they got so many contributions from the weaker back end of their pen, so maybe these guys' best skill is their clutch ability, and maybe they really can repeat it. Let's check out their career WPA-(WPA/LI)xLI* before this year, and then in 2008-

WPA over expected


It doesn't look like any of these pitchers has any repeatable clutch skill. In fact, every one of them had done worse in higher leverage situations over his career before this year, when they combined to contribute 3 wins over the expected value based on how well they pitched. Glover, who somehow managed to contribute .21 wins above average in 34 innings for the Rays, was cut mid-season and went to Detroit, where he managed to post a WPA of -1.07 in just 20 innings. That, in case you're not familiar with the relative scale, is horrible. In part, it was avoiding those kind of performances from their worst pitchers that made the Rays so successful.

Judging from where their clutch performances came from, it doesn't seem likely that they can come close to repeating them. So that's bad news to the tune of 4 wins for the 09 Rays, even if they pitch just as well. Even that could be difficult for them, though. They not only had pitchers contributing significantly beyond their performances, they had pitchers performing well beyond their career norms. The most exaggerated example of this is Grant Balfour, who went from being a walking punch-line (7.3 BB/9 in 24.2 IP in 2007 for a guy with that name?) to one of the best relievers in baseball in 2008.


Bill James '0974.012.044.502.683.01xx1.28
Marcel '0957.09.163.952.323.55xx0.57

Marcel is the more pessimistic projection here, but both projections have Balfour regressing quite a bit next year (though not nearly to his pre-2008 self). His regression, based on his '09 projections for ERA and IP, is likely to knock 1-1.5 wins off his 2008 value. He also picked up an extra half a win by pitching disproportionately well in higher leverage situations, as discussed above, so it would be well within reason for Balfour to be worth a full 2 wins less next season. As a player with little Major League history, injuries that sidetracked his career, and one big break-out season under his belt, on top of just being a reliever in general, he is about as difficult to project a player as you'll find, though, so he could also be better than that. Still, there is very little chance of him posting another 1.54 ERA.

If we apply the same exWins formula to the 2009 projections of Tampa's top 3 relievers from 2008, we can see how many wins difference to expect from all 3 of them. Here are the differences in their 2008 seasons and '09 projections, along with the difference in WPA in 2008 when distributed evenly across all leverage situtations (dist.).

exWin differences b/w 2008 and '09 projections


Between the three of them, Tampa can likely expect a drop-off in the range of 5.5-6 wins. The specific distribution among the three could change if Balfour or Howell get more closing opportunities and Wheeler gets moved to a set-up role, but that's basically just shifting their leverage indexes around without changing the overall picture much.

Throughout their bullpen, they had guys outperforming their histories, and they didn't really have anyone flop on them. The chances of their whole bullpen repeating that are pretty slim. Guys are going to fall on the wrong side of their projections next year, and they aren't going to get such great over-performances. That doesn't mean they can't compete next year, but if they are going to contend with the Sox and the rest of the AL, they'd better start finding production in other places, and fast.

*Multiplying WPA/LI by LI would seem like it would just give you WPA, but it doesn't because WPA/LI is it's own separate stat calculated separately from the season WPA total and average LI. WPA/LI takes the WPA of every individual play and divides it by the LI for that play, and then adds those together. LI on its own is just the average LI of all plays. WPA/LI is a better measure of the intrinsic value of a player's performance, as it considers all situations equally, whereas WPA gives you a better idea of the actual effect a player had on his team's games. WPA/LI x LI is a good balance between the two.

**exWins is the number of wins above average expected from a pitcher using a pythagorean approach with his ERA as the runs allowed and the league average ERA as the runs scored. The formula is explained in one of Tango's posts on the linked article.


hostile postulate said...

really good stuff here.

i'm curious, though - judging by the results you found on the best bullpens in recent history, it would seem logical that building a good, stable bullpen is impractical and is usually left to luck. would you agree with this? i'm also curious as to who the few teams were that stayed at least someone consistent, especially in regard to whether or not there's any trends that distinguish those teams from the less consistent ones (ie, solid closer, gaudy k/bb ratios, etc.).

Kincaid said...

I wouldn't say that building a good, stable bullpen is impractical, but I would say that expecting it to stay at a very high level is impractical. The nature of bullpens as having relatively small sample seasons with widely varying leverage situations means that there will invariably be some that skew to the high side in value, and the top bullpens in a given year usually benefit from those upward skews, but they don't seem to be repeatable. Say the Rays' bullpen's true value was somewhere in the 3-4 win range (the top bullpens in baseball the past 5 years averaged 9.38 wins in their ML-leading year and dropped to 3.33 wins the next year). That would still be a pretty good bullpen and very useful to them if they can pick up the slack in their offense. But it's not going to carry them to the playoffs again without picking up the slack elsewhere. I would guess from the data I've looked at that expecting more than 4 or so wins from a bullpen without the help of luck is pushing it in most cases, but good bullpens may be able to consistently give you close to that.

The 5 teams that stayed in the top 5 were the '03-'04 Dodgers, the '04-'05 Cardinals, the '05-'06 Twins and Angels, and the '06-'07 Mariners. All had notable closers, but Putz and Gagne were both notable mostly for those good seasons and have been fairly pedestrian in other years, so I would hesitate to credit those teams' success to a shut-down closer. Izzy is a sort of in-between case: he has had a very distinguished career, but he also had horrible years in '06 and '08, right after the years in question. Nathan and K-Rod were definitive anchors.

In the sample, there was virtually no correlation between a bullpen's K/9, BB/9, or K/BB and how much the bullpen changed in WPA the following year. The 5 that stayed in the top 5 did have a moderate advantage in all 3 of those areas, though, and the 5 teams that lost more than 10 wins in their bullpen performance did worse than the average, mostly in walks:

average of sample: 7.57 K/9, 3.42 BB/9, 2.22 K/BB
5 repeat teams: 7.79 K/9, 3.35 BB/9, 2.32 K/BB
5 biggest drops: 7.70 K/9, 3.62 BB/9, 2.06 K/BB

But, like I said, there was really no correlation throughout the sample (all three variables had a correlation coefficient between -.08 and .08 with change in WPA). It could be useful in predicting the teams most likely to drop a lot or to stay near the top by finding the best and worst bullpens in the sample, but it's probably not all that helpful. The Rays don't look good there, though, and they're far enough on the low end it could be significant:

2008 Rays: 8.03 K/9, 4.11 BB/9, 1.95 K/BB

There were 5 teams in the sample with K/BB ratios below 2, and they did significantly worse than the rest of the sample. One actually improved the next year (from 5.22 to 6.78), but even with that, the 5 teams still dropped an average of 7.87 wins the following year, and the 5 on average (including the '07 Mariners 6.78 WPA) were actually 1.12 wins worse than average the following year. So they went from being in the top 5 to being over a win worse than an average bullpen the next year. The sample isn't big enough to say anything definitive, but that is a big enough drop that it is probably significant.

The Rays' bullpen also had the second largest gap in FIP and ERA in baseball, which is another bad sign for next year.

