The first thing to note is that FIP is not dependent on strikeouts in the way most people think. As we learned in last month's primer, FIP is dependent on 4 different values, one of which is strikeouts, and another of which is balls in play. Both are important to FIP, as are BB and HR. FIP is a balance of all 4 areas for a pitcher, with each area weighted according to their observed value in actual MLB games. They do not favour strikeouts as a style, and in fact, do not favour them at all when they come with high walk and HR totals. It is fully possible to have a poor FIP as a high-strikeout pitcher.
FIP, in general, does not favour strikeouts in any way other than how their observed value relates to the other 3 areas included in the formula. Now that that is out of the way, FIP does overvalue strikeouts if a team's defense is very good, but only slightly, and, conversely, it actually undervalues strikeouts if a team has poor defense. To see how this works, let's revisit the table of values for the 4 types of events covered by FIP:
Remember that the coefficients used by FIP are derived by subtracting the value of a BIP from the value of each other event and multiplying by 9. FIP is dependent on the value of these 4 types of events being close to those in the above table. When team defense differs significantly from average, it has no effect on the value of HR, BB, or SO (actually, that's not true, because those values are all sensitive to the run-environment, so when a team allows more or fewer runs, those values will change slightly, but for our purposes, we'll ignore that). Defense does, however, have an obvious effect on the value of a ball in play. With a good defense, a BIP will be worth less (to the offense) than -.04. With a bad defense, a BIP will be worth more. So the coefficients in FIP are in fact slightly off when a defense is not close to average, because FIP is tuned to fit the average value of a ball in play.
Here enters the beauty of FIP. Because of how the coefficients are derived, the formula can be easily tuned to fit any level of team defense. FIP, it turns out, is not wrong for pitchers in front of good or bad defenses, it just has to be tuned differently. All we have to do is recalculate the value of a ball in play and re-derive the coefficients.
Imagine a pitcher had as good a defense as he could possibly have. Say, for example, he had the 2009 Mariners' defense. This defense was worth 85 runs above average, or roughly .02 runs per BIP. An average BIP is worth -.04 runs, so an average BIP pitching in front of this defense is worth -.06 runs (remember that lower is better for the defense). We recalculate our coefficients and get:
13.13*HR + 3.23*BB - 1.90 SO*
Strikeouts did indeed lose value, and walks and home runs both became more costly. How much difference does this make, though? Let's revisit our hypothetical 6.7 strikeout pitcher. Let's say he also walks 2.1 per 9 (after adding in HBP and subtracting out IBB) and allows .33 HR per 9.
Using the traditionally derived coefficients, he'll have an FIP of about 2.84. Keeping in mind that we also need to calculate a new league constant to scale FIP to ERA since we changed the coefficients, we find that he would have an FIP of about 2.79 using our defense-specific coefficients. Traditional FIP underestimated him by about .05 ER/9, or about 1.2 runs per 200 innings, compared to defense-sensitive FIP.
Since we are already adjusting for defense in our calculations, we can go a step further in incorporating defensive context into our valuation. The league average ERA this year was 4.32, but we know that that won't be the case given a +85 run defense. A league average pitcher, given +85 defense, will have about a 3.84 ERA. Using that figure, we can recalculate our constant for FIP and calculate a new number that is an estimate of actual ERA, not of ERA minus defensive support. This means that we would expect our 2.79 FIP pitcher to have an actual ERA of about 2.31. This is, of course, not as valuable as an ERA of 2.31 in front of an average defense, so we have to account for that as well. If average is 3.87, then a replacement level starter (using a .380 winning percentage as replacement level) will have an ERA right at 5.
A 2.31 ERA is good for a .749 winning percentage against a league average 4.32 ERA. Our replacement level pitcher, who is normally .380, is not .380 against the league with his +85 defense, however. He is .437. That means that our pitcher is worth about 6.9 WAR per 200 innings. Using his traditional FIP, we would give him a .679 winning percentage over a .380 replacement level, which comes out to 6.6 WAR per 200 innings. Our hypothetical** pitcher actually gained .3 wins once we considered the nuances of pitching to contact in front of a stellar defense. That's actually quite a bit. It's worth over a million dollars to the pitcher on the open market.
At the beginning of this article, I said that the traditional coefficients were only slightly off with an extreme defense. Here, we find that they can be off by as much as .3 wins if we take a Cy Young caliber contact pitcher and put him in front of the best defense on the planet. Can we really write off .3 wins as slight enough to use traditional FIP as a stand-in for defense-sensitive FIP if we want to capture the value of pitchers separate from, but in the context of, their defense?
If the value were ever really that high, I'd say no. It isn't, though, at least not if what we want to measure is how defense affects a pitcher's approach. Everything we plugged into the calculations above were purely after-the-fact measurements, but the only thing a pitcher can leverage in adjusting his approach are expected values. That means that if our +85 defense only projects to be worth 60 runs a year going forward (I'm making that number up for illustration purposes), then the pitcher can only leverage 70% of those 85 runs by adjusting his approach. Even though the defense ended up saving 85 runs, there is no way the pitcher could have leveraged the 25 they saved over their projection without knowing they would outperform the projection in advance (which, by the loose definition of a projection, you can't). He also can't leverage his full home run rate, which in this case is probably at least to some extent anomalous. If he knew ahead of time that he would only allow .33 HR per 9 giving up that much contact, he could leverage contact quite a bit (the .3 wins arrived at above being "quite a bit" in this case), but only knowing his projected home run rate, he can only leverage up to his projection, not beyond.
For these reasons, our hypothetical pitcher is never going to actually be undervalued by .3 wins per 200 innings using traditional FIP just because he pitches to contact, even if we give him by far the best defense in baseball.
This also means that just because a pitcher's ERA is better than his FIP, even if that difference is because of defensive support, it does not mean the pitcher was utilizing a better defense if his team defense was not far above average overall. Let's create a new hypothetical pitcher who has the same FIP as the one above and an ERA in the 2.2s, but whose team defense we measure to be a bit below average. In this case, we don't know that the difference between the pitcher's FIP and ERA is because the pitcher got better than average defensive support, but he might have. Let's assume that he did. Does he get credit for pitching to contact and using that good defensive support? In this case, no, because whatever his defensive support ends up being, we expect it to be below average, so deciding to pitch to contact is a bad choice. In terms of how this pitcher can leverage his defense, strikeouts are actually slightly underrated (slightly enough that we can basically ignore it, but they are underrated) even though the pitcher's ERA over-credits him for good defensive support, because he has no way to leverage that defensive support based on decisions about pitching approach made before the fact. Our pitcher is now far overrated by ERA because of defensive support and not at all underrated by FIP because of an ability to leverage good defense.
We return to our initial question about using FIP for pitchers who receive good defensive support: should a pitcher be punished for pitching to contact in order to leverage a good defense? The answer is somewhat complicated. No, a pitcher should not be punished for leveraging good defense if he is doing it properly, but FIP can actually be tweaked to account for that pretty easily because the methodology for deriving the coefficients lends itself perfectly to adjusting the formula for differing values of balls in play. Traditional FIP and defense-sensitive FIP track very closely together, though, to the point that the difference is mostly negligible and not worth not using FIP in almost any conceivable case. Even in cases where defense-sensitive FIP is a bit off from FIP, FIP will still capture the context of pitching to defense, while still separating the actual value of the defense, better than ERA (note how much closer defense-sensitive FIP, after we recalculated the coefficients to take defensive context into account, was to the traditional measure than it was to the predicted ERA once we also added in the value of the defense). Furthermore, you can't tell if a pitcher even had the opportunity to properly leverage his defensive support just by comparing FIP to ERA, even if you assume that the difference is due to defensive support. A pitcher with a 2.8 FIP and a 2.2 ERA, even assuming that his ERA includes a lot of defensive support, did not necessarily ever have the opportunity to leverage that support by choosing to pitch to contact. In fact, the degree to which a pitcher can leverage his defense has nothing to do with his defensive support itself, but with the projected value of his defensive support before the fact and with his expected rates of HR, BB, and SO given a certain pitching strategy.
*NOTE: You won't be able to exactly replicate any of these values from the numbers given here because of rounding discrepancies, so if you are trying to work through the math on your own and find some differences, that is probably why.
**NOTE: This pitcher truly is hypothetical. Don't believe me? What real life pitcher threw in a 4.32 ERA league in 2009? That's mostly why the value doesn't match up at all with the pitcher you looked up, by the way.