3-D Baseball: On Miguel Cabrera, Value, and the Triple Crown

“In ’67, the triple crown was never even mentioned once. We were so involved in the pennant race, I didn’t know I won the triple crown until the next day, when I read it in the paper.”

-Carl Yastrzemski to the Boston Herald, published September 26, 2012

“Is it too early to say that [Cabrera] has a legitimate shot at a Triple Crown this season hitting in front of Fielder? I don't think so.”

-Fox News sports article, published April 13, 2012

The Triple Crown has grown in stature over the years. That’s not to say it wasn’t a big deal before, but reporters now are asking Carl Yastrzemski about someone else winning it faster than they ever asked him about winning it himself. In 1942, when Ted Williams won it, no one even had a list of previous winners compiled. An AP reporter had to research it for his story on Williams’ feat, and he still missed the most recent occurrence (Joe Medwick, whose Triple Crown just five years earlier escaped detection).

Back then, it was a cool thing. It wasn’t necessarily the historic thing it’s become. It didn’t yet carry the mythical ethos of the pantheon-dwellers -- Williams, Mantle, Yaz, Frank Robinson, etc -- who could once do what for so long escaped their modern counterparts. When someone won it, it didn’t carry the weight of a whole generation of fans who grew up hearing about it and never seeing it. It was just a cool thing.

I can see getting excited about it. It’s an impressive feat. It’s something we’ve waited for for a long time. It's something only a handful of the greats have even done.

And yet, I have a hard time getting excited. It was a great season, sure. A wonderful season at the plate. But the best season I’ve ever seen? Not close. Which means I’ve seen a lot of non-Triple-Crown seasons that were better, because this is the first Triple Crown of my lifetime. You don’t even have to look that hard to find a better season. There’s another one right in front of our noses.

I’m talking, of course, about Miguel Cabrera’s 2011 season.

I know that seems, at least on the surface, like a bit of a contrarian statement. How could he have been better when he hit 14 fewer home runs and drove in 36 fewer runs and didn’t, I don’t know, win the first Triple Crown in four and a half decades? I don’t mean it as a contrarian viewpoint, though. I just think Cabrera hit better in 2011 than in 2012.

Let me explain myself. First, we need to establish what we mean by “better”.

I grew up with a fairly traditional baseball upbringing. I was the son of a catcher who was the son of a catcher, saved only from the tools of ignorance myself by a bad case of sinistrality (a condition my dad only fully forgave me for when my younger sister took up softball and inherited his old gear). I learned the game from proud field generals who would rather hold their ground to a hard-charging runner than hit a home run, even if they dropped the ball in the process.

That’s not a bad way to learn the game. It was a great way to learn it. But part of that upbringing was growing up thinking that Rickey Henderson was Lou Brock-Lite, and that Ted Sizemore was the ideal #2 hitter, and that Tony Gwynn was the best hitter in the game. Part of that was drafting Ozzie Smith for my first fantasy league in a three-team-deep league.

It’s not that those things are necessarily wrong. I don’t remember or care what happened in that fantasy league, other than that I remember drafting my favourite player. I don’t remember or care how many runs the Padres scored with Tony Gwynn anchoring their lineup, or how many games they won. I remember that watching Tony Gwynn was unlike watching anyone else in baseball, because you felt like you knew you were going to see something happen. He was going to put the ball in play, and the defense was going to scramble to field it. When Tony won, it felt like he won because he could almost place the ball at the spot where it landed. When the defense won, it felt like they got away with one. It was exciting to someone who learned the game the way I did.

As far as baseball is a game of entertainment, maybe Tony Gwynn was the best hitter in the game. Arguing for Tony Gwynn over Frank Thomas, or Barry Bonds, or Fred McGriff, or a handful of other guys as a hitter, though, isn’t really an argument of value or production. It’s an argument of what “best” means to begin with. He was better at some things, yeah. Maybe better at the things that are most important to you. At some point, though, it started to hit me that, whatever abstract ideals I might hold about what a hitter should be, the very concrete objective of all hitters is the same. They hit as best they can to win games, and they do so by helping to score runs.

That’s something that’s hard to measure when your statistical upbringing comes mostly from Topps and Donruss. How many runs is Gwynn’s AVG worth? How many runs are Thomas’ walks and extra base hits worth? I don’t know. It doesn’t say on the back of the card. We all know when we watch a game that getting on base is important, that making outs is bad, and that getting to second or third is better than getting to first. How much better? I don’t know. And so the argument becomes about what best actually means, because the units of measurement are not helpful.

The thing is, baseball engenders a strange kind of fanatic. The kind of people who meticulously record every Major League at bat in every inning of every game. Henry Chadwick devised a system for doing so way back in the 19th Century, and people have been doing it ever since. As a result of having all that data, you can actually go back and look at every play and see how many runs scored when a batter singled vs how many runs scored when he walked, and so on. And people did this. They got that same sense that the stats on the backs of the cards were not capturing the value we could sense on the field, only instead of arguing about what meant what, they went and looked at the records to see what meant what.

After looking at thousands and thousands of innings, you can see patterns emerge. A walk adds about .55 runs over making an out. A single adds about .7 runs. A home run, about 1.65 runs. It depends on the situation, of course, and the offensive environment, but on average the event values are pretty stable. This is where linear weights come from. People started picking up on the pattern and noticing that you could tie a hitter’s batting line to actual, concrete terms of run production, rather than getting stuck arguing about what is and isn’t important.

So that is what I mean by better: what does more to produce the runs that help the team win games? In that sense, there’s really no argument for Gwynn over Thomas. You put early-90s Frank Thomas in a lineup, and you’re going to score more runs than if you put early-90s Gwynn in that same lineup. Maybe this doesn’t hold in, say, 19th Century style baseball, or some other extreme environment. I don’t know. But it definitely holds in early-1990s MLB.

In the same way, Miguel Cabrera hit better in 2011 than in 2012.

The batting title means something in the same l'art pour l'art kind of way that Tony Gwynn was awesome. Cabrera's .330 average this year looks just as cool in the bold font on the back of a baseball card as his .344 average last year. A hundred years from now, there will still be fanatics thumbing through the pages of books (or whatever form information will take in 100 years) and seeing Bill Mueller's and Freddy Sanchez' names, the same way I saw George Sisler's and Lefty O’Doul's when I was a kid. The black ink still means something in the aesthetic of the art of hitting.

Aesthetically, it matters that Cabrera hit .330 in 2012 instead of 2011, when Adrian Gonzalez, Micheal Young, and Victor Martinez all hit higher than that. It matters that he hit .330 in the AL instead of the NL, where Buster Posey hit .336 and Melky Cabrera a mathematically-exempt .346.

None of those things matter much to the Tigers chances of winning, though (well, what Victor Martinez does obviously matters to the Tigers, but you get the point). To the Tigers, what matters is that .330 instead of .344 is 9 fewer hits over the course of the year. It means dropping a handful of runs, and maybe losing a win somewhere along the line. The difference isn’t the point where the black ink turns grey, but anywhere that hits are lost.

What matters even more is that he drew 42 fewer walks (37 of them unintentional). A .014 point drop in batting average is not that huge, but his OBP dropped over .050 points from .448 to .393. .393 is still a great OBP. Only five hitters had a higher OBP this year.

.448 is a phenomenal OBP. That's how often Barry Bonds got on base in his career. Barry Lamar Bonds. The greatest hitter I've ever seen, and getting on base was his greatest skill. And Cabrera matched that, albeit for one year and not twenty-two. .393 is really good, but it doesn't conjure up Barry Bonds. It conjures up Bobby Abreu or John Olerud or Mark McGwire. Great hitters, on-base machines. But not Barry Bonds.

Cabrera hit 14 more homers in 2012 than 2011. That matters too, quite a bit. It adds back a large chunk of the value lost from all those walks and hits that turned into outs. The question is, how much of that lost value does it make up?

Turning an out into a home run adds as many runs, on average, as turning two or three outs into walks. In fact, 14 is just enough home runs to offset the loss of 37 unintentional walks, if the 37 walks turn into 14 home runs and 23 outs.

That’s not what happened, though. Cabrera hit into 43 more outs* in 2012 than 2011. He also hit eight fewer doubles. Turning a double into a home run is about as good as turning one out into a walk. When the extra home runs start coming at the cost of other types of hits instead of outs, then those 14 home runs are no longer enough to cover the big on-base drop.

*I am just using the shorthand AB-H+SF for outs, not including outs on the bases or double plays, removing reach-on-errors, etc

Linear weights illustrates this. The following table shows how many of each event Cabrera gained or lost going from 2011 to 2012, along with the average impact of that event on team scoring (i.e. its linear weights value) and the total run impact:

	difference	lwts	total
outs	+43	-0.26	-11.1
1B	+2	0.45	0.9
2B	-8	0.76	-6.1
3B	0	1.04	0.0
HR	+14	1.40	19.6
niBB	-37	0.30	-11.2
HBP	0	0.33	0.0
total			-7.9

On top of that, you’ve got small differences in intentional walks and double plays, both of which favour 2011 Cabrera, so pretty close to 10 or so runs. FanGraphs puts the difference at 9 runs. Baseball-Reference has it at 13.

In more crude terms, there’s just no way a .020 point gain in SLG can offset the runs you lose with a .050 point drop in OBP. At least not if the run environment stays about the same (which, in this case, it did).

But, like we said earlier, hitting is about scoring runs. Cabrera may have done more positive things at the plate in 2011 than in 2012, but his positive things drove in 139 officially sanctioned* runs in 2012, 34 more than in 2011. That's a lot of runs. How can he have been more productive in 2011 when his at bats produced so many more runs in 2012?

*143 runs actually scored during his PAs in 2012, but 4 were not credited as RBIs due to either errors or double plays

It is important to understand that runs are the fruits of a team effort. The majority of runs scored in MLB result from a combination of successes by multiple hitters or baserunners. They require runners getting on base and advancing closer to home. They require avoiding outs so that your team gets more chances to drive those runners in. All of those things contribute to run scoring.

RBIs only tell you who was at bat at the end of the chain. They don’t tell you about the hitters and baserunners who set up good scoring opportunities. They don’t tell you about the kinds of scoring opportunities a hitter sets up for the following hitters. They don’t tell you when the hitter creates runs out of difficult situations rather than simply converting good chances set up by his teammates.

It’s the back-of-the-baseball-card problem all over again. RBIs are simple record keeping, a description of what happened when a hitter was at bat. They don’t measure the actual impact of the plays, and they don’t attempt to apportion the credit among all the players who contributed to the scoring sequence. Did Miguel Cabrera drive in more runs in 2012 because he was more productive with his at bats, or did he drive in more runs because the runners in front of him were more productive in setting up scoring chances? We don’t know. RBIs don’t tell us.

We can check, though. We can go through the record of every Miguel Cabrera at bat in 2011 and 2012 and see exactly what the hitters in front of him produced. The simplest thing to check, because it is shown on both Baseball-Reference and Baseball Prospectus, would be the number of baserunners Cabrera had in each season.

The number of runners a hitter has on base to drive in is a major factor in his RBI total. Take, for example, Coco Crisp. He had 213 runners on base in 508 PAs. Ryan Doumit had 398 runners on base in his 528 PAs. Doumit drove in 75 runs this year to Crisp’s 46, even though Crisp hit much better with runners on base and with runners in scoring position (Doumit did hit a bit better overall because of his better bases-empty performance). The huge gap in baserunners, and not their actual performance, drove the difference in RBIs between the two players. This is probably RBI’s biggest blind spot.

That is not what happened with Cabrera, though. Cabrera actually had fewer runners on base in 2012 than he did in 2011 (444 to 460). In 2012, he drove in 21% of all runners who were on base when he came to bat, compared to just 16% in 2011. At first glance, Cabrera not only did more in 2012, he actually did more with less.

The total number of baserunners isn't the only issue, though. Overall baserunners is still a very general figure, a conglomeration of 24 base-out states that present a wide range of RBI opportunities. It is much easier to drive in a runner from second than from first. It’s easier to drive in a runner from second with two outs than with no outs. Counting the overall baserunners is a good first step, but where those baserunners were still makes a big difference.

The most extreme example is a runner on third with less than two outs. It is still just one baserunner, but the hitter gets him in about half the time. Given the same number of PAs in each situation, a hitter would get about 10 RBIs with a runner on third and less than two outs for every 1 RBI with a runner on first. He’d get 3 or 4 for every RBI with a runner on second. He’d get about as many RBIs as he would with the bases loaded and two outs.

In 2011, Cabrera came to bat 30 times with a runner on 3rd and less than 2 outs. He was intentionally walked 5 of those times, which leaves 25 opportunities. In 2012, he came up 58 times with a runner on third and less than 2 outs, and was again IBBed 5 of those times. That's 53 opportunities - 28 more than in 2011.

When you count up not just the total baserunners Cabrera had, but how many of them were on first, second, or third, and how many outs there were when he came to bat, you see he actually had better RBI opportunities in 2012 than in 2011. Even though he had more baserunners in 2011, his opportunities in 2012 were worth about 12% more RBIs (an average performance would yield 82 RBIs given Cabrera’s 2012 opportunities, compared to 73 given his 2011 opportunities).

Still, an increase from 105 to 139 is a bigger jump than we’d expect from the difference in opportunities alone. Part of the difference is a result of the hitters in front of him creating more runs in 2012 than they did in 2011, but part of it is also that Cabrera was more productive at driving them in.

Driving in runs is only one element in the chain, though. You still need hitters getting on base and moving runners over. You still need hitters preserving outs so that the team can string more at bats into each inning. You still need baserunners taking extra bases and avoiding getting thrown out. All of those are important to setting up and creating runs. Just like the hitters in front of Cabrera are responsible for setting him up with good scoring opportunities, so too is Cabrera responsible for setting up the guys behind him.

In the subset of run creation that deals with driving runners in, Cabrera performed better in 2012. In the subset involving setting up opportunities for his team to score, he was better in 2011. So how do we balance these into one overall comparison?

Well, that’s basically what linear weights do. They assign values to events with each aspect of the run-creation sequence in mind. Linear weights don’t really know how well Cabrera did or didn’t take advantage of his specific opportunities, though. They don’t know that he did so well at driving in the guys who got on base for him, beyond just knowing that people who hit really well are usually good at driving in runs. If he was just more clutch or more productive with his timing in 2012, linear weights will miss that.

Rather than using linear weights, we can instead use something that assigns value on a case-by-case basis, something that measures the impact of each play based on its specific context. That’s where run expectancy comes in.

Run expectancy is like linear weights, except instead of measuring the values of events (walks, singles, etc), it measures the values of situations. How many runs does a team score when it gets a runner on first with no outs? How about runners on second and third with one out? You can go through the records of thousands and thousands of innings and find out. And once you do that, you can measure how much value is added in each PA by seeing how much run expectancy was added in going from one situation to the next.

As such, it provides a context-sensitive look at production that linear weights lacks. It weights at bats based on their impact on run scoring, so that a bases loaded at bat might be worth two or three normal at bats, while a bases empty, two out at bat would be less than half as important as a normal at bat. It measures the clutch or timing aspects of run production, such as producing well with men on base and driving in lots of runs.

When you break down Cabrera’s seasons using run expectancy, he was still more productive in 2011 than 2012. In fact, the margin approximately doubled to over 20 runs (+67 in 2011, +46 in 2012). Even with all that extra RBI production in 2012, including measurements of Cabrera’s timing and clutch and performance with runners on base just pushes his 2011 season further ahead.

The difference is in how well he set up scoring opportunities for the hitters behind him. It’s the getting on base, the moving runners over even when he doesn’t score them, the leaving his team more outs to work with. Those things all add up to runs, and in 2011, Cabrera did those things well enough to more than cover the difference in driving runners in.

You can break down run expectancy to reflect this by splitting it into the number of runs expected to score in that plate appearance vs how many are expected to score over the remainder of the inning. In 2012, Cabrera scored 54 more runners during his PAs than you’d expect from an average performance. In terms of setting up the rest of the inning, though, Cabrera actually left the guys behind him with less to work with than an average hitter would have. His +46 run expectancy breaks down to +54 at driving runners in and -8 at setting up the rest of the inning.

That doesn’t mean Cabrera was a negative force in any way in 2012. The main reason he left the hitters behind him less to work with is that he erased a lot of runners who would have been on base by driving them in himself. That’s obviously a good thing. But you can drive runners in while still doing things to contribute elsewhere. You can get on base and extend innings and move runners over: things that don’t drive in runs directly, but which create a lot of runs for the guys behind you to drive in.

Those were things Cabrera did well in 2011. He was still driving in runs, albeit at a less torrid pace than in 2012 (+25 runs above average in driving runners in), but he was also improving his team’s outlook for the rest of the inning. Cabrera added another 42 runs above average by setting up the inning for the hitters behind him, a full 50 run improvement from his 2012 performance. That’s more than enough to make up for the lower RBI production in 2011.

All of this may sound very theoretical compared to all those factual runs Cabrera drove in. Setting up opportunities may or may not lead to runs. They’re not guaranteed. They still depend on teammates to continue the chain. The fact is, though, those runs happen, just like the runs that happened because someone got on base for Cabrera. And Cabrera deserves credit for creating those runs.

Following Miguel Cabrera’s 688 PAs in 2011, the Tigers scored 317 runs over the remainder of the inning. That’s a lot of runs, thanks in large part to Cabrera’s role in setting the table. Following Cabrera’s 695* PAs in 2012, the Tigers scored 270 runs. Those are the extra runs we see in Cabrera’s 2011 performance, actually turning up on the scoreboard.

*excluding the two times he came up for the second time in an inning, to avoid double counting (though that makes very little difference)

Granted, Cabrera isn’t the only factor in those runs, just like he’s not the only factor in the runs he drives in. The other hitters in the lineup factor in as well. If the guys behind him hit worse this year, of course the Tigers would likely score fewer runs over the remainder of the inning, no matter what Cabrera does.

It would be a bit odd if that were the case, since they added Prince Fielder behind Cabrera in 2012 and moved Cabrera up to third in the lineup, but we can still check. Here’s how the hitters following Cabrera in an inning hit in 2011 and 2012:

year        R   AVG      OBP   SLG   wOBA   PA*
2011   317   0.284   0.338   0.423   0.326   1376
2012   270   0.282   0.355   0.437   0.339   1244

*PA excludes IBB and SH

We know from the play-by-play records that the hitters in front of Cabrera set up better scoring chances in 2012 than 2011. Entering Cabrera’s PAs, the Tigers’ would have expected to score about 348 runs with average hitting performances through the end of the inning in 2012, compared to 339 in 2011. And we know that the hitters behind Cabrera hit better in 2012 than in 2011. And still, the Tigers managed to score more runs in those innings in 2011 than in 2012, 424 to 413.

So maybe it’s more than a little plausible that Cabrera was, in fact, a major reason the Tigers scored more in 2011. Maybe he really was more productive at the plate in 2011, just like the numbers suggest. Maybe all those extra runs he set up in 2011 really did cross the plate. If we accept that we are rating hitters based solely on how much they contribute to team run production, then maybe Cabrera really did hit better in 2011 than in 2012.

2012 still has the bold font on the back of the baseball card. It will still live on as a memorable season, and a really cool statistical performance. It will mean a lot to a lot of people. In an abstract, personal way, maybe it’s the better season. In the concrete terms of the results on the field, though, it’s the Tony Gwynn to 2011’s Frank Thomas. A great, great season. Just not a better season.

3-D Baseball

On Miguel Cabrera, Value, and the Triple Crown

1 comments:

Post a Comment

Javier Vazquez K-Watch

Links

Retrosheet Credit

Lahman Credit

Contributors

Blog Archive