Searching for Steroids: a Critique of the Evidence

At this point, steroids are not really the hot topic they once were. The more time that passes and the more names that come out, the harder it is to maintain genuine outrage. Revelations that once shook fans in anger now barely shrug shoulders for many. There is no longer a Pete Rose-figure to pile with shame, no lone, brash figure heading the scandal, never failing to oblige us with enough asinine behaviour to keep our vitriol fresh. The face of the steroid scandal has shifted increasingly quickly in a revolving lineup of star after star falling from grace. We've been submerged in scandal for long enough that we've learned to breathe in it, learned to live with the heartbreak of losing heroes as little more than annoyance. Of course, there are still plenty of fans who maintain a hard-line stance against steroids. I know some who would still have records and stats wiped and users kept far from Cooperstown. On the whole, though, while disapproval remains, a numbness has grown that has fans just wanting to take the prevalence of steroids as an inevitability of the past as long as we keep it clean going forward. So while the latest news on the issue isn't insignificant, it's lighter on the stigma than it has been in years, and I think that makes this, in some ways at least, a better time than ever to discuss it.

I have, of course, discussed (I use the civil term liberally here) the issue many times already. Coming from a family of die-hard Cardinal fans as I do, I got to see the full fury on both sides from early on when the whispers started about McGwire and then when they rose into all out shouting. I had my opinions, and I expressed them. My feelings have evolved over time, though, and now, I feel like things have settled enough that I feel comfortable writing publicly about steroids. So that's what I'm going to do, starting with a complication illustrated by the latest fallen star, David Ortiz.

I'm not going to write as explicitly about my feelings as John did; rather, I am going to write about wading through the confusion and trying to make sense of the mess (which is basically what John did too, now that I think about it). There are a few things that have come up in the wake of the recent leak of Manny's and Ortiz' names that I think are important to address, but for now, I'll stick with the Ortiz revelation. Specifically, I want to write not so much about Ortiz, but about the statistical implications of this news and the timing with Ortiz' rise to prominence.

Ortiz has always raised some suspicion just for his transformation from the guy who was released to make roster space for Jose Morban to a perennial MVP candidate at the plate upon arrival in Boston, and the news of his cameo in The 104 has several fans connecting the dots. Media personalities have cited his Minnesota numbers and the timing of the positive test to imply that Ortiz' emergence as a hitter was strongly linked to steroids. The problem is that these dots are a bit like the stars of Ursa Major that, I'm told, look like a bear when you connect them properly. The issue is that there are countless ways to connect the dots, and no one really knows how to do it properly or without injecting a bit of whimsical imagination to help paint the picture.

We don't know, no matter what anyone tells you, how to identify steroid users by the numbers. We have a good idea that they improve performance. We have an inkling that they are related to either injuries or healing or both. Everything is just ambiguous enough, though, that anyone can be made to seem suspect through the wrong coloured glasses. A player is suspect if he suddenly succumbs to a rash of injuries while another player is suspected for being too durable. If someone muscles up and gains weight, it's said they they must be going on the juice. If someone slims down and loses weight, they must be coming off it. A career year? evidence of doping. A sudden loss of production? The same. Someone who developed young and was hitting moonshots in his early 20s against much older competition is too mature for someone to naturally be at that age. Someone who develops power and strength later in his 20s can't have changed his body without steroids. I've heard all of these used to implicate specific players. There's just so much that gets tossed around as supposed evidence that doesn't amount to anything. Too infrequently is the question asked if steroids are the only feasible explanation or even the most likely, only whether it is a possible one.

Every time a Brett Boone or a Brady Anderson is exposed, it fuels the fervor with which people pursue these hunches. They were right about such obvious signs indicating use before, after all. But what about the Jason Grimsleys and Alex Cabreras and Paul Byrds? We know that some guys got big and had power surges in this era, and we know that some guys did neither, and we now know that players from both groups juiced. Most likely, there are also players from both groups who didn't juice. Essentially, the issue I want to raise, and how it relates to David Ortiz, is that it's far too easy to link evidence that seems obvious to steroids when the reasons for believing a connection to exist are tentative at best. It's easy to see that Ortiz tested positive in 2003 and see that as the driving force for the transformation in his game that year, but does this news really suggest that?

First, the idea that Ortiz transformed overnight is far overblown. One prominent sports talk show was discussing Ortiz' time in Minnesota as if he were putting up utility infield power numbers. They cited facts like his hitting only 58 HR in 6 years and that he was 6th on the team in HR over his time in Minnesota, behind hitters like Corie Koskie and Jaque Jones. They did not cite facts like his having fewer than 50 ABs in 2 of those years or his being plagued by injuries in Minnesota and never getting a full season's worth of ABs in any year there. They neglected to mention his leading the team in AB/HR over his 6 years in Minnesota. There was no mention of two broken wrists, both requiring surgery, while with the Twins, which would seem particularly relevant given what his latest wrist injury has done to his power, or of another surgery to remove bone chips from his knee. No indication that the power had grown toward the end of his stay in Minnesota to the point that he was putting up ISOs of .241 and .228 and AB/HR of 16.8 and 20.6 in his final 2 seasons with Minnesota despite dealing with injuries both years. This was not the single-digit per year HR threat who was portrayed.

Now, consider David Ortiz going to Boston in 2003, but take steroids out of the picture. Is it unreasonable to see him finding such a groove there? He had had surgeries in each of the past two seasons and was finally healthy. He may have just been fully recovering his wrist strength. He was finally getting regular playing time. He was entering his prime years. He was already coveted by wunderkind Theo Epstein and courted by stars like Pedro and Manny. He had teammates telling him he would hit 30 HR and 30 doubles in Fenway even as he got off to a slow start. His first year there, he hit 31 HR and 39 doubles, showing only the power his teammates and GM saw signs of already. Is that an unreasonable jump for a player going from his age 26 to age 27 year who hit 32 doubles and 20 HR a year earlier in a season interrupted by knee surgery?

Consider this: in 2001, despite breaking his wrist and having surgery that year, Ortiz' HR/FB rate, according to Baseball-Reference, was 14.4%. In 2003, when Ortiz allegedly morphed into a monster, his HR/FB rate was 14.3%. What we saw in Ortiz was definitely an improved hitter in 2003, but it was not out of nowhere. Were his numbers in 2003 likely aided by steroids? Yes, that would be a reasonable assumption. But did steroids turn Ortiz from a run-of-the-mill hitter into Big Papi? There is no indication that that is the case.

The most alluring trap here is the timing of the positive test. It came in 2003, presumably in Spring Training, when all players in MLB were tested. It's also possible that it came later if he was one of the 240 players chosen for additional testing that season. The seeming connection is that he did steroids just as he emerged as a power hitter, so they must be connected. This assumptions requires a few leaps of faith, however. First, you have to assume that Ortiz was not doing steroids before 2003, that he resisted the temptation while fighting through multiple major injuries and when there was no testing, and then only started when MLB began testing and was on the horizon of a new policy with random testing and penalties. If he were already doing steroids before that, after all, then they would do nothing to explain his improvement. Then, you have to assume that after not taking steroids throughout his 6 years in the Majors with no testing, he not only started taking them in 2003 and got caught by the survey tests, but that he kept juicing throughout the next several years and managed to thwart all subsequent tests along the way. Steroids in early 2003 can't explain how he kept improving in proceeding years and had huge seasons well beyond even his 2003 levels later on if he didn't keep taking them. If his improvements were due to steroids and he didn't keep taking them, then his numbers should not have gotten even better in subsequent years. Following from this assumption is the implication that MLB's testing policy does not work, since it didn't catch him despite his presumed consistent use of steroids for years. Finally, you have to assume that steroids even have the ability to turn someone from a run-of-the-mill hitter into a superstar slugger despite knowing of dozens of examples of players who used steroids and turned into, well, Nook Logan. It is much more likely, given the environment of testing, that 2003 marks the end of a period of steroid use for Ortiz dating back to his Minnesota days than the beginning of a period dating forward into his most productive years, in which case, his statistics would have told us virtually nothing about his likelihood of steroid use.

The point of all this is not to exonerate Ortiz, but rather to try to put into perspective the complications of trying to tie PED use to production, or, even more dangerously, production to PED use. Take, for instance, a player like Luis Gonzalez. Is it possible for someone like Gonzo to hit 57 HR at age 33 when he never topped 31 the rest of his career? Gonzalez, like many players who had one exceptional year beyond their career norms, has come under much suspicion. Again, the assumption here asks us to strain logic in believing that the only reasonable explanation is that Gonzalez, in the middle of a 5-year stretch where he hit at least 26 HR each year, decided to start taking steroids in one of those years, saw exceptional results, and then either promptly decided to quit using them or they just stopped helping him. If the only reason Gonzo's home runs spiked to 57 is that he was on steroids, and he was already ok with taking them for a whole year and winning a WS using steroids, then wouldn't it make more sense that he would keep using them, at least for the next couple years when MLB wasn't really doing anything about steroid use? But if we assume that he did keep using them, then they can no longer explain the one-year spike in 2001. The assumption that Gonzo will be on the list of 104 names from 2003 just because of his 2001 makes little sense for this reason: the point of being suspicious is that his HR total in 2001 was so much higher than any other year for Gonzo, but if he were also taking them in surrounding years (like 2003), then there's nothing different in 2001 that would lead to a home run explosion. He may or may not have taken steroids, but his 2001 season is not evidence either way.

Of course, Gonzalez' 2001 season was unusual. That does not mean steroids are the explanation, no more than Davey Johnson's 1973, when he hit 43 home runs after hitting 5 the year before and without topping 18 for the rest of his MLB career, requires us to conclude steroids must have been in play. No more than Willard Marshall's 1947, when, at age 26, he followed up a 13 HR 1946 campaign with 36, good for 3rd in the NL, and then proceeded into his prime years dropping back to 14 and 12 over the next two years. Is it rare for a consistent 10-15 HR threat to hit 36 one year and never even hit half that total in the rest of his career? Yeah, but that doesn't mean it doesn't happen without steroids. Hack Wilson's 1930, Roger Maris' 1961, Andre Dawson's 1987, heck, Ned Williamson's 1884; steroids? If they happen in 2001, the whispers are sure there.

Players have career years. Sometimes, they even have large shifts in talent levels. Occasionally, those shifts are extreme, especially when a player's career year happens to line up with a big offensive year in MLB. The improved environment can exaggerate the home run totals of players. Consider the following graph of the MLB-wide HR/ball-in-play for every year since 1910:

Notice especially the spikes occurring in 1930, 1961, and 1987 that helped Wilson, Maris, and Dawson reach new highs. Now, imagine any of the above players having career years in Bank One Ballpark in 2001, a home run park in the second biggest year for home run rates in MLB history. How could Luis Gonzalez have hit more home runs than anyone in the NL had ever hit until just a few years earlier? Chase Field circa 2001, plus a career year from a pretty good hitter is a good place to start. Steroids may or may not have helped him hit 168 HR in 5 years, but, barring additional solid evidence that says he juiced in 2001 and not in other years, they don't explain how so many more got distributed into 2001 than any other year.

There is another trend I want to highlight from the above graph. Namely, I want to look at the consistently higher home run rates starting in 1993/94. Conventional wisdom says steroids took over, but how true is that? Tom Tango looked at this very trend on the Hardball Times last year and found that explanation to be unsatisfactory. I highly recommend reading his article, especially the quote from the guy responsible for testing MLB's baseballs. Consider that another significant event occured in the between the 1992 and 1993 seasons: a new Commissioner took office, one who has prided himself on instituting any change that is perceived as increasing fan interest, which, throughout the mid- to late-90s, meant the longball.

This sudden and significant shift rivals even the transition out of the Deadball Era and the return of WWII vets, except this shift has no obvious trigger. Unless everyone just started doing steroids all together in 1993 and 1994, that's not a good explanation (actually, considering the difference in MLB's drug enforcement starting these years from what it had been under Kuhn, Ueberoth, Giamatti, and Vincent, I guess this is mildly conceivable). Notice also the three coloured dots on the graph: these represent 2003-2005, the years MLB instituted survey testing for steroids, random testing with penalties, and then strengthened the policy to include suspensions and public release of names after a first positive test. MLB reported significant drops in positive tests from the survey testing period in these years, and the drop from outrageous home run rates was noted widely in the media (including irresponsible reporting at times of the projected home run totals for the league based on totals from the first month or two of the season without attempting to account for the fact that home run rates always rise in the warmer months). However, the graph above shows that HR rates stayed in the same range they had been in in the '90s. There were rises from the year before in both the year survey testing was instituted and again in the first year of an enforced testing policy under Selig. In 2006, when the current policy that further strengthened penalties took effect, the HR rate rose again. The past 2 years have seen HR rates comparable to that of 1998. So if the current policy has indeed gone a long way toward cleaning up the game from steroid use, then steroids weren't the primary factor in the increase of HR rates in this era. They may have played a role in rates peaking in 2000 as high as they did, but there seems to be a lot more to the power surge than steroids.

The stigma attached to steroids has made it very difficult to see many of the complications involved. It eventually became obvious that the power numbers we were seeing were not under the same conditions as in years past once the Gonzos and Greg Vaughns began doing things Ted Williams and Hank Aaron had never done, and steroids became a very easy target for those frustrated with that idea. The premise that, until recently, power numbers from this era were not comparable to those of eras past but now somehow are, however, is not well founded. We are still in the same power era we have been in for a decade and a half. It's difficult to make out the differences even looking at the whole league as a sample. That makes it virtually impossible to tell anything conclusive from the numbers of just one player, particularly when it is one big year that calls suspicion. Seeing someone like Bonds or McGwire hit the twilight of his career and suddenly start churning out home runs at an unprecedented and sustained rate for years might warrant some suspicion, especially given the surrounding circumstances, but the things that pass for evidence against too many are much, much more murky than that.

The more we learn about who used what, and the more the stigma fades away, the clearer it becomes that two things are true: that nearly anyone can be guilty, regardless of what we thought, and that the one reason to be convinced anyone took steroids is to have actual evidence (not just suspicions) linking him to them. Short of that, there's really no way to tell.
Javier Vazquez: Stats vs. the Standings

It would seem that our favourite pitcher here at 3-D Baseball is Javier Vazquez. We do, after all, have a tracker (and by tracker I mean I go in and manually update it when I think about it) devoted to monitoring his quest for 3000 strikeouts at the top of the margin. We've written one article about him already and are now featuring another Javy Vazquez piece, and we don't have that many player exclusive articles. To be perfectly honest, though, he's not really a favourite of anyone here. Personal favourites of our writers include Greg Maddux, Satchel Paige, Bob Gibson, even Barry Zito; Vazquez is just another pitcher, most notable to us for being another in the hilarious pattern of high-profile moves that for whatever reason never seemed to pan out for the Yankees. Vazquez, however, is among the most illustrative players in the gap between common sense baseball perception and sabrmetric digging into the depths of player values, one of the more difficult puzzles in the enigmatic endeavor of relating player production to the printed standings laying across your lap as you sip your morning coffee. As such, he's the perfect subject for someone such as me who is interested in those sorts of things.

So the question before us today: how can a pitcher as good as statisticians claim Vazquez is be struggling to crack .500 this far into his career?

To try to answer this, I want to take a look at Vazquez' teams' W-L record when he pitches and see just how many of those wins they should have had with an average starter instead of Vazquez. Common sense says that a good pitcher should win games, but what if we don't know how many games his team could have won without him? We need to establish a baseline that tells us how good Vazquez' teams were without him before we can say his mediocre record is not adding wins. Once we do that, we should have an idea of what kind of real-life value Vazquez added in the standings which we can then compare to his credited statistical value.

I guess this would be a good time to detail how good statisticians claim Vazquez is. I would, after all, like to try to reconcile that with the actual wins and losses for his teams, so it would be a good starting point to know what kind of value we're talking about for Vazquez beyond just good K:BB ratios, FIPs, tRAs, etc. Sean Smith of has Vazquez rated as 32 wins above replacement through 2008. Fangraphs, which bases its win values on FIP and innings pitched, has him at about 31 WAR from 2002-2008 (this would be higher if it included his whole career, as BaseballProjection does)., the home of tRA, likes him even better, crediting him with about 31 WAR from 2003-2008.

If you aren't familiar with the concept of replacement level, don't worry. It seems like an ambiguous concept and can be hard to guage at first since it's not inherently defined and it's up to each statistician to figure what replacement level he or she will work with. For this article, we will be dealing with the simpler and more rigidly defined practice of comparing to the league average, so you'll only need to have a basic idea of what the abover WAR numbers mean. Replacement level is generally considered to be somewhere around 2 wins per year below average, so a full season's worth of average production is worth somewhere around 2 WAR. StatCorner's runs above average stat would have Vazquez' 31 WAR worth about 16-17 wins above average over the years it covers, if that is a more comfortable baseline. Additionally, it would be helpful to have some context as to how good Vazquez' WAR figures are. Adding in Vazquez' whole career to the FIP and tRA estimates of win value would likely put him into the 40s, but since they only go back so far, it's hard to get historical comparisons. Instead, you can get a good idea of Vazquez' career value by seeing who he compares to in Sean Smith's database. His 32 wins are already right around the career production of notables Preacher Roe, John Tudor, and Sal Maglie. Sabathia, in a few years fewer than Vazquez, has accumulated 33 wins. Keeping in mind that WAR adds about 2 wins for every year of average production and thus tend to keep accumulating throughout a player's career, the company Javy is already in at his age is pretty good, and certainly better than his record would suggest.

This brings us to the primary issue. Are the statistical metrics missing the mark on Vazquez? While this could bring up a discussion of the merits of those metrics and the validity of their methodology, I want to keep the technicality to a minimum here. I don't want to use this article to call for us to ignore what we see in the standings and take the metrics as gospel; rather, I want to simply look at the actual wins and losses and see if they add up.

Vazquez' W-L record through 2008 was 127-129. Taking that a step further, we can look at what his team did whenever he started, regardless of whether he was credited with a decision, and see that his teams went 175-178. That's not so great. I'd say it's downright average, if not a little worse. Baseball is a team sport, though, and going 175-178 for otherwise bad teams can be pretty good. To see how good that 175-178 is in Vazquez' case, I considered 3 primary factors relating to the quality of the team outside (for the most part) of Vazquez' influence: the offense, the bullpen, and the defense. In doing so, I tried to find an estimate of how the same teams would have done in those same games with a league average pitcher starting instead of Vazquez.

The first factor I addressed was the offense. In Vazquez' 353 starts through 2008, his teams scored an average of 4.44 runs per game. Assuming average run prevention (basically, ignoring the effects of Vazquez, the bullpen, and the defense), we can take the average R/G for the league Vazquez pitched in each year, park adjust it to his home park, and use that as our runs allowed to get a Pythagorean record to see how good the offenses were.

If that sounds like I'm not keeping the technicality to a minimum, let us step back for a moment. Basically, all it means is that I'm comparing what Vazquez' offenses did in games he pitched to what an average offense would have done. It turns out that based on the strength of the offense alone, an otherwise average team would have been expected to win about 162 of those 353 games. So Vazquez has worked in front of well below average offensive support over his career to date.

Moving on to the bullpen, I took Vazquez' innings out of every game he pitched and replaced them with the same number of innings by the average starting pitcher in the league he was pitching in at the time (once again adjusted to Vazquez' park, of course). The innings pitched by the bullpen were left intact. This allowed me to get an estimate of how many runs these teams would have allowed with an average starter based on how good or bad their bullpens were. As it turns out, the bullpens, on the whole, were bad as well. Their runs allowed per 9 innings of 4.96 (this is higher than their ERA since it includes unearned runs) was nearly as bad as the average starter's mark, adjusted to Vazquez' parks, of 5.03. Generally speaking, bullpens should have better run prevention than the average starter, and it's a bad sign if they don't. Divvying the innings appropriately and combining the bullpen's mark with that of the average starter's mark gives us a new runs allowed figure for our teams and lets us recalculate their record with the effects of the bullpen as well as the offense. This drops the number of expected wins to 158.

Finally, I looked at defense. From 2002 on, I used team UZR to measure defensive value. Before 2002, I used Total Zone, as UZR is not currently available for those years. This final adjustment was as simple as converting the fielding runs measurements to a per-game figure and prorating it to the number of games Vazquez started, and then adding or subtracting those runs from the average starter in our previous adjustment. The runs allowed by the bullpen need no adjustment since the runs they gave up were already in front of the defenses we are measuring. Once again, Vazquez' defenses were substandard, adding .02 runs per game to the expected runs allowed. This modest adjustment takes off 1 further win from our previous total.

That brings us to a final total of 157 wins if we remove Javier Vazquez from the 353 games he started and replace his innings with a league average starting pitcher. Remember that the actual record of these teams in front of Vazquez is 175-178. Considering that these teams generally sucked and were expected to go somewhere around 157-196 in those games without Javy's influence, 175-178 is pretty good. It's about 18 wins above average, meaning that if he were pitching on an mostly average teams, it would be reasonable to expect them to be close to 36 games over .500 instead of 3 games below. It's not hard to imagine the implications for Vazquez' W-L record.

The 18 wins above average matches up reasonably well with the statistical estimates of Vazquez' win value. In this case, despite the surface tension between the stats and the record, the metrics are in fact in harmony with common sense perception that they need to translate to real life
wins that show up in the W-L columns. They do just that; they just don't start with a baseline value of .500. Vazquez was not pitching for otherwise average teams, so comparing his record to the record of an average team clearly understates his value. Just as common sense dictates, it takes a very good pitcher to bring otherwise substantially below-average teams up to average performance, which is exactly what we're seeing with Vazquez.
Fun with Retrosheet (Magic Loogie edition)

June 14, 1987; Mets, Phillies. It's a beautiful afternoon to spend in the right field stands. Then it happens: a crucial Keith Hernandez error opens the door for a five-run Phillies' ninth that costs the Mets the game. Nice game, Pretty Boy!

We've all heard the story, I'm sure. It goes on, of course, outside the players' entrance, where Hernandez proceeds to let fly one magic loogie complete with right turns, left turns, ricochets, force enough to displace a baseball cap, and a pause, in mid-air, mind you, before inflicting its final damage. The jury is still out on the existence of a second spitter.

We, like Seinfeld, know better than to trust the word of an unsavory character such as Newman. And we, like Seinfeld, are doing a little digging.

Exhibit A: June 14, 1987, no game was played in Shea. The Mets were in the midst of a 10-game road trip.

Exhibit B: The Mets were not playing the Phillies that day. They played the Pirates instead.

Exhibit C: The Mets did not give up 5 runs nor lose in the 9th. In fact, they won 7-3.

Exhibit D: Keith Hernandez did not make an error that day. He was a perfect 12 for 12 in his fielding opportunities. He also had a home run and a double along with 2 runs scored and 2 RBIs. Nice game, Pretty Boy.

So there you have it. Magic loogie or not, that's the way it happened.
