Garciaparra and Arroyo Speak up on Steroids

I mentioned in my recent post that there were a few topics I wanted to address in regard to steroids. Two of them were comments from former teammates of Manny and Ortiz following the leak of their names in connection to the survey tests conducted by MLB in 2003. The first came from Nomar and dealt with one potential problem with listing the names of players who failed an anonymous survey test. The second came from Bronson Arroyo and addressed the possibility of lack of intent involved in testing positive for the survey tests.

Nomar's remarks raised the issue of whether there could be players named on the list who never even legitimately tested positive. For instance, he poses the possibility that, as survey testing, the standards of identifying false positives may not have been as high (since they were not actually trying to identify steroid users). By far the more intriguing possibility he talks about, however, is the possibility someone could be on the list for refusing to participate in the survey test. Typically, a refusal to test implies an act of hiding something, and, as such, is generally treated the same as a positive test. With anonymity guaranteed, there was no reason for a user to refuse testing; however, Nomar claims there were players who wanted to refuse testing in the survey (and thus be counted as a positive) simply to raise the number of positive tests and increase the chances of implementing a testing policy. Could these players be on the list?

If this is the case, and if names keep leaking out, then it is possible that players could end up being vilified specifically because they took a stand against steroid use. If Nomar's assertion that players would actually submit automatic positives to help trigger a testing policy sounds far-fetched, consider that reports of plans to do just that surfaced during Spring Training of 2003. While the organized effort to coordinate refusal of testing throughout an organization was stopped, it is conceivable that some of the individual players involved in the plan, as well as others throughout MLB who felt the same way, may have followed through on refusal to test. Given that the testing was merely for survey purposes, there would have been no deterrent to anyone wishing to do so. Again the question becomes, could these players be on the same list as the players who tested and came up positive?

The tests were purely intended to survey MLB for the prevalence of steroid use; it was never their goal to identify players who tested positive. The only data of note from such a survey was how many positive results were returned. No other details were relevant, including who tested positive or what drugs were discovered in samples. As such, there would be no reason to separate the list of positives into any further classification than "positive" or "not positive". Since refusal to test counted as a positive, it is reasonable to assume that it is at least possible a player who refused to test would be listed with the other positive results on the list.

Presumably, the union has or had more details regarding the tests, including what drugs were detected in the tests, based on reports that players were eventually notified of such information after the results were seized. If this is the case, then we may eventually get confirmation of a players' non-guilt if such a player is named; however, what are currently leaking are simply names with no further details. We are relying on the players themselves to fill us in on the rest of the details.

Given this possibility, what if a name like Frank Thomas is leaked as being on the list? He was an avid opponent to steroids at the time of the survey tests, and while the identities of the veterans who hatched the plan to refuse testing were never revealed, we do know that the most outspoken player on the issue was playing for the White Sox at the time. We have to at least consider the possibility that we may hear a name connected to that list without an actual positive test associated with it, and it would behoove us to be prepared to wait until all the facts come out before passing judgment from now on.

On a different note, Bronson Arroyo's comments related to those who actually did test positive. Quote Arroyo:
Before 2004, none of us paid any attention to anything we took. Now they don't want us to take anything unless it's approved. But back then, who knows what was in stuff? The FDA wasn't regulating stuff, not unless it was killing people or people were dying from it.
-Boston Harold

His claim is that players could have essentially tested positive for steroids without intentionally taking steroids because lax regulation of legal supplements led to tainted products and because the players had little incentive to be as discerning with what they were taking as long as it was legally available over the counter. He adds that he took andro prior to 2004 and believes that he may be one of the players to test positive because of rumours that andro, which was legal at the time and not on baseball's banned supplement list, was potentially tainted with steroids.

If players really did have steroids in their system only because they were unknowingly taking legal supplements that were tainted, I believe that is a significant difference from deliberately juicing. In the anonymous survey testing, there was no deterrent to get players taking legal supplements to find out if they could be tainted and make sure their samples were clean. Someone who tripped a positive test with a legal supplement may have been surprised to learn as much, but the only repercussion would be that they learned that they would have to be careful about even legal supplements in the future. It was never to be an issue as long as they were aware of such dangers going forward. The leaking of names from the list of positive tests in the survey test improperly lumps these players in under the canopy of deliberate cheaters.

How plausible are Arroyo's claims, though? Like with Garciaparra's assertions, contemporary reports back Arroyo. The FDA claimed a year later that andro was a potentially dangerous supplement that could be converted to steroids in the body and stopped distribution of the product. As a dietary supplement, andro required no FDA approval to hit the market, but the Dietary Supplement Health and Education Act of 1994 described in the linked article gave the FDA both responsibility to ensure its safety and avenue to act against it; however, it was only 10 years later, after the period of survey testing in MLB, that the supplement was regulated.

The issue could be further exacerbated for players who live and train in countries with less oversight over supplements than the U.S. The possibility that players could have introduced steroids into their bodies unknowingly certainly seems plausible. With the only regulation in place that players take legal supplements, I can't blame players for going only so far as to determine whether what they are taking is a legally available and legally acquired substance as the rules required. Neither the law nor the league dictated that they were responsible to know whether their legal actions could inadvertantly lead to detection of steroids.

Is it reasonable to believe that not all of the 104 positive tests were the result of deliberate juicing? The number of positive tests in MLB dropped from 104 in 2003 to 12 in 2004. If we are to expect that all of those positives were the result of deliberate juicing, and we believe both that steroids have a significant effect on performance and that the number of positive tests is in any way indicative of the prevalence of deliberate use, then that is a sizeable drop that should lead to detectable changes in league-wide numbers. That's not the case, however, at least not for two of the measures most commonly assumed to be linked to steroids (HR power for hitters and fastball velocity for pitchers). As we saw in the previous steroid article, the rate of home runs per ball in play rose a bit from 2003 to 2004 and has not been down significantly since. Similarly, the average fastball velocity, as estimated by averaging velocities from FanGraphs' team pages, weighted by percentage of fastballs thrown for each team, is virtually unchanged.

This does not, of course, mean that a large chunk of the positives in 2003 did not represent deliberate steroid use. There are a number of possible reasons those stats didn't drop. It could be that our understanding of how steroids affect players is simply wrong or misguided. It's possible that the drop in positive tests represents players getting better at beating the system once the tests are punitive. It's also reasonable, though, that part of the explanation could be that a number of those positive tests were from tainted legal supplements that did not have the same effects as steroids. After all, despite Arroyo's comment that andro made him feel "like [he] could jump and hit [his] head on the basketball rim," his velocity didn't drop at all froms 2002-2003 to the years after. This sample of one does nothing to establish a reliable effect of andro or other potentially tainted then-legal supplements, but it is in line with the idea that just because something could introduce steroids into a player's urine does not mean it is giving him the benefits commonly perceived as resultant of steroid use.

Neither of these players' statements give us anything definitive about the list of 104. Both, however, introduce complexities far beyond the simple binary experiment of on or off being played out in the media. Simply assuming as much as we do based on this list that was never intended to be released, and which is being so now illegally (which is a separate but probably more important issue), is oversimplifying the issue and eschewing an understanding of the era we've bourn witness to. If we really want to understand what was going on all these years and what it meant to the game as Senator Mitchell so wanted when he conducted his report, then we need to be listening to players like Garciaparra and Arroyo when they speak like this, and we need to take what they say seriously. To me, that is much more imporant than having names to vilify.
Continue Reading...

Stat of the Day

Andrew McCutchen joined the company of Pirate greats like Willie Stargell, Roberto Clemente, Ralph Kiner, and Frank Thomas (ok, maybe this Frank Thomas wasn't that great) last night when he became the organization's first hitter to homer thrice in one game since Aramis Ramirez did it in 2001, the first whose name doesn't make Pirates fans cringe since Darnell Coles in 1987, and the first who hasn't been traded away entering his prime years (yet) since Bill Robinson in 1976. The real history McCutchen made last night, though, happened in the third inning when he reached on a bunt single. In Retrosheet's PBP files, which date back to 1954, only one other player has homered three times and bunted for a hit in the same game: Dan Ford on July 20, 1983 for Baltimore. Coincidentally, both Ford and McCutchen led off their half of the first with a home run.

Continue Reading...

Searching for Steroids: a Critique of the Evidence

At this point, steroids are not really the hot topic they once were. The more time that passes and the more names that come out, the harder it is to maintain genuine outrage. Revelations that once shook fans in anger now barely shrug shoulders for many. There is no longer a Pete Rose-figure to pile with shame, no lone, brash figure heading the scandal, never failing to oblige us with enough asinine behaviour to keep our vitriol fresh. The face of the steroid scandal has shifted increasingly quickly in a revolving lineup of star after star falling from grace. We've been submerged in scandal for long enough that we've learned to breathe in it, learned to live with the heartbreak of losing heroes as little more than annoyance. Of course, there are still plenty of fans who maintain a hard-line stance against steroids. I know some who would still have records and stats wiped and users kept far from Cooperstown. On the whole, though, while disapproval remains, a numbness has grown that has fans just wanting to take the prevalence of steroids as an inevitability of the past as long as we keep it clean going forward. So while the latest news on the issue isn't insignificant, it's lighter on the stigma than it has been in years, and I think that makes this, in some ways at least, a better time than ever to discuss it.

I have, of course, discussed (I use the civil term liberally here) the issue many times already. Coming from a family of die-hard Cardinal fans as I do, I got to see the full fury on both sides from early on when the whispers started about McGwire and then when they rose into all out shouting. I had my opinions, and I expressed them. My feelings have evolved over time, though, and now, I feel like things have settled enough that I feel comfortable writing publicly about steroids. So that's what I'm going to do, starting with a complication illustrated by the latest fallen star, David Ortiz.

I'm not going to write as explicitly about my feelings as John did; rather, I am going to write about wading through the confusion and trying to make sense of the mess (which is basically what John did too, now that I think about it). There are a few things that have come up in the wake of the recent leak of Manny's and Ortiz' names that I think are important to address, but for now, I'll stick with the Ortiz revelation. Specifically, I want to write not so much about Ortiz, but about the statistical implications of this news and the timing with Ortiz' rise to prominence.

Ortiz has always raised some suspicion just for his transformation from the guy who was released to make roster space for Jose Morban to a perennial MVP candidate at the plate upon arrival in Boston, and the news of his cameo in The 104 has several fans connecting the dots. Media personalities have cited his Minnesota numbers and the timing of the positive test to imply that Ortiz' emergence as a hitter was strongly linked to steroids. The problem is that these dots are a bit like the stars of Ursa Major that, I'm told, look like a bear when you connect them properly. The issue is that there are countless ways to connect the dots, and no one really knows how to do it properly or without injecting a bit of whimsical imagination to help paint the picture.

We don't know, no matter what anyone tells you, how to identify steroid users by the numbers. We have a good idea that they improve performance. We have an inkling that they are related to either injuries or healing or both. Everything is just ambiguous enough, though, that anyone can be made to seem suspect through the wrong coloured glasses. A player is suspect if he suddenly succumbs to a rash of injuries while another player is suspected for being too durable. If someone muscles up and gains weight, it's said they they must be going on the juice. If someone slims down and loses weight, they must be coming off it. A career year? evidence of doping. A sudden loss of production? The same. Someone who developed young and was hitting moonshots in his early 20s against much older competition is too mature for someone to naturally be at that age. Someone who develops power and strength later in his 20s can't have changed his body without steroids. I've heard all of these used to implicate specific players. There's just so much that gets tossed around as supposed evidence that doesn't amount to anything. Too infrequently is the question asked if steroids are the only feasible explanation or even the most likely, only whether it is a possible one.

Every time a Brett Boone or a Brady Anderson is exposed, it fuels the fervor with which people pursue these hunches. They were right about such obvious signs indicating use before, after all. But what about the Jason Grimsleys and Alex Cabreras and Paul Byrds? We know that some guys got big and had power surges in this era, and we know that some guys did neither, and we now know that players from both groups juiced. Most likely, there are also players from both groups who didn't juice. Essentially, the issue I want to raise, and how it relates to David Ortiz, is that it's far too easy to link evidence that seems obvious to steroids when the reasons for believing a connection to exist are tentative at best. It's easy to see that Ortiz tested positive in 2003 and see that as the driving force for the transformation in his game that year, but does this news really suggest that?

First, the idea that Ortiz transformed overnight is far overblown. One prominent sports talk show was discussing Ortiz' time in Minnesota as if he were putting up utility infield power numbers. They cited facts like his hitting only 58 HR in 6 years and that he was 6th on the team in HR over his time in Minnesota, behind hitters like Corie Koskie and Jaque Jones. They did not cite facts like his having fewer than 50 ABs in 2 of those years or his being plagued by injuries in Minnesota and never getting a full season's worth of ABs in any year there. They neglected to mention his leading the team in AB/HR over his 6 years in Minnesota. There was no mention of two broken wrists, both requiring surgery, while with the Twins, which would seem particularly relevant given what his latest wrist injury has done to his power, or of another surgery to remove bone chips from his knee. No indication that the power had grown toward the end of his stay in Minnesota to the point that he was putting up ISOs of .241 and .228 and AB/HR of 16.8 and 20.6 in his final 2 seasons with Minnesota despite dealing with injuries both years. This was not the single-digit per year HR threat who was portrayed.

Now, consider David Ortiz going to Boston in 2003, but take steroids out of the picture. Is it unreasonable to see him finding such a groove there? He had had surgeries in each of the past two seasons and was finally healthy. He may have just been fully recovering his wrist strength. He was finally getting regular playing time. He was entering his prime years. He was already coveted by wunderkind Theo Epstein and courted by stars like Pedro and Manny. He had teammates telling him he would hit 30 HR and 30 doubles in Fenway even as he got off to a slow start. His first year there, he hit 31 HR and 39 doubles, showing only the power his teammates and GM saw signs of already. Is that an unreasonable jump for a player going from his age 26 to age 27 year who hit 32 doubles and 20 HR a year earlier in a season interrupted by knee surgery?

Consider this: in 2001, despite breaking his wrist and having surgery that year, Ortiz' HR/FB rate, according to Baseball-Reference, was 14.4%. In 2003, when Ortiz allegedly morphed into a monster, his HR/FB rate was 14.3%. What we saw in Ortiz was definitely an improved hitter in 2003, but it was not out of nowhere. Were his numbers in 2003 likely aided by steroids? Yes, that would be a reasonable assumption. But did steroids turn Ortiz from a run-of-the-mill hitter into Big Papi? There is no indication that that is the case.

The most alluring trap here is the timing of the positive test. It came in 2003, presumably in Spring Training, when all players in MLB were tested. It's also possible that it came later if he was one of the 240 players chosen for additional testing that season. The seeming connection is that he did steroids just as he emerged as a power hitter, so they must be connected. This assumptions requires a few leaps of faith, however. First, you have to assume that Ortiz was not doing steroids before 2003, that he resisted the temptation while fighting through multiple major injuries and when there was no testing, and then only started when MLB began testing and was on the horizon of a new policy with random testing and penalties. If he were already doing steroids before that, after all, then they would do nothing to explain his improvement. Then, you have to assume that after not taking steroids throughout his 6 years in the Majors with no testing, he not only started taking them in 2003 and got caught by the survey tests, but that he kept juicing throughout the next several years and managed to thwart all subsequent tests along the way. Steroids in early 2003 can't explain how he kept improving in proceeding years and had huge seasons well beyond even his 2003 levels later on if he didn't keep taking them. If his improvements were due to steroids and he didn't keep taking them, then his numbers should not have gotten even better in subsequent years. Following from this assumption is the implication that MLB's testing policy does not work, since it didn't catch him despite his presumed consistent use of steroids for years. Finally, you have to assume that steroids even have the ability to turn someone from a run-of-the-mill hitter into a superstar slugger despite knowing of dozens of examples of players who used steroids and turned into, well, Nook Logan. It is much more likely, given the environment of testing, that 2003 marks the end of a period of steroid use for Ortiz dating back to his Minnesota days than the beginning of a period dating forward into his most productive years, in which case, his statistics would have told us virtually nothing about his likelihood of steroid use.

The point of all this is not to exonerate Ortiz, but rather to try to put into perspective the complications of trying to tie PED use to production, or, even more dangerously, production to PED use. Take, for instance, a player like Luis Gonzalez. Is it possible for someone like Gonzo to hit 57 HR at age 33 when he never topped 31 the rest of his career? Gonzalez, like many players who had one exceptional year beyond their career norms, has come under much suspicion. Again, the assumption here asks us to strain logic in believing that the only reasonable explanation is that Gonzalez, in the middle of a 5-year stretch where he hit at least 26 HR each year, decided to start taking steroids in one of those years, saw exceptional results, and then either promptly decided to quit using them or they just stopped helping him. If the only reason Gonzo's home runs spiked to 57 is that he was on steroids, and he was already ok with taking them for a whole year and winning a WS using steroids, then wouldn't it make more sense that he would keep using them, at least for the next couple years when MLB wasn't really doing anything about steroid use? But if we assume that he did keep using them, then they can no longer explain the one-year spike in 2001. The assumption that Gonzo will be on the list of 104 names from 2003 just because of his 2001 makes little sense for this reason: the point of being suspicious is that his HR total in 2001 was so much higher than any other year for Gonzo, but if he were also taking them in surrounding years (like 2003), then there's nothing different in 2001 that would lead to a home run explosion. He may or may not have taken steroids, but his 2001 season is not evidence either way.

Of course, Gonzalez' 2001 season was unusual. That does not mean steroids are the explanation, no more than Davey Johnson's 1973, when he hit 43 home runs after hitting 5 the year before and without topping 18 for the rest of his MLB career, requires us to conclude steroids must have been in play. No more than Willard Marshall's 1947, when, at age 26, he followed up a 13 HR 1946 campaign with 36, good for 3rd in the NL, and then proceeded into his prime years dropping back to 14 and 12 over the next two years. Is it rare for a consistent 10-15 HR threat to hit 36 one year and never even hit half that total in the rest of his career? Yeah, but that doesn't mean it doesn't happen without steroids. Hack Wilson's 1930, Roger Maris' 1961, Andre Dawson's 1987, heck, Ned Williamson's 1884; steroids? If they happen in 2001, the whispers are sure there.

Players have career years. Sometimes, they even have large shifts in talent levels. Occasionally, those shifts are extreme, especially when a player's career year happens to line up with a big offensive year in MLB. The improved environment can exaggerate the home run totals of players. Consider the following graph of the MLB-wide HR/ball-in-play for every year since 1910:

Notice especially the spikes occurring in 1930, 1961, and 1987 that helped Wilson, Maris, and Dawson reach new highs. Now, imagine any of the above players having career years in Bank One Ballpark in 2001, a home run park in the second biggest year for home run rates in MLB history. How could Luis Gonzalez have hit more home runs than anyone in the NL had ever hit until just a few years earlier? Chase Field circa 2001, plus a career year from a pretty good hitter is a good place to start. Steroids may or may not have helped him hit 168 HR in 5 years, but, barring additional solid evidence that says he juiced in 2001 and not in other years, they don't explain how so many more got distributed into 2001 than any other year.

There is another trend I want to highlight from the above graph. Namely, I want to look at the consistently higher home run rates starting in 1993/94. Conventional wisdom says steroids took over, but how true is that? Tom Tango looked at this very trend on the Hardball Times last year and found that explanation to be unsatisfactory. I highly recommend reading his article, especially the quote from the guy responsible for testing MLB's baseballs. Consider that another significant event occured in the between the 1992 and 1993 seasons: a new Commissioner took office, one who has prided himself on instituting any change that is perceived as increasing fan interest, which, throughout the mid- to late-90s, meant the longball.

This sudden and significant shift rivals even the transition out of the Deadball Era and the return of WWII vets, except this shift has no obvious trigger. Unless everyone just started doing steroids all together in 1993 and 1994, that's not a good explanation (actually, considering the difference in MLB's drug enforcement starting these years from what it had been under Kuhn, Ueberoth, Giamatti, and Vincent, I guess this is mildly conceivable). Notice also the three coloured dots on the graph: these represent 2003-2005, the years MLB instituted survey testing for steroids, random testing with penalties, and then strengthened the policy to include suspensions and public release of names after a first positive test. MLB reported significant drops in positive tests from the survey testing period in these years, and the drop from outrageous home run rates was noted widely in the media (including irresponsible reporting at times of the projected home run totals for the league based on totals from the first month or two of the season without attempting to account for the fact that home run rates always rise in the warmer months). However, the graph above shows that HR rates stayed in the same range they had been in in the '90s. There were rises from the year before in both the year survey testing was instituted and again in the first year of an enforced testing policy under Selig. In 2006, when the current policy that further strengthened penalties took effect, the HR rate rose again. The past 2 years have seen HR rates comparable to that of 1998. So if the current policy has indeed gone a long way toward cleaning up the game from steroid use, then steroids weren't the primary factor in the increase of HR rates in this era. They may have played a role in rates peaking in 2000 as high as they did, but there seems to be a lot more to the power surge than steroids.

The stigma attached to steroids has made it very difficult to see many of the complications involved. It eventually became obvious that the power numbers we were seeing were not under the same conditions as in years past once the Gonzos and Greg Vaughns began doing things Ted Williams and Hank Aaron had never done, and steroids became a very easy target for those frustrated with that idea. The premise that, until recently, power numbers from this era were not comparable to those of eras past but now somehow are, however, is not well founded. We are still in the same power era we have been in for a decade and a half. It's difficult to make out the differences even looking at the whole league as a sample. That makes it virtually impossible to tell anything conclusive from the numbers of just one player, particularly when it is one big year that calls suspicion. Seeing someone like Bonds or McGwire hit the twilight of his career and suddenly start churning out home runs at an unprecedented and sustained rate for years might warrant some suspicion, especially given the surrounding circumstances, but the things that pass for evidence against too many are much, much more murky than that.

The more we learn about who used what, and the more the stigma fades away, the clearer it becomes that two things are true: that nearly anyone can be guilty, regardless of what we thought, and that the one reason to be convinced anyone took steroids is to have actual evidence (not just suspicions) linking him to them. Short of that, there's really no way to tell.
Continue Reading...

Javier Vazquez: Stats vs. the Standings

It would seem that our favourite pitcher here at 3-D Baseball is Javier Vazquez. We do, after all, have a tracker (and by tracker I mean I go in and manually update it when I think about it) devoted to monitoring his quest for 3000 strikeouts at the top of the margin. We've written one article about him already and are now featuring another Javy Vazquez piece, and we don't have that many player exclusive articles. To be perfectly honest, though, he's not really a favourite of anyone here. Personal favourites of our writers include Greg Maddux, Satchel Paige, Bob Gibson, even Barry Zito; Vazquez is just another pitcher, most notable to us for being another in the hilarious pattern of high-profile moves that for whatever reason never seemed to pan out for the Yankees. Vazquez, however, is among the most illustrative players in the gap between common sense baseball perception and sabrmetric digging into the depths of player values, one of the more difficult puzzles in the enigmatic endeavor of relating player production to the printed standings laying across your lap as you sip your morning coffee. As such, he's the perfect subject for someone such as me who is interested in those sorts of things.

So the question before us today: how can a pitcher as good as statisticians claim Vazquez is be struggling to crack .500 this far into his career?

To try to answer this, I want to take a look at Vazquez' teams' W-L record when he pitches and see just how many of those wins they should have had with an average starter instead of Vazquez. Common sense says that a good pitcher should win games, but what if we don't know how many games his team could have won without him? We need to establish a baseline that tells us how good Vazquez' teams were without him before we can say his mediocre record is not adding wins. Once we do that, we should have an idea of what kind of real-life value Vazquez added in the standings which we can then compare to his credited statistical value.

I guess this would be a good time to detail how good statisticians claim Vazquez is. I would, after all, like to try to reconcile that with the actual wins and losses for his teams, so it would be a good starting point to know what kind of value we're talking about for Vazquez beyond just good K:BB ratios, FIPs, tRAs, etc. Sean Smith of BaseballProjection.com has Vazquez rated as 32 wins above replacement through 2008. Fangraphs, which bases its win values on FIP and innings pitched, has him at about 31 WAR from 2002-2008 (this would be higher if it included his whole career, as BaseballProjection does). StatCorner.com, the home of tRA, likes him even better, crediting him with about 31 WAR from 2003-2008.

If you aren't familiar with the concept of replacement level, don't worry. It seems like an ambiguous concept and can be hard to guage at first since it's not inherently defined and it's up to each statistician to figure what replacement level he or she will work with. For this article, we will be dealing with the simpler and more rigidly defined practice of comparing to the league average, so you'll only need to have a basic idea of what the abover WAR numbers mean. Replacement level is generally considered to be somewhere around 2 wins per year below average, so a full season's worth of average production is worth somewhere around 2 WAR. StatCorner's runs above average stat would have Vazquez' 31 WAR worth about 16-17 wins above average over the years it covers, if that is a more comfortable baseline. Additionally, it would be helpful to have some context as to how good Vazquez' WAR figures are. Adding in Vazquez' whole career to the FIP and tRA estimates of win value would likely put him into the 40s, but since they only go back so far, it's hard to get historical comparisons. Instead, you can get a good idea of Vazquez' career value by seeing who he compares to in Sean Smith's database. His 32 wins are already right around the career production of notables Preacher Roe, John Tudor, and Sal Maglie. Sabathia, in a few years fewer than Vazquez, has accumulated 33 wins. Keeping in mind that WAR adds about 2 wins for every year of average production and thus tend to keep accumulating throughout a player's career, the company Javy is already in at his age is pretty good, and certainly better than his record would suggest.

This brings us to the primary issue. Are the statistical metrics missing the mark on Vazquez? While this could bring up a discussion of the merits of those metrics and the validity of their methodology, I want to keep the technicality to a minimum here. I don't want to use this article to call for us to ignore what we see in the standings and take the metrics as gospel; rather, I want to simply look at the actual wins and losses and see if they add up.

Vazquez' W-L record through 2008 was 127-129. Taking that a step further, we can look at what his team did whenever he started, regardless of whether he was credited with a decision, and see that his teams went 175-178. That's not so great. I'd say it's downright average, if not a little worse. Baseball is a team sport, though, and going 175-178 for otherwise bad teams can be pretty good. To see how good that 175-178 is in Vazquez' case, I considered 3 primary factors relating to the quality of the team outside (for the most part) of Vazquez' influence: the offense, the bullpen, and the defense. In doing so, I tried to find an estimate of how the same teams would have done in those same games with a league average pitcher starting instead of Vazquez.

The first factor I addressed was the offense. In Vazquez' 353 starts through 2008, his teams scored an average of 4.44 runs per game. Assuming average run prevention (basically, ignoring the effects of Vazquez, the bullpen, and the defense), we can take the average R/G for the league Vazquez pitched in each year, park adjust it to his home park, and use that as our runs allowed to get a Pythagorean record to see how good the offenses were.

If that sounds like I'm not keeping the technicality to a minimum, let us step back for a moment. Basically, all it means is that I'm comparing what Vazquez' offenses did in games he pitched to what an average offense would have done. It turns out that based on the strength of the offense alone, an otherwise average team would have been expected to win about 162 of those 353 games. So Vazquez has worked in front of well below average offensive support over his career to date.

Moving on to the bullpen, I took Vazquez' innings out of every game he pitched and replaced them with the same number of innings by the average starting pitcher in the league he was pitching in at the time (once again adjusted to Vazquez' park, of course). The innings pitched by the bullpen were left intact. This allowed me to get an estimate of how many runs these teams would have allowed with an average starter based on how good or bad their bullpens were. As it turns out, the bullpens, on the whole, were bad as well. Their runs allowed per 9 innings of 4.96 (this is higher than their ERA since it includes unearned runs) was nearly as bad as the average starter's mark, adjusted to Vazquez' parks, of 5.03. Generally speaking, bullpens should have better run prevention than the average starter, and it's a bad sign if they don't. Divvying the innings appropriately and combining the bullpen's mark with that of the average starter's mark gives us a new runs allowed figure for our teams and lets us recalculate their record with the effects of the bullpen as well as the offense. This drops the number of expected wins to 158.

Finally, I looked at defense. From 2002 on, I used team UZR to measure defensive value. Before 2002, I used Total Zone, as UZR is not currently available for those years. This final adjustment was as simple as converting the fielding runs measurements to a per-game figure and prorating it to the number of games Vazquez started, and then adding or subtracting those runs from the average starter in our previous adjustment. The runs allowed by the bullpen need no adjustment since the runs they gave up were already in front of the defenses we are measuring. Once again, Vazquez' defenses were substandard, adding .02 runs per game to the expected runs allowed. This modest adjustment takes off 1 further win from our previous total.

That brings us to a final total of 157 wins if we remove Javier Vazquez from the 353 games he started and replace his innings with a league average starting pitcher. Remember that the actual record of these teams in front of Vazquez is 175-178. Considering that these teams generally sucked and were expected to go somewhere around 157-196 in those games without Javy's influence, 175-178 is pretty good. It's about 18 wins above average, meaning that if he were pitching on an mostly average teams, it would be reasonable to expect them to be close to 36 games over .500 instead of 3 games below. It's not hard to imagine the implications for Vazquez' W-L record.

The 18 wins above average matches up reasonably well with the statistical estimates of Vazquez' win value. In this case, despite the surface tension between the stats and the record, the metrics are in fact in harmony with common sense perception that they need to translate to real life
wins that show up in the W-L columns. They do just that; they just don't start with a baseline value of .500. Vazquez was not pitching for otherwise average teams, so comparing his record to the record of an average team clearly understates his value. Just as common sense dictates, it takes a very good pitcher to bring otherwise substantially below-average teams up to average performance, which is exactly what we're seeing with Vazquez.
Continue Reading...

Fun with Retrosheet (Magic Loogie edition)

June 14, 1987; Mets, Phillies. It's a beautiful afternoon to spend in the right field stands. Then it happens: a crucial Keith Hernandez error opens the door for a five-run Phillies' ninth that costs the Mets the game. Nice game, Pretty Boy!

We've all heard the story, I'm sure. It goes on, of course, outside the players' entrance, where Hernandez proceeds to let fly one magic loogie complete with right turns, left turns, ricochets, force enough to displace a baseball cap, and a pause, in mid-air, mind you, before inflicting its final damage. The jury is still out on the existence of a second spitter.

We, like Seinfeld, know better than to trust the word of an unsavory character such as Newman. And we, like Seinfeld, are doing a little digging.

Exhibit A: June 14, 1987, no game was played in Shea. The Mets were in the midst of a 10-game road trip.

Exhibit B: The Mets were not playing the Phillies that day. They played the Pirates instead.

Exhibit C: The Mets did not give up 5 runs nor lose in the 9th. In fact, they won 7-3.

Exhibit D: Keith Hernandez did not make an error that day. He was a perfect 12 for 12 in his fielding opportunities. He also had a home run and a double along with 2 runs scored and 2 RBIs. Nice game, Pretty Boy.

So there you have it. Magic loogie or not, that's the way it happened.
Continue Reading...

Dissecting Genius: examining Dave Duncan

Dave Duncan has long been labeled a genius by many followers of the game. Tony LaRussa has frequently credited Duncan as a major contributor to his success. Cardinal fans often talk of his coaching skill as if he spins straw into pitchers. Broadcasters have been known to predict his election into the HOF, an institution that does not even induct coaches. Does the Duncan Effect really exist, though? Do pitchers really get better under the old catcher's tutelage?

Duncan's proponents will cite a few statistics supporting their claims, including his staffs' ERAs (they've led the league 4 times) and his 4 Cy Young winners. However, the answers here are not that simple. What we have just shown is mostly that over his long career, Dave Duncan has coached some very good pitchers. We have no idea how much of that is due to Dave Duncan. We can't use that as evidence much more than we can say Mike Piazza or Javy Lopez were brilliant catchers because their staffs had such success in L.A. and Atlanta. Additionally, we have anecdotal evidence of pitchers like Woody Williams or Jeff Suppan who were perceived to have grown much more successful thanks to pitching under Duncan. This approach also has it's problems and does not necessarily prove anything about Duncan.

None of this means, of course, that the claims about Duncan are unfounded. It just means we need to formulate a better approach if we are going to attempt to find evidence for or against the claims about Duncan. The primary issue here is that we have a premise (Dave Duncan is a genius), but no good way to go about seeing if that might be true. Most fans, broadcasters, analysts, writers, etc. have a relatively weak grasp on statistical meaning and, consequentially, return essentially meaningless statistical evidence when they feel such evidence is needed. It doesn't invalidate their claims, since they are usually based on some sort of intuitive or anecdotal understanding rather than the statistics, but it does fail to validate them.

In Duncan's case, we need to find a way to isolate as best we can his influence on pitchers. If we just look at everyone he coaches and compare them to the rest of the league, we can't tell what Duncan is changing in those pitchers. Instead, we take an approach similar to what Tom Tango calls WOWY (with or without you), meaning we look at what pitchers did under Dave Duncan and then compare that to what they did without him.

I decided to look at starting pitchers who had pitched three straight years away from Duncan before joining his staff, and then compare their performance in those three years to their performance in their first year with Duncan (for players traded for midseason, their first full year was used). This gives us a good sample to evaluate what we can expect from these pitchers without Duncan's influence as well as lets us focus on Duncan's perceived specialty: veteran starters. For this study, a starting pitcher was anyone who made at least 10 starts that year. There were 12 pitchers who were starters for all 3 years before joining the Cardinals and then starters in their first year with the Cardinals, plus 8 more who were relievers at some point in the three previous years who started for the Cardinals in their first year with the team.

The simplest approach would be to simply take each pitcher's ERA over the years leading up to learning from Duncan and compare that to his ERA under Duncan. This approach has several problems, which will be discussed presently, but it is a good place to start. Starting pitchers going to St. Louis during Duncan's tenure certainly have improved their ERAs. In the three years before joining Duncan, the 12 starters dropped their ERA from 4.48 to 4.12 in a bit over 2000 innings. Adding in the other 8 pitchers, their ERA dropped from 4.58 to 4.29 in about 3200 innings. This lends credence to the anecdotal evidence that pitchers do improve significantly under Duncan.

There are four major problems with this simple approach. One of them works against Duncan, two work in his favour, and the final one is mostly neutral. The first is that an aggregate of the past 3 years is not always a very good estimate of what a pitcher should be expected to do. We are looking mostly at veterans here (the average age of the 12 starters when they joined the Cardinals was 29.4, and Mark Mulder at 27 was the only one under 28). A pitcher hitting his 30s, in general, is not supposed to be as good as he was 3 years earlier, but our aggregate counts what he did 3 years ago as just as important to what he should be expected to do as last year. To illustrate this, look at how the ERAs of our sample rise in each successive year prior to joining the Cardinals:
  • 3 years prior: 3.84 ERA
  • 2 years prior: 4.68 ERA
  • 1 year prior: 4.97 ERA

Looking at the trend over the previous three years rather than lumping them all together shows how much our initial method diminished the Duncan Effect. Going just by the previous year, these pitchers dropped from a 4.97 ERA to a 4.12 ERA. Adding in the other 8 pitchers again, the ERA drops from 4.88 to 4.29.

This seems like a huge difference, but we still have to account for the other three differences. One, pitchers going to the Cardinals are always moving to the NL. Sometimes they are coming from the NL, and sometimes from the AL; in the former case, it doesn't matter, but in the latter, a league adjustment becomes necessary because ERAs are lower in the NL. A pitcher going from the AL to the NL should see his ERA drop even if he doesn't pitch any better. Two, the Cardinals have fielded very good defenses, on the whole, during Duncan's tenure. Again, a pitcher moving in front of a good defense should see his ERA drop even if he does not pitch any better. Three, we need to account for park factors. A pitcher coming from Coors should see his ERA drop, while a pitcher coming from Petco should see it rise. Both Busches have been pretty neutral, so this mostly matters when we look at pitchers coming from more extreme parks to Busch.

There is an additional problem that only applies to looking at the sample with the 8 converted relievers: a pitcher's ERA will generally be lower when he is used as a reliever than when he is used as a starter. So converting a pitcher to a starter is likely to boost his ERA a bit. There is also the issue of having fewer innings to project these pitchers' expected ERAs, which will create a minor issue in the next step (namely more regression when we project them). For these reasons, counting these pitchers will underrate the Duncan Effect to some degree, but they do increase our sample to a more comfortable level, and these pitchers are some of Duncan's most famous projects, so we can still look at them, keeping in mind that we are not exactly comparing apples to apples.

Now that we have our main problems outlined, we can refine our approach. The issues will be addressed in the following ways:
  • Use Marcel projections instead of a simple 3-year aggregate or the previous year's ERA alone to determine each pitcher's expected performance level
  • Park adjust each pitcher's ERA for each year we look at
  • Determine a league adjustment to apply for pitchers in the AL to put their stats on par with NL pitchers
  • Calculate FIP as well as ERA to account for the impact of fielders on ERA
The first thing I did was apply a park adjustment to each pitcher's stats for every team he played for in a given year. For example, if someone pitched for Baltimore and Houston in the same season, the Baltimore stats were adjusted to Camden and the Houston stats to Ex-ron. Separate stints with different teams were still left separate at this point so that the league adjustment could be applied only to the proper stint. The park adjustment figures I used were the same as the 1-year pitcher park factors published on Baseball-Reference.

Then I determined my league adjustments. This was done in traditional fashion, by looking at all pitchers who switched leagues from one year to the next and comparing how they did. To smooth out some of the noise, I looked at 5-year samples (2 years before and after each season), giving more weight to the season I was measuring in my aggregate. In recent years, this adjustment is about .92 (meaning you multiply a pitcher's ERA by .92 when he goes from the AL to the NL). In the mid-90s, it got as low as .80. Separate adjusments were done for FIP and ERA; the two adjustments are similar, but the FIP adjustments seem to be a bit more stable from year to year and didn't go quite as low at their lowest as the ERA adjustments.

The league adjustments were applied to all AL seasons. I did this because I ultimately want to look at what pitchers would do in a neutral environment, so I converted everyone's stats to a neutral NL park (hey, that sounds a lot like Busch Stadium). Once the adjustment was applied, I combined separate stints into full seasons (i.e. our previous Baltimore/Houston example is now simply counted as 1 season for the pitcher in a neutral environment rather than 2 stints in separate environments). These are the figures I plugged into the Marcel projections. I projected both ERA and FIP using this method.

This gives us a much better idea of how each pitcher should be expected to pitch free from any of the above influences. Our group of 12 starters now projects to a neutralized 4.61 ERA and 4.64 FIP entering Duncan's care and ends up with a 4.16 ERA and 4.44 FIP. With the other 8 pitchers, they project to 4.63/4.64 and end up at 4.29/4.53. As we would expect, the resulting ERAs are much lower than the FIPs. The Cardinals' team ERA over Duncan's tenure has been .25 points lower than its FIP due to consistently good defenses.


What about Oakland?

The same thing can be done with Duncan's pitchers in Oakland. This time, the group of pitchers dropped their ERAs by about .2 points, but again, that was in front of mostly good defenses. The FIPs stayed about the same.

What does this all say about the Duncan Effect? To be honest, nothing definitive. We aren't looking at all at young pitchers Duncan might have to develop who came up through his own organization. It doesn't include all pitchers who joined the Cardinals, only ones with a three year track record elsewhere. This means pitchers like Chris Carpenter, who was coming off missing extended time from injuries when he joined the Cardinals, are not counted here because our methods don't do much to isolate Duncan's effect. We don't have nearly as large a sample as we'd like to decide anything for sure. Our use of FIP downplays Duncan's pitch-to-contact philosophy where utilizing those good defenses was more effective than FIP can account for. We can, however, tell a couple important things. One, a pretty good chunk of the percieved Duncan Effect is due to other factors, probably most notably the defenses his teams have had. Two, those other effects don't cover all of the improvement we see in pitchers Duncan has coached, and they still did noticeably better than expected as a group. This does not prove a Duncan Effect, nor does it assign a real value to it, but it does support the claims and suggest that there is a good chance it does exist.
Continue Reading...

The Anti-Koufaxes

This Sunday, Jamie Moyer became the 46th pitcher in Major League history to win 250 games. Barring injury, he should pass Hall of Famers Bob Gibson, Carl Hubbell, and Red Faber at some point this year. The most remarkable thing about Moyer's career total, however, is not the names he's approaching, but how he got there. At age 30, Moyer and his mid-80s fastball were nearly out of the game. He had to that point accumulated a mere 34 career wins. The crafty lefty was coveted more for his mind than his arm and was reportedly offered minor league coaching jobs from teams that thought his career as a pitcher was done. Baltimore decided to take a chance on Moyer sticking around a few years more, and since then, he hasn't looked back. At age 30, he won 12 games and also posted career bests in ERA, WHIP, H/9, BB/9, and K:BB ratio. From that point on, he has gone on to win 216 games, meaning over 86% of his career victories have come since turning 30. Furthermore, he's had 86 victories since turning 40, more than twice the total he posted in his 20s and over a third of his career total.

Not surprisingly, Moyer's feats in his later years put him in pretty elite company. He's 7th all time in wins after turning 30 and 3rd in wins after turning 40. Here's the list of pitchers ahead of him in each:



































Wins after 30.................
Wins after 40
Cy Young (316)




Phil Niekro (121)
Phil Niekro (287)




Jack Quinn (104)
Warren Spahn (277)




Jamie Moyer (86)
Gaylord Perry (240)









Randy Johnson (235)









Early Wynn (217)









Jamie Moyer (216)












Moyer will probably pass Wynn for 6th on the wins after turning 30 list this season.

As you can see, he's in pretty good company. He'll be only the second on either list, joining Jack Quinn, not to make the Hall of Fame. Can any of these pitchers match Moyer's obscurity up until 30, though?

M
ost of them turned up the jets to some degree after their typical primes. Spahn, Perry, and Wynn all collected between 72 and 76% of their victories after age 30. Spahn got a famously late start to his career after giving up 3 years to the War effort along with many of his peers, starting only 16 games at 25 before becoming a full-timer at 26. This certainly cut off of his win totals, but he became dominant in a hurry, winning 21 games 3 times in his 4 full seasons in his 20s. Perry wasn't his multiple Cy Young self in his 20s, but he was and All Star and a 20 game winner at 27 and had a 3.06 ERA, 12% better than the League average over his career to that point, through 1968. Wynn was a bit below average in both W% and ERA throughout his 20s, though he was a regular contributor to the Senators' rotation and even got MVP votes after going 18-12 with a 2.91 ERA as a 23 year old.

Johnson struggled with control and crappy teams throughout his 20s, but he was already showing flashes of the dominant pitcher he would become. He twice led the league in strikeouts in his 20s (he also led the league in walks three times, a feat that, unlike the former, he would never repeat after turning 30) and made 2 All Star teams. Through his 20's, he was 12 games over .500, had a 108 ERA+, and had struck out more than a batter an inning throwing sliders harder than Moyer threw his fastball.

Jack Quinn was a good but largely insignificant pitcher through his younger years. He racked up the after 40 wins by pitching until he was 50.

Cy Young and his nearly 200 wins by age 30 politely excuse themselves from this discussion.

Phil Niekro, however, matches every bit of Moyer's pre-30 struggle and post-30 success. Even with a semi-breakout in his late-20s (he led the league in ERA in 1967 after starting the year in the bullpen), he had only 31 wins when he turned 30. Moyer's percentages of his career wins after 30 and 40 (86% and 34% respectively) easily surpass most of his rivals here, but fall short of Niekro's marks of 90% and 38%.

Moyer spent his age 29 season pitching for the Toledo Mudhens after signing a minor league deal with the Tigers. He found a new hope of returning to the Majors that year, telling the Associated Press in an interview during the 1992 season, "I think a lot of guys are waiting for expansion. It definitely creates jobs." (1). In 1993, he signed a minor league deal to compete for the fifth starter job for the Orioles, who had just lost 1992 callup and pitching prospect Richie Lewis to the Marlins in the expansion draft. As noted above, Moyer made the team and took off from there.

Niekro never faced that kind of near-attrition, but he was toiling in Atlanta's bullpen into his late-20s. He began the 1967 season having just turned 28 and with only 1 career start to his name. By mid-June, the knuckleballer had finally won a shot at the rotation and didn't let the opportunity slip. He went 10-7 with 10 complete games and a sub-2 ERA in 20 starts that year. Still, he was set to return to at least part-time bullpen duty in 1968. A local paper noted both the surprise of his '67 success and the plan to return to bullpen duty in '68 in a Feb. 1968 article:

Phil Niekro will prove an even bigger surprise in 1968 than he was in 1967 if he can handle the double duty assignment planned for him by manager Luman Harris of the Atlanta Braves.

Niekro, who led the National League with a 1.87 earned run average last season, is being groomed for duty as both a starter and a reliever in an effort to stabalize the Braves' chaotic pitching situation. (2)


The plan never went into motion for Niekro, as he quickly proved himself too valuable to remove from the rotation. That year, he started 34 games (which led the team) and made only 3 relief appearances. He did not show the results that would turn him into a HOFer until his 30s, but finally, at 29, he had earned a regular starting role, and he would only get better from there.

Moyer and Niekro are two-of-a-kind in Major League history. Both soft-tossers enjoyed fairly obscure careers through their 20s only to turn in some of the best post-30 and post-40 careers the game has ever seen. No one else can match both their pre-30 irrelevance and their post-30 significance. Moyer is an extremely long shot to join Niekro in the HOF, but he's shown few signs of slowing down into his mid-40s (he even signed a multi-year deal with the Phillies this offseason), and his consistency and longevity in his later years have put him among the game's elite.

1-July 21, 1992 St. Petersburg Times, AP story

2-Feb. 27, 1968, Rome News-Tribune, story by Fred Down
Continue Reading...