Math Behind Projecting the Division Winner (THT Article)

Note: this article uses examples from the free statistical software R

In my Hardball Times article about the projecting the number of wins we expect from the division winner, I included the following example:

Instead of having five baseball teams, let's say we have five coins. All we are going to do is flip each coin 162 times. Each time a coin lands on heads, it gets a win, and each time it lands on tails, it gets a loss. The coin with the most wins after 162 flips wins the division.

How many wins would you project for the coin that ends up winning the division, whichever coin that might be?

No coin by itself is going to have an expected value of more than 81 wins, but it is extremely likely that at least one out of the five coins will end up with more than 81 wins just by chance. It turns out that if you repeat this experiment a bunch of times, the coin that wins the division will end up with about 88 wins, on average.

Hopefully this makes sense conceptually, but how do I get 88 wins (or, more precisely, 88.3943...)?

One way, of course, is to actually do what I said, and flip a bunch of coins over and over and over and record the results. Let's say I repeat this experiment 10 times, and I get the following results for the "division winners":

94, 85, 89, 87, 89, 90, 82, 86, 85, 86

That is an average of 87.3--pretty good, but obviously not the most precise estimate. We need to repeat the experiment more than ten times to make sure we get something closer to the true mean. Rather than spend hours upon hours flipping coins, we can actually cheat and get a computer to pretend to do it for us. This is called simulation, and it can be a very powerful statistical tool for determining probabilities, averages, distributions, etc that are not computationally obvious (full disclosure: I actually cheated and simulated the 10 seasons rather than record and tally 8000+ coin flips).

Now, let's simulate 1000 seasons: this time, we get 88.5940 wins leading the division, on average. Much better, but still a couple tenths off. Bumping the number of seasons up to 10,000, this time we get 88.4296. And if we keep simulating more and more seasons, we are going to start seeing the results stay clustered more and more closely around 88.3943.

So that's one way to estimate the expected win total for our division winner. How do I know that the results should cluster around 88.3943 specifically, though, other than simulating millions and millions of seasons?

We can get the answer without simulation by starting with a simpler question. What is the probability that none of the teams wins more than, for example, 81 games? The probability that one team wins no more than 81 games is a simple binomial distribution problem: pbinom(81,162,.5) ~ .5313. The probability that all five are at 81 or lower then becomes .5313^5 ~ .04233.

There is about a 4% chance that the division winner will have 81 or fewer wins. We can repeat that calculation for 80 wins, and we see that there is about a .02262 probability of the division winner having 80 or fewer wins. That means the probability of the division winner having exactly 81 wins is .04233 - .02262 = .01971.

Then, we repeat that process for every number from 0 to 162, and we end up with a table of probabilities of the division winner ending up on each possible number of wins. (If you were to do this by hand, you could shortcut a bit by only going from something like 70 to 115 since the probabilities outside that range are all virtually zero anyway.)

Finally, we multiply each possible win total by the probability of the division winner finishing with that number of wins, and we add up the results to get a mean for the distribution. And doing that gives us 88.3943.


#calculate expected mean value of division winner
p <- .5 #probability of each team winning each game
n <- 162 #number of games per season
teams <- 5 #number of teams in the division

games <- 0:n # list of possible win totals (0:162)
p.list <- pbinom(games,n,p)^teams # p of div winner winning X games or fewer
wts <- c(p.list[1],diff(p.list)) # p of div winner winning exactly X games
sum(games*wts) # average wins by division winner

[1] 88.39431

As we can see, it is possible to calculate the mean of this distribution exactly, but it is still pretty cumbersome to do so without a computer. As such, let's discuss one final way to estimate this mean using simpler calculations.

First, we will need a continuous distribution, so we use a normal approximation for the binomial distribution. The mean of the normal distribution will just be 81 (the average number of wins we expect from a team in our example), and the standard deviation will be sqrt(npq) = sqrt(162*.5*.5) ~ 6.36.

All we need to do now is find the point where there is a 50% chance that five numbers randomly sampled from this distribution will all fall below that number. Start by finding the percentile of the distribution that fulfils this condition:

p^5 = .5
p = .5^(1/5) ~ 0.8706

This means we want point at the 0.8706 percentile of our normal distribution, which is simple to look up using an online tool or simple statistical software:

qnorm(0.8706,81,6.36) ~ 88.1849

That is our estimate for the expected number of wins from the division winner. This is slightly off because we are actually calculating the median and not the mean (and because we used a normal approximation, but that makes less difference), but it is still a pretty good estimate given the amount of calculation we simplified.
Continue Reading...

More From Jesse Burkett (Hit Batsmen)

From the same news archive binge as the previous article, we get more from Jesse Burkett on the NL's rule changes, and...holy crap. Apparently there was a (thankfully short) period in the NL where hitting a batter only awarded the batter a ball, not a base:

"That rule penalizing the pitcher with only a ball for hitting a batter is a very bad one," said the great hitter. "My word on it, some of those pitchers will be bounding fast ones off the batter's ribs this season."*

Well said, Mr. Burkett. I can see why that change didn't stick.

The article also insinuates that part of the reason Cy Young jumped to the AL (which did not lessen the penalty for hitting batters) was that he thought the new HBP rule was dumb.

*The St. Louis Republic. (St. Louis, Mo.), 29 March 1901. Chronicling America: Historic American Newspapers. Lib. of Congress.
Continue Reading...

NL Institutes Pitch 1901

With MLB's newfound interest in speeding up the pace of play, it's easy to forget that MLB rules actually had a pitch clock in place before this year. Granted, it was virtually never enforced (I think I saw an automatic ball called for a clock violation once, and no one knew what was going on when it happened), but the rule was technically there.

I had no idea just how far back that rule went, though, until I saw this quote from Hall of Famer Jesse Burkett while browsing through some old sports pages (scroll/zoom to the highlighted word at the bottom right corner of the page):

I have been reading how the rule limiting the pitcher to twenty seconds on the slab before throwing will handicap "Cup". That is only a National League rule, and "Cup" is in the American, where the rule is not in force.*

"Cup" here is George Cuppy, a longtime teammate of Burkett's who had just signed with the newly formed American League. I don't know if the rule was on the books continuously from 1901-present time, but apparently the idea of a 20-second limit on pitchers dates back at least that far.

*The St. Louis Republic. (St. Louis, Mo.), 27 March 1901. Chronicling America: Historic American Newspapers. Lib. of Congress.
Continue Reading...

Jeff Manship and the Denny Bautista-line

Jeff Manship signed a minor league deal with Cleveland this past December.  These are the sorts of deals the Jeff Manships of the world get.  Manship has made two Opening Day rosters in his career--in 2011 and again in 2014--and he had to fight it out in Spring Training for both.  In 2011, he made it to April 17, just 3.1 IP over 5 games, before getting sent down.  Last year, he stayed up until July 23, but over a month of that time was spent on the DL.

No, there's nothing remarkable at all about Manship's contract with Cleveland. He's the type of player who is only even a free agent at all because no one wants to hand him an MLB roster spot, and he's out of options.  What is remarkable is that Manship keeps making the Majors anyway, every single year.  Since his first call-up in 2009, he has now spent time in the Majors for six years running.  And in every single one, he's had an ERA above 5.00.

Denny Bautista was, in some ways, a rather un-Manship-like prospect.  Manship was drafted in the 50th round out of high school, went to Notre Dame, and then signed as a 14th round pick three years later.  In 2008, he climbed to #9 on Baseball-America's list of the Twins' top ten prospects, only to drop back out of the list the following year.  John Sickels, who also had Manship as the Twins' #9 prospect in 2008, had him at the back end of their top 20 each of the next two years.

Bautista, meanwhile, was signed as a 17-year-old out of the Dominican Republic.  He was (and still is, presumably) the cousin of Ramon and Pedro Martinez.  He twice (in 2002 and 2004) cracked Baseball-America's top 100 prospect list, peaking at #59.  At 21, an age where Manship was still finishing up his career at Notre Dame and just starting off in the Gulf Coast and Florida State Leagues, Bautista was already in the Majors.

In spite of their different pedigrees, Jeff Manship and Denny Bautista ended up as very similar pitchers:  failed starters, journeyman relievers, shuttling up and down between cities like Minneapolis and Denver and Kansas City and cities like Rochester and Colorado Springs and Omaha.  They are so similar, in fact, that Denny Bautista is the only other pitcher in Major League history to keep succeeding in precisely the same sub-par way that Manship has.

There have been other pitchers who kept putting up ERAs above 5.00 and kept getting shots in the Majors.  A handful of them, including future Cy Young winner R.A. Dickey in his pre-knuckleball days, have even had ERAs above 5.00 in each of their first six seasons.  None of them, other than Manship and Bautista, have kept getting back to the Majors every single year, though.  There has always been a year or two in between somewhere where they languished in the minors without getting the call.

Of course, Dickey aside, there isn't much hope for success for these kind of pitchers.  Kevin Jarvis somehow managed to stick around another six seasons and pitch past his 37th birthday after putting up 5+ ERAs in each of his first six years, but he was just as ineffective in those final six years as in the first six (140 ERA-/124 FIP- in his first six seasons, 135 ERA-/125 FIP- in his final six).  Everyone else disappeared pretty quickly.

As for Bautista, he did get a seventh year.  It was actually his best, at least by ERA.  He finally broke the 5.00 barrier and posted a 3.27 ERA in 2010, good for right about average for a reliever.  However, in a rather cruel statement about age and the ticking clock on failed prospects, this of all years was finally the year that failed to earn him another shot in the Majors.  The following June, he was released from Seattle's system and wound up pitching in Korea.  He's still around--he pitched in the Mexican league last year--but he hasn't been back in affiliated pro ball since.

It's not like you need any careful analysis to know that the outlook is not good for Manship's career, though.  I mean, he's a guy who has thrown 139.1 innings over the past six years with a 6.46 ERA and just got released by his third team in three years.  It's interesting, though, don't you think?  That he keeps finding his way back, year after year?  That even when you split his career ERA into 20-30 inning chunks (or 3.1 inning chunks, as was the case in 2011), they all still come in over 5.00?  Even the progression is interesting: his 6.65 ERA was actually the third straight season his ERA dropped from the year before (Manship's ERAs from 2010-2014:  8.10, 7.89, 7.04, 6.65).  And he could go right on dropping the ERA again and again for years to come, and still not be any good.  That's amazing, in its own way.

If he ends up above 5.00 again this year, he would be the first pitcher ever to pitch in seven different MLB seasons and post an ERA that high in every one.  Here's something else interesting, though:  Steamer actually projects him for a 4.38 ERA this year.  That's...that's less than 5.00!  By a pretty fair amount!

When you think about it, the projection actually makes sense.  Even with ERAs consistently north of replacement level, teams have to be projecting him for something below 5.00, or they wouldn't bother calling him up.  And his fielding-independent numbers are actually...well, they're not good, but they're a lot better than his ERAs.  So there is a pretty good reason to believe he can break the Bautista-barrier if he finds his way back to the Majors this year.

Even so, every year that passes, Manship's career is on thinner and thinner ice.  It has to be--look what happened to Bautista, whose 3.27 ERA in year seven couldn't even save his career.  In all likelihood, if he gets another shot, he probably will have the best ERA of his career, but there is a very real chance that it would still be his last year anyway.  Heck, there is a very real chance we've seen the last of Jeff Manship in the Majors already.  That is, of course, unless he starts working on his knuckleball.
Continue Reading...

(Possibly) the First Baseball Article I Ever Published

I was digging through some stuff the other day and came across an old newspaper from college that had what might be the first baseball article I ever published. It's not really anything in-depth or analytical--just a short opinion piece on the Bagwell contract situation that was in the news at the time. I think my writing style has definitely evolved since then, but it was interesting to see something I wrote so early in my development. Anyway, here's the article:

Bagwell article.PDF

  Continue Reading...

Effects of Playing the Sun Field on OF Putouts per BIP

I recently looked into how playing the sun field affects an outfielder's defensive performance. I was inspired by Craig Wright's discovery that Babe Ruth regularly switched between left and right field throughout his career to avoid playing the sun field, as I wanted to know what kind of effect this knowledge would have on his defensive value.

As far as I can tell, there isn't much of an effect. You can stop reading here if you care whether reading material is interesting, but I'll detail my methodology below so that those interested can know what I mean when I say I didn't find an effect.


First, I compiled as best I could a list of sun fields for all open air stadiums in the Retrosheet era (1950-2012 for my current database). Sun fields were estimated based on diagrams and images from Seamheads ballpark database,, and I was able to corroborate or correct some parks by looking around the web for written mentions of sun fields or photographs showing shadows during a game.

This is actually trickier than it sounds--you can get a decent idea of where the sun should set from maps or diagrams that include stadium orientation, but the sun's position also depends on the stadium's latitude and changes based on the time of day and time of year (which also means you can get conflicting results from photographs depending on when they were taken--see the shadows pointing to CF vs the shadows pointing to RF in Busch Stadium). Still, I did the best I could to identify a primary sun field, and while I doubt I came up with a perfect list, it should be good enough to detect an effect if there is one.

Once I had a list of sun fields for each stadium, I looked at putouts per ball in play for corner outfielders in each stadium. I then divided these into day and night games, so that I had average number of putouts per ball in play for left and right fielders in day and night games for each stadium. Using these figures, I checked the difference between PO/BIP between day and night games. Parks with roofs (retractable or not) and parks with CF sun fields were ignored.

From there, I checked how much the average PO/BIP went up or down for fielders playing the sun field. If playing the sun field makes impairs the fielder, then they should see a drop in performance from night to day games. However, it is also possible that playing the outfield is generally easier or harder in day games, so I also checked the change in putout rate for the opposite corner outfield position to use as a control group. Rather than compare the sun field's day game performance to its night game performance, I compared the change from night to day for the sun field to the change for the non-sun-field.

For example, in 2012 Busch Stadium, left fielders recorded putouts on 6.13% of balls in play during night games, and 5.94% during day games. Right fielders were 6.62% for night games and 7.49% for day games. That means that left fielders dropped their PO/BIP by .0019 in day games, while right fielders raised their PO/BIP by .0087. I have right field as Busch's primary sun field, so the sun field was associated with a gain of .0106 putouts per ball in play over the control group in day games.

Doing this for every season in every stadium included in the study, I got an average* of 0.0002 gain in PO/BIP for the sun field over the non-sun-field, which is practically zero and very slightly in the wrong direction to indicate an effect.

*the average was a weighted mean, with the weight given to each stadium-season being the harmonic mean of day BIP and night BIP. For example, 2012 Busch stadium had 2872 night BIP and 1549 day BIP, which is a harmonic mean of 2012.5.

Since I was concerned that poor data on which field was the sun field may have masked any potential effect, or that only some parks might have a bad sun field, I checked to see if individual stadiums displayed any effect. If that were the case, it should still show up in the overall data as a diminished but still visible effect, but it was worth checking. Individual stadiums did vary from zero effect, but not any more than they would by random chance. When splitting stadium-seasons into even and odd numbered years, there was no correlation between the observed effect for a stadium in even years versus the same stadium in odd years.

Finally, I checked the same thing for individual fielders, to see if there was any evidence that particular fielders had notable trouble with the sun field that would show up in PO/BIP. The result was the same as the test for individual parks--fielders varied from zero effect but no more than expected by chance, and the even-odd season correlation for fielders was 0.

This does not necessarily mean that the sun does not affect fielders--I assume that when the ball is actually in the sun, it adds a great deal of difficulty. It is likely that this does not happen often enough to significantly alter a fielder's defensive numbers, though. At the very least, it appears that finding an effect would require much more precise data. For example, you could probably find something by using the sun's position at the time of the play and the trajectory of the batted ball to identify specific plays that are likely affected. Even if this data were available, however, it would be impossible to use it to evaluate Ruth specifically, and the overall effect I saw indicates that there is likely no need to adjust his defense valuation down simply because he rarely played the sun field.
Continue Reading...

Did Adrian Peterson really Outgain Eric Dickerson?

A couple years ago, I wrote about how rounding errors affect yardage gains in football.  The general rule was that, assuming the rounding error on each play is independent, the total rounding error follows a normal distribution with parameters mean = 0 and SD = sqrt(number of plays/12).

I began thinking about this again for two reasons.  One, Adrian Peterson just came within 9 yards of Eric Dickerson's season rushing record.  With 348 rushes for Peterson and 379 for Dickerson, that comes out to a standard deviation for the combined rounding errors of 7.8 yards, and about a 12% chance that the 9 yard difference is entirely due to rounding errors.

The other reason is that Brian Burke pointed out in the comments of the original article that the rounding errors of plays in the NFL are not independent.  The total yardage gain for each drive has to round off to the correct figure.  From Brian's comment:

"One other way to state this is that if a team has 2 plays in a row, and one goes for 4.5 yards but is scored as 4, and the next goes for 5.5 yds, it can't be scored as 5. It must be scored as a 6 yd gain because the ball is very clearly 10 yds further down field, not 9."

I wanted to try to account for this constraint and see how much difference it would make.

Note: the following is mostly dry and math-related, so if you want to skip it, I estimate the chance of rounding errors covering the 9 yard difference between Dickerson and Peterson at about 14%.

Continue Reading...