RSS

Did Adrian Peterson really Outgain Eric Dickerson?

A couple years ago, I wrote about how rounding errors affect yardage gains in football.  The general rule was that, assuming the rounding error on each play is independent, the total rounding error follows a normal distribution with parameters mean = 0 and SD = sqrt(number of plays/12).

I began thinking about this again for two reasons.  One, Adrian Peterson just came within 9 yards of Eric Dickerson's season rushing record.  With 348 rushes for Peterson and 379 for Dickerson, that comes out to a standard deviation for the combined rounding errors of 7.8 yards, and about a 12% chance that the 9 yard difference is entirely due to rounding errors.

The other reason is that Brian Burke pointed out in the comments of the original article that the rounding errors of plays in the NFL are not independent.  The total yardage gain for each drive has to round off to the correct figure.  From Brian's comment:

"One other way to state this is that if a team has 2 plays in a row, and one goes for 4.5 yards but is scored as 4, and the next goes for 5.5 yds, it can't be scored as 5. It must be scored as a 6 yd gain because the ball is very clearly 10 yds further down field, not 9."

I wanted to try to account for this constraint and see how much difference it would make.

Note: the following is mostly dry and math-related, so if you want to skip it, I estimate the chance of rounding errors covering the 9 yard difference between Dickerson and Peterson at about 14%.


Continue Reading...

THE EMPTY SET: Reflecting on Cooperstown’s Lost Year

A sea of people stretched across the field and masked the green grass with Cardinal red.  There was Bob Feller mingling across the fence beside the stage.  There was Frank Robinson.  There was Stan Musial.  Somewhere, on our side of the fence, was Tug McGraw.

We were all there for Ozzie.  There were a few scattered Phillie fans there for Harry Kalas, that year’s Frick Award recipient, if you looked carefully for the different insignias on their caps.  Every here and there you'd see a maroon Mike Schmidt throwback.  Other than that, it was just thousands of red-clad fans fixated on the wizard of a shortstop standing at the podium before us.

"This is awesome."  It was the first my dad, uncle, brother, and I had seen of Induction Weekend.  "We've got to come back in five years."

Five years is, of course, the waiting period for retired players before they become eligible for the Hall of Fame.  Three of my generation's great players had just retired.  And one was another beloved Cardinal.

**********

The BBWAA announced the results of their Hall of Fame balloting last Wednesday.  No one got in.  Barry Bonds didn't get in.  Roger Clemens didn't get in.  Not Biggio, not Bagwell.  Not Jack Morris.  Not Piazza, Trammell, Raines, Schilling, Martinez, Walker (either one), or Lofton.  Not McGwire or Sosa or Palmeiro.  Not even Shawn Green.

Someone will get in.  In 1996, the last year no one met the 75% threshold, there were six players on the ballot (Niekro, Perez, Sutton, Santo, Rice, and Sutter) who would get in eventually.  That's how it always is; every ballot has several candidates who will get in someday.

Biggio will get in.  Every player who has ever gotten Biggio's level of support early in his candidacy has had no trouble getting elected sooner rather than later.  Bagwell is at that high early level of support where almost everyone gets in eventually.  Piazza even more so.

Jack Morris will probably get in as a Veterans Committee selection someday.  Schilling will probably get in someday.  Eventually, as the electorate gets a bit younger, Tim Raines will probably find the remaining votes he needs to get in, barring a complete disaster with the current and upcoming logjam that might never clear up before he falls off the ballot.

Maybe they won't all get in.  But some of them will, and maybe some of the others as well.  Trammell is the type of guy who could finally get his due when the Hall puts together a VC for his era.  Edgar Martinez could pick up some support as the voters begin to accept that the DH is now part of the game.  The voters, or the Hall, might someday come around on Bonds and Clemens.

Someone is going to get in.  Definitely Biggio.  Very likely Jack Morris.  They're just going to have to wait.  So too will Cooperstown, which swells up with tens of thousands of tourists (and their wallets) every July except this one.

Continue Reading...

On Miguel Cabrera, Value, and the Triple Crown

“In ’67, the triple crown was never even mentioned once.  We were so involved in the pennant race, I didn’t know I won the triple crown until the next day, when I read it in the paper.”
-Carl Yastrzemski to the Boston Herald, published September 26, 2012

“Is it too early to say that [Cabrera] has a legitimate shot at a Triple Crown this season hitting in front of Fielder? I don't think so.”
-Fox News sports article, published April 13, 2012


The Triple Crown has grown in stature over the years.  That’s not to say it wasn’t a big deal before, but reporters now are asking Carl Yastrzemski about someone else winning it faster than they ever asked him about winning it himself.  In 1942, when Ted Williams won it, no one even had a list of previous winners compiled.  An AP reporter had to research it for his story on Williams’ feat, and he still missed the most recent occurrence (Joe Medwick, whose Triple Crown just five years earlier escaped detection).

Back then, it was a cool thing.  It wasn’t necessarily the historic thing it’s become.  It didn’t yet carry the mythical ethos of the pantheon-dwellers -- Williams, Mantle, Yaz, Frank Robinson, etc -- who could once do what for so long escaped their modern counterparts.  When someone won it, it didn’t carry the weight of a whole generation of fans who grew up hearing about it and never seeing it.  It was just a cool thing.

I can see getting excited about it.  It’s an impressive feat.  It’s something we’ve waited for for a long time.  It's something only a handful of the greats have even done.

And yet, I have a hard time getting excited.  It was a great season, sure.  A wonderful season at the plate.  But the best season I’ve ever seen?  Not close.  Which means I’ve seen a lot of non-Triple-Crown seasons that were better, because this is the first Triple Crown of my lifetime.  You don’t even have to look that hard to find a better season.  There’s another one right in front of our noses.

I’m talking, of course, about Miguel Cabrera’s 2011 season.

I know that seems, at least on the surface, like a bit of a contrarian statement.  How could he have been better when he hit 14 fewer home runs and drove in 36 fewer runs and didn’t, I don’t know, win the first Triple Crown in four and a half decades?  I don’t mean it as a contrarian viewpoint, though.  I just think Cabrera hit better in 2011 than in 2012.

Let me explain myself.  First, we need to establish what we mean by “better”.

I grew up with a fairly traditional baseball upbringing.  I was the son of a catcher who was the son of a catcher, saved only from the tools of ignorance myself by a bad case of sinistrality (a condition my dad only fully forgave me for when my younger sister took up softball and inherited his old gear).  I learned the game from proud field generals who would rather hold their ground to a hard-charging runner than hit a home run, even if they dropped the ball in the process.

That’s not a bad way to learn the game.  It was a great way to learn it.  But part of that upbringing was growing up thinking that Rickey Henderson was Lou Brock-Lite, and that Ted Sizemore was the ideal #2 hitter, and that Tony Gwynn was the best hitter in the game.  Part of that was drafting Ozzie Smith for my first fantasy league in a three-team-deep league.

It’s not that those things are necessarily wrong.  I don’t remember or care what happened in that fantasy league, other than that I remember drafting my favourite player.  I don’t remember or care how many runs the Padres scored with Tony Gwynn anchoring their lineup, or how many games they won.  I remember that watching Tony Gwynn was unlike watching anyone else in baseball, because you felt like you knew you were going to see something happen.  He was going to put the ball in play, and the defense was going to scramble to field it.  When Tony won, it felt like he won because he could almost place the ball at the spot where it landed.  When the defense won, it felt like they got away with one.  It was exciting to someone who learned the game the way I did.

As far as baseball is a game of entertainment, maybe Tony Gwynn was the best hitter in the game.  Arguing for Tony Gwynn over Frank Thomas, or Barry Bonds, or Fred McGriff, or a handful of other guys as a hitter, though, isn’t really an argument of value or production.  It’s an argument of what “best” means to begin with.  He was better at some things, yeah.  Maybe better at the things that are most important to you.  At some point, though, it started to hit me that, whatever abstract ideals I might hold about what a hitter should be, the very concrete objective of all hitters is the same.  They hit as best they can to win games, and they do so by helping to score runs.

That’s something that’s hard to measure when your statistical upbringing comes mostly from Topps and Donruss.  How many runs is Gwynn’s AVG worth?  How many runs are Thomas’ walks and extra base hits worth?  I don’t know.  It doesn’t say on the back of the card.  We all know when we watch a game that getting on base is important, that making outs is bad, and that getting to second or third is better than getting to first.  How much better?  I don’t know.  And so the argument becomes about what best actually means, because the units of measurement are not helpful.

Continue Reading...

Clutch, WPA/LI, and the Home Run Bias

The stat Clutch, as published on FanGraphs and Baseball-Reference, is designed to quantify how much better or worse a hitter has produced in situations based on how critical those situations are in the immediate context of a game.  Players who perform better in more critical situations (for example, late in a close game) than they normally do will have a positive Clutch rating, and players who perform worse in such situations will have a negative Clutch.  It does this by comparing two values for a hitter:  his WPA and his WPA.LI.  I will assume you are familiar enough these two stats (not necessarily their inner workings, but at least what they are) as a prerequisite for this piece; if not, you can catch up on B-R's or FanGraphs' explanation pages.

WPA.LI follows two key constraints.  The first is that, for a given game state (i.e. the inning, the score, the number of outs, and the placement of any runners on base), the relative value of a play is determined by how much that play affects the team's chances of winning.  If the bases are empty, a walk is credited the same as a single.  If the bases are loaded with the winning run on third, a walk is credited the same as a home run.  This constraint works exactly like WPA (as one might expect from a WPA-based metric).

The second constraint differentiates WPA.LI from WPA.  One of the properties of WPA is that some situations are inherently weighted more strongly than others.  A key at bat late in a close game can swing a team's chances of winning by several times as much as the same result in a blowout, and it is credited accordingly.  WPA.LI, on the other hand, ensures that the average play in every situation gets the same weight.

So, on the one hand, you have WPA, which weights PAs according to their immediate impact on the game.  One clutch PA might be worth as much as 4 or 5 normal PAs, and one mop-up PA might be worth practically nothing.  On the other hand, you have WPA.LI, which weights every PA equally, just like most other stats do.  Basically, it is linear weights, but with the ability to tailor the value of each event to the specific situation rather than sticking to a blanket value for each event across all situations.  While WPA tells the story of clutch hitting (who got the big hit when the team most needed production), WPA.LI tells the story of situational hitting (who got on base when the team needed baserunners, put the ball in play when the strikeout was most costly, or hit for power when advancing runners quickly was more important than getting another guy on first).

There is a third important constraint which WPA.LI does not adhere to, however.  Ideally, the average value of each event would match its linear weights value.  If a home run is worth 1.4 runs above average across all situations, then you would like the average WPA.LI value of a HR to be 1.4 runs (or rather, the equivalent value on the wins scale).  That is not the case, however.

The following linear weights values represent the average change in run and win expectancy for that event across all situations, along with the average WPA.LI value of each event.  All three versions have been placed on the runs scale by setting the value of the out at -.27 in order to make them easier to compare directly:



RE WPA WPA.LI
1B 0.47 0.47 0.44
2B 0.77 0.75 0.75
3B 1.05 1.06 1.04
HR 1.41 1.42 1.58
BB 0.31 0.30 0.31
K -0.29 -0.30 -0.29
out -0.27 -0.27 -0.27

As you can see, WPA.LI does fine at assigning the correct value to most events, but the value of the HR is way off.  This may seem counterintuitive; if WPA.LI just creates custom linear weights for each situation based on the WPA values, why would the average WPA.LI value be different from the average WPA value?  We can look at the mathematical relationship between WPA and WPA.LI to see why this is.
Continue Reading...

The Pujols Decision: One Fan's Reflections

Stan Musial is the man in St. Louis.  Nearly 50 years after Musial last played for the Cardinals, he remains the undisputed king of Cardinal baseball.  His statue alone stands tall outside the main entrance to Busch Stadium, a few hundred feet south of the plaza where all the lesser (albeit much more attractive) statues of other Cardinal greats sit.  For decades, no one in St. Louis thought they would ever see a player rival Musial.

And then Albert came along.  Just one year and one Bobby Bonilla injury removed from his professional debut as a 13th round draft pick, Pujols was in the starting lineup and lighting up the National League.  He hit for average.  He hit for power.  He got on base.  He eventually learned to play a very good first base.  For the first time, St. Louis fans saw a player and thought, "this could be the guy who tops Musial."

The accolades came.  The MVPs (three of them, same as Musial), the All Star appearances, the Silver Sluggers and Gold Gloves, the home runs and hits and RBIs; all of them flocked to Pujols' Baseball-Reference page like moths to Matt Holliday's ear.

The wins followed.  Led by Pujols' success, the team made the playoffs 7 out of 11 seasons, winning 3 pennants and 2 World Series along the way.  From 2001-2011, only the high-spending Yankees and Red Sox won more games than did Pujols' Cardinals.  Pujols was the best player in the game, a superstar of whose order the franchise had not seen in decades.  Fans watched in awe and wondered how high his career would stack by the time it ended.

Pujols was, over his 11 years with St. Louis, remarkably similar to Musial when Musial was at his best.  Compare Pujols’ career in St. Louis to Musial’s best 11 year stretch (1943-54):



PA H RBI HR BB R
Musial (1943-54) 7564 2251 1174 281 990 1301
Pujols (2001-11) 7433 2073 1329 445 975 1291




AVG OBP SLG wRC+ brWAR fWAR
Musial (1943-54) .346 .434 .591 171 88 98
Pujols (2001-11) .328 .420 .617 167 84 88


In both traditional counting totals and more sabermetric evaluations, the two come up as near equals.  Musial got on base a bit better (in an environment where hitters got on base more than they do today) while Pujols hit for more power (in an environment where hitters hit for more power than they did in Musial’s day).  The two were comparable fielders, good for their position, but at the weak end of the fielding spectrum.

Musial rates slightly better in both Baseball-Reference’s and FanGraphs’ implementations of WAR, but they are close enough that which one you would pick will largely depend on how you approach the different eras (i.e. how you want to adjust for things like integration, expansion, population growth, international development, improved scouting, the war years, etc).  They’re close enough that it would reasonable to take the position that no Cardinal fan has ever seen one of their own play at a higher level than Pujols has over his 11 years with the team, not even Musial.  It’s not a slam-dunk position; maybe you still take Musial.  But, for the first time since Musial retired, you’d probably at least have to think about it.

Watching Pujols play ignited Cardinal fans like watching Musial did, and we loved every minute of it.  Naturally, we wanted that to continue.  We wanted another all-time great to stay a career Cardinal.  Then, out of nowhere, the report swept in from the winter meetings that Pujols had signed with the Angels.  No build up, nothing.  No one had even talked about the Angels in the weeks of negotiating that led off the offseason.  Just like that, he was gone.



Continue Reading...

Win Expectancy and Leverage Index tables, R Code

This post is just a quick dump of some code you can use to create win-expectancy and leverage index tables like what I used for my recent Baseball PreGUESTus article. It is written for the free statistical program R, and it builds upon the excellent work on run-expectancy and run distribution tables done by Sobchak at ChancesIs.com.

In order to run this code, you will need R with the package plyr installed. You will also need the file bo_transitions.csv from ChancesIs (either the CSV file hosted on that site, or one created using a similar query to the one Sobchak published) and the file game_state_frequency.csv, which you can copy from this table. Sobchak's data and the game_state_frequency table are from the years 1993-2010. You can collect the data for other years by altering Sobchak's SQL query and this game_state_frequency query.

*note-you only need game_state_frequency.csv for calculating LI. You don't need it if all you want is a WE table.


Once you have those files on your computer, you can construct a win-expectancy table with the following R code:

Win Expectancy Table, R code

You will have to change the line
setwd("/Users/Seshoumaru/Desktop/untitled folder/baseball/run-win expectancy")

to the folder path where you saved the necessary CSV files.

The win expectancy values are generated based on Sobchak's simulated run distributions. It is currently set to run 100,000 simulated innings from each state to estimate the distributions. You can raise the number of simulations to increase the precision, but it will take longer to process. On my computer, 100,000 simulations took about 4 minutes to run. 1,000,000 simulations took about an hour. The win expectancies themselves are not simulated, however.

The code limits run scoring to 16 runs for the remainder of the inning you are in, plus 16 runs total for the rest of the game. This is done to greatly reduce processing time. The generated tables cover scores from the home team being down 16 to up 16 (all score differentials are from the perspective of the home team.

The above code assumes equal run distributions for both teams. With a few changes, you can alter the code to include home-field advantage by using separate distributions for the home and away teams. To do this, you will need to alter Sobchak's query to create additional bo_transition files for just the home team and just the away team (called bo_transitions_home.csv and bo_transitions_away.csv). Once you have added those files, you can run the following code:


Win Expectancy Table, HFA version, R code



Continue Reading...

Regression to the Mean and Beta Distributions

This morning, a discussion of regression to the mean popped up on Phil's and Tango's blogs. This discussion touches upon some of the recent work I've been doing with Beta distributions, so I figured I'd go ahead and lay out the math linking regression to the mean with Bayesian probability with a Beta prior.

Many of the events we measure in baseball are Bernoulli trials, meaning we simply record whether they happen or not for each opportunity. For example, whether or not a team wins a game, or whether or not a batter gets on base are Bernoulli trials. When we observe these events over a period of time, the results follow a binomial distribution.

When we observe these binomial events, each team or player has a certain amount of skill in producing successes. Said skill level will vary from team to team or player to player, and, as a result, we will observe different results from different teams or players. Albert Pujols, for example, has a high degree of skill at getting on base compared to the whole population of MLB hitters, and we would expect to observe him getting on base more often than, say, Emilio Bonifacio.

The variance in talent levels is not the only thing driving the variance in obvserved results, however. As with any binomial process (excepting those with 0% or 100% probabilities, anyway), there is also random variance as described by the binonial distribution. Even if Albert's on-base skill is roughly 40%, and Bonifacio's is roughly 33%, it is still possible that you will occasionally observe Emilio to have a higher OBP than Albert over a given period of time.

In baseball, it is a practical problem that we do not know the true probability linked to each team's or player's skill, only their observed rate of success. Thus, if we want to know the true talent probability, we have to estimate it from the observed.

One way to do this is with regression to the mean. Say that we have a player with a .400 observed OBP over 500 PAs, and we want to estimate his true talent OBP. Regression to the mean says we need to find out how much, on average, our observed sample will reflect the hitter's true talent OBP, and how much it will reflect random binomial variation. Then, that will tell us how many PAs of the league average we need to add to the observed performance to estimate the hitter's true talent.

For example, say we decide that the number of league average PAs we need to add to regress a 500 PA sample of OBP is 250. We would take the observed performance (200 times on base in 500 PAs), and add 82.5 times on base in 250 PAs (i.e. the league average performance, assuming league average is about .330) to that.

200+82.5......282.5
------------ = -------- = .377
500+250........750

Therefore, regression to the mean would estimate the hitter's true OBP talent at .377.

As Phil demonstrated, once you decide that you have to add 250 PAs of league average performance to your sample to regress, you would use that same 250 PA figure to regress any OBP performance, regardless of how many PAs are in the observed sample. Whether you have 10 observed PAs or 1000 observed PAs, the amount of average performance you have to add to regress does not change.

Now, how would one go about finding that 250 PA figure? One way is to figure out the number of PAs at which the random binomial variance is equal to the variance of true talent in the population.

Start by taking the observed variance in the population. You would look at all hitters over a certain number of PAs (say, 500, for example), and you might observe that the variance in their observed OBPs is about .00132, with the average about .330. The observed variance is equal to the sum of the random binomial variance and the variance of true OBP talent across the population of hitters. We don't know the variance of true talent, but we can calculate the random binomial variance as p(1-p)/n, where p is the probability of getting on base (.330 for our observed population) and n is the observed number of PAs (500 in this case). For this example, that would be about .00044. Therefore, the variance of true talent in the population is approximately:

.00132 - .00044 = .00088

Next, we find the number of PAs where the random binomial variance will equal the variance of true talent:

p*(1-p)/n = true_var

.330*(1-.330)/n = .00088

n = .330*(1-.330)/.00088
250


We can also approach the problem of estimating true talent from observed performance using Bayesian probability. In order to use Bayes, we need to make an assumption about the distribution of true talent in the population the hitter is being drawn from (i.e. the prior distribution). We will assume that true talent follows a Beta distribution.

Return now to our .400 observed OBP example. Bayes says the posterior distribution (i.e. the distribution of possible true talents for a hitter drawn from the prior distribution after observing his performance) is proportional to the product of the prior distribution and the likelihood function (i.e. the binomial distribution, which is the likelihood of observing a each possible OBP, given the prior probability).

The prior Beta distribtuion is:

x^(α-1) * (1-x)^(β-1)
------------------------
..........B(α,β)

where B(α,β) is a constant equal to the Beta function with parameters α and β.

The binomial likelihood for observing s successes in n trials (i.e. the observed on-base performance) is:

.....n!
--------- * x^s * (1-x)^(n-s)
s!(n-s)!

where x is the true probability of a success.

Next, we multiply the prior distribution by the likelihood distribution:

x^(α-1) * (1-x)^(β-1) .........n!
------------------------- * --------- * x^s * (1-x)^(n-s)
............B(α,β).... .......... s!(n-s)!


combine the exponents for the x and (1-x) factors:


x^(α + s - 1) * (1-x)^(β + n - s - 1).. .....n!
--------------------------------------- * --------
.......................B(α,β) ...................... s!(n-s)!


Separating the constant factors from the variables:

...........n!
------------------- * x^(α + s - 1) * (1-x)^(β + n - s - 1)
s!(n-s)! * B(α,β)


This product is proportional to the posterior distribution, so the posterior distribution will be the above multiplied by some constant in order to scale it so that the cumulative probability equals one. Since the left portion of the above expression is already a constant, we can simply absorb that into the scaling constant, and the final posterior distribution then becomes:


C * x^(α + s - 1) * (1-x)^(β + n - s - 1)


Notice that the above distribution conforms to a new Beta distribution with parameters α+s and β+n-s, and with a constant C = 1/B(α+s,β+n-s). When the prior distribution is a Beta distribution with parameters α and β and the likelihood function is binomial, then the posterior distribution will also be a Beta distribution, and it will have the parameters α+s and β+n-s.

We still need to choose values for the parameters α and β for the prior distribution. Recall from the regression example that we found a mean of .330 and a variance of .00088 for the true talent in the population (i.e. the prior distribution), so we will choose values for α and β that give us those values. For a Beta distribution, the mean is equal to:

α/(α+β)

and the variance is equal to:

............αβ
----------------------
(α+β)^2 * (α+β+1)


A bit of algebra gives us values for α and β of approximately 82.5 and 167.5 respectively. That means the posterior distribution will have as parameters:

α+s = 82.5 + 200 = 282.5
β+n-s = 167.5 + 500 - 200 = 467.5

and a mean of

........282.5...........282.5
----------------- = ------- = .377
282.5 + 467.5.......750

As you can see, this is identical to the regression estimate. This will always be the case as long as the prior distribution is Beta and the likelihood is binomial. We can see why if we derive the regression constant (the number of PAs of league average we need to add to the observed performance in order to regress) from the prior distribution.

Recall that the regression constant can be found by finding the point where random binomial variance equals prior distribution variance. Therefore:

p(1-p)/k ≈ prior variance

where k is the regression constant and p is the population mean.

p(1-p)/k ≈ αβ / ( (α + β)^2(α + β + 1) ) ; p ≈ α/(α+β)

α/(α+β) * ( 1 - α/(α+β) ) / k . ≈ αβ / ( (α + β)^2 * (α + β + 1) )
α/(α+β) - α^2/(α+β)^2.........≈ k * αβ / ( (α + β)^2 * (α + β + 1) )
(α(α+β) - α^2)/(α+β)^2 ...... ≈ k * αβ / ( (α + β)^2 * (α + β + 1) )
(α(α+β) -α^2)....................... ≈ k * αβ / (α + β + 1)
(α^2 + αβ - α^2) ..................≈ k * αβ / (α + β + 1)
αβ.......................................... ≈ k * αβ / (α + β + 1)
1 .............................................≈ k / (α + β + 1)
k....................,,,,,,,,,,,,,,,,........ ≈ α + β + 1


Since α and β for the prior in our example are 82.5 and 167.5, k would be 82.5 + 167.5 + 1 = 251.

This estimate of k is actually biased, because it assumes a random binomial variance based only on the population mean, whereas the actual random binomial variance for the prior distribution will be the average binomial variance over the entire distribution. In other words, not all of the population will have a .330 OBP skill; some hitters will have a .300 skill, while others will have a .400 skill, and they will all have different binomial variances associated with them. More precisely, the random binomial variation for the prior distribution will be the following definite integral taken from 0 to 1:


⌠ x(1-x)
| ------- * B(x;α,β) dx
...k


which, conceptually, is the weighted sum of the the binomial variances for each possible value from the prior distribution, where each binomial variance is weighted by the probability density function of the prior.

.......1........
------------ | x(1-x) * x^(α-1) * (1-x)^(β-1) dx
k * B(α,β) ⌡


........1.......
------------ | x^α * (1-x)^β dx
k * B(α,β) ⌡



The definite integral is in the form of the Beta function B(α+1,β+1), so we can rewrite this as


B(α+1,β+1)
-------------
k * B(α,β)


The Beta function is interchangeable with the Gamma Function in the following manner:

B(α,β) = Γ(α)*Γ(β) / Γ(α+β)

replacing the two Beta functions with their Gamma equivalencies:


Γ(α+1) * Γ(β+1) * Γ(α+β)
-------------------------------
k * Γ(α) * Γ(β) * Γ(α+β+2)


This revision is useful because the Gamma function has a property where Γ(x+1)/Γ(x) = x, so the above reduces to:


αβ * Γ(α+β)
---------------
k * Γ(α+β+2)


Furthermore, since Γ(x+1)/Γ(x) = x, it follows that Γ(x+2)/Γ(x+1) = x+1. If we multiply those two equations together, we find that

Γ(x+1)....Γ(x+2)
-------- * -------- = x(x+1)
..Γ(x)......Γ(x+1)

Γ(x+2)/Γ(x) = x(x+1)

Γ(x)/Γ(x+2) = 1/(x(x+1))


Therefore


.αβ * Γ(α+β) ..................αβ
---------------- = ------------------------
k * Γ(α+β+2).....k * (α+β) * (α+β+1)



Now that we have a manageable expression for the random binomial variance of the prior distribution, we return to the requirement that random binomial variance equals the variance of the prior distribution:


..............αβ ................................αβ
------------------------ = -----------------------
k * (α+β) * (α+β+1)......(α+β)^2 * (α+β+1)


k * (α+β) * (α+β+1) = (α+β)^2 * (α+β+1)


k = α+β


Using a more precise calculation for the random binomial variance of the prior, we find that k = α+β rather than α+β+1. Note that when we estimate k by assuming a constant binomial variance of p(1-p)/k, we get a value of k exactly 1 higher than when we run the full calculation for the binomial variance. This is useful because the former calculation is much simpler than the latter, so we can calculate k by using the former method and then subtracting 1. Also note that the 250 value we got in the initial regression to the mean example would also be 1 too high if we were using more precise figures; I've just been rounding them off for cleanliness' sake.

Let's look now at the calculation for regression to the mean:

true talent estimate = (s+pk)/(n+k)

where s is the observed successes, n is the observed trials, p is the population mean, and k is the regression constant.

We know from our prior that p=α/(α+β) and k=α+β, so

(s+pk)/(n+k) =

s + (α+β)*α/(α+β)
----------------------
......n + α + β


...α + s
-----------
α + β + n


And what does Bayes say? Our posterior is a Beta with parameters α+s and β+n-s, which has a mean

.......(α+s)
-----------------
(α+s)+(β+n-s)


...α + s
-----------
α + β + n


So Bayes and regression to the mean produce identical talent estimates under these conditions (a binomial process where true talent follows a Beta distribution).

k is far easier to estimate directly (such as by using the method in the initial regression tot he mean example) than α and β, so we would typically calculate α and β from k. To do that, we use the fact that p = α/(α+β), and that k=α+β, so by substitution we can easily find that:

α=kp
β=k(1-p)

where k is the regression constant and p is the population mean.

We can also see that the regression amount will be constant regardless of the number of observed PAs, because when we take our Bayesian talent estimate:

...α + s
-----------
α + β + n

we see that we are always adding the quantity kp (as substituted for α) to the observed successes (s), and always adding the quantity k (as substituted for (α+β)) to the observed trials (n), no matter what observed values we have for s and n. The amounts we add to the observed successes and trials depend only on the parameters of the prior, which do not change.


Continue Reading...