tag:blogger.com,1999:blog-28681942924140020632017-06-24T03:26:18.793-07:003-D BaseballKincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.comBlogger105125tag:blogger.com,1999:blog-2868194292414002063.post-4272530960744139712017-06-04T18:30:00.000-07:002017-06-05T04:47:59.856-07:00 Math Behind Regression with Changing Talent Levels (THT Article)In my article "<a href="http://www.hardballtimes.com/regression-with-changing-talent-levels-the-effects-of-variance/">Regression with Changing Talent Levels: the Effects of Variance</a>" on the Hardball Times, I talk about how changes in players' true talent levels from day to day reduce the variance of talent in the population overall over time. In other words, the spread in talent over a 100-game sample will be smaller than the spread in talent over a one-game sample. In the article, I gave the following formula to calculate how much the spread in talent is reduced, which I will further explain here: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-he9l8t-EIQs/WSzZpdJjbUI/AAAAAAAAAwc/Mr0ueI_HgpAdeqCd1uvoXr8VIVK6EG-sgCPcB/s1600/talent_variance1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-he9l8t-EIQs/WSzZpdJjbUI/AAAAAAAAAwc/Mr0ueI_HgpAdeqCd1uvoXr8VIVK6EG-sgCPcB/s1600/talent_variance1.png" data-original-width="491" data-original-height="226" /></a></div> <span class="fullpost"> <br /><br /> <i>*Note: in the THT article, I used d for the number of days instead of n to avoid confusion with another formula that was referenced from a previous article, which used n for something else. For this article, I'm just going to use n for the number of days.</i> <br /><br /> The value given by the formula is the ratio of talent variance over n days to the talent variance for a single day. In other words, the variance in talent drops by a multiplicative factor that is dependent on the length of the sample and the correlation of talent from day to day. <br /><br /> Now, how do we get that formula? <br /><br />If we only have two days in our sample, it is not too difficult to calculate the drop in talent variance. Let t<sub style="font-size: 8pt;">0</sub> be a variable representing player talent levels on Day 1, and t<sub style="font-size: 8pt;">1</sub> be a variable representing player talent levels on Day 2. We want to find the variance of the average talent levels over both days, or (t<sub style="font-size: 8pt;">0</sub>+t<sub style="font-size: 8pt;">1</sub>)/2. <br /><br /> The following formula gives us the variance of the sum of two variables: <br /><br /> </span><br /><div class="separator" style="clear: both; text-align: center;"><span class="fullpost"><a href="https://3.bp.blogspot.com/-24XfH-okxio/WSJlEJjkr2I/AAAAAAAAAtM/MLiaiYVa1gYb__fve-Vjnj7OgpkpnPimgCPcB/s1600/talent_variance2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-24XfH-okxio/WSJlEJjkr2I/AAAAAAAAAtM/MLiaiYVa1gYb__fve-Vjnj7OgpkpnPimgCPcB/s1600/talent_variance2.png" /></a></span></div><span class="fullpost"> <br /><br /> The covariance is directly proportional to the correlation between the two variables and is defined as follows: <br /><br /> </span><br /><div class="separator" style="clear: both; text-align: center;"><span class="fullpost"><a href="https://2.bp.blogspot.com/-YlylGzERYmA/WSJlEODNxbI/AAAAAAAAAtM/md7vRsDyTWYTwE3Ne-RJOlh1l0sg0MCPQCPcB/s1600/talent_variance3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-YlylGzERYmA/WSJlEODNxbI/AAAAAAAAAtM/md7vRsDyTWYTwE3Ne-RJOlh1l0sg0MCPQCPcB/s1600/talent_variance3.png" /></a></span></div><span class="fullpost"></span><span class="fullpost"> <br /><br /> (Note that sd<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub></sub>sd<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">1</sub></sub> = var<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub></sub> = var<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">1</sub></sub> because the standard deviation and variance for both variables are the same.) <br /><br /> Before we continue, there is an important thing to note. Because we are trying to derive a formula for a ratio (variance in talent over n days divided by variance in talent over one day), we don't necessarily need to calculate the numerator and denominator of that ratio exactly. As long as we can calculate values that are proportional to those values by the same factor, the ratio will be preserved. <br /><br /> Technically, we want the variance of the value (t<sub style="font-size: 8pt;">0</sub>+t<sub style="font-size: 8pt;">1</sub>)/2 and not just t<sub style="font-size: 8pt;">0</sub>+t<sub style="font-size: 8pt;">1</sub>, which would be vart(1+r)/2 instead of 2vart(1+r). However, those two values are proportional, so it doesn't really matter for now which we calculate as long as we can also calculate a value for the denominator that is proportional by the same factor. <br /><br /> For two days, the above calculations are simple enough. Once you start adding more days, however, it starts to get more complicated. Fortunately, the above math can also be expressed with a covariance matrix: <br /><br /> <br /><div align="center"><table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border: none;"><tbody><tr align="center" style="height: 31px;"><td style="border-bottom: 1pt solid black; border-right: 1pt solid black; height: 31px; width: 35px;"></td><td style="border-bottom: 1pt solid black; height: 31px; width: 52px;"><strong>t<sub style="font-size: 8pt;">0</sub></strong></td><td style="border-bottom: 1pt solid black; height: 31px; width: 52px;"><strong>t<sub style="font-size: 8pt;">1</sub></strong></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">0</sub></strong></td><td style="height: 31px; width: 52px;"><em>var<sub style="font-size: 8pt;">0</sub></em></td><td style="height: 31px; width: 52px;">cov<sub style="font-size: 8pt;">0,1</sub></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">1</sub></strong></td><td style="height: 31px; width: 52px;">cov<sub style="font-size: 8pt;">0,1</sub></td><td style="height: 31px; width: 52px;"><em>var<sub style="font-size: 8pt;">1</sub></em></td></tr></tbody></table></div><br /><br /> The variance of the sum t<sub style="font-size: 8pt;">0</sub>+t<sub style="font-size: 8pt;">1</sub> is equal to the sum of the terms in the covariance matrix, which you can see just gives us the formula: var<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub>+t<sub style="font-size: 8pt;">1</sub></sub> = var<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub></sub> + var<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">1</sub></sub> + 2cov<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub>,t<sub style="font-size: 8pt;">1</sub></sub>. The covariance matrix is convenient because it can be expanded for any number of days: <br /><br /> </span><br /><div align="center" style="font-size: 12pt; font-weight: bold;"><span class="fullpost">Covariance matrix between talent n days apart </span></div><span class="fullpost"><br /><br /> <div align="center"><table style="border-collapse: collapse; height: 217px; width: 375px;"><tbody><tr align="center" style="height: 31px;"><td style="border-bottom: 1pt solid black; border-right: 1pt solid black; height: 31px; width: 35px;"></td><td style="border-bottom: 1pt solid black; height: 31px; width: 55px;"><strong>t<sub style="font-size: 8pt;">0</sub></strong></td><td style="border-bottom: 1pt solid black; height: 31px; width: 55px;"><strong>t<sub style="font-size: 8pt;">1</sub></strong></td><td style="border-bottom: 1pt solid black; height: 31px; width: 55px;"><strong>t<sub style="font-size: 8pt;">2</sub></strong></td><td style="border-bottom: 1pt solid black; height: 31px; width: 55px;"><strong>t<sub style="font-size: 8pt;">3</sub></strong></td><td style="border-bottom: 1pt solid black; height: 31px; width: 29px;"><strong>...</strong></td><td style="border-bottom: 1pt solid black; height: 31px; width: 55px;"><strong>t<sub style="font-size: 8pt;">n-1</sub></strong></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">0</sub></strong></td><td style="height: 31px; width: 55px;"><em>var<sub style="font-size: 8pt;">0</sub></em></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,1</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,2</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,3</sub></td><td style="height: 31px; width: 29px;">...</td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,n-1</sub></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">1</sub></strong></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,1</sub></td><td style="height: 31px; width: 55px;"><em>var<sub style="font-size: 8pt;">1</sub></em></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">1,2</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">1,3</sub></td><td style="height: 31px; width: 29px;">...</td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">1,n-1</sub></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">2</sub></strong></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,2</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">1,2</sub></td><td style="height: 31px; width: 55px;"><em>var<sub style="font-size: 8pt;">2</sub></em></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">2,3</sub></td><td style="height: 31px; width: 29px;">...</td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">2,n-1</sub></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">3</sub></strong></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,3</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">1,3</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">2,3</sub></td><td style="height: 31px; width: 55px;"><em>var<sub style="font-size: 8pt;">3</sub></em></td><td style="height: 31px; width: 29px;">...</td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">3,n-1</sub></td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>⋮</strong></td><td style="height: 31px; width: 55px;">⋮</td><td style="height: 31px; width: 55px;">⋮</td><td style="height: 31px; width: 55px;">⋮</td><td style="height: 31px; width: 55px;">⋮</td><td style="height: 31px; width: 29px;">⋱</td><td style="height: 31px; width: 55px;">⋮</td></tr><tr align="center" style="height: 31px;"><td style="border-right: 1pt solid black; height: 31px; width: 35px;"><strong>t<sub style="font-size: 8pt;">n-1</sub></strong></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">0,n-1</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">1,n-1</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">2,n-1</sub></td><td style="height: 31px; width: 55px;">cov<sub style="font-size: 8pt;">3,n-1</sub></td><td style="height: 31px; width: 29px;">...</td><td style="height: 31px; width: 55px;"><em>var<sub style="font-size: 8pt;">n-1</sub></em></td></tr></tbody></table></div><br /><br /> We can also construct a correlation matrix. Given that we know the correlation of talent from one day to the next, this isn't that difficult. If the correlation between talent levels on Day 1 and Day 2 is r, and the correlation between talent levels on Day 2 and Day 3 is also r, we can chain those two facts together to find that the correlation between talent levels on Day 1 and Day 3 is r<sup style="font-size: 8pt;">2</sup>. <br /><br /> The same logic can be extended for any number of days, so that the correlation between talent levels n days apart is rn: <br /><br /> <div align="center" style="font-size: 12pt; font-weight: bold;">Correlation matrix between talent n days apart </div><br /><br /> <div align="center"><table style="border-collapse: collapse;"><tbody align="center"><tr style="border-bottom: 1pt solid black;"><td style="border-right: 1pt solid black;" width="36"></td><td width="31"><strong>t<sub style="font-size: 8pt;">0</sub></strong></td><td width="31"><strong>t<sub style="font-size: 8pt;">1</sub></strong></td><td width="31"><strong>t<sub style="font-size: 8pt;">2</sub></strong></td><td width="31"><strong>t<sub style="font-size: 8pt;">3</sub></strong></td><td width="29"><strong>...</strong></td><td width="36"><strong>t<sub style="font-size: 8pt;">n-1</sub></strong></td></tr><tr><td style="border-right: 1pt solid black;" width="36"><strong>t<sub style="font-size: 8pt;">0</sub></strong></td><td width="31"><em>r<sup style="font-size: 8pt;">0</sup></em></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="31">r<sup style="font-size: 8pt;">2</sup></td><td width="31">r<sup style="font-size: 8pt;">3</sup></td><td width="29">...</td><td width="36">r<sup style="font-size: 8pt;">n-1</sup></td></tr><tr><td style="border-right: 1pt solid black;" width="36"><strong>t<sub style="font-size: 8pt;">1</sub></strong></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="31"><em>r<sup style="font-size: 8pt;">0</sup></em></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="31">r<sup style="font-size: 8pt;">2</sup></td><td width="29">...</td><td width="36">r<sup style="font-size: 8pt;">n-2</sup></td></tr><tr><td style="border-right: 1pt solid black;" width="36"><strong>t<sub style="font-size: 8pt;">2</sub></strong></td><td width="31">r<sup style="font-size: 8pt;">2</sup></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="31"><em>r<sup style="font-size: 8pt;">0</sup></em></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="29">...</td><td width="36">r<sup style="font-size: 8pt;">n-3</sup></td></tr><tr><td style="border-right: 1pt solid black;" width="36"><strong>t<sub style="font-size: 8pt;">3</sub></strong></td><td width="31">r<sup style="font-size: 8pt;">3</sup></td><td width="31">r<sup style="font-size: 8pt;">2</sup></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="31"><em>r<sup style="font-size: 8pt;">0</sup></em></td><td width="29">...</td><td width="36">r<sup style="font-size: 8pt;">n-4</sup></td></tr><tr><td style="border-right: 1pt solid black;" width="36"><strong>⋮</strong></td><td width="31">⋮</td><td width="31">⋮</td><td width="31">⋮</td><td width="31">⋮</td><td width="29">⋱</td><td width="36">⋮</td></tr><tr><td style="border-right: 1pt solid black;" width="36"><strong>t<sub style="font-size: 8pt;">n-1</sub></strong></td><td width="31">r<sup style="font-size: 8pt;">n-1</sup></td><td width="31">r<sup style="font-size: 8pt;">n-2</sup></td><td width="31">r<sup style="font-size: 8pt;">n-3</sup></td><td width="31">r<sup style="font-size: 8pt;">n-4</sup></td><td width="29">...</td><td width="36"><em>r<sup style="font-size: 8pt;">0</sup></em></td></tr></tbody></table></div><br /><br /> This matrix is more useful than the covariance matrix, because all we need to know to fill in the entire correlation matrix is the value of r. And because correlation is proportional to covariance (cov<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub>,t<sub style="font-size: 8pt;">1</sub></sub> = r · var<sub style="font-size: 8pt;">t<sub style="font-size: 6pt;">0</sub></sub>), the sum of the correlation matrix is proportional to the sum of the covariance matrix. <br /><br /> Our next step, then, is to calculate the sum of the correlation matrix. Notice that the terms on each diagonal going from the top left to bottom right are identical: <br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-wQWenbKYEAQ/WSTy-N8q0zI/AAAAAAAAAvs/YL2w3MFoENEX9Qe2xDYW9PCybt3iMA6eQCPcB/s1600/correlation%2Btable%2Bdiagonals.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="257" src="https://1.bp.blogspot.com/-wQWenbKYEAQ/WSTy-N8q0zI/AAAAAAAAAvs/YL2w3MFoENEX9Qe2xDYW9PCybt3iMA6eQCPcB/s320/correlation%2Btable%2Bdiagonals.png" width="320" /></a></div><br /><br /> We can use this pattern to simplify the sum. Since the matrix is symmetrical, we can ignore the terms below the long diagonal and calculate the sum for just the top half of the matrix, and then double it later: <br /><br /> <div align="center"><table><tbody align="center" style="font-size: 14pt;"><tr><td style="width: 25px;">r<sup style="font-size: 10pt;">0</sup></td><td style="width: 25px;"><div style="color: blue;">r<sup style="font-size: 10pt;">1</sup></div></td><td style="width: 25px;"><div style="color: red;">r<sup style="font-size: 10pt;">2</sup></div></td><td style="width: 25px;"><div style="color: orange;">r<sup style="font-size: 10pt;">3</sup></div></td><td style="width: 22px;">...</td><td style="width: 32px;">r<sup style="font-size: 10pt;">n-1</sup></td><td style="width: 28px;">→</td><td style="text-align: right; width: 61px;"><em>r<sup style="font-size: 10pt;">n-1</sup></em></td></tr><tr><td style="width: 25px;"></td><td style="width: 25px;">r<sup style="font-size: 10pt;">0</sup></td><td style="width: 25px;"><div style="color: blue;">r<sup style="font-size: 10pt;">1</sup></div></td><td style="width: 25px;"><div style="color: red;">r<sup style="font-size: 10pt;">2</sup></div></td><td style="color: orange; width: 22px;">⋱</td><td style="width: 32px;">⋮</td><td style="width: 28px;"></td><td style="text-align: right; width: 61px;">⋮</td></tr><tr><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;">r<sup style="font-size: 10pt;">0</sup></td><td style="width: 25px;"><div style="color: blue;">r<sup style="font-size: 10pt;">1</sup></div></td><td style="color: red; width: 22px;">⋱</td><td style="width: 32px;"><div style="color: orange;">r<sup style="font-size: 10pt;">3</sup></div></td><td style="width: 28px;"><div style="color: orange;"><em>→</em></div></td><td style="text-align: right; width: 61px;"><div style="color: orange;"><em>(n-3)r<sup style="font-size: 10pt;">3</sup></em></div></td></tr><tr><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;">r<sup style="font-size: 10pt;">0</sup></td><td style="color: blue; width: 22px;">⋱</td><td style="width: 32px;"><div style="color: red;">r<sup style="font-size: 10pt;">2</sup></div></td><td style="width: 28px;"><div style="color: red;"><em>→</em></div></td><td style="width: 61px;"><div style="color: red; text-align: right;"><em>(n-2)r</em><em><sup style="font-size: 10pt;">2</sup></em></div></td></tr><tr><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 22px;">⋱ </td><td style="width: 32px;"><div style="color: blue;">r<sup style="font-size: 10pt;">1</sup></div></td><td style="width: 28px;"><div style="color: blue;">→</div></td><td style="width: 61px;"><div style="color: blue; text-align: right;"><em>(n-1)r<sup style="font-size: 10pt;">1</sup></em></div></td></tr><tr><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 25px;"></td><td style="width: 22px;"></td><td style="width: 32px;">r<sup style="font-size: 10pt;">0</sup></td><td style="width: 28px;">→</td><td style="width: 61px;"><div style="text-align: right;"><em>nr<sup style="font-size: 10pt;">0</sup></em></div></td></tr></tbody></table></div><br /><br /> There is one r<sup style="font-size: 8pt;">0</sup> term in each column of the matrix, so there are n r<sup style="font-size: 8pt;">0</sup> terms in the sum. Likewise, there are (n-1) r<sup style="font-size: 8pt;">1</sup> terms, (n-2) r<sup style="font-size: 8pt;">2</sup> terms, etc. If we group each diagonal into its own distinct term, we get a sum whose terms follow the pattern (n-1)*r<sup style="font-size: 8pt;">i</sup>: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-cGHvbVTPfkY/WSJlEQU6_nI/AAAAAAAAAtM/O08mCIBOEu0hYS5GGq6JkGNsyAs7m9zIQCPcB/s1600/talent_variance5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-cGHvbVTPfkY/WSJlEQU6_nI/AAAAAAAAAtM/O08mCIBOEu0hYS5GGq6JkGNsyAs7m9zIQCPcB/s1600/talent_variance5.png" /></a></div></span><span class="fullpost"> <br /><br /> Applying the distributive property and separating the terms of the sum, we get the following: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-vs7mxXC5xuE/WSJlEc1ks_I/AAAAAAAAAtM/4NWmBzP3IGgXuMGdvhOM5h-9KzpI_7eJwCPcB/s1600/talent_variance6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-vs7mxXC5xuE/WSJlEc1ks_I/AAAAAAAAAtM/4NWmBzP3IGgXuMGdvhOM5h-9KzpI_7eJwCPcB/s1600/talent_variance6.png" /></a></div></span><span class="fullpost"><br /><br /> The first sum is a simple geometric series, which we can calculate using the formula for geometric series: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-ekP25svdLck/WSJlEhzWAuI/AAAAAAAAAtM/WZ4agbp5WHAtampEbyez4tkoiB44r54BwCPcB/s1600/talent_variance7.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-ekP25svdLck/WSJlEhzWAuI/AAAAAAAAAtM/WZ4agbp5WHAtampEbyez4tkoiB44r54BwCPcB/s1600/talent_variance7.png" /></a></div></span><span class="fullpost"><br /><br /> The second sum is similar, but the additional i factor makes it a bit trickier since it is no longer a geometric series. We can, however, transform it into a geometric series using a trick where we convert this from a single sum to a double sum, where we replace the expression inside the sum with another sum. <br /><br /> The idea is that each term of the series is itself a separate sum which has i terms of r<sup style="font-size: 8pt;">i</sup>. This sum can be written as follows: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-RMzwTQfAM9E/WSJlEtySo6I/AAAAAAAAAtM/3ga4EivGY24-aPXeGJgQrLxJkphNghpDACPcB/s1600/talent_variance8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-RMzwTQfAM9E/WSJlEtySo6I/AAAAAAAAAtM/3ga4EivGY24-aPXeGJgQrLxJkphNghpDACPcB/s1600/talent_variance8.png" /></a></div></span><span class="fullpost"><br /><br /> Notice that we switched to using the index h rather than i. This means there is nothing inside the sum that increments on each successive term, and the i acts as a static value. In other words, this is just adding up the value r<sup style="font-size: 8pt;">i</sup> i times, which is of course equal to ir<sup style="font-size: 8pt;">i</sup>. <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-ehNHVKv5SnU/WSJlEqoPUuI/AAAAAAAAAtM/0z4OdRadwIo5eM9wo9B_7AycxbzUVPD9wCPcB/s1600/talent_variance9.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-ehNHVKv5SnU/WSJlEqoPUuI/AAAAAAAAAtM/0z4OdRadwIo5eM9wo9B_7AycxbzUVPD9wCPcB/s1600/talent_variance9.png" /></a></div></span><span class="fullpost"><br /><br /> In order to visualize how this double sum works, we can write down the terms of the sum in an array with i rows and h columns, where the value corresponding to each pair of (i,h) values is r<sup style="font-size: 8pt;">i</sup>. For example, here is what the array would look like with n=4: <br /><br /> <div align="center"><table style="border-collapse: collapse;"><tbody align="center"><tr style="border-bottom: solid black 1px;"><td style="border-right: solid black 1px;" width="31"></td><td width="31"><em>h=0</em></td><td width="28"><em>h=1</em></td><td width="28"><em>h=2</em></td><td width="28"><em>h=3</em></td></tr><tr><td style="border-right: solid black 1px;" width="31"><em>i=0</em></td><td width="31"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">0</sup></div></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">0</sup></div></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">0</sup></div></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">0</sup></div></td></tr><tr><td style="border-right: solid black 1px;" width="31"><em>i=1</em></td><td width="31">r<sup style="font-size: 8pt;">1</sup></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">1</sup></div></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">1</sup></div></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">1</sup></div></td></tr><tr><td style="border-right: solid black 1px;" width="31"><em>i=2</em></td><td width="31">r<sup style="font-size: 8pt;">2</sup></td><td width="28">r<sup style="font-size: 8pt;">2</sup></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">2</sup></div></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">2</sup></div></td></tr><tr><td style="border-right: solid black 1px;" width="31"><em>i=3</em></td><td width="31">r<sup style="font-size: 8pt;">3</sup></td><td width="28">r<sup style="font-size: 8pt;">3</sup></td><td width="28">r<sup style="font-size: 8pt;">3</sup></td><td width="28"><div style="color: #dddddd;">r<sup style="font-size: 8pt;">3</sup></div></td></tr></tbody></table></div><br /><br /> The greyed-out values are included to complete the array, but are not actually part of the sum. If we go through the sum iteratively, we start at i=0, and take the sum of r<sup style="font-size: 8pt;">i</sup> from h=0 to h=-1. Since you can't count up from 0 to -1, there are no values to count in this row, which represents the fact that ir<sup style="font-size: 8pt;">i</sup> = 0 when i=0. <br /><br /> Next, we go to i=1, and fill in the values r<sup style="font-size: 8pt;">1</sup> for k=0 to k=0. The next row, when i=2, we go from h=0 to h=1. And so on. <br /><br /> We are currently taking the sum of each row and then adding those individual sums together. However, we could also start by taking the sum of each column, which would be equivalent to reversing the order of the two sums in our double series: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-NvIj-pi8vvE/WSJlDQSbksI/AAAAAAAAAtM/nTro5E4a-MYxrHiYp7pS724RxCkmm-WCQCPcB/s1600/talent_variance10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-NvIj-pi8vvE/WSJlDQSbksI/AAAAAAAAAtM/nTro5E4a-MYxrHiYp7pS724RxCkmm-WCQCPcB/s1600/talent_variance10.png" /></a></div></span><span class="fullpost"><br /><br /> Note that the inner sum now goes from i=h+1 to i=n-1, which you can see in the columns of the array of terms above. <br /><br /> This is useful because each column of the array is a geometric series, meaning it will be easy to compute. The sum of each column is just the geometric series from i=0 to i=n-1. Then, to eliminate the greyed-out values from the sum, we subtract the geometric series from i=0 to i=h. <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-MO0eKgCRJjQ/WSJlDbzKJ7I/AAAAAAAAAtM/RDVmvXFudZomohDNStjrKzPUgRF6oP1lACPcB/s1600/talent_variance11.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-MO0eKgCRJjQ/WSJlDbzKJ7I/AAAAAAAAAtM/RDVmvXFudZomohDNStjrKzPUgRF6oP1lACPcB/s1600/talent_variance11.png" /></a></div></span><span class="fullpost"><br /><br /> This is the value for our inner sum, so we plug that back into the outer sum: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-_-CfTw1GzHw/WSJlDsVOEGI/AAAAAAAAAtM/8D1X9T3LvvE8vl_kWq9w4SS_jUIqGmb7gCPcB/s1600/talent_variance12.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-_-CfTw1GzHw/WSJlDsVOEGI/AAAAAAAAAtM/8D1X9T3LvvE8vl_kWq9w4SS_jUIqGmb7gCPcB/s1600/talent_variance12.png" /></a></div></span><span class="fullpost"><br /><br /> We now have values for both halves of our original sum, so next we combine them to get the full value: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-YeKS_U6AlCU/WSJlDmBv2rI/AAAAAAAAAtM/4k5KGq3ouO4fadtj08H-40UVbQWxF5_2gCPcB/s1600/talent_variance13.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-YeKS_U6AlCU/WSJlDmBv2rI/AAAAAAAAAtM/4k5KGq3ouO4fadtj08H-40UVbQWxF5_2gCPcB/s1600/talent_variance13.png" /></a></div></span><span class="fullpost"><br /><br /> We still have one more step to go to calculate the full sum of the correlation matrix. Recall that when we started, we were working with a symmetrical correlation matrix, and because the matrix was symmetrical along the diameter, we set out to find the sum for only the upper half of the matrix. In order to get the sum of the full matrix, we have to double this value: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-VoU5zaDRAC0/WSJlDiDnvoI/AAAAAAAAAtM/YK7ITqFCyGQKi9RKEvWGPhB52c5vmjKDwCPcB/s1600/talent_variance14.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-VoU5zaDRAC0/WSJlDiDnvoI/AAAAAAAAAtM/YK7ITqFCyGQKi9RKEvWGPhB52c5vmjKDwCPcB/s1600/talent_variance14.png" /></a></div></span><span class="fullpost"><br /><br /> Finally, note that the long diagonal of the correlation matrix only occurs once in the matrix, so by doubling our initial sum, we are double-counting that diagonal. In order to correct for this, we need to subtract the sum of that diagonal, which is just n*1 (since each element in that diagonal equals 1): <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-ETJsqb3dlx4/WSJlDxbkQMI/AAAAAAAAAtM/zx-OFU3vXe4NbEzoZzhiNowMKV3SXCmmACPcB/s1600/talent_variance15.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-ETJsqb3dlx4/WSJlDxbkQMI/AAAAAAAAAtM/zx-OFU3vXe4NbEzoZzhiNowMKV3SXCmmACPcB/s1600/talent_variance15.png" /></a></div></span><span class="fullpost"><br /><br /> This value is proportional to the sum of the covariance matrix, which is proportional to the variance of talent in the population over n days. <br /><br /> Next, we need to come up with a corresponding value to represent the variance of talent over a single day. To do this, we can rely on the fact that as long as talent never changes, the variance in talent over any number of days is the same as the variance in talent over a single day. Instead of comparing to the variance in talent over a single day, we can instead compare to the variance in talent over n days when talent is constant from day to day. <br /><br /> This allows us to construct a similar correlation matrix to represent the constant-talent scenario. Compared to the correlation matrix for changing talent, this is trivially simple: since talent levels are the same throughout the sample, the correlation between talent from one day to the next will always be one. <br /><br /> In other words, the correlation matrix will just be an n x n array of 1s. And the sum of an n x n array of 1s is just n^2. <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-rq9Rm0EhR2k/WSJlD3bGaKI/AAAAAAAAAtM/JI0-uV85rsoVgyQISc0dAcY3s3ZeCIWDgCPcB/s1600/talent_variance16.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-rq9Rm0EhR2k/WSJlD3bGaKI/AAAAAAAAAtM/JI0-uV85rsoVgyQISc0dAcY3s3ZeCIWDgCPcB/s1600/talent_variance16.png" /></a></div><br /><br /> The ratio of these two values will give us the ratio of talent variance after n days of talent changes to the talent variance when talent is constant: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-EgKT8Qld0U4/WSJlDzLk1mI/AAAAAAAAAtM/qro7aaXARH4_amw28n03if6-skk9s77lQCPcB/s1600/talent_variance17.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-EgKT8Qld0U4/WSJlDzLk1mI/AAAAAAAAAtM/qro7aaXARH4_amw28n03if6-skk9s77lQCPcB/s1600/talent_variance17.png" /></a></div><br /><br /> And that is our formula for finding the ratio of variance in true talent over n days to the variance in true talent on a single day, given the value r for the correlation of true talent from one day to the next. With some simplification, the above formula is equivalent to what was posted in the THT article: <br /><br /> <div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-6hc53zK1qCQ/WSzZpe37zbI/AAAAAAAAAwg/L7q1R-M8qsc89x9f1TiYqtg9MwlKdpuPwCPcB/s1600/talent_variance18.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-6hc53zK1qCQ/WSzZpe37zbI/AAAAAAAAAwg/L7q1R-M8qsc89x9f1TiYqtg9MwlKdpuPwCPcB/s1600/talent_variance18.png" data-original-width="495" data-original-height="273" /></a></div> </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-81715498832181798302016-12-24T17:23:00.002-07:002016-12-24T17:25:29.072-07:00The FightIt happened on May 15, 1912.<br /><br />The once-mighty Detroit Tigers were off to a slow start. It was to be a long season, their first losing one in six years. Far from mollifying the pain of defeat, their past success only served to heighten the tension they felt—the old veterans had nearly forgotten what it was to lose, and the youthful among them had not known to begin with. By contrast, their current situation, while not objectively hopeless, only felt that much more dire.<br /><br />Needless to say, when the Tigers rolled into New York on their steam locomotive from Boston, where they’d just dropped another two out of three, and cozied up to Hilltop Park, they were a cohort on edge.<br /><br /><span class="fullpost">Hilltop Park, as it happened, seemed at first the perfect destination for such a group of men. The Highlanders, not yet the storied franchise they would later become, were one of the few teams in the American League still worse than they were, and their boys were ripe for the beating. Over the next three days, Detroit began to feel their season reforming beneath their cleats. They took two of the first three and were nearly back to .500. Once-shattered men began again to believe.<br /><br />And so they took the field for the fourth and final game of the series. Things began inauspiciously, with the teams trading blows for the first two innings and Detroit emerging from the proverbial fracas with a one-run lead. As it were, such acts of violence were not to remain figurative.<br /><br />Detroit’s star centerfielder, Tyrus Raymond Cobb, was so known for his gentle disposition that his teammates, half-mockingly but not without a hint of affection, referred to him as “the Georgia Peach”. However, as Detroit’s standout performer, it was Cobb who found himself the target of the local malcontents who had made it their duty to suffer Highlander seasons firsthand.<br /><br />Loudest among these was one Claude Lueker, a man whose brazenness had been honed in the fiery confines of Tammany Hall, and he spoke in ways of which only a man entrenched in politics could even conceive. Such foul narratives poured from his mouth as would turn an oak tree barren just from the stench of their connotations.<br /><br />For four innings this continued. Cobb tried to escape the abuse by staying in centerfield for both turns at bat, sitting quietly against the outfield scoreboard and only speaking up to help direct the New York outfielders to avoid collisions. However, Cobb was accustomed to reading between innings, and had in fact been looking forward to the New York trip where the country’s leading literary critics resided and published, and had that very day picked up a new analysis of MacBeth from just such a scholar before the game. Only Cobb had left his reading glasses in the dugout, and was unable to study his text from the outfield.<br /><br />And so, after four innings of careful isolation, Cobb finally felt it safe to brave the trek back to the dugout to retrieve his spectacles. He knew at once he had been mistaken. The heckler was on him again, this time saying things Cobb was certain could turn even the most ardent of free speech advocates into anti-seditionists.<br /><br />Once in the dugout, Cobb was immediately accosted for his inaction.<br /><br />“Dammit, Cobb!” cried Sam Crawford. “This has gone on long enough! There are children here, for crying out loud!”<br /><br />Ed Willett soon chimed in. “You can escape this nonsense out there in centerfield, but I’ve got to stand on the mound and listen to it! You think Donie Bush would let this kind of thing go? Sometimes I wish he were our future Hall of Famer.”<br /><br />Cobb protested. “Look, I’m sorry you all have to put up with this, but there’s nothing we can do. We’ll be out of New York tomorrow, and we can put the whole thing behind us then.”<br /><br />Wanting nothing more than to go back to the outfield where the fans were much more docile and many were willing to debate the merits of Mark Twain’s lesser novels (which was one of Cobb’s pet subjects), Cobb hoped he could leave it at that. It was at this moment that an insult so offensive crept over the lip of the dugout and into the ears of the Detroit men that there was no longer anything Cobb could do for the hurler. <br /><br />Hughie Jennings walked over and put his arm on Cobb’s shoulder. “Look, son, I know you don’t like this any more than the rest of us. Probably less than the rest of us. But you’ve got to do something to shut that man up.” Jennings' eyes glowed with a warm fierceness Cobb knew from experience he could not allay. With a final pat on Cobb's shoulder, Jennings bored into him with those eyes and tried to reassure him: “We’ll have your back.” Cobb turned reluctantly toward the dugout steps.<br /><br />After a tentative step into the stands, Cobb quickly retreated. Jennings began to protest, but Cobb cut him off. “Look, I know what you’re going to say, but the man is an invalid! He’s got no hands!”<br /><br />“I don’t care if he doesn’t have any feet!” Jennings bellowed. “What must be done will be done, if not by you then by someone else!”<br /><br />From the corner of his eye, Cobb saw Bill Burns reaching for his lumber. Burns had long since washed out as an effective pitcher and had never been able to hit a lick, but he remained a towering hulk of a man, and Cobb knew it would not end pleasantly were he commissioned for the task. So, even more reluctantly than before, Cobb slunk back up the dugout steps and into the stands, trailed behind by his fellow Tigers.<br /><br />“Look,” Cobb said as he approached the man, “I wish you wouldn’t create such a ruckus, but also know that I haven’t any ill intent toward you.” With that, Cobb raised his fist half-heartedly, when suddenly the man heaved his entire weight in the direction of Cobb. Like two anteaters on the savanna they tumbled. Cobb’s teammates jumped at the sight, storming into the stands with bats in hand. Mayhem was upon the lower grandstand like flies on a heap of corpses and was not to be driven away.<br /><br />At this point, the Highlanders, who had been surveying the local architecture beyond left field using Hal Chase’s new engineering sextant, heard the commotion and were made aware of the delay in the game. They rushed to the aid of their fellow professionals, leaping unaware into the middle of the fray. For the next forty-five minutes, fans and players were at each other in a most uncivilized manner before the umpires managed to get through to the telegraph office in the press box to wire the police.<br /><br />By the time it was over, more than two dozen fans were injured, and several players received stern warnings for their behavior. Ban Johnson, who happened to be in attendance and witnessed the second half of the brawl after returning from the concession stand, suspended the entire Detroit roster, and they had to play three days later against Philadelphia with a replacement nine.<br /><br />And that, to this day, remains without a doubt the greatest fight in baseball history. </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-56737196139044892092015-09-03T10:41:00.000-07:002015-09-03T10:43:33.484-07:00Baseball is Dying (1892 version)At least that seems to be the opinion of <i>Pittsburgh Dispatch</i> sports editor John D. Pringle in his <a href="http://chroniclingamerica.loc.gov/lccn/sn84024546/1892-11-20/ed-1/seq-14/">weekly "A Review of Sports" column</a>: <br><blockquote>If there were ever any doubts concerning the waning interest in baseball, the meeting of the magnates at Chicago during the past week must have dispelled them. The gathering was more like the meeting together of a lot of men to sing a funeral dirge than anything else. The proceedings were doleful despite the efforts of the magnates to wear smiles. Most certainly this annual meeting was far below par in enthusiasm with those of former years. <br>... <br>To be sure, those persons who court notoriety by always wanting rules changed and tinkered were at the meeting. There was no millenium plan this time; it is an exploded bladder now, but there was the new diamond notion and a few other things just as silly and just as characteristic of liquid intellects as the Utopian "plan." Of course all the venders of quack remedies pointed out that "something must be done to revive an interest in baseball." Ah! You see they admit the game's popularity is waning. Happily no changes were decided on. </blockquote><br><span class="fullpost">Even more pessimistic was the <a href="http://chroniclingamerica.loc.gov/lccn/sn84024546/1892-11-03/ed-1/seq-4/#words=BistDLL%20Ctty%20renaissance%20progress">Kansas City Times, which apparently wrote</a>: <br><blockquote>BASEBALL has apparently served its day and its days seem near an end. Perhaps there may be a renaissance. But the ball players have come to the end of their string; they can play very little better; there is no more progress to be made. The people have seen it all. They are tired of reviewing it. </blockquote><br>By the way, this is the "new diamond notion" Pringle refers to: <br><br><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-eqeiURJ7YCE/Veh9UR6eldI/AAAAAAAAAaw/b_ECaVDbtts/s1600/five%2Bbase%2Bbaseball%2Bdiamond.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://1.bp.blogspot.com/-eqeiURJ7YCE/Veh9UR6eldI/AAAAAAAAAaw/b_ECaVDbtts/s320/five%2Bbase%2Bbaseball%2Bdiamond.png" /></a></div><br>As you can see, the proposal was to add a fifth base, with the middle bases positioned roughly where the infielders actually play. The basis for the proposal was twofold: One, it would increase the amount of fair territory by widening the angle between the first and third baselines, resulting in more base hits and fewer foul balls. Two, it would shorten the distance between stealable bases to 70 feet (along with the distance the catcher would have to throw the ball), leading to a more active running game. <br><br>By keeping the distance to first and to home the same, proponents hoped to minimize the impact on infield hits and scoring plays. By adding an extra base station and increasing the total distance around the bases, the extra action of more base hits and base stealing would not necessarily lead to a huge increase in scoring. <br><br> </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com2tag:blogger.com,1999:blog-2868194292414002063.post-59165619009396134582015-05-29T14:52:00.002-07:002015-05-29T14:59:47.793-07:00Gender in Chess PART 4: MISREPRESENTING THE DATA<i>The following is part of a series of posts about some of the difficulties with conducting and interpreting statistical research.</i><br><br><i>Previous: <br><a href="http://www.3-dbaseball.net/2015/05/the-gender-gap-in-chess-case-study-in.html">INTRO</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-1-measuring-gender.html">PART 1: MEASURING THE GENDER GAP</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-2-elo-ratings.html">PART 2: ELO RATINGS</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-3-cause-and-effect.html">PART 3: CAUSE AND EFFECT, THE BILALIĆ, SMALLBONE, MCLEOD AND GOBET STUDY</a></i><br><br>Finally, I think one of the biggest issues is that Howard may have misrepresented his research in the Chessbase.com article. Since the full paper is behind a paywall, I don't know for sure or to what extent, but there are certainly indications that the article overstates Howard's conclusions. <br><br>One is the following graph, which is one of the few pieces of data Howard shares from his research: <br><br><span class="fullpost"><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-TUf4v2ctYtY/VWjb1rcTR_I/AAAAAAAAAYw/4zgnNUAworo/s1600/howard07.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://1.bp.blogspot.com/-TUf4v2ctYtY/VWjb1rcTR_I/AAAAAAAAAYw/4zgnNUAworo/s320/howard07.png" /></a></div><br> The graph purportedly refutes the participation hypothesis by showing that the rating gap between males and females increases as the female participation rate increases. This supports Howard's alternative hypothesis that the most talented females are already playing no matter how low the overall female participation rate is, and that increasing the participation rate only adds less talented players and can never catch females up to males. <br><br>A few things jump out about this graph, though. First, the data on federations between 5-10% and 15-25% is completely missing from the graph, with the three remaining points forming a neat line with a clear slope. I have no idea if this was deliberate, but it is at least strange. <br><br>More importantly, Howard doesn't explain anywhere in his summary how the data is aggregated, how many players are included in each group, what countries are included in each group, how any individual federations rated, or why this particular graph was chosen out of the various studies or various number-of-games controls Howard seems to have run. <br><br>Howard singles out only Vietnam and Georgia as countries with high female participation in the text of the article. Except when I downloaded the April, 2015 rating list, the difference between the average male rating and the average female rating in Vietnam (94 points) was significantly lower than the difference worldwide (153 points). And Georgia (35 points) had one of the smallest gender rating gaps in the world. I don't have data on the number of games played to check what happens when you include that control, but as I wrote in the previous post, I am skeptical that that could possibly cause the rating gap for Georgia or Vietnam to suddenly jump above average. <br><br>What countries with high (25+%) female participation rate among FIDE-rated players had higher than average gender gaps? Ethiopia had a massive gap, with the average male rated 621 points higher than the average female. But there are only 30 Ethiopian players on the list, with just 9 females. Most of the other countries with a high percentage of females on the rating list that had above-average rating gaps also had very few players. <br><br>Now, I don't think it is Ethiopia that is throwing off Howard's chart, because I don't think any of the female players from Ethiopa have played enough FIDE-rated games to qualify for Howard's cutoff, but I wonder if Howard's graph is simply weighting all federations equally when he aggregates the data. If I try to recreate something like Howard's chart with the April, 2015 rating data without any control for games played, then I do get a positive slope <i>if</i> I just take the simple average of each federation's rating gap. If I instead weight each federation's rating gap by the number of female players, so that, for example, Georgia with its hundreds of rated players gets more weight in the aggregate than Ethiopia with its 30, then I get a negative slope: <br><br><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-YfVLYBLVIEo/VWjcB7T5ufI/AAAAAAAAAY4/P5S3VgP-9Rk/s1600/gap%2Bby%2Bfemale%2Bparticipation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://4.bp.blogspot.com/-YfVLYBLVIEo/VWjcB7T5ufI/AAAAAAAAAY4/P5S3VgP-9Rk/s320/gap%2Bby%2Bfemale%2Bparticipation.png" /></a></div><br>So it could be that Howard's graph is aggregating the data in a misleading way. I don't know for sure, but his results look a lot more like what I get when I aggregate the data in a misleading way. It is also possible that setting a control for players at 350 rated games played left relatively few players, and that after further splitting up the data into separate federations like this, there are simply not enough data points to get reliable results. <br><br>It is definitely misleading for Howard to highlight Georgia as his prime example of a federation that encourages female participation while he is showing that these countries have a larger gender gap, because Georgia definitely has a smaller than average gender gap. The following line in particular sounds suspicious: <br><br><blockquote>"I also tackled the participation rate hypothesis by replicating a variety of studies with players from Georgia, where women are strongly encouraged to play chess and the female FIDE participation rate is high at over 30%. The overall results were much the same as with the entire FIDE list, but sometimes not quite as pronounced."</blockquote><br>This is right after the graph showing that the gender gap goes up as female participation increases, and right after he singled out only Georgia and Vietnam as examples of countries included in that graph. Howard finds that the gender gap is actually lower in Georgia ("sometimes not quite as pronounced"), but he completely downplays this finding and neglects to report any quantitative representation showing how the results were less pronounced. It is no wonder that readers like Nigel Short got completely the wrong impression of Howard's results, as when Short summarized this graph in the following manner: <br><br><blockquote>"Howard debunks this by showing that in countries like Georgia, where female participation is substantially higher than average, the gender gap actually <i>increases</i> – which is, of course, the exact opposite of what one would expect were the participatory hypothesis true."</blockquote><br>I found this <a href="http://www.chess.com/blog/smurfo/men-women-and-short-2-an-academic-response">review of the full paper</a> written by Australian grandmaster David Smerdon. Smerdon's review gives a very different impression of Howard's work than Howard's own Chessbase summary. For example, in reference to the Georgia data and Short's interpretation: <br><br><blockquote>"I don’t know what Short is referring to here, because there is nothing in the Howard article that suggests this. Figure 1 of the study shows that the gender gap is, and has always been, lower in Georgia than in the rest of the world for the subsamples tested (top 10 and top 50). Short may be referring to Figure 2, which, to be fair, probably shouldn’t have been included in the final paper. It looks at the gender gap as the number of games increases, but on the previous page of the article, Howard himself acknowledges that accounting for number of games played supports the participation hypothesis at all levels except the very extreme."</blockquote><br>And later, summarizing Howard's research on the gender gap in Georgia: <br><br><blockquote>"...This supports a nurture argument to the gender gap, but again, the sample size is too small for anything definitive to be concluded."</blockquote><br>This sounds like it is describing completely different research from Howard's Chessbase article. While Short definitely did not do himself or the gender discussion any favours with his interpretation, neither does Howard do his research justice with his published summary. </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-50325721022827169172015-05-29T14:49:00.002-07:002015-08-22T22:21:51.512-07:00Gender in Chess PART 3: CAUSE AND EFFECT, THE BILALIĆ, SMALLBONE, MCLEOD AND GOBET STUDY<i>The following is part of a series of posts about some of the difficulties with conducting and interpreting statistical research.</i><br><br><i>Previous: <br><a href="http://www.3-dbaseball.net/2015/05/the-gender-gap-in-chess-case-study-in.html">INTRO</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-1-measuring-gender.html">PART 1: MEASURING THE GENDER GAP</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-2-elo-ratings.html">PART 2: ELO RATINGS</a></i><br><br>Howard criticizes a <a href="http://rspb.royalsocietypublishing.org/content/276/1659/1161.abstract">2009 study published by the British Royal Society</a> that found support for the participation hypothesis--that there are fewer elite female chess players simply because there are fewer female chess players overall. The Bilalić, et al study looked at the top 100 rated male and female players in the German federation and compared the distribution of their ratings to an expected distribution based on the overall participation rates by gender. The observed gender gap was close to what was expected from the overall participation rates. <br><br>Howard issues three main criticisms of the study: <br><br><span class="fullpost"> 1) It is too difficult to determine cause and effect from their data. <br>2) They didn't control for the number of rated games played. <br>3) The study relies on data from only the German Federation and thus could simply be a sample size fluke. <br><br>CAUSE/EFFECT <br><br>Howard argues that showing that the gender gap is in line with what we would expect from participation rates is not enough to establish participation rates as the cause of the gender gap. However, Howard himself does no better in establishing a cause/effect relationship between the gender gap and his hypothesis that men are innately more talented at chess. <br><br>Howard supports his claim with data showing that the rating difference between the top male and female players has remained relatively constant over the years, which he assumes means the gender gap has not closed (<a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-1-measuring-gender.html">which is probably incorrect</a>). He then assumes that if there were non-biological causes behind the gender gap, the gap must have diminished over the past several decades as feminism has advanced in many developed countries, and if it hasn't then that means there is likely a biological cause. <br><br>But he doesn't provide any more support for this assumption than Bilalić, et al do for theirs. Several areas in sports that should be unaffected by the physical differences between males and females, such as coaching, general management, and officiating positions, have seen little to no progress in gender disparity over that same time span in spite of any general advances in society. It is not a given that a lack of significant progress means that gender disparity is due to natural talent. <br><br>I think Howard overestimates his evidence of a causal relationship in part because underestimates the "gatekeeper" effect in chess. In his 2005 paper, he gives this as an important factor in testing his hypothesis: <br><br><blockquote>"Adequately testing the evolutionary psychology view, that the achievement differences at least partly are due to ability differences, requires a domain with very special characteristics. First, it should be a complete meritocracy with no influence of gatekeepers, in which talent of either gender can rise readily."</blockquote><br>Howard relies on the assumption that chess is close to a complete meritocracy because most tournaments are open* and results are based on your performance. Howard contrasts this to fields like science, where decision-makers control access to resources and could be susceptible to bias: <br><br><blockquote>"In most domains, gatekeepers control resources needed for high achievement and may run an ‘old boy’s network’ favouring males. In science, for instance, gatekeepers distribute graduate school places, jobs, research grants, and journal and laboratory space."</blockquote><br><i>*Most major tournaments involving the top players are actually not open, but invitational. Most tournaments below the elite level are open, however.</i><br><br>The absence of decision-makers with the ability to deny players access to tournaments does not mean there are no gatekeeper forces at work, however. There are other forces that can have just as strong an effect. WIM Sabrina Chevannes gives <a href="http://www.chess.com/blog/schevannes/women-in-chess-the-sexisminchess-controversy">some examples of social pressures</a> (under the section "My thoughts on sexism in chess") that commonly make women feel unwelcome or uncomfortable at predominantly male tournaments, ranging from belittling remarks to flat-out harassment. <br><br>These problems are <a href="https://twitter.com/SChevannes/status/590225065020633088">driving established female players away from the game</a>, but they can also be important for young players getting into the game. Most grandmasters start chess at a young age, and research backs the idea that <a href="http://www.ncbi.nlm.nih.gov/pubmed/17201516">starting age is an important factor in chess mastery</a> (<a href="http://bura.brunel.ac.uk/bitstream/2438/611/1/Gobet_DevPsyc_Final.pdf">full paper</a>)), both because starting earlier allows for greater total accumulation of practice, and because chess likely has a "critical period" effect for learning (the same effect that makes it much easier for a young child to learn a language than an adult). <br><br>This means that even subtle effects, such as a parent being more likely to teach the game to male children at a young age, or young males being more attracted to the social environment of a predominantly male local club, can have a significant gatekeeper effect. Things like age of exposure to chess, access to high-level coaching and competition, and social compatibility with existing chess culture are all important factors in developing a player's ability. <br><br>This is probably why we see strong chess countries like Russia or other former Soviet nations consistently dominating chess, even though they probably don't have any biological ability advantage. The more children who are exposed to favourable learning criteria, the more high-level chess players a population will produce. Just like these factors help keep the strongest federations on top, they could conceivably favour male players over female players. <br><br>Chevannes also points out more explicit gatekeeper behaviour, such as limited access to funding and coaching for England's womens Olympiad team ("Effects of sexism in English Chess"). Several countries provide state-funding or private grants for chess development, similar to the type of gatekeeper influences Howard describes in science. For example, the USCF has the Samford Chess Fellowship, a private grant currently for $42,000, which has been awarded annually since 1987. Thirty of the 32 recipients (three years the grant was split between two recipients) have been male. <br><br>And, as mentioned earlier, most of the top tournaments are actually invitational, which also fits Howard's criteria for gatekeeper influence. The potential gatekeeper effect of invitational tournaments preserving rating gaps is even something <a href="http://www.chessquotes.com/topic-ratings">players have complained about:</a> when the top tournaments only hand out invitations to the same group of top-rated players, those players just end up trade rating points among themselves, which leaves little opportunity for them to give rating points back to the rest of the field. <br><br>These factors are incredibly difficult to measure and separate out from your data, which is why Howard considers the absence of such factors essential to test his hypothesis. By ignoring these factors, Howard strongly inflates his evidence in support of a biological cause. In fact, this is a common criticism of the entire field of evolutionary psychology which Howard uses to approach this question: its hypotheses about cause and effect are so difficult to properly test, it is debatable whether it actually qualifies as science. <br><br><br>NUMBER OF GAMES CONTROL <br><br>As discussed in the <a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-2-elo-ratings.html">previous post</a>, I don't think controlling for the number of rated games played adequately separates out the effect of practice and development from that of natural talent. More importantly, though, Howard's criticism here is confusing because he only describes the importance of controlling for number games as something that could avoid a potential bias against female players. Females tend to play far fewer rated games on average, and a player's rating tends to increase the more games they play. <br><br>In order for this criticism to be relevant to the Bilalić study, omitting this control would have to bias the results in favour of female players. Howard offers no reasoning as to why this would be the case, and it is not at all obvious how it could be. Howard's own data appears to show a decreased gender gap after controlling for number of games. <br><br><br>NOT ENOUGH DATA POINTS <br><br>While Bilalić, et al did only look at players from the German federation, they compared ratings for the top 100 players of each gender. In Howard's original study, he included players from all federations, but still only compared the ratings of the top 10, 50, and 100 players of each gender, so he was not actually using any more data points than the study he is criticizing. <br><br>Just as importantly, Bilalić, et al actually had a reason for using data from just one federation rather than FIDE data, as outlined in a <a href="http://journal.frontiersin.org/article/10.3389/fpsyg.2014.00569/full">later paper by Bilalić, Nemanja Vaci, and Bartosz Gula</a>. FIDE rating data is limited to only above-average players and omits a lot of data from developing or below-average players. Rating data from individual federations can allow for a more comprehensive view of the population, such as a better estimation of overall participation rates, which was necessary for their study. <br><br>In Howard's summary article, he refutes the Bilalić study by showing data from more federations, but he doesn't actually repeat their study to create a comparison to their work. Instead, he just shows aggregated data with no indication of how many players were included or how the data was aggregated. It is not clear that he actually used more data points to draw his conclusion than Bilalić, et al used, only that he looked at players from multiple federations. <br><br>Howard's most recent study is behind a paywall, so unfortunately all I have to go by is his summary published on Chessbase.com. I assume there are more details in the full study, but it is impossible to tell how his data really compares with the data from the Bilalić study from what he published in the summary, which is largely written as a refutation to the Bilalić study. <br><br><i>NEXT: <br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-4-misrepresenting.html">PART 4: MISREPRESENTING THE DATA</a></i></span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-31232034772369829802015-05-29T14:48:00.003-07:002015-05-29T14:57:51.317-07:00Gender in Chess PART 2: ELO RATINGS<i>The following is part of a series of posts about some of the difficulties with conducting and interpreting statistical research.</i><br><br><i>Previous: <br><a href="http://www.3-dbaseball.net/2015/05/the-gender-gap-in-chess-case-study-in.html">Intro</a></i><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-1-measuring-gender.html">PART 1: MEASURING THE GENDER GAP</a><br><br>We saw previously that the lack of significant change in the rating gap between the top male and female players can actually be evidence that the gender gap in chess has diminished over the past few decades. That is not the only potential interpretation problem with Howard's conclusion, though. Even if control groups hadn't indicated that the Elo gap should be increasing absent any closing of the gender gap, it could still be conceivable that the gender gap has in fact diminished. <br><br>This is because Elo ratings are not indicators of absolute playing strength, only of strength relative to the field of rated players. In other words, a 2500 Elo rating among one group of players is not necessarily equivalent to a 2500 rating among another group of players. For example, Hikaru Nakamura's FIDE rating for April, 2015 is 2798. His USCF rating is 2881. Both are Elo ratings, but because they are tracked among different pools of players, they don't have to match up even though they are both describing the strength of the exact same player. <br><br>Howard is looking at the same FIDE rating for both male and female players, though, so this shouldn't be a problem, right? Possibly, but we don't know for sure. <br><br><span class="fullpost"> Elo ratings work by taking points away from one player and giving them to the other each time a game is played. If a player has a true playing strength of 2500 but is rated at 2400, then they would be expected to take points from their opponents until their rating matches their playing strength. Likewise, a player who is overrated will give points back to the field until their rating returns to their ability. <br><br>Many top female players play predominantly or exclusively in womens events. And some of the top female players who play in open events, such as Judit and Susan Polgar when they were still active, rarely play womens events at all, and as a result rarely play against other women. Because of this, if males or females are over- or underrated as a group, there might not be enough games between the two groups to transfer the necessary rating points to bring them back in line. It is possible that female players and male players form two sufficiently isolated player pools that their ratings are not necessarily comparable. <br><br>This might sound far-fetched, but it is actually a known problem and has occurred before. In 1987, FIDE commissioned a study comparing the performance of top female players against men to their performance against other women because of this exact issue. The six women who had played a sufficient number of games against both genders over the mid-80s to qualify for the study <a href="http://anusha.com/elo.htm">all held significantly higher performance ratings against male opponents than female opponents</a>--on average more than 100 points higher. <br><br>This suggested that, for example, a 2400 rated female player was likely stronger than a 2400 male player. To compensate, FIDE added 100 points to all rated female players (except Susan Polgar*) in order to bring their ratings in line with the male ratings. It is possible that the two pools of players have remained isolated enough to drift out of sync again over the last few decades, however. <br><br><i>*The reasoning was that Polgar already played mostly within the male pool of players and didn't need the adjustment. However, the decision to give the full 100 points to her top rivals, who also played a significant number of games against men, and 0 points to Polgar was nonsensical and controversial, and there were accusations that FIDE was deliberately manipulating the ratings to place Maya Chiburdanidze in the #1 spot ahead of Polgar.</i><br><br>Most people who follow chess believe that some form of inflation exists in the ratings . In other words, they believe a 2800 rating is not as strong now, when there are a handful of players hovering around that level, as it was when Garry Kasparov first achieved it back in 1990 and Anatoly Karpov was the only other player over 2700, or in 1972 when Bobby Fischer topped the ratings list by over 100 points at 2785. <br><br>The mechanism of inflation is not well understood, however, and it is not clear that it would necessarily have had the same effect on a fairly isolated pool of female players as on the population as a whole. It could be that after nearly 30 years, male ratings have inflated faster than female ratings, and we have once again reached a point where female players as a whole are underrated. <br><br><br>Howard himself notes another potential interpretation problem with using FIDE ratings to measure the gender gap in chess: <br><br><blockquote>"I found that women typically play many fewer FIDE-rated games than males, only about one third of the number on average. Now, the usual learning curve for chess players is a progressive ascent to a peak at around 750 FIDE-rated games. ... Comparing modestly- and highly-practiced individuals can be misleading. Studies should control for differences in number of games played, either by equating males and females on this or by examining differences at the typical rating peak at around 750 games."</blockquote><br>Howard then dismisses this explanation because even after controlling for the number of rated games played, males still had higher ratings. <br><br>The number of FIDE-rated games played itself isn't really what we care about, though. It's just a proxy for "modestly-practiced", "highly-practiced", etc. Players who have played more games should, in general, have more experience and further development. Games played are far from a perfect indicator of a player's level of development or experience, however. <br><br>Most obviously, not all games are FIDE-rated. While top-rated players do for the most part compete exclusively in FIDE-rated events, that is not true for developing players. For example, U.S. prodigy <a href="https://ratings.fide.com/chess_statistics.phtml?event=2040506">Sam Sevian has played 539 FIDE-rated games</a> as of April, 2015. He's played <a href="http://main.uschess.org/datapage/gamestats.php?memid=13493815">922 USCF-rated games</a>. Even ignoring casual and club games, that is hundreds of competitive games that are not in Howard's data (and it is more than the difference between 922 and 539, because not all FIDE-rated games are USCF-rated). <br><br>The amount of study devoted to chess outside of rated games is also a huge factor in development. Someone who is devoted to studying chess full-time will develop much more than someone who competes as a casual hobby, even if you control for the number of rated games played. Likewise, someone who competes fairly regularly and reaches 750 games in their 20s is different from someone who competes less frequently and reaches 750 games in their 40s or 50s (or even later). The former is probably much more likely to still be ascending and hitting their peak at that point, while the latter likely peaked or plateaued at a much lower number of games, and would probably have begun declining with age by the time they reached 750 games. <br><br>It's easy to see how two players can be at vastly different stages of development even after the same number of games played. Howard isn't comparing two individual players, though--he is comparing two groups of players (male and female). As long as you look at enough players in each group, shouldn't those other factors start to even out? <br><br>Ideally, they should. If there is a bias that applies to the group as a whole, though, that won't happen. For example, if female players tend to begin playing FIDE-events at an earlier stage in their development, or if they tend to compete less frequently than male competitors, that would introduce a bias that won't even out. <br><br>Many female players compete predominantly in female-only events, which are less frequent than open-gender events. And because these female-only events draw from a much smaller segment of the chess-playing population than the open-gender events, they also tend to be less intimidating for less-experienced players to enter. So there is a good chance this bias does exist. <br><br>In fact, Howard's data supports this. His 2005 paper includes a table summarizing males and females who entered the rating list between 1985 and 1989, and shows that the median age at which females first appeared on the list was about five years younger than the median age for males (though for top 100 females, it was only about 6 months younger than for top 100 males). And, in spite of entering the rating list at a younger age, the females on average still played significantly fewer games in their competitive careers. <br><br>Howard hits on an important idea about a player's rating being reflective of both their innate abilities and their level of practice and development. In order to test for the effects of innate abilities alone, as Howard sets out to do, he realizes that he needs to strip out the effects of development. However, this is a much more complicated issue than Howard acknowledges, and simply controlling for the number of rated games played is not adequate to make the assumption that any remaining differences must reflect natural ability. <br><br><i>NEXT: <br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-3-cause-and-effect.html">PART 3: CAUSE AND EFFECT, THE BILALIĆ, SMALLBONE, MCLEOD AND GOBET STUDY</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-4-misrepresenting.html">PART 4: MISREPRESENTING THE DATA</a></i> </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com1tag:blogger.com,1999:blog-2868194292414002063.post-90632658168081128072015-05-29T14:42:00.000-07:002015-05-29T14:58:50.625-07:00Gender in Chess PART 1: MEASURING THE GENDER GAP<i>The following is part of a series of posts about some of the difficulties with conducting and interpreting statistical research.</i><br><br><i>Previous: <br><a href="http://www.3-dbaseball.net/2015/05/the-gender-gap-in-chess-case-study-in.html">INTRO</a></i><br><br> Howard begins by revisiting a <a href="http://www.chrest.info/Fribourg_Cours_Expertise/Articles-www/IV%20Expertise%20et%20societe/Howard%20on%20gender%20diff%20in%20chess.pdf">2005 paper he published on the same topic</a> showing the gap between the average Elo rating of the top 50 male players and the top 50 female players: <br><br><a href="http://4.bp.blogspot.com/-4-MBaSyr6Vc/VWjaWmmYCPI/AAAAAAAAAYc/EzmKkW21veU/s1600/howard05.png" imageanchor="1" ><img border="0" src="http://4.bp.blogspot.com/-4-MBaSyr6Vc/VWjaWmmYCPI/AAAAAAAAAYc/EzmKkW21veU/s320/howard05.png" /></a><br>Howard then argues that because the Elo gap has remained relatively constant in spite of societal changes over that time period, the difference between the male and female ratings is not due to societal factors and is at least partially biologically-based. <br><br>This finding is likely surprising to most people in chess. For example, the legendary Garry Kasparov, who early in his career expressed a somewhat Fischer-esque dismissal of female chess talent, grew to greatly respect the Polgar sisters (one of whom has defeated Kasparov himself) and felt they broke new ground for female players. In a recent interview at an exhibition match with Short in St. Louis, Kasparov rejected the claim that the gender gap has not closed. Even Short himself wrote that he had assumed the gap had closed somewhat before reading Howard's article. <br><br><span class="fullpost"> Howard acknowledges this prior expectation in his 2005 paper: <br><br><blockquote>"Anecdotally at least, there has been some convergence in chess at top levels. For example, there are more female grandmasters. Judit Polgar, born in 1976 and the strongest-ever female player, regularly wins tournaments against top male competition and several times has made the top ten players list. She once held the record for youngest-ever grandmaster. But, the extent of gender differences and their trends over time have never been quantified."</blockquote><br>After quantifying the Elo difference, though, Howard simply assumes that the difference remaining flat means there has been no closing of the gender gap. This might seem like an reasonable assumption, but it lacks an important step: he has no control group to help interpret his results. <br><br>*** <br><br>Computers have revolutionized how chess is played and studied at the top level. With the help of computer engines that are much stronger than any human grandmaster, known opening lines are constantly being analyzed more and more thoroughly. The more thoroughly these lines are known, the more important it is for players to memorize them, and the deeper they have to look for new ideas that could lead to a winning position. Strong grandmasters spend most of their time studying and developing these lines. <br><br>Former World Champion Vladimir Kramnik (born 1975, reached grandmaster 1991) said at his most recent tournament that <a href="https://www.youtube.com/watch?v=EsPLcY6tzhw&t=1m32s">top players have to work much harder now</a> than when his career was starting. However, only the very top players can support themselves studying chess and competing full time. Most grandmasters, let alone lower titled or untitled players, don't have the time to keep up with all of these advances. <br><br>It is possible that this has led to the top players distancing themselves from the field. If that is the case, then, absent any closing of the gender gap, we would expect the Elo gap between the top 50 males and the top 50 females to have grown over time, just because there are more males in the group at the very top that is pulling away from everyone else. We need some kind of control group to compare to in order to help us interpret Howard's graph before we conclude the gender gap has not closed. <br><br>One way to do this is to compare breakdowns other than the top 50 females vs. the top 50 males. For example, what if we take the top 50 Russian players, and compare them to the top 50 non-Russian players? <br><br>The top 50 Russians in the April 2015 FIDE rating list have an average Elo rating of 2659. The top 50 players from outside Russia are at 2726. So the top 50 Russian players are 67 points below the non-Russians. <br><br>If we go back to 1991 (the first year the Soviet federations were listed separately--it would be impossible to make comparisons before that because the USSR included many strong players from outside Russia), the top 50 Russians were 54 points behind the top non-Russians. So the gap has grown a bit in the last couple decades, in spite of the fact that Russia remains by far the top federation. <br><br>Of course, you might be able to make a case that Russia is a bit weaker than it was in the early 90s when Kasparov and Karpov were still dominating chess. Except here's the thing: when we compare Russia to the rest of the world, Russia has lost ground. But if we instead compare Russia to each individual federation, they have actually gained ground over most of them. This seems paradoxical, but it makes sense if the top end of the spectrum is stretching itself out. <br><br>Let's take a look at some of these other countries. <br><br>The U.S. is experiencing something of a golden age for chess right now. They currently have two of the top ten players in the world. Hikaru Nakamura, the best American player since Bobby Fischer*, has been as high as #2 in the world in the live rankings this year, and recently became the first American to hit 2800 Elo. Increased funding and efforts in development programs have produced some remarkable young talent, including Sam Sevian, who in 2014 became the sixth-youngest grandmaster ever at 13 years old. <br><br><i>*at least not counting Fabiano Caruana, who has spent most of the last year as the #2 player in the world--Caruana was born in the U.S. but moved to Europe at age 13 and has represented Italy for his professional career</i><br><br>The emergence of serious collegiate chess teams has also attracted strong talent from around the world to the U.S. For example, five of the twelve competitors in the open-gender division of the 2015 U.S. National Championship (and at least that many from the Womens division) had originally competed under a different national federation before transferring to the USCF, including world #7 Wesley So. Likely influenced by the emergence of American chess, the aforementioned Caruana recently announced that he is transferring back to the USCF. <br><br>You would be hard pressed to argue that the U.S. federation is weaker now than in 1991, and certainly not much weaker. Yet in 1991, the top 50 American players were 105 points behind the top 50 non-Americans. Now, they're 185 points back. <br><br>What about Norway, the home of current World Champion and clear #1 Magnus Carlsen? Carlsen has sparked a chess craze in Norway, where tournaments now get <a href="https://chess24.com/en/read/news/norwegian-national-tv-networks-vying-for-chess-broadcast-rights">national TV coverage</a>. Norway hosts one of the top chess tournaments in the world (Norway Chess) and last year hosted the Chess Olympiad. The number of Norwegians in FIDE's published rating list grew from 92 in 1991 to 1306 this year. <br><br>The gap between the top 50 Norwegian players and the rest of the world grew from 289 points in 1991 to 337 points in 2015. <br><br>Not all federations saw their gap increase. China, for example, has without a doubt become much stronger in chess since 1991. Chess has had difficulty catching on in China due to the prevalence of xianqi, China's native chess variant, and go, another popular strategy game. Chess was even outlawed for a period in the 1960s and '70s as part of Chairman Mao's Cultural Revolution. Starting in the 1970s, however, China began pouring an increasing amount of funding and effort into growing its chess program. <br><br>This has ramped up in recent years, and China has finally emerged as a world chess power. Their women's team has won gold in four of the nine Chess Olympiads held since 1998 and three of the five World Team Championships since a women's division was created in 2007. The open-gender team won gold in the 2014 Olympiad and the 2015 World Team Championships. Their top 50 went from 329 points back of the world in 1991 to 207 points back in 2015. <br><br>Still, the vast majority of federations saw increases. Here are the Elo gaps for each of the 38 federations that had at least 50 FIDE-rated players in both 1991 and 2015: <br><br><a href="http://1.bp.blogspot.com/-DyE4xZrYkXU/VWjahIAfPFI/AAAAAAAAAYk/hV2Dor5B3DU/s1600/elo%2Bgap.png" imageanchor="1" ><img border="0" src="http://1.bp.blogspot.com/-DyE4xZrYkXU/VWjahIAfPFI/AAAAAAAAAYk/hV2Dor5B3DU/s320/elo%2Bgap.png" /></a><br><br>Only 5 of the 38 federations closed the Elo gap at all, and on average the gap grew by 54 points. <br><br>When we look at the individual federations as control groups, we see evidence that the top really is separating itself further away from the field as time goes on. In spite of that, Howard's graph shows that women actually closed the Elo gap by a small amount. This can be interpreted as evidence that the gender gap is in fact closing, because it is offsetting the effect we are seeing with the national federations. <br><br>It is tempting to see evidence that supports your hypothesis in a vacuum, such as the relatively constant Elo gap between male and female players over the years, and to stop there. It is also tempting to believe a variable you believe to be objective and unbiased, such as Elo ratings, is self-explanatory and needs no control group to interpret. However, this is a dangerous practice. Especially when your results run counter to what subject matter experts would expect, as this finding did, it is important to make sure you have the proper context to interpret your results before jumping to conclusions. <br><br><i>NEXT: <br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-2-elo-ratings.html">PART 2: ELO RATINGS</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-3-cause-and-effect.html">PART 3: CAUSE AND EFFECT, THE BILALIĆ, SMALLBONE, MCLEOD AND GOBET STUDY</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-4-misrepresenting.html">PART 4: MISREPRESENTING THE DATA</a></i> </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-75404565071080788502015-05-29T14:41:00.003-07:002015-09-02T07:38:51.788-07:00THE GENDER GAP IN CHESS: A CASE STUDY IN STATISTICAL RESEARCH<i>The following is an introduction to a series of posts about some of the difficulties with conducting and interpreting statistical research, with links to the rest of the series at the end of this post.</i><br><br>Bobby Fischer once said he could beat any woman in the world giving them knight odds* (the <a href="http://www.chessquotes.com/player-fischer">full quote</a>, in true Fischer fashion, is worse). Mikhail Tal famously responded, "Fischer is Fischer, but a knight is a knight!" <br><br><i>*Knight odds means the player giving odds starts the game with one knight already off the board.</i><br><br>Tal was correct, of course. In 2008, a master player named John Meyer, rated 2284 (grandmasters are rated 2500+, with the top GMs well over 2700 or even 2800), played a match against the computer program Rybka with knight odds. By that time, computers had far surpassed humans in chess. Rybka could have easily defeated the world champion in a non-handicapped match. With knight odds, Meyer won the match 4-0. There were women in Fischer's generation much stronger than Meyer who would have had no problem beating Fischer given such a handicap. <br><br>Still, chess remains a largely male-dominated profession. Currently, there are just two women in the top 100 rated players in the world, and one (Judit Polgar) is retired and will fall out of the active rankings later this year. Theoretically, chess should be among the most gender-neutral competitive disciplines, but the overwhelming majority of players are male. In fact, the predominance of male players is so strong that the URL for FIDE's top 100 overall list actually ends with "...?list=men", even though there are women on the list. <br><br>The question of why this is and what can (or should) be done about it has long been a point of discussion in the game, but this discussion reached the mainstream media last month due to controversy over an <a href="http://en.chessbase.com/post/vive-la-diffrence-the-full-story">article written by British Grandmaster Nigel Short</a> in the magazine New in Chess. <br><br><span class="fullpost"> If you don't know Short (and you probably don't unless you particularly follow chess or remember his highly publicized World Championship match with Garry Kasparov in 1993), he's...well he's not really the best representative to speak about anything, really. When asked to write an obituary in his newspaper chess column for fellow British Grandmaster Tony Miles, he pretended to write a proper obituary for a few paragraphs before descending into a long-winded rant about why he didn't like Tony Miles, culminating with the line "I obtained a measure of revenge not only by eclipsing Tony in terms of chess performance, but also by sleeping with his girlfriend, which was definitely satisfying but perhaps not entirely gentlemanly." Nigel Short, everyone. <br><br>So it's no surprise that Short set off some fuses when asked to write about this topic (by the time he gets to the part about how he has to "manoeuvre the car out of our narrow garage" for his wife, you kind of get the sense that he's just doing this on purpose--which, in a media environment where controversy equals views equals money, he may well be.) <br><br>In the midst of his rambling, though, Short actually does cite an academic paper by Robert Howard (actually, <a href="http://en.chessbase.com/post/explaining-male-predominance-in-chess">a synopsis of the study posted by Howard</a> to the chess website Chessbase.com): <br><br><blockquote>"Nevertheless, my gut feeling was that female chess players are both stronger and more numerous than they were when I first began competing. The latter is certainly true, but an excellent article by the Australian Robert Howard on the chessbase.com website last year demonstrated that, despite the enormous societal changes over 40 years, the gap between the leading males and females has remained fairly constant at nearly 250 Elo points – a yawning chasm in ability. That women seem stronger has more to do with universally higher standards, due to the ubiquity of computers, than any closing of the gender gap." </blockquote><br><br>Unfortunately, Short's citation comes with a clear agenda, as is evident in how he presents a second academic study which reached different conclusions: <br><br><blockquote>"Howard also subtly critiques the most absurd theory to gain prominence in recent years, by Bilalić, Smallbone, McLeod and Gobet (which was submitted to the prestigious Royal Society, no less), that the rating sex difference is almost entirely attributable to participatory numbers (they comprise just 1% of the readership of this magazine). With the aid of a couple of bell curves this foursome neatly solve the eternal chess conundrum of why women lag behind their male counterparts, while simultaneously satisfying that irritating modern psychological urge to prove all of us, everywhere, are equal. Only a bunch of academics could come up with such a preposterous conclusion which flies in the face of observation, common sense and an enormous amount of empirical evidence too. Howard debunks this by showing that in countries like Georgia, where female participation is substantially higher than average, the gender gap actually <i>increases</i> – which is, of course, the exact opposite of what one would expect were the participatory hypothesis true."</blockquote><br>The problem is partially that Short probably has no idea what the studies are doing (for example, Short seems unaware that Howard found the gender gap did decrease in Georgia compared to the rest of the world, makes up the term "enormous amount of empirical evidence" without justification, and I don't get the impression he's even read the Bilalić, et al study), but in this case, the blame doesn't lie entirely with Short. Howard's synopsis itself is largely responsible. It appears to misrepresent Howard's own work, as well as point to some potential critical issues with the study. <br><br>That being the case, I'd like to use this as an opportunity to cover some of the potential pitfalls in running this type of statistical analysis. <br><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-1-measuring-gender.html">PART 1: MEASURING THE GENDER GAP</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-2-elo-ratings.html">PART 2: ELO RATINGS</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-3-cause-and-effect.html">PART 3: CAUSE AND EFFECT, THE BILALIĆ, SMALLBONE, MCLEOD AND GOBET STUDY</a><br><a href="http://www.3-dbaseball.net/2015/05/gender-in-chess-part-4-misrepresenting.html">PART 4: MISREPRESENTING THE DATA</a> </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-70894288010807181262015-03-16T12:06:00.002-07:002015-03-17T05:19:08.855-07:00Math Behind Projecting the Division Winner (THT Article)<i>Note: this article uses examples from the free <a href="http://www.r-project.org/">statistical software R</a></i><br><br>In my <a href="http://www.hardballtimes.com/projecting-uncertainty-a-roadmap-to-the-projected-standings/">Hardball Times article</a> about the projecting the number of wins we expect from the division winner, I included the following example: <br><br><blockquote><i>Instead of having five baseball teams, let's say we have five coins. All we are going to do is flip each coin 162 times. Each time a coin lands on heads, it gets a win, and each time it lands on tails, it gets a loss. The coin with the most wins after 162 flips wins the division. <br><br>How many wins would you project for the coin that ends up winning the division, whichever coin that might be? <br><br>No coin by itself is going to have an expected value of more than 81 wins, but it is extremely likely that at least one out of the five coins will end up with more than 81 wins just by chance. It turns out that if you repeat this experiment a bunch of times, the coin that wins the division will end up with about 88 wins, on average.</i></blockquote><br>Hopefully this makes sense conceptually, but how do I get 88 wins (or, more precisely, 88.3943...)? <br><br><span class="fullpost">One way, of course, is to actually do what I said, and flip a bunch of coins over and over and over and record the results. Let's say I repeat this experiment 10 times, and I get the following results for the "division winners": <br><br>94, 85, 89, 87, 89, 90, 82, 86, 85, 86 <br><br>That is an average of 87.3--pretty good, but obviously not the most precise estimate. We need to repeat the experiment more than ten times to make sure we get something closer to the true mean. Rather than spend hours upon hours flipping coins, we can actually cheat and get a computer to pretend to do it for us. This is called simulation, and it can be a very powerful statistical tool for determining probabilities, averages, distributions, etc that are not computationally obvious (full disclosure: I actually cheated and simulated the 10 seasons rather than record and tally 8000+ coin flips). <br><br>Now, let's simulate 1000 seasons: this time, we get 88.5940 wins leading the division, on average. Much better, but still a couple tenths off. Bumping the number of seasons up to 10,000, this time we get 88.4296. And if we keep simulating more and more seasons, we are going to start seeing the results stay clustered more and more closely around 88.3943. <br><br>So that's one way to estimate the expected win total for our division winner. How do I know that the results should cluster around 88.3943 specifically, though, other than simulating millions and millions of seasons? <br><br>We can get the answer without simulation by starting with a simpler question. What is the probability that none of the teams wins more than, for example, 81 games? The probability that one team wins no more than 81 games is a simple binomial distribution problem: pbinom(81,162,.5) ~ .5313. The probability that all five are at 81 or lower then becomes .5313^5 ~ .04233. <br><br>There is about a 4% chance that the division winner will have 81 or fewer wins. We can repeat that calculation for 80 wins, and we see that there is about a .02262 probability of the division winner having 80 or fewer wins. That means the probability of the division winner having exactly 81 wins is .04233 - .02262 = .01971. <br><br>Then, we repeat that process for every number from 0 to 162, and we end up with a table of probabilities of the division winner ending up on each possible number of wins. (If you were to do this by hand, you could shortcut a bit by only going from something like 70 to 115 since the probabilities outside that range are all virtually zero anyway.) <br><br>Finally, we multiply each possible win total by the probability of the division winner finishing with that number of wins, and we add up the results to get a mean for the distribution. And doing that gives us 88.3943. <br><br><br><i>R CODE: <br><br>#calculate expected mean value of division winner<br>p <- .5 #probability of each team winning each game<br>n <- 162 #number of games per season<br>teams <- 5 #number of teams in the division <br><br>games <- 0:n # list of possible win totals (0:162)<br>p.list <- pbinom(games,n,p)^teams # p of div winner winning X games or fewer<br>wts <- c(p.list[1],diff(p.list)) # p of div winner winning exactly X games<br>sum(games*wts) # average wins by division winner <br><br>#RESULT<br>[1] 88.39431</i> <br><br><br>As we can see, it is possible to calculate the mean of this distribution exactly, but it is still pretty cumbersome to do so without a computer. As such, let's discuss one final way to estimate this mean using simpler calculations. <br><br>First, we will need a continuous distribution, so we use a normal approximation for the binomial distribution. The mean of the normal distribution will just be 81 (the average number of wins we expect from a team in our example), and the standard deviation will be sqrt(npq) = sqrt(162*.5*.5) ~ 6.36. <br><br>All we need to do now is find the point where there is a 50% chance that five numbers randomly sampled from this distribution will all fall below that number. Start by finding the percentile of the distribution that fulfils this condition: <br><br>p^5 = .5<br>p = .5^(1/5) ~ 0.8706 <br><br>This means we want point at the 0.8706 percentile of our normal distribution, which is simple to look up using an online tool or simple statistical software: <br><br>qnorm(0.8706,81,6.36) ~ 88.1849 <br><br>That is our estimate for the expected number of wins from the division winner. This is slightly off because we are actually calculating the median and not the mean (and because we used a normal approximation, but that makes less difference), but it is still a pretty good estimate given the amount of calculation we simplified. </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com1tag:blogger.com,1999:blog-2868194292414002063.post-80600491883084399802015-02-25T05:00:00.001-07:002015-02-25T05:01:09.594-07:00More From Jesse Burkett (Hit Batsmen)From the same news archive binge as the <a href="http://www.3-dbaseball.net/2015/02/nl-institutes-pitch-clockin-1901.html">previous article</a>, we get <a href="http://chroniclingamerica.loc.gov/lccn/sn84020274/1901-03-29/ed-1/seq-6/#words=penalizing+hitting">more from Jesse Burkett on the NL's rule changes</a>, and...holy crap. Apparently there was a (thankfully short) period in the NL where hitting a batter only awarded the batter a ball, not a base: <br><br> <span class="fullpost"><blockquote>"That rule penalizing the pitcher with only a ball for hitting a batter is a very bad one," said the great hitter. "My word on it, some of those pitchers will be bounding fast ones off the batter's ribs this season."*</blockquote><br>Well said, Mr. Burkett. I can see why that change didn't stick. <br><br>The article also insinuates that part of the reason Cy Young jumped to the AL (which did not lessen the penalty for hitting batters) was that he thought the new HBP rule was dumb. <br><br>*<i>The St. Louis Republic</i>. (St. Louis, Mo.), 29 March 1901. Chronicling America: Historic American Newspapers. Lib. of Congress. </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-76202799822657104132015-02-25T04:23:00.000-07:002015-02-25T04:24:53.518-07:00NL Institutes Pitch Clock...in 1901With MLB's newfound interest in speeding up the pace of play, it's easy to forget that MLB rules actually had a pitch clock in place before this year. Granted, it was virtually never enforced (I think I saw an automatic ball called for a clock violation once, and no one knew what was going on when it happened), but the rule was technically there. <br><br>I had no idea just how far back that rule went, though, until I saw this quote from Hall of Famer <a href="http://www.baseball-reference.com/players/b/burkeje01.shtml">Jesse Burkett</a> while <a href="http://chroniclingamerica.loc.gov/lccn/sn84020274/1901-03-27/ed-1/seq-7/#words=seconds">browsing through some old sports pages</a> (scroll/zoom to the highlighted word at the bottom right corner of the page): <br><br><span class="fullpost"><blockquote>I have been reading how the rule limiting the pitcher to twenty seconds on the slab before throwing will handicap "Cup". That is only a National League rule, and "Cup" is in the American, where the rule is not in force.*</blockquote><br>"Cup" here is <a href="http://www.baseball-reference.com/players/c/cuppyni01.shtml">George Cuppy</a>, a longtime teammate of Burkett's who had just signed with the newly formed American League. I don't know if the rule was on the books continuously from 1901-present time, but apparently the idea of a 20-second limit on pitchers dates back at least that far. <br><br>*<i>The St. Louis Republic</i>. (St. Louis, Mo.), 27 March 1901. Chronicling America: Historic American Newspapers. Lib. of Congress. </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-53193787286073595562015-02-12T11:21:00.001-07:002016-05-21T01:38:08.808-07:00Jeff Manship and the Denny Bautista-line<a href="http://www.fangraphs.com/statss.aspx?playerid=6865&position=P">Jeff Manship</a> signed a minor league deal with Cleveland this past December. These are the sorts of deals the Jeff Manships of the world get. Manship has made two Opening Day rosters in his career--in 2011 and again in 2014--and he had to fight it out in Spring Training for both. In 2011, he made it to April 17, just 3.1 IP over 5 games, before getting sent down. Last year, he stayed up until July 23, but over a month of that time was spent on the DL.<br /><br />No, there's nothing remarkable at all about Manship's contract with Cleveland. He's the type of player who is only even a free agent at all because no one wants to hand him an MLB roster spot, and he's out of options. What is remarkable is that Manship keeps making the Majors anyway, every single year. Since his first call-up in 2009, he has now spent time in the Majors for six years running. And in every single one, he's had an ERA above 5.00. <br /><span class="fullpost"> </span><br /><span class="fullpost"><br /><a href="http://www.fangraphs.com/statss.aspx?playerid=1947&position=P">Denny Bautista</a> was, in some ways, a rather un-Manship-like prospect. Manship was drafted in the 50th round out of high school, went to Notre Dame, and then signed as a 14th round pick three years later. In 2008, he climbed to #9 on Baseball-America's list of the Twins' top ten prospects, only to drop back out of the list the following year. John Sickels, who also had Manship as the Twins' #9 prospect in 2008, had him at the back end of their top 20 each of the next two years.<br /><br />Bautista, meanwhile, was signed as a 17-year-old out of the Dominican Republic. He was (and still is, presumably) the cousin of Ramon and Pedro Martinez. He twice (in 2002 and 2004) cracked Baseball-America's top 100 prospect list, peaking at #59. At 21, an age where Manship was still finishing up his career at Notre Dame and just starting off in the Gulf Coast and Florida State Leagues, Bautista was already in the Majors.<br /><br />In spite of their different pedigrees, Jeff Manship and Denny Bautista ended up as very similar pitchers: failed starters, journeyman relievers, shuttling up and down between cities like Minneapolis and Denver and Kansas City and cities like Rochester and Colorado Springs and Omaha. They are so similar, in fact, that Denny Bautista is the only other pitcher in Major League history to keep succeeding in precisely the same sub-par way that Manship has.<br /><br />There have been other pitchers who kept putting up ERAs above 5.00 and kept getting shots in the Majors. A handful of them, including future Cy Young winner R.A. Dickey in his pre-knuckleball days, have even had ERAs above 5.00 in each of their first six seasons. None of them, other than Manship and Bautista, have kept getting back to the Majors every single year, though. There has always been a year or two in between somewhere where they languished in the minors without getting the call.<br /><br />Of course, Dickey aside, there isn't much hope for success for these kind of pitchers. Kevin Jarvis somehow managed to stick around another six seasons and pitch past his 37th birthday after putting up 5+ ERAs in each of his first six years, but he was just as ineffective in those final six years as in the first six (140 ERA-/124 FIP- in his first six seasons, 135 ERA-/125 FIP- in his final six). Everyone else disappeared pretty quickly.<br /><br />As for Bautista, he did get a seventh year. It was actually his best, at least by ERA. He finally broke the 5.00 barrier and posted a 3.27 ERA in 2010, good for right about average for a reliever. However, in a rather cruel statement about age and the ticking clock on failed prospects, this of all years was finally the year that failed to earn him another shot in the Majors. The following June, he was released from Seattle's system and wound up pitching in Korea. He's still around--he pitched in the Mexican league last year--but he hasn't been back in affiliated pro ball since.<br /><br />It's not like you need any careful analysis to know that the outlook is not good for Manship's career, though. I mean, he's a guy who has thrown 139.1 innings over the past six years with a 6.46 ERA and just got released by his third team in three years. It's interesting, though, don't you think? That he keeps finding his way back, year after year? That even when you split his career ERA into 20-30 inning chunks (or 3.1 inning chunks, as was the case in 2011), they all still come in over 5.00? Even the progression is interesting: his 6.65 ERA was actually the third straight season his ERA <i>dropped</i> from the year before (Manship's ERAs from 2010-2014: 8.10, 7.89, 7.04, 6.65). And he could go right on dropping the ERA again and again for years to come, and still not be any good. That's amazing, in its own way.<br /><br />If he ends up above 5.00 again this year, he would be the first pitcher ever to pitch in seven different MLB seasons and post an ERA that high in every one. Here's something else interesting, though: Steamer actually projects him for a 4.38 ERA this year. That's...that's less than 5.00! By a pretty fair amount!<br /><br />When you think about it, the projection actually makes sense. Even with ERAs consistently north of replacement level, teams have to be projecting him for something below 5.00, or they wouldn't bother calling him up. And his fielding-independent numbers are actually...well, they're not good, but they're a lot better than his ERAs. So there is a pretty good reason to believe he can break the Bautista-barrier if he finds his way back to the Majors this year.<br /><br />Even so, every year that passes, Manship's career is on thinner and thinner ice. It has to be--look what happened to Bautista, whose 3.27 ERA in year seven couldn't even save his career. In all likelihood, <i>if</i> he gets another shot, he probably will have the best ERA of his career, but there is a very real chance that it would still be his last year anyway. Heck, there is a very real chance we've seen the last of Jeff Manship in the Majors already. That is, of course, unless he starts working on his knuckleball. <br><br><i>Edit: Manship did make it back to the Majors in 2015, and posted an 0.92 ERA in 39.1 IP with Cleveland. Well done, Jeff!</i> </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-88795343979846806622015-02-10T11:46:00.000-07:002015-02-10T11:58:49.969-07:00(Possibly) the First Baseball Article I Ever PublishedI was digging through some stuff the other day and came across an old newspaper from college that had what might be the first baseball article I ever published. It's not really anything in-depth or analytical--just a short opinion piece on the Bagwell contract situation that was in the news at the time. I think my writing style has definitely evolved since then, but it was interesting to see something I wrote so early in my development. Anyway, here's the article: <br /><br /><a href="https://drive.google.com/file/d/0B6kgiU_VMWjUMVVmSzlkQmJjRkk/view?usp=sharing">Bagwell article.PDF</a><br /><br /> <span class="fullpost"><iframe src="https://drive.google.com/file/d/0B6kgiU_VMWjUMVVmSzlkQmJjRkk/preview" width="570" height="500"></iframe></span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-20822633669600113642013-11-01T02:49:00.001-07:002013-11-01T02:51:09.148-07:00Effects of Playing the Sun Field on OF Putouts per BIPI recently looked into how playing the sun field affects an outfielder's defensive performance. I was inspired by <a href="http://espn.go.com/blog/sweetspot/post/_/id/6537/could-berkman-and-holliday-switch-off">Craig Wright's discovery</a> that Babe Ruth regularly switched between left and right field throughout his career to avoid playing the sun field, as I wanted to know what kind of effect this knowledge would have on his defensive value. <br><br>As far as I can tell, there isn't much of an effect. You can stop reading here if you care whether reading material is interesting, but I'll detail my methodology below so that those interested can know what I mean when I say I didn't find an effect. <br><br><span class="fullpost">METHODOLOGY <br><br>First, I compiled as best I could a list of sun fields for all open air stadiums in the Retrosheet era (1950-2012 for my current database). Sun fields were estimated based on diagrams and images from <a href="http://www.seamheads.com/ballparks/index.php">Seamheads ballpark database</a>, <a href="http://ballparks.com/baseball/index.htm">Ballparks.com</a>, and <a href="http://andrewclem.com/Baseball/Stadium_lists.html">AndrewClem.com</a>. I was able to corroborate or correct some parks by looking around the web for written mentions of sun fields or photographs showing shadows during a game. <br><br>This is actually trickier than it sounds--you can get a decent idea of where the sun should set from maps or diagrams that include stadium orientation, but the sun's position also depends on the stadium's latitude and changes based on the time of day and time of year (which also means you can get conflicting results from photographs depending on when they were taken--see the <a href="http://baseballstadiumreviews.com/Big%20Pictures/Major%20League%20Parks/Busch%20Stadium/Busch2BIG.jpg">shadows pointing to CF</a> vs the <a href="http://www.ballparksofbaseball.com/nl/pictures/2013/busch_main.jpg">shadows pointing to RF</a> in Busch Stadium). Still, I did the best I could to identify a primary sun field, and while I doubt I came up with a perfect list, it should be good enough to detect an effect if there is one. <br><br>Once I had a list of sun fields for each stadium, I looked at putouts per ball in play for corner outfielders in each stadium. I then divided these into day and night games, so that I had average number of putouts per ball in play for left and right fielders in day and night games for each stadium. Using these figures, I checked the difference between PO/BIP between day and night games. Parks with roofs (retractable or not) and parks with CF sun fields were ignored. <br><br>From there, I checked how much the average PO/BIP went up or down for fielders playing the sun field. If playing the sun field makes impairs the fielder, then they should see a drop in performance from night to day games. However, it is also possible that playing the outfield is generally easier or harder in day games, so I also checked the change in putout rate for the opposite corner outfield position to use as a control group. Rather than compare the sun field's day game performance to its night game performance, I compared the change from night to day for the sun field to the change for the non-sun-field. <br><br>For example, in 2012 Busch Stadium, left fielders recorded putouts on 6.13% of balls in play during night games, and 5.94% during day games. Right fielders were 6.62% for night games and 7.49% for day games. That means that left fielders dropped their PO/BIP by .0019 in day games, while right fielders raised their PO/BIP by .0087. I have right field as Busch's primary sun field, so the sun field was associated with a gain of .0106 putouts per ball in play over the control group in day games. <br><br>Doing this for every season in every stadium included in the study, I got an average* of 0.0002 gain in PO/BIP for the sun field over the non-sun-field, which is practically zero and very slightly in the wrong direction to indicate an effect. <br><br><i>*the average was a weighted mean, with the weight given to each stadium-season being the harmonic mean of day BIP and night BIP. For example, 2012 Busch stadium had 2872 night BIP and 1549 day BIP, which is a harmonic mean of 2012.5.</i><br><br>Since I was concerned that poor data on which field was the sun field may have masked any potential effect, or that only some parks might have a bad sun field, I checked to see if individual stadiums displayed any effect. If that were the case, it should still show up in the overall data as a diminished but still visible effect, but it was worth checking. Individual stadiums did vary from zero effect, but not any more than they would by random chance. When splitting stadium-seasons into even and odd numbered years, there was no correlation between the observed effect for a stadium in even years versus the same stadium in odd years. <br><br>Finally, I checked the same thing for individual fielders, to see if there was any evidence that particular fielders had notable trouble with the sun field that would show up in PO/BIP. The result was the same as the test for individual parks--fielders varied from zero effect but no more than expected by chance, and the even-odd season correlation for fielders was 0. <br><br>This does not necessarily mean that the sun does not affect fielders--I assume that when the ball is actually in the sun, it adds a great deal of difficulty. It is likely that this does not happen often enough to significantly alter a fielder's defensive numbers, though. At the very least, it appears that finding an effect would require much more precise data. For example, you could probably find something by using the sun's position at the time of the play and the trajectory of the batted ball to identify specific plays that are likely affected. Even if this data were available, however, it would be impossible to use it to evaluate Ruth specifically, and the overall effect I saw indicates that there is likely no need to adjust his defense valuation down simply because he rarely played the sun field. </span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-57665789138989568552013-01-24T16:11:00.001-07:002013-01-24T16:11:37.157-07:00Did Adrian Peterson really Outgain Eric Dickerson?A couple years ago, I wrote about <a href="http://www.3-dbaseball.net/2010/09/rounding-errors-part-ii-yardage-gains.html">how rounding errors affect yardage gains</a> in football. The general rule was that, assuming the rounding error on each play is independent, the total rounding error follows a normal distribution with parameters mean = 0 and SD = sqrt(number of plays/12).<br><br>I began thinking about this again for two reasons. One, Adrian Peterson just came within 9 yards of Eric Dickerson's season rushing record. With 348 rushes for Peterson and 379 for Dickerson, that comes out to a standard deviation for the combined rounding errors of 7.8 yards, and about a 12% chance that the 9 yard difference is entirely due to rounding errors.<span class="fullpost"><br><br>The other reason is that Brian Burke pointed out in the comments of the original article that the rounding errors of plays in the NFL are not independent. The total yardage gain for each drive has to round off to the correct figure. From Brian's comment:<br></span><br><blockquote class="tr_bq"><span class="fullpost">"One other way to state this is that if a team has 2 plays in a row, and one goes for 4.5 yards but is scored as 4, and the next goes for 5.5 yds, it can't be scored as 5. It must be scored as a 6 yd gain because the ball is very clearly 10 yds further down field, not 9."</span></blockquote><span class="fullpost"><br>I wanted to try to account for this constraint and see how much difference it would make.<br><i><br>Note: the following is mostly dry and math-related, so if you want to skip it, I estimate the chance of rounding errors covering the 9 yard difference between Dickerson and Peterson at about 14%.</i><br></span><br><a href="http://www.3-dbaseball.net/2013/01/did-adrian-peterson-really-outgain-eric.html#more">Read more »</a>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-28380015387169824202013-01-15T03:10:00.001-07:002013-01-15T03:10:31.919-07:00THE EMPTY SET: Reflecting on Cooperstown’s Lost YearA sea of people stretched across the field and masked the green grass with Cardinal red. There was Bob Feller mingling across the fence beside the stage. There was Frank Robinson. There was Stan Musial. Somewhere, on our side of the fence, was Tug McGraw.<br><br>We were all there for Ozzie. There were a few scattered Phillie fans there for Harry Kalas, that year’s Frick Award recipient, if you looked carefully for the different insignias on their caps. Every here and there you'd see a maroon Mike Schmidt throwback. Other than that, it was just thousands of red-clad fans fixated on the wizard of a shortstop standing at the podium before us.<br><br>"This is awesome." It was the first my dad, uncle, brother, and I had seen of Induction Weekend. "We've got to come back in five years."<br><br>Five years is, of course, the waiting period for retired players before they become eligible for the Hall of Fame. Three of my generation's great players had just retired. And one was another beloved Cardinal.<br><span class="fullpost"><br>**********<br><br>The BBWAA announced the results of their Hall of Fame balloting last Wednesday. No one got in. Barry Bonds didn't get in. Roger Clemens didn't get in. Not Biggio, not Bagwell. Not Jack Morris. Not Piazza, Trammell, Raines, Schilling, Martinez, Walker (either one), or Lofton. Not McGwire or Sosa or Palmeiro. Not even Shawn Green.<br><br>Someone will get in. In 1996, the last year no one met the 75% threshold, there were six players on the ballot (Niekro, Perez, Sutton, Santo, Rice, and Sutter) who would get in eventually. That's how it always is; every ballot has several candidates who will get in someday.<br><br>Biggio will get in. Every player who has ever gotten Biggio's level of support early in his candidacy has had no trouble getting elected sooner rather than later. Bagwell is at that high early level of support where almost everyone gets in eventually. Piazza even more so.<br><br>Jack Morris will probably get in as a Veterans Committee selection someday. Schilling will probably get in someday. Eventually, as the electorate gets a bit younger, Tim Raines will probably find the remaining votes he needs to get in, barring a complete disaster with the current and upcoming logjam that might never clear up before he falls off the ballot.<br><br>Maybe they won't all get in. But some of them will, and maybe some of the others as well. Trammell is the type of guy who could finally get his due when the Hall puts together a VC for his era. Edgar Martinez could pick up some support as the voters begin to accept that the DH is now part of the game. The voters, or the Hall, might someday come around on Bonds and Clemens.<br><br>Someone is going to get in. Definitely Biggio. Very likely Jack Morris. They're just going to have to wait. So too will Cooperstown, which swells up with tens of thousands of tourists (and their wallets) every July except this one.<br></span><br><a href="http://www.3-dbaseball.net/2013/01/the-empty-set-reflecting-on.html#more">Read more »</a>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-63687764180271509752012-12-16T13:57:00.000-07:002013-01-21T14:21:57.644-07:00On Miguel Cabrera, Value, and the Triple Crown<i>“In ’67, the triple crown was never even mentioned once. We were so involved in the pennant race, I didn’t know I won the triple crown until the next day, when I read it in the paper.”</i><br><div style="text-align: right;">-Carl Yastrzemski to the Boston Herald, published September 26, 2012</div><i><br></i><i>“Is it too early to say that [Cabrera] has a legitimate shot at a Triple Crown this season hitting in front of Fielder? I don't think so.”</i><br><div style="text-align: right;">-<a href="http://www.foxnews.com/sports/2012/04/13/rounding-third-tigers-better-than-advertised/#ixzz2EgHJwcfv">Fox News sports article</a>, published April 13, 2012 </div><i><br></i><br>The Triple Crown has grown in stature over the years. That’s not to say it wasn’t a big deal before, but reporters now are asking Carl Yastrzemski about someone else winning it faster than they ever asked him about winning it himself. In 1942, when Ted Williams won it, no one even had a list of previous winners compiled. An AP reporter had to research it for <a href="http://news.google.com/newspapers?id=gPE0AAAAIBAJ&sjid=zWkFAAAAIBAJ&pg=1109,6342468">his story on Williams’ feat</a>, and he still missed the most recent occurrence (Joe Medwick, whose Triple Crown just five years earlier escaped detection).<br><br>Back then, it was a cool thing. It wasn’t necessarily the historic thing it’s become. It didn’t yet carry the mythical ethos of the pantheon-dwellers -- Williams, Mantle, Yaz, Frank Robinson, etc -- who could once do what for so long escaped their modern counterparts. When someone won it, it didn’t carry the weight of a whole generation of fans who grew up hearing about it and never seeing it. It was just a cool thing.<br><br><span class="fullpost">I can see getting excited about it. It’s an impressive feat. It’s something we’ve waited for for a long time. It's something only a handful of the greats have even done. <br><br>And yet, I have a hard time getting excited. It was a great season, sure. A wonderful season at the plate. But the best season I’ve ever seen? Not close. Which means I’ve seen a lot of non-Triple-Crown seasons that were better, because this is the first Triple Crown of my lifetime. You don’t even have to look that hard to find a better season. There’s another one right in front of our noses.<br><br>I’m talking, of course, about Miguel Cabrera’s 2011 season.<br><br>I know that seems, at least on the surface, like a bit of a contrarian statement. How could he have been better when he hit 14 fewer home runs and drove in 36 fewer runs and didn’t, I don’t know, win the first Triple Crown in four and a half decades? I don’t mean it as a contrarian viewpoint, though. I just think Cabrera hit better in 2011 than in 2012.<br><br>Let me explain myself. First, we need to establish what we mean by “better”. <br><br>I grew up with a fairly traditional baseball upbringing. I was the son of a catcher who was the son of a catcher, saved only from the tools of ignorance myself by a bad case of sinistrality (a condition my dad only fully forgave me for when my younger sister took up softball and inherited his old gear). I learned the game from proud field generals who would rather hold their ground to a hard-charging runner than hit a home run, even if they dropped the ball in the process.<br><br>That’s not a bad way to learn the game. It was a great way to learn it. But part of that upbringing was growing up thinking that Rickey Henderson was Lou Brock-Lite, and that Ted Sizemore was the ideal #2 hitter, and that Tony Gwynn was the best hitter in the game. Part of that was drafting Ozzie Smith for my first fantasy league in a three-team-deep league.<br><br>It’s not that those things are necessarily wrong. I don’t remember or care what happened in that fantasy league, other than that I remember drafting my favourite player. I don’t remember or care how many runs the Padres scored with Tony Gwynn anchoring their lineup, or how many games they won. I remember that watching Tony Gwynn was unlike watching anyone else in baseball, because you felt like you knew you were going to see something happen. He was going to put the ball in play, and the defense was going to scramble to field it. When Tony won, it felt like he won because he could almost place the ball at the spot where it landed. When the defense won, it felt like they got away with one. It was exciting to someone who learned the game the way I did.<br><br>As far as baseball is a game of entertainment, maybe Tony Gwynn was the best hitter in the game. Arguing for Tony Gwynn over Frank Thomas, or Barry Bonds, or Fred McGriff, or a handful of other guys as a hitter, though, isn’t really an argument of value or production. It’s an argument of what “best” means to begin with. He was better at some things, yeah. Maybe better at the things that are most important to you. At some point, though, it started to hit me that, whatever abstract ideals I might hold about what a hitter should be, the very concrete objective of all hitters is the same. They hit as best they can to win games, and they do so by helping to score runs.<br><br>That’s something that’s hard to measure when your statistical upbringing comes mostly from Topps and Donruss. How many runs is Gwynn’s AVG worth? How many runs are Thomas’ walks and extra base hits worth? I don’t know. It doesn’t say on the back of the card. We all know when we watch a game that getting on base is important, that making outs is bad, and that getting to second or third is better than getting to first. How much better? I don’t know. And so the argument becomes about what best actually means, because the units of measurement are not helpful.<br></span><br><a href="http://www.3-dbaseball.net/2012/12/on-miguel-cabrera-value-and-triple-crown.html#more">Read more »</a>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com1tag:blogger.com,1999:blog-2868194292414002063.post-88680544371167625012012-06-19T09:39:00.000-07:002012-06-19T09:42:53.032-07:00Clutch, WPA/LI, and the Home Run BiasThe stat Clutch, as published on FanGraphs and Baseball-Reference, is designed to quantify how much better or worse a hitter has produced in situations based on how critical those situations are in the immediate context of a game. Players who perform better in more critical situations (for example, late in a close game) than they normally do will have a positive Clutch rating, and players who perform worse in such situations will have a negative Clutch. It does this by comparing two values for a hitter: his WPA and his WPA.LI. I will assume you are familiar enough these two stats (not necessarily their inner workings, but at least what they are) as a prerequisite for this piece; if not, you can catch up on <a href="http://www.baseball-reference.com/about/wpa.shtml">B-R's</a> or <a href="http://www.fangraphs.com/library/index.php/misc/wpa-li/">FanGraphs</a>' explanation pages. <br><br> <span class="fullpost">WPA.LI follows two key constraints. The first is that, for a given game state (i.e. the inning, the score, the number of outs, and the placement of any runners on base), the relative value of a play is determined by how much that play affects the team's chances of winning. If the bases are empty, a walk is credited the same as a single. If the bases are loaded with the winning run on third, a walk is credited the same as a home run. This constraint works exactly like WPA (as one might expect from a WPA-based metric).<br><br>The second constraint differentiates WPA.LI from WPA. One of the properties of WPA is that some situations are inherently weighted more strongly than others. A key at bat late in a close game can swing a team's chances of winning by several times as much as the same result in a blowout, and it is credited accordingly. WPA.LI, on the other hand, ensures that the average play in every situation gets the same weight.<br><br>So, on the one hand, you have WPA, which weights PAs according to their immediate impact on the game. One clutch PA might be worth as much as 4 or 5 normal PAs, and one mop-up PA might be worth practically nothing. On the other hand, you have WPA.LI, which weights every PA equally, just like most other stats do. Basically, it is linear weights, but with the ability to tailor the value of each event to the specific situation rather than sticking to a blanket value for each event across all situations. While WPA tells the story of clutch hitting (who got the big hit when the team most needed production), WPA.LI tells the story of situational hitting (who got on base when the team needed baserunners, put the ball in play when the strikeout was most costly, or hit for power when advancing runners quickly was more important than getting another guy on first).<br><br>There is a third important constraint which WPA.LI does not adhere to, however. Ideally, the average value of each event would match its linear weights value. If a home run is worth 1.4 runs above average across all situations, then you would like the average WPA.LI value of a HR to be 1.4 runs (or rather, the equivalent value on the wins scale). That is not the case, however.<br><br>The following linear weights values represent the average change in run and win expectancy for that event across all situations, along with the average WPA.LI value of each event. All three versions have been placed on the runs scale by setting the value of the out at -.27 in order to make them easier to compare directly:<br><br> <style><!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:1.0in .75in 1.0in .75in; mso-header-margin:.5in; mso-footer-margin:.5in;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:12.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri, sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl63 {mso-number-format:Fixed;} --></style> <br><table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; width: 260px;"> <colgroup><col span="4" style="width: 65pt;" width="65"> </colgroup><tbody><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15pt; text-align: right; width: 65pt;" width="65"><br></td> <td style="text-align: right; width: 65pt;" width="65">RE</td> <td style="text-align: right; width: 65pt;" width="65">WPA</td> <td style="text-align: right; width: 65pt;" width="65">WPA.LI</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">1B</td> <td align="right" class="xl63">0.47</td> <td align="right" class="xl63">0.47</td> <td align="right" class="xl63">0.44</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">2B</td> <td align="right" class="xl63">0.77</td> <td align="right" class="xl63">0.75</td> <td align="right" class="xl63">0.75</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">3B</td> <td align="right" class="xl63">1.05</td> <td align="right" class="xl63">1.06</td> <td align="right" class="xl63">1.04</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">HR</td> <td align="right" class="xl63">1.41</td> <td align="right" class="xl63">1.42</td> <td align="right" class="xl63">1.58</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">BB</td> <td align="right" class="xl63">0.31</td> <td align="right" class="xl63">0.30</td> <td align="right" class="xl63">0.31</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">K</td> <td align="right" class="xl63">-0.29</td> <td align="right" class="xl63">-0.30</td> <td align="right" class="xl63">-0.29</td> </tr><tr height="15" style="height: 15.0pt;"> <td height="15" style="height: 15.0pt;">out</td> <td align="right" class="xl63">-0.27</td> <td align="right" class="xl63">-0.27</td> <td align="right" class="xl63">-0.27</td> </tr></tbody></table><br>As you can see, WPA.LI does fine at assigning the correct value to most events, but the value of the HR is way off. This may seem counterintuitive; if WPA.LI just creates custom linear weights for each situation based on the WPA values, why would the average WPA.LI value be different from the average WPA value? We can look at the mathematical relationship between WPA and WPA.LI to see why this is.<br></span><a href="http://www.3-dbaseball.net/2012/06/clutch-wpali-and-home-run-bias.html#more">Read more »</a>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com4tag:blogger.com,1999:blog-2868194292414002063.post-55196623791823659222012-06-09T03:20:00.001-07:002012-06-09T11:07:15.859-07:00The Pujols Decision: One Fan's ReflectionsStan Musial is the man in St. Louis. Nearly 50 years after Musial last played for the Cardinals, he remains the undisputed king of Cardinal baseball. His statue alone stands tall outside the main entrance to Busch Stadium, a few hundred feet south of the plaza where all the lesser (albeit much more attractive) statues of other Cardinal greats sit. For decades, no one in St. Louis thought they would ever see a player rival Musial.<br><br>And then Albert came along. Just one year and one Bobby Bonilla injury removed from his professional debut as a 13th round draft pick, Pujols was in the starting lineup and lighting up the National League. He hit for average. He hit for power. He got on base. He eventually learned to play a very good first base. For the first time, St. Louis fans saw a player and thought, "this could be the guy who tops Musial."<br><br><span class="fullpost">The accolades came. The MVPs (three of them, same as Musial), the All Star appearances, the Silver Sluggers and Gold Gloves, the home runs and hits and RBIs; all of them flocked to Pujols' Baseball-Reference page like moths to Matt Holliday's ear.<br><br>The wins followed. Led by Pujols' success, the team made the playoffs 7 out of 11 seasons, winning 3 pennants and 2 World Series along the way. From 2001-2011, only the high-spending Yankees and Red Sox won more games than did Pujols' Cardinals. Pujols was the best player in the game, a superstar of whose order the franchise had not seen in decades. Fans watched in awe and wondered how high his career would stack by the time it ended.<br><br>Pujols was, over his 11 years with St. Louis, remarkably similar to Musial when Musial was at his best. Compare Pujols’ career in St. Louis to Musial’s best 11 year stretch (1943-54):<br><br><br><table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="border-collapse: collapse; border: none; margin-left: 5.4pt; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-table-layout-alt: fixed; mso-yfti-tbllook: 1696;"><tbody><tr style="height: 12.15pt; mso-yfti-firstrow: yes; mso-yfti-irow: 0;"> <td style="border: solid windowtext 1.0pt; height: 12.15pt; mso-border-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 104.35pt;" valign="top" width="104"><br></td> <td style="border: 1pt solid windowtext; height: 12.15pt; padding: 0in 5.4pt; text-align: center; width: 40.55pt;" valign="top" width="41">PA </td> <td style="border: 1pt solid windowtext; height: 12.15pt; padding: 0in 5.4pt; text-align: center; width: 45pt;" valign="top" width="45">H </td> <td style="border: 1pt solid windowtext; height: 12.15pt; padding: 0in 5.4pt; text-align: center; width: 39.75pt;" valign="top" width="40">RBI </td> <td style="border: 1pt solid windowtext; height: 12.15pt; padding: 0in 5.4pt; text-align: center; width: 45.75pt;" valign="top" width="46">HR </td> <td style="border: 1pt solid windowtext; height: 12.15pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">BB </td> <td style="border: 1pt solid windowtext; height: 12.15pt; padding: 0in 5.4pt; text-align: center; width: 49.5pt;" valign="top" width="50">R </td> </tr><tr style="height: 16.15pt; mso-yfti-irow: 1;"> <td style="border-top: none; border: solid windowtext 1.0pt; height: 16.15pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 104.35pt;" valign="top" width="104">Musial (1943-54) </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 16.15pt; padding: 0in 5.4pt; text-align: center; width: 40.55pt;" valign="top" width="41">7564 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 16.15pt; padding: 0in 5.4pt; text-align: center; width: 45pt;" valign="top" width="45">2251 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 16.15pt; padding: 0in 5.4pt; text-align: center; width: 39.75pt;" valign="top" width="40">1174 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 16.15pt; padding: 0in 5.4pt; text-align: center; width: 45.75pt;" valign="top" width="46">281 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 16.15pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">990 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 16.15pt; padding: 0in 5.4pt; text-align: center; width: 49.5pt;" valign="top" width="50">1301 </td> </tr><tr style="height: 12.85pt; mso-yfti-irow: 2; mso-yfti-lastrow: yes;"> <td style="border-top: none; border: solid windowtext 1.0pt; height: 12.85pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 104.35pt;" valign="top" width="104">Pujols (2001-11) </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 12.85pt; padding: 0in 5.4pt; text-align: center; width: 40.55pt;" valign="top" width="41">7433 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 12.85pt; padding: 0in 5.4pt; text-align: center; width: 45pt;" valign="top" width="45">2073 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 12.85pt; padding: 0in 5.4pt; text-align: center; width: 39.75pt;" valign="top" width="40">1329 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 12.85pt; padding: 0in 5.4pt; text-align: center; width: 45.75pt;" valign="top" width="46">445 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 12.85pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">975 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 12.85pt; padding: 0in 5.4pt; text-align: center; width: 49.5pt;" valign="top" width="50">1291 </td> </tr></tbody></table><br><br><br><table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="border-collapse: collapse; border: none; margin-left: 5.4pt; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-table-layout-alt: fixed; mso-yfti-tbllook: 1696;"><tbody><tr style="height: 12.15pt; mso-yfti-firstrow: yes; mso-yfti-irow: 0;"> <td style="border: solid windowtext 1.0pt; height: 12.15pt; mso-border-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 104.35pt;" valign="top" width="104"><br></td> <td style="border: 1pt solid windowtext; height: 5.55pt; padding: 0in 5.4pt; text-align: center;" top"="" width="41">AVG</td> <td style="border: 1pt solid windowtext; height: 5.55pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">OBP </td> <td style="border: 1pt solid windowtext; height: 5.55pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">SLG </td> <td style="border: 1pt solid windowtext; height: 5.55pt; padding: 0in 5.4pt; text-align: center; width: 40.95pt;" valign="top" width="41">wRC+ </td> <td style="border: 1pt solid windowtext; height: 5.55pt; padding: 0in 5.4pt; text-align: center; width: 49.05pt;" valign="top" width="49">brWAR </td> <td style="border: 1pt solid windowtext; height: 5.55pt; padding: 0in 5.4pt; text-align: center; width: 49.5pt;" valign="top" width="50">fWAR </td> </tr><tr style="height: 17.4pt; mso-yfti-irow: 1;"> <td style="border-top: none; border: solid windowtext 1.0pt; height: 17.4pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 1.45in;" valign="top" width="104">Musial (1943-54) </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">.346 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">.434 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">.591 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.95pt;" valign="top" width="41">171 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 49.05pt;" valign="top" width="49">88 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 49.5pt;" valign="top" width="50">98 </td> </tr><tr style="height: 17.4pt; mso-yfti-irow: 2; mso-yfti-lastrow: yes;"> <td style="border-top: none; border: solid windowtext 1.0pt; height: 17.4pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 1.45in;" valign="top" width="104">Pujols (2001-11) </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">.328 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">.420 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.5pt;" valign="top" width="41">.617 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 40.95pt;" valign="top" width="41">167 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 49.05pt;" valign="top" width="49">84 </td> <td style="border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-style: none solid solid none; border-width: medium 1pt 1pt medium; height: 17.4pt; padding: 0in 5.4pt; text-align: center; width: 49.5pt;" valign="top" width="50">88 </td> </tr></tbody></table><br><br>In both traditional counting totals and more sabermetric evaluations, the two come up as near equals. Musial got on base a bit better (in an environment where hitters got on base more than they do today) while Pujols hit for more power (in an environment where hitters hit for more power than they did in Musial’s day). The two were comparable fielders, good for their position, but at the weak end of the fielding spectrum. <br><br>Musial rates slightly better in both Baseball-Reference’s and FanGraphs’ implementations of WAR, but they are close enough that which one you would pick will largely depend on how you approach the different eras (i.e. how you want to adjust for things like integration, expansion, population growth, international development, improved scouting, the war years, etc). They’re close enough that it would reasonable to take the position that no Cardinal fan has ever seen one of their own play at a higher level than Pujols has over his 11 years with the team, not even Musial. It’s not a slam-dunk position; maybe you still take Musial. But, for the first time since Musial retired, you’d probably at least have to think about it.<br><br>Watching Pujols play ignited Cardinal fans like watching Musial did, and we loved every minute of it. Naturally, we wanted that to continue. We wanted another all-time great to stay a career Cardinal. Then, out of nowhere, the report swept in from the winter meetings that Pujols had signed with the Angels. No build up, nothing. No one had even talked about the Angels in the weeks of negotiating that led off the offseason. Just like that, he was gone.<br><br><br><br></span><a href="http://www.3-dbaseball.net/2012/06/pujols-decision-one-fans-reflections.html#more">Read more »</a>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-22292864360124880912012-03-15T02:42:00.003-07:002012-03-15T14:18:26.560-07:00Win Expectancy and Leverage Index tables, R CodeThis post is just a quick dump of some code you can use to create win-expectancy and leverage index tables like what I used for my recent <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.baseballprospectus.com/article.php?articleid=16212">Baseball PreGUESTus article</a>. It is written for the <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.r-project.org/">free statistical program R</a>, and it builds upon the excellent work on run-expectancy and run distribution tables done by <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.chancesis.com/2011/08/14/run-expectancy-and-markov-chains/">Sobchak at ChancesIs.com</a>.<br /><br /><span class="fullpost">In order to run this code, you will need R with the package plyr installed. You will also need the file bo_transitions.csv from ChancesIs (either the CSV file hosted on that site, or one created using a similar query to the one Sobchak published) and the file game_state_frequency.csv, which you can copy from <a style="font-weight: bold; color: rgb(51, 0, 153);" href="https://docs.google.com/spreadsheet/pub?key=0AqkgiU_VMWjUdHh2akN6dVJXZ25CMHZfZVBwNnlCUGc&output=html">this table</a>. Sobchak's data and the game_state_frequency table are from the years 1993-2010. You can collect the data for other years by altering Sobchak's SQL query and <a style="font-weight: bold; color: rgb(51, 0, 153);" href="https://docs.google.com/document/pub?id=1JDy0TeMFYX2EldoDcIbs_c72jFlEwelYCxYYnagCuQ8">this game_state_frequency query</a>.<br /><span style="font-style:italic;"><br />*note-you only need game_state_frequency.csv for calculating LI. You don't need it if all you want is a WE table.</span><br /><br />Once you have those files on your computer, you can construct a win-expectancy table with the following R code:<br /><br /><a style="font-weight: bold; color: rgb(51, 0, 153);" href="https://docs.google.com/document/pub?id=1rRbeFbifAhYnfSKgn_bjx-NzZ38UvQ7t1ACJgInBW7Y">Win Expectancy Table, R code</a><br /><br />You will have to change the line<br /><blockquote>setwd("/Users/Seshoumaru/Desktop/untitled folder/baseball/run-win expectancy")</blockquote><br />to the folder path where you saved the necessary CSV files.<br /><br />The win expectancy values are generated based on Sobchak's simulated run distributions. It is currently set to run 100,000 simulated innings from each state to estimate the distributions. You can raise the number of simulations to increase the precision, but it will take longer to process. On my computer, 100,000 simulations took about 4 minutes to run. 1,000,000 simulations took about an hour. The win expectancies themselves are not simulated, however.<br /><br />The code limits run scoring to 16 runs for the remainder of the inning you are in, plus 16 runs total for the rest of the game. This is done to greatly reduce processing time. The generated tables cover scores from the home team being down 16 to up 16 (all score differentials are from the perspective of the home team.<br /><br />The above code assumes equal run distributions for both teams. With a few changes, you can alter the code to include home-field advantage by using separate distributions for the home and away teams. To do this, you will need to alter Sobchak's query to create additional bo_transition files for just the home team and just the away team (called bo_transitions_home.csv and bo_transitions_away.csv). Once you have added those files, you can run the following code:<br /><br /><a style="font-weight: bold; color: rgb(51, 0, 153);" href="https://docs.google.com/document/pub?id=1Rn3pjQj-AgOAIf30_p26c2j2tKRbC15FW_OnqCW7ziw"><br />Win Expectancy Table, HFA version, R code</a><br /><br /><br /></span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com4tag:blogger.com,1999:blog-2868194292414002063.post-90165226823449544952011-08-17T08:00:00.009-07:002011-08-24T12:25:30.978-07:00Regression to the Mean and Beta Distributions<span style="font-style: italic;">This morning, a discussion of regression to the mean popped up on <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://sabermetricresearch.blogspot.com/2011/08/tango-method-of-regression-to-mean-kind.html">Phil's</a> and <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.insidethebook.com/ee/index.php/site/comments/regression_toward_the_mean_proof/">Tango's</a> blogs. This discussion touches upon some of the recent work I've been doing with Beta distributions, so I figured I'd go ahead and lay out the math linking regression to the mean with Bayesian probability with a Beta prior.</span>
<br />
<br /><span class="fullpost">Many of the events we measure in baseball are Bernoulli trials, meaning we simply record whether they happen or not for each opportunity. For example, whether or not a team wins a game, or whether or not a batter gets on base are Bernoulli trials. When we observe these events over a period of time, the results follow a binomial distribution.
<br />
<br />When we observe these binomial events, each team or player has a certain amount of skill in producing successes. Said skill level will vary from team to team or player to player, and, as a result, we will observe different results from different teams or players. Albert Pujols, for example, has a high degree of skill at getting on base compared to the whole population of MLB hitters, and we would expect to observe him getting on base more often than, say, Emilio Bonifacio.
<br />
<br />The variance in talent levels is not the only thing driving the variance in obvserved results, however. As with any binomial process (excepting those with 0% or 100% probabilities, anyway), there is also random variance as described by the binonial distribution. Even if Albert's on-base skill is roughly 40%, and Bonifacio's is roughly 33%, it is still possible that you will occasionally observe Emilio to have a higher OBP than Albert over a given period of time.
<br />
<br />In baseball, it is a practical problem that we do not know the true probability linked to each team's or player's skill, only their observed rate of success. Thus, if we want to know the true talent probability, we have to estimate it from the observed.
<br />
<br />One way to do this is with regression to the mean. Say that we have a player with a .400 observed OBP over 500 PAs, and we want to estimate his true talent OBP. Regression to the mean says we need to find out how much, on average, our observed sample will reflect the hitter's true talent OBP, and how much it will reflect random binomial variation. Then, that will tell us how many PAs of the league average we need to add to the observed performance to estimate the hitter's true talent.
<br />
<br />For example, say we decide that the number of league average PAs we need to add to regress a 500 PA sample of OBP is 250. We would take the observed performance (200 times on base in 500 PAs), and add 82.5 times on base in 250 PAs (i.e. the league average performance, assuming league average is about .330) to that.
<br />
<br />200+82.5<span style="color: rgb(255, 255, 255);">......</span>282.5
<br />------------ = -------- = .377
<br />500+250<span style="color: rgb(255, 255, 255);">........</span>750
<br />
<br />Therefore, regression to the mean would estimate the hitter's true OBP talent at .377.
<br />
<br />As Phil demonstrated, once you decide that you have to add 250 PAs of league average performance to your sample to regress, you would use that same 250 PA figure to regress any OBP performance, regardless of how many PAs are in the observed sample. Whether you have 10 observed PAs or 1000 observed PAs, the amount of average performance you have to add to regress does not change.
<br />
<br />Now, how would one go about finding that 250 PA figure? One way is to figure out the number of PAs at which the random binomial variance is equal to the variance of true talent in the population.
<br />
<br />Start by taking the observed variance in the population. You would look at all hitters over a certain number of PAs (say, 500, for example), and you might observe that the variance in their observed OBPs is about .00132, with the average about .330. The observed variance is equal to the sum of the random binomial variance and the variance of true OBP talent across the population of hitters. We don't know the variance of true talent, but we can calculate the random binomial variance as p(1-p)/n, where p is the probability of getting on base (.330 for our observed population) and n is the observed number of PAs (500 in this case). For this example, that would be about .00044. Therefore, the variance of true talent in the population is approximately:
<br />
<br />.00132 - .00044 = .00088
<br />
<br />Next, we find the number of PAs where the random binomial variance will equal the variance of true talent:
<br />
<br />p*(1-p)/n = true_var
<br />
<br />.330*(1-.330)/n = .00088
<br />
<br />n = .330*(1-.330)/.00088</span><span class="fullpost"> ≈</span><span class="fullpost"> 250
<br />
<br />
<br />We can also approach the problem of estimating true talent from observed performance using Bayesian probability. In order to use Bayes, we need to make an assumption about the distribution of true talent in the population the hitter is being drawn from (i.e. the prior distribution). We will assume that true talent follows a <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://en.wikipedia.org/wiki/Beta_distribution">Beta distribution</a>.
<br />
<br />Return now to our .400 observed OBP example. Bayes says the posterior distribution (i.e. the distribution of possible true talents for a hitter drawn from the prior distribution after observing his performance) is proportional to the product of the prior distribution and the likelihood function (i.e. the binomial distribution, which is the likelihood of observing a each possible OBP, given the prior probability).
<br />
<br />The prior Beta distribtuion is:
<br />
<br />x^(α-1) * (1-x)^(β-1)
<br />------------------------
<br /><span style="color: rgb(255, 255, 255);">..........</span>B(α,β)
<br />
<br />where B(α,β) is a constant equal to the <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://en.wikipedia.org/wiki/Beta_function">Beta function</a> with parameters α and β.
<br />
<br />The binomial likelihood for observing s successes in n trials (i.e. the observed on-base performance) is:
<br />
<br /><span style="color: rgb(255, 255, 255);">.....</span>n!
<br />--------- * x^s * (1-x)^(n-s)
<br />s!(n-s)!
<br />
<br />where x is the true probability of a success.
<br />
<br />Next, we multiply the prior distribution by the likelihood distribution:
<br />
<br />x^(α-1) * (1-x)^(β-1) <span style="color: rgb(255, 255, 255);">.........</span>n!
<br />------------------------- * --------- * x^s * (1-x)^(n-s)
<br /><span style="color: rgb(255, 255, 255);">............</span>B(α,β)<span style="color: rgb(255, 255, 255);">.... ..........</span> s!(n-s)!
<br />
<br />
<br />combine the exponents for the x and (1-x) factors:
<br />
<br />
<br />x^(α + s - 1) * (1-x)^(β + n - s - 1)<span style="color: rgb(255, 255, 255);">.. .....</span>n!
<br />--------------------------------------- * --------
<br /><span style="color: rgb(255, 255, 255);">.......................</span>B(α,β)<span style="color: rgb(255, 255, 255);"> ...................... </span>s!(n-s)!
<br />
<br />
<br />Separating the constant factors from the variables:
<br />
<br /><span style="color: rgb(255, 255, 255);">...........</span>n!
<br />------------------- * x^(α + s - 1) * (1-x)^(β + n - s - 1)
<br />s!(n-s)! * B(α,β)
<br />
<br />
<br />This product is proportional to the posterior distribution, so the posterior distribution will be the above multiplied by some constant in order to scale it so that the cumulative probability equals one. Since the left portion of the above expression is already a constant, we can simply absorb that into the scaling constant, and the final posterior distribution then becomes:
<br />
<br />
<br />C * x^(α + s - 1) * (1-x)^(β + n - s - 1)
<br />
<br />
<br />Notice that the above distribution conforms to a new Beta distribution with parameters α+s and β+n-s, and with a constant C = 1/B(α+s,β+n-s). When the prior distribution is a Beta distribution with parameters α and β and the likelihood function is binomial, then the posterior distribution will also be a Beta distribution, and it will have the parameters α+s and β+n-s.
<br />
<br />We still need to choose values for the parameters α and β for the prior distribution. Recall from the regression example that we found a mean of .330 and a variance of .00088 for the true talent in the population (i.e. the prior distribution), so we will choose values for α and β that give us those values. For a Beta distribution, the mean is equal to:
<br />
<br />α/(α+β)
<br />
<br />and the variance is equal to:
<br />
<br /><span style="color: rgb(255, 255, 255);">............</span>αβ
<br />----------------------
<br />(α+β)^2 * (α+β+1)
<br />
<br />
<br />A bit of algebra gives us values for α and β of approximately 82.5 and 167.5 respectively. That means the posterior distribution will have as parameters:
<br />
<br />α+s = 82.5 + 200 = 282.5
<br />β+n-s = 167.5 + 500 - 200 = 467.5
<br />
<br />and a mean of
<br />
<br /><span style="color: rgb(255, 255, 255);">........</span>282.5<span style="color: rgb(255, 255, 255);">...........</span>282.5
<br />----------------- = ------- = .377
<br />282.5 + 467.5<span style="color: rgb(255, 255, 255);">.......</span>750
<br />
<br />As you can see, this is identical to the regression estimate. This will always be the case as long as the prior distribution is Beta and the likelihood is binomial. We can see why if we derive the regression constant (the number of PAs of league average we need to add to the observed performance in order to regress) from the prior distribution.
<br />
<br />Recall that the regression constant can be found by finding the point where random binomial variance equals prior distribution variance. Therefore:
<br />
<br />p(1-p)/k ≈ prior variance
<br />
<br />where k is the regression constant and p is the population mean.
<br />
<br />p(1-p)/k ≈ αβ / ( (α + β)^2(α + β + 1) ) ; p ≈ α/(α+β)
<br />
<br />α/(α+β) * ( 1 - α/(α+β) ) / k <span style="color: rgb(255, 255, 255);">.</span> ≈ αβ / ( (α + β)^2 * (α + β + 1) )
<br />α/(α+β) - α^2/(α+β)^2<span style="color: rgb(255, 255, 255);">.........</span>≈ k * αβ / ( (α + β)^2 * (α + β + 1) )
<br />(α(α+β) - α^2)/(α+β)^2<span style="color: rgb(255, 255, 255);"> ......</span> ≈ k * αβ / ( (α + β)^2 * (α + β + 1) )
<br />(α(α+β) -α^2)<span style="color: rgb(255, 255, 255);">.......................</span> ≈ k * αβ / (α + β + 1)
<br />(α^2 + αβ - α^2) <span style="color: rgb(255, 255, 255);">..................</span>≈ k * αβ / (α + β + 1)
<br />αβ<span style="color: rgb(255, 255, 255);">.......................................... </span>≈ k * αβ / (α + β + 1)
<br />1 <span style="color: rgb(255, 255, 255);">.............................................</span>≈ k / (α + β + 1)
<br />k<span style="color: rgb(255, 255, 255);">....................,,,,,,,,,,,,,,,,........</span> ≈ α + β + 1
<br />
<br />
<br />Since α and β for the prior in our example are 82.5 and 167.5, k would be 82.5 + 167.5 + 1 = 251.
<br />
<br />This estimate of k is actually biased, because it assumes a random binomial variance based only on the population mean, whereas the actual random binomial variance for the prior distribution will be the average binomial variance over the entire distribution. In other words, not all of the population will have a .330 OBP skill; some hitters will have a .300 skill, while others will have a .400 skill, and they will all have different binomial variances associated with them. More precisely, the random binomial variation for the prior distribution will be the following definite integral taken from 0 to 1:
<br />
<br />
<br />⌠ x(1-x)
<br />| ------- * B(x;α,β) dx
<br />⌡<span style="color: rgb(255, 255, 255);">...</span>k
<br />
<br />
<br />which, conceptually, is the weighted sum of the the binomial variances for each possible value from the prior distribution, where each binomial variance is weighted by the probability density function of the prior.
<br />
<br /><span style="color: rgb(255, 255, 255);">.......</span>1<span style="color: rgb(255, 255, 255);">........ </span>⌠
<br />------------ | x(1-x) * x^(α-1) * (1-x)^(β-1) dx
<br />k * B(α,β) ⌡
<br />
<br />
<br /><span style="color: rgb(255, 255, 255);">........</span>1<span style="color: rgb(255, 255, 255);">....... </span>⌠
<br />------------ | x^α * (1-x)^β dx
<br />k * B(α,β) ⌡
<br />
<br />
<br />
<br />The definite integral is in the form of the Beta function B(α+1,β+1), so we can rewrite this as
<br />
<br />
<br />B(α+1,β+1)
<br />-------------
<br />k * B(α,β)
<br />
<br />
<br />The Beta function is interchangeable with the <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://en.wikipedia.org/wiki/Gamma_function">Gamma Function</a> in the following manner:
<br />
<br />B(α,β) = Γ(α)*Γ(β) / Γ(α+β)
<br />
<br />replacing the two Beta functions with their Gamma equivalencies:
<br />
<br />
<br />Γ(α+1) * Γ(β+1) * Γ(α+β)
<br />-------------------------------
<br />k * Γ(α) * Γ(β) * Γ(α+β+2)
<br />
<br />
<br />This revision is useful because the Gamma function has a property where Γ(x+1)/Γ(x) = x, so the above reduces to:
<br />
<br />
<br />αβ * Γ(α+β)
<br />---------------
<br />k * Γ(α+β+2)
<br />
<br />
<br />Furthermore, since Γ(x+1)/Γ(x) = x, it follows that Γ(x+2)/Γ(x+1) = x+1. If we multiply those two equations together, we find that
<br />
<br />Γ(x+1)<span style="color: rgb(255, 255, 255);">....</span>Γ(x+2)
<br />-------- * -------- = x(x+1)
<br /><span style="color: rgb(255, 255, 255);">..</span>Γ(x)<span style="color: rgb(255, 255, 255);">......</span>Γ(x+1)
<br />
<br />Γ(x+2)/Γ(x) = x(x+1)
<br />
<br />Γ(x)/Γ(x+2) = 1/(x(x+1))
<br />
<br />
<br />Therefore
<br />
<br />
<br /><span style="color: rgb(255, 255, 255);">.</span>αβ * Γ(α+β) <span style="color: rgb(255, 255, 255);">..................</span>αβ
<br />---------------- = ------------------------
<br />k * Γ(α+β+2)<span style="color: rgb(255, 255, 255);">.....</span>k * (α+β) * (α+β+1)
<br />
<br />
<br />
<br />Now that we have a manageable expression for the random binomial variance of the prior distribution, we return to the requirement that random binomial variance equals the variance of the prior distribution:
<br />
<br />
<br /><span style="color: rgb(255, 255, 255);">..............</span>αβ<span style="color: rgb(255, 255, 255);"> ................................</span>αβ
<br />------------------------ = -----------------------
<br />k * (α+β) * (α+β+1)<span style="color: rgb(255, 255, 255);">......</span>(α+β)^2 * (α+β+1)
<br />
<br />
<br />k * (α+β) * (α+β+1) = (α+β)^2 * (α+β+1)
<br />
<br />
<br />k = α+β
<br />
<br />
<br />Using a more precise calculation for the random binomial variance of the prior, we find that k = α+β rather than α+β+1. Note that when we estimate k by assuming a constant binomial variance of p(1-p)/k, we get a value of k exactly 1 higher than when we run the full calculation for the binomial variance. This is useful because the former calculation is much simpler than the latter, so we can calculate k by using the former method and then subtracting 1. Also note that the 250 value we got in the initial regression to the mean example would also be 1 too high if we were using more precise figures; I've just been rounding them off for cleanliness' sake.
<br />
<br />Let's look now at the calculation for regression to the mean:
<br />
<br />true talent estimate = (s+pk)/(n+k)
<br />
<br />where s is the observed successes, n is the observed trials, p is the population mean, and k is the regression constant.
<br />
<br />We know from our prior that p=α/(α+β) and k=α+β, so
<br />
<br />(s+pk)/(n+k) =
<br />
<br />s + (α+β)*α/(α+β)
<br />----------------------
<br /><span style="color: rgb(255, 255, 255);">......</span>n + α + β
<br />
<br />
<br /><span style="color: rgb(255, 255, 255);">...</span>α + s
<br />-----------
<br />α + β + n
<br />
<br />
<br />And what does Bayes say? Our posterior is a Beta with parameters α+s and β+n-s, which has a mean
<br />
<br /><span style="color: rgb(255, 255, 255);">.......</span>(α+s)
<br />-----------------
<br />(α+s)+(β+n-s)
<br />
<br />
<br /><span style="color: rgb(255, 255, 255);">...</span>α + s
<br />-----------
<br />α + β + n
<br />
<br />
<br />So Bayes and regression to the mean produce identical talent estimates under these conditions (a binomial process where true talent follows a Beta distribution).
<br />
<br />k is far easier to estimate directly (such as by using the method in the initial regression tot he mean example) than α and β, so we would typically calculate α and β from k. To do that, we use the fact that p = α/(α+β), and that k=α+β, so by substitution we can easily find that:
<br />
<br />α=kp
<br />β=k(1-p)
<br />
<br />where k is the regression constant and p is the population mean.
<br />
<br />We can also see that the regression amount will be constant regardless of the number of observed PAs, because when we take our Bayesian talent estimate:
<br />
<br /><span style="color: rgb(255, 255, 255);">...</span>α + s
<br />-----------
<br />α + β + n
<br />
<br />we see that we are always adding the quantity kp (as substituted for α) to the observed successes (s), and always adding the quantity k (as substituted for (α+β)) to the observed trials (n), no matter what observed values we have for s and n. The amounts we add to the observed successes and trials depend only on the parameters of the prior, which do not change.
<br />
<br />
<br /></span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-12298009472240986632011-07-20T17:44:00.003-07:002012-04-02T12:51:22.890-07:00Poisson Processes in SportsIn sports, the problem of relating a team's offensive and defensive production to its W-L record is closely related to the distribution of scoring events in the sport. For example, say you want to know how often a team that scores, on average, 4 times per game and allows 3 scores per game is expected to win. It is not enough to simply know that the team averages 4 scores and 3 scores allowed; you also have to have an idea of how likely the team is to score (or allow) 0 times, 1 time, 2 times, etc. If the nature of the sport provides for a very tight range of scores for each team (i.e. the 4-score team is very unlikely to score 0 or 1 time, or 7 or 8 times), then the team will win more often than if the sport sees a wider distribution of observed scores for each team.<br /><br /><span class="fullpost">Let's say, for example, that the team in this example scores and allows scores in the following distribution:<br /><br /><style>table { }td { padding-top: 1px; padding-right: 1px; padding-left: 1px; color: black; font-size: 12pt; font-weight: 400; font-style: normal; text-decoration: none; font-family: Calibri,sans-serif; vertical-align: bottom; border: medium none; white-space: nowrap; }</style> <table style="border-collapse: collapse; width: 150pt;" border="0" cellpadding="0" cellspacing="0" width="150"> <col style="width: 50pt;" span="3" width="50"> <tbody><tr style="height: 15pt;" height="15"> <td style="height: 15pt; width: 50pt; text-align: right;" height="15" width="50"><br /></td> <td style="width: 50pt; text-align: right;" width="50">score</td> <td style="width: 50pt; text-align: right;" width="50">allow</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">0</td> <td align="right">0.06</td> <td align="right">0.14</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">1</td> <td align="right">0.1</td> <td align="right">0.15</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">2</td> <td align="right">0.13</td> <td align="right">0.17</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">3</td> <td align="right">0.15</td> <td align="right">0.17</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">4</td> <td align="right">0.17</td> <td align="right">0.12</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">5</td> <td align="right">0.13</td> <td align="right">0.1</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">6</td> <td align="right">0.1</td> <td align="right">0.07</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">7</td> <td align="right">0.07</td> <td align="right">0.04</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">8</td> <td align="right">0.05</td> <td align="right">0.03</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">9</td> <td align="right">0.03</td> <td align="right">0.01</td> </tr> <tr style="height: 15pt;" height="15"> <td style="height: 15pt;" align="right" height="15">10</td> <td align="right">0.01</td> <td align="right">0</td> </tr> </tbody></table><br /><br />In the above table, the team would score 0 times 6% of the time and allow 0 scores 14% of the time.<br /><br />To find the chances of the team winning, you first figure its chances of outscoring its opponents when it scores once. Since this team will allow fewer than one score 14% of the time, it would be expected to win 14% of the time it scores once. The team scores once 10% of the time, so one-score victories should account for .1*.14=1.4% of its games. Continuing for 2-score victories, the team allows less than 2 scores 29% of the time (14%+15%), so 2-score victories account for .13 2-score games * .29 wins per 2-score game = 3.8% of the team's total games.<br /><br />Doing this for each possible number of scores, the team will win a total of 56% of its games. Repeating the same process for losses, it will lose 32% of the time (the other 12% of games will end tied).<br /><br />As long as we know the probability of each possible number of scores and scores allowed, the expected W-L performance can be found in this way. In terms of summation notation, it looks something like this:<br /><br /><span style="font-size:78%;"><span style="color: rgb(255, 255, 255);">..</span>∞<span style="color: rgb(255, 255, 255);">............ </span> </span><span style="font-size:78%;"> </span> <span style="font-size:78%;"> i-1</span><br /><span style="font-size:180%;">∑</span>p<span style="font-size:78%;">s</span>(i) <span style="font-size:180%;">∑</span>p<span style="font-size:78%;">a</span>(j)<br /><span style="font-size:78%;">i=0</span><span style="color: rgb(255, 255, 255);">......,..</span><span style="font-size:78%;">j=0</span><br /><br />where p<span style="font-size:78%;">s</span>(i) is the probability of scoring i number of times and p<span style="font-size:78%;">a</span>(j) is the probability of allowing j number of scores.<br /><br />This is only useful if you have a reasonable model for finding these probabilities, however, which requires you to have some model for the distribution of possible scores around the team average. In baseball, no such distribution is obvious, so instead of the above process, we use shortcuts like <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.tangotiger.net/wiki/index.php?title=PythagenPat">PythagenPat</a> to model the results of translating the underlying distribution of possible run-totals to an expected win percentage (by the way, the above example roughly resembles the actual distribution for a baseball team; traditional pythag would give you 4^2/(4^2+3^2) = 16/25 = .640 w%, while the example (ignoring ties) shows .56W/(.56W+.32L) = .636). Steven Miller showed that a <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://arxiv.org/abs/math/0509698">Weibull distribution of runs</a> gives a Pythagorean estimate of W%, and that the Weibull distribution is a reasonable assumption for his sample data (the 2004 American League), but that is just working backwards from the model in place.<br /><br />Some sports, however, do present an obvious choice of model, namely the Poisson distribution. Both soccer and hockey are decent examples of <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://en.wikipedia.org/wiki/Poisson_process">Poisson processes</a> because<br /><br />-play happens over a predetermined length, measured in a continuous fashion (i.e. time, as opposed to something like baseball which is measured in discreet units of outs or innings)<br />-goals can only come one at a time (as opposed to something like basketball, where points can come in groups of 1, 2, 3, or 4)<br />-the number of goals scored over a given period of the game is largely independent of the number of goals scored over a separate period of the game (the fluid nature of possession is a key attribute here; for a sport like American football where a score dictates who has possession for a significant chunk of time, a team's score over one five-minute span will be somewhat dependent on whether it scored in the previous five-minute span, for example)<br />-the expectation for the number of goals over a period of time (once you know who is playing) depends mostly on the length of time<br /><br />Hockey has at least one exception to the requirements of a Poisson process, in that the number of goals scored at the end of the game is not always independent of the number of goals scored earlier in the game due to empty net goals, but I don't know how much of an issue this presents. Soccer is a more straight-forward example (as well as a more homogeneous example due to the relative lack of substitution and penalties that are continually affecting the score-rate in hockey). Both, however, generally fit the mould for a Poisson process.<br /><br />Using a Poisson distribution to fill out a table as in the above example (if you have Excel or a similar spreadsheet program, it should have a Poisson distribution function built in), we can then calculate expected W-L performances for a team. The first and second columns use the average number of goals for and against , respectively, as λ (in Excel, Poisson.Dist(x,avg goals for/against,False), where x is 1,2,3..). Say we do this for a soccer team that we expect to score an average of 2 goals per game and allow an average of 1 goal per game against its opponent. We get the following probabilities:<br /><br />W: .606<br />L: .183<br />D: .212<br /><br />Using the traditional soccer point-format (3 points for a win, 1 for a draw), this team would average about 2.03 points per game against its opponent.<br /><br />We can also use the Poisson distribution to figure out what to expect if the game goes to overtime. Elimination soccer matches typically have a 30 minute OT (two 15-minute periods), so the λ (which, recall, are the average goals for and against, which are 2 and 1 in this example) for the OT will be 1/3 their regulation-match value (note that finding λ for regular-season hockey OTs will be more complicated because the 4v4 format will affect the scoring rate). Reconstructing the table with λ values of 2/3 and 1/3, we get the following results for games that go to OT:<br /><br />OTW: .384<br />OTL: .161<br />OTD: .454<br /><br />If overtime ends in a draw, the game is usually decided on PKs. If we assume that each team is 50/50 to win in PKs (which is not necessarily the case, but shootout odds should be closer to 50/50 than the rest of the match, and the odds in a shootout aren't necessarily based on expected goals for and against for the match), then our team's expected win% once a game goes to OT is .384 + .5*.454 = .611. Remember that the team wins 60.6% of the time in regulation, and the game goes to OT 21.2% of the time, so the team's total expected wins is .606 + .611*.212 = .735.<br /><br />If we want to model a sudden death OT, such as in the Stanley Cup playoffs, the odds of winning in regulation remain unchanged, but we have to use a different formula to determine the chances of winning once the game goes to overtime. The Poisson distribution works for estimating the probability of scoring a certain number of goals in a pre-determined amount of time (such as a 20-minute period or a 60-minute game), but not for estimating the time until the next goal. For that, we instead need the <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://en.wikipedia.org/wiki/Exponential_distribution">exponential distribution</a>, which models the amount of time until the next goal.<br /><br />We want to know the probability that our team's time until its next goal is less than its opponent's time to its next goal. Recall the above formula we used to determine the odds of our team's goals scored being higher than its opponent's:<br /><br /><span style="font-size:78%;"> <span style="color: rgb(255, 255, 255);">..</span>∞<span style="color: rgb(255, 255, 255);">............ </span> </span><span style="font-size:78%;"> </span> <span style="font-size:78%;"> i-1</span><br /><span style="font-size:180%;">∑</span>p<span style="font-size:78%;">s</span>(i) <span style="font-size:180%;">∑</span>p<span style="font-size:78%;">a</span>(j)<br /><span style="font-size:78%;">i=0</span><span style="color: rgb(255, 255, 255);">......,..</span><span style="font-size:78%;">j=0</span><br /><br />Here, we use something similar, except that we want to know the chances of our team's value (time to the next goal) is less than that of its opponenent:<br /><br /><span style="font-size:78%;"><span style="color: rgb(255, 255, 255);">..</span>∞</span><span style="font-size:78%;"><span style="color: rgb(255, 255, 255);">............ </span> </span><span style="font-size:78%;"> </span> <span style="font-size:78%;"> i-1</span><br /><span style="font-size:180%;">∑</span>p<span style="font-size:78%;">a</span>(i) <span style="font-size:180%;">∑</span>p<span style="font-size:78%;">s</span>(j)<br /><span style="font-size:78%;">i=0</span><span style="color: rgb(255, 255, 255);">......,..</span><span style="font-size:78%;">j=0</span><br /><br />where p<span style="font-size:78%;">s</span>(j) is the probability of our team's next goal coming after j amount of elapsed time, and p<span style="font-size:78%;">a</span>(i) is the probability of its opponent's next goal coming after i amount of elapsed time.<br /><br />Additionally, we are now dealing with a continuous variable (time elapsed) rather than a discreet variable (number of goals scored), so we need to integrate instead of summate:<br /><br />⌠<span style="font-size:78%;">∞<span style="color: rgb(255, 255, 255);">.........</span></span>⌠<span style="font-size:78%;">x</span><br />⌡<span style="font-size:78%;">0</span> f(x) ⌡<span style="font-size:78%;">0</span> g(x) dx dx<br /><br />where f(x) models the amount of time until the opponent's next goal, and g(x) models the amount of time until our team's next goal. In this formula, f(x) is an exponential probability density function with λ=expected goals allowed (G<span style="font-size:78%;">a</span>), and ∫g(x)dx is an exponential cumulative distribution function with λ=expected goals scored (G<span style="font-size:78%;">s</span>):<br /><br /><br />⌠<span style="font-size:78%;">∞</span> <br />⌡<span style="font-size:78%;">0</span> G<span style="font-size:78%;">a</span>*e^-(G<span style="font-size:78%;">a</span>x)*(1-e^-(G<span style="font-size:78%;">s</span>x)) dx<br /><br />This might look a bit ugly (or maybe not since e^x is such a simple integration), but it simplifies to just:<br /><br /><span style="color: rgb(255, 255, 255);">...</span>G<span style="font-size:78%;">s</span><br />-------<br />G<span style="font-size:78%;">s</span>+G<span style="font-size:78%;">a</span><br /><br />This makes perfect sense if we think about the next goal being a goal randomly selected from the distribution of possible goals in the game: the odds that the randomly selected goal comes from our team equal the percentage of total goals we expect to come from our team, and the odds that the randomly selected goal comes from our opponent equal the percentage of total goals we expect to come from them.<br /><br />Now that we have a model for sudden-death OT, we can estimate a team's chances of winning a game with sudden death OT. For example, say we have a hockey game where we expect our team to score 3 goals and allow 2 goals on average. This team would be expected to win in regulation about 58.5% of the time, lose in regulation about 24.7% of the time, and go to OT 16.8% of the time. Once in OT, it will win 3/(3+2)=60% of the time, so its total expected wins is .585 + .6*.168 = .686.<br /><br />Another interesting use of these distributions is to evaluate different strategies or lineups for a team (given that you can estimate the expected goals scored and allowed for varying lineups/strategies). Returning to the soccer team example where we have a team that we expect to score two goals and allow one, let's say that they are capable of making adjustments that make them stronger defensively, but at the cost of a significant portion of their offense. Say that they can play a defensive game and allow just .38 goals per game, but that doing so reduces their expected offensive output to 1.2 goals per game. In regular league play, the new defensive alignment will still average 2.03 points per game, so there is no benefit to this change.<br /><br />In a tournament elimination game, however, their win expectancy rises from .735 to .761, because the increase in regulation draws will still lead to a lot of wins (~61% of OT games) instead of just 1-point outcomes. What's more, if they switch back to the more aggressive game in OT (their 2 goals for, 1 goal against form), they can slightly improve their OT win odds (from .608 to .611) by avoiding more shootouts.<br /><br />Similarly, a sudden death format, where only the ratio of goals scored to goals allowed matters, can also produce different ideal strategies. Doubling both expected goals scored and allowed, for example, would have a significant effect on a team's odds of winning in regulation, but would have no effect on sudden-death because it preserves the ratio of offense to defense, and changes that have no impact on regulation (like going from 2 goals for/1 goal against to 1.2 goals for/.38 goals against in a regular season format soccer match) could have a significant impact on sudden death chances (.667 to .759 once you get to sudden death). Of course, any changes in strategy called for by different formats would depend on the team's ability to adapt to a different style of play and on how such changes affect its expected offensive and defensive production, but it is possible for an ideal lineup or strategy in one format to not be ideal in another, and using Poisson distributions to find the connection between offensive and defensive production and expected W-L performance is helpful in evaluating potential changes.<br /><br /><br /></span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com0tag:blogger.com,1999:blog-2868194292414002063.post-65569980276860229242011-05-15T21:27:00.006-07:002011-05-15T21:53:04.560-07:00Luis Gonzalez paintingMy most recent painting, for a charity auction in Phoenix, AZ. It's supposed to be Game 7 of the 2001 World Series, but, coincidentally, it also happens to feature the only two numbers retired by the Diamondbacks. Click the image for a larger size.<br /><br /><span class="fullpost"><br /><a href="http://i16.photobucket.com/albums/b37/adorhauer/DSC03643.jpg" target="_blank"><img src="http://i16.photobucket.com/albums/b37/adorhauer/224321_672123746822_36105071_35265428_6717060_n.jpg" border="1" alt="Photobucket"></a><br /></span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com1tag:blogger.com,1999:blog-2868194292414002063.post-87674372281624414082011-04-22T02:45:00.002-07:002011-04-22T12:20:49.834-07:00Bayes, Regression, and the Red Sox (but mostly Bayes)WARNING: Math heavy post.<br /><br /><span class="fullpost">Let's say that we have a hypothetical Major League team that has started the season 2-10. Say we want to see how likely it is that a team that goes 2-10 is still a .500 team. If you've taken any statistics courses, you may be familiar with hypothesis testing where you formulate a null hypothesis that the team is a .500 team, and then calculate the probability that they would go 2-10 by chance. You would then see that a .500 team will go 2-10 (or worse) over a 12 game span less than 2% of the time, and you'd probably reject your null hypothesis, depending on how high you set your significance level.<br /><br />This does not, however, tell you anything about the likelihood that the team actually is a .500 team, only the likelihood that they would perform that badly if they were in fact a .500 team. To address the former issue, we can instead use a Bayesian approach.<br /><br />To do that, we have to make an assumption about the population that the team is drawn from (in other words, about the spread of talent across all teams in the league). Most teams in MLB fall somewhere in the .400-.600 range (at least in true talent, if not in observed win percentage), so for convenience sake, let's just assume MLB teams follow a normal distribution of talent with a standard deviation of .050 W%.<br /><br />This is our prior distribution. Once we have that, we can apply <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://en.wikipedia.org/wiki/Bayes%27_theorem">Bayes' Theorem</a>:<br /><br /><br />P(A|B) = P(B|A)*P(A)/P(B)<br /><br />We want to find the probability of A given B [P(A|B)], which is to say that we want to find the probability that a team drawn from our prior population distribution (u=.500, SD=.050) is actually a .500 (or better) team, given that we have observed that team to go 2-10 over 12 games. To do this, we need to find the probabilities P(B|A), P(A), and P(B).<br /><br />P(A) is the simplest. This is the probability of a random team drawn from our prior distribution is at least a .500 team, ignoring of what we observed over the 12 game sample. This is just .500, because the population is centered around .500. So, P(A)=.5.<br /><span style="font-style: italic;"><br />*NOTE-if you want to use a W% other than .500, all you have to do is find the probability of drawing a W% at least that high from the prior distribution, which is trivial when the prior distribution is normal since that is a common computer function. If you don't have a program that can calculate normal distribution probabilities, you can use<span style="font-weight: bold; color: rgb(51, 0, 153);"> </span><a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.analyzemath.com/statistics/normal_calculator.html">an online tool</a>.</span><br /><br />P(B) is a bit more difficult. This is the probability that we observe a random team drawn from our prior distribution to go 2-10 over 12 games. This would be simple to calculate for any one team if we already know its true W%, but we need to know the average probability for all possible teams we could draw from our prior distribution. This will require a bit of calculus.<br /><br />For any one team with a known true W% (p), the probability of going 2-10 is:<br /><br />( 12! / (2! * 10!) ) * p^2 * (1-p)^10<br /><br />where 12 represents the number of total games, 2 the number of wins, and 10 the number of losses. We need to find the average value of that formula across the entire prior distribution.<br /><br />To do this, we utilize the same principle as a weighted average. In this case, the weight given to each possible value of p is represented by the probability density function of the prior distribution. So, for example, the odds of a .400 team going 2-10 are about 6.4%, and that 6.4% gets weighted by the quantity f(p), where f(p) is the pdf for our prior normal distribution. We repeat this for each possible value of p, add up the weighted terms, and then divide by the sum of the weights. As you have probably guessed, this means taking the definite integral (from p=0 to p=1) of the probabilities weighted by f(p):<br /><br />INTEGRAL(f(p) * ( ( 12! / (2! * 10!) ) * p^2 * (1-p)^10 ) dp)<br /><br />f(p) is just the normal distribution pdf, which is:<br /><br />f(p) = e^((-((x-u)^2))/(2*VAR))/(sqrt(2*Pi*VAR)) ; u=.5, VAR=.05^2=.0025<br /><br />As fun as it sounds to go ahead and do that by hand, we'll just let computers do this part for us too, using an <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.solvemymath.com/online_math_calculator/calculus/definite_integral/index.php">online definite integral calculator</a>. You can copy and paste this into the function field if you want to try it for yourself:<br /><br />((12!/(2!*10!))*exp((-((x-.5)^2))/(2*.0025))*(x^2*(1-x)^10)/(sqrt(2*3.14159*.0025)))<br /><br />or, more generally, if you want to play around with different records or different means and variances for the prior normal distribution:<br /><br />(((W+L)!/(W!*L!))*exp((-((x-u)^2))/(2*VAR))*(x^W*(1-x)^L)/(sqrt(2*Pi*VAR)))<br /><br />Integrating that from 0 to 1, we get a total value of .020. Remember that we still have to divide by the total sum of the weights to find the weighted average, but since that is just F(p), or the cumulative distribution function of the prior distribution, it is equal to one by definition. Therefore, P(B) = .02.<br /><br />Finally, we have to calculate P(B|A). This is the probability of observing a 2-10 record, given that we are drawing a random team from the prior distribution that fulfills condition A, which is that the team has at least a .500 true W%. This is done very similarly to finding P(B) above, except we are only considering values of p>.5.<br /><br />Start by calculating the same definite integral as before, but from .5 to 1 instead of from 0 to 1 (this is done by simply changing a textbox below the formula). This gives a value of .0045. That is the weighted sum of all the probabilities; to turn the weighted sum into an average, we still have to divide by the sum of all the weights, which in this case is .5 (this is the cdf F(p) from .5 to 1). Dividing .0045 by .5, we get .009. This is P(B|A).<br /><br />P(A)=.5<br />P(B) = .02<br />P(B|A) = .009<br /><br />P(A|B) = .009*.5/.02 = .22<br /><br />The probability of a team that goes 2-10 being at least a true talent .500 team is about 22% (assuming our prior distribution is fairly accurate). As you can see, this is pretty far from what one might conclude from using the hypothesis test to reject the null hypothesis that the team is a .500 team. This is why it is important not to misconstrue the meaning of the hypothesis test. The hypothesis test only tells you a very specific thing, which is how likely or unlikely the observed result is if you assume the null hypothesis to be true. Rejecting the null hypothesis on this basis does not necessarily mean the null hypothesis is unlikely; that depends on the prior distribution of possible hypotheses that exist. Considering potential prior distributions allows us to make more relevant estimates and conclusions about the likelihood of the null hypothesis.<br /><br />Another advantage of the Bayesian approach is that it gives us a full posterior distribution of possible results. For example, when we observe a 2-10 team, we can not only estimate the odds that it is a true .500 team, but also the odds that it is a true .550 team, or .600 team, or whatever. Also, since we have a full distribution of likelihoods, we can also figure out the expected value.<br /><br />The posterior distribution of possible true talent W% for a team that is observed to go 2-10 is represented by the product we integrated earlier:<br /><br />(((W+L)!/(W!*L!))*exp((-((x-u)^2))/(2*VAR))*(x^W*(1-x)^L)/(sqrt(2*Pi*VAR)))<br /><br />We find the expected value of that function the same way we found the average value of the above probabilities. This time, we want to find the average value of x (or p, as we were calling it before), so we weight each value of x by the above function, and then divide by the sum of the weights. For the numerator, this means integrating the above function multiplied by x:<br /><br />x * (((W+L)!/(W!*L!))*exp((-((x-u)^2))/(2*VAR))*(x^W*(1-x)^L)/(sqrt(2*Pi*VAR)))<br /><br />The denominator is just the summation of the function not multiplied by x, which we already did above (it is the same thing as the P(B) above, or .020).<br /><br />Plugging this into the definite integral calculator, we get:<br /><br />0.009442586/0.020355243 = 0.464<br /><br />So a team that goes 2-10 over 12 games will be, on average, about a .464 team (again, assuming our prior distribution is accurate).<br /><br />As a shortcut for the above calculations, one can also use regression to the mean to estimate the expected value of the team's true W%. This is done by finding the point where random variance in the observed record equals the variance of talent in the population. We know the standard deviation of talent in the population, using our assumed prior, is .050 wins per game. The random variance in the observed record is n*p*(1-p), where n is the number of games and p is the teams' true W%. Since p mostly stays around .5, this approximately equal to n*.25 (it's actually about n*.2475, but the difference is minimal). When you have 100 observed games, the random variance will be 100*.25 = 25. The variance of talent will be (100*.05)^2 = 25. Therefore, after 100 games, the variance due to random variation will equal the variance of talent in the population.<br /><br />At this point, we would regress exactly 50% to the mean to estimate the expected value of a team's true W%. This is the same as adding 100 games of league average performance to a team's observed record. This also works after any observed number of games. To estimate our 2-10 team's true W% with regression to the mean, we would add 50 wins and 50 losses to the observed record and get an expected W% of 52/112 = .464. How well regression to the mean approximates Bayesian probability depends on the prior distribution you choose, but in this case, it works very well (rounds to the same thousandths place).<br /><br />This is all done assuming a prior that presumes nothing about the team in question other than the fact that it comes from a distribution of teams roughly like that we observe in MLB. What if, in addition to knowing the 2-10 team comes from MLB, we also know that the team employs several really good players and that we believe them to be one of the most talented teams in the league? We can adjust the prior for that information as well. Our prior distribution, after accounting for the amount of talent we believe to be on the team, might have a mean of .590 instead of .500. Let's say that we assume a normal distribution with a mean of .590 and a SD of .030 for our prior. Now we get an expected W%, after observing the team go 2-10, of .572. In this case, the absymal 12 game observed sample dropped our expectations for the team going forward from .590 to .572. Of course, this all depends on what prior you assume, but as long as you make reasonable assumptions, your estimate should give you a decent idea.</span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com4tag:blogger.com,1999:blog-2868194292414002063.post-34774679719047701212011-02-13T02:46:00.007-07:002011-12-13T01:16:53.087-07:00Optimistic Projections.<span style="font-style:italic;">UPDATE: The table in the article has been updated with actual 2011 wOBAs for each player, along with actual 2011 PAs. An aggregate line has been added with the combined projected and actual wOBAs weighted by actual 2011 PAs. Also, in light of Mike's response below regarding the handling of intentional walks in their projections, it is possible the RotoChamp wOBA values are artifically-inflated, so the difference between the average Roto projection and the actual might not be meaningful.</span><br /><br />While looking through the <a style="font-weight: bold; color: rgb(51, 0, 153);" href="http://www.fangraphs.com/projections.aspx?pos=all&stats=bat&type=marcel">projections posted at FanGraphs</a>, I noticed that some of the RotoChamp projections seem a bit optimistic. For example, Jim Thome is projected with a .419 wOBA. Thome had a good year last year, putting up a .437 wOBA over 340 PA, but he's also 40 years old, and even with a potential HOF career behind him, his career wOBA is only .407.<br /><br />This looks like a simple case of under-regressing 2010 performance and not enough weight given to past performance. However, other optimistic projections fall out of this pattern. For example, Albert Pujols is projected with a .449 wOBA, higher even than his career .434 mark despite Albert now being on the wrong side of 30 (it appears the wOBA projections may not remove IBB, so Albert's career wOBA with his IBBs counted as nIBBs would be .444; closer to his projection, but still lower). Albert's 2010 performance, however, does not seem to be driving the high projection; he hit only .420 last year.<br /><br /><span class="fullpost">Does RotoChamp see something important in projections like these, or is the optimism misplaced? I find 20 different players listed in both the Marcel and RotoChamp projections who are projected at least .030 points wOBA higher by RotoChamp than Marcel. They are:<br /><br /><style> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:1.0in .75in 1.0in .75in; mso-header-margin:.5in; mso-footer-margin:.5in;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:12.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri, sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl63 {mso-number-format:"0\.000";} --> </style> <table style="border-collapse: collapse; width: 575px; height: 485px;" border="0" cellpadding="0" cellspacing="0"> <colgroup><col style="width:65pt" span="8" width="65"> </colgroup><tbody><tr style="height:15.0pt" height="15"> <td style="height:15.0pt;width:65pt" height="15" width="65">Name</td> <td style="width: 65pt; text-align: right;" width="65">Age</td> <td style="width: 65pt; text-align: right;" width="65">Rel</td> <td style="width: 65pt; text-align: right;" width="65">Marcel</td> <td style="width: 65pt; text-align: right;" width="65">RotoChamp</td> <td style="width: 65pt; text-align: right;" width="65">diff</td> <td style="width: 65pt; text-align: right;" width="65">2011 PA<br /></td> <td style="width: 65pt; text-align: right;" width="65">2011 wOBA<br /></td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Jim Thome<span style="mso-spacerun:yes"> </span></td> <td align="right">41</td> <td align="right">0.81</td> <td class="xl63" align="right">0.360</td> <td class="xl63" align="right">0.419</td> <td align="right">0.059</td> <td align="right">324</td> <td class="xl63" align="right">0.362</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Dan Johnson<span style="mso-spacerun:yes"> </span></td> <td align="right">32</td> <td align="right">0.4</td> <td class="xl63" align="right">0.327</td> <td class="xl63" align="right">0.381</td> <td align="right">0.054</td> <td align="right">91</td> <td class="xl63" align="right">0.181</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Jose Bautista<span style="mso-spacerun:yes"> </span></td> <td align="right">31</td> <td align="right">0.84</td> <td class="xl63" align="right">0.362</td> <td class="xl63" align="right">0.408</td> <td align="right">0.046</td> <td align="right">655</td> <td class="xl63" align="right">0.441</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Brandon Allen<span style="mso-spacerun:yes"> </span></td> <td align="right">25</td> <td align="right">0.38</td> <td class="xl63" align="right">0.321</td> <td class="xl63" align="right">0.367</td> <td align="right">0.046</td> <td align="right">195</td> <td class="xl63" align="right">0.286</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Ramon Castro<span style="mso-spacerun:yes"> </span></td> <td align="right">35</td> <td align="right">0.6</td> <td class="xl63" align="right">0.319</td> <td class="xl63" align="right">0.361</td> <td align="right">0.042</td> <td align="right">75</td> <td class="xl63" align="right">0.332</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Magglio Ordonez<span style="mso-spacerun:yes"> </span></td> <td align="right">37</td> <td align="right">0.83</td> <td class="xl63" align="right">0.345</td> <td class="xl63" align="right">0.386</td> <td align="right">0.041</td> <td align="right">357</td> <td class="xl63" align="right">0.283</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Ryan Hanigan<span style="mso-spacerun:yes"> </span></td> <td align="right">31</td> <td align="right">0.69</td> <td class="xl63" align="right">0.329</td> <td class="xl63" align="right">0.370</td> <td align="right">0.041</td> <td align="right">304</td> <td class="xl63" align="right">0.320</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Chipper Jones<span style="mso-spacerun:yes"> </span></td> <td align="right">39</td> <td align="right">0.83</td> <td class="xl63" align="right">0.353</td> <td class="xl63" align="right">0.393</td> <td align="right">0.04</td> <td align="right">512</td> <td class="xl63" align="right">0.345</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Jorge Posada<span style="mso-spacerun:yes"> </span></td> <td align="right">40</td> <td align="right">0.79</td> <td class="xl63" align="right">0.341</td> <td class="xl63" align="right">0.381</td> <td align="right">0.04</td> <td align="right">387</td> <td class="xl63" align="right">0.309</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Matt Diaz<span style="mso-spacerun:yes"> </span></td> <td align="right">33</td> <td align="right">0.74</td> <td class="xl63" align="right">0.330</td> <td class="xl63" align="right">0.367</td> <td align="right">0.037</td> <td align="right">268</td> <td class="xl63" align="right">0.280</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Andruw Jones<span style="mso-spacerun:yes"> </span></td> <td align="right">34</td> <td align="right">0.75</td> <td class="xl63" align="right">0.323</td> <td class="xl63" align="right">0.360</td> <td align="right">0.037</td> <td align="right">222</td> <td class="xl63" align="right">0.371</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Albert Pujols<span style="mso-spacerun:yes"> </span></td> <td align="right">31</td> <td align="right">0.87</td> <td class="xl63" align="right">0.414</td> <td class="xl63" align="right">0.449</td> <td align="right">0.035</td> <td align="right">651</td> <td class="xl63" align="right">0.385</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">David Ortiz<span style="mso-spacerun:yes"> </span></td> <td align="right">36</td> <td align="right">0.85</td> <td class="xl63" align="right">0.347</td> <td class="xl63" align="right">0.382</td> <td align="right">0.035</td> <td align="right">605</td> <td class="xl63" align="right">0.405</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Jason Giambi<span style="mso-spacerun:yes"> </span></td> <td align="right">40</td> <td align="right">0.78</td> <td class="xl63" align="right">0.326</td> <td class="xl63" align="right">0.359</td> <td align="right">0.033</td> <td align="right">152</td> <td class="xl63" align="right">0.407</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Matt Treanor<span style="mso-spacerun:yes"> </span></td> <td align="right">35</td> <td align="right">0.64</td> <td class="xl63" align="right">0.279</td> <td class="xl63" align="right">0.312</td> <td align="right">0.033</td> <td align="right">242</td> <td class="xl63" align="right">0.291</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Miguel Cabrera<span style="mso-spacerun:yes"> </span></td> <td align="right">28</td> <td align="right">0.87</td> <td class="xl63" align="right">0.390</td> <td class="xl63" align="right">0.422</td> <td align="right">0.032</td> <td align="right">688</td> <td class="xl63" align="right">0.436</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Chase Utley<span style="mso-spacerun:yes"> </span></td> <td align="right">33</td> <td align="right">0.86</td> <td class="xl63" align="right">0.370</td> <td class="xl63" align="right">0.401</td> <td align="right">0.031</td> <td align="right">454</td> <td class="xl63" align="right">0.344</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Aubrey Huff<span style="mso-spacerun:yes"> </span></td> <td align="right">35</td> <td align="right">0.87</td> <td class="xl63" align="right">0.345</td> <td class="xl63" align="right">0.376</td> <td align="right">0.031</td> <td align="right">579</td> <td class="xl63" align="right">0.294</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Jed Lowrie<span style="mso-spacerun:yes"> </span></td> <td align="right">27</td> <td align="right">0.65</td> <td class="xl63" align="right">0.336</td> <td class="xl63" align="right">0.367</td> <td align="right">0.031</td> <td align="right">341</td> <td class="xl63" align="right">0.297</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15">Manny Ramirez<span style="mso-spacerun:yes"> </span></td> <td align="right">39</td> <td align="right">0.81</td> <td class="xl63" align="right">0.371</td> <td class="xl63" align="right">0.401</td> <td align="right">0.03</td> <td align="right">17</td> <td class="xl63" align="right">0.052</td> </tr> <tr style="height:15.0pt" height="15"> <td style="height:15.0pt" height="15"><br /></td> <td><br /></td> <td><br /></td> <td class="xl63"><br /></td> <td style="text-align: right;" class="xl63"><br /></td> <td style="text-align: right;"><br /></td> <td style="text-align: right;">TOTAL PA<br /></td> <td class="xl63"><br /></td> </tr> <tr style="height:15.0pt" height="15"> <td colspan="2" style="height:15.0pt;mso-ignore:colspan" height="15">weighted average</td> <td><br /></td> <td class="xl63" align="right">0.354</td> <td class="xl63" align="right">0.392</td> <td><br /></td> <td align="right">7119</td> <td class="xl63" align="right">0.353</td> </tr> </tbody></table><br />This does not include prospects who have not appeared in the Majors, such as Brandon Belt, who RotoChamp likes a lot (.385 projected wOBA). Technically, Marcel projects these players with a league average wOBA, so they could be included, but whether or not RotoChamp's prospect insights add value is a separate issue than what I am looking at here. Other than Brandon Allen, these players all have fairly established Major League track records with plenty of data for a projection system to work with, and RotoChamp is still seeing them very differently from Marcel.<br /><br />The test here is simple. At the end of the year, will Marcel's or RotoChamp's estimates for these 20 players be better? Additionally, will this group signicantly outperform its Marcel projections, even if they still end up closer to Marcel than RotoChamp? After all, RotoChamp could be finding something that Marcel is underselling in these players and still be over-weighting that insight to end up with overly optimistic projections.<br /><br />Most of the players on the list are in their 30s or 40s (all except Miguel Cabrera, Jed Lowrie, and Brandon Allen). Their average age is 34. Many, like Thome, Jose Bautista, and Andruw Jones are coming off big 2010 seasons, while some, like Albert and Manny Ramirez, are big names projected for bigger numbers than their 2010s would indicate.</span>Kincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.com1