tag:blogger.com,1999:blog-2868194292414002063.post2229286436012488091..comments2017-04-28T20:47:25.262-07:00Comments on 3-D Baseball: Win Expectancy and Leverage Index tables, R CodeKincaidhttp://www.blogger.com/profile/07348661324396474896noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-2868194292414002063.post-63471329848840011162015-02-07T21:03:55.632-07:002015-02-07T21:03:55.632-07:00Sorry I didn't see this earlier. I am not sur...Sorry I didn't see this earlier. I am not sure how helpful these explanations will be, but hopefully they will at least start to make sense if you get a chance to play around with the functions and look at their outputs.<br /><br />create.run.dist():<br /><br />run.dist.simulation is a table of the probabilities of scoring a given number of runs in an inning from each base-out state. This table gives probabilities for scoring anywhere from 0-16 runs (there is nothing in the simulation limiting it to that range, I just cut it off at 16 runs to reduce the number of calculations needed). This distribution of run scoring only applies to runs scored through the end of the inning, though. To calculate win probabilities, we need the probability of scoring a given number of runs through the end of the game.<br /><br />What create.run.dist() does is takes the distribution of run scoring through the end of the inning, and it adds another inning on top of that. So instead of giving the probabilities for scoring anywhere from 0-16 by the end of this inning, it takes those probabilities and turns them into the probabilities of scoring anywhere from 0-32 runs by the end of the next inning (although it looks like I also limited this to 0-30 runs in the win probability calculations to further reduce calculation time). And then you can take the run distribution for the next two innings, and feed that back into the function to add another inning, and it gives you the run distribution over the next three innings, etc.<br /><br />create.run.dist() only creates run distributions from the start of an inning through the end of the game.<br /><br />create.run.dist.2() does the same thing, but it calculates the run distribution from any base-out state through the end of the game rather than just from the start of the inning through the end of the game. The extra parameter "h" is a number from 1:24 that identifies the base-out state. (The reason create.run.dist() also exists is that the distributions from the start of an inning through the end of the game are used to feed into create.run.dist.2() later on.)<br /><br /><br />diag.sum():<br /><br />diag.sum() is a helper function that is used in conjunction with the two create.run.dist() functions. create.run.dist() and create.run.dist() don't actually return a single row of probabilities for scoring each number of runs. Rather, they return a table of data, with different rows giving probabilities for different combinations that lead to a given number of runs.<br /><br />For example, say we want to know the probability of scoring exactly 1 run through the end of the end of the game. Because of how create.run.dist() works, it will give us a table where one row gives us the probability of scoring 1 run in this inning and 0 runs for the rest of this game. And then another row will give us the probability of scoring 0 runs this inning and then 1 run for the rest of the game. To get the total probability of scoring 1 run through the end of the game, we have to add those two probabilities together. <br /><br />Fortunately, the different iterations that lead to the same number of runs end up on the same diagonal in the table returned by create.run.dist(). This means to get the total probabilities, we just have to sum up the diagonals of that table. Which is what this function does, hence the name "diag.sum()".Kincaidhttps://www.blogger.com/profile/07348661324396474896noreply@blogger.comtag:blogger.com,1999:blog-2868194292414002063.post-53149876732795425392015-01-21T13:36:18.015-07:002015-01-21T13:36:18.015-07:00Thanks this is great. Could you tell me what the f...Thanks this is great. Could you tell me what the functions create.run.dist, create.run.dist.2, diag.sum are doing exactly? There isn't much commenting so I am a little confused.Sam Sharpehttps://www.blogger.com/profile/18225012525613318422noreply@blogger.comtag:blogger.com,1999:blog-2868194292414002063.post-23208810301662470072013-01-21T10:53:57.010-07:002013-01-21T10:53:57.010-07:00I did not get the code exactly yet does this mean ...I did not get the code exactly yet does this mean that we can use this in our <a href="https://www.rogersbreakawaybase.com/rogers-bases" rel="nofollow">baseball field equipment</a> too? Like a code organizer to file things properly. Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2868194292414002063.post-36733115008963226692012-03-17T12:28:45.450-07:002012-03-17T12:28:45.450-07:00Thanks for the code! When I was reading the articl...Thanks for the code! When I was reading the article, one thing that came to mind would be to use neither LI or boLI for your hypothetical situation where the closer doesn't need to be saved for tomorrow. Instead, you could simulate a bunch of games and see how likely it would be that there would be a better situation for the closer to come in. <br /><br />I using your code as a starting point, I wrote a function to simulate a game at any starting point you want. Then kept track of how many times the starting point had the highest LI for that team. <br /><br />I ran the function 10000 times for the example in the article, and that point was the highest LI for the game in 49.96% of the games. The highest LI in the game was less than 2 in 83.61% of the games.<br /><br />This seems to indicate that you should use your closer in this situation, but I am not sure. Maybe a way to make this more thorough would be to calculate how much putting in the closer improves the probability of winning in this situation versus in the other potentially important situations (weighted by the probability of reaching those situations).Andrewhttps://www.blogger.com/profile/16603453765851716066noreply@blogger.com