So the most disturbing assumption we’ve made so far is in always taking the extra base. This is an easily fixable problem, and it’s about time it was fixed.
If we look back on the pure distributions, the only one that needs fixing is the singles model. Here, we have to subtract out a run every time three singles don’t score, or, in actuality, only when the last three singles don’t score a run. So we need to take the probability that at least three singles were hit, which is (1-(p’)^3 – 3*p*(p’)^3-6*p^2*p’^3) and then multiply that by
how often three singles won’t make a run, to find what we need to subtract from the established
distribution that we already have. How often WILL three singles fail to produce a run? Well, I’m going to start by assuming that the probability of advancing from first to third is the same, regardless of who’s batting or running, and regardless of what inning it is, how many outs there are (less than three, obviously), etc. I’ll do the same for the probability of scoring from second. Let’s call these P13 and P2H respectively. Okay, the only way to not score from three singles is to both not advance from first to third and then not advance from second home. This happens with probability (1-P13)*(1-P2H) = 1-P13-P2H+P23*P2H.
So the probability that you DO score on three singles, if we’re curious, is simply that subtracted from one, so we end up with… P13+P2H-P13*P2H. Anyway, we end up needing to subtract this, then, from the distribution we had before:
(1-(p’)^3 – 3*p*(p’)^3-6*p^2*p’^3)*( 1-P13-P2H+P23*P2H) where, once again, p is OBP, p’
is 1-OBP, P13 is the probability of a runner going first to third on a single, and P2H is the probability of scoring from second on a single.
Okay, what about the overall Homogeneous distribution? Largely, it’s going to be the same. The only changes, in fact, will be in the parts that I call the 2s and 3s adjustments. Here, you need to prune some stuff of similar to what we did above. This also means we need to include triples separately from doubles, and another advancement parameter, for scoring from 1st on a double,
which I’ll call P1H. Okay, so the variables are as follows:
O = On-Base-Percentage
W = Walk rate (BB/PA)
S = Singles rate (1B/PA)
D = Doubles rate (2B/PA)
T = Triples rate (3B/PA)
H = Home run rate (HR/PA)
P13 = Probability of going 1st to 3rd on a single
P2H = Probability of scoring from 2nd on a single
P1H = Probability of scoring from 1st on a double.
So the new 2S adjustment works out to be:
6*(1-O)^3*(O*H*2+O*T*1+(D+T+H)*D*1+(W+S)*D*P1H+(T+H)*S*1+D*S*P2H+H*W*1)
The new 3S adjustment works out to:
(1-(1-O)^3*(1+3*O+6*O^2))*(O*O*H*3+O*O*T*2+O*(S+W)*2*(1+P1H)+O*(D+T+H)*2+O*(T+H)*S*(1+P2H)+(D+T+H)*S*S*1+(W+S)*S*S*(P13+P2H-P13*P2H)+O*H*W*2+O*3*B*1+(D+T+H)*D*W*1+(W+S)*D*W*P1H+(T+H)*S*W*1+D*S*B*P2H+(W+S)*S*W*0+H*W*W*1+(W+S+D+T)*W*W*0)
And if we add these to the unchanged 1s adjustment and walk distribution, we get our new equation to be:
(6*O^6-18*O^5+15*O^4)/(1-O) + 3*H*(1-O)^3 + 6*(1-O)^3*(O*H*2+O*T*1+(D+T+H)*D*1+(W+S)*D*P1H+(T+H)*S*1+D*S*P2H+H*W*1) + (1-(1-O)^3*(1+3*O+6*O^2))*(O*O*H*3+O*O*T*2+O*(S+W)*2*(1+P1H)+O*(D+T+H)*2+O*(T+H)*S*(1+P2H)+(D+T+H)*S*S*1+(W+S)*S*S*(P13+P2H-P13*P2H)+O*H*W*2+O*3*B*1+(D+T+H)*D*W*1+(W+S)*D*W*P1H+(T+H)*S*W*1+D*S*B*P2H+(W+S)*S*W*0+H*W*W*1+(W+S+D+T)*W*W*0)
Now, if you want to include advancing on outs, that gets more complicated yet... but still do-able. A later post, perhaps, but it will be an enormously long equation.
No comments:
Post a Comment