I pulled it from a paper in grad school, and have been using it ever since because it's just too useful not to. Start with the assumption that you're sampling from a Gaussian distribution with unknown mean and variance. Next, add an unknown amount of error that means you can get really wonky values every so often. If you calculate just the regular sample mean and sample variance, you can get bad answers because those wonky values skew the distribution badly. Instead, substitute the median for the mean, and you've got a pretty decent estimate of the center of the distribution. Now, for the
variance standard deviation, take the
interquartile distance (the distance between the 75th percentile and the 25th percentile), and multiply it by 0.7413. You come up with this number by noting that for a Gaussian distribution, sigma = (1 / (norminv(0.75) - norminv(0.25))) * IQD. I've been calling this Q[uartile]-sigma in all my code since reading about it.
Although, now that I look at this, I realize that that's basically just calculating the
MAD estimate of sigma in a different way. And look, there on the wiki page, it
says the same thing. Well.
The point being, if you have values {1,2,3,4,5,6,7,8,9000}, then the mean is going to be horribly skewed due to a contamination of just 11% of the data. Ditto with the regular standard deviation. However, using Qsigma (or the MAD estimate, since they're apparently equivalent) lets you have up to ~50% crap data (if you're lucky and have it equally bad in both directions. Otherwise just 25%) before the statistic starts to break down.
Anyway, here's a squirrel:
|
Walnuts! |
A biker having a very bad day:
|
Deer probably isn't too happy about it either. |
And some books:
|
Books! |
The continuation of yesterday's dinner is that:
- You cannot buy pints of regular milk apparently.
- Little kid 8oz milk-boxes don't expect you to not use the straw.
- My mac&cheese did just need a bit more milk to smooth out.
- I cook everything en papillote now-a-days.