Wednesday, July 12, 2017

Wednesday: No, it's fun to discover a massive problem with a database just when you want to go home.

But I think I was able to fix it with a redo, and changing all the "INSERT"s to "INSERT IGNORE"s.

And I wanted to make a proper version of the garbage scribble at the top of this post.

I really like the way R plots things.
 It's reasonably simple:

b = data.frame(N = seq(0,177))
b$P = choose(177,b$N) * 0.24**b$N * (1 - 0.24)**(177 - b$N)

L = rbind(c(0,0), subset(b, b$N <= 36),  c(36,0))
U = rbind(c(44,0), subset(b, b$N > 43), c(177,0))

ggplot(b,aes(x = N, y = P, color="P_expect = 0.24")) + \
  geom_polygon(data=L,aes(x = N, y = P, fill="Pobs")) + \
  geom_polygon(data=U,aes(x = N, y = P, fill="Pover")) + \
  geom_line() + geom_point()  + \
  scale_colour_manual(values=c("black"),name="",
                      guide=guide_legend(override.aes=aes(fill=NA))) + \
  scale_fill_manual(values=c("red","blue"),name="")

ggsave("/tmp/2017b.png")

The only major issues are that doing the shading requires constructing a polygon object, and that needs to have endpoints set correctly (or it shades the wrong way).  Getting all the colors and legend set was also not super obvious, and that "override.aes" thing is just nonsense.

"But it's not snek."  So then it was a challenge to see how to do this in snek, too.
Snek is simpler, but their documentation is far worse.  Examples should start slow, not alphabetically with "animation."  Why would you do that?
#!/usr/bin/env python3                                                                                                       
import matplotlib.pyplot as plt
import scipy.special
import numpy as np

P_expect = 0.24
N_talks  = 177
N_obs    = 36
N_expect = 44

N = np.arange(0,N_talks,1)
P = scipy.special.binom(N_talks,N) * P_expect**N * (1 - P_expect)**(N_talks - N)

# Lower portion
Nl = np.arange(0,N_obs,1)
Pl = scipy.special.binom(N_talks,Nl) * P_expect**Nl * (1 - P_expect)**(N_talks - Nl)
Zl = Nl * 0.0

# Upper portion
Nu = np.arange(N_expect,N_talks,1)
Pu = scipy.special.binom(N_talks,Nu) * P_expect**Nu * (1 - P_expect)**(N_talks - Nu)
Zu = Nu * 0.0

plt.grid()
label_text = "P_expect = %.2f" % P_expect
plt.fill_between(Nl,y1=Zl,y2=Pl,color="red",label="P_obs")
plt.fill_between(Nu,y1=Zu,y2=Pu,color="blue",label="P_over")
plt.plot(N,P,color="black",label=label_text)
plt.scatter(N,P,s=5,color="black")
plt.legend()
plt.savefig("mpl.png")

Having an explicit fill_between() function saved a lot of time.  The legend() function was also helpful for making it just work.

I'm still behind on my RSS stuff.

  • "Oh no!  This new Spider-Man movie confuses the timeline!"  I complain about stupid stuff, but this is too far.  Comic book timelines have been insane forever.  I mean, look at Squirrel Girl.  Doreen was 14 when she defeated Doom with Tony, then she did stuff for a few years with the GLA, then she was kind in the Avengers, then she babysat Luke Cage/Jessica Jones' daughter, and now she's in college.  How old is she?  Why isn't she 39 if the comics follow regular time?  Doesn't matter, she's doing college now, enjoy your wonderful stories.  Comic book time is meaningless.
  • Wonder Woman.
  • Whoops.
  • Best Spider-Man.
  • Buffalo.

No comments:

Post a Comment