Vectorize Simulation

See chapter 24. Improving performance in “Advanced R” by Hadley Wickham.

See also the beginnings of chapters 5 and 6 of in “Python Data Science Handbook”, 2nd edition by Jake VanderPlas.

Note that for improving performance the Wickham book just has

The other key thing is parallel computing.

The simulation code (for the simple logit model) I had a loop over n.
Vectorize this code and time it to check that it is faster.
Is the a large n where it really is faster?
This code is so simple, it may not make much of a difference.
Wickham emphasizes that vectorized code can be simpler and more elegant.
But sometimes I like to just write a loop since it looks simple to me.

Check the mle for the intercept

I fixed the intercept at the mle and then varied the slope on a grid of values to check that the min of -logL was at the mle.

Flip this, and do a grid of intercept values, fixing the slope at the mle.

Now you can believe the logit software does the mle.

Sampling Experiment

Draw repeated simulations of the logit data.
For each simulation, compute the mle.

Probably best if you fix the x values, but that is up to you.

What are the sampling properties of the mle?
Is it unbiased?

Can you vectorize this loop?

Plot the sampling distribution of P(Y=1|x)

In class we plotted P(Y=1|x) vs x at the mle.

Can you plot the variation in this plot over repeated samples of Y|x to see the variation?

Do our estimated of P(Y=1|x) look unbiased at each x value?