See chapter 24. Improving performance in “Advanced R” by Hadley Wickham.
See also the first few page of chapter 2. “Introduction to Numpy”
in “Python Data Science Handbook” by Jake VanderPlas.
Note that for improving performance the book just has
The other key thing is parallel computing.
The simulation code (for the simple logit model) I had a loop over n.
Vectorize this code and time it to check that it is faster.
Is the a large n where it really is faster?
This is code is so simple, it may not make much of a difference.
Wickham emphasizes that vectorized code can be simpler and more elegant.
But sometimes I like to just write a loop since it looks simple to me.
I fixed the intercept at the mle and then varied the slope on a grid of values to check that the min of -logL was at the mle.
Flip this, and do a grid of intercept values, fixing the slope at the mle.
Now you can believe the logit software does the mle.
Draw repeated simulations of the logit data.
For each simulation, compute the mle.
What are the sampling properties of the mle?
Is it unbiased?
Can you vectorize this loop?
Could you do this with parallel computing?
We computed -logL on a two grid.
Plot -logL versus the (beta0,beta1) = (intercept,slope) values of the two way grid.
This is the only hard problem!!