Where are we and what should I be doing?


8. August 10, 2024:

Finished the course!!

Note that we covered a lot of stuff in the last class,
so trees and logistic regression are not on the final.

To prepare for the final, finish and review all of the homework and go over the two old finals on the webpage.
You can skip the last question in the section 5 homework on trees if you like (but I really like it!!).

Final: available as Canvas quiz 9am Friday August 16 to 11:59pm Sunday August 18, 3 hour test.

There are review sessions Tuesday August 13 and Thursday August 15 at 8pm.

Of course you can email questions to Percy or Rob.


8. August 3, 2024:

In both sections we got to the beginning of the section on trees.

Quiz 3 will be available from 9am Tuesday August 6 to midnight Thursday August 8.
To prepare for the quize do the first three questions of the section 5 homework.
Quiz 3 from last summer is also on the webpage.

Old Finals

Old finals from 2012 and 2007 are posted.
You might want to start working through these finals to prepare for our final.
07 final: skip Question 9 parts 9,15,17.
For parts Q7 (c), (d) you would need to use R, prop.test as in Section 2 homework, Q10 (a).
12 final: skip Question 2, Question 7 parts 3,4 would need R:prop.test.


7. July 27, 2024:

We finished the Section 4 notes on Multiple Regression.

Do all the homework for multiple regression (Section 4).

Note that for excel I have a video of doing the multiple regression with a dummy variable
in excel file in the Section 4 part of the webpage: Multiple regression in Excel with dummy variables.
.

In R, the link "Simple Multiple Regression in R" does multiple regression with categorical variables in R.
Simple Multiple Regression.
See also, Prediction and dummies in R.


There is also a link to the R script plot-midcity-with-N-and-B.R which
does the plotting with categorical B=Brick, N=neighborhood that we looked at in the notes.
The plotting is done in the R-package ggplot2, but you could do the same thing in base R.

For additional reading see chapter 18 in Irizarry.


6. July 20, 2024:

midterm.

5. July 13, 2024:

We finished the Section 3, Simple Linear Regression Notes.


Homework
Finish all the homework on the first three sections.
R users might also look at problem 3 which is on the predictive interval.

We have the midterm this week, no class July 19-20!!!
The midterm covers the first three section of notes.
The midterm exam will be open from 9am Friday July 19 to 11:59 pm Sunday July 21, 2 hour time limit.
The test is open notes (anything on our class webpage) and book (whatever books you may be using) but you cannot interact with a person or use the internet.
Can can use software like Excel or R to do computations and statistics.


Review Sessions on zoom
Wednesday, July 17th 7pm.
The review session will be recorded.


To prepare for the midterm:
Do all the homework suggested above and review all the past homework.

Review Quiz1 and Quiz2.

From the old midterms:
(Note that we are not doing things in the same order as some of the old classes so don't look at all of the problems!)
2023 midterm: all problems.
2017 midterm: problems 2,4,5,6,7.
2016 midterm: problems 2,4,5,6,7.


Note:
You do not need to be able to run a regression for the midterm,
but you do need to be able to run regressions to do the homework.
Examples of running simple linear regressions and doing scatter plots are on the webpage in both R and excel.
In excel check out the video "Simple Regression and Scatter plot in Excel" in the excel section of the webpage.
In R, check out the link "simple-data-analysis.R" in the R section of the webpage and "Simple Linear Regression in R"
in the Section 3 notes part of the webpage.
If you have questions about running regressions in Excel, contact Percy.
If you have questions about running regressions in R, contact Percy or Rob.



4. July 6, 2024:

In both sections we are about to start the notes ``6. p-values'' in the
Section 2 notes ``Learning from Data''.

Homework

Problems 1.1 - 1.10 from Section 2.


quiz 2:
quiz 2: Tuesday at 8am - Thursday at 11:59pm, July 9 - 11.

Quiz
To prepare for the quiz, the following homework problems are particularly useful:
Section 1:
1.12 Ford and Tesla
1.15 Working with the Normal Distribution
Section 2:
1.3 The Audit
1.5 USA Mean Return
1.7 CI-for-proportion-of-up-ticks

Additional Reading Stine and Foster: Chapters 15 and 16
openintro: 5.1, 5.2. Irizarry: chapter 15.

Old Tests
2023 midterm: Questions 1-3.
2023 quiz 2.
2007 final: 3,4,5, and 7 parts a and b.
2012 final: 3,4,5,6, 7 parts 1 and 2.


3. June 29, 2024:

On Friday we got to slide 13 of the section 2 notes ``Learning from Data''.
On Saturday (Rob had a power outage !!!) we to slide 9 fo the section 2 notes.

No quiz this week.

Note that in the Excel section of the webpage there is a video on
plotting the normal density in excel.
Don't forget to check out NORMIDIST and NORMINV in the excel functions.

Homework

All problems from Section 1.
Problem 1.18 involved more advanced R so have a look at it but you don't have to do it.
Problem 1(a) of the section 2 homework.


Additional Optional Reading:
Stine and Foster: Chapters 12 and 14
openintro: 4.1, 5.1
Irizarry: 13.10-13.13, chapter 14.

Optional R
Check out the R scripts at the bottom of the Section 1 notes area on the webpage.
Maybe start with the Normal-distribution-in-R.R script.


2. June 22, 2024:

In both classes we ended at the Sharpe ratio, which is the last slide before Continuous distributions.

In the homework file ( Section 1 Homework) you should now be able to do the first 14 problems.
Solutions to all problems are provided.

Note that the solutions to the problems are done in R. Checking the details of the R is a great way to get going in R!

Quiz 1 Preparation:

Quiz 1 will be available from 8am, Tuesday June 25 to 11:59 pm Thursday June 27.
Once you start you will have one hour to complete it.

The test is open notes and book.
You cannot talk to a person or do internet searches or things like chatGPT.
You will not be tested on software (e..g R or Excel).
Note the I drop the lowest of the three quizzes.

From the old tests, have a look at:
2023 Quiz 1, all problems.
2015 Quiz 1, all problems.
2007 final, question 3.
2012 final, questions 3 and 4.
2017 midterm, questions 3 and 4.
2016 midterm, questions 2, 3, and 4.
Additional Optional Reading:

Stine and Foster: Chapters 9 and 10.
OpenIntro Statistics: section 3.2.
Irizarry: 14.1- 14.6.


1. June 15, 2024:

In the Friday class we got to slide 45, "100 coins" in the Section 1 notes.
On Saturday we got to slide 48, just a bit further.

Homework
(remember, you don't have to hand the homework in)

Try problems 1-8 of the Section 1 homework.

Old tests
A good question to try on the old tests would be: 07 final, 3 (a)-(e).
You could also have a look at problem 7 of the 2016 midterm.

Excel

See the video Introduction to statistics in Excel for excel stats basics.
Make sure you:
(i) do tools/excel add ins/Analysis ToolPak.
(ii) can download the file midcity.csv from the data webpage (data to a directory (folder) and then read it into excel.
(iii) see where statistical functions are in /formulas/More functions.
(iv) see where more stats is in data/data analysis.

Optional R

Read through A first look at R, which is also on the webpage.
Then, see if you can pick up in R where we left off by continuing to work through the R code in
simple-data-analysis.R.

Have a look at the second link on R from the webpage under Notes on R, R and Data

Try having a look at Hands-On Programming with R or swirl.

The first 5 chapters of "Introduction to Data Science", by Irizarry are
and excellent introduction to R.

Optional Reading:

OpenIntro Statistics: sections 2.1-2.3.
Stine and Foster: Chapters 7, 8, 9.1
"Introduction to Data Science", by Irizarry discusses probability in chapters 13 and 14!


See the syllabus for references to these books.


0. June 8, 2024:

There are no pre first class assignments.

You may want to look at the materials on excel and R on the webpage.