Where are we and what should I be doing?
April 6:
We were on the out-of-sample confusion matrix for the IMDB (binary output) example
in the neural nets notes.
April 4:
Just did SGD (stochastic gradient descent) in the neural nets notes.
March 28:
Just did boosting.
About to do variable importance measures for tree based methods.
March 16:
About to do the examples at the end of the regularized logit notes.
February 27:
We got up the the picture depicting the constraint contours for Lp in the
Section 8 "Understanding the Lasso Solution".
Homework 3 is on the webpage. It has a comple simple questions on the basic optimization concepts we covered.
Homework 4 is on the webpage.
It covers regularized regression with the LASSO (L1) and, if you like, Ridge (L2).
This is important material and fundamental to our course.
You should spend some time getting to know software for LASSO and Ridge.
As noted on the webpage, Chapter 6 (section 6.2 in particular) is a wonderful place to read
about "shrinkage methods" and the book has a webpage with several useful resources.
Section 6.1 covers the variable selection stuff and AIC and BIC that we did in sections 4 and 5 of our
linear regression notes.
The webpage has lecture notes which you may prefer to mine and a lab where the methods are illustrated in R.
You might also and to have a look at the vignette for the R package glmnet (see > browseVignettes()).
Note that for both R and python I have simple scripts on the webpage that fit regularized regression.
See:
Simple python script to do Ridge and Lasso
Simple python script to do Ridge and Lasso, html
Simple R script to do Ridge and Lasso
Simple R script to do Ridge and Lasso, html version
In python you should go to scikit learn (https://scikit-learn.org/stable/) and check out:
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge
from sklearn.linear_model import RidgeCV
from sklearn.linear_model import Lasso
from sklearn.linear_model import LassoCV
Chapter 42 of "Python Data Science Handbook, 2ndEd" by VanderPlas covers polynomial regression
and Ridge/LASSO in chapter 42.
Chapter 4 of "Hands-On Machine Learning with Scikit-Learn, Keras & Tensor Flow, 3rdEd" by Geron covers linear regression,
polynomial regression, and Regularized linear models.
February 27:
You should watch the recorded lecture on sections 4 and 5 of the linear regression notes on subset selection and AIC/BIC.
https://www.rob-mcculloch.org/2023_uml/webpage/notes22/linreg_s4-5_subset-abIC.mp4
Homework 3 and 4 are on the webpage.
Homework 3 is related to the material we did on optimization and homework 4 is related to the LASSO.
Would be good to get these done by the end of spring break.
February 9:
We finished the Knn/Bias-variance notes and went through the python script which
does cross validation and grid search using the general sklearn design.
Next time we start our quick optimization review.
Homework 2 is on the webpage with a February 17 due data.
February 7:
Finished the knn notes and had a look at the script which does Naive Bayes with sklearn.
February 2:
We are in the middle of the section where we do y=medv using more that just the one x=lstat.
January, 31:
We are about to do section 3, Out of sample predictions, in the KNN notes.
January, 26:
We are about to do Naive Bayes on the ham/spam example in the first set of notes
on probability review and Naive Bayes.
Homework 1 is on the webpage and due February 6.
January, 24:
About to do Bayes Theorem in the probability review leading up to Naive Bayes.
January, 19:
We are just about to the the multiple regresson of y=price on x=(mileage,year) in the
Hello world python script.
You should be playing around in R and/or python and look at the various books and websites
to find python/R resources that work for you.
Everyone (whether you are using R or python) should have a look at
https://scikit-learn.org/stable/.
In particular, have a look at:
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.
Note the statement:
Ridge regression addresses some of the problems of Ordinary Least Squares by imposing a penalty on the size of the coefficients with l2 regularization.
What are the problems with OLS????!!!!
It is also useful to look at the official Machine Learning in R webpage:
https://cran.r-project.org/web/views/MachineLearning.html.
January, 17:
We finished going through the hello world in R script, discussed python software, and started the hello world python script.
Next time (1-19-23) we will continue with the python script.
The R code I was look at is do_1-17-23.R
.
The python code I was looking at is do_1-17-23.py
.
January, 13:
We are working throught the "Hello World in R" document at
https://www.rob-mcculloch.org/R/R_Hello-World_Regression.html.
Let's pick it up next time at the section titles
Run the Regression of y=price on X=(mileage,year).
If you are planning to use R, make sure you have R and Rstudio installed and
start playing with R.
I you are planning to use python have a look at my python information page
https://www.rob-mcculloch.org/python/index.html
and get python installed.
You have to decide if you are doing anaconda, miniconda, or a system install (e.g. with pip3).
Hopefully next class we will wrap up our look at R and move on the python by going through
https://www.rob-mcculloch.org/python/Py_Hello-Word_Regression.html.
Raj (student) notes that on the mac home brew is a great package manager
Hey Prof,
Homebrew (Brew) is the go-to package manager for macOS. It can be used to install software libraries (e.g. Python or R), applications (e.g. RStudio), and managed services (e.g. postgres).
Command to install brew: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
There seems to be different flavors/variants of conda (Anaconda, Miniconda, etc.). I'm currently using Miniforge, which can be installed via brew install miniforge
- RD
January, 10:
Note that while I will be lecturing in class, the lecture will also be available on zoom and recorded.
zoom info:
Topic: 494 Machine Learning
Time: Jan 10, 2023 01:30 PM Arizona
Every week on Tue, Thu, until Apr 27, 2023, 32 occurrence(s)
Jan 10, 2023 01:30 PM
Jan 12, 2023 01:30 PM
Jan 17, 2023 01:30 PM
...
Apr 27, 2023 01:30 PM
Please download and import the following iCalendar (.ics) files to your calendar system.
Weekly: https://asu.zoom.us/meeting/tZcqcu-orzkjGtQMT4ojeVcvaadmCx0QiaCn/ics?icsToken=98tyKuGrqT4sGtWXuRmHRpwqB4_4M_TzmHpEjbdssSuxDCpWVADgN-NGP5FnQOnZ
Join from PC, Mac, Linux, iOS or Android: https://asu.zoom.us/j/83756012900?pwd=bGw1WHNPUG96VEdLZmx1dzNDWjBOdz09
Password: 877013