Information on R

Here are some books I like on R, python, data science, machine learning:
Books

R

Basic R links are:

The main R webpage is: R.

rstudio is run by posit see: Posit.
For rstudio, you can go directly to rstudio.

Installation

If you just google "how do I install R" you will have no problem.

For R, go to download R.
For rstudio, downloads are at: download rstudio


swirl is a nice way to learn R.
If you go to http://swirlstats.com/students.html it tell you how
to get started installing R, rstudio, and swirl.

Documentation

The official R documentation page is R documentation
and the official R intro is Introduction to R.
The writing in the R introduction can get very technical, but if you just read the initial parts of a section it is usually very good,
definitely worth a try.

Official rstudio intro documentation is at Hands-On Programming with R.

The above mentioned swirl is good.

For more detailed textbook introduction I like the books by Irizarry:
Introduction to Data Science: Data Wrangling and Visualization with R (Chapman & Hall/CRC Data Science Series) 2nd Edition
by Rafael A. Irizarry (Author)

Introduction to Data Science: Data Analysis and Prediction Algorithms with R (Chapman & Hall/CRC Data Science Series)
by Rafael A. Irizarry (Author)
Note that there a two flavors of R.
There is "base R" which is the basic system which has been around for a long time.
There is also the more recent and quite popular "tidyverse".
For the tidyerse see
R for Data Science (2e).
See also
Quantitative Social Science: An Introduction in tidyverse
by Kosuke Imai and Nora Webb Williams | Aug 2, 2022

and the simpler version

Data Analysis for Social Science, A Friendly and Practical Introduction
by Elena LLaudet and Kosuke Imai, 2023.

R Markdown

In data science ``dynamic documents'' in which code and math and the output form code are combined have become very popular.

Rmarkdown (particularly in rstudio) has become very popular.

This seems like a nice tutorial from R bloggers: Getting Started with R Markdown - Guide and Cheatsheet

Here is the rstudio webpage for Rmarkdown rstudio on Rmarkdown

The rstudio cheatsheets are very useful, you can get them from the rstudio help, but here they are:
rmarkdown cheatsheet (from the rstudio help)    rmarkdown-reference.pdf

See also Part V of ``R for Data Science'' by Wickham and Grolemund.

Try going into rstudio, clicking to get a new Rmarkdown file and the clicking knit. Note that (if your Rmarkdown is simple) you can ``render'' the Rmarkdown to html or pdf (or word). If you play around with this file and consult the cheatsheets you get the hang of it but it takes a little while.

Mostly Rmarkdown is pretty easy, but some things can be tricky.
At the top of the fname.Rmd file there is a preamble that controls how the Rmarkdown is "rendered".
For example, you can render to pdf or html. It is not always easy to figure out how to get these
options to do what you want.

Rmarkdown has "code chunks" where you put in a chunk of R code.
There are then a lot of options on how the code and the output from the code are displayed.
In particular, you can cache (to disk) the results for a code chunk and specify which
previous chunks a chunk depends on. This way a chunk is only rerun when the code changes for the code for a chunk that it depends on changes.


Here is an example where of a section of notes written in Rmarkdown (note the preamble):
   markdown,    pdf

Here is an example where a tutorial was written in Rmarkdown
(note the cache=TRUE option is some of the code chunks and the dependson= is subsequent chunks):
   markdown,    pdf

R packages for Machine Learning
CRAN Task View: Machine Learning & Statistical Learning

Hello world, stats in R

  Hello world regression in R, Rmarkdown
  Hello world regression in R, rendered html
  Hello world regression in R, rendered pdf
  Hello world regression in R, short version, rendered html

OOS Loop in R

Here is a simple example of a loop in R to estimate the out-of-same root mean square error
for linear regression and the susedcars.csv data set using just x=(mileage,year) for y=price:
  do-cars-oos.R.

What is the oos loop trying to do?
  Out of sample Loss..