---
title: "R_Hellow-World_Multiple-Regression"
author: "Rob McCulloch"
date: "January 22, 2018"
output:
pdf_document: default
html_document: default
fontsize: 14pt
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Simple Example of Multiple Regression in R
Let's do a ``hello world'' example of R.
We will use the used cars data (susedcars.csv) to relate the price of a used
car to it's mileage and year.
We will:
* read in the data from a *.csv file into an R data.frame
* process the data by selecting a few columns and rescaling two of them
* do some simple summaries
* do some simple plots
* run the multiple regression
* plot $y$ versus $\hat{y}$.
* make some predictions
# Read in the data and get the variables we want
```{r rdat, include=TRUE, echo=TRUE}
cd = read.csv("susedcars.csv")
names(cd)
cd = cd[,c(1,4,5)]
cd$price = cd$price/1000
cd$mileage = cd$mileage/1000
head(cd)
summary(cd)
cor(cd)
```
# Plot y versus each x
```{r pl-mileage-price, include=TRUE, echo=TRUE,dependson="rdat"}
plot(cd$mileage,cd$price,xlab="mileage",ylab="price")
```
```{r pl-year-price, include=TRUE, echo=TRUE}
plot(cd$year,cd$price,xlab="year",ylab="price")
```
# Run the Regression of y=price on X=(mileage,year)
Ok, now we can run the regression.
```{r reg-price-mileage-year, include=TRUE, echo=TRUE,dependson="rdat"}
lmmod = lm(price~mileage+year,cd)
summary(lmmod)
cat("the coefficients are:",lmmod$coefficients,"\n")
```
So, the fitted relationship is
$$
price = -53695.49 - 0.154 \, mileage + 2.7 \, year
$$
# Get and Plot the Fits
We will pull off the fitted values and plot them versus $y$.
```{r plot-fits, include=TRUE, echo=TRUE,dependson=c("rdat","reg-price-mileage-year")}
yhat = lmmod$fitted.values
plot(cd$price,yhat,xlab="y=price")
```
**Clearly, it is really bad !!**
# Predictions
```{r lm-predict, include=TRUE, echo=TRUE,dependson=c("rdat","reg-price-mileage-year")}
xpdf = data.frame(mileage=c(40,100),year=c(2010,2004))
ypred = predict(lmmod,xpdf)
cat("at x:")
xpdf
cat("predicted price is\n")
ypred
```