4 Linear Regression in R

Author: Nathaniel E. Helwig

4.1 Chapter Outline and Goals

In this chapter, we will cover how to…

Fit simple and multiple linear regression models
Test the significance of regression coefficients
Plot and interpret the regression results
Make predictions from fit regression models

R’s lm (linear model) function will be the primary tool used in the chapter.

4.2 Minnesota Beer Data (Reminder)

4.2.1 Overview

The Minnesota beer data has 44 beers measured on 7 variables:

Brewery: Name of the brewery (factor with 8 levels)
Beer: Name of the beer (factor with 44 levels)
Description: Description of the beer (factor with 37 levels)
Style: Style of the beer (factor with 3 levels)
ABV: Alcohol by volume (numeric)
IBU: International bitterness units (integer)
Rating: Beer Advocate rating (integer)

Data obtained by NEH from Beer Advocate and the websites of the eight breweries.

4.2.2 Load and Look at the Data

Use the read.csv function to load the beer data into R

beer <- read.csv("http://users.stat.umn.edu/~helwig/notes/MNbeer.csv")

The head function returns the first six lines of a data frame

head(beer)

##       Brewery         Beer              Description Style ABV IBU Rating
## 1     Bauhaus  Wonderstuff     New Bohemian Pilsner Lager 5.4  48     88
## 2     Bauhaus    Stargazer German Style Schwarzbier Lager 5.0  28     87
## 3     Bauhaus  Wagon Party    West Cost Style Lager Lager 5.4  55     86
## 4     Bauhaus    Sky-Five!        Midwest Coast IPA   IPA 6.7  70     86
## 5 Bent Paddle         Kanu         Session Pale Ale   Ale 4.8  48     85
## 6 Bent Paddle Venture Pils            Pilsner Lager Lager 5.0  38     87

4.3 Simple Linear Regression

4.3.1 Fit the Model

Consider a simple linear regression model of the form \[ y_i = \beta_0 + \beta_1 x_i + \epsilon_i \] where $y_i$ is the Rating of the i-th beer (response), $x_i$ is the ABV of the i-th beer (predictor), $\beta_0$ is the unknown regression intercept, $\beta_1$ is the unknown regression slope, and $\epsilon_i \sim \mathrm{N}(0, \sigma^2)$ is a latent Gaussian error term. To fit the model, we can use the lm function

mod <- lm(Rating ~ ABV, data = beer)

The first input is the regression formula (Response ~ Predictor), and the second input is the data frame containing the variables in the regression formula. Note that mod is an object of class lm, which is a list containing information about the fit model.

class(mod)

## [1] "lm"

names(mod)

##  [1] "coefficients"  "residuals"     "effects"       "rank"         
##  [5] "fitted.values" "assign"        "qr"            "df.residual"  
##  [9] "xlevels"       "call"          "terms"         "model"

For example, the $coefficients element contains the estimated regression coefficients

mod$coefficients

## (Intercept)         ABV 
##   74.034788    2.259646

which reveal that the expected Rating increases by about 2.26 points for every 1 unit (i.e., 1%) increaese in ABV.

4.3.2 Inference Information

To obtain a more detailed summary of the fit model, use the summary function

modsum <- summary(mod)
names(modsum)

##  [1] "call"          "terms"         "residuals"     "coefficients" 
##  [5] "aliased"       "sigma"         "df"            "r.squared"    
##  [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

modsum

## 
## Call:
## lm(formula = Rating ~ ABV, data = beer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.4965 -2.1563  0.3593  1.6670  7.0179 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  74.0348     3.5933  20.604  < 2e-16 ***
## ABV           2.2596     0.6117   3.694 0.000632 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.28 on 42 degrees of freedom
## Multiple R-squared:  0.2452, Adjusted R-squared:  0.2272 
## F-statistic: 13.65 on 1 and 42 DF,  p-value: 0.000632

Note that summarizing an lm object returns the estimated error standard deviation sigma ($\hat{\sigma} = 3.28$), the coefficient of determination r.squared ($R^2 = 0.2452$), and a coefficient inference table for testing $H_0: \beta_j = 0$ versus $H_1: \beta_j \neq 0$. The observed t statistic for testing the slope parameter is $t = 3.69$ with 42 degrees of freedom, resulting in a p-value less than 0.001—we reject $H_0$ using any standard $\alpha$ level.

Use the confint function to obtain confidence intervals for regression coefficients

confint(mod, "ABV")

##        2.5 %   97.5 %
## ABV 1.025152 3.494139

The 95% confidence interval for $\beta_1$ reveals that we expect the average Rating to increase by 1.03 to 3.49 points for each additional 1% ABV.

4.3.3 Plot the Regression Line

It’s easy to include the least-squares regression line on a scatterplot by adding a geom_smooth() to the ggplot():

# Without "standard error bars"
ggplot(beer, aes(x = IBU, y = ABV)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE)

# With "standard error bars"
ggplot(beer, aes(x = IBU, y = ABV)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = TRUE)

## Warning in grid.Call.graphics(C_polygon, x$x, x$y, index): semi-
## transparency is not supported on this device: reported only once per page

```

Alternatively, we can use the abline() function:

plot(beer$ABV, beer$Rating, xlab = "Alcohol By Volume", 
     ylab = "Beer Advocate Rating", main = "Alcohol by Rating")
abline(mod)

4.3.4 Diagnostic and Influence Plots

R makes it really easy to create simple diagnostic and influence plots for a fit regression model:

plot(mod)

4.3.5 Prediction for New Data

We often want to use a fit regression model to create predictions for new data. In R, this involves first creating the data frame of new predictor scores

newdata <- data.frame(ABV = seq(4.2, 7.5, by = 0.1))

which we input to the predict function along with the fit model

newfit <- predict(mod, newdata)
newfit

##        1        2        3        4        5        6        7        8 
## 83.52530 83.75127 83.97723 84.20319 84.42916 84.65512 84.88109 85.10705 
##        9       10       11       12       13       14       15       16 
## 85.33302 85.55898 85.78495 86.01091 86.23688 86.46284 86.68880 86.91477 
##       17       18       19       20       21       22       23       24 
## 87.14073 87.36670 87.59266 87.81863 88.04459 88.27056 88.49652 88.72249 
##       25       26       27       28       29       30       31       32 
## 88.94845 89.17441 89.40038 89.62634 89.85231 90.07827 90.30424 90.53020 
##       33       34 
## 90.75617 90.98213

By default, the predict function returns a vector of predictions $\hat{y}_{i(\mbox{new})} = \hat{\beta}_0 + \hat{\beta}_1 x_{i(\mbox{new})}$. To obtain the corresponding standard errors of the predictions, we can use the se.fit input

newfitse <- predict(mod, newdata, se.fit = TRUE)
newfitse

## $fit
##        1        2        3        4        5        6        7        8 
## 83.52530 83.75127 83.97723 84.20319 84.42916 84.65512 84.88109 85.10705 
##        9       10       11       12       13       14       15       16 
## 85.33302 85.55898 85.78495 86.01091 86.23688 86.46284 86.68880 86.91477 
##       17       18       19       20       21       22       23       24 
## 87.14073 87.36670 87.59266 87.81863 88.04459 88.27056 88.49652 88.72249 
##       25       26       27       28       29       30       31       32 
## 88.94845 89.17441 89.40038 89.62634 89.85231 90.07827 90.30424 90.53020 
##       33       34 
## 90.75617 90.98213 
## 
## $se.fit
##         1         2         3         4         5         6         7 
## 1.1064827 1.0521132 0.9985311 0.9458702 0.8942934 0.8439992 0.7952311 
##         8         9        10        11        12        13        14 
## 0.7482877 0.7035342 0.6614152 0.6224659 0.5873171 0.5566893 0.5313647 
##        15        16        17        18        19        20        21 
## 0.5121308 0.4996912 0.4945590 0.4969606 0.5067890 0.5236260 0.5468247 
##        22        23        24        25        26        27        28 
## 0.5756164 0.6092087 0.6468540 0.6878872 0.7317387 0.7779320 0.8260744 
##        29        30        31        32        33        34 
## 0.8758444 0.9269800 0.9792672 1.0325312 1.0866283 1.1414400 
## 
## $df
## [1] 42
## 
## $residual.scale
## [1] 3.279704

The interval input can be used to create confidence and prediction intervals

newfitCI <- predict(mod, newdata, interval = "confidence")
newfitPI <- predict(mod, newdata, interval = "prediction")
head(newfitCI)

##        fit      lwr      upr
## 1 83.52530 81.29233 85.75827
## 2 83.75127 81.62801 85.87452
## 3 83.97723 81.96211 85.99235
## 4 84.20319 82.29435 86.11204
## 5 84.42916 82.62440 86.23392
## 6 84.65512 82.95186 86.35838

head(newfitPI)

##        fit      lwr      upr
## 1 83.52530 76.54007 90.51053
## 2 83.75127 76.80033 90.70220
## 3 83.97723 77.05856 90.89590
## 4 84.20319 77.31472 91.09166
## 5 84.42916 77.56880 91.28951
## 6 84.65512 77.82077 91.48948

The confidence and prediction intervals can be plotted using:

# Confidence intervals
ggplot(beer, aes(x = ABV, y = Rating)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = TRUE)

# Prediction intervals
newfitPI <- data.frame(newfitPI)
newfitPI$ABV <- newdata$ABV
ggplot(beer, aes(x = ABV, y = Rating)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE) +
  geom_ribbon(data = newfitPI, aes(x = ABV, y = fit, ymin = lwr, ymax = upr), fill = "blue", alpha = 0.2)

Or:

plot(beer$ABV, beer$Rating, xlab = "Alcohol By Volume", 
     ylab = "Beer Advocate Rating", main = "Alcohol by Rating",
     ylim = c(75, 100))
lines(newdata$ABV, newfitCI[,1])
lines(newdata$ABV, newfitCI[,2], lty = 2, col = "blue")
lines(newdata$ABV, newfitCI[,3], lty = 2, col = "blue")
lines(newdata$ABV, newfitPI[,2], lty = 3, col = "red")
lines(newdata$ABV, newfitPI[,3], lty = 3, col = "red")
legend("bottomright", lty = 1:3, legend = c("fit", "95% CI", "95% PI"),
       col = c("black", "blue", "red"), bty = "n")

4.4 Multiple Linear Regression

4.4.1 Overview

A multiple linear regression model has the form \[ y_i = \beta_0 + \sum_{j=1}^p \beta_j x_{ij} + \epsilon_i \] where $y_i$ is the response for the $i$-th observation, $x_{ij}$ is the j-th predictor for the i-th observation, $\beta_0$ is the unknown regression intercept, $\beta_j$ is the unknown regression slope for the j-th predictor, and $\epsilon_i \sim \mathrm{N}(0, \sigma^2)$ is a latent Gaussian error term. Note that $\beta_j$ gives the expected change in the response variable for a 1-unit change in the j-th predictor variable conditioned on the other predictors, i.e., holding all other predictors constant.

4.4.2 Additive Effects

We will start by considering a model predicting the Rating from the additive effects of ABV and Brewery

amod <- lm(Rating ~ ABV + Brewery, data = beer)

Note that this model allows each Brewery to have a unique regression intercept (Bauhaus is the baseline), but assumes that the slope between ABV and Rating is the same for each Brewery. We can summarize the model using the same approach as before:

amodsum <- summary(amod)
amodsum

## 
## Call:
## lm(formula = Rating ~ ABV + Brewery, data = beer)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.702 -1.477  0.217  1.207  5.505 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          77.82164    3.27842  23.738  < 2e-16 ***
## ABV                   1.58726    0.53419   2.971  0.00533 ** 
## BreweryBent Paddle    1.01666    1.75996   0.578  0.56719    
## BreweryFulton        -2.27857    1.75923  -1.295  0.20373    
## BreweryIndeed         0.87472    1.64545   0.532  0.59836    
## BrewerySteel Toe      0.05955    1.89701   0.031  0.97513    
## BrewerySummit        -1.16837    1.64440  -0.711  0.48209    
## BrewerySurly          4.25343    1.68773   2.520  0.01644 *  
## BreweryUrban Growler -3.22777    1.76155  -1.832  0.07542 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.622 on 35 degrees of freedom
## Multiple R-squared:  0.5979, Adjusted R-squared:  0.506 
## F-statistic: 6.506 on 8 and 35 DF,  p-value: 3.502e-05

Compared to the simple linear regression model containing only the ABV predictor, we have noticeably reduced the residual standard deviation estimate sigma ($\hat{\sigma} = 2.622$) and increased the coefficient of (multiple) determination r.squared ($R^2 = 0.5979$).

The anova and Anova functions can be used to test the significance of terms

library(car)

## Loading required package: carData

anova(amod)  # Type I (sequential) SS test

## Analysis of Variance Table
## 
## Response: Rating
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## ABV        1 146.77 146.774 21.3451 5.026e-05 ***
## Brewery    7 211.10  30.157  4.3857  0.001393 ** 
## Residuals 35 240.67   6.876                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Anova(amod)  # Type II SS test

## Anova Table (Type II tests)
## 
## Response: Rating
##            Sum Sq Df F value   Pr(>F)   
## ABV        60.709  1  8.8288 0.005331 **
## Brewery   211.102  7  4.3857 0.001393 **
## Residuals 240.669 35                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that anova tests the effects sequentially (ABV alone, then Brewery given ABV), whereas the Anova function (in the car package) tests the effects conditioned on the other effect (ABV given Brewery, Brewery given ABV). Using the Type II tests from the Anova function, we see that both ABV ($F_{1,35} = 8.83, p = 0.005$) and Brewery ($F_{7,35} = 4.39, p = 0.001$) significantly add to the prediction of the beer’s Rating.

4.4.3 Interaction Effects

Next we consider a model predicting the Rating from the interaction effects of ABV and Brewery

imod <- lm(Rating ~ ABV * Brewery, data = beer)

Note that formula notation is shorthand for Rating ~ ABV + Brewery + ABV:Brewery, so this model allows each Brewery to have a unique regression intercept and slope relating ABV and Rating. We can summarize the model using the same approach as before:

imodsum <- summary(imod)
imodsum

## 
## Call:
## lm(formula = Rating ~ ABV * Brewery, data = beer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.2905 -1.3040 -0.0471  1.4774  4.9434 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               90.4203    12.4097   7.286 6.22e-08 ***
## ABV                       -0.6525     2.1920  -0.298    0.768    
## BreweryBent Paddle       -17.0655    17.8235  -0.957    0.347    
## BreweryFulton            -18.8852    14.6059  -1.293    0.207    
## BreweryIndeed             -8.1702    14.4123  -0.567    0.575    
## BrewerySteel Toe         -16.6529    16.6037  -1.003    0.324    
## BrewerySummit            -11.2505    15.7834  -0.713    0.482    
## BrewerySurly             -12.0484    14.6792  -0.821    0.419    
## BreweryUrban Growler      -8.6554    19.7273  -0.439    0.664    
## ABV:BreweryBent Paddle     3.2331     3.1819   1.016    0.318    
## ABV:BreweryFulton          2.9580     2.5806   1.146    0.261    
## ABV:BreweryIndeed          1.6240     2.5265   0.643    0.526    
## ABV:BrewerySteel Toe       2.8851     2.7839   1.036    0.309    
## ABV:BrewerySummit          1.7846     2.8071   0.636    0.530    
## ABV:BrewerySurly           2.8236     2.5109   1.125    0.270    
## ABV:BreweryUrban Growler   1.0034     3.4276   0.293    0.772    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.813 on 28 degrees of freedom
## Multiple R-squared:  0.6297, Adjusted R-squared:  0.4313 
## F-statistic: 3.174 on 15 and 28 DF,  p-value: 0.004029

Compared to the additive model, we have slightly increased the residual standard deviation estimate sigma ($\hat{\sigma} = 2.813$) and increased the coefficient of (multiple) determination r.squared ($R^2 = 0.6297$).

Use the Anova function to test the signifiance of the effects

library(car)
Anova(imod)  # Type II SS test

## Anova Table (Type II tests)
## 
## Response: Rating
##              Sum Sq Df F value   Pr(>F)   
## ABV          60.709  1  7.6695 0.009854 **
## Brewery     211.102  7  3.8098 0.005009 **
## ABV:Brewery  19.030  7  0.3434 0.926653   
## Residuals   221.639 28                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results reveal that the interaction effect is not significant ($F_{7,28} = 0.34, p = 0.927$), but the main effects of ABV ($F_{1,28} = 7.67, p = 0.01$) and Brewery ($F_{7,28} = 3.81, p = 0.005$) are significant at the classic $\alpha = 0.05$ significance level.

4.4.4 Comparing Fit Models

To compare the fit models, we can use the anova function for F-tests

anova(mod, amod, imod)

## Analysis of Variance Table
## 
## Model 1: Rating ~ ABV
## Model 2: Rating ~ ABV + Brewery
## Model 3: Rating ~ ABV * Brewery
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
## 1     42 451.77                                
## 2     35 240.67  7    211.10 3.8098 0.005009 **
## 3     28 221.64  7     19.03 0.3434 0.926653   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

or the AIC function to extract Akaike’s information criterion

AIC(mod, amod, imod)

##      df      AIC
## mod   3 233.3420
## amod 10 219.6329
## imod 17 230.0085

In this case, the F-tests and AIC values suggest that the additive model should be preferred. We conclude that each Brewery has a unique baseline Rating, and increasing the ABV by 1% corresponds to an expected 1.59 point increase in the Rating.

4.5 Exercises

Load the Minnesota Beer Data into R.
Make a scatterplot of the IBU (x-axis) by Rating (y-axis)
Fit a simple linear regression model predicting Rating from IBU.
Is there a significant linear relationship between IBU and Rating?
Plot the linear relationship, along with 95% confidence and prediction intervals.
Fit a multiple linear regression model predicting Rating from the additive effects of IBU and Brewery.
Fit a multiple linear regression model predicting Rating from the additive and interaction effects of IBU and Brewery.
Considering the models you fit in Ex 3, 7, 8, which do you prefer and why?

Solutions:

## 1: Load the Minnesota Beer Data into R
beer <- read.csv("http://users.stat.umn.edu/~helwig/notes/MNbeer.csv")

# 2: Make a scatterplot of the IBU (x-axis) by Rating (y-axis)
ggplot(beer, aes(x = IBU, y = Rating)) + 
  geom_point()

# 3: Fit a simple linear regression model predicting Rating from IBU
# For every 1 IBU increase, Rating tends to increase by 0.07 points.
my_model <- lm(Rating ~ IBU, data = beer)
coef(my_model)

## (Intercept)         IBU 
## 83.76336277  0.06693905

#4: Is there a significant linear relationship between IBU and Rating?
# Answer: Yes!  Notice that the p-value in the IBU row
# below is 0.  Also, notice that the confidence bands below
# don't include the flat, 0 slope line.
summary(my_model)

## 
## Call:
## lm(formula = Rating ~ IBU, data = beer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3822 -2.4669  0.3309  2.0854  9.8856 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 83.76336    1.36811  61.225   <2e-16 ***
## IBU          0.06694    0.02474   2.706   0.0098 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.484 on 42 degrees of freedom
## Multiple R-squared:  0.1485, Adjusted R-squared:  0.1282 
## F-statistic: 7.322 on 1 and 42 DF,  p-value: 0.0098

#5: Plot the linear relationship, along with 95% confidence and prediction intervals
# The confidence intervals reflect the plausible TREND in this relationship.
# The prediction intervals reflect the plausible DEVIATIONS from this trend.
ggplot(beer, aes(x = IBU, y = Rating)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = TRUE)

## Warning in grid.Call.graphics(C_polygon, x$x, x$y, index): semi-
## transparency is not supported on this device: reported only once per page

# prediction bands (this code = yuck)
newdata  <- data.frame(IBU = seq(15, 100, by = 1))
newfit   <- predict(my_model, newdata)
newfitPI <- predict(my_model, newdata, interval = "prediction")
newfitPI <- data.frame(newfitPI)
newfitPI$IBU <- newdata$IBU
ggplot(beer, aes(x = IBU, y = Rating)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE) +
  geom_ribbon(data = newfitPI, aes(x = IBU, y = fit, ymin = lwr, ymax = upr), fill = "blue", alpha = 0.2)

## Warning: Ignoring unknown aesthetics: y

## Warning: semi-transparency is not supported on this device: reported only
## once per page

#6: Fit a multiple linear regression model predicting Rating from the additive effects of IBU and Brewery.
model_2 <- lm(Rating ~ IBU + Brewery, data = beer)
summary(model_2)

## 
## Call:
## lm(formula = Rating ~ IBU + Brewery, data = beer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.7732 -1.7126  0.0889  1.2031  5.9521 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          83.98984    1.65525  50.741  < 2e-16 ***
## IBU                   0.05493    0.01969   2.790  0.00848 ** 
## BreweryBent Paddle    1.19330    1.78467   0.669  0.50811    
## BreweryFulton        -2.00670    1.78467  -1.124  0.26849    
## BreweryIndeed         0.46958    1.67917   0.280  0.78139    
## BrewerySteel Toe      1.16761    1.87697   0.622  0.53793    
## BrewerySummit        -1.47248    1.66443  -0.885  0.38237    
## BrewerySurly          5.20257    1.66495   3.125  0.00357 ** 
## BreweryUrban Growler -2.59571    1.78495  -1.454  0.15479    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.654 on 35 degrees of freedom
## Multiple R-squared:  0.5881, Adjusted R-squared:  0.4939 
## F-statistic: 6.246 on 8 and 35 DF,  p-value: 5.108e-05

#7: Fit a multiple linear regression model predicting Rating from the additive and interaction effects of IBU and Brewery.
model_3 <- lm(Rating ~ IBU * Brewery, data = beer)
summary(model_3)

## 
## Call:
## lm(formula = Rating ~ IBU * Brewery, data = beer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.1925 -1.5054 -0.0439  1.1809  5.4490 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              88.22268    4.88650  18.054   <2e-16 ***
## IBU                      -0.02931    0.09313  -0.315    0.755    
## BreweryBent Paddle       -1.02635    6.56017  -0.156    0.877    
## BreweryFulton            -6.97752    5.59517  -1.247    0.223    
## BreweryIndeed            -2.02179    5.70948  -0.354    0.726    
## BrewerySteel Toe         -4.66505    5.96999  -0.781    0.441    
## BrewerySummit            -6.14258    5.99866  -1.024    0.315    
## BrewerySurly             -1.26059    5.52626  -0.228    0.821    
## BreweryUrban Growler     -4.71814    5.99284  -0.787    0.438    
## IBU:BreweryBent Paddle    0.03848    0.13323   0.289    0.775    
## IBU:BreweryFulton         0.10101    0.10813   0.934    0.358    
## IBU:BreweryIndeed         0.05608    0.10323   0.543    0.591    
## IBU:BrewerySteel Toe      0.11515    0.11102   1.037    0.309    
## IBU:BrewerySummit         0.09249    0.11216   0.825    0.417    
## IBU:BrewerySurly          0.12576    0.10291   1.222    0.232    
## IBU:BreweryUrban Growler  0.03605    0.11883   0.303    0.764    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.814 on 28 degrees of freedom
## Multiple R-squared:  0.6297, Adjusted R-squared:  0.4313 
## F-statistic: 3.174 on 15 and 28 DF,  p-value: 0.00403

ggplot(beer, aes(x = IBU, y = Rating, color = Brewery)) + 
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE)

#8 Considering the models you fit in Ex 3, 7, 8, which do you prefer and why?
# I prefer the middle model.  Including Brewery in the model appears
# to improve predictions.  The "interactive" effects seem to be overkill.