penguins-lm

lm
en
regression
penguins
Published

September 12, 2024

1 Exercise

Consider the dataset penguins. Compute a linear model with body mass as output variable (DV) and flipper length as input (IV).

  1. Report the coefficients and interpret them.
  2. Plot the model and the coefficients.
  3. Report the model fit (R squared).
  4. BONUS: predict() the weight of an average flipper-sized animal. Check out the internet for examples of how to do so in case you need support.

2 Solution

2.1 Setup

library(tidyverse)
library(easystats)
library(ggpubr)

# import data:
penguins <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")

2.2 Let’s go

lm1 <- lm(body_mass_g ~ flipper_length_mm, data = penguins)

Plot the model:

plot(estimate_relation(lm1))

Alternative plotting method:

ggscatter(penguins,
          x = "flipper_length_mm",
          y = "body_mass_g",
          add ="reg.line")

Coefficients (parameters):

parameters(lm1)
Parameter         | Coefficient |     SE |               95% CI | t(340) |      p
---------------------------------------------------------------------------------
(Intercept)       |    -5780.83 | 305.81 | [-6382.36, -5179.30] | -18.90 | < .001
flipper length mm |       49.69 |   1.52 | [   46.70,    52.67] |  32.72 | < .001

Plot the coefficients:

plot(parameters(lm1))

Model fit (explained variance by model):

r2(lm1)
# R2 for Linear Regression
       R2: 0.759
  adj. R2: 0.758

Predict weight of average animal:

penguins |> 
  summarise(flipper_length_mm_avg = 
              mean(flipper_length_mm, na.rm = TRUE))
  flipper_length_mm_avg
1                   201

2.3 For average flipper length, what’s the expected weight?

predict(lm1, newdata = data.frame(flipper_length_mm = 200))
   1 
4156 

Around 4 kgs.

2.4 Centering the data

Center the data:

penguins_c <-
  penguins |> 
  mutate(flipper_length_mm_c = center(flipper_length_mm))

Now the mean value is (nearly) zero:

mean(penguins_c$flipper_length_mm_c, na.rm = TRUE)
[1] -1.2e-14

Run the model again:

lm2 <- lm(body_mass_g ~ flipper_length_mm_c, data = penguins_c)

parameters(lm2)
Parameter           | Coefficient |    SE |             95% CI | t(340) |      p
--------------------------------------------------------------------------------
(Intercept)         |     4201.75 | 21.32 | [4159.82, 4243.69] | 197.08 | < .001
flipper length mm c |       49.69 |  1.52 | [  46.70,   52.67] |  32.72 | < .001