ames-kaggle1

regression
data
kaggle
string
kaggle
Published

June 1, 2023

Aufgabe

Berechnen Sie ein einfaches lineare Modell für die Ames House Price Kaggle Competition.

Hinweise:











Lösung

Pakete starten

library(tidyverse)
library(easystats)

Daten importieren

d_train_path_online <- "https://raw.githubusercontent.com/sebastiansauer/Lehre/main/data/ames-kaggle/train.csv"
d_test_path_online <- "https://raw.githubusercontent.com/sebastiansauer/Lehre/main/data/ames-kaggle/test.csv"
d_train <- read_csv(d_train_path_online)
Rows: 1460 Columns: 81
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (43): MSZoning, Street, Alley, LotShape, LandContour, Utilities, LotConf...
dbl (38): Id, MSSubClass, LotFrontage, LotArea, OverallQual, OverallCond, Ye...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
d_test <- read_csv(d_test_path_online)
Rows: 1459 Columns: 80
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (43): MSZoning, Street, Alley, LotShape, LandContour, Utilities, LotConf...
dbl (37): Id, MSSubClass, LotFrontage, LotArea, OverallQual, OverallCond, Ye...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Model definieren

m1 <- lm(SalePrice ~ OverallQual, data = d_train)

Neue Daten vorhersagen

m1_pred <- predict(m1, newdata = d_test)

Daten einreichen

d_subm <-
  d_test %>% 
  select(Id) %>% 
  mutate(SalePrice = m1_pred)

head(d_subm)
# A tibble: 6 × 2
     Id SalePrice
  <dbl>     <dbl>
1  1461   130973.
2  1462   176409.
3  1463   130973.
4  1464   176409.
5  1465   267280.
6  1466   176409.
write_csv(d_subm, file = "einreichen-kaggle-modell1-yeah.csv")

Categories:

  • regression
  • ames
  • kaggle
  • string