── Attaching packages ────────────────────────────────────── tidymodels 1.1.1 ──
✔ broom 1.0.5 ✔ recipes 1.0.8
✔ dials 1.2.0 ✔ rsample 1.2.0
✔ dplyr 1.1.3 ✔ tibble 3.2.1
✔ ggplot2 3.4.4 ✔ tidyr 1.3.0
✔ infer 1.0.5 ✔ tune 1.1.2
✔ modeldata 1.2.0 ✔ workflows 1.1.3
✔ parsnip 1.1.1 ✔ workflowsets 1.0.1
✔ purrr 1.0.2 ✔ yardstick 1.2.0
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ purrr::discard() masks scales::discard()
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
✖ recipes::step() masks stats::step()
• Learn how to get started at https://www.tidymodels.org/start/
Aufgabe
Berechnen Sie einfaches Prognosemodell auf Basis eines Entscheidungsbaums!
Modellformel: am ~ .
(Datensatz mtcars
)
Berichten Sie die Modellgüte (ROC-AUC).
Hinweise:
- Tunen Sie den Komplexitätsparameter des Baumes.
- Führen Sie eine \(v=2\)-fache Kreuzvalidierung durch (weil die Stichprobe so klein ist).
- Beachten Sie die üblichen Hinweise.
Lösung
Setup
library(tidymodels)
data(mtcars)
library(tictoc) # Zeitmessung
Für Klassifikation verlangt Tidymodels eine nominale AV, keine numerische:
mtcars <-
mtcars %>%
mutate(am = factor(am))
Daten teilen
d_split <- initial_split(mtcars)
d_train <- training(d_split)
d_test <- testing(d_split)
Modell(e)
mod_tree <-
decision_tree(mode = "classification",
cost_complexity = tune())
Rezept(e)
rec1 <-
recipe(am ~ ., data = d_train)
Resampling
rsmpl <- vfold_cv(d_train, v = 2)
Workflow
wf1 <-
workflow() %>%
add_recipe(rec1) %>%
add_model(mod_tree)
Tuning/Fitting
fit1 <-
tune_grid(object = wf1,
resamples = rsmpl)
Bester Kandidat
Warning: No value of `metric` was given; metric 'roc_auc' will be used.
# A tibble: 5 × 7
cost_complexity .metric .estimator mean n std_err .config
<dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
1 0.0000000194 roc_auc binary 0.5 2 0 Preprocessor1_Model01
2 0.00000000417 roc_auc binary 0.5 2 0 Preprocessor1_Model02
3 0.000107 roc_auc binary 0.5 2 0 Preprocessor1_Model03
4 0.000522 roc_auc binary 0.5 2 0 Preprocessor1_Model04
5 0.00000542 roc_auc binary 0.5 2 0 Preprocessor1_Model05
Finalisieren
wf1_finalized <-
wf1 %>%
finalize_workflow(select_best(fit1))
Warning: No value of `metric` was given; metric 'roc_auc' will be used.
Last Fit
final_fit <-
last_fit(object = wf1_finalized, d_split)
collect_metrics(final_fit)
# A tibble: 2 × 4
.metric .estimator .estimate .config
<chr> <chr> <dbl> <chr>
1 accuracy binary 0.875 Preprocessor1_Model1
2 roc_auc binary 0.75 Preprocessor1_Model1
Categories:
- statlearning
- trees
- tidymodels
- string