You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In ?recipes, the data passed to recipes() in the example is the training split. But it is unclear what should be passed if we split data using vfold_cv. Is it the entire dataset? The docs seem to suggest the actual data doesn't matter for the recipe, just the column names.
We typically don't use a vfold_cv fold object directly with a recipe() object. packages like tune knows how to handle both recipes and vfold_cv objects.
The dataset passed to recipe() does a couple of things. It is used to denote the column names, and their types. This information is used to detect whether some variables are missing, or if they have the wrong types.
When you go to prep() a recipe it will use that data by default, if the training argument of prep() has not been set.
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.
In
?recipes
, thedata
passed torecipes()
in the example is the training split. But it is unclear what should be passed if we split data usingvfold_cv
. Is it the entire dataset? The docs seem to suggest the actual data doesn't matter for the recipe, just the column names.https://www.tidymodels.org/start/resampling/ does not show how to use a recipe, just formula.
The text was updated successfully, but these errors were encountered: