-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [R] Replace vignettes and examples #11123
base: master
Are you sure you want to change the base?
Conversation
Thanks for the I know nothing about I can at least tell you how we're addressing these things LightGBM, maybe it will be helpful. In
output:
markdown::html_format:
options:
toc: true
number_sections: true
vignette: >
%\VignetteIndexEntry{Basic Walkthrough}
%\VignetteEngine{knitr::knitr}
%\VignetteEncoding{UTF-8}
It's a little awkward that the R docs are not technically part of the readthedocs site, but overall this has worked pretty well for us. Some notes:
Even if you don't pursue this specific mix, I do recommend not checking the |
Thanks for the suggestions, but while LightGBM's solution works in the sense of providing rendered artifacts, I don't think it's an ideal solution here - the doc page URL differs from the Sphinx one and is not as easily indexable/searchable as an embedded .md or .ipynb file in the same sphinx page. I've added a script and instructions to update the Still waiting from comments from @mayer79 if there are any. |
Let me update the branch to prevent CI hang on mgpu tests. It's a AWS instance. |
Sure, those are good points. You have to balance whether those benefits you listed are worth the added build complexity, heavier set of dependencies, and increased risk of CRAN rejections. Not my call to make here in |
ref #9810
closes #10746
This PR adds a new introductory vignette which replaces most of the previous ones, and modifies the code examples throughout functions aimed at interactive usage to call
xgboost()
instead ofxgb.train()
.Motivation
Since the time that XGBoost was first published at CRAN, its adoption and mindshare have risen substantially, to the point that it has become the standard when it comes to boosted decision trees. In this day and age, I don't think the package needs to provide any introduction to the concepts of gradient-boosting, cross-validation, evaluation metrics, and so on - people who use R are already going to be familiar with those, and the things it compares against (like the package 'gbm') have become obsolete by now.
As well, the documentation and tutorials for XGBoost have mostly moved to the online docs - any R-specific documents become outdated rather soon, and are less likely to be seen by a random user. Most of the python examples and guides should in any event work with the R interface with very minimal modifications like dict->list.
Apart from becoming a standard-use library, the features supported by XGBoost have expanded over time, and lots of the materials that were there before, such as the first vignette, contained tips that are not applicable to the current state of the library, like manually one-hot encoding categorical features.
Hence, I decided to remove the previous vignettes and create a new one from scratch, which contains only examples around the usage of the R interface and its conventions.
Help needed
It would be ideal if this vignette could also get added to the online docs.
Thus, I created the vignette as a quarto file (.qmd), which has the option to render to both .html (what CRAN hosts) and .md (which can be included in the .rst files).
Only, getting it to render to .md required building the vignette with jupyter instead of knitr, which in turn requires installs of python, jupyter, ipykernel, and the "ir" kernel that runs R in jupyter, plus registering that kernel in the user-level config for jupyter. By adding that line "jupyter: ir", it additionally makes the default quarto render (e.g. as used by the "knit" button in RStudio) build the .html vignette using jupyter instead of knitr, which is most definitely not going to work out in CRAN servers. I don't know how to solve this.
Would also be nice if some CI job could be auto-building the .md file for the online docs from the .qmd source of the vignette.
(CCing @mayer79 and @jameslamb )