- Implemented support for logistic regression.
-
{cuda.ml} installation process will no longer require the presence of cmake v3.21+. If the required version of cmake is absent, then the installation process will use a temporary copy of cmake downloaded from https://github.com/Kitware/CMake/releases.
-
Moving forward, Daniel Falbel (https://github.com/dfalbel) will be the new maintainer of {cuda.ml}.
- Implemented support for logistic regression.
-
{cuda.ml} installation process will no longer require the presence of cmake v3.21+. If the required version of cmake is absent, then the installation process will use a temporary copy of cmake downloaded from https://github.com/Kitware/CMake/releases.
-
Moving forward, Daniel Falbel (https://github.com/dfalbel) will be the new maintainer of {cuda.ml}.
- Addressed feedback from CRAN. Debugging symbols were previously stripped from a DSO to reduce package size. Now debugging symbols are preserved in accordance with the CRAN policy.
- Added support for OLS, ridge regression, and LASSO regression.
-
Fixed issue with CUDA architecture string being empty when building {cuda.ml}
-
{cuda.ml} source code was revised to be compatible with
libcuml++
version 21.06, 21.08, and 21.10 -
Added support for automatically downloading a pre-built version of
libcuml++
and bundling & linking the downloadedlibcuml++
with the rest of the {cuda.ml} installation when no pre-existing copy oflibcuml++
is found. This is done so that new users can try out {cuda.ml} quickly without having to install Conda or to buildlibcuml++
from source manually.
-
Re-wrote R interfaces of all supervised ML algorithms using {hardhat} to support data-frame, matrix, formula, and recipe inputs, per suggestion from @topepo in #78 and #77.
-
Added {parsnip} bindings for random forest, SVM, and KNN models.
-
Improved warning message for missing linkage to the RAPIDS CuML shared library. If the C++/CUDA source code of this package was not linked with a valid version of the RAPIDS CuML shared library when the package was installed, then a warning will be emitted whenever the package is loaded.
-
Added support K-Means initialization options (namely, "kmeans++", "random", and "array") and other configuration parameters for K-Means clustering in
cuML
. -
Added 'cuml_log_level' option to
cuml_dbscan()
. -
Implemented R interface for single-linkage agglomerative clustering.
- Implemented R interfaces for Principal Component Analysis (PCA), Truncated
Singular Value Decomposition (TSVD), T-distributed Stochastic Neighbor
Embedding (T-SNE), Uniform Manifold Approximation and Projection (UMAP),
and Random Projection routines in
cuML
(including inverse transformations from lower-dimensional representation to the original feature space when applicable).
-
Added R interface for CuML Forest Inference Library (FIL). Users can load any existing XGBoost or LightGBM model using Treelite and use the model to perform high-throughput batch inference using GPU acceleration provided by FIL.
-
Implemented R interface for K-Nearest Neighbor (KNN) classification and regression.
- Fixed a missing
cudaEventRecord()
call incuml4r::async_copy()
.
-
Added
ellipsis::check_dots_used()
checks for all...
parameters in R. -
Renamed this package from {cuml4r} to {cuml} per suggestion from @lorenzwalthert (context: #75). The new name is shorter, and more importantly, is consistent with the mlverse naming convention for R packages (e.g., {keras}, {tensorflow}, {torch}, {tabnet}, etc).
- Implemented R interfaces for single-GPU versions of DBSCAN and K-Means
clustering algorithms from
cuML
.
-
Implemented R interfaces for
cuML
Random Forest classification and regression routines. -
Implemented R interfaces for
cuML
Support Vector Machine classifier and regressor. -
Support for SVM multi-class classification was implemented using the one-vs- rest strategy (as SVM classifier from
cuML
currently only supports binary classifications).
-
Included suggestions on how to build and install
cuML
libraries from source with or without multi-GPU support in https://github.com/yitao-li/cuml-installation-notes. All suggestions are known to be working for RAPIDS cuML version 21.08. Please note the building-from- source option is more for advanced use cases that require customizations of RAPIDS cuML libraries' build parameters, compilers, etc, and is somewhat time- consuming and not as beginner-friendly as installingcuML
directly from Conda. -
Found and fixed a few typos and inconsistencies.
-
Some examples were simplified.
-
Added documentation for
predict()
functions per suggestion from @topepo in #80.
-
Configuration script was revised to work with RAPIDS cuML libraries installed via Conda or built from source. If RAPIDS cuML libraries could not be located during the configuration process, then a warning message will be emitted.
-
Improved on the initial prototype of {cuml} by utilizing modern C++ constructs from
thrust
(https://github.com/NVIDIA/thrust), making the C++ source code of this project more readable and maintainable. -
Formatted all human-written C++ source code with clang-format and all human- written R source code with
styler
. Rcpp-generated C++ and R source files will not be formatted. -
Caching of build artifacts using
ccache
can be enabled by setting the env variable CUML4R_ENABLE_CCACHE (e.g., one can runR CMD build cuml
followed byCUML4R_ENABLE_CCACHE=1 R CMD INSTALL cuml_0.1.0.tar.gz
to avoid re- compiling the same artifacts across builds. Notice this feature is intended for {cuml} contributors or advanced users who need to build {cuml} frequently, and is not enabled by default for other users. -
Some larger cpp files were split into more granular ones for faster build speed (if parallel build is enabled) and also greater maintainability.