Merlin: HugeCTR V3.5
What's New in Version 3.5
-
HPS interface encapsulation and exporting as library: We encapsulate the Hierarchical Parameter Server(HPS) interfaces and deliver it as a standalone library. Besides, we prodvide HPS Python APIs and demonstrate the usage with a notebook. For more information, please refer to Hierarchical Parameter Server and HPS Demo.
-
Hierarchical Parameter Server Triton Backend: The HPS Backend is a framework for embedding vectors looking up on large-scale embedding tables that was designed to effectively use GPU memory to accelerate the looking up by decoupling the embedding tables and embedding cache from the end-to-end inference pipeline of the deep recommendation model. For more information, please refer to Hierarchical Parameter Server.
-
SOK pip release: SOK pip releases on https://pypi.org/project/merlin-sok/. Now users can install SOK via
pip install merlin-sok
. -
Joint loss and multi-tasks training support:: We support joint loss in training so that users can train with multiple labels and tasks with different weights. MMoE sample is added to show the usage here.
-
HugeCTR documentation on web page: Now users can visit our web documentation.
-
ONNX converter enhancement:: We enable converting
MultiCrossEntropyLoss
andCrossEntropyLoss
layers to ONNX to support multi-label inference. For more information, please refer to HugeCTR to ONNX Converter. -
HDFS python API enhancement:
- Simplified
DataSourceParams
so that users do not need to provide all the paths before they are really necessary. Now users only have to passDataSourceParams
once when creating a solver. - Later paths will be automatically regarded as local paths or HDFS paths depending on the
DataSourceParams
setting. See notebook for usage.
- Simplified
-
HPS performance optimization: We use better method to determine partition number in database backends in HPS.
-
Bug fixing:
- HugeCTR input layer now can take dense_dim greater than 1000.