Merlin: HugeCTR V3.2
What's New in Version 3.2
-
New HugeCTR to ONNX Converter: We’re introducing a new HugeCTR to ONNX converter in the form of a Python package. All graph configuration files are required and model weights must be formatted as inputs. You can specify where you want to save the converted ONNX model. You can also convert sparse embedding models. For more information, refer to HugeCTR to ONNX Converter and HugeCTR2ONNX Demo Notebook.
-
New Hierarchical Storage Mechanicsm on the Parameter Server (POC): We’ve implemented a hierarchical storage mechanism between local SSDs and CPU memory. As a result, embedding tables no longer have to be stored in the local CPU memory. The distributed Redis cluster is being implemented as a CPU cache to store larger embedding tables and interact with the GPU embedding cache directly. The local RocksDB serves as a query engine to back up the complete embedding table on the local SSDs and assist the Redis cluster with looking up missing embedding keys. Please find more information here
-
Parquet Format Support Within the Data Generator: The HugeCTR data generator now supports the parquet format, which can be configured easily using the Python API. For more information, refer to Data Generator API.
-
Python Interface Support for the Data Generator: The data generator has been enabled within the HugeCTR Python interface. The parameters associated with the data generator have been encapsulated into the
DataGeneratorParams
struct, which is required to initialize theDataGenerator
instance. You can use the data generator's Python APIs to easily generate the Norm, Parquet, or Raw dataset formats with the desired distribution of sparse keys. For more information, refer to Data Generator API and Data Generator Samples. -
Improvements to the Formula of the Power Law Simulator within the Data Generator: We've modified the formula of the power law simulator within the data generator so that a positive alpha value is always produced, which will be needed for most use cases. The alpha values for
Long
,Medium
, andShort
within the power law distribution are 0.9, 1.1, and 1.3 respectively. For more information, refer to Data Generator API. -
Support for Arbitrary Input and Output Tensors in the Concat and Slice Layers: The Concat and Slice layers now support any number of input and output tensors. Previously, these layers were limited to a maximum of four tensors.
-
New Continuous Training Notebook: We’ve added a new notebook to demonstrate how to perform continuous training using the model oversubscription (also referred to as Embedding Training Cache) feature. For more information, refer to HugeCTR Continuous Training.
-
New HugeCTR Contributor Guide: We've added a new HugeCTR Contributor Guide that explains how to contribute to HugeCTR, which may involve reporting and fixing a bug, introducing a new feature, or implementing a new or pending feature.
-
Enhancements to Sparse Operation Kits (SOK): SOK now supports TensorFlow 2.5 and 2.6. We also added support for identity hashing, dynamic input, and Horovod within SOK. Lastly, we added a new SOK docs set to help you get started with SOK.
-
Supporting Arbitrary Number of Inputs in Concat Layer and Slice Layer: The Concat and Slice layers now support any number of input and output tensors, respectively. Previously, these layers would be limited to a maximum of 4 tensors.
-
Fix power law in Data Generator (Generalize the power law simulator in Data Generator): We’ve modified the formula of the power law simulator to make for the positive alpha value, which is more general in different use cases. Besides, the alpha values for
Long
,Medium
andShort
of power law distribution are 0.9, 1.1 and 1.3 respectively. For more information, see Data Generator API.