Name	Name	Last commit message	Last commit date
parent directory ..
hugectr2onnx	hugectr2onnx
readme_src	readme_src
README.md	README.md
setup.py	setup.py

HugeCTR to ONNX Converter

hugectr2onnx is a python package that can convert HugeCTR models to ONNX format. It can improve the compatibility of HugeCTR with other deep learning frameworks given that Open Neural Network Exchange (ONNX) serves as an open-source format for AI models.

To use the ONNX converter, you need to prepare HugeCTR models files and graph configuration files which can be obtained with HugeCTR training APIs. Please refer to HugeCTR Python Interface to get familiar with how to train and save your models. To learn the workflow of training HugeCTR model, converting to ONNX and making inference with ONNX Runtime, please refer to HugeCTR to ONNX Demo Notebook.

Overview

HugeCTR Model Format

The trained models with HugeCTR consist of the dense model and the sparse models. There will be one binary dense model file which stores the weights of all the dense layers. The number of the sparse model folders will be equal to the that of embedding table, and there exist binary files for keys and embedding vectors in each folder. All these model files will be saved automatically according to the snapshot and snapshot_prefix values within the HugeCTR training API hugectr.Model.fit(). A graph configuration JSON file is required to load these binary model files correctly, which can be derived with the HugeCTR training API hugectr.Model.graph_to_json(). Here is an example of HugeCTR trained models and graph configuration JSON file:

wdl_model
├── wdl0_sparse_2000.model
│   ├── emb_vector
│   └── key
├── wdl1_sparse_2000.model
│   ├── emb_vector
│   └── key
├── wdl_dense_2000.model
└── wdl.json

ONNX Converter Features

The HugeCTR to ONNX converter will parse the graph configuration JSON file for each HugeCTR layer and construct an equivalent ONNX graph with ONNX operators. At the same time, the converter will read the HugeCTR model files and upload the weights to the corresponding ONNX node if the HugeCTR layer has some weights. We also provide an option for users to choose whether to convert the sparse models to ONNX or not, given that the sparse embedding model is usually very large and the weight alignment is relatively easily compared to the dense part.

If you choose to convert both dense and sparse models to ONNX, the converted ONNX graph will take dense features and sparse keys as the inputs and expose prediction result as the output. If you only convert the dense model to ONNX, then the converted ONNX graph will require dense features and sparse embedding vectors as the inputs. The following figure depicts the differences of the converted ONNX models in these two cases (the red dotted frames represent lookup and reduction for sparse embedding vectors).

Fig. 1: The ONNX graph of WDL model. Left: converting both dense and sparse models. Right: converting only the dense model.

Installation

There are several ways to install this package.

Use NGC Container

In the docker image: nvcr.io/nvidia/merlin/merlin-training:0.7, hugectr2onnx is already installed and you can directrly import this package via:

import hugectr2onnx

Build from source

If you want to build this package from the souce code, here are the steps to follow:

$ git clone https://github.com/NVIDIA/HugeCTR.git
$ cd HugeCTR/onnx_converter
$ python3 setup.py install

API Docs

hugectr2onnx.converter.convert(*args, **kwargs)

This function is used to convert the HugeCTR models to ONNX format. It requires graph configuration JSON file and model files from HugeCTR training, and will save the converted ONNX model to the specified path.

Parameters

onnx_model_path (string): the path to store the ONNX model.
graph_config (string): the graph configuration JSON file of the HugeCTR model.
dense_model (string): the file of the dense weights for the HugeCTR model.
convert_embedding (bolean): whether to convert the sparse embeddings for the HugeCTR model (optional).
sparse_models (List[str]): the files of the sparse embeddings for the HugeCTR model (optional).
ntp_file (string): the file of the non-trainable parameters for the HugeCTR model (optional).
graph_name (string): the graph name for the ONNX model (optional)

Examples

hugectr2onnx.converter.convert(onnx_model_path = "wdl.onnx",
                            graph_config = "wdl.json",
                            dense_model = "wdl_dense_2000.model",
                            convert_embedding = True,
                            sparse_models = ["wdl0_sparse_2000.model", "wdl1_sparse_2000.model"])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnx_converter

onnx_converter

README.md

HugeCTR to ONNX Converter

Overview

Installation

API Docs

Files

onnx_converter

Directory actions

More options

Directory actions

More options

Latest commit

History

onnx_converter

Folders and files

parent directory

README.md

HugeCTR to ONNX Converter

Overview

Installation

API Docs