NVIDIA TensorFlow Release 20.06

Initial GitHub Release

Key Features and Enhancements

NVIDIA TensorFlow release 20.06 is based on TensorFlow 1.15.2.
Release 20.06 includes the following key features and enhancements:

Integrated latest NVIDIA Deep Learning SDK to support NVIDIA A100 when built with CUDA 11 and cuDNN 8
Improved NVTX annotations for XLA clusters for use with NVIDIA DLProf
Improved XLA to avoid excessive recompilations
Enhancements for Automatic Mixed Precision with einsum, 3D Convolutions, and list operations
Improved 3D Convolutions to support NDHWC format
Default TF32 support on NVIDIA A100

TF-TRT inference throughput may regress for certain models by up to 37% compared to the 21.06-tf1 release. This will be fixed in a future release.
A CUDNN performance regression can cause slowdowns of up to 15% in certain ResNet models. This will be fixed in a future release.
TensorFlow Wheel release 21.12 has a known corruption issue in its NVTX profiling markers when using the CUPTI library from CUDA Toolkit version 11.5. An updated CUPTI build, numbered 11.5.57 or higher, in CUDA 11.5 Update 1 will address this issue.

pip install --user nvidia-pyindex
pip install --user nvidia-tensorflow[horovod]

docker pull nvcr.io/nvidia/tensorflow:21.12-tf1-py3