20.06
NVIDIA TensorFlow Release 20.06
Initial GitHub Release
Key Features and Enhancements
NVIDIA TensorFlow release 20.06 is based on TensorFlow 1.15.2.
Release 20.06 includes the following key features and enhancements:
- Integrated latest NVIDIA Deep Learning SDK to support NVIDIA A100 when built with CUDA 11 and cuDNN 8
- Improved NVTX annotations for XLA clusters for use with NVIDIA DLProf
- Improved XLA to avoid excessive recompilations
- Enhancements for Automatic Mixed Precision with einsum, 3D Convolutions, and list operations
- Improved 3D Convolutions to support NDHWC format
- Default TF32 support on NVIDIA A100
Known Issues
- TF-TRT inference throughput may regress for certain models by up to 37% compared to the 21.06-tf1 release. This will be fixed in a future release.
- A CUDNN performance regression can cause slowdowns of up to 15% in certain ResNet models. This will be fixed in a future release.
- TensorFlow Wheel release 21.12 has a known corruption issue in its NVTX profiling markers when using the CUPTI library from CUDA Toolkit version 11.5. An updated CUPTI build, numbered 11.5.57 or higher, in CUDA 11.5 Update 1 will address this issue.
Binary builds
PIP - built for Python 3.8 on Ubuntu 20.04:
pip install --user nvidia-pyindex
pip install --user nvidia-tensorflow[horovod]
docker pull nvcr.io/nvidia/tensorflow:21.12-tf1-py3