581 Branches 47 Tags

This branch is 5 commits ahead of, 42 commits behind main.

Name	Name	Last commit message	Last commit date
Latest commit peri044 chore: updates Dec 17, 2024 7408664 · Dec 17, 2024 History 4,307 Commits
.github	.github	upgrade tensorrt dependency to >=10.3.0,<=10.6.0 (#3286 )	Dec 12, 2024
bzl_def	bzl_def	chore: Apply linting and introduce precommit hooks to auto run repo	Aug 9, 2022
cmake	cmake	refactor: Address some issues with enums and overhaul documentation (#…	Jul 8, 2024
core	core	fix: Correct mutex scope in execute_engine() (#3310 )	Dec 16, 2024
cpp	cpp	upgrade torch from 2.5.0.dev to 2.6.0.dev in main branch (#3165 )	Sep 23, 2024
docker	docker	upgrade tensorrt dependency to >=10.3.0,<=10.6.0 (#3286 )	Dec 12, 2024
docs	docs	docs: [Automated] Regenerating documenation for `ade51b4`	Dec 16, 2024
docsrc	docsrc	chore: example fixes (#3176 )	Dec 16, 2024
examples	examples	chore: example fixes (#3176 )	Dec 16, 2024
notebooks	notebooks	Notebook failure (#3048 )	Aug 21, 2024
packaging	packaging	upgrade tensorrt dependency to >=10.3.0,<=10.6.0 (#3286 )	Dec 12, 2024
py	py	Merge branch 'main' into fix_py_build	Dec 17, 2024
tests	tests	Replace scaled_dot_product_attention lowering pass with decomposition (…	Dec 16, 2024
third_party	third_party	feat: Win/Linux Dual Compatible `WORKSPACE` + Upgrade CUDA + Upgrade …	Jun 14, 2024
toolchains	toolchains	upgrade tensorrt dependency to >=10.3.0,<=10.6.0 (#3286 )	Dec 12, 2024
tools	tools	chore: Fixes required for LLM models (#3002 )	Aug 29, 2024
.bazelrc	.bazelrc	refactor: Upgrade bazel and move to MODULE.bazel (#3012 )	Jul 31, 2024
.bazelversion	.bazelversion	refactor: Upgrade bazel and move to MODULE.bazel (#3012 )	Jul 31, 2024
.clang-format	.clang-format	tools(//tools/linter): Adding linters for python and C++	Oct 31, 2020
.clang-tidy	.clang-tidy	chore: Apply linting and introduce precommit hooks to auto run repo	Aug 9, 2022
.dockerignore	.dockerignore	chore: Apply linting and introduce precommit hooks to auto run repo	Aug 9, 2022
.git-blame-ignore-revs	.git-blame-ignore-revs	chore: ignore blames for linting	Aug 9, 2022
.gitignore	.gitignore	fix the libtorch version mismatch issue (#3086 )	Aug 16, 2024
.gitmodules	.gitmodules	remove nvidia ws	Sep 13, 2022
.pre-commit-config.yaml	.pre-commit-config.yaml	feat: Automatically generating converters for QDP plugins (#3321 )	Dec 16, 2024
.style.yapf	.style.yapf	chore: Apply linting and introduce precommit hooks to auto run repo	Aug 9, 2022
BUILD.bazel	BUILD.bazel	cherry-pick: Python Runtime Windows Builds on TRT 10 (#2764 ) (#2776 )	May 1, 2024
CHANGELOG.md	CHANGELOG.md	chore: Apply linting and introduce precommit hooks to auto run repo	Aug 9, 2022
CMakeLists.txt	CMakeLists.txt	fix: Upgrade Torch version to `2.1.0.dev20230605`	Jun 20, 2023
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md	refactor: Address some issues with enums and overhaul documentation (#…	Jul 8, 2024
CONTRIBUTING.md	CONTRIBUTING.md	refactor: Address some issues with enums and overhaul documentation (#…	Jul 8, 2024
Config.cmake.in	Config.cmake.in	Add cmake support for the examples	Jul 26, 2022
LICENSE	LICENSE	update LICENSE	Jun 4, 2022
MODULE.bazel	MODULE.bazel	feat: Automatically generating converters for QDP plugins (#3321 )	Dec 16, 2024
README.md	README.md	upgrade tensorrt dependency to >=10.3.0,<=10.6.0 (#3286 )	Dec 12, 2024
dev_dep_versions.yml	dev_dep_versions.yml	upgrade tensorrt dependency to >=10.3.0,<=10.6.0 (#3286 )	Dec 12, 2024
noxfile.py	noxfile.py	Torch TRT ngc container changes (#3299 )	Dec 13, 2024
package-lock.json	package-lock.json	Cherry-pick manylinux compatible builds into main (#1677 )	Feb 24, 2023
pyproject.toml	pyproject.toml	feat: Automatically generating converters for QDP plugins (#3321 )	Dec 16, 2024
requirements-dev.txt	requirements-dev.txt	fix: Fix the CUDAGraphs C++ runtime implementation (#3067 )	Aug 15, 2024
setup.py	setup.py	chore: updates	Dec 17, 2024
uv.lock	uv.lock	feat: Automatically generating converters for QDP plugins (#3321 )	Dec 16, 2024
version.txt	version.txt	upgrade torch from 2.5.0.dev to 2.6.0.dev in main branch (#3165 )	Sep 23, 2024
versions.py	versions.py	chore: cherry-pick FP8 (#2892 )	Jun 7, 2024

Repository files navigation

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu124

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:

import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)

Deployment in C++:

#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform	Support
Linux AMD64 / GPU	Supported
Windows / GPU	Supported (Dynamo only)
Linux aarch64 / GPU	Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux aarch64 / DLA	Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux ppc64le / GPU	Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

Bazel 6.3.2
Libtorch 2.5.0.dev (latest nightly) (built with CUDA 12.4)
CUDA 12.4
TensorRT 10.6.0.26

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Installation

Quickstart

Option 1: torch.compile

Option 2: Export

Step 1: Optimize + serialize

Step 2: Deploy

Deployment in PyTorch:

Deployment in C++:

Further resources

Platform Support

Dependencies

Deprecation Policy

Contributing

License

About

Releases 18

Packages 4

Contributors 96

Languages

License

pytorch/TensorRT

Folders and files

Latest commit

History

Repository files navigation

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Installation

Quickstart

Option 1: torch.compile

Option 2: Export

Step 1: Optimize + serialize

Step 2: Deploy

Deployment in PyTorch:

Deployment in C++:

Further resources

Platform Support

Dependencies

Deprecation Policy

Contributing

License

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 18

Packages 4

Contributors 96

Languages