NVIDIA® TensorRT™ is a software development kit (SDK) developed by NVIDIA specifically for high-performance deep learning inference. It is an inference optimizer and runtime library designed for NVIDIA GPUs, capable of significantly improving the inference performance of deep learning models in production environments.
# Install the Python package using pip
pip install tensorrt
# Or build from source
git clone -b main https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
git submodule update --init --recursive
# Build the Docker image
./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.9
# Launch the build container
./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.9 --gpus all
This repository contains the open-source components of TensorRT, mainly including:
NVIDIA TensorRT is a mature, high-performance deep learning inference optimization platform that provides developers with a complete solution from model optimization to deployment. Its powerful optimization capabilities, rich features, and comprehensive ecosystem support make it one of the preferred tools for AI application deployment. Whether it is edge computing or data center deployment, TensorRT can help developers achieve the best inference performance and efficiency.