Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
ab54824
Added ONNX exporter class to export model to ONNX format
DimaBir Oct 2, 2023
c5c5414
Merge remote-tracking branch 'origin/dev' into dev
DimaBir Oct 2, 2023
8ed72b9
Fixed import typo
DimaBir Oct 2, 2023
6451458
Fixed docstring
DimaBir Oct 2, 2023
2dc2560
Fixed docstring
DimaBir Oct 2, 2023
9019460
Merge remote-tracking branch 'origin/dev' into dev
DimaBir Oct 2, 2023
758f964
Fixed typo
DimaBir Oct 2, 2023
e186b66
Fixed typo
DimaBir Oct 2, 2023
1a6cf6e
Fixed typo
DimaBir Oct 2, 2023
47d6481
Print ONNX model
DimaBir Oct 2, 2023
74577ea
trying set to train to print BN
DimaBir Oct 2, 2023
d00d16f
Removed Conv + BN fusion in exporting PyTorch to ONNX
DimaBir Oct 2, 2023
b07c628
Removed Conv + BN fusion in exporting PyTorch to ONNX
DimaBir Oct 2, 2023
8e9d013
Add ONNX Inference
DimaBir Oct 2, 2023
f04011d
Updated dockerfile include packages
DimaBir Oct 2, 2023
7e04d3b
Fixed ONNX Inference
DimaBir Oct 2, 2023
85aebba
Fixed ONNX input
DimaBir Oct 2, 2023
7bb680d
Fixed ONNX input
DimaBir Oct 2, 2023
3dc11ca
Fixed ONNX input
DimaBir Oct 2, 2023
6e021cd
Fixed ONNX input
DimaBir Oct 2, 2023
5ac5b47
Fixed ONNX input
DimaBir Oct 2, 2023
69f3c9b
Fixed ONNX input
DimaBir Oct 2, 2023
d0f8936
Fixed ONNX input
DimaBir Oct 2, 2023
d665ae4
Fixed ONNX input
DimaBir Oct 2, 2023
f0049d8
Added abstract benchmark class
DimaBir Oct 2, 2023
647d811
Fixed ONNXBenchmark param
DimaBir Oct 2, 2023
125d825
Fixed ONNXBenchmark param
DimaBir Oct 2, 2023
914da47
Fixed ONNXBenchmark param
DimaBir Oct 2, 2023
f3c162d
Fixed ONNXBenchmark param
DimaBir Oct 2, 2023
ff84523
Applied black formatting
DimaBir Oct 2, 2023
057b899
Added image cat3
DimaBir Oct 2, 2023
dd81da6
Enabling optimization for ONNX exporter, using GPU
DimaBir Oct 2, 2023
284671b
Enabling optimization for ONNX exporter, using GPU
DimaBir Oct 2, 2023
e8427c7
Enabling optimization for ONNX exporter, using GPU
DimaBir Oct 2, 2023
5acb999
Fixed ONNX benchmark error
DimaBir Oct 2, 2023
e9b08e9
Fixed ONNX benchmark error
DimaBir Oct 2, 2023
3bb4f70
Fixed ONNX benchmark error
DimaBir Oct 2, 2023
dc574bd
Added requirement, reformatted code
DimaBir Oct 2, 2023
f7a8779
Fixed typo in default image extension
DimaBir Oct 2, 2023
eda33aa
Update inference images
DimaBir Oct 2, 2023
df0bf5f
Update inference images
DimaBir Oct 2, 2023
e1fefc2
Updated README.md
DimaBir Oct 2, 2023
759e92f
Updated README.md
DimaBir Oct 2, 2023
88c2901
Merge branch 'main' into dev
DimaBir Oct 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ RUN apt-get update && apt-get install -y \
python3-pip \
git

# Install Python packages
RUN pip3 install torch torchvision torch-tensorrt pandas Pillow numpy packaging onnx
ø
# Set the working directory
WORKDIR /workspace

# Copy local project files to /workspace in the image
COPY . /workspace

# Install Python packages
RUN pip3 install --no-cache-dir -r /workspace/requirements.txt
24 changes: 13 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
5. [Inference Benchmark Results](#inference-benchmark-results)
- [Example of Results](#example-of-results)
- [Explanation of Results](#explanation-of-results)
6. [Author](#author)
7. [References](#references)
8. [Notes](#notes)
6. [ONNX Exporter](#onnx-exporter)
7. [Author](#author)
8. [References](#references)

## Overview
This project demonstrates how to perform inference with a PyTorch model and optimize it using NVIDIA TensorRT. The script loads a pre-trained ResNet-50 model from torchvision, performs inference on a user-provided image, and prints the top-K predicted classes. Additionally, the script benchmarks the model's performance in the following configurations: CPU, CUDA, TensorRT-FP32, and TensorRT-FP16, providing insights into the speedup gained through optimization.
Expand All @@ -30,19 +30,20 @@ docker build -t awesome-tesnorrt .
docker run --gpus all --rm -it awesome-tesnorrt

# 3. Run the Script inside the Container
python src/main.py --image_path /path-to-image/image.jpg --topk 2
python src/main.py
```

### Arguments
- `--image_path`: Specifies the path to the image you want to predict.
- `--image_path`: (Optional) Specifies the path to the image you want to predict.
- `--topk`: (Optional) Specifies the number of top predictions to show. Defaults to 5 if not provided.
- `--onnx`: (Optional) Specifies if we want export ResNet50 model to ONNX and run benchmark only for this model

## Example Command
```sh
python src/main.py --image_path ./inference/cat3.jpg --topk 3 --show_image
python src/main.py --image_path ./inference/cat3.jpg --topk 3 --onnx
```

This command will run predictions on the image at the specified path, show the top 3 predictions, and display the image. If you do not want to display the image, omit the `--show_image` flag. For the default 5 top predictions, omit the `--topk` argument or set it to 5.
This command will run predictions on the image at the specified path and show the top 3 predictions using both PyTorch and ONNX Runtime models. For the default 5 top predictions, omit the --topk argument or set it to 5.

## Inference Benchmark Results

Expand All @@ -58,6 +59,7 @@ My prediction: %33 tabby
My prediction: %26 Egyptian cat
Running Benchmark for CPU
Average batch time: 942.47 ms
Average ONNX inference time: 15.59 ms
Running Benchmark for CUDA
Average batch time: 41.02 ms
Compiling and Running Inference Benchmark for TensorRT with precision: torch.float32
Expand All @@ -70,16 +72,16 @@ Average batch time: 7.25 ms
- First k lines show the topk predictions. For example, `My prediction: %33 tabby` displays the highest confidence prediction made by the model for the input image, confidence level (`%33`), and the predicted class (`tabby`).
- The following lines provide information about the average batch time for running the model in different configurations:
- `Running Benchmark for CPU` and `Average batch time: 942.47 ms` indicate the average batch time when running the model on the CPU.
- `Average ONNX inference time: 15.59 ms` indicate the average batch time when running the ONNX model on the CPU.
- `Running Benchmark for CUDA` and `Average batch time: 41.02 ms` indicate the average batch time when running the model on CUDA.
- `Compiling and Running Inference Benchmark for TensorRT with precision: torch.float32` and `Average batch time: 19.20 ms` show the average batch time when running the model with TensorRT using `float32` precision.
- `Compiling and Running Inference Benchmark for TensorRT with precision: torch.float16` and `Average batch time: 7.25 ms` indicate the average batch time when running the model with TensorRT using `float16` precision.

## ONNX Exporter
The ONNX Exporter utility is integrated into this project to allow the conversion of the PyTorch model to ONNX format, enabling inference and benchmarking using ONNX Runtime. The ONNX model can provide hardware-agnostic optimizations and is widely supported across various platforms and devices.

## Author
[DimaBir](https://github.com/DimaBir)

## References
- [ResNetTensorRT Project](https://github.com/DimaBir/ResNetTensorRT/tree/main)

## Notes
- The project uses a Docker container built on top of the NVIDIA TensorRT image to ensure that all dependencies, including CUDA and TensorRT, are correctly installed and configured.
- Please ensure you have the NVIDIA Container Toolkit installed to run the container with GPU support.
Binary file added inference/cat3.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added inference/fan.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed inference/image-2.jpg
Binary file not shown.
Binary file added inference/vase.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 9 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
torch
torchvision
torch-tensorrt
pandas
Pillow
numpy
packaging
onnx
onnxruntime
61 changes: 60 additions & 1 deletion src/benchmark.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,35 @@
import time
from typing import Tuple

from abc import ABC, abstractmethod
import numpy as np
import torch
import torch.backends.cudnn as cudnn
import logging
import onnxruntime as ort

# Configure logging
logging.basicConfig(filename="model.log", level=logging.INFO)


class Benchmark:
class Benchmark(ABC):
"""
Abstract class representing a benchmark.
"""

def __init__(self, nruns: int = 100, nwarmup: int = 50):
self.nruns = nruns
self.nwarmup = nwarmup

@abstractmethod
def run(self) -> None:
"""
Abstract method to run the benchmark.
"""
pass


class PyTorchBenchmark:
def __init__(
self,
model: torch.nn.Module,
Expand Down Expand Up @@ -74,3 +93,43 @@ def run(self) -> None:
print(f"Input shape: {input_data.size()}")
print(f"Output features size: {features.size()}")
logging.info(f"Average batch time: {np.mean(timings) * 1000:.2f} ms")


class ONNXBenchmark(Benchmark):
"""
A class used to benchmark the performance of an ONNX model.
"""

def __init__(
self,
ort_session: ort.InferenceSession,
input_shape: tuple,
nruns: int = 100,
nwarmup: int = 50,
):
super().__init__(nruns)
self.ort_session = ort_session
self.input_shape = input_shape
self.nwarmup = nwarmup
self.nruns = nruns

def run(self) -> None:
print("Warming up ...")
# Adjusting the batch size in the input shape to match the expected input size of the model.
input_shape = (1,) + self.input_shape[1:]
input_data = np.random.randn(*input_shape).astype(np.float32)

for _ in range(self.nwarmup): # Warm-up runs
_ = self.ort_session.run(None, {"input": input_data})

print("Starting benchmark ...")
timings = []

for _ in range(self.nruns):
start_time = time.time()
_ = self.ort_session.run(None, {"input": input_data})
end_time = time.time()
timings.append(end_time - start_time)

avg_time = np.mean(timings) * 1000
logging.info(f"Average ONNX inference time: {avg_time:.2f} ms")
Loading