Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +73 -0
- README_zh.md +63 -0
- TensorRT-9.2.0.5.tar.gz +3 -0
- fmha_plugins/9.2_plugin_cuda11/fMHAPlugin.so +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
fmha_plugins/9.2_plugin_cuda11/fMHAPlugin.so filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: tencent-hunyuan-community
|
4 |
+
license_link: https://huggingface.co/Tencent-Hunyuan/HunyuanDiT/blob/main/LICENSE.txt
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
---
|
8 |
+
|
9 |
+
# HunyuanDiT TensorRT Acceleration
|
10 |
+
|
11 |
+
English | [中文](https://huggingface.co/Tencent-Hunyuan/TensorRT-libs/blob/main/README_zh.md)
|
12 |
+
|
13 |
+
We provide a TensorRT version of [HunyuanDiT](https://github.com/Tencent/HunyuanDiT) for inference acceleration
|
14 |
+
(faster than flash attention). One can convert the torch model to TensorRT model using the following steps.
|
15 |
+
|
16 |
+
## 1. Download dependencies from huggingface.
|
17 |
+
|
18 |
+
```shell
|
19 |
+
cd HunyuanDiT
|
20 |
+
# Use the huggingface-cli tool to download the model.
|
21 |
+
huggingface-cli download Tencent-Hunyuan/TensorRT-libs --local-dir ./ckpts/t2i/model_trt
|
22 |
+
```
|
23 |
+
|
24 |
+
## 2. Install the TensorRT dependencies.
|
25 |
+
|
26 |
+
```shell
|
27 |
+
sh trt/install.sh
|
28 |
+
```
|
29 |
+
|
30 |
+
## 3. Build the TensorRT engine.
|
31 |
+
|
32 |
+
|
33 |
+
### Method 1: Use the prebuilt engine
|
34 |
+
|
35 |
+
We provide some prebuilt TensorRT engines.
|
36 |
+
|
37 |
+
| Supported GPU | Download Link | Remote Path |
|
38 |
+
|:----------------:|:---------------------------------------------------------------------------------------------------------------:|:---------------------------------:|
|
39 |
+
| GeForce RTX 3090 | [HuggingFace](https://huggingface.co/Tencent-Hunyuan/TensorRT-engine/blob/main/engines/RTX3090/model_onnx.plan) | `engines/RTX3090/model_onnx.plan` |
|
40 |
+
| GeForce RTX 4090 | [HuggingFace](https://huggingface.co/Tencent-Hunyuan/TensorRT-engine/blob/main/engines/RTX4090/model_onnx.plan) | `engines/RTX4090/model_onnx.plan` |
|
41 |
+
| A100 | [HuggingFace](https://huggingface.co/Tencent-Hunyuan/TensorRT-engine/blob/main/engines/A100/model_onnx.plan) | `engines/A100/model_onnx.plan` |
|
42 |
+
|
43 |
+
Use the following command to download and place the engine in the specified location.
|
44 |
+
|
45 |
+
```shell
|
46 |
+
huggingface-cli download Tencent-Hunyuan/TensorRT-engine <Remote Path> --local-dir ./ckpts/t2i/model_trt/engine
|
47 |
+
```
|
48 |
+
|
49 |
+
### Method 2: Build your own engine
|
50 |
+
|
51 |
+
If you are using a different GPU, you can build the engine using the following command.
|
52 |
+
|
53 |
+
```shell
|
54 |
+
# Set the TensorRT build environment variables first. We provide a script to set up the environment.
|
55 |
+
source trt/activate.sh
|
56 |
+
|
57 |
+
# Method 1: Build the TensorRT engine. By default, it will read the `ckpts` folder in the current directory.
|
58 |
+
sh trt/build_engine.sh
|
59 |
+
|
60 |
+
# Method 2: If your model directory is not `ckpts`, you need to specify the model directory.
|
61 |
+
sh trt/build_engine.sh </path/to/ckpts>
|
62 |
+
```
|
63 |
+
|
64 |
+
4. Run the inference using the TensorRT model.
|
65 |
+
|
66 |
+
```shell
|
67 |
+
# Run the inference using the prompt-enhanced model + HunyuanDiT TensorRT model.
|
68 |
+
python sample_t2i.py --prompt "渔舟唱晚" --infer-mode trt
|
69 |
+
|
70 |
+
# Close prompt enhancement. (save GPU memory)
|
71 |
+
python sample_t2i.py --prompt "渔舟唱晚" --infer-mode trt --no-enhance
|
72 |
+
```
|
73 |
+
|
README_zh.md
ADDED
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# 混元-DiT TensorRT 加速
|
2 |
+
|
3 |
+
[English](https://huggingface.co/Tencent-Hunyuan/TensorRT-libs/blob/main/README.md) | 中文
|
4 |
+
|
5 |
+
我们提供了将 [混元-DiT](https://github.com/Tencent/HunyuanDiT) 中的文生图模型转换为 TensorRT 的代码和相关依赖用于推理加速
|
6 |
+
(比 Flash Attention 更快)。 您可以使用以下步骤使用我们 TensorRT 模型。
|
7 |
+
|
8 |
+
## 1. 从 Huggingface 下载 TensorRT 的依赖文件
|
9 |
+
|
10 |
+
```shell
|
11 |
+
cd HunyuanDiT
|
12 |
+
|
13 |
+
# Download the dependencies
|
14 |
+
huggingface-cli download Tencent-Hunyuan/TensorRT-libs --local-dir ./ckpts/t2i/model_trt
|
15 |
+
```
|
16 |
+
|
17 |
+
## 2. 安装 TensorRT 依赖
|
18 |
+
|
19 |
+
```shell
|
20 |
+
sh trt/install.sh
|
21 |
+
```
|
22 |
+
|
23 |
+
## 3. 构建 TensorRT engine
|
24 |
+
|
25 |
+
### 方法1: 使用预构建的 engine
|
26 |
+
|
27 |
+
本仓库提供了一些预构建的 TensorRT engine.
|
28 |
+
|
29 |
+
| 支持的 GPU | 文件链接 | 远程地址 |
|
30 |
+
|:----------------:|:---------------------------------------------------------------------------------------------------------------:|:---------------------------------:|
|
31 |
+
| GeForce RTX 3090 | [HuggingFace](https://huggingface.co/Tencent-Hunyuan/TensorRT-engine/blob/main/engines/RTX3090/model_onnx.plan) | `engines/RTX3090/model_onnx.plan` |
|
32 |
+
| GeForce RTX 4090 | [HuggingFace](https://huggingface.co/Tencent-Hunyuan/TensorRT-engine/blob/main/engines/RTX4090/model_onnx.plan) | `engines/RTX4090/model_onnx.plan` |
|
33 |
+
| A100 | [HuggingFace](https://huggingface.co/Tencent-Hunyuan/TensorRT-engine/blob/main/engines/A100/model_onnx.plan) | `engines/A100/model_onnx.plan` |
|
34 |
+
|
35 |
+
可以使用以下命令下载并放置在指定的位置
|
36 |
+
|
37 |
+
```shell
|
38 |
+
huggingface-cli download Tencent-Hunyuan/TensorRT-engine <远程地址> --local-dir ./ckpts/t2i/model_trt/engine
|
39 |
+
```
|
40 |
+
|
41 |
+
### 方法2: 自行构建 engine
|
42 |
+
如果您使用不同于上面表格中的 GPU, 可以使用以下命令构建适配于当前 GPU 的 engine.
|
43 |
+
|
44 |
+
```shell
|
45 |
+
# 首先设置 TensorRT 构建相关的环境变量,我们提供了一个脚本来一键设置
|
46 |
+
source trt/activate.sh
|
47 |
+
|
48 |
+
# 方式1: 构建 TensorRT engine. 默认会读取当前目录下的 ckpts 文件夹
|
49 |
+
sh trt/build_engine.sh
|
50 |
+
|
51 |
+
# 方式2: 如果您的模型目录不是 ckpts, 需要指定模型目录
|
52 |
+
sh trt/build_engine.sh </path/to/ckpts>
|
53 |
+
```
|
54 |
+
|
55 |
+
4. 使用 TensorRT 模型进行推理.
|
56 |
+
|
57 |
+
```shell
|
58 |
+
# 使用 prompt 强化 + 文生图 TensorRT 模型进行推理
|
59 |
+
python sample_t2i.py --prompt "渔舟唱晚" --infer-mode trt
|
60 |
+
|
61 |
+
# 关闭 prompt 强化 (可以在显存不足时使用)
|
62 |
+
python sample_t2i.py --prompt "渔舟唱晚" --infer-mode trt --no-enhance
|
63 |
+
```
|
TensorRT-9.2.0.5.tar.gz
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d4ae57919c3747836e60ffb6b36a3975c9145e41cbe2027e7f6c9d1071c8e2b8
|
3 |
+
size 2453863376
|
fmha_plugins/9.2_plugin_cuda11/fMHAPlugin.so
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:913f1697f0a16aa25d7814a6ee482d82cc2d083439f2efd5047ed4e728217687
|
3 |
+
size 99438864
|