mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-24 10:53:52 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
809947301c
commit
f96690e73d
@ -6,20 +6,19 @@
|
|||||||
|
|
||||||
- [Overview](#overview)
|
- [Overview](#overview)
|
||||||
- [Instructions](#instructions)
|
- [Instructions](#instructions)
|
||||||
- [Substep A. BF16 quantized precision](#substep-a-bf16-quantized-precision)
|
|
||||||
- [Substep B. FP8 quantized precision](#substep-b-fp8-quantized-precision)
|
|
||||||
- [Substep C. FP4 quantized precision](#substep-c-fp4-quantized-precision)
|
|
||||||
- [Substep A. FP16 precision (high VRAM requirement)](#substep-a-fp16-precision-high-vram-requirement)
|
|
||||||
- [Substep B. FP8 quantized precision](#substep-b-fp8-quantized-precision)
|
|
||||||
- [Substep C. FP4 quantized precision](#substep-c-fp4-quantized-precision)
|
|
||||||
- [Substep A. BF16 precision](#substep-a-bf16-precision)
|
|
||||||
- [Substep B. FP8 quantized precision](#substep-b-fp8-quantized-precision)
|
|
||||||
- [Troubleshooting](#troubleshooting)
|
- [Troubleshooting](#troubleshooting)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
|
* Basic idea
|
||||||
|
|
||||||
|
Multi-modal inference combines different data types, such as **text, images, and audio**, within a single model pipeline to generate or interpret richer outputs.
|
||||||
|
Instead of processing one input type at a time, multi-modal systems have shared representations that **text-to-image generation**, **image captioning**, or **vision-language reasoning**.
|
||||||
|
|
||||||
|
On GPUs, this enables **parallel processing across modalities** for faster, higher-fidelity results for tasks that combine language and vision.
|
||||||
|
|
||||||
## What you'll accomplish
|
## What you'll accomplish
|
||||||
|
|
||||||
You'll deploy GPU-accelerated multi-modal inference capabilities on NVIDIA Spark using TensorRT to run
|
You'll deploy GPU-accelerated multi-modal inference capabilities on NVIDIA Spark using TensorRT to run
|
||||||
@ -39,18 +38,18 @@ FP8, FP4).
|
|||||||
- NVIDIA Spark device with Blackwell GPU architecture
|
- NVIDIA Spark device with Blackwell GPU architecture
|
||||||
- Docker installed and accessible to current user
|
- Docker installed and accessible to current user
|
||||||
- NVIDIA Container Runtime configured
|
- NVIDIA Container Runtime configured
|
||||||
- Hugging Face account with valid token
|
- Hugging Face account with access to Black Forest Labs models [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) and [FLUX.1-dev-onnx](https://huggingface.co/black-forest-labs/FLUX.1-dev-onnx) on Hugging Face
|
||||||
|
- Hugging Face [token](https://huggingface.co/settings/tokens) configured with access to both FLUX.1 model repositories
|
||||||
- At least 48GB VRAM available for FP16 Flux.1 Schnell operations
|
- At least 48GB VRAM available for FP16 Flux.1 Schnell operations
|
||||||
- Verify GPU access: `nvidia-smi`
|
- Verify GPU access: `nvidia-smi`
|
||||||
- Check Docker GPU integration: `docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu20.04 nvidia-smi`
|
- Check Docker GPU integration: `docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu20.04 nvidia-smi`
|
||||||
- Confirm HF token access with permissions to FLUX repos: `echo $HF_TOKEN`, Sign in to your huggingface account You can create the token from create your token here (make sure you provide permissions to the token): https://huggingface.co/settings/tokens , Note the permissions to be checked and the repos: black-forest-labs/FLUX.1-dev and black-forest-labs/FLUX.1-dev-onnx (search for these repos when creating the user token) to be added.
|
|
||||||
|
|
||||||
## Ancillary files
|
## Ancillary files
|
||||||
|
|
||||||
All necessary files can be found in the TensorRT repository [here on GitHub](https://github.com/NVIDIA/TensorRT)
|
All necessary files can be found in the TensorRT repository [here on GitHub](https://github.com/NVIDIA/TensorRT)
|
||||||
- **requirements.txt** - Python dependencies for TensorRT demo environment
|
- [**requirements.txt**](https://github.com/NVIDIA/TensorRT/blob/main/demo/Diffusion/requirements.txt) - Python dependencies for TensorRT demo environment
|
||||||
- **demo_txt2img_flux.py** - Flux.1 model inference script
|
- [**demo_txt2img_flux.py**](https://github.com/NVIDIA/TensorRT/blob/main/demo/Diffusion/demo_txt2img_flux.py) - Flux.1 model inference script
|
||||||
- **demo_txt2img_xl.py** - SDXL model inference script
|
- [**demo_txt2img_xl.py**](https://github.com/NVIDIA/TensorRT/blob/main/demo/Diffusion/demo_txt2img_xl.py) - SDXL model inference script
|
||||||
- **TensorRT repository** - Contains diffusion demo code and optimization tools
|
- **TensorRT repository** - Contains diffusion demo code and optimization tools
|
||||||
|
|
||||||
## Time & risk
|
## Time & risk
|
||||||
@ -104,21 +103,21 @@ pip3 install -r requirements.txt
|
|||||||
|
|
||||||
Test multi-modal inference using the Flux.1 Dev model with different precision formats.
|
Test multi-modal inference using the Flux.1 Dev model with different precision formats.
|
||||||
|
|
||||||
### Substep A. BF16 quantized precision
|
**Substep A. BF16 quantized precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
--hf-token=$HF_TOKEN --download-onnx-models --bf16
|
--hf-token=$HF_TOKEN --download-onnx-models --bf16
|
||||||
```
|
```
|
||||||
|
|
||||||
### Substep B. FP8 quantized precision
|
**Substep B. FP8 quantized precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
--hf-token=$HF_TOKEN --quantization-level 4 --fp8 --download-onnx-models
|
--hf-token=$HF_TOKEN --quantization-level 4 --fp8 --download-onnx-models
|
||||||
```
|
```
|
||||||
|
|
||||||
### Substep C. FP4 quantized precision
|
**Substep C. FP4 quantized precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
@ -131,14 +130,14 @@ Test the faster Flux.1 Schnell variant with different precision formats.
|
|||||||
|
|
||||||
> **Warning**: FP16 Flux.1 Schnell requires >48GB VRAM for native export
|
> **Warning**: FP16 Flux.1 Schnell requires >48GB VRAM for native export
|
||||||
|
|
||||||
### Substep A. FP16 precision (high VRAM requirement)
|
**Substep A. FP16 precision (high VRAM requirement)**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
--hf-token=$HF_TOKEN --version="flux.1-schnell"
|
--hf-token=$HF_TOKEN --version="flux.1-schnell"
|
||||||
```
|
```
|
||||||
|
|
||||||
### Substep B. FP8 quantized precision
|
**Substep B. FP8 quantized precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
@ -146,7 +145,7 @@ python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry b
|
|||||||
--quantization-level 4 --fp8 --download-onnx-models
|
--quantization-level 4 --fp8 --download-onnx-models
|
||||||
```
|
```
|
||||||
|
|
||||||
### Substep C. FP4 quantized precision
|
**Substep C. FP4 quantized precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
@ -158,14 +157,14 @@ python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry b
|
|||||||
|
|
||||||
Test the SDXL model for comparison with different precision formats.
|
Test the SDXL model for comparison with different precision formats.
|
||||||
|
|
||||||
### Substep A. BF16 precision
|
**Substep A. BF16 precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_xl.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_xl.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
--hf-token=$HF_TOKEN --version xl-1.0 --download-onnx-models
|
--hf-token=$HF_TOKEN --version xl-1.0 --download-onnx-models
|
||||||
```
|
```
|
||||||
|
|
||||||
### Substep B. FP8 quantized precision
|
**Substep B. FP8 quantized precision**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python3 demo_txt2img_xl.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
python3 demo_txt2img_xl.py "a beautiful photograph of Mt. Fuji during cherry blossom" \
|
||||||
|
|||||||
@ -34,8 +34,13 @@ services:
|
|||||||
stack: 67108864
|
stack: 67108864
|
||||||
networks:
|
networks:
|
||||||
- host
|
- host
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "service", "ssh", "status"]
|
||||||
|
interval: 30s
|
||||||
|
timeout: 10s
|
||||||
|
retries: 10
|
||||||
|
|
||||||
networks:
|
networks:
|
||||||
host:
|
host:
|
||||||
name: host
|
name: host
|
||||||
external: true
|
external: true
|
||||||
|
|||||||
@ -1,3 +1,4 @@
|
|||||||
|
#!/bin/env bash
|
||||||
#
|
#
|
||||||
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||||
# SPDX-License-Identifier: Apache-2.0
|
# SPDX-License-Identifier: Apache-2.0
|
||||||
@ -14,7 +15,6 @@
|
|||||||
# See the License for the specific language governing permissions and
|
# See the License for the specific language governing permissions and
|
||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
#
|
#
|
||||||
#!/bin/env bash
|
|
||||||
|
|
||||||
set -e
|
set -e
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user