mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-23 10:33:51 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
f39dd60161
commit
6df8e01e0d
@ -35,7 +35,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
|
|||||||
- [Use a NIM on Spark](nvidia/nim-llm/)
|
- [Use a NIM on Spark](nvidia/nim-llm/)
|
||||||
- [Quantize to NVFP4](nvidia/nvfp4-quantization/)
|
- [Quantize to NVFP4](nvidia/nvfp4-quantization/)
|
||||||
- [Ollama](nvidia/ollama/)
|
- [Ollama](nvidia/ollama/)
|
||||||
- [Use Open WebUI](nvidia/open-webui/)
|
- [Use Open WebUI with Ollama](nvidia/open-webui/)
|
||||||
- [Use Open Fold](nvidia/protein-folding/)
|
- [Use Open Fold](nvidia/protein-folding/)
|
||||||
- [Fine tune with Pytorch](nvidia/pytorch-fine-tune/)
|
- [Fine tune with Pytorch](nvidia/pytorch-fine-tune/)
|
||||||
- [RAG application in AI Workbench](nvidia/rag-ai-workbench/)
|
- [RAG application in AI Workbench](nvidia/rag-ai-workbench/)
|
||||||
|
|||||||
@ -16,24 +16,14 @@
|
|||||||
#
|
#
|
||||||
FROM nvcr.io/nvidia/pytorch:25.09-py3
|
FROM nvcr.io/nvidia/pytorch:25.09-py3
|
||||||
|
|
||||||
ARG HF_TOKEN
|
|
||||||
|
|
||||||
RUN cd /workspace/ && \
|
RUN cd /workspace/ && \
|
||||||
git clone https://github.com/comfyanonymous/ComfyUI.git && \
|
git clone https://github.com/comfyanonymous/ComfyUI.git && \
|
||||||
cd ComfyUI && \
|
cd ComfyUI && \
|
||||||
git checkout 4ffea0e864275301329ddb5ecc3fbc7211d7a802 && \
|
git checkout 4ffea0e864275301329ddb5ecc3fbc7211d7a802 && \
|
||||||
sed -i '/torch/d' requirements.txt && \
|
sed -i '/torch/d' requirements.txt && \
|
||||||
pip install -r requirements.txt && \
|
pip install -r requirements.txt && \
|
||||||
pip install torchsde && \
|
pip install torchsde
|
||||||
mkdir -p /workspace/ComfyUI/user/default/workflows/
|
|
||||||
|
|
||||||
COPY . /workspace/sd-scripts
|
WORKDIR /workspace/ComfyUI
|
||||||
|
|
||||||
RUN hf download black-forest-labs/FLUX.1-dev ae.safetensors --local-dir models/vae && \
|
|
||||||
hf download black-forest-labs/FLUX.1-dev flux1-dev.safetensors --local-dir models/checkpoints && \
|
|
||||||
hf download comfyanonymous/flux_text_encoders clip_l.safetensors --local-dir models/text_encoders && \
|
|
||||||
hf download comfyanonymous/flux_text_encoders t5xxl_fp16.safetensors --local-dir models/text_encoders && \
|
|
||||||
hf download RLakshmi24/flux-dreambooth-lora-tj-spark flux_dreambooth.safetensors --local-dir models/loras && \
|
|
||||||
cp /workspace/sd-scripts/workflows/finetuned_flux.json /workspace/ComfyUI/user/default/workflows/
|
|
||||||
|
|
||||||
CMD ["/bin/bash"]
|
CMD ["/bin/bash"]
|
||||||
@ -17,8 +17,6 @@
|
|||||||
|
|
||||||
FROM nvcr.io/nvidia/pytorch:25.09-py3
|
FROM nvcr.io/nvidia/pytorch:25.09-py3
|
||||||
|
|
||||||
ARG HF_TOKEN
|
|
||||||
|
|
||||||
RUN cd /workspace/ && \
|
RUN cd /workspace/ && \
|
||||||
git clone https://github.com/kohya-ss/sd-scripts.git && \
|
git clone https://github.com/kohya-ss/sd-scripts.git && \
|
||||||
cd sd-scripts && \
|
cd sd-scripts && \
|
||||||
@ -27,16 +25,6 @@ RUN cd /workspace/ && \
|
|||||||
apt update && \
|
apt update && \
|
||||||
apt install -y libgl1-mesa-dev
|
apt install -y libgl1-mesa-dev
|
||||||
|
|
||||||
COPY . /workspace/sd-scripts
|
WORKDIR /workspace/sd-scripts
|
||||||
|
|
||||||
RUN hf auth login --token $HF_TOKEN
|
|
||||||
|
|
||||||
RUN cd /workspace/sd-scripts/ && \
|
|
||||||
hf download black-forest-labs/FLUX.1-dev ae.safetensors --local-dir models && \
|
|
||||||
hf download black-forest-labs/FLUX.1-dev flux1-dev.safetensors --local-dir models && \
|
|
||||||
hf download comfyanonymous/flux_text_encoders clip_l.safetensors --local-dir models && \
|
|
||||||
hf download comfyanonymous/flux_text_encoders t5xxl_fp16.safetensors --local-dir models
|
|
||||||
|
|
||||||
RUN cd /workspace/sd-scripts
|
|
||||||
|
|
||||||
CMD ["/bin/bash"]
|
CMD ["/bin/bash"]
|
||||||
|
|||||||
@ -1,108 +1,172 @@
|
|||||||
# FLUX.1 Fine-tuning with LoRA
|
# FLUX.1 Fine-tuning with LoRA
|
||||||
|
|
||||||
This project demonstrates fine-tuning the FLUX.1-dev 11B model using Dreambooth LoRA (Low-Rank Adaptation) for custom image generation. The demo includes training on custom concepts and inference through both command-line scripts and ComfyUI.
|
This project demonstrates fine-tuning the FLUX.1-dev 12B model using Dreambooth LoRA (Low-Rank Adaptation) for custom image generation. The demo includes training on custom concepts and inference through both command-line scripts and ComfyUI.
|
||||||
|
|
||||||
## Results
|
|
||||||
|
|
||||||
Fine-tuning FLUX.1 with custom concepts enables the model to generate images with your specific objects and styles:
|
|
||||||
|
|
||||||
<figure>
|
|
||||||
<img src="flux_assets/before_finetuning.png" alt="Before Fine-tuning" width="400"/>
|
|
||||||
<figcaption>Base FLUX.1 model without custom concept knowledge</figcaption>
|
|
||||||
</figure>
|
|
||||||
|
|
||||||
<br>
|
|
||||||
|
|
||||||
<figure>
|
|
||||||
<img src="flux_assets/after_finetuning.png" alt="After Fine-tuning" width="400"/>
|
|
||||||
<figcaption>FLUX.1 model after LoRA fine-tuning with custom "tjtoy" and "sparkgpu" concepts</figcaption>
|
|
||||||
</figure>
|
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The project includes:
|
The project includes:
|
||||||
- **FLUX.1-dev Fine-tuning**: LoRA-based fine-tuning using sd-scripts
|
- **FLUX.1-dev Fine-tuning**: LoRA-based fine-tuning
|
||||||
- **Custom Concept Training**: Train on "tjtoy" toy and "sparkgpu" GPU
|
- **Custom Concept Training**: Train on "tjtoy" toy and "sparkgpu" GPU
|
||||||
- **Command-line Inference**: Generate images using trained LoRA weights
|
- **Command-line Inference**: Generate images using trained LoRA weights
|
||||||
- **ComfyUI Integration**: Intuitive workflows for inference with custom models
|
- **ComfyUI Integration**: Intuitive workflows for inference with custom models
|
||||||
- **Docker Support**: Complete containerized environment
|
- **Docker Support**: Complete containerized environment
|
||||||
|
|
||||||
## Training
|
## Contents
|
||||||
|
1. [Model Download](#1-model-download)
|
||||||
|
2. [Base Model Inference](#2-base-model-inference)
|
||||||
|
3. [Dataset Preparation](#3-dataset-preparation)
|
||||||
|
4. [Training](#4-training)
|
||||||
|
5. [Finetuned Model Inference](#5-finetuned-model-inference)
|
||||||
|
|
||||||
### 1. Build Docker Image by providing `HF_TOKEN`
|
## 1. Model Download
|
||||||
|
|
||||||
|
### 1.1 Install HuggingFace Hub
|
||||||
|
|
||||||
|
Let's install the Hugging Face CLI client for authentication and downloading model checkpoints.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Build the Docker image (this will download FLUX models automatically)
|
pip install huggingface_hub
|
||||||
docker build -f Dockerfile.train --build-arg HF_TOKEN=$HF_TOKEN -t flux-training .
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Note**: The Docker build automatically downloads the required FLUX models:
|
### 1.2 Huggingface Authentication
|
||||||
- `flux1-dev.safetensors` (~23GB)
|
|
||||||
|
You will have to be granted access to the FLUX.1-dev model since it is gated. Go to their [model card](https://huggingface.co/black-forest-labs/FLUX.1-dev), to accept the terms and gain access to the checkpoints.
|
||||||
|
|
||||||
|
If you do not have a `HF_TOKEN` already, follow the instructions [here](https://huggingface.co/docs/hub/en/security-tokens) to generate one. Authenticate your system by replacing your generated token in the following command.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
hf auth login --token <YOUR_HF_TOKEN>
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.3 Download the pre-trained checkpoints
|
||||||
|
|
||||||
|
The following snippet downloads the required FLUX models for training and inference.
|
||||||
|
- `flux1-dev.safetensors` (~23.8GB)
|
||||||
- `ae.safetensors` (~335MB)
|
- `ae.safetensors` (~335MB)
|
||||||
- `clip_l.safetensors` (~246MB)
|
- `clip_l.safetensors` (~246MB)
|
||||||
- `t5xxl_fp16.safetensors` (~9.8GB)
|
- `t5xxl_fp16.safetensors` (~9.8GB)
|
||||||
|
|
||||||
### 2. Run Docker Container
|
```bash
|
||||||
|
cd flux-finetuning/assets
|
||||||
|
|
||||||
|
# script to download (takes about a total of 5-10 mins, based on your internet speed)
|
||||||
|
sh download.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify that your `models/` directory follows this structure after downloading the checkpoints.
|
||||||
|
|
||||||
|
```
|
||||||
|
models/
|
||||||
|
├── checkpoints/
|
||||||
|
│ └── flux1-dev.safetensors
|
||||||
|
├── loras/
|
||||||
|
├── text_encoders/
|
||||||
|
│ ├── clip_l.safetensors
|
||||||
|
│ └── t5xxl_fp16.safetensors
|
||||||
|
└── vae/
|
||||||
|
└── ae.safetensors
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.4 (Optional) Using fine-tuned checkpoints
|
||||||
|
|
||||||
|
If you already have fine-tuned LoRAs, place them inside `models/loras`. If you do not have one yet, proceed to the [Training](#training) section for more details.
|
||||||
|
|
||||||
|
## 2. Base Model Inference
|
||||||
|
|
||||||
|
Let's begin by generating an image using the base FLUX.1 model on 2 concepts we am interested in, Toy Jensen and DGX Spark.
|
||||||
|
|
||||||
|
### 2.1 Spin up the docker container
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Run with GPU support and mount current directory
|
# Build the inference docker image
|
||||||
docker run -it \
|
docker build -f Dockerfile.inference -t flux-comfyui .
|
||||||
--gpus all \
|
|
||||||
--ipc=host \
|
# Launch the ComfyUI container (ensure you are inside flux-finetuning/assets)
|
||||||
--ulimit memlock=-1 \
|
# You can ignore any import errors for `torchaudio`
|
||||||
--ulimit stack=67108864 \
|
sh launch_comfyui.sh
|
||||||
--net=host \
|
|
||||||
flux-training
|
|
||||||
```
|
```
|
||||||
|
Access ComfyUI at `http://localhost:8188` to generate images with the base model. Do not select any pre-existing template.
|
||||||
|
|
||||||
### 3. Train the Model
|
### 2.2 Load the base workflow
|
||||||
|
|
||||||
```bash
|
Find the workflow section on the left-side panel of ComfyUI (or press `w`). Upon opening it, you should find two existing workflows loaded up. For the base Flux model, let's load the `base_flux.json` workflow. After loading the json, you should see ComfyUI load up the workflow.
|
||||||
# Inside the container, navigate to sd-scripts and run training
|
|
||||||
cd /workspace/sd-scripts
|
|
||||||
sh train.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
### 4. Run Inference
|
### 2.3 Fill in the prompt for your generation
|
||||||
The `inference.sh` script generates 9 images with different seeds.
|
|
||||||
|
|
||||||
After training, you can generate images using the learned concepts. For example:
|
Provide your prompt in the `CLIP Text Encode (Prompt)` block. For example, we will use `Toy Jensen holding a DGX Spark in a datacenter`. You can expect the generation to take ~3 mins since it is compute intesive to create high-resolution 1024px images.
|
||||||
- `"tjtoy toy"` - Your custom toy concept
|
|
||||||
- `"sparkgpu gpu"` - Your custom GPU concept
|
|
||||||
- Combine them: `"tjtoy toy holding sparkgpu gpu"`
|
|
||||||
|
|
||||||
```bash
|
For the provided prompt and random seed, the base Flux model generated the following image. Although the generation has good quality, it fails to understand the custom characters and concepts we would like to generate.
|
||||||
# Generate images using the trained LoRA
|
|
||||||
sh inference.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
### Dataset Structure
|
<figure>
|
||||||
|
<img src="flux_assets/before_workflow.png" alt="Base model workflow" width="1000"/>
|
||||||
|
<figcaption>Base FLUX.1 model workflow without custom concept knowledge</figcaption>
|
||||||
|
</figure>
|
||||||
|
|
||||||
The training data is organized in the `data/` directory:
|
## 3. Dataset Preparation
|
||||||
|
|
||||||
|
Let's prepare our dataset to perform Dreambooth LoRA finetuning on the FLUX.1-dev 12B model. However, if you wish to continue with the provided dataset of Toy Jensen and DGX Spark, feel free to skip to the [Training](#training) section. This dataset is a collection of public assets accessible via Google Images.
|
||||||
|
|
||||||
|
### 3.1 Data collection
|
||||||
|
|
||||||
|
You will need to prepare a dataset of all the concepts you would like to generate, and about 5-10 images for each concept. For this example, we would like to generate images with 2 concepts.
|
||||||
|
|
||||||
|
#### TJToy Concept
|
||||||
|
- **Trigger phrase**: `tjtoy toy`
|
||||||
|
- **Training images**: 6 high-quality images of custom toy figures
|
||||||
|
- **Use case**: Generate images featuring the specific toy character in various scenes
|
||||||
|
|
||||||
|
#### SparkGPU Concept
|
||||||
|
- **Trigger phrase**: `sparkgpu gpu`
|
||||||
|
- **Training images**: 7 images of custom GPU hardware
|
||||||
|
- **Use case**: Generate images featuring the specific GPU design in different contexts
|
||||||
|
|
||||||
|
### 3.2 Format the dataset
|
||||||
|
|
||||||
|
Create a folder for each concept with it's corresponding name, and place it inside the `flux_data` directory. In our case, we have used `sparkgpu` and `tjtoy` as our concepts, and placed a few images inside each of them. After preparing the dataset, the structure inside `flux_data` should mimic the following.
|
||||||
|
|
||||||
```
|
```
|
||||||
data/
|
flux_data/
|
||||||
├── data.toml # Training configuration
|
├── data.toml
|
||||||
├── tjtoy/ # Custom toy concept images (6 images)
|
├── concept_1/
|
||||||
│ ├── 1.png
|
│ ├── 1.png
|
||||||
│ ├── 2.jpg
|
│ ├── 2.jpg
|
||||||
│ ├── 3.png
|
└── ...
|
||||||
│ ├── 4.png
|
└── concept_2/
|
||||||
│ ├── 5.png
|
|
||||||
│ └── 6.png
|
|
||||||
└── sparkgpu/ # Custom GPU concept images (7 images)
|
|
||||||
├── 1.jpeg
|
├── 1.jpeg
|
||||||
├── 2.jpg
|
├── 2.jpg
|
||||||
├── 3.jpg
|
└── ...
|
||||||
├── 4.jpg
|
|
||||||
├── 6.png
|
|
||||||
├── 7.png
|
|
||||||
└── 8.png
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Training Parameters
|
### 3.3 Update the data config
|
||||||
|
|
||||||
Key training settings in `train.sh`:
|
Now, let's modify the `flux_data/data.toml` file to reflect the concepts chosen. Ensure that you update/create entries for each of your concept, by modifying the `image_dir` and `class_tokens` fields under `[[datasets.subsets]]`. For better performance in finetuning, it is a good practice to append a class token to your concept name (like `toy` or `gpu`).
|
||||||
|
|
||||||
|
## 4. Training
|
||||||
|
|
||||||
|
### 4.1 Build the docker image
|
||||||
|
|
||||||
|
Make sure that the ComfyUI inference container is brought down before proceeding to train.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build the inference docker image
|
||||||
|
docker build -f Dockerfile.train -t flux-train .
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4.2 Setup the training command
|
||||||
|
|
||||||
|
Launch training by executing the follow command. The training script is setup to use a default configuration that can generate reasonable images for your dataset, in about ~90 mins of training. This train command will automatically store checkpoints in the `models/loras/` directory.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sh launch_train.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
If you wish to generate very-quality images on your custom concepts (like the images we have shown in the README), you will have to train for much longer (~8 hours). To accomplish this, modify the num epochs in the `launch_train.sh` script to 100.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
--max_train_epochs=100
|
||||||
|
```
|
||||||
|
|
||||||
|
Feel free to play around with the other hyperparameters in the `launch_train.sh` script to find the best settings for your dataset. Some notable parameters to tune include:
|
||||||
- **Network Type**: LoRA with dimension 256
|
- **Network Type**: LoRA with dimension 256
|
||||||
- **Learning Rate**: 1.0 (with Prodigy optimizer)
|
- **Learning Rate**: 1.0 (with Prodigy optimizer)
|
||||||
- **Epochs**: 100 (saves every 25 epochs)
|
- **Epochs**: 100 (saves every 25 epochs)
|
||||||
@ -110,94 +174,49 @@ Key training settings in `train.sh`:
|
|||||||
- **Mixed Precision**: bfloat16
|
- **Mixed Precision**: bfloat16
|
||||||
- **Optimizations**: Torch compile, gradient checkpointing, cached latents
|
- **Optimizations**: Torch compile, gradient checkpointing, cached latents
|
||||||
|
|
||||||
## ComfyUI
|
## 5. Finetuned Model Inference
|
||||||
|
|
||||||
ComfyUI provides an intuitive visual interface for using your fine-tuned LoRA models. The beauty of LoRA fine-tuning is that you can easily add your custom concepts to any FLUX workflow with just a single node.
|
Now let's generate images using our finetuned LoRAs!
|
||||||
|
|
||||||
### 1. Build Docker Image by providing `HF_TOKEN`
|
### 5.1 Spin up the docker container
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Build the Docker image (this will download FLUX models automatically)
|
# Build the inference docker image, if you haven't already
|
||||||
docker build -f Dockerfile.inference --build-arg HF_TOKEN=$HF_TOKEN -t flux-comfyui .
|
docker build -f Dockerfile.inference -t flux-comfyui .
|
||||||
|
|
||||||
|
# Launch the ComfyUI container (ensure you are inside flux-finetuning/assets)
|
||||||
|
# You can ignore any import errors for `torchaudio`
|
||||||
|
sh launch_comfyui.sh
|
||||||
```
|
```
|
||||||
|
Access ComfyUI at `http://localhost:8188` to generate images with the finetuned model. Do not select any pre-existing template.
|
||||||
|
|
||||||
### 2. Run Docker Container
|
### 5.2 Load the finetuned workflow
|
||||||
|
|
||||||
```bash
|
Find the workflow section on the left-side panel of ComfyUI (or press `w`). Upon opening it, you should find two existing workflows loaded up. For the finetuned Flux model, let's load the `finetuned_flux.json` workflow. After loading the json, you should see ComfyUI load up the workflow.
|
||||||
# Run with GPU support and mount current directory
|
|
||||||
docker run -it \
|
|
||||||
--gpus all \
|
|
||||||
--ipc=host \
|
|
||||||
--ulimit memlock=-1 \
|
|
||||||
--ulimit stack=67108864 \
|
|
||||||
--net=host \
|
|
||||||
flux-comfyui
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Running ComfyUI
|
### 5.3 Fill in the prompt for your generation
|
||||||
|
|
||||||
```bash
|
Provide your prompt in the `CLIP Text Encode (Prompt)` block. Now let's incorporate our custom concepts into our prompt for the finetuned model. For example, we will use `tjtoy toy holding sparkgpu gpu in a datacenter`. You can expect the generation to take ~3 mins since it is compute intesive to create high-resolution 1024px images.
|
||||||
# Start ComfyUI server
|
|
||||||
cd /workspace/ComfyUI
|
|
||||||
python main.py
|
|
||||||
```
|
|
||||||
|
|
||||||
Access ComfyUI at `http://localhost:8188`
|
For the provided prompt and random seed, the finetuned Flux model generated the following image. Unlike the base model, we can see that the finetuned model can generate multiple concepts in a single image.
|
||||||
|
|
||||||
### 4. ComfyUI Workflow Example
|
<figure>
|
||||||
|
<img src="flux_assets/after_workflow.png" alt="After Fine-tuning" width="1000"/>
|
||||||
|
<figcaption>Finetuned FLUX.1 model with custom concept knowledge</figcaption>
|
||||||
|
</figure>
|
||||||
|
|
||||||

|
### 5.4 (Optional) Tuning your generations
|
||||||
*ComfyUI workflow showing how easily LoRA can be integrated into the base FLUX model*
|
|
||||||
|
|
||||||
The workflow demonstrates the simplicity of LoRA integration:
|
ComfyUI exposes several fields to tune and change the look and feel of the generated images. Here are some parameters to look out for in the workflow.
|
||||||
1. **Load Checkpoint**: Base FLUX.1-dev model remains unchanged
|
|
||||||
2. **Load LoRA**: Simply add your trained LoRA file (`flux_dreambooth.safetensors`)
|
|
||||||
3. **Adjust Strength**: Fine-tune the influence of your custom concepts (0.8-1.2 typically works well)
|
|
||||||
4. **Generate**: Use your custom trigger words (`tjtoy toy`, `sparkgpu gpu`) in prompts
|
|
||||||
|
|
||||||
This modular approach means you can:
|
1. **LoRA weights**: Change your trained LoRA file in the `Load LoRA` plugin, and even tune its strengths
|
||||||
- **Preserve base model quality**: The original FLUX capabilities remain intact
|
2. **Adjust resolution**: Modify the width and height in the `Empty Latent Image` plugin for other resolutions
|
||||||
- **Easy experimentation**: Quickly swap different LoRA models or adjust strengths
|
3. **Random seed**: Change the noise seed in the `RandomNoise` plugin for alternative images with the same prompt
|
||||||
- **Combine concepts**: Mix multiple LoRA models or use them with other techniques
|
4. **Tune sampling**: Modify the sampler, scheduler and steps as necessary
|
||||||
- **Minimal storage**: LoRA files are typically 100-200MB vs 23GB+ for full models
|
|
||||||
|
|
||||||
### ComfyUI Model Structure
|
|
||||||
|
|
||||||
Organize models in ComfyUI as follows:
|
|
||||||
|
|
||||||
```
|
|
||||||
ComfyUI/models/
|
|
||||||
├── checkpoints/
|
|
||||||
│ └── flux1-dev.safetensors # Main FLUX model
|
|
||||||
├── vae/
|
|
||||||
│ └── ae.safetensors # FLUX VAE
|
|
||||||
├── clip/
|
|
||||||
│ ├── clip_l.safetensors # CLIP text encoder
|
|
||||||
│ └── t5xxl_fp16.safetensors # T5 text encoder
|
|
||||||
└── loras/
|
|
||||||
└── flux_dreambooth.safetensors # Your trained LoRA
|
|
||||||
```
|
|
||||||
|
|
||||||
## Custom Concepts
|
|
||||||
|
|
||||||
The fine-tuning process teaches FLUX.1 to understand two custom concepts:
|
|
||||||
|
|
||||||
### TJToy Concept
|
|
||||||
- **Trigger phrase**: `tjtoy toy`
|
|
||||||
- **Training images**: 6 high-quality images of custom toy figures
|
|
||||||
- **Use case**: Generate images featuring the specific toy character in various scenes
|
|
||||||
|
|
||||||
### SparkGPU Concept
|
|
||||||
- **Trigger phrase**: `sparkgpu gpu`
|
|
||||||
- **Training images**: 7 images of custom GPU hardware
|
|
||||||
- **Use case**: Generate images featuring the specific GPU design in different contexts
|
|
||||||
|
|
||||||
### Combined Usage
|
|
||||||
You can combine both concepts in prompts:
|
|
||||||
- `"tjtoy toy holding sparkgpu gpu"`
|
|
||||||
- `"tjtoy toy standing next to sparkgpu gpu in a data center"`
|
|
||||||
- `"sparkgpu gpu being examined by tjtoy toy"`
|
|
||||||
|
|
||||||
## Credits
|
## Credits
|
||||||
|
|
||||||
This project uses [sd-scripts](https://github.com/kohya-ss/sd-scripts) repository by `kohya-ss` for FLUX.1 fine-tuning.
|
This project uses the following open-source repositories:
|
||||||
|
- [sd-scripts](https://github.com/kohya-ss/sd-scripts) repository by `kohya-ss` for FLUX.1 fine-tuning.
|
||||||
|
- [ComfyUI](https://github.com/comfyanonymous/ComfyUI.git) repository by `comfyanonymous` for FLUX.1 inference.
|
||||||
|
|||||||
@ -1,4 +1,3 @@
|
|||||||
#!/bin/bash
|
|
||||||
#
|
#
|
||||||
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||||
# SPDX-License-Identifier: Apache-2.0
|
# SPDX-License-Identifier: Apache-2.0
|
||||||
@ -14,23 +13,8 @@
|
|||||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
# See the License for the specific language governing permissions and
|
# See the License for the specific language governing permissions and
|
||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
|
#
|
||||||
for SEED in $(seq 0 7); do
|
hf download black-forest-labs/FLUX.1-dev ae.safetensors --local-dir models/vae
|
||||||
python flux_minimal_inference.py \
|
hf download black-forest-labs/FLUX.1-dev flux1-dev.safetensors --local-dir models/checkpoints
|
||||||
--ckpt_path="models/flux1-dev.safetensors" \
|
hf download comfyanonymous/flux_text_encoders clip_l.safetensors --local-dir models/text_encoders
|
||||||
--model_type="flux" \
|
hf download comfyanonymous/flux_text_encoders t5xxl_fp16.safetensors --local-dir models/text_encoders
|
||||||
--clip_l="models/clip_l.safetensors" \
|
|
||||||
--t5xxl="models/t5xxl_fp16.safetensors" \
|
|
||||||
--ae="models/ae.safetensors" \
|
|
||||||
--output_dir="outputs" \
|
|
||||||
--lora_weights="saved_models/flux_dreambooth.safetensors" \
|
|
||||||
--merge_lora_weights \
|
|
||||||
--prompt="tjtoy toy holding sparkgpu gpu in a datacenter" \
|
|
||||||
--width=1024 \
|
|
||||||
--height=1024 \
|
|
||||||
--steps=50 \
|
|
||||||
--guidance=1.0 \
|
|
||||||
--cfg_scale=1.0 \
|
|
||||||
--seed=$SEED \
|
|
||||||
--dtype="bfloat16"
|
|
||||||
done
|
|
||||||
Binary file not shown.
|
Before Width: | Height: | Size: 1.1 MiB |
BIN
nvidia/flux-finetuning/assets/flux_assets/after_workflow.png
Normal file
BIN
nvidia/flux-finetuning/assets/flux_assets/after_workflow.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.8 MiB |
Binary file not shown.
|
Before Width: | Height: | Size: 987 KiB |
BIN
nvidia/flux-finetuning/assets/flux_assets/before_workflow.png
Normal file
BIN
nvidia/flux-finetuning/assets/flux_assets/before_workflow.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.9 MiB |
Binary file not shown.
|
Before Width: | Height: | Size: 827 KiB |
29
nvidia/flux-finetuning/assets/launch_comfyui.sh
Normal file
29
nvidia/flux-finetuning/assets/launch_comfyui.sh
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
#
|
||||||
|
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||||
|
# SPDX-License-Identifier: Apache-2.0
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
#
|
||||||
|
docker run -it \
|
||||||
|
--gpus all \
|
||||||
|
--ipc=host \
|
||||||
|
--net=host \
|
||||||
|
--ulimit memlock=-1 \
|
||||||
|
--ulimit stack=67108864 \
|
||||||
|
-v $(pwd)/models/vae:/workspace/ComfyUI/models/vae \
|
||||||
|
-v $(pwd)/models/loras:/workspace/ComfyUI/models/loras \
|
||||||
|
-v $(pwd)/models/checkpoints:/workspace/ComfyUI/models/checkpoints \
|
||||||
|
-v $(pwd)/models/text_encoders:/workspace/ComfyUI/models/text_encoders \
|
||||||
|
-v $(pwd)/workflows/:/workspace/ComfyUI/user/default/workflows/ \
|
||||||
|
flux-comfyui \
|
||||||
|
python main.py
|
||||||
66
nvidia/flux-finetuning/assets/launch_train.sh
Normal file
66
nvidia/flux-finetuning/assets/launch_train.sh
Normal file
@ -0,0 +1,66 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
#
|
||||||
|
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||||
|
# SPDX-License-Identifier: Apache-2.0
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the License);
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an AS IS BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
#
|
||||||
|
CMD="accelerate launch \
|
||||||
|
--num_processes=1 --num_machines=1 --mixed_precision=bf16 \
|
||||||
|
--main_process_ip=127.0.0.1 --main_process_port=29500 \
|
||||||
|
--num_cpu_threads_per_process=2 \
|
||||||
|
flux_train_network.py \
|
||||||
|
--pretrained_model_name_or_path=models/checkpoints/flux1-dev.safetensors \
|
||||||
|
--clip_l=models/text_encoders/clip_l.safetensors \
|
||||||
|
--t5xxl=models/text_encoders/t5xxl_fp16.safetensors \
|
||||||
|
--ae=models/vae/ae.safetensors \
|
||||||
|
--dataset_config=flux_data/data.toml \
|
||||||
|
--output_dir=models/loras/ \
|
||||||
|
--prior_loss_weight=1.0 \
|
||||||
|
--output_name=flux_dreambooth \
|
||||||
|
--save_model_as=safetensors \
|
||||||
|
--network_module=networks.lora_flux \
|
||||||
|
--network_dim=256 \
|
||||||
|
--network_alpha=256 \
|
||||||
|
--learning_rate=1.0 \
|
||||||
|
--optimizer_type=Prodigy \
|
||||||
|
--lr_scheduler=cosine_with_restarts \
|
||||||
|
--gradient_accumulation_steps 4 \
|
||||||
|
--gradient_checkpointing \
|
||||||
|
--sdpa \
|
||||||
|
--max_train_epochs=1 \
|
||||||
|
--save_every_n_epochs=25 \
|
||||||
|
--mixed_precision=bf16 \
|
||||||
|
--guidance_scale=1.0 \
|
||||||
|
--timestep_sampling=flux_shift \
|
||||||
|
--model_prediction_type=raw \
|
||||||
|
--torch_compile \
|
||||||
|
--persistent_data_loader_workers \
|
||||||
|
--cache_latents \
|
||||||
|
--cache_latents_to_disk \
|
||||||
|
--cache_text_encoder_outputs \
|
||||||
|
--cache_text_encoder_outputs_to_disk"
|
||||||
|
|
||||||
|
docker run -it \
|
||||||
|
--gpus all \
|
||||||
|
--ipc=host \
|
||||||
|
--net=host \
|
||||||
|
--ulimit memlock=-1 \
|
||||||
|
--ulimit stack=67108864 \
|
||||||
|
-v $(pwd)/flux_data:/workspace/sd-scripts/flux_data \
|
||||||
|
-v $(pwd)/models/vae:/workspace/sd-scripts/models/vae \
|
||||||
|
-v $(pwd)/models/loras:/workspace/sd-scripts/models/loras \
|
||||||
|
-v $(pwd)/models/checkpoints:/workspace/sd-scripts/models/checkpoints \
|
||||||
|
-v $(pwd)/models/text_encoders:/workspace/sd-scripts/models/text_encoders \
|
||||||
|
flux-train \
|
||||||
|
bash -c "$CMD"
|
||||||
0
nvidia/flux-finetuning/assets/models/loras/.gitkeep
Normal file
0
nvidia/flux-finetuning/assets/models/loras/.gitkeep
Normal file
0
nvidia/flux-finetuning/assets/models/vae/.gitkeep
Normal file
0
nvidia/flux-finetuning/assets/models/vae/.gitkeep
Normal file
@ -1,52 +0,0 @@
|
|||||||
#!/bin/bash
|
|
||||||
#
|
|
||||||
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
|
||||||
# SPDX-License-Identifier: Apache-2.0
|
|
||||||
#
|
|
||||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
||||||
# you may not use this file except in compliance with the License.
|
|
||||||
# You may obtain a copy of the License at
|
|
||||||
#
|
|
||||||
# http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
#
|
|
||||||
# Unless required by applicable law or agreed to in writing, software
|
|
||||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
# See the License for the specific language governing permissions and
|
|
||||||
# limitations under the License.
|
|
||||||
#
|
|
||||||
|
|
||||||
accelerate launch \
|
|
||||||
--num_processes=1 --num_machines=1 --mixed_precision=bf16 \
|
|
||||||
--main_process_ip=127.0.0.1 --main_process_port=29500 \
|
|
||||||
--num_cpu_threads_per_process=2 \
|
|
||||||
flux_train_network.py \
|
|
||||||
--pretrained_model_name_or_path="models/flux1-dev.safetensors" \
|
|
||||||
--clip_l="models/clip_l.safetensors" \
|
|
||||||
--t5xxl="models/t5xxl_fp16.safetensors" \
|
|
||||||
--ae="models/ae.safetensors" \
|
|
||||||
--dataset_config="flux_data/data.toml" \
|
|
||||||
--output_dir="saved_models" \
|
|
||||||
--prior_loss_weight=1.0 \
|
|
||||||
--output_name="flux_dreambooth" \
|
|
||||||
--save_model_as=safetensors \
|
|
||||||
--network_module=networks.lora_flux \
|
|
||||||
--network_dim=256 \
|
|
||||||
--network_alpha=256 \
|
|
||||||
--learning_rate=1.0 \
|
|
||||||
--optimizer_type="Prodigy" \
|
|
||||||
--lr_scheduler="cosine_with_restarts" \
|
|
||||||
--sdpa \
|
|
||||||
--max_train_epochs=100 \
|
|
||||||
--save_every_n_epochs=25 \
|
|
||||||
--mixed_precision="bf16" \
|
|
||||||
--guidance_scale=1.0 \
|
|
||||||
--timestep_sampling="flux_shift" \
|
|
||||||
--model_prediction_type="raw" \
|
|
||||||
--torch_compile \
|
|
||||||
--persistent_data_loader_workers \
|
|
||||||
--cache_latents \
|
|
||||||
--cache_latents_to_disk \
|
|
||||||
--cache_text_encoder_outputs \
|
|
||||||
--cache_text_encoder_outputs_to_disk \
|
|
||||||
--gradient_checkpointing
|
|
||||||
1
nvidia/flux-finetuning/assets/workflows/base_flux.json
Normal file
1
nvidia/flux-finetuning/assets/workflows/base_flux.json
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -1,6 +1,6 @@
|
|||||||
# Use Open WebUI
|
# Use Open WebUI with Ollama
|
||||||
|
|
||||||
> Install Open WebUI and chat with models on your Spark
|
> Install Open WebUI and use Ollama to chat with models on your Spark
|
||||||
|
|
||||||
## Table of Contents
|
## Table of Contents
|
||||||
|
|
||||||
@ -12,9 +12,9 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
## Basic Idea
|
## Basic idea
|
||||||
|
|
||||||
Open WebUI is an extensible, self-hosted AI interface that operating entirely offline.
|
Open WebUI is an extensible, self-hosted AI interface that operates entirely offline.
|
||||||
This playbook shows you how to deploy Open WebUI with an integrated Ollama server on your DGX Spark device using
|
This playbook shows you how to deploy Open WebUI with an integrated Ollama server on your DGX Spark device using
|
||||||
NVIDIA Sync. The setup creates a secure SSH tunnel that lets you access the web
|
NVIDIA Sync. The setup creates a secure SSH tunnel that lets you access the web
|
||||||
interface from your local browser while the models run on Spark's GPU.
|
interface from your local browser while the models run on Spark's GPU.
|
||||||
@ -45,7 +45,7 @@ for model management, persistent data storage, and GPU acceleration for model in
|
|||||||
- Large model downloads may take significant time depending on network speed
|
- Large model downloads may take significant time depending on network speed
|
||||||
|
|
||||||
**Rollback**: Stop and remove Docker containers using provided cleanup commands, remove Custom Port
|
**Rollback**: Stop and remove Docker containers using provided cleanup commands, remove Custom Port
|
||||||
from NVIDIA Sync settings
|
from NVIDIA Sync settings.
|
||||||
|
|
||||||
## Instructions
|
## Instructions
|
||||||
|
|
||||||
@ -137,12 +137,12 @@ Common issues and their solutions.
|
|||||||
|---------|-------|-----|
|
|---------|-------|-----|
|
||||||
| Permission denied on docker ps | User not in docker group | Run Step 1 completely, including logging out and logging back in or use sudo|
|
| Permission denied on docker ps | User not in docker group | Run Step 1 completely, including logging out and logging back in or use sudo|
|
||||||
| Model download fails | Network connectivity issues | Check internet connection, retry download |
|
| Model download fails | Network connectivity issues | Check internet connection, retry download |
|
||||||
| GPU not detected in container | Missing --gpus=all flag | Recreate container with correct command |
|
| GPU not detected in container | Missing `--gpus=all flag` | Recreate container with correct command |
|
||||||
| Port 8080 already in use | Another application using port | Change port in docker command or stop conflicting service |
|
| Port 8080 already in use | Another application using port | Change port in docker command or stop conflicting service |
|
||||||
|
|
||||||
## Step 8. Cleanup and rollback
|
## Step 8. Cleanup and rollback
|
||||||
|
|
||||||
Steps to completely remove the Open WebUI installation and free up resources.
|
Steps to completely remove the Open WebUI installation and free up resources:
|
||||||
|
|
||||||
> **Warning**: These commands will permanently delete all Open WebUI data and downloaded models.
|
> **Warning**: These commands will permanently delete all Open WebUI data and downloaded models.
|
||||||
|
|
||||||
@ -224,7 +224,7 @@ Click the gear icon in the top right corner to open the Settings window.
|
|||||||
|
|
||||||
Click on the "Custom" tab to access Custom Ports configuration.
|
Click on the "Custom" tab to access Custom Ports configuration.
|
||||||
|
|
||||||
## Step 4. Add Open WebUI Custom Port
|
## Step 4. Add Open WebUI custom port
|
||||||
|
|
||||||
This step creates a new entry in NVIDIA Sync that will manage the Open
|
This step creates a new entry in NVIDIA Sync that will manage the Open
|
||||||
WebUI container and create the necessary SSH tunnel.
|
WebUI container and create the necessary SSH tunnel.
|
||||||
@ -337,7 +337,7 @@ Press Enter to send the message and wait for the model's response.
|
|||||||
|
|
||||||
## Step 9. Stop the Open WebUI
|
## Step 9. Stop the Open WebUI
|
||||||
|
|
||||||
When you are finished with your session and want to stop the Open WebUI server and reclaim resource, close the Open WebUI from NVIDIA Sync.
|
When you are finished with your session and want to stop the Open WebUI server and reclaim resources, close the Open WebUI from NVIDIA Sync.
|
||||||
|
|
||||||
Click on the NVIDIA Sync icon in your system tray or taskbar to open the main application window.
|
Click on the NVIDIA Sync icon in your system tray or taskbar to open the main application window.
|
||||||
|
|
||||||
@ -354,12 +354,12 @@ Common issues and their solutions.
|
|||||||
| Permission denied on docker ps | User not in docker group | Run Step 1 completely, including terminal restart |
|
| Permission denied on docker ps | User not in docker group | Run Step 1 completely, including terminal restart |
|
||||||
| Browser doesn't open automatically | Auto-open setting disabled | Manually navigate to localhost:12000 |
|
| Browser doesn't open automatically | Auto-open setting disabled | Manually navigate to localhost:12000 |
|
||||||
| Model download fails | Network connectivity issues | Check internet connection, retry download |
|
| Model download fails | Network connectivity issues | Check internet connection, retry download |
|
||||||
| GPU not detected in container | Missing --gpus=all flag | Recreate container with correct start script |
|
| GPU not detected in container | Missing `--gpus=all flag` | Recreate container with correct start script |
|
||||||
| Port 12000 already in use | Another application using port | Change port in Custom App settings or stop conflicting service |
|
| Port 12000 already in use | Another application using port | Change port in Custom App settings or stop conflicting service |
|
||||||
|
|
||||||
## Step 11. Cleanup and rollback
|
## Step 11. Cleanup and rollback
|
||||||
|
|
||||||
Steps to completely remove the Open WebUI installation and free up resources.
|
Steps to completely remove the Open WebUI installation and free up resources:
|
||||||
|
|
||||||
> **Warning**: These commands will permanently delete all Open WebUI data and downloaded models.
|
> **Warning**: These commands will permanently delete all Open WebUI data and downloaded models.
|
||||||
|
|
||||||
@ -398,4 +398,4 @@ If Open WebUI reports an update is available, you can update the container image
|
|||||||
docker pull ghcr.io/open-webui/open-webui:ollama
|
docker pull ghcr.io/open-webui/open-webui:ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
Then launch Open WebUI again fromNVIDIA Sync.
|
After the updated, launch Open WebUI again from NVIDIA Sync.
|
||||||
|
|||||||
@ -74,7 +74,11 @@ On your DGX Spark system, open the **NVIDIA AI Workbench** application and click
|
|||||||
|
|
||||||
### Troubleshooting installation issues
|
### Troubleshooting installation issues
|
||||||
|
|
||||||
If you encounter the error message: `An error occurred ... container tool failed to reach ready state. try again: docker is not running` reboot your DGX Spark system to restart the docker service, then reopen NVIDIA AI Workbench.
|
If you encounter an error message that says:
|
||||||
|
|
||||||
|
An error occurred ... container tool failed to reach ready state. try again: docker is not running
|
||||||
|
|
||||||
|
Reboot your DGX Spark system to restart the docker service, then reopen NVIDIA AI Workbench.
|
||||||
|
|
||||||
## Step 2. Verify API key requirements
|
## Step 2. Verify API key requirements
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user