From e4ca6905afe503a3d6474d63e8a12ac24839435f Mon Sep 17 00:00:00 2001 From: GitLab CI Date: Wed, 5 Nov 2025 20:04:14 +0000 Subject: [PATCH] chore: Regenerate all playbooks --- nvidia/comfy-ui/README.md | 238 +++++++++++++-------------- nvidia/cuda-x-data-science/README.md | 1 + 2 files changed, 120 insertions(+), 119 deletions(-) diff --git a/nvidia/comfy-ui/README.md b/nvidia/comfy-ui/README.md index 0e475a4..0faa504 100644 --- a/nvidia/comfy-ui/README.md +++ b/nvidia/comfy-ui/README.md @@ -14,185 +14,185 @@ ## Basic idea - ComfyUI is an open-source web server application for AI image generation using diffusion-based models like SDXL, Flux and others. - It has a browser-based UI that lets you create, edit and run image generation and editing workflows with multiple steps. - Generation and editing steps (e.g. loading a model, adding text or sampling) are configurable in the UI as a node, and you connect nodes with wires to form a workflow. +ComfyUI is an open-source web server application for AI image generation using diffusion-based models like SDXL, Flux and others. +It has a browser-based UI that lets you create, edit and run image generation and editing workflows with multiple steps. +Generation and editing steps (e.g. loading a model, adding text or sampling) are configurable in the UI as a node, and you connect nodes with wires to form a workflow. - ComfyUI uses the host's GPU for inference, so you can install it on your Spark and do all of your image generation and editing directly on device. +ComfyUI uses the host's GPU for inference, so you can install it on your Spark and do all of your image generation and editing directly on device. - Workflows are saved as JSON files, so you can version them for future work, collaboration and reproducibility. +Workflows are saved as JSON files, so you can version them for future work, collaboration and reproducibility. -# # What you'll accomplish +## What you'll accomplish - You'll install and configure ComfyUI on your NVIDIA DGX Spark device so you can use the unified memory to work with large models. +You'll install and configure ComfyUI on your NVIDIA DGX Spark device so you can use the unified memory to work with large models. -# # What to know before starting +## What to know before starting - - Experience working with Python virtual environments and package management - - Familiarity with command line operations and terminal usage - - Basic understanding of deep learning model deployment and checkpoints - - Knowledge of container workflows and GPU acceleration concepts - - Understanding of network configuration for accessing web services +- Experience working with Python virtual environments and package management +- Familiarity with command line operations and terminal usage +- Basic understanding of deep learning model deployment and checkpoints +- Knowledge of container workflows and GPU acceleration concepts +- Understanding of network configuration for accessing web services -# # Prerequisites +## Prerequisites - **Hardware Requirements:** - - NVIDIA Spark device with Blackwell architecture - - Minimum 8GB GPU memory for Stable Diffusion models - - At least 20GB available storage space +**Hardware Requirements:** +- NVIDIA Spark device with Blackwell architecture +- Minimum 8GB GPU memory for Stable Diffusion models +- At least 20GB available storage space - **Software Requirements:** - - Python 3.8 or higher installed: `python3 --version` - - pip package manager available: `pip3 --version` - - CUDA toolkit compatible with Blackwell: `nvcc --version` - - Git version control: `git --version` - - Network access to download models from Hugging Face - - Web browser access to `:8188` port +**Software Requirements:** +- Python 3.8 or higher installed: `python3 --version` +- pip package manager available: `pip3 --version` +- CUDA toolkit compatible with Blackwell: `nvcc --version` +- Git version control: `git --version` +- Network access to download models from Hugging Face +- Web browser access to `:8188` port -# # Ancillary files +## Ancillary files - All required assets can be found [in the ComfyUI repository on GitHub](https://github.com/comfyanonymous/ComfyUI) +All required assets can be found [in the ComfyUI repository on GitHub](https://github.com/comfyanonymous/ComfyUI) - - `requirements.txt` - Python dependencies for ComfyUI installation - - `main.py` - Primary ComfyUI server application entry point - - `v1-5-pruned-emaonly-fp16.safetensors` - Stable Diffusion 1.5 checkpoint model +- `requirements.txt` - Python dependencies for ComfyUI installation +- `main.py` - Primary ComfyUI server application entry point +- `v1-5-pruned-emaonly-fp16.safetensors` - Stable Diffusion 1.5 checkpoint model -# # Time & risk +## Time & risk - * **Estimated time:** 30-45 minutes (including model download) - * **Risk level:** Medium - * Model downloads are large (~2GB) and may fail due to network issues - * Port 8188 must be accessible for web interface functionality - * **Rollback:** Virtual environment can be deleted to remove all installed packages. Downloaded models can be removed manually from the checkpoints directory. +* **Estimated time:** 30-45 minutes (including model download) +* **Risk level:** Medium + * Model downloads are large (~2GB) and may fail due to network issues + * Port 8188 must be accessible for web interface functionality +* **Rollback:** Virtual environment can be deleted to remove all installed packages. Downloaded models can be removed manually from the checkpoints directory. ## Instructions ## Step 1. Verify system prerequisites - Check that your NVIDIA Spark device meets the requirements before proceeding with installation. +Check that your NVIDIA Spark device meets the requirements before proceeding with installation. - ```bash - python3 --version - pip3 --version - nvcc --version - nvidia-smi - ``` +```bash +python3 --version +pip3 --version +nvcc --version +nvidia-smi +``` - Expected output should show Python 3.8+, pip available, CUDA toolkit and GPU detection. +Expected output should show Python 3.8+, pip available, CUDA toolkit and GPU detection. -# # Step 2. Create Python virtual environment +## Step 2. Create Python virtual environment - You will install ComfyUI on your host system, so you should create an isolated environment to avoid conflicts with system packages. +You will install ComfyUI on your host system, so you should create an isolated environment to avoid conflicts with system packages. - ```bash - python3 -m venv comfyui-env - source comfyui-env/bin/activate - ``` +```bash +python3 -m venv comfyui-env +source comfyui-env/bin/activate +``` - Verify the virtual environment is active by checking the command prompt shows `(comfyui-env)`. +Verify the virtual environment is active by checking the command prompt shows `(comfyui-env)`. -# # Step 3. Install PyTorch with CUDA support +## Step 3. Install PyTorch with CUDA support - Install PyTorch with CUDA 12.9 support. +Install PyTorch with CUDA 12.9 support. - ```bash - pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu129 - ``` +```bash +pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu129 +``` - This installation targets CUDA 12.9 compatibility with Blackwell architecture GPUs. +This installation targets CUDA 12.9 compatibility with Blackwell architecture GPUs. -# # Step 4. Clone ComfyUI repository +## Step 4. Clone ComfyUI repository - Download the ComfyUI source code from the official repository. +Download the ComfyUI source code from the official repository. - ```bash - git clone https://github.com/comfyanonymous/ComfyUI.git - cd ComfyUI/ - ``` +```bash +git clone https://github.com/comfyanonymous/ComfyUI.git +cd ComfyUI/ +``` -# # Step 5. Install ComfyUI dependencies +## Step 5. Install ComfyUI dependencies - Install the required Python packages for ComfyUI operation. +Install the required Python packages for ComfyUI operation. - ```bash - pip install -r requirements.txt - ``` +```bash +pip install -r requirements.txt +``` - This installs all necessary dependencies including web interface components and model handling libraries. +This installs all necessary dependencies including web interface components and model handling libraries. -# # Step 6. Download Stable Diffusion checkpoint +## Step 6. Download Stable Diffusion checkpoint - Navigate to the checkpoints directory and download the Stable Diffusion 1.5 model. +Navigate to the checkpoints directory and download the Stable Diffusion 1.5 model. - ```bash - cd models/checkpoints/ - wget https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/resolve/main/v1-5-pruned-emaonly-fp16.safetensors - cd ../../ - ``` +```bash +cd models/checkpoints/ +wget https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/resolve/main/v1-5-pruned-emaonly-fp16.safetensors +cd ../../ +``` - The download will be approximately 2GB and may take several minutes depending on network speed. +The download will be approximately 2GB and may take several minutes depending on network speed. -# # Step 7. Launch ComfyUI server +## Step 7. Launch ComfyUI server - Start the ComfyUI web server with network access enabled. +Start the ComfyUI web server with network access enabled. - ```bash - python main.py --listen 0.0.0.0 - ``` +```bash +python main.py --listen 0.0.0.0 +``` - The server will bind to all network interfaces on port 8188, making it accessible from other devices. +The server will bind to all network interfaces on port 8188, making it accessible from other devices. -# # Step 8. Validate installation +## Step 8. Validate installation - Check that ComfyUI is running correctly and accessible via web browser. +Check that ComfyUI is running correctly and accessible via web browser. - ```bash - curl -I http://localhost:8188 - ``` +```bash +curl -I http://localhost:8188 +``` - Expected output should show HTTP 200 response indicating the web server is operational. +Expected output should show HTTP 200 response indicating the web server is operational. - Open a web browser and navigate to `http://:8188` where `` is your device's IP address. +Open a web browser and navigate to `http://:8188` where `` is your device's IP address. -# # Step 9. Optional - Cleanup and rollback +## Step 9. Optional - Cleanup and rollback - If you need to remove the installation completely, follow these steps: +If you need to remove the installation completely, follow these steps: - > [!WARNING] - > This will delete all installed packages and downloaded models. +> [!WARNING] +> This will delete all installed packages and downloaded models. - ```bash - deactivate - rm -rf comfyui-env/ - rm -rf ComfyUI/ - ``` +```bash +deactivate +rm -rf comfyui-env/ +rm -rf ComfyUI/ +``` - To rollback during installation, press `Ctrl+C` to stop the server and remove the virtual environment. +To rollback during installation, press `Ctrl+C` to stop the server and remove the virtual environment. -# # Step 10. Optional - Next steps +## Step 10. Optional - Next steps - Test the installation with a basic image generation workflow: +Test the installation with a basic image generation workflow: - 1. Access the web interface at `http://:8188` - 2. Load the default workflow (should appear automatically) - 3. Click "Run" to generate your first image - 4. Monitor GPU usage with `nvidia-smi` in a separate terminal +1. Access the web interface at `http://:8188` +2. Load the default workflow (should appear automatically) +3. Click "Run" to generate your first image +4. Monitor GPU usage with `nvidia-smi` in a separate terminal - The image generation should complete within 30-60 seconds depending on your hardware configuration. +The image generation should complete within 30-60 seconds depending on your hardware configuration. ## Troubleshooting | Symptom | Cause | Fix | - |---------|-------|-----| - | PyTorch CUDA not available | Incorrect CUDA version or missing drivers | Verify `nvcc --version` matches cu129, reinstall PyTorch | - | Model download fails | Network connectivity or storage space | Check internet connection, verify 20GB+ available space | - | Web interface inaccessible | Firewall blocking port 8188 | Configure firewall to allow port 8188, check IP address | - | Out of GPU memory errors after manually flushing buffer cache | Insufficient VRAM for model | Use smaller models or enable CPU fallback mode | +|---------|-------|-----| +| PyTorch CUDA not available | Incorrect CUDA version or missing drivers | Verify `nvcc --version` matches cu129, reinstall PyTorch | +| Model download fails | Network connectivity or storage space | Check internet connection, verify 20GB+ available space | +| Web interface inaccessible | Firewall blocking port 8188 | Configure firewall to allow port 8188, check IP address | +| Out of GPU memory errors after manually flushing buffer cache | Insufficient VRAM for model | Use smaller models or enable CPU fallback mode | - > [!NOTE] - > DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. - > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within - > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with: - ```bash - sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches' - ``` +> [!NOTE] +> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. +> With many applications still updating to take advantage of UMA, you may encounter memory issues even when within +> the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with: +```bash +sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches' +``` diff --git a/nvidia/cuda-x-data-science/README.md b/nvidia/cuda-x-data-science/README.md index 2d9132f..62338be 100644 --- a/nvidia/cuda-x-data-science/README.md +++ b/nvidia/cuda-x-data-science/README.md @@ -2,6 +2,7 @@ > Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes + ## Table of Contents - [Overview](#overview)