chore: Regenerate all playbooks

2026-06-18 04:22:21 +00:00 · 2025-10-12 20:13:25 +00:00 · 2025-10-12 20:13:25 +00:00 · 8f5d38151e
commit 8f5d38151e
parent e39a692dfd
20 changed files with 60 additions and 32 deletions
--- a/nvidia/comfy-ui/README.md
+++ b/nvidia/comfy-ui/README.md
@ -188,7 +188,8 @@ The image generation should complete within 30-60 seconds depending on your hard
 | Web interface inaccessible | Firewall blocking port 8188 | Configure firewall to allow port 8188, check IP address |
 | Out of GPU memory errors after manually flushing buffer cache | Insufficient VRAM for model | Use smaller models or enable CPU fallback mode |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE] 
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/flux-finetuning/README.md
+++ b/nvidia/flux-finetuning/README.md
@ -173,7 +173,8 @@ Unlike the base model, we can see that the fine-tuned model can generate multipl
 |---------|--------|-----|
 | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/jax/README.md
+++ b/nvidia/jax/README.md
@ -187,9 +187,10 @@ Blackwell GPU architecture.
 | Port 8080 unavailable | Port already in use | Use `-p 8081:8080` or kill process on 8080 |
 | Package conflicts in Docker build | Outdated environment file | Update environment file for Blackwell |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
-With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
-the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
+> the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
 sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
 ```
--- a/nvidia/llama-factory/README.md
+++ b/nvidia/llama-factory/README.md
@ -85,7 +85,8 @@ git --version
 ## Step 2. Launch PyTorch container with GPU support

 Start the NVIDIA PyTorch container with GPU access and mount your workspace directory.
-> **Note:** This NVIDIA PyTorch container supports CUDA 13
+> [!NOTE]
+> This NVIDIA PyTorch container supports CUDA 13

 ```bash
 docker run --gpus all --ipc=host --ulimit memlock=-1 -it --ulimit stack=67108864 --rm -v "$PWD":/workspace nvcr.io/nvidia/pytorch:25.09-py3 bash
@ -128,7 +129,8 @@ cat examples/train_lora/llama3_lora_sft.yaml

 ## Step 7. Launch fine-tuning training

-> **Note:** Login to your hugging face hub to download the model if the model is gated.
+> [!NOTE]
+> Login to your hugging face hub to download the model if the model is gated.

 Execute the training process using the pre-configured LoRA setup.

@ -206,7 +208,8 @@ docker container prune -f
 | Model download fails or is slow | Network connectivity or Hugging Face Hub issues | Check internet connection, try using `HF_HUB_OFFLINE=1` for cached models |
 | Training loss not decreasing | Learning rate too high/low or insufficient data | Adjust `learning_rate` parameter or check dataset quality |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/monai-reasoning/README.md
+++ b/nvidia/monai-reasoning/README.md
@ -115,7 +115,8 @@ ls -la ./models/monai-reasoning-cxr-3b
 ## You should see model files including config.json and model weights
 ```

-> **Important Note:** Currently, a custom internal VLLM container is required until the sm121 support is available in the public image. The instructions below use the internal container `******:5005/dl/dgx/vllm:main-py3.31165712-devel`.
+> [!IMPORTANT]
+> Currently, a custom internal VLLM container is required until the sm121 support is available in the public image. The instructions below use the internal container `******:5005/dl/dgx/vllm:main-py3.31165712-devel`.

 ## Step 3. Verify System Architecture

@ -294,7 +295,8 @@ for medical image analysis and reasoning tasks.
 | Open WebUI shows connection error | Wrong backend URL | Verify `OPENAI_API_BASE_URL` is set correctly |
 | Model doesn't show full reasoning | Reasoning tags enabled | Disable "Reasoning Tags" in Chat Controls → Advanced Params |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/multi-agent-chatbot/README.md
+++ b/nvidia/multi-agent-chatbot/README.md
@ -140,7 +140,8 @@ docker volume rm "$(basename "$PWD")_postgres_data"
 |---------|--------|-----|
 | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/multi-modal-inference/README.md
+++ b/nvidia/multi-modal-inference/README.md
@ -215,7 +215,8 @@ environment.
 | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser |
 | Model download timeouts | Network issues or rate limiting | Retry command or pre-download models |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/nemo-fine-tune/README.md
+++ b/nvidia/nemo-fine-tune/README.md
@ -192,7 +192,8 @@ First, export your HF_TOKEN so that gated models can be downloaded.
 ## Run basic LLM fine-tuning example
 export HF_TOKEN=<your_huggingface_token>
 ```
-> **Note:** Please Replace `<your_huggingface_token>` with your Hugging Face access token to access gated models (e.g., Llama).
+> [!NOTE]
+> Please Replace `<your_huggingface_token>` with your Hugging Face access token to access gated models (e.g., Llama).

 **Full Fine-tuning example:**

@ -321,7 +322,8 @@ Explore the [NeMo AutoModel GitHub repository](https://github.com/NVIDIA-NeMo/Au
 | ARM64 package compatibility issues | Package not available for ARM architecture | Use source installation or build from source with ARM64 flags |
 | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/nim-llm/README.md
+++ b/nvidia/nim-llm/README.md
@ -185,7 +185,8 @@ Test the integration with your preferred HTTP client or SDK to begin building ap
 | API returns 404 or connection refused | Container not fully started or wrong port | Wait for container startup completion, verify port 8000 is accessible |
 | runtime not found | NVIDIA Container Toolkit not properly configured | Run `sudo nvidia-ctk runtime configure --runtime=docker` and restart Docker |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/nvfp4-quantization/README.md
+++ b/nvidia/nvfp4-quantization/README.md
@ -258,7 +258,8 @@ The quantized model is now ready for deployment. Common next steps include:
 | Quantization process hangs | Container resource limits | Increase Docker memory limits or use `--ulimit` flags |
 | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/ollama/README.md
+++ b/nvidia/ollama/README.md
@ -190,7 +190,8 @@ To remove the custom app:
 1. Open NVIDIA Sync Settings → Custom tab
 2. Select "Ollama Server" and click "Remove"

-**Warning**: To completely uninstall Ollama from your Spark device:
+> [!WARNING]
+> To completely uninstall Ollama from your Spark device:

 ```bash
 sudo systemctl stop ollama
--- a/nvidia/speculative-decoding/README.md
+++ b/nvidia/speculative-decoding/README.md
@ -165,7 +165,8 @@ docker stop <container_id>
 | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser |
 | Server doesn't respond | Port conflicts or firewall | Check if port 8000 is available and not blocked |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/stack-sparks/README.md
+++ b/nvidia/stack-sparks/README.md
@ -229,7 +229,8 @@ ssh <IP for Node 2> hostname

 ## Step 6. Cleanup and Rollback

-> **Warning**: These steps will reset network configuration.
+> [!WARNING]
+> These steps will reset network configuration.

 ```bash
 ## Rollback network configuration (if using Option 1)
--- a/nvidia/tailscale/README.md
+++ b/nvidia/tailscale/README.md
@ -314,8 +314,8 @@ Expected output:
 Remove Tailscale completely if needed. This will disconnect devices from the
 tailnet and remove all network configurations.

-> **Warning**: This will permanently remove the device from your Tailscale
-> network and require re-authentication to rejoin.
+> [!WARNING]
+> his will permanently remove the device from your Tailscale network and require re-authentication to rejoin.

 ```bash
 ## Stop Tailscale service
--- a/nvidia/trt-llm/README.md
+++ b/nvidia/trt-llm/README.md
@ -310,7 +310,8 @@ docker run \
 ```


-> **Note:** If you hit a host OOM during downloads or first run, free the OS page cache on the host (outside the container) and retry:
+> [!NOTE]
+> If you hit a host OOM during downloads or first run, free the OS page cache on the host (outside the container) and retry:
 ```bash
 sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
 ```
@ -734,7 +735,8 @@ docker rmi ghcr.io/open-webui/open-webui:main
 | "task: non-zero exit (255)" | Container exit with error code 255 | Check container logs with `docker ps -a --filter "name=trtllm-multinode_trtllm"` to get container ID, then `docker logs <container_id>` to see detailed error messages |
 | Docker state stuck in "Pending" with "no suitable node (insufficien...)" | Docker daemon not properly configured for GPU access | Verify steps 2-4 were completed successfully and check that `/etc/docker/daemon.json` contains correct GPU configuration |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/txt2kg/README.md
+++ b/nvidia/txt2kg/README.md
@ -154,7 +154,8 @@ docker exec ollama-compose ollama rm llama3.1:8b
 | Slow triple extraction | Large model or large context window | Reduce document chunk size or use faster models |
 | ArangoDB connection refused | Service not fully started | Wait 30s after start.sh, verify with `docker ps` |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/unsloth/README.md
+++ b/nvidia/unsloth/README.md
@ -143,7 +143,8 @@ for advanced usage instructions, including:

 ## Troubleshooting

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/vllm/README.md
+++ b/nvidia/vllm/README.md
@ -348,7 +348,8 @@ http://192.168.100.10:8265
 | CUDA out of memory with 405B | Insufficient GPU memory | Use 70B model or reduce max_model_len parameter |
 | Container startup fails | Missing ARM64 image | Rebuild vLLM image following ARM64 instructions |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/vlm-finetuning/README.md
+++ b/nvidia/vlm-finetuning/README.md
@ -319,7 +319,8 @@ Feel free to play around with additional videos available in the gallery.

 ## Troubleshooting

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash
--- a/nvidia/vss/README.md
+++ b/nvidia/vss/README.md
@ -134,7 +134,8 @@ docker network create vss-shared-network

 Log in to NVIDIA's container registry using your [NGC API Key](https://org.ngc.nvidia.com/setup/api-keys).

-> **Note:** If you don’t have an NVIDIA account already, you’ll have to create one and register for the [developer program](https://developer.nvidia.com/nvidia-developer-program).
+> [!NOTE]
+> If you don’t have an NVIDIA account already, you’ll have to create one and register for the [developer program](https://developer.nvidia.com/nvidia-developer-program).

 ```bash
 ## Log in to NVIDIA Container Registry
@ -193,7 +194,8 @@ Launch the complete VSS Event Reviewer stack including Alert Bridge, VLM Pipelin
 IS_SBSA=1 IS_AARCH64=1 ALERT_REVIEW_MEDIA_BASE_DIR=/tmp/alert-media-dir docker compose up
 ```

-> **Note:** This step will take several minutes as containers are pulled and services initialize. The VSS backend requires additional startup time. Proceed to the next step in a new terminal in the meantime.
+> [!NOTE]
+> This step will take several minutes as containers are pulled and services initialize. The VSS backend requires additional startup time. Proceed to the next step in a new terminal in the meantime.

 **8.5 Navigate to CV Event Detector directory**

@ -266,7 +268,8 @@ Open these URLs in your browser:
 - `http://localhost:7862` - CV UI to launch and monitor CV pipeline
 - `http://localhost:7860` - Alert Inspector UI to view clips and review VLM results

-> **Note:** You may now proceed to step 10.
+> [!NOTE]
+> You may now proceed to step 10.

 ## Step 9. Option B 

@ -322,7 +325,8 @@ cat config.yaml | grep -A 10 "model"
 docker compose up
 ```

-> **Note:** This step will take several minutes as containers are pulled and services initialize. The VSS backend requires additional startup time. 
+> [!NOTE]
+> This step will take several minutes as containers are pulled and services initialize. The VSS backend requires additional startup time. 

 **9.7 Validate Standard VSS deployment**

@ -411,7 +415,8 @@ With VSS deployed, you can now:
 | Services fail to communicate | Incorrect environment variables | Verify `IS_SBSA=1 IS_AARCH64=1` are set correctly |
 | Web interfaces not accessible | Services still starting or port conflicts | Wait 2-3 minutes, check `docker ps` for container status |

-> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
+> [!NOTE]
+> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. 
 > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within 
 > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
 ```bash