From c5e890f83699d3b88b28630190e4a0d27ae20459 Mon Sep 17 00:00:00 2001 From: GitLab CI Date: Sun, 12 Oct 2025 20:53:42 +0000 Subject: [PATCH] chore: Regenerate all playbooks --- nvidia/comfy-ui/README.md | 3 ++- nvidia/connect-to-your-spark/README.md | 15 ++++++-------- nvidia/dgx-dashboard/README.md | 6 ++++-- nvidia/flux-finetuning/README.md | 3 ++- nvidia/jax/README.md | 3 ++- nvidia/llama-factory/README.md | 3 ++- nvidia/monai-reasoning/README.md | 3 ++- nvidia/multi-agent-chatbot/README.md | 6 ++++-- nvidia/multi-modal-inference/README.md | 6 ++++-- nvidia/nemo-fine-tune/README.md | 3 ++- nvidia/nim-llm/README.md | 3 ++- nvidia/nvfp4-quantization/README.md | 3 ++- nvidia/protein-folding/README.md | 4 ++-- nvidia/pytorch-fine-tune/README.md | 3 ++- nvidia/rag-ai-workbench/README.md | 3 ++- nvidia/sglang/README.md | 3 ++- nvidia/trt-llm/README.md | 28 +++++++++++++++++--------- nvidia/txt2kg/README.md | 3 ++- nvidia/vllm/README.md | 6 ++++-- nvidia/vlm-finetuning/README.md | 15 +++++++++----- nvidia/vscode/README.md | 3 ++- nvidia/vss/README.md | 6 ++++-- 22 files changed, 83 insertions(+), 48 deletions(-) diff --git a/nvidia/comfy-ui/README.md b/nvidia/comfy-ui/README.md index 96a46fb..e1ea282 100644 --- a/nvidia/comfy-ui/README.md +++ b/nvidia/comfy-ui/README.md @@ -158,7 +158,8 @@ Open a web browser and navigate to `http://:8188` where `` i If you need to remove the installation completely, follow these steps: -> **Warning:** This will delete all installed packages and downloaded models. +> [!WARNING] +> This will delete all installed packages and downloaded models. ```bash deactivate diff --git a/nvidia/connect-to-your-spark/README.md b/nvidia/connect-to-your-spark/README.md index f6025ec..1affd51 100644 --- a/nvidia/connect-to-your-spark/README.md +++ b/nvidia/connect-to-your-spark/README.md @@ -66,12 +66,9 @@ applications, and manage your DGX Spark remotely from your laptop. ## Time & risk -**Time estimate:** 5-10 minutes - -**Risk level:** Low - SSH setup involves credential configuration but no system-level changes -to the DGX Spark device - -**Rollback:** SSH key removal can be done by editing `~/.ssh/authorized_keys` on the DGX Spark. +- **Time estimate:** 5-10 minutes +- **Risk level:** Low - SSH setup involves credential configuration but no system-level changes to the DGX Spark device +- **Rollback:** SSH key removal can be done by editing `~/.ssh/authorized_keys` on the DGX Spark. ## Connect with NVIDIA Sync @@ -146,9 +143,9 @@ Finally, connect your DGX Spark by filling out the form: - **Username**: Your DGX Spark user account name - **Password**: Your DGX Spark user account password -**Note:** Your password is used only during this initial setup to configure SSH key-based -authentication. It is not stored or transmitted after setup completion. NVIDIA Sync will SSH into your device and -configure its locally provisioned SSH key pair. +> [!NOTE] +> Your password is used only during this initial setup to configure SSH key-based authentication. It is not stored or transmitted after setup completion. NVIDIA Sync will SSH into your device and +> configure its locally provisioned SSH key pair. Click add "Add" and NVIDIA Sync will automatically: diff --git a/nvidia/dgx-dashboard/README.md b/nvidia/dgx-dashboard/README.md index f2c3b6a..4453e90 100644 --- a/nvidia/dgx-dashboard/README.md +++ b/nvidia/dgx-dashboard/README.md @@ -198,7 +198,8 @@ From the Settings page, under the "Updates" tab: 2. Click "Update Now" to initiate the update process 3. Wait for the update to complete and your device to reboot -> **Warning**: System updates will upgrade packages, firmware if available, and trigger a reboot. Save your work before proceeding. +> [!WARNING] +> System updates will upgrade packages, firmware if available, and trigger a reboot. Save your work before proceeding. ## Step 7. Cleanup and rollback @@ -207,7 +208,8 @@ To clean up resources and return system to original state: 1. Stop any running JupyterLab instances via dashboard 2. Delete the JupyterLab working directory -> **Warning**: If you ran system updates, the only rollback is to restore from a system backup or recovery media. +> [!WARNING] +> If you ran system updates, the only rollback is to restore from a system backup or recovery media. No permanent changes are made to the system during normal dashboard usage. diff --git a/nvidia/flux-finetuning/README.md b/nvidia/flux-finetuning/README.md index f267b1e..0954ddd 100644 --- a/nvidia/flux-finetuning/README.md +++ b/nvidia/flux-finetuning/README.md @@ -111,7 +111,8 @@ After playing around with the base model, you have 2 possible next steps. * If you already have fine-tuned LoRAs placed inside `models/loras/`, please skip to `Step 7. Fine-tuned model inference` section. * If you wish to train a LoRA for your custom concepts, first make sure that the ComfyUI inference container is brought down before proceeding to train. You can bring it down by interrupting the terminal with `Ctrl+C` keystroke. -> **Note**: To clear out any extra occupied memory from your system, execute the following command outside the container after interrupting the ComfyUI server. +> [!NOTE] +> To clear out any extra occupied memory from your system, execute the following command outside the container after interrupting the ComfyUI server. ```bash sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches' ``` diff --git a/nvidia/jax/README.md b/nvidia/jax/README.md index 6a08619..8644000 100644 --- a/nvidia/jax/README.md +++ b/nvidia/jax/README.md @@ -99,7 +99,8 @@ git clone https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx- ## Step 3. Build the Docker image -> **Warning:** This command will download a base image and build a container locally to support this environment. +> [!WARNING] +> This command will download a base image and build a container locally to support this environment. ```bash cd jax/assets diff --git a/nvidia/llama-factory/README.md b/nvidia/llama-factory/README.md index 38e1ff0..c77724f 100644 --- a/nvidia/llama-factory/README.md +++ b/nvidia/llama-factory/README.md @@ -183,7 +183,8 @@ llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml ## Step 11. Cleanup and rollback -> **Warning:** This will delete all training progress and checkpoints. +> [!WARNING] +> This will delete all training progress and checkpoints. To remove all generated files and free up storage space: diff --git a/nvidia/monai-reasoning/README.md b/nvidia/monai-reasoning/README.md index 026abb9..22be690 100644 --- a/nvidia/monai-reasoning/README.md +++ b/nvidia/monai-reasoning/README.md @@ -266,7 +266,8 @@ You can now upload a chest X-ray image and ask questions directly in the chat in To stop and remove the containers and network, run the following commands. This will not delete your downloaded model weights. -> **Warning:** This will stop all running containers and remove the network. +> [!WARNING] +> This will stop all running containers and remove the network. ```bash ## Stop containers diff --git a/nvidia/multi-agent-chatbot/README.md b/nvidia/multi-agent-chatbot/README.md index 9b2f701..80c072f 100644 --- a/nvidia/multi-agent-chatbot/README.md +++ b/nvidia/multi-agent-chatbot/README.md @@ -37,7 +37,8 @@ The setup includes: - No other processes running on the DGX Spark GPU - Enough disk space for model downloads -> **Note**: This demo uses ~120 out of the 128GB of DGX Spark's memory by default. +> [!NOTE] +> This demo uses ~120 out of the 128GB of DGX Spark's memory by default. > Please ensure that no other workloads are running on your Spark using `nvidia-smi`, or switch to a smaller supervisor model like gpt-oss-20B. @@ -104,7 +105,8 @@ watch 'docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"' Open your browser and go to: http://localhost:3000 -> **Note**: If you are running this on a remote GPU via an SSH connection, in a new terminal window, you need to run the following command to be able to access the UI at localhost:3000 and for the UI to be able to communicate to the backend at localhost:8000. +> [!NOTE] +> If you are running this on a remote GPU via an SSH connection, in a new terminal window, you need to run the following command to be able to access the UI at localhost:3000 and for the UI to be able to communicate to the backend at localhost:8000. >```ssh -L 3000:localhost:3000 -L 8000:localhost:8000 username@IP-address``` diff --git a/nvidia/multi-modal-inference/README.md b/nvidia/multi-modal-inference/README.md index 2ea49de..0e0e0d3 100644 --- a/nvidia/multi-modal-inference/README.md +++ b/nvidia/multi-modal-inference/README.md @@ -128,7 +128,8 @@ python3 demo_txt2img_flux.py "a beautiful photograph of Mt. Fuji during cherry b Test the faster Flux.1 Schnell variant with different precision formats. -> **Warning**: FP16 Flux.1 Schnell requires >48GB VRAM for native export +> [!WARNING] +> FP16 Flux.1 Schnell requires >48GB VRAM for native export **Substep A. FP16 precision (high VRAM requirement)** @@ -190,7 +191,8 @@ python3 -c "import tensorrt as trt; print(f'TensorRT version: {trt.__version__}' Remove downloaded models and exit container environment to free disk space. -> **Warning**: This will delete all cached models and generated images +> [!WARNING] +> This will delete all cached models and generated images ```bash ## Exit container diff --git a/nvidia/nemo-fine-tune/README.md b/nvidia/nemo-fine-tune/README.md index 3eb390e..d1abc48 100644 --- a/nvidia/nemo-fine-tune/README.md +++ b/nvidia/nemo-fine-tune/README.md @@ -280,7 +280,8 @@ print('✅ Setup complete') Remove the installation and restore the original environment if needed. These commands safely remove all installed components. -> **Warning:** This will delete all virtual environments and downloaded models. Ensure you have backed up any important training checkpoints. +> [!WARNING] +> This will delete all virtual environments and downloaded models. Ensure you have backed up any important training checkpoints. ```bash ## Remove virtual environment diff --git a/nvidia/nim-llm/README.md b/nvidia/nim-llm/README.md index 45ffa02..a6048de 100644 --- a/nvidia/nim-llm/README.md +++ b/nvidia/nim-llm/README.md @@ -152,7 +152,8 @@ Expected output should be a JSON response containing a completion field with gen Remove the running container and optionally clean up cached model files. -> **Warning:** Removing cached models will require re-downloading on next run. +> [!WARNING] +> Removing cached models will require re-downloading on next run. ```bash docker stop $CONTAINER_NAME diff --git a/nvidia/nvfp4-quantization/README.md b/nvidia/nvfp4-quantization/README.md index 957e856..e7bdef7 100644 --- a/nvidia/nvfp4-quantization/README.md +++ b/nvidia/nvfp4-quantization/README.md @@ -226,7 +226,8 @@ curl -X POST http://localhost:8000/v1/chat/completions \ To clean up the environment and remove generated files: -> **Warning:** This will permanently delete all quantized model files and cached data. +> [!WARNING] +> This will permanently delete all quantized model files and cached data. ```bash ## Remove output directory and all quantized models diff --git a/nvidia/protein-folding/README.md b/nvidia/protein-folding/README.md index b30b459..e104978 100644 --- a/nvidia/protein-folding/README.md +++ b/nvidia/protein-folding/README.md @@ -370,8 +370,8 @@ deactivate rm -rf openfold_env/ ``` -> **Warning:** The following will delete downloaded databases (>3TB). Only run if you need to -> free disk space and are willing to re-download. +> [!WARNING] +> The following will delete downloaded databases (>3TB). Only run if you need to free disk space and are willing to re-download. ```bash ## Remove all databases (requires re-download) diff --git a/nvidia/pytorch-fine-tune/README.md b/nvidia/pytorch-fine-tune/README.md index b08700f..f1ee844 100644 --- a/nvidia/pytorch-fine-tune/README.md +++ b/nvidia/pytorch-fine-tune/README.md @@ -119,7 +119,8 @@ python Llama3_3B_full_finetuning.py |---------|--------|-----| | Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your [HuggingFace token](https://huggingface.co/docs/hub/en/security-tokens); and request access to the [gated model](https://huggingface.co/docs/hub/en/models-gated#customize-requested-information) on your web browser | -> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. +> [!NOTE] +> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with: ```bash diff --git a/nvidia/rag-ai-workbench/README.md b/nvidia/rag-ai-workbench/README.md index 8a38da3..cac9f65 100644 --- a/nvidia/rag-ai-workbench/README.md +++ b/nvidia/rag-ai-workbench/README.md @@ -151,7 +151,8 @@ Upload a custom dataset, adjust the Router prompt, and submit custom queries to This step explains how to remove the project if needed and what changes were made to your system. -> **Warning:** This will permanently delete the project and all associated data. +> [!WARNING] +> This will permanently delete the project and all associated data. To remove the project completely: diff --git a/nvidia/sglang/README.md b/nvidia/sglang/README.md index 895e927..3135a78 100644 --- a/nvidia/sglang/README.md +++ b/nvidia/sglang/README.md @@ -203,7 +203,8 @@ Common issues and their resolutions: Stop and remove containers to clean up resources. This step returns your system to its original state. -> **Warning:** This will stop all SGLang containers and remove temporary data. +> [!WARNING] +> This will stop all SGLang containers and remove temporary data. ```bash ## Stop all SGLang containers diff --git a/nvidia/trt-llm/README.md b/nvidia/trt-llm/README.md index d6a4cb5..14d8263 100644 --- a/nvidia/trt-llm/README.md +++ b/nvidia/trt-llm/README.md @@ -108,7 +108,8 @@ The following models are supported with TensorRT-LLM on Spark. All listed models | **Llama-4-Scout-17B-16E-Instruct** | NVFP4 | ✅ | `nvidia/Llama-4-Scout-17B-16E-Instruct-FP4` | | **Qwen3-235B-A22B (two Sparks only)** | NVFP4 | ✅ | `nvidia/Qwen3-235B-A22B-FP4` | -**Note:** You can use the NVFP4 Quantization documentation to generate your own NVFP4-quantized checkpoints for your favorite models. This enables you to take advantage of the performance and memory benefits of NVFP4 quantization even for models not already published by NVIDIA. +> [!NOTE] +> You can use the NVFP4 Quantization documentation to generate your own NVFP4-quantized checkpoints for your favorite models. This enables you to take advantage of the performance and memory benefits of NVFP4 quantization even for models not already published by NVIDIA. Reminder: not all model architectures are supported for NVFP4 quantization. @@ -396,7 +397,8 @@ curl -s http://localhost:8355/v1/chat/completions \ Remove downloaded models and containers to free up space when testing is complete. -> **Warning:** This will delete all cached models and may require re-downloading for future runs. +> [!WARNING] +> This will delete all cached models and may require re-downloading for future runs. ```bash ## Remove Hugging Face cache @@ -519,7 +521,8 @@ On your primary node, deploy the TRT-LLM multi-node stack by downloading the [** ```bash docker stack deploy -c $HOME/docker-compose.yml trtllm-multinode ``` -**Note:** Ensure you download both files into the same directory from which you are running the command. +> [!NOTE] +> Ensure you download both files into the same directory from which you are running the command. You can verify the status of your worker nodes using the following ```bash @@ -534,7 +537,8 @@ oe9k5o6w41le trtllm-multinode_trtllm.1 nvcr.io/nvidia/tensorrt-llm/relea phszqzk97p83 trtllm-multinode_trtllm.2 nvcr.io/nvidia/tensorrt-llm/release:1.0.0rc3 spark-1b3b Running Running 2 minutes ago ``` -**Note:** If your "Current state" is not "Running", see troubleshooting section for more information. +> [!NOTE] +> If your "Current state" is not "Running", see troubleshooting section for more information. ### Step 7. Create hosts file @@ -603,7 +607,8 @@ docker exec \ This will start the TensorRT-LLM server on port 8355. You can then make inference requests to `http://localhost:8355` using the OpenAI-compatible API format. -**Note:** You might see a warning such as `UCX WARN network device 'enp1s0f0np0' is not available, please use one or more of`. You can ignore this warning if your inference is successful, as it's related to only one of your two CX-7 ports being used, and the other being left unused. +> [!NOTE] +> You might see a warning such as `UCX WARN network device 'enp1s0f0np0' is not available, please use one or more of`. You can ignore this warning if your inference is successful, as it's related to only one of your two CX-7 ports being used, and the other being left unused. **Expected output:** Server startup logs and ready message. @@ -630,7 +635,8 @@ Stop and remove containers by using the following command on the leader node: docker stack rm trtllm-multinode ``` -> **Warning:** This removes all inference data and performance reports. Copy `/opt/*perf-report.json` files before cleanup if needed. +> [!WARNING] +> This removes all inference data and performance reports. Copy `/opt/*perf-report.json` files before cleanup if needed. Remove downloaded models to free disk space: @@ -659,7 +665,8 @@ After setting up TensorRT-LLM inference server in either single-node or multi-no Run the following command on the DGX Spark node where you have the TensorRT-LLM inference server running. For multi-node setup, this would be the primary node. -**Note:** If you used a different port for your OpenAI-compatible API server, adjust the `OPENAI_API_BASE_URL="http://localhost:8355/v1"` to match the IP and port of your TensorRT-LLM inference server. +> [!NOTE] +> If you used a different port for your OpenAI-compatible API server, adjust the `OPENAI_API_BASE_URL="http://localhost:8355/v1"` to match the IP and port of your TensorRT-LLM inference server. ```bash docker run \ @@ -696,10 +703,13 @@ You should see the Open WebUI interface at http://localhost:8080 where you can: You can select your model(s) from the dropdown menu on the top left corner. That's all you need to do to start using Open WebUI with your deployed models. -**Note:** If accessing from a remote machine, replace localhost with your DGX Spark's IP address. +> [!NOTE] +> If accessing from a remote machine, replace localhost with your DGX Spark's IP address. ### Step 3. Cleanup and rollback -**Warning:** This removes all chat data and may require re-uploading for future runs. +> [!WARNING] +> This removes all chat data and may require re-uploading for future runs. + Remove the container by using the following command: ```bash docker stop open-webui diff --git a/nvidia/txt2kg/README.md b/nvidia/txt2kg/README.md index 79ae75f..ee71138 100644 --- a/nvidia/txt2kg/README.md +++ b/nvidia/txt2kg/README.md @@ -89,7 +89,8 @@ docker exec ollama-compose ollama pull Browse available models at [https://ollama.com/search](https://ollama.com/search) -> **Note**: The unified memory architecture enables running larger models like 70B parameters, which produce significantly more accurate knowledge triples. +> [!NOTE] +> The unified memory architecture enables running larger models like 70B parameters, which produce significantly more accurate knowledge triples. ## Step 4. Access the web interface diff --git a/nvidia/vllm/README.md b/nvidia/vllm/README.md index e11812b..f9fd967 100644 --- a/nvidia/vllm/README.md +++ b/nvidia/vllm/README.md @@ -244,7 +244,8 @@ Expected output includes a generated haiku response. ## Step 10. (Optional) Deploy Llama 3.1 405B model -> **Warning:** 405B model has insufficient memory headroom for production use. +> [!WARNING] +> 405B model has insufficient memory headroom for production use. Download the quantized 405B model for testing purposes only. @@ -300,7 +301,8 @@ docker exec node nvidia-smi --query-gpu=memory.used,memory.total --format=csv Remove temporary configurations and containers when testing is complete. -> **Warning:** This will stop all inference services and remove cluster configuration. +> [!WARNING] +> This will stop all inference services and remove cluster configuration. ```bash ## Stop containers on both nodes diff --git a/nvidia/vlm-finetuning/README.md b/nvidia/vlm-finetuning/README.md index 314a01c..8070176 100644 --- a/nvidia/vlm-finetuning/README.md +++ b/nvidia/vlm-finetuning/README.md @@ -104,7 +104,8 @@ sh launch.sh ## Enter the mounted directory within the container cd /vlm_finetuning ``` -**Note**: The same Docker container and launch commands work for both image and video VLM recipes. The container features all necessary dependencies, including FFmpeg, Decord, and optimized libraries for both workflows. +> [!NOTE] +> The same Docker container and launch commands work for both image and video VLM recipes. The container features all necessary dependencies, including FFmpeg, Decord, and optimized libraries for both workflows. ## Step 5. [Option A] For image VLM fine-tuning (Wildfire Detection) @@ -129,7 +130,8 @@ cd ui_image/data For this fine-tuning playbook, we will use the [Wildfire Prediction Dataset](https://www.kaggle.com/datasets/abdelghaniaaba/wildfire-prediction-dataset) from Kaggle. Visit the kaggle dataset page [here](https://www.kaggle.com/datasets/abdelghaniaaba/wildfire-prediction-dataset) to click the download button. Select the `cURL` option in the `Download Via` dropdown and copy the curl command. -> **Note**: You will need to be logged into Kaggle and may need to accept the dataset terms before the download link works. +> [!NOTE] +> You will need to be logged into Kaggle and may need to accept the dataset terms before the download link works. Run the following commands in your container: @@ -235,7 +237,8 @@ dataset/ #### 6.2. Model download -> **Note**: These instructions assume you are already inside the Docker container. For container setup, refer to the section above to `Build the Docker container`. +> [!NOTE] +> These instructions assume you are already inside the Docker container. For container setup, refer to the section above to `Build the Docker container`. ```bash hf download OpenGVLab/InternVL3-8B @@ -262,7 +265,8 @@ Scroll down, enter your prompt in the chat box and hit `Generate`. Your prompt w If you are proceeding to train a fine-tuned model, ensure that the streamlit demo UI is brought down before proceeding to train. You can bring it down by interrupting the terminal with `Ctrl+C` keystroke. -> **Note**: To clear out any extra occupied memory from your system, execute the following command outside the container after interrupting the ComfyUI server. +> [!NOTE] +> To clear out any extra occupied memory from your system, execute the following command outside the container after interrupting the ComfyUI server. ```bash sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches' ``` @@ -294,7 +298,8 @@ You can monitor and evaluate the training progress and metrics, as they will be After training, ensure that you shutdown the jupyter kernel in the notebook and kill the jupyter server in the terminal with a `Ctrl+C` keystroke. -> **Note**: To clear out any extra occupied memory from your system, execute the following command outside the container after interrupting the ComfyUI server. +> [!NOTE] +> To clear out any extra occupied memory from your system, execute the following command outside the container after interrupting the ComfyUI server. ```bash sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches' ``` diff --git a/nvidia/vscode/README.md b/nvidia/vscode/README.md index da5895f..e5ac189 100644 --- a/nvidia/vscode/README.md +++ b/nvidia/vscode/README.md @@ -152,7 +152,8 @@ Within VS Code: ## Step 8. Uninstalling VS Code -> **Warning:** Uninstalling VS Code will remove all user settings and extensions. +> [!WARNING] +> Uninstalling VS Code will remove all user settings and extensions. To remove VS Code if needed: ```bash diff --git a/nvidia/vss/README.md b/nvidia/vss/README.md index b7c3dcd..2922582 100644 --- a/nvidia/vss/README.md +++ b/nvidia/vss/README.md @@ -128,7 +128,8 @@ Create a Docker network that will be shared between VSS services and CV pipeline docker network create vss-shared-network ``` -> **Warning:** If the network already exists, you may see an error. Remove it first with `docker network rm vss-shared-network` if needed. +> [!WARNING] +> If the network already exists, you may see an error. Remove it first with `docker network rm vss-shared-network` if needed. ## Step 6. Authenticate with NVIDIA Container Registry @@ -369,7 +370,8 @@ Follow the steps [here](https://docs.nvidia.com/vss/latest/content/ui_app.html) To completely remove the VSS deployment and free up system resources: -> **Warning:** This will destroy all processed video data and analysis results. +> [!WARNING] +> This will destroy all processed video data and analysis results. ```bash ## For Event Reviewer deployment