chore: Regenerate all playbooks

2026-04-23 10:33:51 +00:00 · 2025-11-25 03:08:49 +00:00 · 2025-11-25 03:08:49 +00:00 · 80ab44c98d
commit 80ab44c98d
parent 699df25ee3
3 changed files with 83 additions and 6 deletions
--- a/nvidia/connect-two-sparks/README.md
+++ b/nvidia/connect-two-sparks/README.md
@ -101,7 +101,12 @@ rocep1s0f1 port 1 ==> enp1s0f1np1 (Up)

 Choose one option to setup the network interfaces. Option 1 and 2 are mutually exclusive.

-**Option 1: Automatic IP Assignment (Recommended)**
+> [!NOTE] 
+> Full bandwidth can be achieved with just one QSFP cable.
+> When two QSFP cables are connected, all four interfaces must be assigned IP addresses to obtain full bandwidth.
+> Option 1 below can only be used when 1 QSFP cable is connected.
+
+**Option 1: Automatic IP Assignment (Can only be used when 1 QSFP cable is connected)**

 Configure network interfaces using netplan on both DGX Spark nodes for automatic
 link-local addressing:
@ -125,11 +130,78 @@ sudo chmod 600 /etc/netplan/40-cx7.yaml
 sudo netplan apply
 ```

+**Option 2: Manual IP Assignment with the netplan configure file**
+
+On node 1:
+```bash
+## Create the netplan configuration file
+sudo tee /etc/netplan/40-cx7.yaml > /dev/null <<EOF
+network:
+  version: 2
+  ethernets:
+    enp1s0f0np0:
+      addresses:
+        - 192.168.100.10/24
+      dhcp4: no
+    enp1s0f1np1:
+      addresses:
+        - 192.168.200.12/24
+      dhcp4: no
+    enP2p1s0f0np0:
+      addresses:
+        - 192.168.100.14/24
+      dhcp4: no
+    enP2p1s0f1np1:
+      addresses:
+        - 192.168.200.16/24
+      dhcp4: no
+EOF
+
+## Set appropriate permissions
+sudo chmod 600 /etc/netplan/40-cx7.yaml
+
+## Apply the configuration
+sudo netplan apply
+```
+
+On node 2:
+```bash
+## Create the netplan configuration file
+sudo tee /etc/netplan/40-cx7.yaml > /dev/null <<EOF
+network:
+  version: 2
+  ethernets:
+    enp1s0f0np0:
+      addresses:
+        - 192.168.100.11/24
+      dhcp4: no
+    enp1s0f1np1:
+      addresses:
+        - 192.168.200.13/24
+      dhcp4: no
+    enP2p1s0f0np0:
+      addresses:
+        - 192.168.100.15/24
+      dhcp4: no
+    enP2p1s0f1np1:
+      addresses:
+        - 192.168.200.17/24
+      dhcp4: no
+EOF
+
+## Set appropriate permissions
+sudo chmod 600 /etc/netplan/40-cx7.yaml
+
+## Apply the configuration
+sudo netplan apply
+```
+
+
+**Option 3: Manual IP Assignment with command line**
+
 > [!NOTE]
 > Using this option, the IPs assigned to the interfaces will change if you reboot the system.

-**Option 2: Manual IP Assignment (Advanced)**
-
 First, identify which network ports are available and up:

 ```bash
@ -167,7 +239,7 @@ You can verify the IP assignment on both nodes by running the following command
 ip addr show enp1s0f1np1
 ```

-## Step 3. Set up passwordless SSH authentication
+## Step 4. Set up passwordless SSH authentication

 #### Option 1: Automatically configure SSH

@ -220,7 +292,7 @@ ssh-copy-id -i ~/.ssh/id_rsa.pub <username>@<IP for Node 1>
 ssh-copy-id -i ~/.ssh/id_rsa.pub <username>@<IP for Node 2>
 ```

-## Step 4. Verify Multi-Node Communication
+## Step 5. Verify Multi-Node Communication

 Test basic multi-node functionality:

--- a/nvidia/multi-modal-inference/README.md
+++ b/nvidia/multi-modal-inference/README.md
@ -76,7 +76,7 @@ the TensorRT development environment with all required dependencies pre-installe
 docker run --gpus all --ipc=host --ulimit memlock=-1 \
 --ulimit stack=67108864 -it --rm --ipc=host \
 -v $HOME/.cache/huggingface:/root/.cache/huggingface \
-nvcr.io/nvidia/pytorch:25.09-py3
+nvcr.io/nvidia/pytorch:25.10-py3
 ```

 ## Step 2. Clone and set up TensorRT repository
@ -101,6 +101,7 @@ apt install -y libgl1 libglu1-mesa libglib2.0-0t64 libxrender1 libxext6 libx11-6
 pip install nvidia-modelopt[torch,onnx]
 sed -i '/^nvidia-modelopt\[.*\]=.*/d' requirements.txt
 pip3 install -r requirements.txt
+pip install onnxconverter_common
 ```

 ## Step 4. Run Flux.1 Dev model inference
--- a/nvidia/nccl/README.md
+++ b/nvidia/nccl/README.md
@ -128,6 +128,10 @@ In this example, the IP address for Node 1 is **169.254.35.62**. Repeat the proc

 ## Step 5. Run NCCL communication test

+> [!NOTE] 
+> Full bandwidth can be achieved with just one QSFP cable.
+> When two QSFP cables are connected, all four interfaces must be assigned IP addresses to obtain full bandwidth.
+
 Execute the following commands on both nodes to run the NCCL communication test. Replace the IP addresses and interface names with the ones you found in the previous step.

 ```bash