chore: Regenerate all playbooks

This commit is contained in:
GitLab CI 2025-10-13 13:22:50 +00:00
parent faf13cfac4
commit 99425a7384
9 changed files with 11 additions and 198 deletions

View File

@ -23,6 +23,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
- [Comfy UI](nvidia/comfy-ui/)
- [Set Up Local Network Access](nvidia/connect-to-your-spark/)
- [Connect Two Sparks](nvidia/connect-two-sparks/)
- [CUDA-X Data Science](nvidia/cuda-x-data-science/)
- [DGX Dashboard](nvidia/dgx-dashboard/)
- [FLUX.1 Dreambooth LoRA Fine-tuning](nvidia/flux-finetuning/)
@ -42,15 +43,14 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
- [RAG application in AI Workbench](nvidia/rag-ai-workbench/)
- [SGLang Inference Server](nvidia/sglang/)
- [Speculative Decoding](nvidia/speculative-decoding/)
- [Connect Two Sparks](nvidia/stack-sparks/)
- [Set up Tailscale on your Spark](nvidia/tailscale/)
- [TRT LLM for Inference](nvidia/trt-llm/)
- [Text to Knowledge Graph](nvidia/txt2kg/)
- [Unsloth on DGX Spark](nvidia/unsloth/)
- [Vibe Coding in VS Code](nvidia/vibe-coding/)
- [vLLM for Inference](nvidia/vllm/)
- [Install and Use vLLM for Inference](nvidia/vllm/)
- [Vision-Language Model Fine-tuning](nvidia/vlm-finetuning/)
- [Install VS Code](nvidia/vscode/)
- [VS Code](nvidia/vscode/)
- [Video Search and Summarization](nvidia/vss/)
## Resources

View File

@ -1,6 +1,6 @@
# Comfy UI
> Install and use ComfyUI to generate images
> Install and use Comfy UI to generate images
## Table of Contents

View File

@ -40,9 +40,9 @@ a functional distributed computing environment.
## Ancillary files
All required files for this playbook can be found [here on GitHub](https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/nvidia/stack-sparks/)
All required files for this playbook can be found [here on GitHub](https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/nvidia/connect-two-sparks/)
- [**discover-sparks.sh**](https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/nvidia/stack-sparks/assets/discover-sparks) script for automatic node discovery and SSH key distribution
- [**discover-sparks.sh**](https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/nvidia/connect-two-sparks/assets/discover-sparks) script for automatic node discovery and SSH key distribution
## Time & risk
@ -169,7 +169,7 @@ ip addr show enp1s0f1np1
**Option 1: Automatically configure SSH**
Run the DGX Spark [**discover-sparks.sh**](https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/nvidia/stack-sparks/assets/discover-sparks) script from one of the nodes to automatically discover and configure SSH:
Run the DGX Spark [**discover-sparks.sh**](https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/nvidia/connect-two-sparks/assets/discover-sparks) script from one of the nodes to automatically discover and configure SSH:
```bash
bash ./discover-sparks

View File

@ -1,6 +1,6 @@
# CUDA-X Data Science
> Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes.
> Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes
## Table of Contents

View File

@ -1,24 +0,0 @@
#
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
network:
version: 2
ethernets:
enp1s0f0np0:
link-local: [ ipv4 ]
enp1s0f1np1:
link-local: [ ipv4 ]

View File

@ -1,163 +0,0 @@
#
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#!/bin/env bash
# discover-sparks
# Discover available systems using avahi-browse and generate MPI hosts file
# Searches all active interfaces automatically
#
# Usage: bash ./discover-sparks
set -euo pipefail
# Check if running as root or with sudo
if [[ $EUID -eq 0 ]] || [[ -n "${SUDO_USER:-}" ]]; then
echo "Error: This script should not be run as root or with sudo"
echo "Please run as a regular user"
exit 1
fi
# Dynamically get interface names from ibdev2netdev output
# Use ibdev2netdev to list Infiniband devices and their network interfaces.
# The awk command searches for lines containing 'Up)' (i.e., interfaces that are up)
# and prints the 5th field, which is the interface name (e.g., enp1s0f0np0).
# The tr command removes any parentheses from the output.
INTERFACES=($(ibdev2netdev | awk '/Up\)/ {print $5}' | tr -d '()'))
if [ ${#INTERFACES[@]} -eq 0 ]; then
echo "ERROR: No active interfaces found via ibdev2netdev."
exit 1
fi
# Create temporary file for processing
TEMP_FILE=$(mktemp)
trap 'rm -f "$TEMP_FILE"' EXIT
# Check if avahi-browse is available
if ! command -v avahi-browse &> /dev/null; then
echo "Error: avahi-browse not found. Please install avahi-utils package."
exit 1
fi
# Run avahi-browse and filter for SSH services on specified interfaces
# -p: parseable output
# -r: resolve host names and addresses
# -f: terminate after dumping all entries available at startup
avahi_output=$(avahi-browse -p -r -f -t _ssh._tcp 2>/dev/null)
# Filter for both interfaces
found_services=false
for interface in "${INTERFACES[@]}"; do
if echo "$avahi_output" | grep "$interface" >> "$TEMP_FILE"; then
found_services=true
fi
done
if [ "$found_services" = false ]; then
echo "Warning: No services found on any specified interface"
exit 0
fi
# Extract IPv4 addresses from the avahi-browse output
# Format: =;interface;IPv4;hostname\032service;description;local;fqdn;ip_address;port;
grep "^=" "$TEMP_FILE" | grep "IPv4" | while IFS=';' read -r prefix interface protocol hostname_service description local fqdn ip_address port rest; do
# Clean up any trailing data
clean_ip=$(echo "$ip_address" | sed 's/;.*$//')
# Validate IP address format
if [[ $clean_ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
echo "$clean_ip" >> "$TEMP_FILE.sorted"
echo "Found: $clean_ip ($fqdn)"
else
echo "Warning: Invalid IP format: $clean_ip"
fi
done
# Sort and remove duplicates
if [[ -s "$TEMP_FILE.sorted" ]]; then
sort -u "$TEMP_FILE.sorted" -o "$TEMP_FILE.sorted"
else
echo "No IPv4 addresses found."
exit 1
fi
# Generate a shared SSH key if it doesn't exist
SHARED_KEY="$HOME/.ssh/id_ed25519_shared"
if [[ ! -f "$SHARED_KEY" ]]; then
echo "Generating shared SSH key for all nodes..."
ssh-keygen -t ed25519 -N "" -f "$SHARED_KEY" -q -C "shared-cluster-key"
fi
echo ""
echo "Setting up shared SSH access across all nodes..."
echo "You may be prompted for your password on each node."
# Ensure local .ssh directory exists with correct permissions
mkdir -p "$HOME/.ssh"
chmod 700 "$HOME/.ssh"
# Add shared public key to local authorized_keys
if ! grep -qF "$(cat "$SHARED_KEY.pub")" "$HOME/.ssh/authorized_keys" 2>/dev/null; then
cat "$SHARED_KEY.pub" >> "$HOME/.ssh/authorized_keys"
chmod 600 "$HOME/.ssh/authorized_keys"
echo " ✓ Added shared public key to local authorized_keys"
fi
# Distribute shared key to all remote nodes
while read -r node_ip; do
if [[ -n "$node_ip" ]]; then
echo "Configuring $node_ip..."
# Copy shared key to remote node and set up authorized_keys
if scp -o StrictHostKeyChecking=accept-new "$SHARED_KEY" "$SHARED_KEY.pub" "$USER@$node_ip:~/.ssh/" &>/dev/null; then
ssh -n -o StrictHostKeyChecking=accept-new "$USER@$node_ip" "
chmod 700 ~/.ssh
chmod 600 ~/.ssh/id_ed25519_shared
chmod 644 ~/.ssh/id_ed25519_shared.pub
# Add shared public key to authorized_keys if not present
if ! grep -qF \"\$(cat ~/.ssh/id_ed25519_shared.pub)\" ~/.ssh/authorized_keys 2>/dev/null; then
cat ~/.ssh/id_ed25519_shared.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
fi
# Create/update SSH config to use shared key by default
if ! grep -q 'IdentityFile.*id_ed25519_shared' ~/.ssh/config 2>/dev/null; then
echo 'Host *' >> ~/.ssh/config
echo ' IdentityFile ~/.ssh/id_ed25519_shared' >> ~/.ssh/config
chmod 600 ~/.ssh/config
fi
" &>/dev/null
echo " ✓ Successfully configured $node_ip with shared key"
else
echo " ✗ Failed to configure $node_ip"
fi
fi
done < "$TEMP_FILE.sorted"
# Update local SSH config to use shared key
if ! grep -q 'IdentityFile.*id_ed25519_shared' "$HOME/.ssh/config" 2>/dev/null; then
touch "$HOME/.ssh/config"
echo 'Host *' >> "$HOME/.ssh/config"
echo ' IdentityFile ~/.ssh/id_ed25519_shared' >> "$HOME/.ssh/config"
chmod 600 "$HOME/.ssh/config"
echo " ✓ Updated local SSH config to use shared key"
fi
echo ""
echo "Shared SSH setup complete!"
echo "All nodes can now SSH to each other using the shared key (id_ed25519_shared)."

View File

@ -1,6 +1,6 @@
# Text to Knowledge Graph
> Transform unstructured text into interactive knowledge graphs using local GPU-accelerated LLM inference and graph visualization
> Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization
## Table of Contents

View File

@ -1,4 +1,4 @@
# vLLM for Inference
# Install and Use vLLM for Inference
> Use a container or build vLLM from source for Spark

View File

@ -1,4 +1,4 @@
# Install VS Code
# VS Code
> Install and use VS Code locally or remotely