mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-26 03:43:52 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
53f06ed06c
commit
f39dd60161
@ -25,7 +25,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
|
|||||||
- [Connect to your Spark](nvidia/connect-to-your-spark/)
|
- [Connect to your Spark](nvidia/connect-to-your-spark/)
|
||||||
- [DGX Dashboard](nvidia/dgx-dashboard/)
|
- [DGX Dashboard](nvidia/dgx-dashboard/)
|
||||||
- [FLUX.1 Dreambooth LoRA Fine-tuning](nvidia/flux-finetuning/)
|
- [FLUX.1 Dreambooth LoRA Fine-tuning](nvidia/flux-finetuning/)
|
||||||
- [Optimized Jax](nvidia/jax/)
|
- [Optimized JAX](nvidia/jax/)
|
||||||
- [Llama Factory](nvidia/llama-factory/)
|
- [Llama Factory](nvidia/llama-factory/)
|
||||||
- [MONAI-Reasoning-CXR-3B Model](nvidia/monai-reasoning/)
|
- [MONAI-Reasoning-CXR-3B Model](nvidia/monai-reasoning/)
|
||||||
- [Build and Deploy a Multi-Agent Chatbot](nvidia/multi-agent-chatbot/)
|
- [Build and Deploy a Multi-Agent Chatbot](nvidia/multi-agent-chatbot/)
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
# Optimized Jax
|
# Optimized JAX
|
||||||
|
|
||||||
> Develop with Optimized Jax
|
> Develop with Optimized JAX
|
||||||
|
|
||||||
## Table of Contents
|
## Table of Contents
|
||||||
|
|
||||||
@ -17,10 +17,10 @@ JAX lets you write **NumPy-style Python code** and run it fast on GPUs without w
|
|||||||
|
|
||||||
- **NumPy on accelerators**: Use `jax.numpy` just like NumPy, but arrays live on the GPU.
|
- **NumPy on accelerators**: Use `jax.numpy` just like NumPy, but arrays live on the GPU.
|
||||||
- **Function transformations**:
|
- **Function transformations**:
|
||||||
- `jit` → Compiles your function into fast GPU code.
|
- `jit` → Compiles your function into fast GPU code
|
||||||
- `grad` → Gives you automatic differentiation.
|
- `grad` → Gives you automatic differentiation
|
||||||
- `vmap` → Vectorizes your function across batches.
|
- `vmap` → Vectorizes your function across batches
|
||||||
- `pmap` → Runs across multiple GPUs in parallel.
|
- `pmap` → Runs across multiple GPUs in parallel
|
||||||
- **XLA backend**: JAX hands your code to XLA (Accelerated Linear Algebra compiler), which fuses operations and generates optimized GPU kernels.
|
- **XLA backend**: JAX hands your code to XLA (Accelerated Linear Algebra compiler), which fuses operations and generates optimized GPU kernels.
|
||||||
|
|
||||||
## What you'll accomplish
|
## What you'll accomplish
|
||||||
@ -40,12 +40,12 @@ GPU acceleration and performance optimization capabilities.
|
|||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
[ ] NVIDIA Spark device with Blackwell architecture
|
- NVIDIA Spark device with Blackwell architecture
|
||||||
[ ] ARM64 (AArch64) processor architecture
|
- ARM64 (AArch64) processor architecture
|
||||||
[ ] Docker or container runtime installed
|
- Docker or container runtime installed
|
||||||
[ ] NVIDIA Container Toolkit configured
|
- NVIDIA Container Toolkit configured
|
||||||
[ ] Verify GPU access: `nvidia-smi`
|
- Verify GPU access: `nvidia-smi`
|
||||||
[ ] Port 8080 available for marimo notebook access
|
- Port 8080 available for marimo notebook access
|
||||||
|
|
||||||
## Ancillary files
|
## Ancillary files
|
||||||
|
|
||||||
@ -119,7 +119,7 @@ docker run --gpus all --rm -it \
|
|||||||
jax-on-spark
|
jax-on-spark
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 5. Access marimo interface
|
## Step 5. Access the marimo interface
|
||||||
|
|
||||||
Connect to the marimo notebook server to begin the JAX tutorial.
|
Connect to the marimo notebook server to begin the JAX tutorial.
|
||||||
|
|
||||||
@ -130,7 +130,7 @@ Connect to the marimo notebook server to begin the JAX tutorial.
|
|||||||
|
|
||||||
The interface will load a table-of-contents display and brief introduction to marimo.
|
The interface will load a table-of-contents display and brief introduction to marimo.
|
||||||
|
|
||||||
## Step 6. Complete JAX introduction tutorial
|
## Step 6. Complete the JAX introduction tutorial
|
||||||
|
|
||||||
Work through the introductory material to understand JAX programming model differences from NumPy.
|
Work through the introductory material to understand JAX programming model differences from NumPy.
|
||||||
|
|
||||||
@ -172,7 +172,7 @@ Common issues and their solutions:
|
|||||||
| Symptom | Cause | Fix |
|
| Symptom | Cause | Fix |
|
||||||
|---------|--------|-----|
|
|---------|--------|-----|
|
||||||
| `nvidia-smi` not found | Missing NVIDIA drivers | Install NVIDIA drivers for ARM64 |
|
| `nvidia-smi` not found | Missing NVIDIA drivers | Install NVIDIA drivers for ARM64 |
|
||||||
| Container fails to access GPU | Missing NVIDIA Container Toolkit | Install nvidia-container-toolkit |
|
| Container fails to access GPU | Missing NVIDIA Container Toolkit | Install `nvidia-container-toolkit` |
|
||||||
| JAX only uses CPU | CUDA/JAX version mismatch | Reinstall JAX with CUDA support |
|
| JAX only uses CPU | CUDA/JAX version mismatch | Reinstall JAX with CUDA support |
|
||||||
| Port 8080 unavailable | Port already in use | Use `-p 8081:8080` or kill process on 8080 |
|
| Port 8080 unavailable | Port already in use | Use `-p 8081:8080` or kill process on 8080 |
|
||||||
| Package conflicts in Docker build | Outdated environment file | Update environment file for Blackwell |
|
| Package conflicts in Docker build | Outdated environment file | Update environment file for Blackwell |
|
||||||
|
|||||||
@ -13,7 +13,7 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
## Basic Idea
|
## Basic idea
|
||||||
|
|
||||||
NCCL (NVIDIA Collective Communication Library) enables high-performance GPU-to-GPU communication
|
NCCL (NVIDIA Collective Communication Library) enables high-performance GPU-to-GPU communication
|
||||||
across multiple nodes. This walkthrough sets up NCCL for multi-node distributed training on
|
across multiple nodes. This walkthrough sets up NCCL for multi-node distributed training on
|
||||||
|
|||||||
@ -14,7 +14,7 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
## Basic Idea
|
## Basic idea
|
||||||
|
|
||||||
This walkthrough demonstrates how to set up and run an agentic retrieval-augmented generation (RAG)
|
This walkthrough demonstrates how to set up and run an agentic retrieval-augmented generation (RAG)
|
||||||
project using NVIDIA AI Workbench. You'll use AI Workbench to clone and run a pre-built agentic RAG
|
project using NVIDIA AI Workbench. You'll use AI Workbench to clone and run a pre-built agentic RAG
|
||||||
@ -39,16 +39,16 @@ architectures.
|
|||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
- [ ] DGX Spark system with NVIDIA AI Workbench installed or ready to install
|
- DGX Spark system with NVIDIA AI Workbench installed or ready to install
|
||||||
- [ ] Free NVIDIA API key: Generate at [NGC API Keys](https://org.ngc.nvidia.com/setup/api-keys)
|
- Free NVIDIA API key: Generate at [NGC API Keys](https://org.ngc.nvidia.com/setup/api-keys)
|
||||||
- [ ] Free Tavily API key: Generate at [Tavily](https://tavily.com/)
|
- Free Tavily API key: Generate at [Tavily](https://tavily.com/)
|
||||||
- [ ] Internet connection for cloning repositories and accessing APIs
|
- Internet connection for cloning repositories and accessing APIs
|
||||||
- [ ] Web browser for accessing the Gradio interface
|
- Web browser for accessing the Gradio interface
|
||||||
|
|
||||||
**Verification commands:**
|
## Verification commands
|
||||||
|
|
||||||
* Verify the NVIDIA AI Workbench application exists on your DGX Spark system
|
- Verify the NVIDIA AI Workbench application exists on your DGX Spark system
|
||||||
* Verify your API keys are valid and up-to-date
|
- Verify your API keys are valid and up-to-date
|
||||||
|
|
||||||
|
|
||||||
## Time & risk
|
## Time & risk
|
||||||
@ -74,7 +74,7 @@ On your DGX Spark system, open the **NVIDIA AI Workbench** application and click
|
|||||||
|
|
||||||
### Troubleshooting installation issues
|
### Troubleshooting installation issues
|
||||||
|
|
||||||
If you encounter the error message: `An error occurred ... container tool failed to reach ready state. try again: docker is not running`, reboot your DGX Spark system to restart the docker service, then reopen NVIDIA AI Workbench.
|
If you encounter the error message: `An error occurred ... container tool failed to reach ready state. try again: docker is not running` reboot your DGX Spark system to restart the docker service, then reopen NVIDIA AI Workbench.
|
||||||
|
|
||||||
## Step 2. Verify API key requirements
|
## Step 2. Verify API key requirements
|
||||||
|
|
||||||
@ -94,7 +94,7 @@ This step clones the pre-built agentic RAG project from GitHub into your AI Work
|
|||||||
|
|
||||||
From the AI Workbench landing page, select the **Local** location if not done so already, then click **Clone Project** from the top right corner.
|
From the AI Workbench landing page, select the **Local** location if not done so already, then click **Clone Project** from the top right corner.
|
||||||
|
|
||||||
Paste this Git repository URL in the clone dialog: ``https://github.com/NVIDIA/workbench-example-agentic-rag``.
|
Paste this Git repository URL in the clone dialog: https://github.com/NVIDIA/workbench-example-agentic-rag
|
||||||
|
|
||||||
Click **Clone** to begin the clone and build process.
|
Click **Clone** to begin the clone and build process.
|
||||||
|
|
||||||
|
|||||||
@ -12,7 +12,7 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
## Basic Idea
|
## Basic idea
|
||||||
This walkthrough establishes a local Visual Studio Code development environment directly on DGX Spark devices. By installing VS Code natively on the ARM64-based Spark system, you gain access to a full-featured IDE with extensions, integrated terminal, and Git integration while leveraging the specialized hardware for development and testing.
|
This walkthrough establishes a local Visual Studio Code development environment directly on DGX Spark devices. By installing VS Code natively on the ARM64-based Spark system, you gain access to a full-featured IDE with extensions, integrated terminal, and Git integration while leveraging the specialized hardware for development and testing.
|
||||||
|
|
||||||
## What you'll accomplish
|
## What you'll accomplish
|
||||||
@ -175,23 +175,28 @@ rm -rf ~/.vscode
|
|||||||
|
|
||||||
## Access with NVIDIA Sync
|
## Access with NVIDIA Sync
|
||||||
|
|
||||||
## Step 1. Install and Open NVIDIA Sync
|
## Step 1. Install and configure NVIDIA Sync
|
||||||
|
|
||||||
## Step 2. Add your Spark to NVIDIA Sync
|
Follow the [NVIDIA Sync setup guide](/spark/connect-to-your-spark/sync) to:
|
||||||
|
- Install NVIDIA Sync for your operating system
|
||||||
|
- Configure which development tools you want to use (VS Code, Cursor, Terminal, etc.)
|
||||||
|
- Add your DGX Spark device by providing its hostname/IP and credentials
|
||||||
|
|
||||||
## Step 3. Install VS Code locally
|
NVIDIA Sync will automatically configure SSH key-based authentication for secure, password-free access.
|
||||||
|
|
||||||
## Step 4. Open Sync and launch VS Code
|
## Step 2. Launch VS Code through NVIDIA Sync
|
||||||
|
|
||||||
|
- Click the NVIDIA Sync icon in your system tray/taskbar
|
||||||
|
- Ensure your device is connected (click "Connect" if needed)
|
||||||
|
- Click on "VS Code" to launch it with an automatic SSH connection to your Spark
|
||||||
- Wait for the remote connection to be established (may ask your local machine for a password or to authorize the connection)
|
- Wait for the remote connection to be established (may ask your local machine for a password or to authorize the connection)
|
||||||
- It may prompt you to "trust the authors of the files in this folder" when you first land in the home directory after a successful ssh connection.
|
- It may prompt you to "trust the authors of the files in this folder" when you first land in the home directory after a successful SSH connection
|
||||||
|
|
||||||
|
## Step 3. Validation and follow-ups
|
||||||
|
|
||||||
|
- Verify that you can access your Spark's filesystem with VS Code as a text editor
|
||||||
## Step 5. Validation and Follow-ups
|
- Open the integrated terminal in VS Code and run test commands like `hostnamectl` and `whoami` to ensure you are remotely accessing your Spark
|
||||||
|
- Navigate to a specific file path or directory and start editing/writing files
|
||||||
- Verify that you can access your Spark's filesystem with VSCode as a text editor. Run test commands in the terminal like `hostnamectl` and `whoami` to ensure you are remotely accessing your spark.
|
- Install VS Code extensions for your development workflow (Python, Docker, GitLens, etc.)
|
||||||
- Specify a file path or directory and start editing/writing files
|
- Clone repositories from GitHub or other version control systems
|
||||||
- Install extensions
|
- Configure and locally host an LLM code assistant if desired
|
||||||
- Clone repos
|
|
||||||
- Locally host LLM code assistant
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user