From 61cbcfd071a1c4498ef58cf8ae55dff81491bacc Mon Sep 17 00:00:00 2001 From: GitLab CI Date: Wed, 18 Mar 2026 17:40:15 +0000 Subject: [PATCH] chore: Regenerate all playbooks --- README.md | 4 +- nvidia/nemoclaw/README.md | 219 +++++++++++++++++++------------------ nvidia/openclaw/README.md | 11 +- nvidia/openshell/README.md | 144 ++++++++++++++++++------ 4 files changed, 226 insertions(+), 152 deletions(-) diff --git a/README.md b/README.md index 7d615ca..735abde 100644 --- a/README.md +++ b/README.md @@ -37,14 +37,14 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting - [Connect Multiple Sparks through a Switch](nvidia/multi-sparks-through-switch/) - [NCCL for Two Sparks](nvidia/nccl/) - [Fine-tune with NeMo](nvidia/nemo-fine-tune/) -- [NemoClaw](nvidia/nemoclaw/) +- [NemoClaw with Nemotron-3-Super on DGX Spark](nvidia/nemoclaw/) - [Nemotron-3-Nano with llama.cpp](nvidia/nemotron/) - [NIM on Spark](nvidia/nim-llm/) - [NVFP4 Quantization](nvidia/nvfp4-quantization/) - [Ollama](nvidia/ollama/) - [Open WebUI with Ollama](nvidia/open-webui/) - [OpenClaw 🦞](nvidia/openclaw/) -- [OpenClaw with OpenShell](nvidia/openshell/) +- [Secure Long Running AI Agents with OpenShell on DGX Spark](nvidia/openshell/) - [Portfolio Optimization](nvidia/portfolio-optimization/) - [Fine-tune with Pytorch](nvidia/pytorch-fine-tune/) - [RAG Application in AI Workbench](nvidia/rag-ai-workbench/) diff --git a/nvidia/nemoclaw/README.md b/nvidia/nemoclaw/README.md index 177fd49..142ba8e 100644 --- a/nvidia/nemoclaw/README.md +++ b/nvidia/nemoclaw/README.md @@ -1,60 +1,68 @@ -# NemoClaw +# NemoClaw with Nemotron-3-Super on DGX Spark -> Run OpenClaw in an OpenShell sandbox on DGX Spark with Ollama (Nemotron) +> Run NemoClaw on DGX Spark with Nemotron-3-Super ## Table of Contents - [Overview](#overview) - - [Quick start safety check](#quick-start-safety-check) - - [What you're getting](#what-youre-getting) - - [Key risks with AI agents](#key-risks-with-ai-agents) - - [Participant acknowledgement](#participant-acknowledgement) + - [Overview](#overview) + - [Basic idea](#basic-idea) + - [What you'll accomplish](#what-youll-accomplish) + - [Notice and disclaimers](#notice-and-disclaimers) + - [Isolation layers (OpenShell)](#isolation-layers-openshell) + - [What to know before starting](#what-to-know-before-starting) + - [Prerequisites](#prerequisites) + - [Ancillary files](#ancillary-files) + - [Time and risk](#time-and-risk) - [Instructions](#instructions) + - [Restarting the gateway (if needed)](#restarting-the-gateway-if-needed) - [Troubleshooting](#troubleshooting) --- ## Overview -## Basic idea +### Overview + +### Basic idea **NVIDIA OpenShell** is an open-source runtime for running autonomous AI agents in sandboxed environments with kernel-level isolation. **NVIDIA NemoClaw** is an OpenClaw plugin that packages OpenShell with an AI agent: it includes the `nemoclaw onboard` wizard to automate setup so you can get a browser-based chat interface running locally on your DGX Spark using Ollama (e.g. NVIDIA Nemotron 3 Super). By the end of this playbook you will have a working AI agent inside an OpenShell sandbox, accessible via a dashboard URL, with inference routed to a local model on your Sparkβ€”all without exposing your host filesystem or network to the agent. -## What you'll accomplish +### What you'll accomplish - Install and configure Docker for OpenShell (including cgroup fix for DGX Spark) - Install Node.js, Ollama, the OpenShell CLI, and the NemoClaw plugin - Run the NemoClaw onboard wizard to create a sandbox and configure inference - Start the OpenClaw web UI inside the sandbox and chat with Nemotron 3 Super (or another Ollama model) locally -## Notice and disclaimers +### Notice and disclaimers The following sections describe safety, risks, and your responsibilities when running this demo. -### Quick start safety check +#### Quick start safety check **Use only a clean environment.** Run this demo on a fresh device or VM with no personal data, confidential information, or sensitive credentials. Keep it isolated like a sandbox. By installing this demo, you accept responsibility for all third-party components, including reviewing their licenses, terms, and security posture. Read and accept before you install or use. -### What you're getting +#### What you're getting This experience is provided "AS IS" for demonstration purposes onlyβ€”no warranties, no guarantees. This is a demo, not a production-ready solution. You will need to implement appropriate security controls for your environment and use case. -### Key risks with AI agents +#### Key risks with AI agents - **Data leakage** β€” Any materials the agent accesses could be exposed, leaked, or stolen. - **Malicious code execution** β€” The agent or its connected tools could expose your system to malicious code or cyber-attacks. - **Unintended actions** β€” The agent might modify or delete files, send messages, or access services without explicit approval. - **Prompt injection and manipulation** β€” External inputs or connected content could hijack the agent's behavior in unexpected ways. -### Participant acknowledgement +#### Participant acknowledgement By participating in this demo, you acknowledge that you are solely responsible for your configuration and for any data, accounts, and tools you connect. To the maximum extent permitted by law, NVIDIA is not responsible for any loss of data, device damage, security incidents, or other harm arising from your configuration or use of NemoClaw demo materials, including OpenClaw or any connected tools or services. -## Isolation layers (OpenShell) +### Isolation layers (OpenShell) | Layer | What it protects | When it applies | |------------|----------------------------------------------------|-----------------------------| @@ -63,18 +71,18 @@ By participating in this demo, you acknowledge that you are solely responsible f | Process | Blocks privilege escalation and dangerous syscalls.| Locked at sandbox creation. | | Inference | Reroutes model API calls to controlled backends. | Hot-reloadable at runtime. | -## What to know before starting +### What to know before starting - Basic use of the Linux terminal and SSH - Familiarity with Docker (permissions, `docker run`) - Awareness of the security and risk sections above -## Prerequisites +### Prerequisites **Hardware and access:** - A DGX Spark (GB10) with keyboard and monitor, or SSH access -- An **NVIDIA API key** from [build.nvidia.com](https://build.nvidia.com) (free; the onboard wizard will prompt for it) +- An **NVIDIA API key** from [build.nvidia.com](https://build.nvidia.com) (free; only required if using NVIDIA Cloud inference β€” not needed for local Ollama) - A GitHub account with access to the NVIDIA organization (for installing the OpenShell CLI from GitHub releases) **Software:** @@ -92,17 +100,20 @@ python3 --version Expected: Ubuntu 24.04, NVIDIA GB10 GPU, Docker server version, Python 3.12+. -## Ancillary files +### Ancillary files -All required assets are in the [openshell-openclaw-plugin repository](https://github.com/NVIDIA/openshell-openclaw-plugin). You will clone it during the instructions to install NemoClaw. +All required assets are in the [NemoClaw repository](https://github.com/NVIDIA/NemoClaw). You will clone it during the instructions to install NemoClaw. -## Time and risk +### Time and risk - **Estimated time:** 45–90 minutes (including first-time gateway and sandbox build, and Nemotron 3 Super download of ~87GB). - **Risk level:** Medium β€” you are running an AI agent in a sandbox; risks are reduced by isolation but not eliminated. Use a clean environment and do not connect sensitive data or production accounts. - **Rollback:** Remove the sandbox with `openshell sandbox delete `, destroy the gateway with `openshell gateway destroy -g nemoclaw`, and uninstall NemoClaw with `sudo npm uninstall -g nemoclaw` and `rm -rf ~/.nemoclaw` (see Cleanup in Instructions). -- **Last Updated:** 03/13/2026 - - First publication +- **Last Updated:** 03/17/2026 + - Updated wizard step descriptions to match actual onboard behavior + - Simplified Step 8 (gateway already runs during sandbox creation) + - Fixed repository references (NemoClaw) + - Added troubleshooting entries for port conflicts and provider setup ## Instructions @@ -191,58 +202,29 @@ ollama run nemotron-3-super:120b Configure Ollama to listen on all interfaces so the sandbox container can reach it: ```bash -sudo systemctl edit ollama.service -``` +sudo mkdir -p /etc/systemd/system/ollama.service.d +printf '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0"\n' | sudo tee /etc/systemd/system/ollama.service.d/override.conf -Add the following on the **third line** of the file (above "Edits below this comment will be discarded"): - -```ini -[Service] -Environment="OLLAMA_HOST=0.0.0.0" -``` - -Save (Ctrl+X, then Y), then restart: - -```bash sudo systemctl daemon-reload sudo systemctl restart ollama ``` ## Step 4. Install the OpenShell CLI -The OpenShell binary is distributed via GitHub releases. You need the GitHub CLI and access to the NVIDIA organization. +Install OpenShell using the install script: ```bash -sudo apt-get install -y gh -gh auth login -``` - -When using SSH, `gh` will show a one-time code. Visit https://github.com/login/device in a browser, enter the code, and authorize for the NVIDIA org. - -Configure git for NVIDIA SAML SSO and download OpenShell: - -```bash -gh auth setup-git - -ARCH=$(uname -m) -case "$ARCH" in - x86_64|amd64) ARCH="x86_64" ;; - aarch64|arm64) ARCH="aarch64" ;; -esac -gh release download --repo NVIDIA/OpenShell \ - --pattern "openshell-${ARCH}-unknown-linux-musl.tar.gz" -tar xzf openshell-${ARCH}-unknown-linux-musl.tar.gz -sudo install -m 755 openshell /usr/local/bin/openshell -rm -f openshell openshell-${ARCH}-unknown-linux-musl.tar.gz +curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh ``` Verify: `openshell --version` ## Step 5. Install NemoClaw -Clone the NemoClaw plugin and install it globally: +Clone the NemoClaw repository and install the CLI globally: ```bash +cd ~ git clone https://github.com/NVIDIA/NemoClaw cd NemoClaw sudo npm install -g . @@ -251,43 +233,64 @@ sudo npm install -g . Verify: `nemoclaw --help` > [!NOTE] -> OpenClaw (the AI agent) is installed **automatically inside the sandbox** during onboarding. You do not install it on the host. +> OpenClaw (the AI agent) is installed **automatically inside the sandbox** during onboarding β€” it is built into the sandbox Docker image. You do not install it on the host. ## Step 6. Run the NemoClaw onboard wizard -Ensure Ollama is running (`curl http://localhost:11434` should return "Ollama is running"). From the directory where you cloned the plugin in Step 5 (e.g. `~/openshell-openclaw-plugin`), or that directory in a new terminal, run: +Ensure Ollama is running (`curl http://localhost:11434` should return "Ollama is running"). From the directory where you cloned the repository in Step 5, run: ```bash -cd ~/openshell-openclaw-plugin +cd ~/NemoClaw nemoclaw onboard ``` The wizard walks you through seven steps: -1. **NVIDIA API key** β€” Paste your key from [build.nvidia.com](https://build.nvidia.com) (starts with `nvapi-`). Only needed once. -2. **Preflight** β€” Checks Docker and OpenShell. "No GPU detected" is normal on DGX Spark (GB10 reports unified memory differently). -3. **Gateway** β€” Starts the OpenShell gateway (30–60 seconds on first run). -4. **Sandbox** β€” Enter a name or press Enter for the default. First build takes 2–5 minutes. -5. **Inference** β€” The wizard auto-detects Ollama (e.g. "Ollama detected on localhost:11434 β€” using it"). -6. **OpenClaw** β€” Configured on first connect. +1. **Preflight** β€” Checks Docker and OpenShell CLI. Detects GPU. "No GPU detected" or the VRAM count is normal on DGX Spark (GB10 reports unified memory differently). +2. **Gateway** β€” Destroys any old `nemoclaw` gateway and starts a new one (30–60 seconds on first run). If port 8080 is already in use by another container, see [Troubleshooting](troubleshooting.md). +3. **Sandbox** β€” Enter a name or press Enter for the default (`my-assistant`). The wizard builds a Docker image from the NemoClaw Dockerfile (which includes OpenClaw, the NemoClaw plugin, and the `nemoclaw-start` entrypoint script), then creates a sandbox from that image. On creation, `nemoclaw-start` runs inside the sandbox to configure and launch the OpenClaw gateway. The wizard also sets up port forwarding from port 18789 on the host to the sandbox. First build takes 2–5 minutes. +4. **Inference (NIM)** β€” Auto-detects local inference engines. If Ollama is running, the wizard selects it automatically and defaults to `nemotron-3-nano`. No API key is needed for local Ollama. If no local engine is found, you will be prompted to choose an inference option (cloud API requires an NVIDIA API key). +5. **Inference provider** β€” Creates the `ollama-local` provider on the gateway and sets the inference route. +6. **OpenClaw** β€” Already configured inside the sandbox during step 3. 7. **Policies** β€” Press Enter or Y to accept suggested presets (pypi, npm). When complete you will see something like: ```text + ────────────────────────────────────────────────── Dashboard http://localhost:18789/ Sandbox my-assistant (Landlock + seccomp + netns) Model nemotron-3-nano (ollama-local) + NIM not running + ────────────────────────────────────────────────── + Run: nemoclaw my-assistant connect + Status: nemoclaw my-assistant status + Logs: nemoclaw my-assistant logs --follow + ────────────────────────────────────────────────── ``` ## Step 7. Configure inference for Nemotron 3 Super -The onboard wizard defaults to `nemotron-3-nano`. Switch the inference route to the Super model you downloaded in Step 3: +The onboard wizard defaults to `nemotron-3-nano`. If you downloaded Nemotron 3 Super 120B in Step 3, switch the inference route to the larger model. + +If the wizard did not create the `ollama-local` provider (you will see `provider 'ollama-local' not found` when running the next command), create it manually first: ```bash -openshell inference set --provider ollama-local --model nemotron-3-super:120b +openshell provider create \ + --name ollama-local \ + --type openai \ + --credential "OPENAI_API_KEY=ollama" \ + --config "OPENAI_BASE_URL=http://host.openshell.internal:11434/v1" ``` +Then set the inference route: + +```bash +openshell inference set --provider ollama-local --model nemotron-3-super:120b --no-verify +``` + +The `--no-verify` flag is needed because `host.openshell.internal` only resolves from inside the sandbox, not from the host. + Verify: ```bash @@ -296,51 +299,27 @@ openshell inference get Expected: `provider: ollama-local` and `model: nemotron-3-super:120b`. -## Step 8. Start the OpenClaw web UI +## Step 8. Get the dashboard URL -Connect to the sandbox (use the name you chose in Step 6, e.g. `my-assistant`): +The onboard wizard in Step 6 already launched the OpenClaw gateway inside the sandbox and set up port forwarding on port 18789. Verify the port forward is active: + +```bash +openshell forward list +``` + +You should see `my-assistant` with port `18789` and status `running`. If the forward is not active or shows `dead`, restart it: + +```bash +openshell forward start --background 18789 my-assistant +``` + +Now get the dashboard URL (which includes an authentication token). Connect to the sandbox and run `openclaw dashboard`: ```bash openshell sandbox connect my-assistant ``` -You are now inside the sandbox. Run these commands in order. - -Set the API key environment variables (required for the gateway). For local Ollama, use the value `local-ollama` β€” no real API key is required. If you use a different inference provider later, replace with your API key: - -```bash -export NVIDIA_API_KEY=local-ollama -export ANTHROPIC_API_KEY=local-ollama -``` - -Initialize NemoClaw (this may drop you into a new shell when done): - -```bash -nemoclaw-start -``` - -After the "NemoClaw ready" banner, re-export the environment variables: - -```bash -export NVIDIA_API_KEY=local-ollama -export ANTHROPIC_API_KEY=local-ollama -``` - -Create memory files and start the web UI: - -```bash -mkdir -p /sandbox/.openclaw/workspace/memory -echo "# Memory" > /sandbox/.openclaw/workspace/MEMORY.md - -openclaw config set gateway.controlUi.dangerouslyAllowHostHeaderOriginFallback true - -nohup openclaw gateway run \ - --allow-unconfigured --dev \ - --bind loopback --port 18789 \ - > /tmp/gateway.log 2>&1 & -``` - -Wait a few seconds, then get your dashboard URL: +Inside the sandbox: ```bash openclaw dashboard @@ -352,7 +331,25 @@ This prints something like: Dashboard URL: http://127.0.0.1:18789/#token=YOUR_UNIQUE_TOKEN ``` -**Save this URL.** Type `exit` to leave the sandbox (the gateway keeps running). +**Save this URL.** Type `exit` to leave the sandbox (the gateway keeps running inside the sandbox). + +### Restarting the gateway (if needed) + +If the OpenClaw gateway inside the sandbox stopped (e.g. after a sandbox restart), connect and re-launch it: + +```bash +openshell sandbox connect my-assistant +``` + +Inside the sandbox: + +```bash +export NVIDIA_API_KEY=local-ollama +export ANTHROPIC_API_KEY=local-ollama +nemoclaw-start +``` + +The `nemoclaw-start` script configures OpenClaw and launches the gateway. After you see the `[gateway]` log lines, type `exit` to leave the sandbox. ## Step 9. Open the chat interface @@ -431,7 +428,7 @@ cd ~ openshell sandbox delete my-assistant 2>/dev/null openshell gateway destroy -g nemoclaw 2>/dev/null sudo npm uninstall -g nemoclaw -rm -rf ~/openshell-openclaw-plugin ~/.nemoclaw +rm -rf ~/NemoClaw ~/.nemoclaw ``` Verify: @@ -474,14 +471,18 @@ Then open the dashboard URL in your local browser. | Symptom | Cause | Fix | |---------|-------|-----| +| Gateway fails with "cannot start gateway: port 8080 is held by container..." | Another OpenShell gateway or container is already using port 8080 | Stop the conflicting container: `openshell gateway destroy -g ` or `docker stop && docker rm `, then retry `nemoclaw onboard` | | Gateway fails with cgroup / "Failed to start ContainerManager" errors | Docker not configured for host cgroup namespace on DGX Spark | Run the cgroup fix: `sudo python3 -c "import json, os; path='/etc/docker/daemon.json'; d=json.load(open(path)) if os.path.exists(path) else {}; d['default-cgroupns-mode']='host'; json.dump(d, open(path,'w'), indent=2)"` then `sudo systemctl restart docker` | | "No GPU detected" during onboard | DGX Spark GB10 reports unified memory differently | Expected on DGX Spark. The wizard still works and will use Ollama for inference. | +| "provider 'ollama-local' not found" when running `openshell inference set` | The onboard wizard did not complete the inference provider setup | Create the provider manually: `openshell provider create --name ollama-local --type openai --credential "OPENAI_API_KEY=ollama" --config "OPENAI_BASE_URL=http://host.openshell.internal:11434/v1"` then retry the inference set command | +| Sandbox created with a random name instead of the one you wanted | Name passed as a positional argument instead of using `--name` flag | Use `--name` flag: `openshell sandbox create --name my-assistant`. Delete the random sandbox with `openshell sandbox delete ` | | "unauthorized: gateway token missing" | Dashboard URL used without token or wrong format | Paste the **full URL** including `#token=...` (hash fragment, not `?token=`). Run `openclaw dashboard` inside the sandbox to get the URL again. | | "No API key found for provider anthropic" | API key env vars not set when starting gateway in sandbox | Inside the sandbox, set both before running the gateway: `export NVIDIA_API_KEY=local-ollama` and `export ANTHROPIC_API_KEY=local-ollama` | | Agent gives no response | Model not loaded or Nemotron 3 Super is slow | Nemotron 3 Super can take 30–90 seconds per response. Verify Ollama: `curl http://localhost:11434`. Ensure inference is set: `openshell inference get` | | Port forward dies or dashboard unreachable | Forward not active or wrong port | List forwards: `openshell forward list`. Restart: `openshell forward stop 18789 my-assistant` then `openshell forward start --background 18789 my-assistant` | | Docker permission denied | User not in docker group | `sudo usermod -aG docker $USER`, then log out and back in. | | Ollama not reachable from sandbox (503 / timeout) | Ollama bound to localhost only or firewall blocking 11434 | Ensure Ollama listens on all interfaces: add `Environment="OLLAMA_HOST=0.0.0.0"` in `sudo systemctl edit ollama.service`, then `sudo systemctl daemon-reload` and `sudo systemctl restart ollama`. If using UFW: `sudo ufw allow 11434/tcp comment 'Ollama for NemoClaw'` and `sudo ufw reload` | +| OpenClaw UI shows error message `origin not allowed` | OpenClaw gateway inside the sandbox rejects remote access connections | Run `openshell sandbox connect my-assistant` to enter sandbox. Inside the sandbox, run `sed -i 's/"allowedOrigins": \[\]/"allowedOrigins": ["*"]/' /root/.openclaw/gateway.json 2>/dev/null` to allow the origin. Restart OpenClaw gateway inside sandbox by running `export NVIDIA_API_KEY=local-ollama; export ANTHROPIC_API_KEY=local-ollama; nemoclaw-start` | > [!NOTE] > DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. With many applications still updating to take advantage of UMA, you may encounter memory issues even when within the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with: diff --git a/nvidia/openclaw/README.md b/nvidia/openclaw/README.md index ab622a1..cdd7fc8 100644 --- a/nvidia/openclaw/README.md +++ b/nvidia/openclaw/README.md @@ -69,8 +69,6 @@ You cannot eliminate all risk; proceed at your own risk. **Critical security mea ## Instructions -## Important: Read security warnings first - > [!CAUTION] > **Before proceeding, review the security risks in the Overview tab.** OpenClaw is an AI agent that can access your files, execute commands, and connect to external services. Data exposure and malicious code execution are real risks. **Strongly recommended:** Run OpenClaw on an isolated system or VM, use dedicated accounts (not your main accounts), and never expose the dashboard to the public internet without authentication. @@ -173,10 +171,10 @@ lms load openai/gpt-oss-120b --context-length 32768 ollama run gpt-oss:120b ``` -Once the interactive prompt appears, set the context window: +Once the interactive prompt appears, set the context window (type the following at the Ollama prompt; do not include any `>>>` prefix): ``` ->>> /set parameter num_ctx 32768 +/set parameter num_ctx 32768 ``` Keep this terminal (or process) running so the model stays loaded. You can now chat with the model or press Ctrl+D to exit the interactive mode while keeping the model server running. @@ -232,6 +230,9 @@ For **gpt-oss-20b** or another model, use the same structure but set `id` and `n **If you use Ollama:** +> [!NOTE] +> `ollama launch openclaw` requires **Ollama v0.15 or later**. If you see an "unknown command" error, upgrade Ollama (`ollama --version`) and retry. + Run: ```bash @@ -261,7 +262,7 @@ You can also ask OpenClaw which model it’s using. In the gateway chat UI you c | Symptom | Cause | Fix | |---------|--------|-----| -| OpenClaw dashboard URL not loading | Gateway not running or wrong host/port | **Restart the OpenClaw gateway:** For Ollama, run `ollama launch openclaw` to restart an already-configured gateway. For LM Studio, restart the OpenClaw gateway via the LM Studio UI or restart the OpenClaw service/container. **Verify:** Check that the gateway process is running with `pgrep -f openclaw` or `ps aux | grep openclaw`. **Find URL/token:** Check the original installer output (scroll up in your terminal) or look in gateway logs (typically `~/.openclaw/logs/`) for the dashboard URL and access token | +| OpenClaw dashboard URL not loading | Gateway not running or wrong host/port | **Restart the OpenClaw gateway:** For Ollama, run `ollama launch openclaw` to restart an already-configured gateway. For LM Studio, restart the OpenClaw gateway via the LM Studio UI or restart the OpenClaw service/container. **Verify:** Check that the gateway process is running with `pgrep -f openclaw` or `ps aux \| grep openclaw`. **Find URL/token:** Check the original installer output (scroll up in your terminal) or look in gateway logs (typically `~/.openclaw/logs/`) for the dashboard URL and access token | | "Connection refused" to model (e.g. localhost:1234 or Ollama port) | LM Studio or Ollama not running, or wrong port | Start the model in a separate terminal (`lms load ...` or `ollama run ...`) and ensure the port in `openclaw.json` matches (1234 for LM Studio, 11434 for Ollama) | | OpenClaw says no model available | Model provider not configured or model not loaded | Add the `models` section to `~/.openclaw/openclaw.json` for LM Studio, or run `ollama launch openclaw` for Ollama; ensure the model is loaded/running | | Out-of-memory or very slow inference on DGX Spark | Model too large for available GPU memory or other GPU workloads | Free GPU memory (close other apps), choose a smaller model, or check usage with `nvidia-smi` | diff --git a/nvidia/openshell/README.md b/nvidia/openshell/README.md index eb66dc7..ab06aec 100644 --- a/nvidia/openshell/README.md +++ b/nvidia/openshell/README.md @@ -1,6 +1,6 @@ -# OpenClaw with OpenShell +# Secure Long Running AI Agents with OpenShell on DGX Spark -> Run OpenClaw in an NVIDIA OpenShell sandbox on DGX Spark +> Run OpenClaw with local models in an NVIDIA OpenShell sandbox on DGX Spark ## Table of Contents @@ -146,8 +146,11 @@ Now that we have verified the user's Docker permission, we must configure Docker ``` bash sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker +``` -## Run a sample workload to verify the setup +Run a sample workload to verify the setup: + +``` bash docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi ``` @@ -180,7 +183,7 @@ openshell gateway start openshell status ``` -`openshell status` should report the gateway as **healthy**. The first run may take a minute while Docker pulls the required images. +`openshell status` should report the gateway as **Connected**. The first run may take a few minutes while Docker pulls the required images and the internal k3s cluster bootstraps. > [!NOTE] > Remote gateway deployment requires passwordless SSH access. Ensure your SSH public key is added to `~/.ssh/authorized_keys` on the DGX Spark before using the `--remote` flag. @@ -199,11 +202,11 @@ ollama --version DGX Spark's 128GB memory can run large models: -| GPU memory available | Suggested model | Model size | Notes | -|---------------------|-------------------------|-----------|-------| -| 25–48 GB | gpt-oss:20b | ~12GB | Lower latency, good for interactive use | -| 48–80 GB | Nemotron-3-Nano-30B-A3B | ~20GB | Good balance of quality and speed | -| 128 GB | gpt-oss:120b | ~65GB | Best quality on DGX Spark | +| GPU memory available | Suggested model | Model size | Notes | +|---------------------|---------------------------|-----------|-------| +| 25–48 GB | nemotron-3-nano | ~24GB | Lower latency, good for interactive use | +| 48–80 GB | gpt-oss:120b | ~65GB | Good balance of quality and speed | +| 128 GB | nemotron-3-super:120b | ~86GB | Best quality on DGX Spark | Verify Ollama is running (it auto-starts as a service after installation). If not, start it manually: @@ -211,10 +214,39 @@ Verify Ollama is running (it auto-starts as a service after installation). If no ollama serve & ``` -Next, run a model from Ollama (adjust the model name to match your choice from [the Ollama model library](https://ollama.com/library)). The `ollama run` command will pull the model automatically if it is not already present. Running the model here ensures it is loaded and ready when you use it with OpenClaw, reducing the chance of timeouts later. Example for GPT-OSS 120b: +Configure Ollama to listen on all interfaces so the OpenShell gateway container can reach it. Create a systemd override: ```bash -ollama run gpt-oss:120b +mkdir -p /etc/systemd/system/ollama.service.d/ +sudo nano /etc/systemd/system/ollama.service.d/override.conf +``` + +Add these lines to the file (create the file if it does not exist): + +```ini +[Service] +Environment="OLLAMA_HOST=0.0.0.0" +``` + +Save and exit, then reload and restart Ollama: + +```bash +sudo systemctl daemon-reload +sudo systemctl restart ollama +``` + +Verify Ollama is listening on all interfaces: + +```bash +ss -tlnp | grep 11434 +``` + +You should see `*:11434` in the output. If it only shows `127.0.0.1:11434`, confirm the override file contents and that you ran `systemctl daemon-reload` before restarting. + +Next, run a model from Ollama (adjust the model name to match your choice from [the Ollama model library](https://ollama.com/library)). The `ollama run` command will pull the model automatically if it is not already present. Running the model here ensures it is loaded and ready when you use it with OpenClaw, reducing the chance of timeouts later. Example for nemotron-3-super: + +```bash +ollama run nemotron-3-super:120b ``` Verify the model is available: @@ -225,7 +257,15 @@ ollama list ## Step 6. Create an inference provider -We are going to create an OpenShell provider that points to your local Ollama server. This lets OpenShell route inference requests to your Spark-hosted model. To create a provider for the cluster, please replace `{Machine_IP}` with the IP Address of your DGX Spark. +We are going to create an OpenShell provider that points to your local Ollama server. This lets OpenShell route inference requests to your Spark-hosted model. + +First, find the IP address of your DGX Spark: + +```bash +hostname -I | awk '{print $1}' +``` + +Then create the provider, replacing `{Machine_IP}` with the IP address from the command above (e.g. `10.110.106.169`): ```bash openshell provider create \ @@ -235,26 +275,37 @@ openshell provider create \ --config OPENAI_BASE_URL=http://{Machine_IP}:11434/v1 ``` -> [!NOTE] -> `host.docker.internal` resolves to the host machine from inside Docker containers. If your Ollama listens on a different port or host, adjust the URL accordingly. +> [!IMPORTANT] +> Do **not** use `localhost` or `127.0.0.1` here. The OpenShell gateway runs inside a Docker container, so it cannot reach the host via `localhost`. Use the machine's actual IP address. + +Verify the provider was created: + +```bash +openshell provider list +``` ## Step 7. Configure inference routing -Point the `inference.local` endpoint (available inside every sandbox) at your Ollama model: +Point the `inference.local` endpoint (available inside every sandbox) at your Ollama model. Replace the model name with your choice from Step 5: ```bash openshell inference set \ --provider local-ollama \ - --model gpt-oss:120b + --model nemotron-3-super:120b ``` +The output should confirm the route and show a validated endpoint URL, for example: `http://10.110.106.169:11434/v1/chat/completions (openai_chat_completions)`. + +> [!NOTE] +> If you see `failed to verify inference endpoint` or `failed to connect` (for example because the gateway cannot reach the host IP from inside its container), add `--no-verify` to skip endpoint verification: `openshell inference set --provider local-ollama --model nemotron-3-super:120b --no-verify`. Ensure Ollama is running and listening on all interfaces (see Step 5). + Verify the configuration: ```bash openshell inference get ``` -Expected output should show `provider: local-ollama` and `model: gpt-oss:120b` (or whichever model you chose). +Expected output should show `provider: local-ollama` and `model: nemotron-3-super:120b` (or whichever model you chose). ## Step 8. Deploy OpenShell Sandbox @@ -274,24 +325,23 @@ openshell sandbox create \ The `--keep` flag keeps the sandbox running after the initial process exits, so you can reconnect later. This is the default behavior. To terminate the sandbox when the initial process exits, use the `--no-keep` flag instead. -> [!NOTE] -> The sandbox name is displayed in the creation output. You can also set it explicitly with `--name `. To find it later, run `openshell sandbox list`. - The CLI will: 1. Resolve `openclaw` against the community catalog 2. Pull and build the container image 3. Apply the bundled sandbox policy 4. Launch OpenClaw inside the sandbox -In order to verify the default policy enabled for your sandbox, please run the following command: - -```bash -openshell sandbox get -``` - ## Step 9. Configure OpenClaw within OpenShell Sandbox -The sandbox container will spin up and you will be guided through the OpenClaw installation process. Work through the prompts as follows. +The sandbox container will spin up and the OpenClaw onboarding wizard will launch automatically in your terminal. + +> [!IMPORTANT] +> The onboarding wizard is **fully interactive** β€” it requires arrow-key navigation and Enter to select options. It cannot be completed from a non-interactive session (e.g. a script or automation tool). You must run `openshell sandbox create` from a terminal with full TTY support. +> +> If the wizard did not complete during sandbox creation, reconnect to the sandbox to re-run it: +> ```bash +> openshell sandbox connect dgx-demo +> ``` Use the arrow keys and Enter key to interact with the installation. - If you understand and agree, use the arrow key of your keyboard to select 'Yes' and press the Enter key. @@ -301,10 +351,10 @@ Use the arrow keys and Enter key to interact with the installation. - How do you want to provide this API key?: Paste API key for now. - API key: please enter "ollama". - Endpoint compatibility: select **OpenAI-compatible** and press Enter. -- Model ID: gpt-oss:120b +- Model ID: enter the model name you chose in Step 5 (e.g. `nemotron-3-super:120b`). - This may take 1-2 minutes as the Ollama model is spun up in the background. - Endpoint ID: leave the default value. -- Alias: gpt-oss:120b (this is optional). +- Alias: enter the same model name (this is optional). - Channel: Select **Skip for now**. - Skills: Select **No** for now. - Enable hooks: Select **No** for now and press Enter. @@ -317,6 +367,19 @@ OpenClaw gateway starting in background. Logs: /tmp/gateway.log UI: http://127.0.0.1:18789/?token=9b4c9a9c9f6905131327ce55b6d044bd53e0ec423dd6189e ``` + +Now that we have configured OpenClaw within the OpenShell sandbox, let's set the name of our openshell sandbox as an environment variable. This will make future commands easier to run. Note that the name of the sandbox was set in the `openshell sandbox create` command in Step 8 using the `--name` flag. + +```bash +export SANDBOX_NAME=dgx-demo +``` + +In order to verify the default policy enabled for your sandbox, please run the following command: + +```bash +openshell sandbox get $SANDBOX_NAME +``` + If you are using the Spark as the primary device, right-click on the URL in the UI section and select Open Link. **Accessing the dashboard from the host or a remote system:** The dashboard URL (e.g. `http://127.0.0.1:18789/?token=...`) is inside the sandbox, so the host does not forward port 18789 by default. To reach it from your host or another machine, use SSH local port forwarding. From a machine that can reach the OpenShell gateway, run (replace gateway URL, sandbox-id, token, and gateway-name with values from your environment): @@ -337,7 +400,7 @@ From this page, you can now **Chat** with your OpenClaw agent within the protect Now that OpenClaw has been configured within the OpenShell protected runtime, you can connect directly into the sandbox environment via: ```bash -openshell sandbox connect dgx-demo +openshell sandbox connect $SANDBOX_NAME ``` Once loaded into the sandbox terminal, you can test connectivity to the Ollama model with this command: @@ -373,14 +436,17 @@ Verify that the OpenClaw agent can reach `inference.local` for model requests an If you exit the sandbox session, reconnect at any time: ```bash -openshell sandbox connect +openshell sandbox connect $SANDBOX_NAME ``` -To transfer files in or out (replace `` with your sandbox name from the creation output or from `openshell sandbox list`): +> [!NOTE] +> `openshell sandbox connect` is interactive-only β€” it opens a terminal session inside the sandbox. There is no way to pass a command for non-interactive execution. Use `openshell sandbox upload`/`download` for file transfers, or use the SSH proxy for scripted access (see Step 9). + +To transfer files in or out out of the sandbox, please use the following: ```bash -openshell sandbox upload ./local-file /sandbox/destination -openshell sandbox download /sandbox/file ./local-destination +openshell sandbox upload $SANDBOX_NAME ./local-file /sandbox/destination +openshell sandbox download $SANDBOX_NAME /sandbox/file ./local-destination ``` ## Step 13. Cleanup @@ -388,7 +454,13 @@ openshell sandbox download /sandbox/file ./local-destination Stop and remove the sandbox: ```bash -openshell sandbox delete +openshell sandbox delete $SANDBOX_NAME +``` + +Remove the inference provider you created in Step 6: + +```bash +openshell provider delete local-ollama ``` Stop the gateway (preserves state for later): @@ -407,7 +479,7 @@ openshell gateway destroy To also remove the Ollama model: ```bash -ollama rm gpt-oss:120b +ollama rm nemotron-3-super:120b ``` ## Step 14. Next steps