mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-27 12:23:51 +00:00
Compare commits
4 Commits
ca924d9fb3
...
7fe6178523
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7fe6178523 | ||
|
|
8452a1c5b1 | ||
|
|
9414a5141f | ||
|
|
48fc5eb30e |
@ -172,12 +172,15 @@ Verify the NVIDIA runtime works:
|
||||
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
|
||||
```
|
||||
|
||||
If you get a permission denied error on `docker`, add your user to the Docker group and log out/in:
|
||||
If you get a permission denied error on `docker`, add your user to the Docker group and activate the new group in your current session:
|
||||
|
||||
```bash
|
||||
sudo usermod -aG docker $USER
|
||||
newgrp docker
|
||||
```
|
||||
|
||||
This applies the group change immediately. Alternatively, you can log out and back in instead of running `newgrp docker`.
|
||||
|
||||
> [!NOTE]
|
||||
> DGX Spark uses cgroup v2. OpenShell's gateway embeds k3s inside Docker and needs host cgroup namespace access. Without `default-cgroupns-mode: host`, the gateway can fail with "Failed to start ContainerManager" errors.
|
||||
|
||||
@ -322,13 +325,21 @@ http://127.0.0.1:18789/#token=<long-token-here>
|
||||
|
||||
**If accessing the Web UI from a remote machine**, you need to set up port forwarding.
|
||||
|
||||
First, find your Spark's IP address. On the Spark, run:
|
||||
|
||||
```bash
|
||||
hostname -I | awk '{print $1}'
|
||||
```
|
||||
|
||||
This prints the primary IP address (e.g. `192.168.1.42`). You can also find it in **Settings > Wi-Fi** or **Settings > Network** on the Spark's desktop, or check your router's connected-devices list.
|
||||
|
||||
Start the port forward on the Spark host:
|
||||
|
||||
```bash
|
||||
openshell forward start 18789 my-assistant --background
|
||||
```
|
||||
|
||||
Then from your remote machine, create an SSH tunnel to the Spark:
|
||||
Then from your remote machine, create an SSH tunnel to the Spark (replace `<your-spark-ip>` with the IP address from above):
|
||||
|
||||
```bash
|
||||
ssh -L 18789:127.0.0.1:18789 <your-user>@<your-spark-ip>
|
||||
|
||||
@ -27,8 +27,8 @@ services:
|
||||
# Ollama configuration
|
||||
- OLLAMA_BASE_URL=http://ollama:11434/v1
|
||||
- OLLAMA_MODEL=llama3.1:8b
|
||||
# Disable vLLM
|
||||
- VLLM_BASE_URL=http://localhost:8001/v1
|
||||
# vLLM disabled in default Ollama mode
|
||||
# - VLLM_BASE_URL=http://localhost:8001/v1
|
||||
- VLLM_MODEL=disabled
|
||||
# Vector DB configuration
|
||||
- QDRANT_URL=http://qdrant:6333
|
||||
|
||||
@ -108,7 +108,7 @@ export class TextProcessor {
|
||||
|
||||
// Determine which LLM provider to use based on configuration
|
||||
// Priority: vLLM > NVIDIA > Ollama
|
||||
if (process.env.VLLM_BASE_URL) {
|
||||
if (process.env.VLLM_BASE_URL && process.env.VLLM_MODEL && process.env.VLLM_MODEL !== 'disabled') {
|
||||
this.selectedLLMProvider = 'vllm';
|
||||
} else if (process.env.NVIDIA_API_KEY) {
|
||||
this.selectedLLMProvider = 'nvidia';
|
||||
|
||||
@ -171,10 +171,12 @@ Add additional model entries for any other Ollama models you wish to host remote
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
|Ollama not starting|GPU drivers may not be installed correctly|Run `nvidia-smi` in the terminal. If the command fails check DGX Dashboard for updates to your DGX Spark.|
|
||||
|Continue can't connect over the network|Port 11434 may not be open or accessible|Run command `ss -tuln \| grep 11434`. If the output does not reflect ` tcp LISTEN 0 4096 *:11434 *:* `, go back to step 2 and run the ufw command.|
|
||||
|Continue can't detect a locally running Ollama model|Configuration not properly set or detected|Check `OLLAMA_HOST` and `OLLAMA_ORIGINS` in `/etc/systemd/system/ollama.service.d/override.conf` file. If `OLLAMA_HOST` and `OLLAMA_ORIGINS` are set correctly, add these lines to your `~/.bashrc` file.|
|
||||
|High memory usage|Model size too big|Confirm no other large models or containers are running with `nvidia-smi`. Use smaller models such as `gpt-oss:20b` for lightweight usage.|
|
||||
| **WiFi connection drops or becomes unreachable** (especially in headless mode) | Aggressive WiFi power-saving settings in NetworkManager | Edit `/etc/NetworkManager/conf.d/default-wifi-powersave-on.conf`, set `wifi.powersave = 2`, and run `sudo systemctl restart NetworkManager`. |
|
||||
| **Random reboots and "00" error code on the display** | Watchdog timer module (`sbsa_gwdt`) not loaded | Add `sbsa_gwdt` to `/etc/modules-load.d/watchdog.conf` and reboot to ensure the hardware watchdog is correctly managed by the kernel. |
|
||||
| Ollama not starting | GPU drivers may not be installed correctly | Run `nvidia-smi` in the terminal. If the command fails check DGX Dashboard for updates to your DGX Spark. |
|
||||
| Continue can't connect over the network | Port 11434 may not be open or accessible | Run command `ss -tuln \| grep 11434`. If the output does not reflect `tcp LISTEN 0 4096 *:11434 *:*`, go back to step 2 and run the ufw command. |
|
||||
| Continue can't detect a locally running Ollama model | Configuration not properly set or detected | Check `OLLAMA_HOST` and `OLLAMA_ORIGINS` in `/etc/systemd/system/ollama.service.d/override.conf` file. If `OLLAMA_HOST` and `OLLAMA_ORIGINS` are set correctly, add these lines to your `~/.bashrc` file. |
|
||||
| High memory usage | Model size too big | Confirm no other large models or containers are running with `nvidia-smi`. Use smaller models such as `gpt-oss:20b` for lightweight usage. |
|
||||
|
||||
> [!NOTE]
|
||||
> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user