diff --git a/nvidia/connect-to-your-spark/README.md b/nvidia/connect-to-your-spark/README.md index db91b27..74e3dc2 100644 --- a/nvidia/connect-to-your-spark/README.md +++ b/nvidia/connect-to-your-spark/README.md @@ -42,7 +42,7 @@ up its IP address. ## What you'll accomplish -You will establish secure SSH access to your DGX Spark device using either NVIDIA Sync or manual +You will establish secure SSH access to your DGX Spark device using either NVIDIA Sync or a manual SSH configuration. NVIDIA Sync provides a graphical interface for device management with integrated app launching, while manual SSH gives you direct command-line control with port forwarding capabilities. Both approaches enable you to run terminal commands, access web @@ -57,18 +57,16 @@ applications, and manage your DGX Spark remotely from your laptop. ## Prerequisites -- DGX Spark device is set up and you have created a local user account +- Your DGX Spark [device is set up](https://docs.nvidia.com/dgx/dgx-spark/first-boot.html) and you have created a local user account - Your laptop and DGX Spark are on the same network - You have your DGX Spark username and password -- You have your device's mDNS hostname (printed on quick start guide) or IP address - - +- You have your device's mDNS hostname (printed on the Quick Start Guide) or IP address ## Time & risk - **Time estimate:** 5-10 minutes - **Risk level:** Low - SSH setup involves credential configuration but no system-level changes to the DGX Spark device -- **Rollback:** SSH key removal can be done by editing `~/.ssh/authorized_keys` on the DGX Spark. +- **Rollback:** SSH key removal can be done by editing `~/.ssh/authorized_keys` on your DGX Spark. ## Connect with NVIDIA Sync @@ -101,12 +99,12 @@ Download and install NVIDIA Sync on your computer to get started. curl -fsSL https://workbench.download.nvidia.com/stable/linux/gpgkey | sudo tee -a /etc/apt/trusted.gpg.d/ai-workbench-desktop-key.asc echo "deb https://workbench.download.nvidia.com/stable/linux/debian default proprietary" | sudo tee -a /etc/apt/sources.list ``` -* Update package lists +* Update package lists: ``` sudo apt update ``` -* Install NVIDIA Sync +* Install NVIDIA Sync: ``` sudo apt install nvidia-sync @@ -147,7 +145,7 @@ Finally, connect your DGX Spark by filling out the form: > Your password is used only during this initial setup to configure SSH key-based authentication. It is not stored or transmitted after setup completion. NVIDIA Sync will SSH into your device and > configure its locally provisioned SSH key pair. -Click add "Add" and NVIDIA Sync will automatically: +Click the "Add" button and NVIDIA Sync will automatically: 1. Generate an SSH key pair on your laptop 2. Connect to your DGX Spark using your provided username and password @@ -163,15 +161,12 @@ Click add "Add" and NVIDIA Sync will automatically: Once connected, NVIDIA Sync appears as a system tray/taskbar application. Click the NVIDIA Sync icon to open the device management interface. -Clicking on the large "Connect" and "Disconnect" buttons controls the overall SSH connection to your device. - -**Set working directory** (optional): Choose a default directory that Apps will open in +- **SSH connection**: Clicking on the large "Connect" and "Disconnect" buttons controls the overall SSH connection to your device. +- **Set working directory** (optional): Choose a default directory that Apps will open in when launched through NVIDIA Sync. This defaults to your home directory on the remote device. - -**Launch applications**: Click on any configured app to open it with automatic SSH +- **Launch applications**: Click on any configured app to open it with automatic SSH connection to your DGX Spark. - -"Custom Ports" are configured on the Settings screen to provide access to custom web apps or APIs running on your device. +- **Customize ports** (optional): "Custom Ports" are configured on the Settings screen to provide access to custom web apps or APIs running on your device. ## Step 5. Validate SSH setup @@ -192,14 +187,14 @@ or ssh ``` -On the DGX Spark, verify you're connected +On the DGX Spark, verify you're connected: ```bash hostname whoami ``` -Exit the SSH session +Exit the SSH session: ```bash exit @@ -210,8 +205,8 @@ exit Test your setup by launching a development tool: - Click the NVIDIA Sync system tray icon. - Select "Terminal" to open a terminal session on your DGX Spark. -- Select "DGX Dashboard" to use Jupyterlab and manage updates. -- Try [a custom port example with Open WebUI](/spark/open-webui/sync) +- Select "DGX Dashboard" to use JupyterLab and manage updates. +- Try [a custom port example with Open WebUI](/spark/open-webui/sync). ## Connect with Manual SSH @@ -233,11 +228,11 @@ Collect the required connection details for your DGX Spark: - **Username**: Your DGX Spark user account name - **Password**: Your DGX Spark account password -- **Hostname**: Your device's mDNS hostname (from quick start guide, e.g., `spark-abcd.local`) -- **IP Address**: Alternative only needed if mDNS doesn't work on your network as described below +- **Hostname**: Your device's mDNS hostname (from the Quick Start Guide, e.g., `spark-abcd.local`) +- **IP Address**: An alternative only needed if mDNS doesn't work on your network as described below In some network configurations, like complex corporate environments, mDNS won't work as expected -and you'll have to use your device's IP address directly to connect. You know you are in this situation when +and you'll have to use your device's IP address directly to connect. You'll know you are in this situation when you try to SSH and the command hangs indefinitely or you get an error like: ``` @@ -246,7 +241,7 @@ ssh: Could not resolve hostname spark-abcd.local: Name or service not known **Testing mDNS Resolution** -To test if mDNS is working, use the `ping` utility. +To test if mDNS is working, use the `ping` utility: ```bash ping spark-abcd.local @@ -262,7 +257,7 @@ PING spark-abcd.local (10.9.1.9): 56 data bytes 64 bytes from 10.9.1.9: icmp_seq=2 ttl=64 time=33.301 ms ``` -If mDNS is **not** working and you will have to use your IP directly, you should see something like this: +If mDNS is **not** working, indicating you will have to use your IP directly, you will see something like this: ``` $ ping -c 3 spark-abcd.local @@ -282,6 +277,8 @@ Connect to your DGX Spark for the first time to verify basic connectivity: ssh @.local ``` +or + ```bash ## Alternative: Connect using IP address ssh @ @@ -289,7 +286,7 @@ ssh @ Replace placeholders with your actual values: - ``: Your DGX Spark account name -- ``: Device hostname without .local suffix +- ``: Device hostname without `.local` suffix - ``: Your device's IP address On first connection, you'll see a host fingerprint warning. Type `yes` and press Enter, @@ -313,7 +310,8 @@ exit To access web applications running on your DGX Spark, use SSH port forwarding. In this example we'll access the DGX Dashboard web application. -DGX Dashboard runs on localhost, port 11000. +> [!NOTE] +> DGX Dashboard runs on localhost, port 11000. Open the tunnel: diff --git a/nvidia/open-webui/README.md b/nvidia/open-webui/README.md index 430551a..bd61bd7 100644 --- a/nvidia/open-webui/README.md +++ b/nvidia/open-webui/README.md @@ -5,8 +5,8 @@ ## Table of Contents - [Overview](#overview) -- [Instructions](#instructions) -- [Setup Open WebUI on Remote Spark with NVIDIA Sync](#setup-open-webui-on-remote-spark-with-nvidia-sync) +- [Set up Open WebUI on Remote Spark with NVIDIA Sync](#set-up-open-webui-on-remote-spark-with-nvidia-sync) +- [Set Up Manually](#set-up-manually) - [Troubleshooting](#troubleshooting) --- @@ -16,27 +16,22 @@ ## Basic idea Open WebUI is an extensible, self-hosted AI interface that operates entirely offline. -This playbook shows you how to deploy Open WebUI with an integrated Ollama server on your DGX Spark device using -NVIDIA Sync. The setup creates a secure SSH tunnel that lets you access the web -interface from your local browser while the models run on Spark's GPU. +This playbook shows you how to deploy Open WebUI with an integrated Ollama server on your DGX Spark device that lets you access the web interface from your local browser while the models run on Spark's GPU. ## What you'll accomplish -You will have a fully functional Open WebUI installation running on your DGX Spark, accessible through -your local web browser via NVIDIA Sync's managed SSH tunneling. The setup includes integrated Ollama -for model management, persistent data storage, and GPU acceleration for model inference. +You will have a fully functional Open WebUI installation running on your DGX Spark. This will be accessible through your local web browser either via **NVIDIA Sync's managed SSH tunneling (recommended)** or via manual setup. The setup includes integrated Ollama for model management, persistent data storage, and GPU acceleration for model inference. ## What to know before starting -- How to use NVIDIA Sync to connect to your DGX Spark device +- How to [Set Up Local Network Access](/spark/connect-to-your-spark) to your DGX Spark device ## Prerequisites -- DGX Spark device is set up and accessible -- NVIDIA Sync installed and connected to your DGX Spark +- DGX Spark [device is set up](https://docs.nvidia.com/dgx/dgx-spark/first-boot.html) and accessible +- [Local Network Access](/spark/connect-to-your-spark) to your DGX Spark - Enough disk space for the Open WebUI container image and model downloads - ## Time & risk * **Duration**: 15-20 minutes for initial setup, plus model download time (varies by model size) @@ -44,121 +39,7 @@ for model management, persistent data storage, and GPU acceleration for model in * Docker permission issues may require user group changes and session restart * Large model downloads may take significant time depending on network speed -## Instructions - -## Step 1. Configure Docker permissions - -To easily manage containers without sudo, you must be in the `docker` group. If you choose to skip this step, you will need to run Docker commands with sudo. - -Open a new terminal and test Docker access. In the terminal, run: - -```bash -docker ps -``` - -If you see a permission denied error (something like permission denied while trying to connect to the Docker daemon socket), add your user to the docker group so that you don't need to run the command with sudo . - -```bash -sudo usermod -aG docker $USER -newgrp docker -``` - -## Step 2. Verify Docker setup and pull container - -Pull the Open WebUI container image with integrated Ollama: - -```bash -docker pull ghcr.io/open-webui/open-webui:ollama -``` - -## Step 3. Start the Open WebUI container - -Start the Open WebUI container by running: - -```bash -docker run -d -p 8080:8080 --gpus=all \ - -v open-webui:/app/backend/data \ - -v open-webui-ollama:/root/.ollama \ - --name open-webui ghcr.io/open-webui/open-webui:ollama -``` - -This will start the Open WebUI container and make it accessible at `http://localhost:8080`. You can access the Open WebUI interface from your local web browser. - -Application data will be stored in the `open-webui` volume and model data will be stored in the `open-webui-ollama` volume. - -## Step 4. Create administrator account - -This step sets up the initial administrator account for Open WebUI. This is a local account that you will use to access the Open WebUI interface. - -In the Open WebUI interface, click the "Get Started" button at the bottom of the screen. - -Fill out the administrator account creation form with your preferred credentials. - -Click the registration button to create your account and access the main interface. - -## Step 5. Download and configure a model - -This step downloads a language model through Ollama and configures it for use in -Open WebUI. The download happens on your DGX Spark device and may take several minutes. - -Click on the "Select a model" dropdown in the top left corner of the Open WebUI interface. - -Type `gpt-oss:20b` in the search field. - -Click the "Pull 'gpt-oss:20b' from Ollama.com" button that appears. - -Wait for the model download to complete. You can monitor progress in the interface. - -Once complete, select "gpt-oss:20b" from the model dropdown. - -## Step 6. Test the model - -This step verifies that the complete setup is working properly by testing model -inference through the web interface. - -In the chat text area at the bottom of the Open WebUI interface, enter: **Write me a haiku about GPUs** - -Press Enter to send the message and wait for the model's response. - -## Step 8. Cleanup and rollback - -Steps to completely remove the Open WebUI installation and free up resources: - -> [!WARNING] -> These commands will permanently delete all Open WebUI data and downloaded models. - -Stop and remove the Open WebUI container: - -```bash -docker stop open-webui -docker rm open-webui -``` - -Remove the downloaded images: - -```bash -docker rmi ghcr.io/open-webui/open-webui:ollama -``` - -Remove persistent data volumes: - -```bash -docker volume rm open-webui open-webui-ollama -``` - -## Step 9. Next steps - -Try downloading different models from the Ollama library at https://ollama.com/library. - -You can monitor GPU and memory usage through the DGX Dashboard available in NVIDIA Sync as you try different models. - -If Open WebUI reports an update is available, you can update the container image by running: - -```bash -docker pull ghcr.io/open-webui/open-webui:ollama -``` - -## Setup Open WebUI on Remote Spark with NVIDIA Sync +## Set up Open WebUI on Remote Spark with NVIDIA Sync > [!TIP] > If you haven't already installed NVIDIA Sync, [learn how here.](/spark/connect-to-your-spark/sync) @@ -173,7 +54,7 @@ Open the Terminal app from NVIDIA Sync to start an interactive SSH session and t docker ps ``` -If you see a permission denied error (something like permission denied while trying to connect to the Docker daemon socket), add your user to the docker group so that you don't need to run the command with sudo . +If you see a permission denied error (something like permission denied while trying to connect to the Docker daemon socket), add your user to the docker group so that you don't need to run the command with sudo. ```bash sudo usermod -aG docker $USER @@ -206,7 +87,7 @@ Once the container image is downloaded, continue to setup NVIDIA Sync. A Custom port is used to automatically start the Open WebUI container and set up port forwarding. -Click the "Add New" button on the Custom tab. +- Click the "Add New" button on the Custom tab. Fill out the form with these values: @@ -258,13 +139,12 @@ echo "Running. Press Ctrl+C to stop ${NAME}." while :; do sleep 86400; done ``` -- Click the "Add" button to save configuration to your DGX Spark. +- Click the "Add" button to save the configuration to your DGX Spark. ## Step 5. Launch Open WebUI -Click on the NVIDIA Sync icon in your system tray or taskbar to open the main application window. - -Under the "Custom" section, click on "Open WebUI". +- Click on the NVIDIA Sync icon in your system tray or taskbar to open the main application window. +- Under the "Custom" section, click on "Open WebUI". Your default web browser should automatically open to the Open WebUI interface at `http://localhost:12000`. @@ -276,43 +156,35 @@ Your default web browser should automatically open to the Open WebUI interface a To start using Open WebUI you must create an initial administrator account. This is a local account that you will use to access the Open WebUI interface. -In the Open WebUI interface, click the "Get Started" button at the bottom of the screen. - -Fill out the administrator account creation form with your preferred credentials. - -Click the registration button to create your account and access the main interface. +- In the Open WebUI interface, click the "Get Started" button at the bottom of the screen. +- Fill out the administrator account creation form with your preferred credentials. +- Click the registration button to create your account and access the main interface. ## Step 7. Download and configure a model Next, download a language model with Ollama and configure it for use in Open WebUI. This download happens on your DGX Spark device and may take several minutes. -Click on the "Select a model" dropdown in the top left corner of the Open WebUI interface. - -Type `gpt-oss:20b` in the search field. - -Click the `Pull "gpt-oss:20b" from Ollama.com` button that appears. - -Wait for the model download to complete. You can monitor progress in the interface. - -Once complete, select "gpt-oss:20b" from the model dropdown. +- Click on the "Select a model" dropdown in the top left corner of the Open WebUI interface. +- Type `gpt-oss:20b` in the search field. +- Click the `Pull "gpt-oss:20b" from Ollama.com` button that appears. +- Wait for the model download to complete. You can monitor progress in the interface. +- Once complete, select "gpt-oss:20b" from the model dropdown. ## Step 8. Test the model -In the chat textarea at the bottom of the Open WebUI interface, enter: **Write me a haiku about GPUs** +You can verify that the setup is working properly by testing the model. -Press Enter to send the message and wait for the model's response. +- In the chat text area at the bottom of the Open WebUI interface, enter: **Write me a haiku about GPUs**. +- Press Enter to send the message and wait for the model's response. ## Step 9. Stop the Open WebUI When you are finished with your session and want to stop the Open WebUI server and reclaim resources, close the Open WebUI from NVIDIA Sync. -Click on the NVIDIA Sync icon in your system tray or taskbar to open the main application window. - -Under the "Custom" section, click the `x` icon on the right of the "Open WebUI" entry. - -This will close the tunnel and stop the Open WebUI docker container. - +- Click on the NVIDIA Sync icon in your system tray or taskbar to open the main application window. +- Under the "Custom" section, click the `x` icon on the right of the "Open WebUI" entry. +- This will close the tunnel and stop the Open WebUI docker container. ## Step 10. Next steps @@ -320,7 +192,7 @@ Try downloading different models from the Ollama library at https://ollama.com/l You can monitor GPU and memory usage through the DGX Dashboard available in NVIDIA Sync as you try different models. -If Open WebUI reports an update is available, you can pull the the container image by running this in your terminal: +If Open WebUI reports an update is available, you can pull the container image by running this in your terminal: ```bash docker stop open-webui @@ -332,7 +204,7 @@ After the update, launch Open WebUI again from NVIDIA Sync. ## Step 11. Cleanup and rollback -Steps to completely remove the Open WebUI installation and free up resources: +Steps to completely remove the Open WebUI installation and free up resources. > [!WARNING] > These commands will permanently delete all Open WebUI data and downloaded models. @@ -358,17 +230,116 @@ docker volume rm open-webui open-webui-ollama Remove the Custom App from NVIDIA Sync by opening Settings > Custom tab and deleting the entry. +## Set Up Manually + +## Step 1. Configure Docker permissions + +To easily manage containers without sudo, you must be in the `docker` group. If you choose to skip this step, you will need to run Docker commands with sudo. + +Open a new terminal and test Docker access. In the terminal, run: + +```bash +docker ps +``` + +If you see a permission denied error (something like permission denied while trying to connect to the Docker daemon socket), add your user to the docker group so that you don't need to run the command with sudo. + +```bash +sudo usermod -aG docker $USER +newgrp docker +``` + +## Step 2. Verify Docker setup and pull container + +Pull the Open WebUI container image with integrated Ollama: + +```bash +docker pull ghcr.io/open-webui/open-webui:ollama +``` + +## Step 3. Start the Open WebUI container + +Start the Open WebUI container by running: + +```bash +docker run -d -p 8080:8080 --gpus=all \ + -v open-webui:/app/backend/data \ + -v open-webui-ollama:/root/.ollama \ + --name open-webui ghcr.io/open-webui/open-webui:ollama +``` + +This will start the Open WebUI container and make it accessible at `http://localhost:8080`. You can access the Open WebUI interface from your local web browser. + +> [!NOTE] +> Application data will be stored in the `open-webui` volume and model data will be stored in the `open-webui-ollama` volume. + +## Step 4. Create administrator account + +Set up the initial administrator account for Open WebUI. This is a local account that you will use to access the Open WebUI interface. + +- In the Open WebUI interface, click the "Get Started" button at the bottom of the screen. +- Fill out the administrator account creation form with your preferred credentials. +- Click the registration button to create your account and access the main interface. + +## Step 5. Download and configure a model + +You'll then download a language model through Ollama and configure it for use in +Open WebUI. This download happens on your DGX Spark device and may take several minutes. + +- Click on the "Select a model" dropdown in the top left corner of the Open WebUI interface. +- Type `gpt-oss:20b` in the search field. +- Click the "Pull 'gpt-oss:20b' from Ollama.com" button that appears. +- Wait for the model download to complete. You can monitor progress in the interface. +- Once complete, select "gpt-oss:20b" from the model dropdown. + +## Step 6. Test the model + +You can verify that the setup is working properly by testing model +inference through the web interface. + +- In the chat text area at the bottom of the Open WebUI interface, enter: **Write me a haiku about GPUs**. +- Press Enter to send the message and wait for the model's response. + +## Step 7. Next steps + +Try downloading different models from the Ollama library at https://ollama.com/library. + +You can try this [set up with NVIDIA Sync](/spark/open-webui/sync) so that you can monitor GPU and memory usage through the DGX Dashboard as you try different models. + +If Open WebUI reports an update is available, you can update the container image by running: + +```bash +docker pull ghcr.io/open-webui/open-webui:ollama +``` + +## Step 8. Cleanup and rollback + +Steps to completely remove the Open WebUI installation and free up resources. + +> [!WARNING] +> These commands will permanently delete all Open WebUI data and downloaded models. + +Stop and remove the Open WebUI container: + +```bash +docker stop open-webui +docker rm open-webui +``` + +Remove the downloaded images: + +```bash +docker rmi ghcr.io/open-webui/open-webui:ollama +``` + +Remove persistent data volumes: + +```bash +docker volume rm open-webui open-webui-ollama +``` + ## Troubleshooting -## Common issues with manual setup - -| Symptom | Cause | Fix | -|---------|-------|-----| -| Permission denied on docker ps | User not in docker group | Run Step 1 completely, including logging out and logging back in or use sudo| -| Model download fails | Network connectivity issues | Check internet connection, retry download | -| GPU not detected in container | Missing `--gpus=all flag` | Recreate container with correct command | -| Port 8080 already in use | Another application using port | Change port in docker command or stop conflicting service | - ## Common issues with setting up via NVIDIA Sync | Symptom | Cause | Fix | @@ -379,7 +350,16 @@ Remove the Custom App from NVIDIA Sync by opening Settings > Custom tab and dele | GPU not detected in container | Missing `--gpus=all flag` | Recreate container with correct start script | | Port 12000 already in use | Another application using port | Change port in Custom App settings or stop conflicting service | -> > [!NOTE] +## Common issues with manual setup + +| Symptom | Cause | Fix | +|---------|-------|-----| +| Permission denied on docker ps | User not in docker group | Run Step 1 completely, including logging out and logging back in or use sudo| +| Model download fails | Network connectivity issues | Check internet connection, retry download | +| GPU not detected in container | Missing `--gpus=all flag` | Recreate container with correct command | +| Port 8080 already in use | Another application using port | Change port in docker command or stop conflicting service | + +> [!NOTE] > DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. > With many applications still updating to take advantage of UMA, you may encounter memory issues even when within > the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with: