Merge 59bedc4afe into 08c06d5bd9

2026-06-24 07:09:31 +00:00 · 2026-04-05 23:56:32 +01:00
4 changed files with 7 additions and 18 deletions
--- a/nvidia/nemo-fine-tune/README.md
+++ b/nvidia/nemo-fine-tune/README.md
@ -47,8 +47,8 @@ All necessary files for the playbook can be found [here on GitHub](https://githu
 * **Duration:** 45-90 minutes for complete setup and initial model fine-tuning
 * **Risks:** Model downloads can be large (several GB), ARM64 package compatibility issues may require troubleshooting, distributed training setup complexity increases with multi-node configurations
 * **Rollback:** Virtual environments can be completely removed; no system-level changes are made to the host system beyond package installations.
-* **Last Updated:** 03/04/2026
+* **Last Updated:** 01/15/2026
-  * Recommend running Nemo finetune workflow via Docker
+  * Fix qLoRA fine-tuning workflow
 ## Instructions
--- a/nvidia/nemoclaw/README.md
+++ b/nvidia/nemoclaw/README.md
@ -172,15 +172,12 @@ Verify the NVIDIA runtime works:
 docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
 ```
-If you get a permission denied error on `docker`, add your user to the Docker group and activate the new group in your current session:
+If you get a permission denied error on `docker`, add your user to the Docker group and log out/in:
 ```bash
 sudo usermod -aG docker $USER
 newgrp docker
 ```
 This applies the group change immediately. Alternatively, you can log out and back in instead of running `newgrp docker`.
 > [!NOTE]
 > DGX Spark uses cgroup v2. OpenShell's gateway embeds k3s inside Docker and needs host cgroup namespace access. Without `default-cgroupns-mode: host`, the gateway can fail with "Failed to start ContainerManager" errors.
@ -325,21 +322,13 @@ http://127.0.0.1:18789/#token=<long-token-here>
 **If accessing the Web UI from a remote machine**, you need to set up port forwarding.
 First, find your Spark's IP address. On the Spark, run:
 ```bash
 hostname -I | awk '{print $1}'
 ```
 This prints the primary IP address (e.g. `192.168.1.42`). You can also find it in **Settings > Wi-Fi** or **Settings > Network** on the Spark's desktop, or check your router's connected-devices list.
 Start the port forward on the Spark host:
 ```bash
 openshell forward start 18789 my-assistant --background
 ```
-Then from your remote machine, create an SSH tunnel to the Spark (replace `<your-spark-ip>` with the IP address from above):
+Then from your remote machine, create an SSH tunnel to the Spark:
 ```bash
 ssh -L 18789:127.0.0.1:18789 <your-user>@<your-spark-ip>
--- a/nvidia/txt2kg/assets/deploy/compose/docker-compose.yml
+++ b/nvidia/txt2kg/assets/deploy/compose/docker-compose.yml
@ -27,8 +27,8 @@ services:
      # Ollama configuration
      - OLLAMA_BASE_URL=http://ollama:11434/v1
      - OLLAMA_MODEL=llama3.1:8b
-      # vLLM disabled in default Ollama mode
+      # Disable vLLM
-      # - VLLM_BASE_URL=http://localhost:8001/v1
+      - VLLM_BASE_URL=http://localhost:8001/v1
      - VLLM_MODEL=disabled
      # Vector DB configuration
      - QDRANT_URL=http://qdrant:6333
--- a/nvidia/txt2kg/assets/frontend/lib/text-processor.ts
+++ b/nvidia/txt2kg/assets/frontend/lib/text-processor.ts
@ -108,7 +108,7 @@ export class TextProcessor {
    // Determine which LLM provider to use based on configuration
    // Priority: vLLM > NVIDIA > Ollama
-    if (process.env.VLLM_BASE_URL && process.env.VLLM_MODEL && process.env.VLLM_MODEL !== 'disabled') {
+    if (process.env.VLLM_BASE_URL) {
      this.selectedLLMProvider = 'vllm';
    } else if (process.env.NVIDIA_API_KEY) {
      this.selectedLLMProvider = 'nvidia';