From b9c45a61a062f6f1a4d00fa9fab9fb438a683d7c Mon Sep 17 00:00:00 2001
From: GitLab CI <automaton@nvidia.com>
Date: Fri, 10 Oct 2025 17:27:37 +0000
Subject: [PATCH] chore: Regenerate all playbooks

---
 README.md                     |   3 +-
 nvidia/stack-sparks/README.md |   2 +-
 nvidia/txt2kg/README.md       |  24 +++---
 nvidia/vibe-coding/README.md  | 153 ----------------------------------
 4 files changed, 13 insertions(+), 169 deletions(-)
 delete mode 100644 nvidia/vibe-coding/README.md

diff --git a/README.md b/README.md
index 16d037c..ddff0e1 100644
--- a/README.md
+++ b/README.md
@@ -42,12 +42,11 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
 - [RAG application in AI Workbench](nvidia/rag-ai-workbench/)
 - [SGLang Inference Server](nvidia/sglang/)
 - [Speculative Decoding](nvidia/speculative-decoding/)
-- [Connect two Sparks](nvidia/stack-sparks/)
+- [Stack two Sparks](nvidia/stack-sparks/)
 - [Set up Tailscale on your Spark](nvidia/tailscale/)
 - [TRT LLM for Inference](nvidia/trt-llm/)
 - [Text to Knowledge Graph](nvidia/txt2kg/)
 - [Unsloth on DGX Spark](nvidia/unsloth/)
-- [Vibe Coding in VS Code](nvidia/vibe-coding/)
 - [Install and use vLLM](nvidia/vllm/)
 - [Vision-Language Model Fine-tuning](nvidia/vlm-finetuning/)
 - [Install VS Code](nvidia/vscode/)
diff --git a/nvidia/stack-sparks/README.md b/nvidia/stack-sparks/README.md
index 2b2213b..d3b8bcc 100644
--- a/nvidia/stack-sparks/README.md
+++ b/nvidia/stack-sparks/README.md
@@ -1,4 +1,4 @@
-# Connect two Sparks
+# Stack two Sparks
 
 > Connect two Spark devices and setup them up for inference and fine-tuning
 
diff --git a/nvidia/txt2kg/README.md b/nvidia/txt2kg/README.md
index 1473e67..ba3fb9c 100644
--- a/nvidia/txt2kg/README.md
+++ b/nvidia/txt2kg/README.md
@@ -1,6 +1,6 @@
 # Text to Knowledge Graph
 
-> Transform unstructured text using LLM inference into interactive knowledge graphs with GPU-accelerated visualization
+> Transform unstructured text into interactive knowledge graphs using local GPU-accelerated LLM inference and graph visualization
 
 ## Table of Contents
 
@@ -20,16 +20,16 @@ The unified memory architecture enables running larger, more accurate models tha
 This txt2kg playbook transforms unstructured text documents into structured knowledge graphs using:
 - **Knowledge Triple Extraction**: Using Ollama with GPU acceleration for local LLM inference to extract subject-predicate-object relationships
 - **Graph Database Storage**: ArangoDB for storing and querying knowledge triples with relationship traversal
-- **Vector Embeddings**: Local SentenceTransformer models for entity embeddings and semantic search
 - **GPU-Accelerated Visualization**: Three.js WebGPU rendering for interactive 2D/3D graph exploration
 
+> **Future Enhancements**: Vector embeddings and GraphRAG capabilities are planned enhancements.
+
 ## What you'll accomplish
 
 You will have a fully functional system capable of processing documents, generating and editing knowledge graphs, and providing querying, accessible through an interactive web interface.
 The setup includes:
 - **Local LLM Inference**: Ollama for GPU-accelerated LLM inference with no API keys required
 - **Graph Database**: ArangoDB for storing and querying triples with relationship traversal
-- **Vector Search**: Local Pinecone-compatible storage for entity embeddings and KNN search
 - **Interactive Visualization**: GPU-accelerated graph rendering with Three.js WebGPU
 - **Modern Web Interface**: Next.js frontend with document management and query interface
 - **Fully Containerized**: Reproducible deployment with Docker Compose and GPU support
@@ -67,7 +67,7 @@ cd ${MODEL}/assets
 
 ## Step 2. Start the txt2kg services
 
-Use the provided start script to launch all required services. This will set up Ollama, ArangoDB, local Pinecone, and the Next.js frontend:
+Use the provided start script to launch all required services. This will set up Ollama, ArangoDB, and the Next.js frontend:
 
 ```bash
 ./start.sh
@@ -77,7 +77,6 @@ The script will automatically:
 - Check for GPU availability
 - Start Docker Compose services
 - Set up ArangoDB database
-- Initialize local Pinecone vector storage
 - Launch the web interface
 
 ## Step 3. Pull an Ollama model (optional)
@@ -90,7 +89,7 @@ docker exec ollama-compose ollama pull <model-name>
 
 Browse available models at [https://ollama.com/search](https://ollama.com/search)
 
-> **Note**: The unified memory architecture enables running larger models like 70B parameters, which produce significantly more accurate knowledge triples and deliver superior GraphRAG performance.
+> **Note**: The unified memory architecture enables running larger models like 70B parameters, which produce significantly more accurate knowledge triples.
 
 ## Step 4. Access the web interface
 
@@ -103,7 +102,6 @@ http://localhost:3001
 You can also access individual services:
 - **ArangoDB Web Interface**: http://localhost:8529 
 - **Ollama API**: http://localhost:11434
-- **Local Pinecone**: http://localhost:5081
 
 ## Step 5. Upload documents and build knowledge graphs
 
@@ -114,19 +112,19 @@ You can also access individual services:
 #### 5.2. Knowledge Graph Generation
 - The system extracts subject-predicate-object triples using Ollama
 - Triples are stored in ArangoDB for relationship querying
-- Entity embeddings are generated and stored in local Pinecone (optional)
 
 #### 5.3. Interactive Visualization
 - View your knowledge graph in 2D or 3D with GPU-accelerated rendering
 - Explore nodes and relationships interactively
 
-#### 5.4. Graph-based RAG Queries
+#### 5.4. Graph-based Queries
 - Ask questions about your documents using the query interface
 - Graph traversal enhances context with entity relationships from ArangoDB
-- The system uses KNN search to find relevant entities in the vector database (optional)
 - LLM generates responses using the enriched graph context
 
-## Step 7. Cleanup and rollback
+> **Future Enhancement**: GraphRAG capabilities with vector-based KNN search for entity retrieval are planned.
+
+## Step 6. Cleanup and rollback
 
 Stop all services and optionally remove containers:
 
@@ -141,11 +139,11 @@ docker compose down -v
 docker exec ollama-compose ollama rm llama3.1:8b
 ```
 
-## Step 8. Next steps
+## Step 7. Next steps
 
 - Experiment with different Ollama models for varied extraction quality
 - Customize triple extraction prompts for domain-specific knowledge
-- Explore advanced Graph-based RAG features
+- Explore advanced graph querying and visualization features
 
 ## Troubleshooting
 
diff --git a/nvidia/vibe-coding/README.md b/nvidia/vibe-coding/README.md
deleted file mode 100644
index 67a7627..0000000
--- a/nvidia/vibe-coding/README.md
+++ /dev/null
@@ -1,153 +0,0 @@
-# Vibe Coding in VS Code
-
-> Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue.dev
-
-## Table of Contents
-
-- [Overview](#overview)
-  - [What You'll Accomplish](#what-youll-accomplish)
-  - [Prerequisites](#prerequisites)
-  - [Requirements](#requirements)
-- [Instructions](#instructions)
-- [Troubleshooting](#troubleshooting)
-
----
-
-## Overview
-
-## DGX Spark Vibe Coding
-
-This playbook walks you through setting up DGX Spark as a **Vibe Coding assistant** — locally or as a remote coding companion for VSCode with Continue.dev.  
-While NVIDIA NIMs are not yet widely supported, this guide uses **Ollama** with **GPT-OSS 120B** to provide a high-performance local LLM environment.
-
-### What You'll Accomplish
-
-You’ll have a fully configured DGX Spark system capable of:
-- Running local code assistance through Ollama.
-- Serving models remotely for Continue.dev and VSCode integration.
-- Hosting large LLMs like GPT-OSS 120B using unified memory.
-
-### Prerequisites
-
-- DGX Spark (128GB unified memory recommended)
-- Internet access for model downloads
-- Basic familiarity with the terminal
-- Optional: firewall control for remote access configuration
-
-### Requirements
-
-- **Ollama** and an LLM of your choice (e.g., `gpt-oss:120b`)
-- **VSCode**
-- **Continue.dev** VSCode extension
-
-## Instructions
-
-## Step 1. Install Ollama
-
-Install the latest version of Ollama using the following command:
-
-```bash
-curl -fsSL https://ollama.com/install.sh | sh
-```
-
-Start the Ollama service:
-
-```bash
-ollama serve
-```
-
-Once the service is running, pull the desired model:
-
-```bash
-ollama pull gpt-oss:120b
-```
-
-## Step 2. (Optional) Enable Remote Access
-
-To allow remote connections (e.g., from a workstation using VSCode and Continue.dev), modify the Ollama systemd service:
-
-```bash
-sudo systemctl edit ollama
-```
-
-Add the following lines beneath the commented section:
-
-```ini
-[Service]
-Environment="OLLAMA_HOST=0.0.0.0:11434"
-Environment="OLLAMA_ORIGINS=*"
-```
-
-Reload and restart the service:
-
-```bash
-sudo systemctl daemon-reload
-sudo systemctl restart ollama
-```
-
-If using a firewall, open port 11434:
-
-```bash
-sudo ufw allow 11434/tcp
-```
-
-## Step 3. Install VSCode
-
-For DGX Spark (ARM-based), download and install VSCode:
-
-```bash
-wget https://code.visualstudio.com/sha/download?build=stable&os=linux-deb-arm64 -O vscode-arm64.deb
-sudo apt install ./vscode-arm64.deb
-```
-
-If using a remote workstation, install VSCode appropriate for your system architecture.
-
-## Step 4. Install Continue.dev Extension
-
-Open VSCode and install **Continue.dev** from the Marketplace.  
-After installation, click the Continue icon on the right-hand bar.
-
-Skip login and open the manual configuration via the **gear (⚙️)** icon.  
-This opens `config.yaml`, which controls model settings.
-
-## Step 5. Local Inference Setup
-
-- In the Continue chat window, use `Ctrl/Cmd + L` to focus the chat.
-- Click **Select Model → + Add Chat Model**
-- Choose **Ollama** as the provider.
-- Set **Install Provider** to default.
-- For **Model**, select **Autodetect**.
-- Click **Connect**.
-
-You can now select your downloaded model (e.g., `gpt-oss:120b`) for local inference.
-
-## Step 6. Remote Setup for DGX Spark
-
-To connect Continue.dev to a remote DGX Spark instance, edit `config.yaml` in Continue and add:
-
-```yaml
-models:
-  - model: gpt-oss:120b
-    title: gpt-oss:120b
-    apiBase: http://YOUR_SPARK_IP:11434/
-    provider: ollama
-```
-
-Replace `YOUR_SPARK_IP` with the IP address of your DGX Spark.  
-Add additional model entries for any other Ollama models you wish to host remotely.
-
-## Troubleshooting
-
-## Common Issues
-
-**1. Ollama not starting**
-- Verify Docker and GPU drivers are installed correctly.
-- Run `ollama serve` manually to view errors.
-
-**2. VSCode can’t connect**
-- Ensure port 11434 is open and accessible from your workstation.
-- Check `OLLAMA_HOST` and `OLLAMA_ORIGINS` in `/etc/systemd/system/ollama.service.d/override.conf`.
-
-**3. High memory usage**
-- Use smaller models such as `gpt-oss:20b` for lightweight usage.
-- Confirm no other large models or containers are running with `nvidia-smi`.