From 434aae8c54418d7f1a75282e83f6fb7b75cbb7bc Mon Sep 17 00:00:00 2001
From: GitLab CI <automaton@nvidia.com>
Date: Mon, 6 Oct 2025 16:42:09 +0000
Subject: [PATCH] chore: Regenerate all playbooks

---
 README.md               |   1 +
 nvidia/txt2kg/README.md | 158 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 159 insertions(+)
 create mode 100644 nvidia/txt2kg/README.md

diff --git a/README.md b/README.md
index 6c88f30..7e3d4b1 100644
--- a/README.md
+++ b/README.md
@@ -45,6 +45,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
 - [Stack two Sparks](nvidia/stack-sparks/)
 - [Setup Tailscale on your Spark](nvidia/tailscale/)
 - [TRT LLM for Inference](nvidia/trt-llm/)
+- [Text to Knowledge Graph](nvidia/txt2kg/)
 - [Unsloth on DGX Spark](nvidia/unsloth/)
 - [Install and use vLLM](nvidia/vllm/)
 - [Vision-Language Model Fine-tuning](nvidia/vlm-finetuning/)
diff --git a/nvidia/txt2kg/README.md b/nvidia/txt2kg/README.md
new file mode 100644
index 0000000..e1020ca
--- /dev/null
+++ b/nvidia/txt2kg/README.md
@@ -0,0 +1,158 @@
+# Text to Knowledge Graph
+
+> Transform unstructured text using LLM inference into interactive knowledge graphs with GPU-accelerated visualization
+
+## Table of Contents
+
+- [Overview](#overview)
+- [Instructions](#instructions)
+
+---
+
+## Overview
+
+## Basic Idea
+
+This playbook demonstrates how to build and deploy a comprehensive knowledge graph generation and visualization solution that serves as a reference for knowledge graph extraction.
+The unified memory architecture enables running larger, more accurate models that produce higher-quality knowledge graphs and deliver superior downstream GraphRAG performance.
+
+This txt2kg playbook transforms unstructured text documents into structured knowledge graphs using:
+- **Knowledge Triple Extraction**: Using Ollama with GPU acceleration for local LLM inference to extract subject-predicate-object relationships
+- **Graph Database Storage**: ArangoDB for storing and querying knowledge triples with relationship traversal
+- **Vector Embeddings**: Local SentenceTransformer models for entity embeddings and semantic search
+- **GPU-Accelerated Visualization**: Three.js WebGPU rendering for interactive 2D/3D graph exploration
+
+## What you'll accomplish
+
+You will have a fully functional system capable of processing documents, generating and editing knowledge graphs, and providing querying, accessible through an interactive web interface.
+The setup includes:
+- **Local LLM Inference**: Ollama for GPU-accelerated LLM inference with no API keys required
+- **Graph Database**: ArangoDB for storing and querying triples with relationship traversal
+- **Vector Search**: Local Pinecone-compatible storage for entity embeddings and KNN search
+- **Interactive Visualization**: GPU-accelerated graph rendering with Three.js WebGPU
+- **Modern Web Interface**: Next.js frontend with document management and query interface
+- **Fully Containerized**: Reproducible deployment with Docker Compose and GPU support
+
+## Prerequisites
+
+-  DGX Spark with latest NVIDIA drivers
+-  Docker installed and configured with NVIDIA Container Toolkit
+-  Docker Compose
+
+
+## Time & risk
+
+**Duration**:
+- 2-3 minutes for initial setup and container deployment
+- 5-10 minutes for Ollama model download (depending on model size)
+- Immediate document processing and knowledge graph generation
+
+**Risks**:
+- GPU memory requirements depend on chosen Ollama model size
+- Document processing time scales with document size and complexity
+
+**Rollback**: Stop and remove Docker containers, delete downloaded models if needed
+
+## Instructions
+
+## Step 1. Clone the repository
+
+In a terminal, clone the txt2kg repository and navigate to the project directory.
+
+```bash
+git clone https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main txt2kg
+cd txt2kg
+```
+
+## Step 2. Start the txt2kg services
+
+Use the provided start script to launch all required services. This will set up Ollama, ArangoDB, local Pinecone, and the Next.js frontend:
+
+```bash
+./start.sh
+```
+
+The script will automatically:
+- Check for GPU availability
+- Start Docker Compose services
+- Set up ArangoDB database
+- Initialize local Pinecone vector storage
+- Launch the web interface
+
+## Step 3. Pull an Ollama model (optional)
+
+Download a language model for knowledge extraction. The default model loaded is Llama 3.1 8B:
+
+```bash
+docker exec ollama-compose ollama pull <model-name>
+```
+
+Browse available models at [https://ollama.com/search](https://ollama.com/search)
+
+> **Note**: The unified memory architecture enables running larger models like 70B parameters, which produce significantly more accurate knowledge triples and deliver superior GraphRAG performance.
+
+## Step 4. Access the web interface
+
+Open your browser and navigate to:
+
+```
+http://localhost:3001
+```
+
+You can also access individual services:
+- **ArangoDB Web Interface**: http://localhost:8529 
+- **Ollama API**: http://localhost:11434
+- **Local Pinecone**: http://localhost:5081
+
+## Step 5. Upload documents and build knowledge graphs
+
+#### 5.1. Document Upload
+- Use the web interface to upload text documents (markdown, text, CSV supported)
+- Documents are automatically chunked and processed for triple extraction
+
+#### 5.2. Knowledge Graph Generation
+- The system extracts subject-predicate-object triples using Ollama
+- Triples are stored in ArangoDB for relationship querying
+- Entity embeddings are generated and stored in local Pinecone (optional)
+
+#### 5.3. Interactive Visualization
+- View your knowledge graph in 2D or 3D with GPU-accelerated rendering
+- Explore nodes and relationships interactively
+
+#### 5.4. Graph-based RAG Queries
+- Ask questions about your documents using the query interface
+- Graph traversal enhances context with entity relationships from ArangoDB
+- The system uses KNN search to find relevant entities in the vector database (optional)
+- LLM generates responses using the enriched graph context
+
+## Step 6. Troubleshooting
+
+Common issues and solutions for txt2kg setup on DGX Spark.
+
+| Symptom | Cause | Fix |
+|---------|--------|-----|
+| Ollama performance issues | Suboptimal settings for DGX Spark | Set environment variables: `OLLAMA_FLASH_ATTENTION=1` (enables flash attention for better performance), `OLLAMA_KEEP_ALIVE=30m` (keeps model loaded for 30 minutes), `OLLAMA_MAX_LOADED_MODELS=1` (avoids VRAM contention), `OLLAMA_KV_CACHE_TYPE=q8_0` (reduces KV cache VRAM with minimal performance impact) |
+| VRAM exhausted or memory pressure (e.g. when switching between Ollama models) | Linux buffer cache consuming GPU memory | Flush buffer cache: `sudo sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'` |
+| Slow triple extraction | Large model or large context window | Reduce document chunk size or use faster models |
+| ArangoDB connection refused | Service not fully started | Wait 30s after start.sh, verify with `docker ps` |
+
+## Step 7. Cleanup and rollback
+
+Stop all services and optionally remove containers:
+
+```bash
+## Stop services
+docker compose down
+
+## Remove containers and volumes (optional)
+docker compose down -v
+
+## Remove downloaded models (optional)
+docker exec ollama-compose ollama rm llama3.1:8b
+```
+
+## Step 8. Next steps
+
+- Experiment with different Ollama models for varied extraction quality
+- Customize triple extraction prompts for domain-specific knowledge
+- Explore advanced Graph-based RAG features