mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-25 03:13:53 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
56c57950bd
commit
809947301c
@ -12,7 +12,7 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
## Basic Idea
|
## Basic idea
|
||||||
|
|
||||||
This playbook demonstrates how to build and deploy a comprehensive knowledge graph generation and visualization solution that serves as a reference for knowledge graph extraction.
|
This playbook demonstrates how to build and deploy a comprehensive knowledge graph generation and visualization solution that serves as a reference for knowledge graph extraction.
|
||||||
The unified memory architecture enables running larger, more accurate models that produce higher-quality knowledge graphs and deliver superior downstream GraphRAG performance.
|
The unified memory architecture enables running larger, more accurate models that produce higher-quality knowledge graphs and deliver superior downstream GraphRAG performance.
|
||||||
@ -43,16 +43,16 @@ The setup includes:
|
|||||||
|
|
||||||
## Time & risk
|
## Time & risk
|
||||||
|
|
||||||
**Duration**:
|
⏱️ **Duration**:
|
||||||
- 2-3 minutes for initial setup and container deployment
|
- 2-3 minutes for initial setup and container deployment
|
||||||
- 5-10 minutes for Ollama model download (depending on model size)
|
- 5-10 minutes for Ollama model download (depending on model size)
|
||||||
- Immediate document processing and knowledge graph generation
|
- Immediate document processing and knowledge graph generation
|
||||||
|
|
||||||
**Risks**:
|
⚠️ **Risks**:
|
||||||
- GPU memory requirements depend on chosen Ollama model size
|
- GPU memory requirements depend on chosen Ollama model size
|
||||||
- Document processing time scales with document size and complexity
|
- Document processing time scales with document size and complexity
|
||||||
|
|
||||||
**Rollback**: Stop and remove Docker containers, delete downloaded models if needed
|
↩️ **Rollback**: Stop and remove Docker containers, delete downloaded models if needed
|
||||||
|
|
||||||
## Instructions
|
## Instructions
|
||||||
|
|
||||||
@ -149,7 +149,7 @@ docker exec ollama-compose ollama rm llama3.1:8b
|
|||||||
|
|
||||||
| Symptom | Cause | Fix |
|
| Symptom | Cause | Fix |
|
||||||
|---------|--------|-----|
|
|---------|--------|-----|
|
||||||
| Ollama performance issues | Suboptimal settings for DGX Spark | Set environment variables: `OLLAMA_FLASH_ATTENTION=1` (enables flash attention for better performance), `OLLAMA_KEEP_ALIVE=30m` (keeps model loaded for 30 minutes), `OLLAMA_MAX_LOADED_MODELS=1` (avoids VRAM contention), `OLLAMA_KV_CACHE_TYPE=q8_0` (reduces KV cache VRAM with minimal performance impact) |
|
| Ollama performance issues | Suboptimal settings for DGX Spark | Set environment variables:<br>`OLLAMA_FLASH_ATTENTION=1` (enables flash attention for better performance)<br>`OLLAMA_KEEP_ALIVE=30m` (keeps model loaded for 30 minutes)<br>`OLLAMA_MAX_LOADED_MODELS=1` (avoids VRAM contention)<br>`OLLAMA_KV_CACHE_TYPE=q8_0` (reduces KV cache VRAM with minimal performance impact) |
|
||||||
| VRAM exhausted or memory pressure (e.g. when switching between Ollama models) | Linux buffer cache consuming GPU memory | Flush buffer cache: `sudo sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'` |
|
| VRAM exhausted or memory pressure (e.g. when switching between Ollama models) | Linux buffer cache consuming GPU memory | Flush buffer cache: `sudo sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'` |
|
||||||
| Slow triple extraction | Large model or large context window | Reduce document chunk size or use faster models |
|
| Slow triple extraction | Large model or large context window | Reduce document chunk size or use faster models |
|
||||||
| ArangoDB connection refused | Service not fully started | Wait 30s after start.sh, verify with `docker ps` |
|
| ArangoDB connection refused | Service not fully started | Wait 30s after start.sh, verify with `docker ps` |
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user