dgx-spark-playbooks/skills/dgx-spark-live-vlm-webui/SKILL.md
Jason Kneen a680d0472b feat: scaffold skills plugin from DGX Spark playbooks
Adds a Claude Code plugin structure that exposes each NVIDIA DGX Spark
playbook as a triggerable skill, with an index skill ('dgx-spark') that
routes users to the right leaf based on intent and encodes the
relationship graph between playbooks (prerequisites, alternatives,
composes-with, upgrade paths).

Structure:
- overrides/*.md       hand-curated frontmatter + Related sections
- scripts/generate.mjs zero-dep Node generator: nvidia + overrides → skills
- scripts/install.sh   symlinks skills into ~/.claude/skills (--plugin mode available)
- skills/              committed, browsable, installable without Node
- .github/workflows/   auto-regenerates skills/ when playbooks/overrides change

Initial curated leaves: ollama, open-webui, vllm, connect-to-your-spark.
Remaining 37 leaves use generator fallback (title + tagline + summary
extracted from README) and can be curated incrementally via overrides/.
2026-04-19 10:22:08 +01:00

1.7 KiB

name description
dgx-spark-live-vlm-webui Real-time Vision Language Model interaction with webcam streaming — on NVIDIA DGX Spark. Use when setting up live-vlm-webui on Spark hardware.

Live VLM WebUI

Real-time Vision Language Model interaction with webcam streaming

Live VLM WebUI is a universal web interface for real-time Vision Language Model (VLM) interaction and benchmarking. It enables you to stream your webcam directly to any VLM backend (Ollama, vLLM, SGLang, or cloud APIs) and receive live AI-powered analysis. This tool is perfect for testing VLM models, benchmarking performance across different hardware configurations, and exploring vision AI capabilities.

The interface provides WebRTC-based video streaming, integrated GPU monitoring, customizable prompts, and support for multiple VLM backends. It works seamlessly with the powerful Blackwell GPU in your DGX Spark, enabling real-time vision inference at impressive speeds.

Outcome: You'll set up a complete real-time vision AI testing environment on your DGX Spark that allows you to:

  • Stream webcam video and get instant VLM analysis through a web browser
  • Test and compare different vision language models (Gemma 3, Llama Vision, Qwen VL, etc.)
  • Monitor GPU and system performance in real-time while models process video frames
  • Customize prompts for various use cases (object detection, scene description, OCR, safety monitoring)
  • Access the interface from any device on your network with a web browser

Full playbook: /Users/jkneen/Documents/GitHub/dgx-spark-playbooks/nvidia/live-vlm-webui/README.md