mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-27 12:23:51 +00:00
25 lines
1.7 KiB
Markdown
25 lines
1.7 KiB
Markdown
---
|
|
name: dgx-spark-live-vlm-webui
|
|
description: Real-time Vision Language Model interaction with webcam streaming — on NVIDIA DGX Spark. Use when setting up live-vlm-webui on Spark hardware.
|
|
---
|
|
|
|
<!-- GENERATED:BEGIN from nvidia/live-vlm-webui/README.md -->
|
|
# Live VLM WebUI
|
|
|
|
> Real-time Vision Language Model interaction with webcam streaming
|
|
|
|
Live VLM WebUI is a universal web interface for real-time Vision Language Model (VLM) interaction and benchmarking. It enables you to stream your webcam directly to any VLM backend (Ollama, vLLM, SGLang, or cloud APIs) and receive live AI-powered analysis. This tool is perfect for testing VLM models, benchmarking performance across different hardware configurations, and exploring vision AI capabilities.
|
|
|
|
The interface provides WebRTC-based video streaming, integrated GPU monitoring, customizable prompts, and support for multiple VLM backends. It works seamlessly with the powerful Blackwell GPU in your DGX Spark, enabling real-time vision inference at impressive speeds.
|
|
|
|
**Outcome**: You'll set up a complete real-time vision AI testing environment on your DGX Spark that allows you to:
|
|
|
|
- Stream webcam video and get instant VLM analysis through a web browser
|
|
- Test and compare different vision language models (Gemma 3, Llama Vision, Qwen VL, etc.)
|
|
- Monitor GPU and system performance in real-time while models process video frames
|
|
- Customize prompts for various use cases (object detection, scene description, OCR, safety monitoring)
|
|
- Access the interface from any device on your network with a web browser
|
|
|
|
**Full playbook**: `/home/runner/work/dgx-spark-playbooks/dgx-spark-playbooks/nvidia/live-vlm-webui/README.md`
|
|
<!-- GENERATED:END -->
|