mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-06-18 04:22:21 +00:00

History

GitLab CI 6a749bdcb0 chore: Regenerate all playbooks		2026-06-17 21:19:47 +00:00
..
README.md	chore: Regenerate all playbooks	2026-06-17 21:19:47 +00:00

README.md

Build a Video Search and Summarization (VSS) Agent

Run the VSS Blueprint on your Spark

Overview
Instructions
Troubleshooting

Overview

Basic idea

Deploy NVIDIA's Video Search and Summarization (VSS) AI Blueprint to build intelligent video analytics systems that combine vision language models, large language models, and retrieval-augmented generation. The system transforms raw video content into real-time actionable insights with video summarization, Q&A, and real-time alerts. You'll set up either a completely local Event Reviewer deployment or a hybrid deployment using remote model endpoints.

What you'll accomplish

You will deploy NVIDIA's VSS AI Blueprint on NVIDIA Spark hardware with Blackwell architecture, choosing between two deployment scenarios: VSS Event Reviewer (completely local with VLM pipeline) or Standard VSS (hybrid deployment with remote LLM/embedding endpoints). This includes setting up Alert Bridge, VLM Pipeline, Alert Inspector UI, Video Storage Toolkit, and optional DeepStream CV pipeline for automated video analysis and event review.

What to know before starting

Working with NVIDIA Docker containers and container registries
Setting up Docker Compose environments with shared networks
Managing environment variables and authentication tokens
Basic understanding of video processing and analysis workflows

Prerequisites

NVIDIA Spark device with ARM64 architecture and Blackwell GPU
DGX OS (suggested: 7.4.0 or higher)
Driver version 580.95.05 or higher installed: nvidia-smi | grep "Driver Version"
CUDA version 13.0 installed: nvcc --version
Docker installed and running: docker --version && docker compose version
Access to NVIDIA Container Registry with NGC API Key
NVIDIA Container Toolkit
[Optional] NVIDIA API Key for remote model endpoints (hybrid deployment only)
Sufficient storage space for video processing (>10GB recommended in /tmp/)

Ancillary files

VSS Blueprint GitHub Repository - Main codebase and Docker Compose configurations
VSS Official Documentation - Complete system documentation

Time & risk

Duration: 30-45 minutes for initial setup, additional time for video processing validation
Risks:
- Container startup can be resource-intensive and time-consuming with large model downloads
- Network configuration conflicts if shared network already exists
- Remote API endpoints may have rate limits or connectivity issues (hybrid deployment)
Rollback: Stop all containers with deploy/docker/scripts/dev-profile.sh down
Last Updated: 06/17/2026
- Update required OS and Driver versions
- Support for VSS 3.2.0 with Cosmos Reason 2 VLM

Instructions

Step 1. Verify environment requirements

Check that your system meets the hardware and software prerequisites.

## Verify driver version
nvidia-smi | grep "Driver Version"
## Expected output: Driver Version: 580.95.05 or higher

## Verify CUDA version
nvcc --version
## Expected output: release 13.0

## Verify Docker is running
docker --version && docker compose version

Step 2. Configure Docker

To easily manage containers without sudo, you must be in the docker group. If you choose to skip this step, you will need to run Docker commands with sudo. Open a new terminal and test Docker access. In the terminal, run:

docker ps

If you see a permission denied error (something like permission denied while trying to connect to the Docker daemon socket), add your user to the docker group so that you don't need to run the command with sudo .

sudo usermod -aG docker $USER
newgrp docker

Additionally, configure Docker so that it can use the NVIDIA Container Runtime.

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

##Run a sample workload to verify the setup
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

Step 3. Clone the VSS repository

Clone the Video Search and Summarization repository from NVIDIA's public GitHub.

Note Install Git LFS if not already present on the system

sudo apt-get install -y git-lfs && git lfs install

## Clone the VSS AI Blueprint repository
git clone https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization.git
cd video-search-and-summarization
git checkout tags/v3.2.0
git lfs install
git lfs pull

Step 4. Run the cache cleaner script

Start the system cache cleaner to optimize memory usage during container operations.

Create the cache cleaner script at /usr/local/bin/sys-cache-cleaner.sh mentioned below

sudo tee /usr/local/bin/sys-cache-cleaner.sh << 'EOF'
#!/bin/bash
## Exit immediately if any command fails
set -e

## Disable hugepages
echo "disable vm/nr_hugepage"
echo 0 | tee /proc/sys/vm/nr_hugepages

## Notify that the cache cleaner is running
echo "Starting cache cleaner - Running"
echo "Press Ctrl + C to stop"
## Repeatedly sync and drop caches every 3 seconds
while true; do
     sync && echo 3 | tee /proc/sys/vm/drop_caches > /dev/null
     sleep 3
done
EOF

sudo chmod +x /usr/local/bin/sys-cache-cleaner.sh

Running in the background

## In another terminal, start the cache cleaner script.
sudo -b /usr/local/bin/sys-cache-cleaner.sh

Note

The above runs the cache cleaner in the current session only; it does not persist across reboots. To have the cache cleaner run across reboots, create a systemd service instead. To stop the background cache cleaner:

sudo pkill -f sys-cache-cleaner.sh

Step 5. Authenticate with NVIDIA Container Registry

Note

If you don’t have an NVIDIA account already, you’ll have to create one and register for the developer program.

## Log in to NVIDIA Container Registry
docker login nvcr.io
## Username: $oauthtoken
## Password: <PASTE_NGC_API_KEY_HERE>

Step 6. Choose deployment scenario

Choose the deployment options based on your requirements:

Deployment Scenario	VLM (Cosmos-Reason2-8B)	LLM
Standard VSS (Base)	Local	Remote
Standard VSS (Alert Verification)	Local	Remote
Standard VSS deployment (Real-Time Alerts)	Local	Remote

Step 7. Standard VSS

Standard VSS (Hybrid Deployment)

In this hybrid deployment, we would use NIMs from build.nvidia.com. Alternatively, you can configure your own hosted endpoints by following the instructions in the VSS remote LLM deployment guide.

7.1 Get NVIDIA API Key

Log in to https://build.nvidia.com/explore/discover.
Search for Get API Key on the page and click on it.

7.2 Launch Standard VSS deployment

Standard VSS deployment (Base) Standard VSS deployment (Alert Verification) Standard VSS deployment (Real-Time Alerts)

## Start Standard VSS (Base)
## Set NGC CLI API key and Hugging Face token (required for VA-MCP)
export NGC_CLI_API_KEY='your_ngc_api_key'
export HF_TOKEN='hf_your_token_here'
export LLM_ENDPOINT_URL=https://your-llm-endpoint.com
deploy/docker/scripts/dev-profile.sh up -p base -H DGX-SPARK --use-remote-llm --llm <REMOTE LLM MODEL NAME>

## Start Standard VSS (Alert Verification)
export NGC_CLI_API_KEY='your_ngc_api_key'
export LLM_ENDPOINT_URL=https://your-llm-endpoint.com
deploy/docker/scripts/dev-profile.sh up -p alerts -m verification -H DGX-SPARK --use-remote-llm --llm <REMOTE LLM MODEL NAME>

## Start Standard VSS (Real-Time Alerts)
export NGC_CLI_API_KEY='your_ngc_api_key'
export LLM_ENDPOINT_URL=https://your-llm-endpoint.com
deploy/docker/scripts/dev-profile.sh up -p alerts -m real-time -H DGX-SPARK --use-remote-llm --llm <REMOTE LLM MODEL NAME>

Note

This step will take several minutes as containers are pulled and services initialize. The VSS backend requires additional startup time. Set the following environment variables before deployment: • NGC_CLI_API_KEY — (required) NGC API key for pulling images and deployment • LLM_ENDPOINT_URL — (required when using --use-remote-llm) Base URL for the remote LLM • NVIDIA_API_KEY — (optional) For remote LLM/VLM endpoints that require it • OPENAI_API_KEY — (optional) For remote LLM/VLM endpoints that require it • VLM_CUSTOM_WEIGHTS — (optional) Absolute path to a custom weights directory

Pass these additional flags to deploy/docker/scripts/dev-profile.sh for remote LLM mode: • --use-remote-llm — (required) Use a remote LLM, the base URL is read from LLM_ENDPOINT_URL in the environment • --llm — (required) Remote LLM model name (for example: nvidia/nvidia-nemotron-nano-9b-v2). Strongly recommended for alert workflows (verification and real-time): use nvidia/nvidia-nemotron-nano-9b-v2. Omitting --llm may cause the script to use whatever model is returned by the remote endpoint.

Run deploy/docker/scripts/dev-profile.sh --help for a full list of supported arguments.

7.3 Validate Standard VSS deployment

Access the VSS UI to confirm successful deployment. Common VSS Endpoints

## Test Agent UI accessibility
## If running locally on your Spark device, use localhost:
curl -I http://localhost:7777
## Expected: HTTP 200 response

## If your Spark is running in Remote/Accessory mode, replace 'localhost' with the IP address or hostname of your Spark device.
## To find your Spark's IP address, run the following command on the Spark terminal:
hostname -I
## Or to get the hostname:
hostname
## Then test accessibility (replace <SPARK_IP_OR_HOSTNAME> with the actual value):
curl -I http://<SPARK_IP_OR_HOSTNAME>:7777

Open http://localhost:7777 or http://<SPARK_IP_OR_HOSTNAME>:7777 in your browser to access the Agent interface.

Step 8. Test video processing workflow

Run a basic test to verify the video analysis pipeline is functioning based on your deployment.

For Standard VSS deployment

Follow the steps here to navigate VSS Agent UI.

Access VSS Agent interface at http://localhost:7777
Download the sample data from NGC here and upload videos and test features
Test Standard VSS deployment (Base) here
Test Standard VSS deployment (Alert Verification) here
Test Standard VSS deployment (Real-Time Alerts) here

Step 9. Cleanup and rollback

To completely remove the VSS deployment and free up system resources Follow:

Warning

This will destroy all processed video data and analysis results.

## For Standard VSS deployment
deploy/docker/scripts/dev-profile.sh down

Step 10. Next steps

With VSS deployed, you can now:

Standard VSS deployment:

Access full VSS capabilities at port 7777
Test video and Q&A features
Configure knowledge graphs and graph databases
Integrate with existing video processing workflows

Troubleshooting

Symptom	Cause	Fix
Container fails to start with "pull access denied"	Missing or incorrect nvcr.io credentials	Re-run `docker login nvcr.io` with valid credentials
Web interfaces not accessible	Services still starting or port conflicts	Wait 2-3 minutes, check `docker ps` for container status

Note

DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. With many applications still updating to take advantage of UMA, you may encounter memory issues even when within the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:

sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

README.md Unescape Escape