mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-04-23 02:23:53 +00:00

History

GitLab CI e0535aa587 chore: Regenerate all playbooks		2025-10-10 20:59:55 +00:00
..
app	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00
compose	chore: Regenerate all playbooks	2025-10-10 18:45:20 +00:00
services	chore: Regenerate all playbooks	2025-10-10 20:59:55 +00:00
README.md	chore: Regenerate all playbooks	2025-10-10 18:45:20 +00:00

README.md

Deployment Configuration

This directory contains all deployment-related configuration for the txt2kg project.

Structure

compose/: Docker Compose files for local development and testing
- docker-compose.yml: Minimal Docker Compose configuration (Ollama + ArangoDB + Next.js)
- docker-compose.complete.yml: Complete stack with optional services (vLLM, Pinecone, Sentence Transformers)
- docker-compose.optional.yml: Additional optional services
- docker-compose.vllm.yml: Legacy vLLM configuration (use --complete flag instead)
app/: Frontend application Docker configuration
- Dockerfile for Next.js application
services/: Containerized services
- ollama/: Ollama LLM inference service with GPU support
- sentence-transformers/: Sentence transformer service for embeddings (optional)
- vllm/: vLLM inference service with FP8 quantization (optional)
- gpu-viz/: GPU-accelerated graph visualization services (optional, run separately)
- gnn_model/: Graph Neural Network model service (experimental, not in default compose files)

Usage

Recommended: Use the start script

# Minimal setup (Ollama + ArangoDB + Next.js frontend)
./start.sh

# Complete stack (includes vLLM, Pinecone, Sentence Transformers)
./start.sh --complete

# Development mode (run frontend without Docker)
./start.sh --dev-frontend

Manual Docker Compose commands:

To start the minimal services:

docker compose -f deploy/compose/docker-compose.yml up -d

To start the complete stack:

docker compose -f deploy/compose/docker-compose.complete.yml up -d

Services Included

Minimal Stack (default)

Next.js App: Web UI on port 3001
ArangoDB: Graph database on port 8529
Ollama: Local LLM inference on port 11434

Complete Stack (`--complete` flag)

All minimal services plus:

vLLM: Advanced LLM inference on port 8001
Pinecone (Local): Vector embeddings on port 5081
Sentence Transformers: Embedding generation on port 8000

Optional Services (run separately)

GPU-Viz Services: See services/gpu-viz/README.md for GPU-accelerated visualization
GNN Model Service: See services/gnn_model/README.md for experimental GNN-based RAG