dgx-spark-playbooks/nvidia/txt2kg/assets/deploy/README.md
2026-01-14 16:05:35 +00:00

4.2 KiB

Deployment Configuration

This directory contains all deployment-related configuration for the txt2kg project.

Structure

  • compose/: Docker Compose configuration

    • docker-compose.yml: ArangoDB + Ollama (default)
    • docker-compose.vllm.yml: Neo4j + vLLM (GPU-accelerated)
  • app/: Frontend application Docker configuration

    • Dockerfile for Next.js application
  • services/: Containerized services

    • ollama/: Ollama LLM inference service (default)
    • vllm/: vLLM inference service with GPU support (via --vllm flag)
    • sentence-transformers/: Sentence transformer service for embeddings (via --vector-search flag)
    • gpu-viz/: GPU-accelerated graph visualization services (run separately)
    • gnn_model/: Graph Neural Network model service (experimental)

Usage

Recommended: Use the start script

# Default: ArangoDB + Ollama
./start.sh

# Use Neo4j + vLLM (GPU-accelerated, for DGX Spark/GB300)
./start.sh --vllm

# Enable vector search (Qdrant + Sentence Transformers)
./start.sh --vector-search

# Combine options
./start.sh --vllm --vector-search

# Development mode (run frontend without Docker)
./start.sh --dev-frontend

Manual Docker Compose commands:

# Default: ArangoDB + Ollama
docker compose -f deploy/compose/docker-compose.yml up -d

# Neo4j + vLLM
docker compose -f deploy/compose/docker-compose.vllm.yml up -d

# With vector search services (add --profile vector-search)
docker compose -f deploy/compose/docker-compose.yml --profile vector-search up -d
docker compose -f deploy/compose/docker-compose.vllm.yml --profile vector-search up -d

Services Included

Default Stack (ArangoDB + Ollama)

  • Next.js App: Web UI on port 3001
  • ArangoDB: Graph database on port 8529
  • Ollama: Local LLM inference on port 11434

vLLM Stack (--vllm flag) - Neo4j + vLLM

  • Next.js App: Web UI on port 3001
  • Neo4j: Graph database on ports 7474 (HTTP) and 7687 (Bolt)
  • vLLM: GPU-accelerated LLM inference on port 8001

Vector Search (--vector-search profile)

  • Qdrant: Vector database on port 6333
  • Sentence Transformers: Embedding generation on port 8000

Optional Services (run separately)

  • GPU-Viz Services: See services/gpu-viz/README.md for GPU-accelerated visualization
  • GNN Model Service: See services/gnn_model/README.md for experimental GNN-based RAG

Architecture

┌─────────────────────────────────────────────────────────────────┐
│  Default Stack (./start.sh)          │  vLLM Stack (--vllm)     │
├──────────────────────────────────────┼──────────────────────────┤
│                                      │                          │
│  ┌─────────────┐                     │  ┌─────────────┐         │
│  │   Next.js   │ port 3001           │  │   Next.js   │ 3001    │
│  └──────┬──────┘                     │  └──────┬──────┘         │
│         │                            │         │                │
│  ┌──────┴──────┐  ┌─────────────┐    │  ┌──────┴──────┐  ┌─────┐│
│  │  ArangoDB   │  │   Ollama    │    │  │   Neo4j     │  │vLLM ││
│  │  port 8529  │  │ port 11434  │    │  │  port 7474  │  │8001 ││
│  └─────────────┘  └─────────────┘    │  └─────────────┘  └─────┘│
│                                      │                          │
└──────────────────────────────────────┴──────────────────────────┘

Optional (--vector-search): Qdrant (6333) + Sentence Transformers (8000)