mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-23 02:23:53 +00:00
4.2 KiB
4.2 KiB
Deployment Configuration
This directory contains all deployment-related configuration for the txt2kg project.
Structure
-
compose/: Docker Compose configuration
docker-compose.yml: ArangoDB + Ollama (default)docker-compose.vllm.yml: Neo4j + vLLM (GPU-accelerated)
-
app/: Frontend application Docker configuration
- Dockerfile for Next.js application
-
services/: Containerized services
- ollama/: Ollama LLM inference service (default)
- vllm/: vLLM inference service with GPU support (via
--vllmflag) - sentence-transformers/: Sentence transformer service for embeddings (via
--vector-searchflag) - gpu-viz/: GPU-accelerated graph visualization services (run separately)
- gnn_model/: Graph Neural Network model service (experimental)
Usage
Recommended: Use the start script
# Default: ArangoDB + Ollama
./start.sh
# Use Neo4j + vLLM (GPU-accelerated, for DGX Spark/GB300)
./start.sh --vllm
# Enable vector search (Qdrant + Sentence Transformers)
./start.sh --vector-search
# Combine options
./start.sh --vllm --vector-search
# Development mode (run frontend without Docker)
./start.sh --dev-frontend
Manual Docker Compose commands:
# Default: ArangoDB + Ollama
docker compose -f deploy/compose/docker-compose.yml up -d
# Neo4j + vLLM
docker compose -f deploy/compose/docker-compose.vllm.yml up -d
# With vector search services (add --profile vector-search)
docker compose -f deploy/compose/docker-compose.yml --profile vector-search up -d
docker compose -f deploy/compose/docker-compose.vllm.yml --profile vector-search up -d
Services Included
Default Stack (ArangoDB + Ollama)
- Next.js App: Web UI on port 3001
- ArangoDB: Graph database on port 8529
- Ollama: Local LLM inference on port 11434
vLLM Stack (--vllm flag) - Neo4j + vLLM
- Next.js App: Web UI on port 3001
- Neo4j: Graph database on ports 7474 (HTTP) and 7687 (Bolt)
- vLLM: GPU-accelerated LLM inference on port 8001
Vector Search (--vector-search profile)
- Qdrant: Vector database on port 6333
- Sentence Transformers: Embedding generation on port 8000
Optional Services (run separately)
- GPU-Viz Services: See
services/gpu-viz/README.mdfor GPU-accelerated visualization - GNN Model Service: See
services/gnn_model/README.mdfor experimental GNN-based RAG
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Default Stack (./start.sh) │ vLLM Stack (--vllm) │
├──────────────────────────────────────┼──────────────────────────┤
│ │ │
│ ┌─────────────┐ │ ┌─────────────┐ │
│ │ Next.js │ port 3001 │ │ Next.js │ 3001 │
│ └──────┬──────┘ │ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────┴──────┐ ┌─────────────┐ │ ┌──────┴──────┐ ┌─────┐│
│ │ ArangoDB │ │ Ollama │ │ │ Neo4j │ │vLLM ││
│ │ port 8529 │ │ port 11434 │ │ │ port 7474 │ │8001 ││
│ └─────────────┘ └─────────────┘ │ └─────────────┘ └─────┘│
│ │ │
└──────────────────────────────────────┴──────────────────────────┘
Optional (--vector-search): Qdrant (6333) + Sentence Transformers (8000)