dgx-spark-playbooks/nvidia/txt2kg/assets/deploy/README.md

67 lines
2.2 KiB
Markdown
Raw Normal View History

2025-10-06 17:05:41 +00:00
# Deployment Configuration
This directory contains all deployment-related configuration for the txt2kg project.
## Structure
- **compose/**: Docker Compose files for local development and testing
2025-10-10 18:45:20 +00:00
- `docker-compose.yml`: Minimal Docker Compose configuration (Ollama + ArangoDB + Next.js)
- `docker-compose.complete.yml`: Complete stack with optional services (vLLM, Pinecone, Sentence Transformers)
- `docker-compose.optional.yml`: Additional optional services
- `docker-compose.vllm.yml`: Legacy vLLM configuration (use `--complete` flag instead)
2025-10-06 17:05:41 +00:00
2025-10-10 18:45:20 +00:00
- **app/**: Frontend application Docker configuration
- Dockerfile for Next.js application
2025-10-06 17:05:41 +00:00
- **services/**: Containerized services
2025-10-10 18:45:20 +00:00
- **ollama/**: Ollama LLM inference service with GPU support
- **sentence-transformers/**: Sentence transformer service for embeddings (optional)
- **vllm/**: vLLM inference service with FP8 quantization (optional)
- **gpu-viz/**: GPU-accelerated graph visualization services (optional, run separately)
- **gnn_model/**: Graph Neural Network model service (experimental, not in default compose files)
2025-10-06 17:05:41 +00:00
## Usage
2025-10-10 18:45:20 +00:00
**Recommended: Use the start script**
2025-10-06 17:05:41 +00:00
```bash
2025-10-10 18:45:20 +00:00
# Minimal setup (Ollama + ArangoDB + Next.js frontend)
./start.sh
# Complete stack (includes vLLM, Pinecone, Sentence Transformers)
./start.sh --complete
# Development mode (run frontend without Docker)
./start.sh --dev-frontend
2025-10-06 17:05:41 +00:00
```
2025-10-10 18:45:20 +00:00
**Manual Docker Compose commands:**
To start the minimal services:
2025-10-06 17:05:41 +00:00
```bash
2025-10-10 18:45:20 +00:00
docker compose -f deploy/compose/docker-compose.yml up -d
2025-10-06 17:05:41 +00:00
```
2025-10-10 18:45:20 +00:00
To start the complete stack:
2025-10-06 17:05:41 +00:00
```bash
2025-10-10 18:45:20 +00:00
docker compose -f deploy/compose/docker-compose.complete.yml up -d
```
## Services Included
### Minimal Stack (default)
- **Next.js App**: Web UI on port 3001
- **ArangoDB**: Graph database on port 8529
- **Ollama**: Local LLM inference on port 11434
### Complete Stack (`--complete` flag)
All minimal services plus:
- **vLLM**: Advanced LLM inference on port 8001
- **Pinecone (Local)**: Vector embeddings on port 5081
- **Sentence Transformers**: Embedding generation on port 8000
### Optional Services (run separately)
- **GPU-Viz Services**: See `services/gpu-viz/README.md` for GPU-accelerated visualization
- **GNN Model Service**: See `services/gnn_model/README.md` for experimental GNN-based RAG