mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-06-20 21:29:31 +00:00

History

Santosh Bhavani de9c46e97e Replace Pinecone with Qdrant for ARM64 compatibility - Migrate from Pinecone to Qdrant vector database for native ARM64 support - Add Qdrant service with automatic collection initialization in docker-compose - Implement QdrantService with UUID-based point IDs to meet Qdrant requirements - Update all API routes and frontend components to use Qdrant - Enhance Storage Connections UI with detailed stats (vectors, status, dimensions) - Add icons and tooltips to Vector DB section matching Graph DB UX		2025-10-24 23:16:44 -07:00
..
app	Replace Pinecone with Qdrant for ARM64 compatibility	2025-10-24 23:16:44 -07:00
compose	Replace Pinecone with Qdrant for ARM64 compatibility	2025-10-24 23:16:44 -07:00
services	Add NVIDIA_API_KEY support and update ollama to v0.12.6	2025-10-19 14:52:24 -05:00
README.md	chore: Regenerate all playbooks	2025-10-10 18:45:20 +00:00

README.md

Deployment Configuration

This directory contains all deployment-related configuration for the txt2kg project.

Structure

compose/: Docker Compose files for local development and testing
- docker-compose.yml: Minimal Docker Compose configuration (Ollama + ArangoDB + Next.js)
- docker-compose.complete.yml: Complete stack with optional services (vLLM, Pinecone, Sentence Transformers)
- docker-compose.optional.yml: Additional optional services
- docker-compose.vllm.yml: Legacy vLLM configuration (use --complete flag instead)
app/: Frontend application Docker configuration
- Dockerfile for Next.js application
services/: Containerized services
- ollama/: Ollama LLM inference service with GPU support
- sentence-transformers/: Sentence transformer service for embeddings (optional)
- vllm/: vLLM inference service with FP8 quantization (optional)
- gpu-viz/: GPU-accelerated graph visualization services (optional, run separately)
- gnn_model/: Graph Neural Network model service (experimental, not in default compose files)

Usage

Recommended: Use the start script

# Minimal setup (Ollama + ArangoDB + Next.js frontend)
./start.sh

# Complete stack (includes vLLM, Pinecone, Sentence Transformers)
./start.sh --complete

# Development mode (run frontend without Docker)
./start.sh --dev-frontend

Manual Docker Compose commands:

To start the minimal services:

docker compose -f deploy/compose/docker-compose.yml up -d

To start the complete stack:

docker compose -f deploy/compose/docker-compose.complete.yml up -d

Services Included

Minimal Stack (default)

Next.js App: Web UI on port 3001
ArangoDB: Graph database on port 8529
Ollama: Local LLM inference on port 11434

Complete Stack (`--complete` flag)

All minimal services plus:

vLLM: Advanced LLM inference on port 8001
Pinecone (Local): Vector embeddings on port 5081
Sentence Transformers: Embedding generation on port 8000

Optional Services (run separately)

GPU-Viz Services: See services/gpu-viz/README.md for GPU-accelerated visualization
GNN Model Service: See services/gnn_model/README.md for experimental GNN-based RAG