chore: Regenerate all playbooks

This commit is contained in:
GitLab CI 2026-01-14 16:05:35 +00:00
parent 7e04f555c4
commit d0dbd18840
70 changed files with 2341 additions and 1253 deletions

View File

@ -43,7 +43,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
- [Portfolio Optimization](nvidia/portfolio-optimization/) - [Portfolio Optimization](nvidia/portfolio-optimization/)
- [Fine-tune with Pytorch](nvidia/pytorch-fine-tune/) - [Fine-tune with Pytorch](nvidia/pytorch-fine-tune/)
- [RAG Application in AI Workbench](nvidia/rag-ai-workbench/) - [RAG Application in AI Workbench](nvidia/rag-ai-workbench/)
- [SGLang Inference Server](nvidia/sglang/) - [SGLang for Inference](nvidia/sglang/)
- [Single-cell RNA Sequencing](nvidia/single-cell/) - [Single-cell RNA Sequencing](nvidia/single-cell/)
- [Speculative Decoding](nvidia/speculative-decoding/) - [Speculative Decoding](nvidia/speculative-decoding/)
- [Set up Tailscale on Your Spark](nvidia/tailscale/) - [Set up Tailscale on Your Spark](nvidia/tailscale/)

View File

@ -67,8 +67,8 @@ model adaptation for specialized domains while leveraging hardware-specific opti
* **Duration:** 30-60 minutes for initial setup, 1-7 hours for training depending on model size and dataset. * **Duration:** 30-60 minutes for initial setup, 1-7 hours for training depending on model size and dataset.
* **Risks:** Model downloads require significant bandwidth and storage. Training may consume substantial GPU memory and require parameter tuning for hardware constraints. * **Risks:** Model downloads require significant bandwidth and storage. Training may consume substantial GPU memory and require parameter tuning for hardware constraints.
* **Rollback:** Remove Docker containers and cloned repositories. Training checkpoints are saved locally and can be deleted to reclaim storage space. * **Rollback:** Remove Docker containers and cloned repositories. Training checkpoints are saved locally and can be deleted to reclaim storage space.
* **Last Updated:** 12/15/2025 * **Last Updated:** 01/08/2025
* Upgrade to latest pytorch container version nvcr.io/nvidia/pytorch:25.11-py3 * Update to Qwen3 LoRA fine-tuning workflow based on LLaMA Factory updates
## Instructions ## Instructions
@ -105,10 +105,15 @@ cd LLaMA-Factory
### Step 4. Install LLaMA Factory with dependencies ### Step 4. Install LLaMA Factory with dependencies
Install the package in editable mode with metrics support for training evaluation. Remove the torchaudio dependency (not needed for LLM fine-tuning) to avoid conflicts with the container's optimized PyTorch, then install.
```bash ```bash
## Remove torchaudio dependency that conflicts with NVIDIA's PyTorch build
sed -i 's/"torchaudio[^"]*",\?//' pyproject.toml
## Install LLaMA Factory with metrics support
pip install -e ".[metrics]" pip install -e ".[metrics]"
pip install --no-deps torchaudio
``` ```
## Step 5. Verify Pytorch CUDA support. ## Step 5. Verify Pytorch CUDA support.
@ -126,7 +131,7 @@ python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda
Examine the provided LoRA fine-tuning configuration for Llama-3. Examine the provided LoRA fine-tuning configuration for Llama-3.
```bash ```bash
cat examples/train_lora/llama3_lora_sft.yaml cat examples/train_lora/qwen3_lora_sft.yaml
``` ```
## Step 7. Launch fine-tuning training ## Step 7. Launch fine-tuning training
@ -137,20 +142,20 @@ cat examples/train_lora/llama3_lora_sft.yaml
Execute the training process using the pre-configured LoRA setup. Execute the training process using the pre-configured LoRA setup.
```bash ```bash
huggingface-cli login # if the model is gated hf auth login # if the model is gated
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml
``` ```
Example output: Example output:
```bash ```
***** train metrics ***** ***** train metrics *****
epoch = 3.0 epoch = 3.0
total_flos = 22851591GF total_flos = 11076559GF
train_loss = 0.9113 train_loss = 0.9993
train_runtime = 0:22:21.99 train_runtime = 0:14:32.12
train_samples_per_second = 2.437 train_samples_per_second = 3.749
train_steps_per_second = 0.306 train_steps_per_second = 0.471
Figure saved at: saves/llama3-8b/lora/sft/training_loss.png Figure saved at: saves/qwen3-4b/lora/sft/training_loss.png
``` ```
## Step 8. Validate training completion ## Step 8. Validate training completion
@ -158,13 +163,12 @@ Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
Verify that training completed successfully and checkpoints were saved. Verify that training completed successfully and checkpoints were saved.
```bash ```bash
ls -la saves/llama3-8b/lora/sft/ ls -la saves/qwen3-4b/lora/sft/
``` ```
Expected output should show: Expected output should show:
- Final checkpoint directory (`checkpoint-21` or similar) - Final checkpoint directory (`checkpoint-411` or similar)
- Model configuration files (`config.json`, `adapter_config.json`) - Model configuration files (`adapter_config.json`)
- Training metrics showing decreasing loss values - Training metrics showing decreasing loss values
- Training loss plot saved as PNG file - Training loss plot saved as PNG file
@ -173,14 +177,14 @@ Expected output should show:
Test your fine-tuned model with custom prompts: Test your fine-tuned model with custom prompts:
```bash ```bash
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml
## Type: "Hello, how can you help me today?" ## Type: "Hello, how can you help me today?"
## Expect: Response showing fine-tuned behavior ## Expect: Response showing fine-tuned behavior
``` ```
## Step 10. For production deployment, export your model ## Step 10. For production deployment, export your model
```bash ```bash
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml
``` ```
## Step 11. Cleanup and rollback ## Step 11. Cleanup and rollback

View File

@ -1,4 +1,4 @@
# SGLang Inference Server # SGLang for Inference
> Install and use SGLang on DGX Spark > Install and use SGLang on DGX Spark
@ -68,6 +68,8 @@ The following models are supported with SGLang on Spark. All listed models are a
| **Phi-4-reasoning-plus** | FP8 | ✅ | `nvidia/Phi-4-reasoning-plus-FP8` | | **Phi-4-reasoning-plus** | FP8 | ✅ | `nvidia/Phi-4-reasoning-plus-FP8` |
| **Phi-4-reasoning-plus** | NVFP4 | ✅ | `nvidia/Phi-4-reasoning-plus-FP4` | | **Phi-4-reasoning-plus** | NVFP4 | ✅ | `nvidia/Phi-4-reasoning-plus-FP4` |
Note: for NVFP4 models, add the `--quantization modelopt_fp4` flag.
### Time & risk ### Time & risk
* **Estimated time:** 30 minutes for initial setup and validation * **Estimated time:** 30 minutes for initial setup and validation

View File

@ -54,9 +54,13 @@ The setup includes:
- Document processing time scales with document size and complexity - Document processing time scales with document size and complexity
- **Rollback**: Stop and remove Docker containers, delete downloaded models if needed - **Rollback**: Stop and remove Docker containers, delete downloaded models if needed
- **Last Updated**: 12/02/2025 - **Last Updated**: 01/08/2025
- Knowledge graph search with multi-hop graph traversal - Migrated from Pinecone to Qdrant for ARM64 compatibility
- Improved UI/UX - Added vLLM support with Neo4j
- Added Palette UI components with accessibility improvements
- Added CPU-only mode for development (`./start.sh --cpu`)
- Optimized ArangoDB with deterministic keys and BM25 search
- Added GNN preprocessing scripts for knowledge graph training
## Instructions ## Instructions

View File

@ -19,7 +19,7 @@ This playbook serves as a reference solution for knowledge graph extraction and
</details> </details>
By default, this playbook leverages **Ollama** for local LLM inference, providing a fully self-contained solution that runs entirely on your own hardware. You can optionally use NVIDIA-hosted models available in the [NVIDIA API Catalog](https://build.nvidia.com) for advanced capabilities. By default, this playbook leverages **Ollama** for local LLM inference, providing a fully self-contained solution that runs entirely on your own hardware. You can optionally use **vLLM** for GPU-accelerated inference on DGX Spark/GB300, or NVIDIA-hosted models available in the [NVIDIA API Catalog](https://build.nvidia.com) for advanced capabilities.
## Key Features ## Key Features
@ -33,7 +33,7 @@ By default, this playbook leverages **Ollama** for local LLM inference, providin
- GPU-accelerated LLM inference with Ollama - GPU-accelerated LLM inference with Ollama
- Fully containerized deployment with Docker Compose - Fully containerized deployment with Docker Compose
- Optional NVIDIA API integration for cloud-based models - Optional NVIDIA API integration for cloud-based models
- Optional vector search and advanced inference capabilities - Optional vector search with Qdrant for semantic similarity
- Optional graph-based RAG for contextual answers - Optional graph-based RAG for contextual answers
## Software Components ## Software Components
@ -55,9 +55,13 @@ By default, this playbook leverages **Ollama** for local LLM inference, providin
### Optional Components ### Optional Components
* **Vector Database & Embedding** (with `--complete` flag) * **vLLM Stack** (with `--vllm` flag)
* **vLLM**: GPU-accelerated LLM inference optimized for DGX Spark/GB300
* Default model: `nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8`
* **Neo4j**: Alternative graph database
* **Vector Database & Embedding** (with `--vector-search` flag)
* **SentenceTransformer**: Local embedding generation (model: `all-MiniLM-L6-v2`) * **SentenceTransformer**: Local embedding generation (model: `all-MiniLM-L6-v2`)
* **Pinecone**: Self-hosted vector storage and similarity search * **Qdrant**: Self-hosted vector storage and similarity search
* **Cloud Models** (configure separately) * **Cloud Models** (configure separately)
* **NVIDIA API**: Cloud-based models via NVIDIA API Catalog * **NVIDIA API**: Cloud-based models via NVIDIA API Catalog
@ -76,7 +80,7 @@ The core workflow for knowledge graph building and visualization:
### Future Enhancements ### Future Enhancements
Additional capabilities can be added: Additional capabilities can be added:
- **Vector search**: Add semantic similarity search with local Pinecone and SentenceTransformer embeddings - **Vector search**: Add semantic similarity search with Qdrant and SentenceTransformer embeddings
- **S3 storage**: MinIO for scalable document storage - **S3 storage**: MinIO for scalable document storage
- **GNN-based GraphRAG**: Graph Neural Networks for enhanced retrieval - **GNN-based GraphRAG**: Graph Neural Networks for enhanced retrieval
@ -84,7 +88,7 @@ Additional capabilities can be added:
This playbook includes **GPU-accelerated LLM inference** with Ollama: This playbook includes **GPU-accelerated LLM inference** with Ollama:
### Ollama Features ### Ollama Features (Default)
- **Fully local inference**: No cloud dependencies or API keys required - **Fully local inference**: No cloud dependencies or API keys required
- **GPU acceleration**: Automatic CUDA support with NVIDIA GPUs - **GPU acceleration**: Automatic CUDA support with NVIDIA GPUs
- **Multiple model support**: Use any Ollama-compatible model - **Multiple model support**: Use any Ollama-compatible model
@ -92,7 +96,13 @@ This playbook includes **GPU-accelerated LLM inference** with Ollama:
- **Easy model management**: Pull and switch models with simple commands - **Easy model management**: Pull and switch models with simple commands
- **Privacy-first**: All data processing happens on your hardware - **Privacy-first**: All data processing happens on your hardware
### Default Configuration ### vLLM Alternative (via `--vllm` flag)
- **High-performance inference**: Optimized for DGX Spark/GB300 unified memory
- **FP8 quantization**: Efficient memory usage with minimal quality loss
- **Large context support**: Up to 32K tokens context length
- **Continuous batching**: High throughput for multiple requests
### Default Ollama Configuration
- Model: `llama3.1:8b` - Model: `llama3.1:8b`
- GPU memory fraction: 0.9 (90% of available VRAM) - GPU memory fraction: 0.9 (90% of available VRAM)
- Flash attention enabled - Flash attention enabled
@ -152,8 +162,39 @@ docker exec ollama-compose ollama pull llama3.1:8b
- **ArangoDB**: http://localhost:8529 (no authentication required) - **ArangoDB**: http://localhost:8529 (no authentication required)
- **Ollama API**: http://localhost:11434 - **Ollama API**: http://localhost:11434
### Alternative: Using vLLM (for DGX Spark/GB300)
For GPU-accelerated inference with vLLM:
```bash
./start.sh --vllm
```
Then wait for vLLM to load the model:
```bash
docker logs vllm-service -f
```
Services:
- **Web UI**: http://localhost:3001
- **Neo4j Browser**: http://localhost:7474 (user: `neo4j`, password: `password123`)
- **vLLM API**: http://localhost:8001
### Adding Vector Search
Enable semantic similarity search:
```bash
./start.sh --vector-search
```
This adds:
- **Qdrant**: http://localhost:6333
- **Sentence Transformers**: http://localhost:8000
## Available Customizations ## Available Customizations
- **Switch LLM backend**: Use `--vllm` flag for vLLM or default for Ollama
- **Add vector search**: Use `--vector-search` flag for Qdrant + embeddings
- **Switch Ollama models**: Use any model from Ollama's library (Llama, Mistral, Qwen, etc.) - **Switch Ollama models**: Use any model from Ollama's library (Llama, Mistral, Qwen, etc.)
- **Modify extraction prompts**: Customize how triples are extracted from text - **Modify extraction prompts**: Customize how triples are extracted from text
- **Add domain-specific knowledge sources**: Integrate external ontologies or taxonomies - **Add domain-specific knowledge sources**: Integrate external ontologies or taxonomies

View File

@ -4,32 +4,36 @@ This directory contains all deployment-related configuration for the txt2kg proj
## Structure ## Structure
- **compose/**: Docker Compose files for local development and testing - **compose/**: Docker Compose configuration
- `docker-compose.yml`: Minimal Docker Compose configuration (Ollama + ArangoDB + Next.js) - `docker-compose.yml`: ArangoDB + Ollama (default)
- `docker-compose.complete.yml`: Complete stack with optional services (vLLM, Pinecone, Sentence Transformers) - `docker-compose.vllm.yml`: Neo4j + vLLM (GPU-accelerated)
- `docker-compose.optional.yml`: Additional optional services
- `docker-compose.vllm.yml`: Legacy vLLM configuration (use `--complete` flag instead)
- **app/**: Frontend application Docker configuration - **app/**: Frontend application Docker configuration
- Dockerfile for Next.js application - Dockerfile for Next.js application
- **services/**: Containerized services - **services/**: Containerized services
- **ollama/**: Ollama LLM inference service with GPU support - **ollama/**: Ollama LLM inference service (default)
- **sentence-transformers/**: Sentence transformer service for embeddings (optional) - **vllm/**: vLLM inference service with GPU support (via `--vllm` flag)
- **vllm/**: vLLM inference service with FP8 quantization (optional) - **sentence-transformers/**: Sentence transformer service for embeddings (via `--vector-search` flag)
- **gpu-viz/**: GPU-accelerated graph visualization services (optional, run separately) - **gpu-viz/**: GPU-accelerated graph visualization services (run separately)
- **gnn_model/**: Graph Neural Network model service (experimental, not in default compose files) - **gnn_model/**: Graph Neural Network model service (experimental)
## Usage ## Usage
**Recommended: Use the start script** **Recommended: Use the start script**
```bash ```bash
# Minimal setup (Ollama + ArangoDB + Next.js frontend) # Default: ArangoDB + Ollama
./start.sh ./start.sh
# Complete stack (includes vLLM, Pinecone, Sentence Transformers) # Use Neo4j + vLLM (GPU-accelerated, for DGX Spark/GB300)
./start.sh --complete ./start.sh --vllm
# Enable vector search (Qdrant + Sentence Transformers)
./start.sh --vector-search
# Combine options
./start.sh --vllm --vector-search
# Development mode (run frontend without Docker) # Development mode (run frontend without Docker)
./start.sh --dev-frontend ./start.sh --dev-frontend
@ -37,31 +41,55 @@ This directory contains all deployment-related configuration for the txt2kg proj
**Manual Docker Compose commands:** **Manual Docker Compose commands:**
To start the minimal services:
```bash ```bash
# Default: ArangoDB + Ollama
docker compose -f deploy/compose/docker-compose.yml up -d docker compose -f deploy/compose/docker-compose.yml up -d
```
To start the complete stack: # Neo4j + vLLM
docker compose -f deploy/compose/docker-compose.vllm.yml up -d
```bash # With vector search services (add --profile vector-search)
docker compose -f deploy/compose/docker-compose.complete.yml up -d docker compose -f deploy/compose/docker-compose.yml --profile vector-search up -d
docker compose -f deploy/compose/docker-compose.vllm.yml --profile vector-search up -d
``` ```
## Services Included ## Services Included
### Minimal Stack (default) ### Default Stack (ArangoDB + Ollama)
- **Next.js App**: Web UI on port 3001 - **Next.js App**: Web UI on port 3001
- **ArangoDB**: Graph database on port 8529 - **ArangoDB**: Graph database on port 8529
- **Ollama**: Local LLM inference on port 11434 - **Ollama**: Local LLM inference on port 11434
### Complete Stack (`--complete` flag) ### vLLM Stack (`--vllm` flag) - Neo4j + vLLM
All minimal services plus: - **Next.js App**: Web UI on port 3001
- **vLLM**: Advanced LLM inference on port 8001 - **Neo4j**: Graph database on ports 7474 (HTTP) and 7687 (Bolt)
- **Pinecone (Local)**: Vector embeddings on port 5081 - **vLLM**: GPU-accelerated LLM inference on port 8001
### Vector Search (`--vector-search` profile)
- **Qdrant**: Vector database on port 6333
- **Sentence Transformers**: Embedding generation on port 8000 - **Sentence Transformers**: Embedding generation on port 8000
### Optional Services (run separately) ### Optional Services (run separately)
- **GPU-Viz Services**: See `services/gpu-viz/README.md` for GPU-accelerated visualization - **GPU-Viz Services**: See `services/gpu-viz/README.md` for GPU-accelerated visualization
- **GNN Model Service**: See `services/gnn_model/README.md` for experimental GNN-based RAG - **GNN Model Service**: See `services/gnn_model/README.md` for experimental GNN-based RAG
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Default Stack (./start.sh) │ vLLM Stack (--vllm) │
├──────────────────────────────────────┼──────────────────────────┤
│ │ │
│ ┌─────────────┐ │ ┌─────────────┐ │
│ │ Next.js │ port 3001 │ │ Next.js │ 3001 │
│ └──────┬──────┘ │ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────┴──────┐ ┌─────────────┐ │ ┌──────┴──────┐ ┌─────┐│
│ │ ArangoDB │ │ Ollama │ │ │ Neo4j │ │vLLM ││
│ │ port 8529 │ │ port 11434 │ │ │ port 7474 │ │8001 ││
│ └─────────────┘ └─────────────┘ │ └─────────────┘ └─────┘│
│ │ │
└──────────────────────────────────────┴──────────────────────────┘
Optional (--vector-search): Qdrant (6333) + Sentence Transformers (8000)
```

View File

@ -8,10 +8,6 @@ RUN npm install -g pnpm --force --yes
# Copy dependency files # Copy dependency files
COPY ./frontend/package.json ./frontend/pnpm-lock.yaml* ./ COPY ./frontend/package.json ./frontend/pnpm-lock.yaml* ./
COPY ./scripts/ /scripts/
# Update the setup-pinecone.js path
RUN sed -i 's|"setup-pinecone": "node ../scripts/setup-pinecone.js"|"setup-pinecone": "node /scripts/setup-pinecone.js"|g' package.json
# Install dependencies with cache mount for faster rebuilds # Install dependencies with cache mount for faster rebuilds
RUN --mount=type=cache,target=/root/.local/share/pnpm/store \ RUN --mount=type=cache,target=/root/.local/share/pnpm/store \
@ -32,7 +28,6 @@ RUN npm install -g pnpm --force --yes
# Copy node_modules from deps stage # Copy node_modules from deps stage
COPY --from=deps /app/node_modules ./node_modules COPY --from=deps /app/node_modules ./node_modules
COPY --from=deps /app/package.json ./package.json COPY --from=deps /app/package.json ./package.json
COPY --from=deps /scripts /scripts
# Copy source code # Copy source code
COPY ./frontend/ ./ COPY ./frontend/ ./

View File

@ -1,20 +1,4 @@
#!/bin/sh #!/bin/sh
#
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Script to initialize Pinecone index at container startup # Script to initialize Pinecone index at container startup
echo "Initializing Pinecone index..." echo "Initializing Pinecone index..."

View File

@ -104,7 +104,7 @@ services:
- OLLAMA_FLASH_ATTENTION=1 - OLLAMA_FLASH_ATTENTION=1
- OLLAMA_KEEP_ALIVE=30m - OLLAMA_KEEP_ALIVE=30m
- OLLAMA_CUDA=1 - OLLAMA_CUDA=1
- OLLAMA_LLM_LIBRARY=cuda - OLLAMA_LLM_LIBRARY=cuda_v13
- OLLAMA_NUM_PARALLEL=1 - OLLAMA_NUM_PARALLEL=1
- OLLAMA_MAX_LOADED_MODELS=1 - OLLAMA_MAX_LOADED_MODELS=1
- OLLAMA_KV_CACHE_TYPE=q8_0 - OLLAMA_KV_CACHE_TYPE=q8_0

View File

@ -1,6 +1,10 @@
# This is a legacy file - use --with-optional flag instead # txt2kg Docker Compose - Neo4j + vLLM (GPU-accelerated)
# The vLLM service is now included in docker-compose.optional.yml #
# This file is kept for backwards compatibility # Optional stack optimized for DGX Spark/GB300 with unified memory support
#
# Usage:
# ./start.sh --vllm # Use this compose file
# ./start.sh --vllm --vector-search # Add Qdrant + Sentence Transformers
services: services:
app: app:
@ -10,105 +14,100 @@ services:
ports: ports:
- '3001:3000' - '3001:3000'
environment: environment:
- ARANGODB_URL=http://arangodb:8529 # Neo4j configuration
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=password123
- GRAPH_DB_TYPE=neo4j
# Disable ArangoDB
- ARANGODB_URL=http://localhost:8529
- ARANGODB_DB=txt2kg - ARANGODB_DB=txt2kg
- PINECONE_HOST=entity-embeddings # vLLM configuration (GPU-accelerated)
- PINECONE_PORT=5081 - VLLM_BASE_URL=http://vllm:8001/v1
- PINECONE_API_KEY=pclocal - VLLM_MODEL=nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
- PINECONE_ENVIRONMENT=local # Disable Ollama
- OLLAMA_BASE_URL=http://localhost:11434/v1
- OLLAMA_MODEL=disabled
# Vector DB configuration
- QDRANT_URL=http://qdrant:6333
- VECTOR_DB_TYPE=qdrant
# Embeddings configuration
- LANGCHAIN_TRACING_V2=true - LANGCHAIN_TRACING_V2=true
- SENTENCE_TRANSFORMER_URL=http://sentence-transformers:80 - SENTENCE_TRANSFORMER_URL=http://sentence-transformers:80
- MODEL_NAME=all-MiniLM-L6-v2 - MODEL_NAME=all-MiniLM-L6-v2
- EMBEDDINGS_API_URL=http://sentence-transformers:80
# Other settings
- GRPC_SSL_CIPHER_SUITES=HIGH+ECDSA:HIGH+aRSA - GRPC_SSL_CIPHER_SUITES=HIGH+ECDSA:HIGH+aRSA
- NODE_TLS_REJECT_UNAUTHORIZED=0 - NODE_TLS_REJECT_UNAUTHORIZED=0
- OLLAMA_BASE_URL=http://ollama:11434/v1 - NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
- OLLAMA_MODEL=qwen3:1.7b - NODE_OPTIONS=--max-http-header-size=80000
- VLLM_BASE_URL=http://vllm:8001/v1 - UV_THREADPOOL_SIZE=128
- VLLM_MODEL=meta-llama/Llama-3.2-3B-Instruct - HTTP_TIMEOUT=1800000
- REMOTE_WEBGPU_SERVICE_URL=http://txt2kg-remote-webgpu:8083 - REQUEST_TIMEOUT=1800000
networks: networks:
- pinecone-net
- default - default
- txt2kg-network - txt2kg-network
- qdrant-net
depends_on: depends_on:
- arangodb neo4j:
- entity-embeddings condition: service_healthy
- sentence-transformers vllm:
- vllm
arangodb:
image: arangodb:latest
ports:
- '8529:8529'
environment:
- ARANGO_NO_AUTH=1
volumes:
- arangodb_data:/var/lib/arangodb3
- arangodb_apps_data:/var/lib/arangodb3-apps
arangodb-init:
image: arangodb:latest
depends_on:
arangodb:
condition: service_started condition: service_started
restart: on-failure
entrypoint: > # Neo4j - Graph database
sh -c " neo4j:
echo 'Waiting for ArangoDB to start...' && image: neo4j:5-community
sleep 10 &&
echo 'Creating txt2kg database...' &&
arangosh --server.endpoint tcp://arangodb:8529 --server.authentication false --javascript.execute-string 'try { db._createDatabase(\"txt2kg\"); console.log(\"Database txt2kg created successfully!\"); } catch(e) { if(e.message.includes(\"duplicate\")) { console.log(\"Database txt2kg already exists\"); } else { throw e; } }'
"
entity-embeddings:
image: ghcr.io/pinecone-io/pinecone-index:latest
container_name: entity-embeddings
environment:
PORT: 5081
INDEX_TYPE: serverless
VECTOR_TYPE: dense
DIMENSION: 384
METRIC: cosine
INDEX_NAME: entity-embeddings
ports: ports:
- "5081:5081" - '7474:7474'
platform: linux/amd64 - '7687:7687'
networks:
- pinecone-net
restart: unless-stopped
sentence-transformers:
build:
context: ../../deploy/services/sentence-transformers
dockerfile: Dockerfile
ports:
- '8000:80'
environment: environment:
- MODEL_NAME=all-MiniLM-L6-v2 - NEO4J_AUTH=neo4j/password123
- NEO4J_server_memory_heap_initial__size=512m
- NEO4J_server_memory_heap_max__size=2G
volumes:
- neo4j_data:/data
- neo4j_logs:/logs
networks: networks:
- default - default
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:7474 || exit 1"]
interval: 15s
timeout: 10s
retries: 10
start_period: 60s
# vLLM - GPU-accelerated LLM with unified memory support
vllm: vllm:
build: build:
context: ../../deploy/services/vllm context: ../services/vllm
dockerfile: Dockerfile dockerfile: Dockerfile
container_name: vllm-service container_name: vllm-service
ports: ports:
- '8001:8001' - '8001:8001'
ipc: host
ulimits:
memlock: -1
stack: 67108864
shm_size: '16gb'
environment: environment:
# Model configuration - VLLM_MODEL=nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
- VLLM_MODEL=meta-llama/Llama-3.2-3B-Instruct
- VLLM_TENSOR_PARALLEL_SIZE=1 - VLLM_TENSOR_PARALLEL_SIZE=1
- VLLM_MAX_MODEL_LEN=4096 - VLLM_MAX_MODEL_LEN=32768
- VLLM_GPU_MEMORY_UTILIZATION=0.9 - VLLM_GPU_MEMORY_UTILIZATION=0.9
# NVfp4 quantization settings - VLLM_MAX_NUM_SEQS=32
- VLLM_QUANTIZATION=fp8 - VLLM_MAX_NUM_BATCHED_TOKENS=32768
- VLLM_KV_CACHE_DTYPE=fp8 - VLLM_KV_CACHE_DTYPE=auto
# Service configuration
- VLLM_PORT=8001 - VLLM_PORT=8001
- VLLM_HOST=0.0.0.0 - VLLM_HOST=0.0.0.0
# Performance tuning
- CUDA_VISIBLE_DEVICES=0 - CUDA_VISIBLE_DEVICES=0
- NCCL_DEBUG=INFO - NCCL_DEBUG=INFO
- CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
- PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
- VLLM_CPU_OFFLOAD_GB=0
volumes: volumes:
- vllm_models:/app/models - vllm_models:/app/models
- /tmp:/tmp - /tmp:/tmp
# Mount model cache for faster startup
- ~/.cache/huggingface:/root/.cache/huggingface - ~/.cache/huggingface:/root/.cache/huggingface
networks: networks:
- default - default
@ -121,21 +120,75 @@ services:
count: 1 count: 1
capabilities: [gpu] capabilities: [gpu]
healthcheck: healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8001/v1/models"] test: ["CMD", "curl", "-f", "http://localhost:8001/health"]
interval: 30s interval: 60s
timeout: 10s timeout: 30s
retries: 5 retries: 30
start_period: 120s # Longer start period for model loading start_period: 1800s
# Optional: Vector search services
sentence-transformers:
build:
context: ../services/sentence-transformers
dockerfile: Dockerfile
ports:
- '8000:80'
environment:
- MODEL_NAME=all-MiniLM-L6-v2
networks:
- default
restart: unless-stopped
profiles:
- vector-search
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- qdrant_data:/qdrant/storage
networks:
- qdrant-net
restart: unless-stopped
profiles:
- vector-search
qdrant-init:
image: curlimages/curl:latest
depends_on:
- qdrant
restart: "no"
entrypoint: /bin/sh
command:
- -c
- |
echo 'Waiting for Qdrant to start...'
sleep 5
curl -X PUT http://qdrant:6333/collections/entity-embeddings \
-H 'Content-Type: application/json' \
-d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
curl -X PUT http://qdrant:6333/collections/document-embeddings \
-H 'Content-Type: application/json' \
-d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
echo 'Collections created'
networks:
- qdrant-net
profiles:
- vector-search
volumes: volumes:
arangodb_data: neo4j_data:
arangodb_apps_data: neo4j_logs:
vllm_models: vllm_models:
qdrant_data:
networks: networks:
pinecone-net:
name: pinecone
default: default:
driver: bridge driver: bridge
txt2kg-network: txt2kg-network:
driver: bridge driver: bridge
qdrant-net:
name: qdrant-network

View File

@ -1,3 +1,12 @@
# txt2kg Docker Compose - ArangoDB + Ollama (Default)
#
# Default stack tested and working on DGX Spark
#
# Usage:
# ./start.sh # Default: ArangoDB + Ollama
# ./start.sh --vector-search # Add Qdrant + Sentence Transformers
#
# For Neo4j + vLLM, use: ./start.sh --vllm
services: services:
app: app:
@ -7,21 +16,32 @@ services:
ports: ports:
- '3001:3000' - '3001:3000'
environment: environment:
# ArangoDB configuration
- ARANGODB_URL=http://arangodb:8529 - ARANGODB_URL=http://arangodb:8529
- ARANGODB_DB=txt2kg - ARANGODB_DB=txt2kg
- GRAPH_DB_TYPE=arangodb
# Disable Neo4j
- NEO4J_URI=bolt://localhost:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=password123
# Ollama configuration
- OLLAMA_BASE_URL=http://ollama:11434/v1
- OLLAMA_MODEL=llama3.1:8b
# Disable vLLM
- VLLM_BASE_URL=http://localhost:8001/v1
- VLLM_MODEL=disabled
# Vector DB configuration
- QDRANT_URL=http://qdrant:6333 - QDRANT_URL=http://qdrant:6333
- VECTOR_DB_TYPE=qdrant - VECTOR_DB_TYPE=qdrant
# Embeddings configuration
- LANGCHAIN_TRACING_V2=true - LANGCHAIN_TRACING_V2=true
- SENTENCE_TRANSFORMER_URL=http://sentence-transformers:80 - SENTENCE_TRANSFORMER_URL=http://sentence-transformers:80
- MODEL_NAME=all-MiniLM-L6-v2 - MODEL_NAME=all-MiniLM-L6-v2
- EMBEDDINGS_API_URL=http://sentence-transformers:80 - EMBEDDINGS_API_URL=http://sentence-transformers:80
# Other settings
- GRPC_SSL_CIPHER_SUITES=HIGH+ECDSA:HIGH+aRSA - GRPC_SSL_CIPHER_SUITES=HIGH+ECDSA:HIGH+aRSA
- NODE_TLS_REJECT_UNAUTHORIZED=0 - NODE_TLS_REJECT_UNAUTHORIZED=0
- OLLAMA_BASE_URL=http://ollama:11434/v1
- OLLAMA_MODEL=llama3.1:8b
- REMOTE_WEBGPU_SERVICE_URL=http://txt2kg-remote-webgpu:8083
- NVIDIA_API_KEY=${NVIDIA_API_KEY:-} - NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
# Node.js timeout configurations for large model processing
- NODE_OPTIONS=--max-http-header-size=80000 - NODE_OPTIONS=--max-http-header-size=80000
- UV_THREADPOOL_SIZE=128 - UV_THREADPOOL_SIZE=128
- HTTP_TIMEOUT=1800000 - HTTP_TIMEOUT=1800000
@ -29,12 +49,14 @@ services:
networks: networks:
- default - default
- txt2kg-network - txt2kg-network
- pinecone-net - qdrant-net
depends_on: depends_on:
- arangodb arangodb:
- ollama condition: service_started
# Optional: sentence-transformers and entity-embeddings are only needed for vector search ollama:
# Traditional graph search works without these services condition: service_started
# ArangoDB - Graph database
arangodb: arangodb:
image: arangodb:latest image: arangodb:latest
ports: ports:
@ -44,6 +66,11 @@ services:
volumes: volumes:
- arangodb_data:/var/lib/arangodb3 - arangodb_data:/var/lib/arangodb3
- arangodb_apps_data:/var/lib/arangodb3-apps - arangodb_apps_data:/var/lib/arangodb3-apps
networks:
- default
restart: unless-stopped
# ArangoDB initialization - create database
arangodb-init: arangodb-init:
image: arangodb:latest image: arangodb:latest
depends_on: depends_on:
@ -57,6 +84,10 @@ services:
echo 'Creating txt2kg database...' && echo 'Creating txt2kg database...' &&
arangosh --server.endpoint tcp://arangodb:8529 --server.authentication false --javascript.execute-string 'try { db._createDatabase(\"txt2kg\"); console.log(\"Database txt2kg created successfully!\"); } catch(e) { if(e.message.includes(\"duplicate\")) { console.log(\"Database txt2kg already exists\"); } else { throw e; } }' arangosh --server.endpoint tcp://arangodb:8529 --server.authentication false --javascript.execute-string 'try { db._createDatabase(\"txt2kg\"); console.log(\"Database txt2kg created successfully!\"); } catch(e) { if(e.message.includes(\"duplicate\")) { console.log(\"Database txt2kg already exists\"); } else { throw e; } }'
" "
networks:
- default
# Ollama - Local LLM inference
ollama: ollama:
build: build:
context: ../services/ollama context: ../services/ollama
@ -68,13 +99,16 @@ services:
volumes: volumes:
- ollama_data:/root/.ollama - ollama_data:/root/.ollama
environment: environment:
- NVIDIA_VISIBLE_DEVICES=all # Make all GPUs visible to the container - NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility # Required capabilities for CUDA - NVIDIA_DRIVER_CAPABILITIES=compute,utility
- OLLAMA_FLASH_ATTENTION=1 # Enable flash attention for better performance - CUDA_VISIBLE_DEVICES=0
- OLLAMA_KEEP_ALIVE=30m # Keep models loaded for 30 minutes - OLLAMA_FLASH_ATTENTION=1
- OLLAMA_NUM_PARALLEL=4 # Process 4 requests in parallel - DGX Spark has unified memory - OLLAMA_KEEP_ALIVE=30m
- OLLAMA_MAX_LOADED_MODELS=1 # Load only one model at a time to avoid VRAM contention - OLLAMA_NUM_PARALLEL=4
- OLLAMA_KV_CACHE_TYPE=q8_0 # Reduce KV cache VRAM usage with minimal performance impact - OLLAMA_MAX_LOADED_MODELS=1
- OLLAMA_KV_CACHE_TYPE=q8_0
- OLLAMA_GPU_LAYERS=-1
- OLLAMA_LLM_LIBRARY=cuda_v13
networks: networks:
- default - default
restart: unless-stopped restart: unless-stopped
@ -92,8 +126,7 @@ services:
retries: 3 retries: 3
start_period: 60s start_period: 60s
# Optional services for vector search (NOT required for traditional graph search) # Optional: Vector search services
# Traditional graph search works with just: app, arangodb, and ollama
sentence-transformers: sentence-transformers:
build: build:
context: ../services/sentence-transformers context: ../services/sentence-transformers
@ -106,7 +139,8 @@ services:
- default - default
restart: unless-stopped restart: unless-stopped
profiles: profiles:
- vector-search # Only start with: docker compose --profile vector-search up - vector-search
qdrant: qdrant:
image: qdrant/qdrant:latest image: qdrant/qdrant:latest
container_name: qdrant container_name: qdrant
@ -116,10 +150,11 @@ services:
volumes: volumes:
- qdrant_data:/qdrant/storage - qdrant_data:/qdrant/storage
networks: networks:
- pinecone-net - qdrant-net
restart: unless-stopped restart: unless-stopped
profiles: profiles:
- vector-search # Only start with: docker compose --profile vector-search up - vector-search
qdrant-init: qdrant-init:
image: curlimages/curl:latest image: curlimages/curl:latest
depends_on: depends_on:
@ -131,32 +166,15 @@ services:
- | - |
echo 'Waiting for Qdrant to start...' echo 'Waiting for Qdrant to start...'
sleep 5 sleep 5
echo 'Checking if entity-embeddings collection exists...' curl -X PUT http://qdrant:6333/collections/entity-embeddings \
RESPONSE=$(curl -s http://qdrant:6333/collections/entity-embeddings) -H 'Content-Type: application/json' \
if echo "$RESPONSE" | grep -q '"status":"ok"'; then -d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
echo 'entity-embeddings collection already exists' curl -X PUT http://qdrant:6333/collections/document-embeddings \
else -H 'Content-Type: application/json' \
echo 'Creating collection entity-embeddings...' -d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
curl -X PUT http://qdrant:6333/collections/entity-embeddings \ echo 'Collections created'
-H 'Content-Type: application/json' \
-d '{"vectors":{"size":384,"distance":"Cosine"}}'
echo ''
echo 'entity-embeddings collection created successfully'
fi
echo 'Checking if document-embeddings collection exists...'
RESPONSE=$(curl -s http://qdrant:6333/collections/document-embeddings)
if echo "$RESPONSE" | grep -q '"status":"ok"'; then
echo 'document-embeddings collection already exists'
else
echo 'Creating collection document-embeddings...'
curl -X PUT http://qdrant:6333/collections/document-embeddings \
-H 'Content-Type: application/json' \
-d '{"vectors":{"size":384,"distance":"Cosine"}}'
echo ''
echo 'document-embeddings collection created successfully'
fi
networks: networks:
- pinecone-net - qdrant-net
profiles: profiles:
- vector-search - vector-search
@ -171,5 +189,5 @@ networks:
driver: bridge driver: bridge
txt2kg-network: txt2kg-network:
driver: bridge driver: bridge
pinecone-net: qdrant-net:
name: pinecone name: qdrant-network

View File

@ -1,5 +1,5 @@
# Use NVIDIA Triton Inference Server with vLLM - optimized for latest NVIDIA hardware # Use official NVIDIA vLLM image - optimized for NVIDIA hardware
FROM nvcr.io/nvidia/tritonserver:25.08-vllm-python-py3 FROM nvcr.io/nvidia/vllm:25.11-py3
# Install curl for health checks # Install curl for health checks
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/* RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*

View File

@ -21,17 +21,11 @@
# Enable unified memory usage for DGX Spark # Enable unified memory usage for DGX Spark
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1 export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True export PYTORCH_ALLOC_CONF=expandable_segments:True
# Enable CUDA unified memory and oversubscription # Enable CUDA unified memory and oversubscription
export CUDA_VISIBLE_DEVICES=0
export PYTORCH_NO_CUDA_MEMORY_CACHING=0 export PYTORCH_NO_CUDA_MEMORY_CACHING=0
# Force vLLM to use CPU offloading for large models
export VLLM_CPU_OFFLOAD_GB=50
export VLLM_ALLOW_RUNTIME_LORA_UPDATES_WITH_SGD_LORA=1
export VLLM_SKIP_WARMUP=0
# Optimized environment for performance # Optimized environment for performance
export VLLM_LOGGING_LEVEL=INFO export VLLM_LOGGING_LEVEL=INFO
export PYTHONUNBUFFERED=1 export PYTHONUNBUFFERED=1
@ -39,8 +33,12 @@ export PYTHONUNBUFFERED=1
# Enable CUDA optimizations # Enable CUDA optimizations
export VLLM_USE_MODELSCOPE=false export VLLM_USE_MODELSCOPE=false
# Enable unified memory in vLLM # Enable FP8 MoE optimizations for Nemotron and other MoE models
export VLLM_USE_V1=0 export VLLM_USE_FLASHINFER_MOE_FP8=1
export VLLM_USE_FLASHINFER_MOE_FP4=1
# Enable FlashInfer attention backend for better performance
export VLLM_ATTENTION_BACKEND=FLASHINFER
# First, test basic CUDA functionality # First, test basic CUDA functionality
echo "=== Testing CUDA functionality ===" echo "=== Testing CUDA functionality ==="
@ -64,68 +62,89 @@ if torch.cuda.is_available():
" "
echo "=== Starting optimized vLLM server ===" echo "=== Starting optimized vLLM server ==="
# Optimized configuration for DGX Spark performance with NVFP4 quantization
# Available quantized models from NVIDIA
NVFP4_MODEL="nvidia/Llama-3.3-70B-Instruct-FP4"
NVFP8_MODEL="nvidia/Llama-3.1-8B-Instruct-FP8"
STANDARD_MODEL="meta-llama/Llama-3.1-70B-Instruct"
# Check GPU compute capability for optimal quantization # Check GPU compute capability for optimal settings
COMPUTE_CAPABILITY=$(nvidia-smi -i 0 --query-gpu=compute_cap --format=csv,noheader,nounits 2>/dev/null || echo "unknown") COMPUTE_CAPABILITY=$(nvidia-smi -i 0 --query-gpu=compute_cap --format=csv,noheader,nounits 2>/dev/null || echo "unknown")
echo "Detected GPU compute capability: $COMPUTE_CAPABILITY" echo "Detected GPU compute capability: $COMPUTE_CAPABILITY"
# Configure quantization based on GPU architecture # Use environment variable if set, otherwise default to Qwen (not gated)
if [[ "$COMPUTE_CAPABILITY" == "12.1" ]] || [[ "$COMPUTE_CAPABILITY" == "10.0" ]]; then if [ -n "$VLLM_MODEL" ]; then
# Blackwell/DGX Spark architecture - use standard 70B model with CPU offloading MODEL_TO_USE="$VLLM_MODEL"
echo "Using standard Llama-3.1-70B model for Blackwell/DGX Spark with CPU offloading" echo "Using model from environment: $MODEL_TO_USE"
QUANTIZATION_FLAG=""
MODEL_TO_USE="$STANDARD_MODEL" # Use standard 70B model
GPU_MEMORY_UTIL="0.7" # Lower GPU memory to allow unified memory
MAX_MODEL_LEN="4096" # Shorter sequences for memory efficiency
MAX_NUM_SEQS="16" # Lower concurrent sequences for 70B
MAX_BATCHED_TOKENS="4096"
CPU_OFFLOAD_GB="50" # Offload 50GB to CPU/unified memory
elif [[ "$COMPUTE_CAPABILITY" == "9.0" ]]; then
# Hopper architecture - use standard model
echo "Using standard 70B model for Hopper architecture"
QUANTIZATION_FLAG=""
MODEL_TO_USE="$STANDARD_MODEL"
GPU_MEMORY_UTIL="0.7"
MAX_MODEL_LEN="4096"
MAX_NUM_SEQS="16"
MAX_BATCHED_TOKENS="4096"
CPU_OFFLOAD_GB="40"
else else
# Other architectures - use standard precision # Default to Qwen 2.5 7B - not gated, no HuggingFace token required
echo "Using standard 70B model for GPU architecture: $COMPUTE_CAPABILITY" MODEL_TO_USE="Qwen/Qwen2.5-7B-Instruct"
QUANTIZATION_FLAG="" echo "Using default model: $MODEL_TO_USE"
MODEL_TO_USE="$STANDARD_MODEL"
GPU_MEMORY_UTIL="0.7"
MAX_MODEL_LEN="2048"
MAX_NUM_SEQS="16"
MAX_BATCHED_TOKENS="2048"
CPU_OFFLOAD_GB="40"
fi fi
echo "Using model: $MODEL_TO_USE" # Configure settings based on model size and GPU architecture
echo "Quantization: ${QUANTIZATION_FLAG:-'disabled'}" # Check if using 8B or smaller model
if [[ "$MODEL_TO_USE" == *"8B"* ]] || [[ "$MODEL_TO_USE" == *"7B"* ]] || [[ "$MODEL_TO_USE" == *"3B"* ]] || [[ "$MODEL_TO_USE" == *"1B"* ]]; then
echo "Configuring for smaller model (8B or less)"
QUANTIZATION_FLAG=""
GPU_MEMORY_UTIL="${VLLM_GPU_MEMORY_UTILIZATION:-0.9}"
MAX_MODEL_LEN="${VLLM_MAX_MODEL_LEN:-8192}"
MAX_NUM_SEQS="${VLLM_MAX_NUM_SEQS:-64}"
MAX_BATCHED_TOKENS="${VLLM_MAX_NUM_BATCHED_TOKENS:-8192}"
CPU_OFFLOAD_GB="${VLLM_CPU_OFFLOAD_GB:-0}"
elif [[ "$COMPUTE_CAPABILITY" == "12.1" ]] || [[ "$COMPUTE_CAPABILITY" == "10.0" ]]; then
# Blackwell/DGX Spark architecture with larger model - use CPU offloading
echo "Configuring for large model on Blackwell/DGX Spark with CPU offloading"
QUANTIZATION_FLAG=""
GPU_MEMORY_UTIL="${VLLM_GPU_MEMORY_UTILIZATION:-0.7}"
MAX_MODEL_LEN="${VLLM_MAX_MODEL_LEN:-4096}"
MAX_NUM_SEQS="${VLLM_MAX_NUM_SEQS:-16}"
MAX_BATCHED_TOKENS="${VLLM_MAX_NUM_BATCHED_TOKENS:-4096}"
CPU_OFFLOAD_GB="${VLLM_CPU_OFFLOAD_GB:-50}"
else
# Other architectures with larger model
echo "Configuring for large model on GPU architecture: $COMPUTE_CAPABILITY"
QUANTIZATION_FLAG=""
GPU_MEMORY_UTIL="${VLLM_GPU_MEMORY_UTILIZATION:-0.7}"
MAX_MODEL_LEN="${VLLM_MAX_MODEL_LEN:-4096}"
MAX_NUM_SEQS="${VLLM_MAX_NUM_SEQS:-16}"
MAX_BATCHED_TOKENS="${VLLM_MAX_NUM_BATCHED_TOKENS:-4096}"
CPU_OFFLOAD_GB="${VLLM_CPU_OFFLOAD_GB:-40}"
fi
echo ""
echo "=== vLLM Configuration ==="
echo "Model: $MODEL_TO_USE"
echo "GPU memory utilization: $GPU_MEMORY_UTIL" echo "GPU memory utilization: $GPU_MEMORY_UTIL"
echo "Max model length: $MAX_MODEL_LEN"
echo "Max num seqs: $MAX_NUM_SEQS"
echo "Max batched tokens: $MAX_BATCHED_TOKENS"
echo "CPU Offload: ${CPU_OFFLOAD_GB}GB" echo "CPU Offload: ${CPU_OFFLOAD_GB}GB"
echo "Quantization: ${QUANTIZATION_FLAG:-'none'}"
echo ""
vllm serve "$MODEL_TO_USE" \ # Build command - only add cpu-offload-gb if > 0
VLLM_CMD="vllm serve $MODEL_TO_USE \
--host 0.0.0.0 \ --host 0.0.0.0 \
--port 8001 \ --port 8001 \
--tensor-parallel-size 1 \ --tensor-parallel-size 1 \
--max-model-len "$MAX_MODEL_LEN" \ --max-model-len $MAX_MODEL_LEN \
--max-num-seqs "$MAX_NUM_SEQS" \ --max-num-seqs $MAX_NUM_SEQS \
--max-num-batched-tokens "$MAX_BATCHED_TOKENS" \ --gpu-memory-utilization $GPU_MEMORY_UTIL \
--gpu-memory-utilization "$GPU_MEMORY_UTIL" \
--cpu-offload-gb "$CPU_OFFLOAD_GB" \
--kv-cache-dtype auto \ --kv-cache-dtype auto \
--trust-remote-code \ --trust-remote-code \
--served-model-name "$MODEL_TO_USE" \ --served-model-name $MODEL_TO_USE"
--enable-chunked-prefill \
--disable-custom-all-reduce \ # Note: For FP8 models, vLLM auto-detects quantization from model config
--disable-async-output-proc \ # No need to specify --dtype float8 (not supported in vLLM 0.11.0)
$QUANTIZATION_FLAG if [[ "$MODEL_TO_USE" == *"FP8"* ]] || [[ "$MODEL_TO_USE" == *"fp8"* ]]; then
echo "Detected FP8 model - vLLM will auto-detect FP8 quantization from model config"
fi
# Add CPU offload only for larger models
if [ "$CPU_OFFLOAD_GB" -gt 0 ] 2>/dev/null; then
VLLM_CMD="$VLLM_CMD --cpu-offload-gb $CPU_OFFLOAD_GB"
fi
# Add quantization if specified
if [ -n "$QUANTIZATION_FLAG" ]; then
VLLM_CMD="$VLLM_CMD $QUANTIZATION_FLAG"
fi
echo "Running: $VLLM_CMD"
exec $VLLM_CMD

View File

@ -18,7 +18,7 @@ This directory contains the Next.js frontend application for the txt2kg project.
- **lib/**: Utility functions and shared logic - **lib/**: Utility functions and shared logic
- LLM service (Ollama, vLLM, NVIDIA API integration) - LLM service (Ollama, vLLM, NVIDIA API integration)
- Graph database services (ArangoDB, Neo4j) - Graph database services (ArangoDB, Neo4j)
- Pinecone vector database integration - Qdrant vector database integration
- RAG service for knowledge graph querying - RAG service for knowledge graph querying
- **public/**: Static assets - **public/**: Static assets
- **types/**: TypeScript type definitions for graph data structures - **types/**: TypeScript type definitions for graph data structures
@ -76,7 +76,7 @@ Required environment variables are configured in docker-compose files:
- `OLLAMA_BASE_URL`: Ollama API endpoint - `OLLAMA_BASE_URL`: Ollama API endpoint
- `VLLM_BASE_URL`: vLLM API endpoint (optional) - `VLLM_BASE_URL`: vLLM API endpoint (optional)
- `NVIDIA_API_KEY`: NVIDIA API key (optional) - `NVIDIA_API_KEY`: NVIDIA API key (optional)
- `PINECONE_HOST`: Local Pinecone host (optional) - `QDRANT_URL`: Qdrant vector database URL (optional)
- `SENTENCE_TRANSFORMER_URL`: Embeddings service URL (optional) - `SENTENCE_TRANSFORMER_URL`: Embeddings service URL (optional)
## Features ## Features
@ -86,4 +86,4 @@ Required environment variables are configured in docker-compose files:
- **RAG Queries**: Query knowledge graphs with retrieval-augmented generation - **RAG Queries**: Query knowledge graphs with retrieval-augmented generation
- **Multiple LLM Providers**: Support for Ollama, vLLM, and NVIDIA API - **Multiple LLM Providers**: Support for Ollama, vLLM, and NVIDIA API
- **GPU-Accelerated Rendering**: Optional PyGraphistry integration for large graphs - **GPU-Accelerated Rendering**: Optional PyGraphistry integration for large graphs
- **Vector Search**: Pinecone integration for semantic search - **Vector Search**: Qdrant integration for semantic search

View File

@ -21,7 +21,7 @@ import { getGraphDbType } from '../settings/route';
/** /**
* Remote backend API that provides endpoints for creating and querying a knowledge graph * Remote backend API that provides endpoints for creating and querying a knowledge graph
* using the selected graph database, Pinecone, and SentenceTransformer * using the selected graph database, Qdrant, and SentenceTransformer
*/ */
/** /**

View File

@ -56,24 +56,24 @@ export async function POST(request: NextRequest) {
console.log(`Generated ${embeddings.length} embeddings`); console.log(`Generated ${embeddings.length} embeddings`);
// Initialize QdrantService // Initialize QdrantService
const pineconeService = QdrantService.getInstance(); const qdrantService = QdrantService.getInstance();
// Check if Qdrant server is running // Check if Qdrant server is running
const isPineconeRunning = await pineconeService.isQdrantRunning(); const isQdrantRunning = await qdrantService.isQdrantRunning();
if (!isPineconeRunning) { if (!isQdrantRunning) {
return NextResponse.json( return NextResponse.json(
{ error: 'Qdrant server is not available. Please make sure it is running.' }, { error: 'Qdrant server is not available. Please make sure it is running.' },
{ status: 503 } { status: 503 }
); );
} }
if (!pineconeService.isInitialized()) { if (!qdrantService.isInitialized()) {
try { try {
await pineconeService.initialize(); await qdrantService.initialize();
} catch (initError) { } catch (initError) {
console.error('Error initializing Pinecone:', initError); console.error('Error initializing Qdrant:', initError);
return NextResponse.json( return NextResponse.json(
{ error: `Failed to initialize Pinecone: ${initError instanceof Error ? initError.message : String(initError)}` }, { error: `Failed to initialize Qdrant: ${initError instanceof Error ? initError.message : String(initError)}` },
{ status: 500 } { status: 500 }
); );
} }
@ -89,13 +89,13 @@ export async function POST(request: NextRequest) {
textContent.set(chunkIds[i], chunks[i]); textContent.set(chunkIds[i], chunks[i]);
} }
// Store embeddings in PineconeService with retry logic // Store embeddings in Qdrant with retry logic
try { try {
await pineconeService.storeEmbeddings(entityEmbeddings, textContent); await qdrantService.storeEmbeddings(entityEmbeddings, textContent);
} catch (storeError) { } catch (storeError) {
console.error('Error storing embeddings in Pinecone:', storeError); console.error('Error storing embeddings in Qdrant:', storeError);
return NextResponse.json( return NextResponse.json(
{ error: `Failed to store embeddings in Pinecone: ${storeError instanceof Error ? storeError.message : String(storeError)}` }, { error: `Failed to store embeddings in Qdrant: ${storeError instanceof Error ? storeError.message : String(storeError)}` },
{ status: 500 } { status: 500 }
); );
} }

View File

@ -132,9 +132,9 @@ export async function POST(req: NextRequest) {
}, },
body: JSON.stringify({ body: JSON.stringify({
text, text,
model: vllmModel || 'meta-llama/Llama-3.2-3B-Instruct', model: vllmModel || process.env.VLLM_MODEL || 'nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8',
temperature: 0.1, temperature: 0.1,
maxTokens: 8192 maxTokens: 4096 // Reduced to leave room for input tokens in context
}) })
}); });

View File

@ -88,13 +88,18 @@ async function ensureConnection(request?: NextRequest): Promise<GraphDBType> {
* GET handler for retrieving graph data from the selected graph database * GET handler for retrieving graph data from the selected graph database
*/ */
export async function GET(request: NextRequest) { export async function GET(request: NextRequest) {
console.log('[graph-db GET] Request received');
try { try {
// Initialize with connection parameters // Initialize with connection parameters
console.log('[graph-db GET] Ensuring connection...');
const graphDbType = await ensureConnection(request); const graphDbType = await ensureConnection(request);
console.log(`[graph-db GET] Using database type: ${graphDbType}`);
const graphDbService = getGraphDbService(graphDbType); const graphDbService = getGraphDbService(graphDbType);
// Get graph data from the database // Get graph data from the database
console.log('[graph-db GET] Fetching graph data...');
const graphData = await graphDbService.getGraphData(); const graphData = await graphDbService.getGraphData();
console.log(`[graph-db GET] Got ${graphData.nodes.length} nodes, ${graphData.relationships.length} relationships`);
// Transform to format expected by the frontend // Transform to format expected by the frontend
const nodes = graphData.nodes.map(node => ({ const nodes = graphData.nodes.map(node => ({

View File

@ -30,7 +30,7 @@ export async function GET(request: NextRequest) {
// Initialize services with the correct graph database type // Initialize services with the correct graph database type
const graphDbType = getGraphDbType(); const graphDbType = getGraphDbType();
const graphDbService = getGraphDbService(graphDbType); const graphDbService = getGraphDbService(graphDbType);
const pineconeService = QdrantService.getInstance(); const qdrantService = QdrantService.getInstance();
// Initialize graph database if needed // Initialize graph database if needed
if (!graphDbService.isInitialized()) { if (!graphDbService.isInitialized()) {
@ -60,7 +60,7 @@ export async function GET(request: NextRequest) {
// Get total triples (relationships) // Get total triples (relationships)
const totalTriples = graphData.relationships.length; const totalTriples = graphData.relationships.length;
// Get vector stats from Pinecone if available // Get vector stats from Qdrant if available
let vectorStats = { let vectorStats = {
totalVectors: 0, totalVectors: 0,
avgQueryTime: 0, avgQueryTime: 0,
@ -68,8 +68,8 @@ export async function GET(request: NextRequest) {
}; };
try { try {
await pineconeService.initialize(); await qdrantService.initialize();
const stats = await pineconeService.getStats(); const stats = await qdrantService.getStats();
vectorStats = { vectorStats = {
totalVectors: stats.totalVectorCount || 0, totalVectors: stats.totalVectorCount || 0,
@ -77,7 +77,7 @@ export async function GET(request: NextRequest) {
avgRelevanceScore: stats.averageRelevanceScore || 0 avgRelevanceScore: stats.averageRelevanceScore || 0
}; };
} catch (error) { } catch (error) {
console.warn('Could not fetch Pinecone stats:', error); console.warn('Could not fetch Qdrant stats:', error);
} }
// Get real query logs instead of mock data // Get real query logs instead of mock data

View File

@ -57,7 +57,7 @@ export async function POST(req: NextRequest) {
console.log(`[${new Date().toISOString()}] /api/ollama: POST request received`); console.log(`[${new Date().toISOString()}] /api/ollama: POST request received`);
try { try {
const { text, model = 'qwen3:1.7b', temperature = 0.1, maxTokens = 8192 } = await req.json(); const { text, model = 'qwen3:1.7b', temperature = 0.1, maxTokens = 4096 } = await req.json();
console.log(`[${new Date().toISOString()}] /api/ollama: Parsed body - model: ${model}, text length: ${text?.length || 0}, maxTokens: ${maxTokens}`); console.log(`[${new Date().toISOString()}] /api/ollama: Parsed body - model: ${model}, text length: ${text?.length || 0}, maxTokens: ${maxTokens}`);
if (!text || typeof text !== 'string') { if (!text || typeof text !== 'string') {

View File

@ -0,0 +1,32 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
import { NextResponse } from 'next/server';
/**
* Fetch available models from Ollama
* GET /api/ollama/tags
*/
export async function GET() {
const ollamaUrl = process.env.OLLAMA_BASE_URL || 'http://ollama:11434/v1';
// Convert /v1 URL to base URL for tags endpoint
const baseUrl = ollamaUrl.replace('/v1', '');
try {
const response = await fetch(`${baseUrl}/api/tags`, {
signal: AbortSignal.timeout(5000),
});
if (!response.ok) {
return NextResponse.json({ models: [] }, { status: 200 });
}
const data = await response.json();
return NextResponse.json(data);
} catch (error) {
// Return empty models array if Ollama is not available
return NextResponse.json({ models: [] }, { status: 200 });
}
}

View File

@ -1,21 +1,5 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
import { NextRequest, NextResponse } from 'next/server'; import { NextRequest, NextResponse } from 'next/server';
import { QdrantService } from '@/lib/qdrant'; import { PineconeService } from '@/lib/pinecone';
/** /**
* Clear all data from the Pinecone vector database * Clear all data from the Pinecone vector database
@ -23,7 +7,7 @@ import { QdrantService } from '@/lib/qdrant';
*/ */
export async function POST() { export async function POST() {
// Get the Pinecone service instance // Get the Pinecone service instance
const pineconeService = QdrantService.getInstance(); const pineconeService = PineconeService.getInstance();
// Clear all vectors from the database // Clear all vectors from the database
const deleteSuccess = await pineconeService.deleteAllEntities(); const deleteSuccess = await pineconeService.deleteAllEntities();

View File

@ -1,21 +1,5 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
import { NextResponse } from 'next/server'; import { NextResponse } from 'next/server';
import { QdrantService } from '@/lib/qdrant'; import { PineconeService } from '@/lib/pinecone';
/** /**
* Create Pinecone index API endpoint * Create Pinecone index API endpoint
@ -24,7 +8,7 @@ import { QdrantService } from '@/lib/qdrant';
export async function POST() { export async function POST() {
try { try {
// Get the Pinecone service instance // Get the Pinecone service instance
const pineconeService = QdrantService.getInstance(); const pineconeService = PineconeService.getInstance();
// Force re-initialization to create the index // Force re-initialization to create the index
(pineconeService as any).initialized = false; (pineconeService as any).initialized = false;

View File

@ -1,21 +1,5 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
import { NextRequest, NextResponse } from 'next/server'; import { NextRequest, NextResponse } from 'next/server';
import { QdrantService } from '@/lib/qdrant'; import { PineconeService } from '@/lib/pinecone';
/** /**
* Get Pinecone vector database stats * Get Pinecone vector database stats
@ -23,7 +7,7 @@ import { QdrantService } from '@/lib/qdrant';
export async function GET() { export async function GET() {
try { try {
// Initialize Pinecone service // Initialize Pinecone service
const pineconeService = QdrantService.getInstance(); const pineconeService = PineconeService.getInstance();
// We can now directly call getStats() which handles initialization and error recovery // We can now directly call getStats() which handles initialization and error recovery
const stats = await pineconeService.getStats(); const stats = await pineconeService.getStats();

View File

@ -19,7 +19,7 @@ import RAGService from '@/lib/rag';
/** /**
* API endpoint for RAG-based question answering * API endpoint for RAG-based question answering
* Uses Pinecone for document retrieval and LangChain for generation * Uses Qdrant for document retrieval and LangChain for generation
* POST /api/rag-query * POST /api/rag-query
*/ */
export async function POST(req: NextRequest) { export async function POST(req: NextRequest) {

View File

@ -51,7 +51,7 @@ export async function POST(req: NextRequest) {
// Optionally store in vector database // Optionally store in vector database
if (sentenceEmbeddings.length > 0) { if (sentenceEmbeddings.length > 0) {
try { try {
// Map the embeddings to a format suitable for Pinecone // Map the embeddings to a format suitable for Qdrant
const embeddingsMap = new Map<string, number[]>(); const embeddingsMap = new Map<string, number[]>();
const textContentMap = new Map<string, string>(); const textContentMap = new Map<string, string>();
const metadataMap = new Map<string, any>(); const metadataMap = new Map<string, any>();
@ -64,9 +64,9 @@ export async function POST(req: NextRequest) {
metadataMap.set(key, item.metadata); metadataMap.set(key, item.metadata);
}); });
// Store in Pinecone // Store in Qdrant
const pineconeService = QdrantService.getInstance(); const qdrantService = QdrantService.getInstance();
await pineconeService.storeEmbeddingsWithMetadata( await qdrantService.storeEmbeddingsWithMetadata(
embeddingsMap, embeddingsMap,
textContentMap, textContentMap,
metadataMap metadataMap

View File

@ -17,8 +17,26 @@
import { NextRequest, NextResponse } from 'next/server'; import { NextRequest, NextResponse } from 'next/server';
import { GraphDBType } from '@/lib/graph-db-service'; import { GraphDBType } from '@/lib/graph-db-service';
// In-memory storage for settings // In-memory storage for settings - use lazy initialization for env vars
// because they're not available at build time, only at runtime
let serverSettings: Record<string, string> = {}; let serverSettings: Record<string, string> = {};
let settingsInitialized = false;
function ensureSettingsInitialized() {
if (!settingsInitialized) {
// Read environment variables at runtime, not build time
serverSettings = {
graph_db_type: process.env.GRAPH_DB_TYPE || 'arangodb',
neo4j_uri: process.env.NEO4J_URI || '',
neo4j_user: process.env.NEO4J_USER || process.env.NEO4J_USERNAME || '',
neo4j_password: process.env.NEO4J_PASSWORD || '',
arangodb_url: process.env.ARANGODB_URL || '',
arangodb_db: process.env.ARANGODB_DB || '',
};
settingsInitialized = true;
console.log(`[SETTINGS] Initialized at runtime with GRAPH_DB_TYPE: "${serverSettings.graph_db_type}"`);
}
}
/** /**
* API Route to sync client settings with server environment variables * API Route to sync client settings with server environment variables
@ -27,13 +45,16 @@ let serverSettings: Record<string, string> = {};
*/ */
export async function POST(request: NextRequest) { export async function POST(request: NextRequest) {
try { try {
// Ensure settings are initialized from env vars first
ensureSettingsInitialized();
const { settings } = await request.json(); const { settings } = await request.json();
if (!settings || typeof settings !== 'object') { if (!settings || typeof settings !== 'object') {
return NextResponse.json({ error: 'Settings object is required' }, { status: 400 }); return NextResponse.json({ error: 'Settings object is required' }, { status: 400 });
} }
// Update server settings // Update server settings (merge with existing)
serverSettings = { ...serverSettings, ...settings }; serverSettings = { ...serverSettings, ...settings };
// Log some important settings for debugging // Log some important settings for debugging
@ -58,6 +79,9 @@ export async function POST(request: NextRequest) {
*/ */
export async function GET(request: NextRequest) { export async function GET(request: NextRequest) {
try { try {
// Ensure settings are initialized from env vars first
ensureSettingsInitialized();
const url = new URL(request.url); const url = new URL(request.url);
const key = url.searchParams.get('key'); const key = url.searchParams.get('key');
@ -84,12 +108,32 @@ export async function GET(request: NextRequest) {
* For use in other API routes * For use in other API routes
*/ */
export function getSetting(key: string): string | null { export function getSetting(key: string): string | null {
ensureSettingsInitialized();
return serverSettings[key] || null; return serverSettings[key] || null;
} }
/** /**
* Get the currently selected graph database type * Get the currently selected graph database type
* Priority: serverSettings > environment variable > default 'arangodb'
*/ */
export function getGraphDbType(): GraphDBType { export function getGraphDbType(): GraphDBType {
return (serverSettings.graph_db_type as GraphDBType) || 'arangodb'; // Ensure settings are initialized from runtime environment variables
ensureSettingsInitialized();
// Check serverSettings (initialized from env vars or updated by client)
if (serverSettings.graph_db_type) {
console.log(`[getGraphDbType] Returning: "${serverSettings.graph_db_type}"`);
return serverSettings.graph_db_type as GraphDBType;
}
// Direct fallback to runtime environment variable
const envType = process.env.GRAPH_DB_TYPE;
if (envType) {
console.log(`[getGraphDbType] Returning from env: "${envType}"`);
return envType as GraphDBType;
}
// Default to arangodb for backwards compatibility
console.log(`[getGraphDbType] Returning default: "arangodb"`);
return 'arangodb';
} }

View File

@ -0,0 +1,44 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
import { NextRequest, NextResponse } from 'next/server';
import { QdrantService } from '@/lib/qdrant';
/**
* Clear all data from the Qdrant vector database
* POST /api/vector-db/clear
*/
export async function POST() {
// Get the Qdrant service instance
const qdrantService = QdrantService.getInstance();
// Clear all vectors from the database
const deleteSuccess = await qdrantService.deleteAllEntities();
// Get updated stats after clearing
const stats = await qdrantService.getStats();
// Return response based on operation success
return NextResponse.json({
success: deleteSuccess,
message: deleteSuccess
? 'Successfully cleared all data from Qdrant vector database'
: 'Failed to clear Qdrant database - service may not be available',
totalVectorCount: stats.totalVectorCount || 0,
httpHealthy: stats.httpHealthy || false
});
}

View File

@ -0,0 +1,53 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
import { NextResponse } from 'next/server';
import { QdrantService } from '@/lib/qdrant';
/**
* Create Qdrant collection API endpoint
* POST /api/vector-db/create-collection
*/
export async function POST() {
try {
// Get the Qdrant service instance
const qdrantService = QdrantService.getInstance();
// Force re-initialization to create the collection
(qdrantService as any).initialized = false;
await qdrantService.initialize();
// Check if initialization was successful by getting stats
const stats = await qdrantService.getStats();
return NextResponse.json({
success: true,
message: 'Qdrant collection created successfully',
httpHealthy: stats.httpHealthy || false
});
} catch (error) {
console.error('Error creating Qdrant collection:', error);
return NextResponse.json(
{
success: false,
error: `Failed to create Qdrant collection: ${error instanceof Error ? error.message : String(error)}`
},
{ status: 500 }
);
}
}

View File

@ -0,0 +1,59 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
import { NextRequest, NextResponse } from 'next/server';
import { QdrantService } from '@/lib/qdrant';
/**
* Get Qdrant vector database stats
*/
export async function GET() {
try {
// Initialize Qdrant service
const qdrantService = QdrantService.getInstance();
// We can now directly call getStats() which handles initialization and error recovery
const stats = await qdrantService.getStats();
return NextResponse.json({
...stats,
timestamp: new Date().toISOString()
});
} catch (error) {
console.error('Error getting Qdrant stats:', error);
// Return a successful response with error information
// This prevents the UI from breaking when Qdrant is unavailable
let errorMessage = error instanceof Error ? error.message : String(error);
// More specific error message for 404 errors
if (errorMessage.includes('404')) {
errorMessage = 'Qdrant server returned 404. The server may not be running or the collection does not exist.';
}
return NextResponse.json(
{
error: `Failed to get Qdrant stats: ${errorMessage}`,
totalVectorCount: 0,
source: 'error',
httpHealthy: false,
timestamp: new Date().toISOString()
},
{ status: 200 } // Use 200 instead of 500 to avoid UI errors
);
}
}

View File

@ -0,0 +1,40 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
import { NextResponse } from 'next/server';
/**
* Fetch available models from vLLM
* GET /api/vllm/models
*/
export async function GET() {
const vllmUrl = process.env.VLLM_BASE_URL || 'http://vllm:8001/v1';
try {
const response = await fetch(`${vllmUrl}/models`, {
signal: AbortSignal.timeout(5000),
});
if (!response.ok) {
return NextResponse.json({ models: [] }, { status: 200 });
}
const data = await response.json();
// vLLM returns OpenAI-compatible format: { data: [{ id: "model-name", ... }] }
if (data.data && Array.isArray(data.data)) {
const models = data.data.map((model: any) => ({
id: model.id,
name: model.id,
}));
return NextResponse.json({ models });
}
return NextResponse.json({ models: [] });
} catch (error) {
// Return empty models array if vLLM is not available
return NextResponse.json({ models: [] }, { status: 200 });
}
}

View File

@ -86,7 +86,7 @@ export async function GET(req: NextRequest) {
*/ */
export async function POST(req: NextRequest) { export async function POST(req: NextRequest) {
try { try {
const { text, model = 'meta-llama/Llama-3.2-3B-Instruct', temperature = 0.1, maxTokens = 1024 } = await req.json(); const { text, model = process.env.VLLM_MODEL || 'nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8', temperature = 0.1, maxTokens = 1024 } = await req.json();
if (!text || typeof text !== 'string') { if (!text || typeof text !== 'string') {
return NextResponse.json({ error: 'Text is required' }, { status: 400 }); return NextResponse.json({ error: 'Text is required' }, { status: 400 });

View File

@ -397,3 +397,88 @@ body {
/* Light mode: tune specific custom elements */ /* Light mode: tune specific custom elements */
.light .glass-card:hover { box-shadow: 0 10px 18px -8px rgba(0,0,0,0.12) !important; } .light .glass-card:hover { box-shadow: 0 10px 18px -8px rgba(0,0,0,0.12) !important; }
.light .startup-tab-icon { box-shadow: 0 1px 3px rgba(0,0,0,0.06) !important; } .light .startup-tab-icon { box-shadow: 0 1px 3px rgba(0,0,0,0.06) !important; }
/* Progress bar indeterminate animation - smooth sliding with gradient shine */
@keyframes progress {
0% {
width: 0%;
margin-left: 0%;
}
50% {
width: 40%;
margin-left: 30%;
}
100% {
width: 0%;
margin-left: 100%;
}
}
.animate-progress {
animation: progress 1.8s ease-in-out infinite;
}
/* Progress bar shimmer effect for determinate progress */
@keyframes shimmer {
0% {
transform: translateX(-100%);
}
100% {
transform: translateX(100%);
}
}
.progress-shimmer {
position: relative;
overflow: hidden;
}
.progress-shimmer::after {
content: "";
position: absolute;
inset: 0;
background: linear-gradient(
90deg,
transparent 0%,
rgba(255, 255, 255, 0.15) 50%,
transparent 100%
);
animation: shimmer 2s ease-in-out infinite;
}
/* Enhanced skeleton shimmer with directional sweep */
@keyframes skeleton-shimmer {
0% {
background-position: -200% 0;
}
100% {
background-position: 200% 0;
}
}
.skeleton-shimmer {
background: linear-gradient(
90deg,
hsl(var(--muted)) 25%,
hsl(var(--muted-foreground) / 0.08) 50%,
hsl(var(--muted)) 75%
);
background-size: 200% 100%;
animation: skeleton-shimmer 1.5s ease-in-out infinite;
}
/* Pulse animation for status indicators */
@keyframes status-pulse {
0%, 100% {
opacity: 1;
transform: scale(1);
}
50% {
opacity: 0.6;
transform: scale(0.95);
}
}
.status-pulse {
animation: status-pulse 2s ease-in-out infinite;
}

View File

@ -46,7 +46,6 @@ export default function Home() {
{ value: "edit", label: "Edit Knowledge Graph", Icon: Edit }, { value: "edit", label: "Edit Knowledge Graph", Icon: Edit },
{ value: "visualize", label: "Visualize Graph", Icon: Network }, { value: "visualize", label: "Visualize Graph", Icon: Network },
] as const; ] as const;
const activeIndex = Math.max(0, steps.findIndex(s => s.value === activeTab));
// Updated to use callback reference // Updated to use callback reference
const handleTabChange = React.useCallback((tab: string) => { const handleTabChange = React.useCallback((tab: string) => {
@ -84,8 +83,8 @@ export default function Home() {
<main className="container mx-auto px-6 py-12 border-b border-border/10"> <main className="container mx-auto px-6 py-12 border-b border-border/10">
<Tabs defaultValue="upload" className="w-full mb-12" onValueChange={setActiveTab}> <Tabs defaultValue="upload" className="w-full" onValueChange={setActiveTab}>
<TabsList className="nvidia-build-tabs mb-12" aria-label="Workflow steps"> <TabsList className="nvidia-build-tabs mb-10" aria-label="Workflow steps">
{steps.map(({ value, label, Icon }) => ( {steps.map(({ value, label, Icon }) => (
<TabsTrigger <TabsTrigger
key={value} key={value}
@ -106,22 +105,22 @@ export default function Home() {
</TabsList> </TabsList>
{/* Step 1: Document Upload */} {/* Step 1: Document Upload */}
<TabsContent value="upload" className="space-y-8"> <TabsContent value="upload" className="nvidia-build-tab-content">
<UploadTab onTabChange={handleTabChange} /> <UploadTab onTabChange={handleTabChange} />
</TabsContent> </TabsContent>
{/* Step 2: Configure & Process */} {/* Step 2: Configure & Process */}
<TabsContent value="configure" className="space-y-8"> <TabsContent value="configure" className="nvidia-build-tab-content">
<ConfigureTab /> <ConfigureTab />
</TabsContent> </TabsContent>
{/* Step 3: Edit Knowledge */} {/* Step 3: Edit Knowledge */}
<TabsContent value="edit" className="space-y-8"> <TabsContent value="edit" className="nvidia-build-tab-content">
<EditTab /> <EditTab />
</TabsContent> </TabsContent>
{/* Step 4: Visualize Knowledge Graph */} {/* Step 4: Visualize Knowledge Graph */}
<TabsContent value="visualize" className="space-y-8"> <TabsContent value="visualize" className="nvidia-build-tab-content">
<VisualizeTab /> <VisualizeTab />
</TabsContent> </TabsContent>
</Tabs> </Tabs>

View File

@ -68,7 +68,7 @@ export default function RagPage() {
} }
// Check if vector search is available // Check if vector search is available
const vectorResponse = await fetch('/api/pinecone-diag/stats'); const vectorResponse = await fetch('/api/vector-db/stats');
if (vectorResponse.ok) { if (vectorResponse.ok) {
const data = await vectorResponse.json(); const data = await vectorResponse.json();
setVectorEnabled(data.totalVectorCount > 0); setVectorEnabled(data.totalVectorCount > 0);
@ -112,7 +112,7 @@ export default function RagPage() {
}); });
try { try {
// If using pure RAG (Pinecone + LangChain) without graph search // If using pure RAG (Qdrant + LangChain) without graph search
if (params.usePureRag) { if (params.usePureRag) {
queryMode = 'pure-rag'; queryMode = 'pure-rag';
try { try {

View File

@ -14,8 +14,8 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
import React, { useState } from "react"; import React, { useState, useRef, useEffect } from "react";
import { ChevronDown, ChevronRight } from "lucide-react"; import { ChevronDown } from "lucide-react";
import { cn } from "@/lib/utils"; import { cn } from "@/lib/utils";
interface AdvancedOptionsProps { interface AdvancedOptionsProps {
@ -32,28 +32,57 @@ export function AdvancedOptions({
defaultOpen = false defaultOpen = false
}: AdvancedOptionsProps) { }: AdvancedOptionsProps) {
const [isOpen, setIsOpen] = useState(defaultOpen); const [isOpen, setIsOpen] = useState(defaultOpen);
const contentRef = useRef<HTMLDivElement>(null);
const [contentHeight, setContentHeight] = useState<number | undefined>(
defaultOpen ? undefined : 0
);
// Update content height when open state changes
useEffect(() => {
if (isOpen) {
const height = contentRef.current?.scrollHeight;
setContentHeight(height);
// After animation completes, set to auto for dynamic content
const timer = setTimeout(() => setContentHeight(undefined), 200);
return () => clearTimeout(timer);
} else {
// First set to current height, then to 0 for smooth collapse
setContentHeight(contentRef.current?.scrollHeight);
requestAnimationFrame(() => setContentHeight(0));
}
}, [isOpen]);
return ( return (
<div className={cn("border rounded-md overflow-hidden", className)}> <div className={cn("border rounded-md overflow-hidden", className)}>
<div <button
className="flex items-center justify-between p-3 bg-muted/30 cursor-pointer hover:bg-muted/50 transition-colors" type="button"
className="w-full flex items-center justify-between p-3 bg-muted/30 cursor-pointer hover:bg-muted/50 transition-colors focus-visible:ring-2 focus-visible:ring-nvidia-green focus-visible:ring-inset"
onClick={() => setIsOpen(!isOpen)} onClick={() => setIsOpen(!isOpen)}
aria-expanded={isOpen}
aria-controls="advanced-options-content"
> >
<h3 className="text-sm font-medium flex items-center"> <h3 className="text-sm font-medium flex items-center">
{isOpen ? ( <ChevronDown
<ChevronDown className="h-4 w-4 mr-2" /> className={cn(
) : ( "h-4 w-4 mr-2 transition-transform duration-200",
<ChevronRight className="h-4 w-4 mr-2" /> !isOpen && "-rotate-90"
)} )}
/>
{title} {title}
</h3> </h3>
</div> </button>
{isOpen && ( <div
id="advanced-options-content"
ref={contentRef}
className="overflow-hidden transition-all duration-200 ease-out"
style={{ height: contentHeight !== undefined ? contentHeight : 'auto' }}
aria-hidden={!isOpen}
>
<div className="p-4 border-t border-border/50"> <div className="p-4 border-t border-border/50">
{children} {children}
</div> </div>
)} </div>
</div> </div>
); );
} }

View File

@ -57,24 +57,34 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
setGraphError(null) setGraphError(null)
try { try {
// Get database type from localStorage // Get database type from localStorage, fall back to fetching from server
const graphDbType = localStorage.getItem("graph_db_type") || "arangodb" let graphDbType = localStorage.getItem("graph_db_type")
if (!graphDbType) {
// Fetch server's default (from GRAPH_DB_TYPE env var)
try {
const settingsRes = await fetch('/api/settings')
const settingsData = await settingsRes.json()
graphDbType = settingsData.settings?.graph_db_type || 'neo4j'
} catch {
graphDbType = 'neo4j'
}
}
setDbType(graphDbType === "arangodb" ? "ArangoDB" : "Neo4j") setDbType(graphDbType === "arangodb" ? "ArangoDB" : "Neo4j")
if (graphDbType === "neo4j") { if (graphDbType === "neo4j") {
// Neo4j connection logic // Neo4j connection logic - use the unified graph-db endpoint
const dbUrl = localStorage.getItem("NEO4J_URL") const dbUrl = localStorage.getItem("NEO4J_URL")
const dbUsername = localStorage.getItem("NEO4J_USERNAME") const dbUsername = localStorage.getItem("NEO4J_USERNAME")
const dbPassword = localStorage.getItem("NEO4J_PASSWORD") const dbPassword = localStorage.getItem("NEO4J_PASSWORD")
// Add query parameters if credentials exist // Add query parameters with type=neo4j
const queryParams = new URLSearchParams() const queryParams = new URLSearchParams()
queryParams.append("type", "neo4j")
if (dbUrl) queryParams.append("url", dbUrl) if (dbUrl) queryParams.append("url", dbUrl)
if (dbUsername) queryParams.append("username", dbUsername) if (dbUsername) queryParams.append("username", dbUsername)
if (dbPassword) queryParams.append("password", dbPassword) if (dbPassword) queryParams.append("password", dbPassword)
const queryString = queryParams.toString() const endpoint = `/api/graph-db?${queryParams.toString()}`
const endpoint = queryString ? `/api/neo4j?${queryString}` : '/api/neo4j'
const response = await fetch(endpoint) const response = await fetch(endpoint)
@ -98,21 +108,21 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
setConnectionUrl(dbUrl) setConnectionUrl(dbUrl)
} }
} else { } else {
// ArangoDB connection logic // ArangoDB connection logic - use the unified graph-db endpoint with type=arangodb
const arangoUrl = localStorage.getItem("arango_url") || "http://localhost:8529" const arangoUrl = localStorage.getItem("arango_url") || "http://localhost:8529"
const arangoDb = localStorage.getItem("arango_db") || "txt2kg" const arangoDb = localStorage.getItem("arango_db") || "txt2kg"
const arangoUser = localStorage.getItem("arango_user") || "" const arangoUser = localStorage.getItem("arango_user") || ""
const arangoPassword = localStorage.getItem("arango_password") || "" const arangoPassword = localStorage.getItem("arango_password") || ""
// Add query parameters if credentials exist // Add query parameters with type=arangodb
const queryParams = new URLSearchParams() const queryParams = new URLSearchParams()
queryParams.append("type", "arangodb")
if (arangoUrl) queryParams.append("url", arangoUrl) if (arangoUrl) queryParams.append("url", arangoUrl)
if (arangoDb) queryParams.append("dbName", arangoDb) if (arangoDb) queryParams.append("dbName", arangoDb)
if (arangoUser) queryParams.append("username", arangoUser) if (arangoUser) queryParams.append("username", arangoUser)
if (arangoPassword) queryParams.append("password", arangoPassword) if (arangoPassword) queryParams.append("password", arangoPassword)
const queryString = queryParams.toString() const endpoint = `/api/graph-db?${queryParams.toString()}`
const endpoint = queryString ? `/api/graph-db?${queryString}` : '/api/graph-db'
const response = await fetch(endpoint) const response = await fetch(endpoint)
@ -144,7 +154,8 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
// Disconnect from graph database // Disconnect from graph database
const disconnectGraph = async () => { const disconnectGraph = async () => {
try { try {
const graphDbType = localStorage.getItem("graph_db_type") || "arangodb" // Use current dbType state which was already determined from server/localStorage
const graphDbType = dbType === "Neo4j" ? "neo4j" : "arangodb"
const endpoint = graphDbType === "neo4j" ? '/api/neo4j/disconnect' : '/api/graph-db/disconnect' const endpoint = graphDbType === "neo4j" ? '/api/neo4j/disconnect' : '/api/graph-db/disconnect'
const response = await fetch(endpoint, { const response = await fetch(endpoint, {
@ -171,7 +182,7 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
// Fetch vector DB stats // Fetch vector DB stats
const fetchVectorStats = async () => { const fetchVectorStats = async () => {
try { try {
const response = await fetch('/api/pinecone-diag/stats'); const response = await fetch('/api/vector-db/stats');
const data = await response.json(); const data = await response.json();
if (response.ok) { if (response.ok) {
@ -273,7 +284,7 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
try { try {
// Call API to clear the database // Call API to clear the database
const response = await fetch('/api/pinecone-diag/clear', { const response = await fetch('/api/vector-db/clear', {
method: 'POST', method: 'POST',
}) })

View File

@ -28,6 +28,16 @@ import {
DialogHeader, DialogHeader,
DialogTitle, DialogTitle,
} from "@/components/ui/dialog" } from "@/components/ui/dialog"
import {
AlertDialog,
AlertDialogAction,
AlertDialogCancel,
AlertDialogContent,
AlertDialogDescription,
AlertDialogFooter,
AlertDialogHeader,
AlertDialogTitle,
} from "@/components/ui/alert-dialog"
import { Button } from "@/components/ui/button" import { Button } from "@/components/ui/button"
import type { Triple } from "@/utils/text-processing" import type { Triple } from "@/utils/text-processing"
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip" import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
@ -45,6 +55,10 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
const [editableTriples, setEditableTriples] = useState<Triple[]>([]) const [editableTriples, setEditableTriples] = useState<Triple[]>([])
const [editingTripleIndex, setEditingTripleIndex] = useState<number | null>(null) const [editingTripleIndex, setEditingTripleIndex] = useState<number | null>(null)
// Delete confirmation dialog state
const [showDeleteDialog, setShowDeleteDialog] = useState(false)
const [deleteTarget, setDeleteTarget] = useState<{ type: 'single' | 'multiple', docId?: string, docName?: string } | null>(null)
// Use shift-select hook for document selection // Use shift-select hook for document selection
const { const {
selectedItems: selectedDocuments, selectedItems: selectedDocuments,
@ -63,11 +77,32 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
const handleDeleteSelected = () => { const handleDeleteSelected = () => {
if (selectedDocuments.length === 0) return if (selectedDocuments.length === 0) return
setDeleteTarget({ type: 'multiple' })
setShowDeleteDialog(true)
}
if (confirm(`Are you sure you want to delete ${selectedDocuments.length} selected document(s)?`)) { const handleConfirmDelete = () => {
if (!deleteTarget) return
if (deleteTarget.type === 'multiple') {
deleteDocuments(selectedDocuments) deleteDocuments(selectedDocuments)
setSelectedDocuments([]) setSelectedDocuments([])
toast({
title: "Documents Deleted",
description: `Successfully deleted ${selectedDocuments.length} document(s).`,
duration: 3000,
})
} else if (deleteTarget.type === 'single' && deleteTarget.docId) {
deleteDocuments([deleteTarget.docId])
toast({
title: "Document Deleted",
description: `"${deleteTarget.docName}" has been deleted.`,
duration: 3000,
})
} }
setShowDeleteDialog(false)
setDeleteTarget(null)
} }
const openTriplesDialog = (documentId: string) => { const openTriplesDialog = (documentId: string) => {
@ -249,6 +284,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
openTriplesDialog(doc.id); openTriplesDialog(doc.id);
}} }}
className="p-2 text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors" className="p-2 text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors"
aria-label={`View and edit ${doc.triples?.length || 0} triples for ${doc.name}`}
title="View and edit triples" title="View and edit triples"
> >
<Eye className="h-4 w-4" /> <Eye className="h-4 w-4" />
@ -269,6 +305,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
// Create a simple info modal or tooltip showing document details // Create a simple info modal or tooltip showing document details
}} }}
className="p-2 text-muted-foreground hover:text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors" className="p-2 text-muted-foreground hover:text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors"
aria-label={`View info for ${doc.name}`}
title="View document info" title="View document info"
> >
<Info className="h-4 w-4" /> <Info className="h-4 w-4" />
@ -294,6 +331,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
} }
}} }}
className="p-2 text-muted-foreground hover:text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors" className="p-2 text-muted-foreground hover:text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors"
aria-label={`Download ${doc.name}`}
title="Download document" title="Download document"
> >
<Download className="h-4 w-4" /> <Download className="h-4 w-4" />
@ -301,11 +339,11 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
<button <button
onClick={(e) => { onClick={(e) => {
e.stopPropagation() e.stopPropagation()
if (confirm(`Are you sure you want to delete ${doc.name}?`)) { setDeleteTarget({ type: 'single', docId: doc.id, docName: doc.name })
deleteDocuments([doc.id]) setShowDeleteDialog(true)
}
}} }}
className="p-2 text-muted-foreground hover:text-red-500 hover:bg-red-500/10 rounded-lg transition-colors" className="p-2 text-muted-foreground hover:text-red-500 hover:bg-red-500/10 rounded-lg transition-colors"
aria-label={`Delete ${doc.name}`}
title="Delete document" title="Delete document"
> >
<Trash2 className="h-4 w-4" /> <Trash2 className="h-4 w-4" />
@ -395,6 +433,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
<button <button
onClick={() => setEditingTripleIndex(null)} onClick={() => setEditingTripleIndex(null)}
className="p-1.5 text-primary hover:text-primary/80 hover:bg-primary/10 rounded-full transition-colors" className="p-1.5 text-primary hover:text-primary/80 hover:bg-primary/10 rounded-full transition-colors"
aria-label={`Save changes to triple: ${triple.subject} ${triple.predicate} ${triple.object}`}
title="Save" title="Save"
> >
<CheckCircle className="h-4 w-4" /> <CheckCircle className="h-4 w-4" />
@ -403,6 +442,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
<button <button
onClick={() => setEditingTripleIndex(index)} onClick={() => setEditingTripleIndex(index)}
className="p-1.5 text-muted-foreground hover:text-foreground hover:bg-muted/50 rounded-full transition-colors" className="p-1.5 text-muted-foreground hover:text-foreground hover:bg-muted/50 rounded-full transition-colors"
aria-label={`Edit triple: ${triple.subject} ${triple.predicate} ${triple.object}`}
title="Edit" title="Edit"
> >
<Edit className="h-4 w-4" /> <Edit className="h-4 w-4" />
@ -411,6 +451,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
<button <button
onClick={() => deleteTriple(index)} onClick={() => deleteTriple(index)}
className="p-1.5 text-muted-foreground hover:text-destructive hover:bg-destructive/10 rounded-full transition-colors" className="p-1.5 text-muted-foreground hover:text-destructive hover:bg-destructive/10 rounded-full transition-colors"
aria-label={`Delete triple: ${triple.subject} ${triple.predicate} ${triple.object}`}
title="Delete" title="Delete"
> >
<Trash2 className="h-4 w-4" /> <Trash2 className="h-4 w-4" />
@ -431,6 +472,40 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
</div> </div>
</DialogContent> </DialogContent>
</Dialog> </Dialog>
{/* Delete Confirmation Dialog */}
<AlertDialog open={showDeleteDialog} onOpenChange={setShowDeleteDialog}>
<AlertDialogContent>
<AlertDialogHeader>
<AlertDialogTitle className="flex items-center gap-2">
<Trash2 className="h-5 w-5 text-destructive" />
Delete {deleteTarget?.type === 'multiple' ? 'Documents' : 'Document'}
</AlertDialogTitle>
<AlertDialogDescription>
{deleteTarget?.type === 'multiple' ? (
<>
Are you sure you want to delete <strong>{selectedDocuments.length}</strong> selected document{selectedDocuments.length !== 1 ? 's' : ''}?
This action cannot be undone.
</>
) : (
<>
Are you sure you want to delete <strong>"{deleteTarget?.docName}"</strong>?
This action cannot be undone.
</>
)}
</AlertDialogDescription>
</AlertDialogHeader>
<AlertDialogFooter>
<AlertDialogCancel onClick={() => setDeleteTarget(null)}>Cancel</AlertDialogCancel>
<AlertDialogAction
onClick={handleConfirmDelete}
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
>
Delete
</AlertDialogAction>
</AlertDialogFooter>
</AlertDialogContent>
</AlertDialog>
</div> </div>
) )
} }

View File

@ -19,6 +19,7 @@
import { Network, Zap } from "lucide-react" import { Network, Zap } from "lucide-react"
import { useDocuments } from "@/contexts/document-context" import { useDocuments } from "@/contexts/document-context"
import { Loader2 } from "lucide-react" import { Loader2 } from "lucide-react"
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
export function GraphActions() { export function GraphActions() {
const { documents, processDocuments, isProcessing, openGraphVisualization } = useDocuments() const { documents, processDocuments, isProcessing, openGraphVisualization } = useDocuments()
@ -50,34 +51,67 @@ export function GraphActions() {
} }
} }
// Helper to get tooltip content for disabled Process button
const getProcessTooltip = () => {
if (isProcessing) return "Processing in progress..."
if (!hasNewDocuments && documents.length === 0) return "Upload documents first to extract knowledge triples"
if (!hasNewDocuments) return "All documents have been processed"
return "Extract knowledge triples from uploaded documents"
}
// Helper to get tooltip content for disabled View Graph button
const getViewGraphTooltip = () => {
if (isProcessing) return "Wait for processing to complete"
if (!hasProcessedDocuments && documents.length === 0) return "Upload and process documents first"
if (!hasProcessedDocuments) return "Process documents first to generate knowledge triples"
return "Visualize the knowledge graph from extracted triples"
}
return ( return (
<div className="flex gap-3 items-center"> <TooltipProvider>
<button <div className="flex gap-3 items-center">
className={`btn-primary ${!hasNewDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`} <Tooltip>
disabled={!hasNewDocuments || isProcessing} <TooltipTrigger asChild>
onClick={handleProcessDocuments} <button
> className={`btn-primary ${!hasNewDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`}
{isProcessing ? ( disabled={!hasNewDocuments || isProcessing}
<> onClick={handleProcessDocuments}
<Loader2 className="h-4 w-4 animate-spin" /> >
Processing... {isProcessing ? (
</> <>
) : ( <Loader2 className="h-4 w-4 animate-spin" />
<> Processing...
<Zap className="h-4 w-4" /> </>
Process Documents ) : (
</> <>
)} <Zap className="h-4 w-4" />
</button> Process Documents
<button </>
className={`btn-primary ${!hasProcessedDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`} )}
disabled={!hasProcessedDocuments || isProcessing} </button>
onClick={() => openGraphVisualization()} </TooltipTrigger>
> <TooltipContent>
<Network className="h-4 w-4" /> <p>{getProcessTooltip()}</p>
View Knowledge Graph </TooltipContent>
</button> </Tooltip>
</div>
<Tooltip>
<TooltipTrigger asChild>
<button
className={`btn-primary ${!hasProcessedDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`}
disabled={!hasProcessedDocuments || isProcessing}
onClick={() => openGraphVisualization()}
>
<Network className="h-4 w-4" />
View Knowledge Graph
</button>
</TooltipTrigger>
<TooltipContent>
<p>{getViewGraphTooltip()}</p>
</TooltipContent>
</Tooltip>
</div>
</TooltipProvider>
) )
} }

View File

@ -17,7 +17,7 @@
"use client" "use client"
import { useState, useEffect } from "react" import { useState, useEffect } from "react"
import { ChevronDown, Cpu } from "lucide-react" import { ChevronDown, Cpu, Server, RefreshCw } from "lucide-react"
import { OllamaIcon } from "@/components/ui/ollama-icon" import { OllamaIcon } from "@/components/ui/ollama-icon"
interface LLMModel { interface LLMModel {
@ -28,15 +28,8 @@ interface LLMModel {
description?: string description?: string
} }
// Default models // NVIDIA API models (always available if API key is set)
const DEFAULT_MODELS: LLMModel[] = [ const NVIDIA_MODELS: LLMModel[] = [
{
id: "ollama-llama3.1:8b",
name: "Llama 3.1 8B",
model: "llama3.1:8b",
provider: "ollama",
description: "Local Ollama model"
},
{ {
id: "nvidia-nemotron-super", id: "nvidia-nemotron-super",
name: "Nemotron Super 49B", name: "Nemotron Super 49B",
@ -54,51 +47,100 @@ const DEFAULT_MODELS: LLMModel[] = [
] ]
export function LLMSelectorCompact() { export function LLMSelectorCompact() {
const [models, setModels] = useState<LLMModel[]>(DEFAULT_MODELS) const [models, setModels] = useState<LLMModel[]>([])
const [selectedModel, setSelectedModel] = useState<LLMModel>(DEFAULT_MODELS[0]) const [selectedModel, setSelectedModel] = useState<LLMModel | null>(null)
const [isOpen, setIsOpen] = useState(false) const [isOpen, setIsOpen] = useState(false)
const [isLoading, setIsLoading] = useState(true)
// Load Ollama models from settings // Fetch available models from running backends
useEffect(() => { const fetchAvailableModels = async () => {
setIsLoading(true)
const availableModels: LLMModel[] = []
// Check vLLM first (port 8001)
try { try {
const selectedOllamaModels = localStorage.getItem("selected_ollama_models") const vllmResponse = await fetch('/api/vllm/models', {
if (selectedOllamaModels) { signal: AbortSignal.timeout(3000)
const modelNames: string[] = JSON.parse(selectedOllamaModels) })
const ollamaModels: LLMModel[] = modelNames.map(name => ({ if (vllmResponse.ok) {
id: `ollama-${name}`, const data = await vllmResponse.json()
name: name, if (data.models && Array.isArray(data.models)) {
model: name, data.models.forEach((model: any) => {
provider: "ollama", const modelId = model.id || model.name || model
description: "Local Ollama model" availableModels.push({
})) id: `vllm-${modelId}`,
name: modelId.split('/').pop() || modelId,
// Combine with default models, avoiding duplicates model: modelId,
const defaultOllamaIds = DEFAULT_MODELS provider: "vllm",
.filter(m => m.provider === "ollama") description: "vLLM (GPU-accelerated)"
.map(m => m.model) })
const uniqueOllamaModels = ollamaModels.filter( })
m => !defaultOllamaIds.includes(m.model) }
)
const allModels = [...DEFAULT_MODELS, ...uniqueOllamaModels]
setModels(allModels)
} }
} catch (error) { } catch (e) {
console.error("Error loading Ollama models:", error) // vLLM not available
console.log("vLLM not available")
} }
}, [])
// Load selected model from localStorage // Check Ollama (port 11434)
useEffect(() => {
try { try {
const saved = localStorage.getItem("selectedModelForRAG") const ollamaResponse = await fetch('/api/ollama/tags', {
if (saved) { signal: AbortSignal.timeout(3000)
const savedModel: LLMModel = JSON.parse(saved) })
setSelectedModel(savedModel) if (ollamaResponse.ok) {
const data = await ollamaResponse.json()
if (data.models && Array.isArray(data.models)) {
data.models.forEach((model: any) => {
const modelName = model.name || model
availableModels.push({
id: `ollama-${modelName}`,
name: modelName,
model: modelName,
provider: "ollama",
description: "Local Ollama model"
})
})
}
} }
} catch (error) { } catch (e) {
console.error("Error loading selected model:", error) // Ollama not available
console.log("Ollama not available")
} }
// Always add NVIDIA API models
availableModels.push(...NVIDIA_MODELS)
setModels(availableModels)
// Set default selected model
if (availableModels.length > 0) {
// Try to restore saved selection
try {
const saved = localStorage.getItem("selectedModelForRAG")
if (saved) {
const savedModel: LLMModel = JSON.parse(saved)
const found = availableModels.find(m => m.id === savedModel.id)
if (found) {
setSelectedModel(found)
setIsLoading(false)
return
}
}
} catch (e) {
// Ignore
}
// Default to first available local model (vLLM or Ollama), not NVIDIA API
const localModel = availableModels.find(m => m.provider === "vllm" || m.provider === "ollama")
setSelectedModel(localModel || availableModels[0])
}
setIsLoading(false)
}
// Fetch models on mount
useEffect(() => {
fetchAvailableModels()
}, []) }, [])
// Save selected model to localStorage and dispatch event // Save selected model to localStorage and dispatch event
@ -117,14 +159,55 @@ export function LLMSelectorCompact() {
if (provider === "ollama") { if (provider === "ollama") {
return <OllamaIcon className="h-3 w-3 text-orange-500" /> return <OllamaIcon className="h-3 w-3 text-orange-500" />
} }
if (provider === "vllm") {
return <Server className="h-3 w-3 text-purple-500" />
}
return <Cpu className="h-3 w-3 text-green-500" /> return <Cpu className="h-3 w-3 text-green-500" />
} }
const getProviderLabel = (provider: string) => {
switch (provider) {
case "ollama": return "Ollama"
case "vllm": return "vLLM"
case "nvidia": return "NVIDIA API"
default: return provider
}
}
if (isLoading) {
return (
<div className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50">
<RefreshCw className="h-3 w-3 animate-spin text-muted-foreground" />
<span className="text-muted-foreground">Loading models...</span>
</div>
)
}
if (!selectedModel) {
return (
<div className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50 text-muted-foreground">
No models available
</div>
)
}
// Group models by provider
const groupedModels = models.reduce((acc, model) => {
if (!acc[model.provider]) {
acc[model.provider] = []
}
acc[model.provider].push(model)
return acc
}, {} as Record<string, LLMModel[]>)
return ( return (
<div className="relative"> <div className="relative">
<button <button
type="button" type="button"
onClick={() => setIsOpen(!isOpen)} onClick={() => setIsOpen(!isOpen)}
aria-haspopup="listbox"
aria-expanded={isOpen}
aria-label={`Select LLM model. Currently selected: ${selectedModel.name}`}
className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50 hover:bg-muted/30 transition-colors" className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50 hover:bg-muted/30 transition-colors"
> >
{getModelIcon(selectedModel.provider)} {getModelIcon(selectedModel.provider)}
@ -141,37 +224,61 @@ export function LLMSelectorCompact() {
/> />
{/* Dropdown */} {/* Dropdown */}
<div className="absolute top-full left-0 mt-2 w-64 border border-border/40 rounded-lg bg-popover shadow-lg z-50 overflow-hidden"> <div
<div className="p-2 border-b border-border/40 bg-muted/30"> className="absolute top-full left-0 mt-2 w-72 border border-border/40 rounded-lg bg-popover shadow-lg z-50 overflow-hidden"
role="listbox"
aria-label="Available LLM models"
>
<div className="p-2 border-b border-border/40 bg-muted/30 flex items-center justify-between">
<h4 className="text-xs font-semibold text-foreground">Select LLM for Answer Generation</h4> <h4 className="text-xs font-semibold text-foreground">Select LLM for Answer Generation</h4>
<button
type="button"
onClick={(e) => {
e.stopPropagation()
fetchAvailableModels()
}}
className="p-1 hover:bg-muted/50 rounded"
title="Refresh models"
>
<RefreshCw className="h-3 w-3 text-muted-foreground" />
</button>
</div> </div>
<div className="max-h-64 overflow-y-auto"> <div className="max-h-80 overflow-y-auto">
{models.map((model) => ( {Object.entries(groupedModels).map(([provider, providerModels]) => (
<button <div key={provider}>
key={model.id} <div className="px-3 py-1.5 text-xs font-semibold text-muted-foreground bg-muted/20 border-b border-border/20">
type="button" {getProviderLabel(provider)}
onClick={() => handleSelectModel(model)}
className={`w-full flex items-start gap-2 p-3 hover:bg-muted/50 transition-colors text-left ${
selectedModel.id === model.id ? 'bg-nvidia-green/10' : ''
}`}
>
<div className="mt-0.5">
{getModelIcon(model.provider)}
</div> </div>
<div className="flex-1 min-w-0"> {providerModels.map((model) => (
<div className="text-sm font-medium text-foreground truncate"> <button
{model.name} key={model.id}
</div> type="button"
{model.description && ( role="option"
<div className="text-xs text-muted-foreground"> aria-selected={selectedModel.id === model.id}
{model.description} onClick={() => handleSelectModel(model)}
className={`w-full flex items-start gap-2 p-3 hover:bg-muted/50 transition-colors text-left ${
selectedModel.id === model.id ? 'bg-nvidia-green/10' : ''
}`}
>
<div className="mt-0.5">
{getModelIcon(model.provider)}
</div> </div>
)} <div className="flex-1 min-w-0">
</div> <div className="text-sm font-medium text-foreground truncate">
{selectedModel.id === model.id && ( {model.name}
<div className="w-2 h-2 rounded-full bg-nvidia-green flex-shrink-0 mt-1.5" /> </div>
)} {model.description && (
</button> <div className="text-xs text-muted-foreground">
{model.description}
</div>
)}
</div>
{selectedModel.id === model.id && (
<div className="w-2 h-2 rounded-full bg-nvidia-green flex-shrink-0 mt-1.5" />
)}
</button>
))}
</div>
))} ))}
</div> </div>
</div> </div>
@ -180,4 +287,3 @@ export function LLMSelectorCompact() {
</div> </div>
) )
} }

View File

@ -17,12 +17,22 @@
"use client" "use client"
import { useState, useEffect, useRef } from "react" import { useState, useEffect, useRef } from "react"
import { createPortal } from "react-dom" import { ChevronDown, Cpu, Server, RefreshCw } from "lucide-react"
import { ChevronDown, Sparkles, Cpu, Server } from "lucide-react"
import { OllamaIcon } from "@/components/ui/ollama-icon" import { OllamaIcon } from "@/components/ui/ollama-icon"
// Base models - NVIDIA NeMo as default (first in list) interface Model {
const baseModels = [ id: string
name: string
icon: React.ReactNode
description: string
model: string
baseURL: string
provider: string
apiKeyName?: string
}
// NVIDIA API models (always available)
const NVIDIA_MODELS: Model[] = [
{ {
id: "nvidia-nemotron", id: "nvidia-nemotron",
name: "NVIDIA Llama 3.3 Nemotron Super 49B", name: "NVIDIA Llama 3.3 Nemotron Super 49B",
@ -31,6 +41,7 @@ const baseModels = [
model: "nvidia/llama-3.3-nemotron-super-49b-v1.5", model: "nvidia/llama-3.3-nemotron-super-49b-v1.5",
apiKeyName: "NVIDIA_API_KEY", apiKeyName: "NVIDIA_API_KEY",
baseURL: "https://integrate.api.nvidia.com/v1", baseURL: "https://integrate.api.nvidia.com/v1",
provider: "nvidia",
}, },
{ {
id: "nvidia-nemotron-nano", id: "nvidia-nemotron-nano",
@ -40,68 +51,116 @@ const baseModels = [
model: "nvidia/nvidia-nemotron-nano-9b-v2", model: "nvidia/nvidia-nemotron-nano-9b-v2",
apiKeyName: "NVIDIA_API_KEY", apiKeyName: "NVIDIA_API_KEY",
baseURL: "https://integrate.api.nvidia.com/v1", baseURL: "https://integrate.api.nvidia.com/v1",
}, provider: "nvidia",
// Preset Ollama model
{
id: "ollama-llama3.1:8b",
name: "Ollama llama3.1:8b",
icon: <OllamaIcon className="h-4 w-4 text-orange-500" />,
description: "Local Ollama server with llama3.1:8b model",
model: "llama3.1:8b",
baseURL: "http://localhost:11434/v1",
provider: "ollama",
}, },
] ]
// vLLM models removed per user request // Helper to create model objects
const createOllamaModel = (modelName: string): Model => ({
// Helper function to create Ollama model objects
const createOllamaModel = (modelName: string) => ({
id: `ollama-${modelName}`, id: `ollama-${modelName}`,
name: `Ollama ${modelName}`, name: `Ollama ${modelName}`,
icon: <OllamaIcon className="h-4 w-4 text-orange-500" />, icon: <OllamaIcon className="h-4 w-4 text-orange-500" />,
description: `Local Ollama server with ${modelName} model`, description: `Local Ollama model`,
model: modelName, model: modelName,
baseURL: "http://localhost:11434/v1", baseURL: "http://localhost:11434/v1",
provider: "ollama", provider: "ollama",
}) })
const createVllmModel = (modelName: string): Model => ({
id: `vllm-${modelName}`,
name: modelName.split('/').pop() || modelName,
icon: <Server className="h-4 w-4 text-purple-500" />,
description: "vLLM (GPU-accelerated)",
model: modelName,
baseURL: "http://localhost:8001/v1",
provider: "vllm",
})
export function ModelSelector() { export function ModelSelector() {
const [models, setModels] = useState(() => [...baseModels]) const [models, setModels] = useState<Model[]>([])
const [selectedModel, setSelectedModel] = useState(() => { const [selectedModel, setSelectedModel] = useState<Model | null>(null)
// Try to find a default Ollama model first
const defaultOllama = models.find(m => m.provider === "ollama")
return defaultOllama || models[0]
})
const [isOpen, setIsOpen] = useState(false) const [isOpen, setIsOpen] = useState(false)
const [isLoading, setIsLoading] = useState(true)
const buttonRef = useRef<HTMLButtonElement | null>(null) const buttonRef = useRef<HTMLButtonElement | null>(null)
const containerRef = useRef<HTMLDivElement | null>(null) const containerRef = useRef<HTMLDivElement | null>(null)
const [mounted, setMounted] = useState(false) const [mounted, setMounted] = useState(false)
// Load configured Ollama models // Fetch available models from running backends
const loadOllamaModels = () => { const fetchAvailableModels = async () => {
setIsLoading(true)
const availableModels: Model[] = []
// Check vLLM first (port 8001)
try { try {
const selectedOllamaModels = localStorage.getItem("selected_ollama_models") const vllmResponse = await fetch('/api/vllm/models', {
if (selectedOllamaModels) { signal: AbortSignal.timeout(3000)
const modelNames = JSON.parse(selectedOllamaModels) })
// Filter out models that are already in baseModels to avoid duplicates if (vllmResponse.ok) {
const baseModelNames = baseModels.filter(m => m.provider === "ollama").map(m => m.model) const data = await vllmResponse.json()
const filteredModelNames = modelNames.filter((name: string) => !baseModelNames.includes(name)) if (data.models && Array.isArray(data.models)) {
const ollamaModels = filteredModelNames.map(createOllamaModel) data.models.forEach((model: any) => {
const newModels = [...baseModels, ...ollamaModels] const modelId = model.id || model.name || model
setModels(newModels) availableModels.push(createVllmModel(modelId))
return newModels })
}
} }
} catch (error) { } catch (e) {
console.error("Error loading Ollama models:", error) console.log("vLLM not available")
} }
// Return base models if no Ollama models configured
return [...baseModels] // Check Ollama (port 11434)
try {
const ollamaResponse = await fetch('/api/ollama/tags', {
signal: AbortSignal.timeout(3000)
})
if (ollamaResponse.ok) {
const data = await ollamaResponse.json()
if (data.models && Array.isArray(data.models)) {
data.models.forEach((model: any) => {
const modelName = model.name || model
availableModels.push(createOllamaModel(modelName))
})
}
}
} catch (e) {
console.log("Ollama not available")
}
// Always add NVIDIA API models
availableModels.push(...NVIDIA_MODELS)
setModels(availableModels)
// Set default selected model
if (availableModels.length > 0) {
// Try to restore saved selection
try {
const saved = localStorage.getItem("selectedModel")
if (saved) {
const savedModel = JSON.parse(saved)
const found = availableModels.find(m => m.id === savedModel.id)
if (found) {
setSelectedModel(found)
setIsLoading(false)
return
}
}
} catch (e) {
// Ignore
}
// Default to first available local model (vLLM or Ollama)
const localModel = availableModels.find(m => m.provider === "vllm" || m.provider === "ollama")
setSelectedModel(localModel || availableModels[0])
}
setIsLoading(false)
} }
// Dispatch custom event when model changes // Dispatch custom event when model changes
const updateSelectedModel = (model: any) => { const updateSelectedModel = (model: Model) => {
setSelectedModel(model) setSelectedModel(model)
localStorage.setItem("selectedModel", JSON.stringify(model))
// Dispatch a custom event with the selected model data // Dispatch a custom event with the selected model data
const event = new CustomEvent('modelSelected', { const event = new CustomEvent('modelSelected', {
@ -110,59 +169,11 @@ export function ModelSelector() {
window.dispatchEvent(event) window.dispatchEvent(event)
} }
// Fetch models on mount
useEffect(() => { useEffect(() => {
// Save selected model to localStorage fetchAvailableModels()
localStorage.setItem("selectedModel", JSON.stringify(selectedModel))
}, [selectedModel])
// Initialize models and selected model
useEffect(() => {
const loadedModels = loadOllamaModels()
// Try to restore selected model from localStorage
const savedModel = localStorage.getItem("selectedModel")
if (savedModel) {
try {
const parsed = JSON.parse(savedModel)
// Find matching model in our current models array
const matchingModel = loadedModels.find(m => m.id === parsed.id)
if (matchingModel) {
updateSelectedModel(matchingModel)
} else {
// If saved model not found, use first available model
updateSelectedModel(loadedModels[0])
}
} catch (e) {
console.error("Error parsing saved model", e)
updateSelectedModel(loadedModels[0])
}
} else {
// If no model in localStorage, use first available model
updateSelectedModel(loadedModels[0])
}
}, []) }, [])
// Listen for Ollama model updates
useEffect(() => {
const handleOllamaUpdate = (event: CustomEvent) => {
console.log("Ollama models updated, reloading...")
const newModels = loadOllamaModels()
// Check if current selected model still exists
const currentModelStillExists = newModels.find(m => m.id === selectedModel.id)
if (!currentModelStillExists) {
// Select first available model if current one is no longer available
updateSelectedModel(newModels[0])
}
}
window.addEventListener('ollama-models-updated', handleOllamaUpdate as EventListener)
return () => {
window.removeEventListener('ollama-models-updated', handleOllamaUpdate as EventListener)
}
}, [selectedModel.id])
// Set mounted state after component mounts (for SSR compatibility) // Set mounted state after component mounts (for SSR compatibility)
useEffect(() => { useEffect(() => {
setMounted(true) setMounted(true)
@ -186,6 +197,55 @@ export function ModelSelector() {
} }
}, []) }, [])
// Listen for Ollama model updates
useEffect(() => {
const handleOllamaUpdate = () => {
console.log("Ollama models updated, reloading...")
fetchAvailableModels()
}
window.addEventListener('ollama-models-updated', handleOllamaUpdate)
return () => {
window.removeEventListener('ollama-models-updated', handleOllamaUpdate)
}
}, [])
if (isLoading) {
return (
<div className="flex items-center gap-2 bg-card border border-border rounded-lg px-4 py-2 text-sm">
<RefreshCw className="h-4 w-4 animate-spin text-muted-foreground" />
<span className="text-muted-foreground">Loading models...</span>
</div>
)
}
if (!selectedModel) {
return (
<div className="flex items-center gap-2 bg-card border border-border rounded-lg px-4 py-2 text-sm text-muted-foreground">
No models available
</div>
)
}
// Group models by provider
const groupedModels = models.reduce((acc, model) => {
if (!acc[model.provider]) {
acc[model.provider] = []
}
acc[model.provider].push(model)
return acc
}, {} as Record<string, Model[]>)
const getProviderLabel = (provider: string) => {
switch (provider) {
case "ollama": return "Ollama (Local)"
case "vllm": return "vLLM (GPU-accelerated)"
case "nvidia": return "NVIDIA API (Cloud)"
default: return provider
}
}
return ( return (
<div ref={containerRef} className="relative"> <div ref={containerRef} className="relative">
<button <button
@ -202,35 +262,57 @@ export function ModelSelector() {
{isOpen && mounted && ( {isOpen && mounted && (
<div <div
className="absolute bg-card border border-border rounded-md shadow-md overflow-hidden max-h-80 overflow-y-auto z-50" className="absolute bg-card border border-border rounded-md shadow-md overflow-hidden max-h-96 overflow-y-auto z-50"
style={{ style={{
width: "288px", width: "320px",
bottom: "calc(100% + 4px)", bottom: "calc(100% + 4px)",
left: 0, left: 0,
}} }}
> >
<ul className="divide-y divide-border/60"> <div className="px-3 py-2 border-b border-border/60 bg-muted/30 flex items-center justify-between">
{models.map((model) => ( <span className="text-xs font-semibold text-foreground">Select Model</span>
<li key={model.id}> <button
<button type="button"
className={`w-full text-left px-3 py-2 hover:bg-muted/30 text-sm flex flex-col gap-1 ${model.id === selectedModel.id ? 'bg-primary/10' : ''}`} onClick={(e) => {
onClick={() => { e.stopPropagation()
updateSelectedModel(model) fetchAvailableModels()
setIsOpen(false) }}
}} className="p-1 hover:bg-muted/50 rounded"
> title="Refresh models"
<span className="flex items-center gap-2"> >
{model.icon} <RefreshCw className="h-3 w-3 text-muted-foreground" />
<span className={`font-medium ${model.id === selectedModel.id ? 'text-primary' : ''}`}>{model.name}</span> </button>
</span> </div>
<span className="text-xs text-muted-foreground pl-6">{model.description}</span> <div>
</button> {Object.entries(groupedModels).map(([provider, providerModels]) => (
</li> <div key={provider}>
<div className="px-3 py-1.5 text-xs font-semibold text-muted-foreground bg-muted/20 border-b border-border/20">
{getProviderLabel(provider)}
</div>
<ul>
{providerModels.map((model) => (
<li key={model.id}>
<button
className={`w-full text-left px-3 py-2 hover:bg-muted/30 text-sm flex flex-col gap-1 ${model.id === selectedModel.id ? 'bg-primary/10' : ''}`}
onClick={() => {
updateSelectedModel(model)
setIsOpen(false)
}}
>
<span className="flex items-center gap-2">
{model.icon}
<span className={`font-medium ${model.id === selectedModel.id ? 'text-primary' : ''}`}>{model.name}</span>
</span>
<span className="text-xs text-muted-foreground pl-6">{model.description}</span>
</button>
</li>
))}
</ul>
</div>
))} ))}
</ul> </div>
</div> </div>
)} )}
</div> </div>
) )
} }

View File

@ -1,19 +1,3 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
"use client" "use client"
import { useState, useEffect } from "react" import { useState, useEffect } from "react"
@ -103,7 +87,7 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
<InfoIcon className="h-5 w-5 text-muted-foreground" /> <InfoIcon className="h-5 w-5 text-muted-foreground" />
</TooltipTrigger> </TooltipTrigger>
<TooltipContent> <TooltipContent>
<p>Qdrant stores vector embeddings for semantic search</p> <p>Local Pinecone stores vector embeddings in memory for semantic search</p>
</TooltipContent> </TooltipContent>
</Tooltip> </Tooltip>
</TooltipProvider> </TooltipProvider>
@ -125,7 +109,7 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
<p className="whitespace-normal break-words">Error: {error}</p> <p className="whitespace-normal break-words">Error: {error}</p>
{error.includes('404') && ( {error.includes('404') && (
<p className="mt-1 text-xs"> <p className="mt-1 text-xs">
The Qdrant server is running but the collection doesn't exist yet. The Pinecone server is running but the index doesn't exist yet.
<button <button
onClick={async () => { onClick={async () => {
setConnectionStatus("checking"); setConnectionStatus("checking");
@ -133,26 +117,26 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
try { try {
const response = await fetch('/api/pinecone-diag/create-index', { method: 'POST' }); const response = await fetch('/api/pinecone-diag/create-index', { method: 'POST' });
if (response.ok) { if (response.ok) {
// Wait a bit for the collection to be created // Wait a bit for the index to be created
await new Promise(resolve => setTimeout(resolve, 2000)); await new Promise(resolve => setTimeout(resolve, 2000));
checkConnection(); checkConnection();
} else { } else {
const data = await response.json(); const data = await response.json();
setError(data.error || 'Failed to create collection'); setError(data.error || 'Failed to create index');
setConnectionStatus("disconnected"); setConnectionStatus("disconnected");
} }
} catch (err) { } catch (err) {
setError(err instanceof Error ? err.message : 'Error creating collection'); setError(err instanceof Error ? err.message : 'Error creating index');
setConnectionStatus("disconnected"); setConnectionStatus("disconnected");
} }
}} }}
className="ml-1 text-blue-600 hover:text-blue-800 underline" className="ml-1 text-blue-600 hover:text-blue-800 underline"
> >
Click here to create the collection Click here to create the index
</button> </button>
<br /> <br />
<span className="text-xs text-gray-600">Or using Docker Compose: </span> <span className="text-xs text-gray-600">Or using Docker Compose: </span>
<code className="mx-1 px-1 bg-gray-100 rounded">docker compose restart qdrant</code> <code className="mx-1 px-1 bg-gray-100 rounded">docker-compose restart pinecone</code>
</p> </p>
)} )}
</div> </div>
@ -160,25 +144,13 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
<div className="text-sm space-y-1 w-full"> <div className="text-sm space-y-1 w-full">
<div className="flex justify-between"> <div className="flex justify-between">
<span className="text-muted-foreground">Qdrant</span> <span className="text-muted-foreground">Vectors:</span>
<span className="text-xs text-muted-foreground">{(stats as any).url || 'http://qdrant:6333'}</span> <span>{stats.nodes}</span>
</div> </div>
<div className="flex justify-between"> <div className="flex justify-between">
<span className="text-muted-foreground">Vectors:</span> <span className="text-muted-foreground">Source:</span>
<span>{stats.nodes} indexed</span> <span>{stats.source} local</span>
</div> </div>
{(stats as any).status && (
<div className="flex justify-between">
<span className="text-muted-foreground">Status:</span>
<span className="capitalize">{(stats as any).status}</span>
</div>
)}
{(stats as any).vectorSize && (
<div className="flex justify-between">
<span className="text-muted-foreground">Dimensions:</span>
<span>{(stats as any).vectorSize}d ({(stats as any).distance})</span>
</div>
)}
</div> </div>
<div className="flex space-x-2"> <div className="flex space-x-2">

View File

@ -0,0 +1,207 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
"use client"
import { useState, useEffect } from "react"
import { Button } from '@/components/ui/button'
import { Badge } from '@/components/ui/badge'
import { InfoIcon } from 'lucide-react'
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from '@/components/ui/tooltip'
import { VectorDBStats } from '@/types/graph'
interface QdrantConnectionProps {
className?: string
}
export function QdrantConnection({ className }: QdrantConnectionProps) {
const [connectionStatus, setConnectionStatus] = useState<"connected" | "disconnected" | "checking">("disconnected")
const [error, setError] = useState<string | null>(null)
const [stats, setStats] = useState<VectorDBStats>({ nodes: 0, relationships: 0, source: 'none' })
// Fetch vector DB stats
const fetchStats = async () => {
try {
const response = await fetch('/api/vector-db/stats');
const data = await response.json();
if (response.ok) {
setStats({
nodes: typeof data.totalVectorCount === 'number' ? data.totalVectorCount : 0,
relationships: 0, // Vector DB doesn't store relationships
source: data.source || 'unknown',
httpHealthy: data.httpHealthy
});
// If we have a healthy HTTP connection, we're connected
if (data.httpHealthy) {
setConnectionStatus("connected");
setError(null);
} else {
setConnectionStatus("disconnected");
setError(data.error || 'Connection failed');
}
console.log('Vector DB stats:', data);
} else {
console.error('Failed to fetch vector DB stats:', data);
setConnectionStatus("disconnected");
setError(data.error || 'Failed to connect to vector database');
}
} catch (error) {
console.error('Error fetching vector DB stats:', error);
setConnectionStatus("disconnected");
setError(error instanceof Error ? error.message : 'Error connecting to vector database');
}
};
// Check connection status and stats
const checkConnection = async () => {
setConnectionStatus("checking")
setError(null)
try {
await fetchStats(); // Fetch stats directly - our status is based on having embeddings
} catch (error) {
console.error('Error connecting to Vector DB:', error)
setConnectionStatus("disconnected")
setError(error instanceof Error ? error.message : 'Unknown error connecting to Vector DB')
}
}
// Reset connection state
const disconnect = async () => {
setConnectionStatus("disconnected")
setStats({ nodes: 0, relationships: 0, source: 'none' })
}
// Initial connection check
useEffect(() => {
checkConnection()
}, [])
return (
<div className={`flex flex-col items-start space-y-4 p-4 border rounded-md ${className}`}>
<div className="flex justify-between w-full">
<h2 className="text-lg font-medium">Vector DB</h2>
<TooltipProvider>
<Tooltip>
<TooltipTrigger>
<InfoIcon className="h-5 w-5 text-muted-foreground" />
</TooltipTrigger>
<TooltipContent>
<p>Qdrant stores vector embeddings for semantic search</p>
</TooltipContent>
</Tooltip>
</TooltipProvider>
</div>
<div className="flex items-center space-x-2">
<span className="text-sm">Status:</span>
{connectionStatus === "connected" ? (
<Badge variant="outline" className="bg-green-50 text-green-700 hover:bg-green-50 border-green-200">Connected</Badge>
) : connectionStatus === "checking" ? (
<Badge variant="outline" className="bg-yellow-50 text-yellow-700 hover:bg-yellow-50 border-yellow-200">Checking...</Badge>
) : (
<Badge variant="outline" className="bg-red-50 text-red-700 hover:bg-red-50 border-red-200">Disconnected</Badge>
)}
</div>
{error && (
<div className="text-sm text-red-600 bg-red-50 p-2 rounded w-full overflow-auto max-h-20">
<p className="whitespace-normal break-words">Error: {error}</p>
{error.includes('404') && (
<p className="mt-1 text-xs">
The Qdrant server is running but the collection doesn't exist yet.
<button
onClick={async () => {
setConnectionStatus("checking");
setError(null);
try {
const response = await fetch('/api/vector-db/create-collection', { method: 'POST' });
if (response.ok) {
// Wait a bit for the collection to be created
await new Promise(resolve => setTimeout(resolve, 2000));
checkConnection();
} else {
const data = await response.json();
setError(data.error || 'Failed to create collection');
setConnectionStatus("disconnected");
}
} catch (err) {
setError(err instanceof Error ? err.message : 'Error creating collection');
setConnectionStatus("disconnected");
}
}}
className="ml-1 text-blue-600 hover:text-blue-800 underline"
>
Click here to create the collection
</button>
<br />
<span className="text-xs text-gray-600">Or using Docker Compose: </span>
<code className="mx-1 px-1 bg-gray-100 rounded">docker compose restart qdrant</code>
</p>
)}
</div>
)}
<div className="text-sm space-y-1 w-full">
<div className="flex justify-between">
<span className="text-muted-foreground">Qdrant</span>
<span className="text-xs text-muted-foreground">{(stats as any).url || 'http://qdrant:6333'}</span>
</div>
<div className="flex justify-between">
<span className="text-muted-foreground">Vectors:</span>
<span>{stats.nodes} indexed</span>
</div>
{(stats as any).status && (
<div className="flex justify-between">
<span className="text-muted-foreground">Status:</span>
<span className="capitalize">{(stats as any).status}</span>
</div>
)}
{(stats as any).vectorSize && (
<div className="flex justify-between">
<span className="text-muted-foreground">Dimensions:</span>
<span>{(stats as any).vectorSize}d ({(stats as any).distance})</span>
</div>
)}
</div>
<div className="flex space-x-2">
<Button
variant="outline"
size="sm"
onClick={checkConnection}
disabled={connectionStatus === "checking"}
>
{connectionStatus === "checking" ? "Checking..." : "Check Connection"}
</Button>
{connectionStatus === "connected" && (
<Button
variant="outline"
size="sm"
onClick={disconnect}
>
Disconnect
</Button>
)}
</div>
</div>
)
}

View File

@ -156,16 +156,21 @@ export function RagQuery({
: 'border-border/30 opacity-50 cursor-not-allowed' : 'border-border/30 opacity-50 cursor-not-allowed'
}`} }`}
> >
<div className="w-5 h-5 rounded-md bg-nvidia-green/15 flex items-center justify-center mb-1.5"> <div className={`w-5 h-5 rounded-md flex items-center justify-center mb-1.5 ${vectorEnabled ? 'bg-nvidia-green/15' : 'bg-muted/15'}`}>
<Zap className="h-2.5 w-2.5 text-nvidia-green" /> <Zap className={`h-2.5 w-2.5 ${vectorEnabled ? 'text-nvidia-green' : 'text-muted-foreground'}`} />
</div> </div>
<span className="text-sm font-semibold">Pure RAG</span> <span className={`text-sm font-semibold ${!vectorEnabled ? 'text-muted-foreground' : ''}`}>Pure RAG</span>
<span className="text-[10px] mt-0.5 text-center text-muted-foreground leading-tight"> <span className="text-[10px] mt-0.5 text-center text-muted-foreground leading-tight">
Vector DB + LLM Vector DB + LLM
</span> </span>
{queryMode === 'pure-rag' && ( {queryMode === 'pure-rag' && (
<div className="absolute top-2 right-2 w-1.5 h-1.5 bg-nvidia-green rounded-full"></div> <div className="absolute top-2 right-2 w-1.5 h-1.5 bg-nvidia-green rounded-full"></div>
)} )}
{!vectorEnabled && (
<div className="text-[9px] px-1.5 py-0.5 bg-blue-500/20 text-blue-700 dark:text-blue-400 rounded mt-1 font-medium">
NEEDS EMBEDDINGS
</div>
)}
</button> </button>
<button <button

View File

@ -76,10 +76,8 @@ export function SettingsModal() {
const [arangoUser, setArangoUser] = useState("") const [arangoUser, setArangoUser] = useState("")
const [arangoPassword, setArangoPassword] = useState("") const [arangoPassword, setArangoPassword] = useState("")
// Vector DB settings - changed from Milvus to Pinecone // Vector DB settings - Qdrant
const [pineconeApiKey, setPineconeApiKey] = useState("") const [qdrantUrl, setQdrantUrl] = useState("")
const [pineconeEnvironment, setPineconeEnvironment] = useState("")
const [pineconeIndex, setPineconeIndex] = useState("")
// S3 Storage settings // S3 Storage settings
const [s3Endpoint, setS3Endpoint] = useState("") const [s3Endpoint, setS3Endpoint] = useState("")
@ -171,9 +169,20 @@ export function SettingsModal() {
setIsS3Connected(s3Connected) setIsS3Connected(s3Connected)
} }
// Load graph DB type // Load graph DB type - fetch from server if not in localStorage
const storedGraphDbType = localStorage.getItem("graph_db_type") || "arangodb" const storedGraphDbType = localStorage.getItem("graph_db_type")
setGraphDbType(storedGraphDbType as GraphDBType) if (storedGraphDbType) {
setGraphDbType(storedGraphDbType as GraphDBType)
} else {
// Fetch server's default (from GRAPH_DB_TYPE env var)
fetch('/api/settings')
.then(res => res.json())
.then(data => {
const serverDefault = data.settings?.graph_db_type || 'neo4j'
setGraphDbType(serverDefault as GraphDBType)
})
.catch(() => setGraphDbType('neo4j'))
}
// Load Neo4j settings // Load Neo4j settings
setNeo4jUrl(localStorage.getItem("neo4j_url") || "") setNeo4jUrl(localStorage.getItem("neo4j_url") || "")
@ -186,9 +195,7 @@ export function SettingsModal() {
setArangoUser(localStorage.getItem("arango_user") || "") setArangoUser(localStorage.getItem("arango_user") || "")
setArangoPassword(localStorage.getItem("arango_password") || "") setArangoPassword(localStorage.getItem("arango_password") || "")
setPineconeApiKey(localStorage.getItem("pinecone_api_key") || "") setQdrantUrl(localStorage.getItem("qdrant_url") || "http://localhost:6333")
setPineconeEnvironment(localStorage.getItem("pinecone_environment") || "")
setPineconeIndex(localStorage.getItem("pinecone_index") || "")
}, [isOpen]) }, [isOpen])
// Save database settings // Save database settings
@ -249,9 +256,7 @@ export function SettingsModal() {
const saveVectorDbSettings = async (e: React.FormEvent) => { const saveVectorDbSettings = async (e: React.FormEvent) => {
e.preventDefault() e.preventDefault()
localStorage.setItem("pinecone_api_key", pineconeApiKey) localStorage.setItem("qdrant_url", qdrantUrl)
localStorage.setItem("pinecone_environment", pineconeEnvironment)
localStorage.setItem("pinecone_index", pineconeIndex)
// Sync settings with server // Sync settings with server
try { try {
@ -262,9 +267,7 @@ export function SettingsModal() {
}, },
body: JSON.stringify({ body: JSON.stringify({
settings: { settings: {
pinecone_api_key: pineconeApiKey, qdrant_url: qdrantUrl,
pinecone_environment: pineconeEnvironment,
pinecone_index: pineconeIndex,
} }
}), }),
}); });
@ -452,7 +455,11 @@ export function SettingsModal() {
return ( return (
<Dialog open={isOpen} onOpenChange={setIsOpen}> <Dialog open={isOpen} onOpenChange={setIsOpen}>
<DialogTrigger asChild> <DialogTrigger asChild>
<button className="flex items-center justify-center gap-2 p-2 hover:bg-primary/10 rounded-full transition-colors" title="Settings"> <button
className="flex items-center justify-center gap-2 p-2 hover:bg-primary/10 rounded-full transition-colors"
aria-label="Open settings"
title="Settings"
>
<Settings className="h-5 w-5 text-muted-foreground hover:text-primary transition-colors" /> <Settings className="h-5 w-5 text-muted-foreground hover:text-primary transition-colors" />
</button> </button>
</DialogTrigger> </DialogTrigger>
@ -668,44 +675,22 @@ export function SettingsModal() {
<div className="space-y-2"> <div className="space-y-2">
<label className="text-sm font-semibold text-foreground flex items-center gap-2"> <label className="text-sm font-semibold text-foreground flex items-center gap-2">
<SearchIcon className="h-4 w-4 text-nvidia-green" /> <SearchIcon className="h-4 w-4 text-nvidia-green" />
Pinecone Configuration Qdrant Configuration
</label> </label>
</div> </div>
<div className="bg-background/50 rounded-lg p-3 space-y-3"> <div className="bg-background/50 rounded-lg p-3 space-y-3">
<div className="grid grid-cols-1 gap-3"> <div className="grid grid-cols-1 gap-3">
<div> <div>
<label className="text-xs font-medium text-muted-foreground mb-1 block">API Key</label> <label className="text-xs font-medium text-muted-foreground mb-1 block">Qdrant URL</label>
<input <input
type="password" type="text"
value={pineconeApiKey} value={qdrantUrl}
onChange={(e) => setPineconeApiKey(e.target.value)} onChange={(e) => setQdrantUrl(e.target.value)}
placeholder="Enter your Pinecone API key" placeholder="http://localhost:6333"
className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors" className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors"
/> />
</div> </div>
<div className="grid grid-cols-2 gap-3">
<div>
<label className="text-xs font-medium text-muted-foreground mb-1 block">Environment</label>
<input
type="text"
value={pineconeEnvironment}
onChange={(e) => setPineconeEnvironment(e.target.value)}
placeholder="us-west1-gcp"
className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors"
/>
</div>
<div>
<label className="text-xs font-medium text-muted-foreground mb-1 block">Index Name</label>
<input
type="text"
value={pineconeIndex}
onChange={(e) => setPineconeIndex(e.target.value)}
placeholder="knowledge-graph"
className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors"
/>
</div>
</div>
</div> </div>
</div> </div>

View File

@ -22,11 +22,15 @@ import { useTheme } from "./theme-provider"
export function ThemeToggle() { export function ThemeToggle() {
const { theme, setTheme } = useTheme() const { theme, setTheme } = useTheme()
const nextTheme = theme === "dark" ? "light" : "dark"
const label = `Switch to ${nextTheme} theme (currently ${theme})`
return ( return (
<button <button
className="btn-icon relative" className="btn-icon relative focus-visible:ring-2 focus-visible:ring-nvidia-green focus-visible:ring-offset-2 focus-visible:ring-offset-background rounded-lg"
onClick={() => setTheme(theme === "dark" ? "light" : "dark")} onClick={() => setTheme(nextTheme)}
aria-label="Toggle theme" aria-label={label}
title={`Switch to ${nextTheme} theme`}
> >
<Sun <Sun
className={`h-5 w-5 transition-all ${theme === "dark" ? "opacity-0 scale-0 rotate-90 absolute" : "opacity-100 scale-100 rotate-0 relative"}`} className={`h-5 w-5 transition-all ${theme === "dark" ? "opacity-0 scale-0 rotate-90 absolute" : "opacity-100 scale-100 rotate-0 relative"}`}

View File

@ -91,11 +91,16 @@ export function TripleEditor({ triple, index, onSave, onCancel }: TripleEditorPr
<button <button
type="button" type="button"
onClick={onCancel} onClick={onCancel}
aria-label="Cancel editing triple"
className="p-2 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/50 transition-colors" className="p-2 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/50 transition-colors"
> >
<X className="h-4 w-4" /> <X className="h-4 w-4" />
</button> </button>
<button type="submit" className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10 transition-colors"> <button
type="submit"
aria-label="Save triple"
className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10 transition-colors"
>
<Check className="h-4 w-4" /> <Check className="h-4 w-4" />
</button> </button>
</div> </div>

View File

@ -19,8 +19,18 @@
import { useState, useEffect, useRef } from "react" import { useState, useEffect, useRef } from "react"
import { useDocuments } from "@/contexts/document-context" import { useDocuments } from "@/contexts/document-context"
import type { Triple } from "@/utils/text-processing" import type { Triple } from "@/utils/text-processing"
import { Pencil, Trash2, Plus, Download, ChevronDown, FileJson, FileText, List, Network, Check, X, Database } from "lucide-react" import { Pencil, Trash2, Plus, Download, ChevronDown, FileJson, FileText, List, Network, Check, X, Database, AlertCircle } from "lucide-react"
import { TripleEditor } from "./triple-editor" import { TripleEditor } from "./triple-editor"
import {
AlertDialog,
AlertDialogAction,
AlertDialogCancel,
AlertDialogContent,
AlertDialogDescription,
AlertDialogFooter,
AlertDialogHeader,
AlertDialogTitle,
} from "@/components/ui/alert-dialog"
// Add this new EntityEditor component before the TripleViewer component // Add this new EntityEditor component before the TripleViewer component
interface EntityEditorProps { interface EntityEditorProps {
@ -59,11 +69,16 @@ function EntityEditor({ entity, onSave, onCancel }: EntityEditorProps) {
<button <button
type="button" type="button"
onClick={onCancel} onClick={onCancel}
aria-label="Cancel editing entity"
className="p-2 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/30" className="p-2 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/30"
> >
<X className="h-4 w-4" /> <X className="h-4 w-4" />
</button> </button>
<button type="submit" className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10"> <button
type="submit"
aria-label="Save entity changes"
className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10"
>
<Check className="h-4 w-4" /> <Check className="h-4 w-4" />
</button> </button>
</div> </div>
@ -88,6 +103,12 @@ export function TripleViewer() {
const [searchQuery, setSearchQuery] = useState('') const [searchQuery, setSearchQuery] = useState('')
const dropdownRef = useRef<HTMLDivElement>(null) const dropdownRef = useRef<HTMLDivElement>(null)
// Delete confirmation dialog state
const [showDeleteTripleDialog, setShowDeleteTripleDialog] = useState(false)
const [tripleToDelete, setTripleToDelete] = useState<{ index: number, triple: Triple } | null>(null)
const [showDeleteEntityDialog, setShowDeleteEntityDialog] = useState(false)
const [entityToDelete, setEntityToDelete] = useState<string | null>(null)
// Handle click outside to close dropdown // Handle click outside to close dropdown
useEffect(() => { useEffect(() => {
function handleClickOutside(event: MouseEvent) { function handleClickOutside(event: MouseEvent) {
@ -167,13 +188,20 @@ export function TripleViewer() {
} }
const handleDeleteTriple = (index: number) => { const handleDeleteTriple = (index: number) => {
if (selectedDoc) { if (selectedDoc && selectedDoc.triples) {
if (confirm("Are you sure you want to delete this triple?")) { setTripleToDelete({ index, triple: selectedDoc.triples[index] })
deleteTriple(selectedDoc.id, index) setShowDeleteTripleDialog(true)
}
} }
} }
const confirmDeleteTriple = () => {
if (selectedDoc && tripleToDelete !== null) {
deleteTriple(selectedDoc.id, tripleToDelete.index)
}
setShowDeleteTripleDialog(false)
setTripleToDelete(null)
}
const exportTriplesCSV = () => { const exportTriplesCSV = () => {
if (!selectedDoc || !selectedDoc.triples) return if (!selectedDoc || !selectedDoc.triples) return
@ -281,16 +309,22 @@ export function TripleViewer() {
const handleDeleteEntity = (entity: string) => { const handleDeleteEntity = (entity: string) => {
if (!selectedDoc || !selectedDoc.triples) return; if (!selectedDoc || !selectedDoc.triples) return;
setEntityToDelete(entity)
setShowDeleteEntityDialog(true)
};
if (confirm(`Are you sure you want to delete the entity "${entity}"? This will remove all triples containing this entity.`)) { const confirmDeleteEntity = () => {
if (selectedDoc && selectedDoc.triples && entityToDelete) {
// Filter out all triples that contain the entity // Filter out all triples that contain the entity
const filteredTriples = selectedDoc.triples.filter(triple => const filteredTriples = selectedDoc.triples.filter(triple =>
triple.subject !== entity && triple.object !== entity triple.subject !== entityToDelete && triple.object !== entityToDelete
); );
// Update the document with the filtered triples // Update the document with the filtered triples
updateTriples(selectedDoc.id, filteredTriples); updateTriples(selectedDoc.id, filteredTriples);
} }
setShowDeleteEntityDialog(false)
setEntityToDelete(null)
}; };
// Function to store triples in the Neo4j database // Function to store triples in the Neo4j database
@ -383,8 +417,11 @@ export function TripleViewer() {
<label className="text-sm font-semibold text-foreground whitespace-nowrap">Select Document</label> <label className="text-sm font-semibold text-foreground whitespace-nowrap">Select Document</label>
<div className="relative w-64"> <div className="relative w-64">
<button <button
className="w-full flex items-center justify-between bg-card border border-border rounded-lg p-3 text-foreground text-sm hover:bg-muted/30 transition-colors" className="w-full flex items-center justify-between bg-card border border-border rounded-lg p-3 text-foreground text-sm hover:bg-muted/30 transition-colors focus-visible:ring-2 focus-visible:ring-nvidia-green focus-visible:ring-offset-2"
onClick={() => setIsDropdownOpen(!isDropdownOpen)} onClick={() => setIsDropdownOpen(!isDropdownOpen)}
aria-haspopup="listbox"
aria-expanded={isDropdownOpen}
aria-label={`Select document. Currently selected: ${selectedDoc?.name || 'None'}`}
> >
<span className="truncate"> <span className="truncate">
{selectedDoc?.name || "Select document"} {selectedDoc?.name || "Select document"}
@ -400,13 +437,18 @@ export function TripleViewer() {
strokeLinecap="round" strokeLinecap="round"
strokeLinejoin="round" strokeLinejoin="round"
className={`transition-transform ${isDropdownOpen ? 'rotate-180' : ''}`} className={`transition-transform ${isDropdownOpen ? 'rotate-180' : ''}`}
aria-hidden="true"
> >
<polyline points="6 9 12 15 18 9"></polyline> <polyline points="6 9 12 15 18 9"></polyline>
</svg> </svg>
</button> </button>
{isDropdownOpen && ( {isDropdownOpen && (
<div className="absolute z-10 mt-1 w-full bg-card border border-border rounded-lg shadow-lg max-h-64 overflow-y-auto"> <div
className="absolute z-10 mt-1 w-full bg-card border border-border rounded-lg shadow-lg max-h-64 overflow-y-auto"
role="listbox"
aria-label="Processed documents"
>
<div className="p-2 sticky top-0 bg-card border-b border-border"> <div className="p-2 sticky top-0 bg-card border-b border-border">
<input <input
type="text" type="text"
@ -425,6 +467,8 @@ export function TripleViewer() {
filteredDocs.map((doc) => ( filteredDocs.map((doc) => (
<button <button
key={doc.id} key={doc.id}
role="option"
aria-selected={doc.id === selectedDoc?.id}
className={`w-full text-left p-2 hover:bg-muted/30 text-sm ${ className={`w-full text-left p-2 hover:bg-muted/30 text-sm ${
doc.id === selectedDoc?.id ? 'bg-primary/10 text-primary' : '' doc.id === selectedDoc?.id ? 'bg-primary/10 text-primary' : ''
}`} }`}
@ -657,6 +701,7 @@ export function TripleViewer() {
<button <button
onClick={() => setEditingIndex(index)} onClick={() => setEditingIndex(index)}
className="p-1.5 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/50 transition-colors" className="p-1.5 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/50 transition-colors"
aria-label={`Edit triple: ${normalizeText(triple.subject)} ${normalizeText(triple.predicate)} ${normalizeText(triple.object)}`}
title="Edit Triple" title="Edit Triple"
> >
<Pencil className="h-3.5 w-3.5" /> <Pencil className="h-3.5 w-3.5" />
@ -664,6 +709,7 @@ export function TripleViewer() {
<button <button
onClick={() => handleDeleteTriple(index)} onClick={() => handleDeleteTriple(index)}
className="p-1.5 text-muted-foreground hover:text-destructive rounded-full hover:bg-destructive/10 transition-colors" className="p-1.5 text-muted-foreground hover:text-destructive rounded-full hover:bg-destructive/10 transition-colors"
aria-label={`Delete triple: ${normalizeText(triple.subject)} ${normalizeText(triple.predicate)} ${normalizeText(triple.object)}`}
title="Delete Triple" title="Delete Triple"
> >
<Trash2 className="h-3.5 w-3.5" /> <Trash2 className="h-3.5 w-3.5" />
@ -805,6 +851,7 @@ export function TripleViewer() {
<button <button
onClick={() => setEditingEntityIndex(index)} onClick={() => setEditingEntityIndex(index)}
className="p-1.5 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/30" className="p-1.5 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/30"
aria-label={`Edit entity: ${normalizeText(entity)}`}
title="Edit Entity" title="Edit Entity"
> >
<Pencil className="h-3.5 w-3.5" /> <Pencil className="h-3.5 w-3.5" />
@ -812,6 +859,7 @@ export function TripleViewer() {
<button <button
onClick={() => handleDeleteEntity(entity)} onClick={() => handleDeleteEntity(entity)}
className="p-1.5 text-muted-foreground hover:text-destructive rounded-full hover:bg-destructive/10" className="p-1.5 text-muted-foreground hover:text-destructive rounded-full hover:bg-destructive/10"
aria-label={`Delete entity: ${normalizeText(entity)}`}
title="Delete Entity" title="Delete Entity"
> >
<Trash2 className="h-3.5 w-3.5" /> <Trash2 className="h-3.5 w-3.5" />
@ -837,6 +885,66 @@ export function TripleViewer() {
)} )}
</> </>
)} )}
{/* Delete Triple Confirmation Dialog */}
<AlertDialog open={showDeleteTripleDialog} onOpenChange={setShowDeleteTripleDialog}>
<AlertDialogContent>
<AlertDialogHeader>
<AlertDialogTitle className="flex items-center gap-2">
<Trash2 className="h-5 w-5 text-destructive" />
Delete Triple
</AlertDialogTitle>
<AlertDialogDescription>
Are you sure you want to delete this triple?
{tripleToDelete && (
<div className="mt-3 p-3 bg-muted/50 rounded-lg text-sm font-mono">
<span className="text-foreground">{normalizeText(tripleToDelete.triple.subject)}</span>
<span className="text-muted-foreground mx-2"></span>
<span className="text-primary">{normalizeText(tripleToDelete.triple.predicate)}</span>
<span className="text-muted-foreground mx-2"></span>
<span className="text-foreground">{normalizeText(tripleToDelete.triple.object)}</span>
</div>
)}
</AlertDialogDescription>
</AlertDialogHeader>
<AlertDialogFooter>
<AlertDialogCancel onClick={() => setTripleToDelete(null)}>Cancel</AlertDialogCancel>
<AlertDialogAction
onClick={confirmDeleteTriple}
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
>
Delete Triple
</AlertDialogAction>
</AlertDialogFooter>
</AlertDialogContent>
</AlertDialog>
{/* Delete Entity Confirmation Dialog */}
<AlertDialog open={showDeleteEntityDialog} onOpenChange={setShowDeleteEntityDialog}>
<AlertDialogContent>
<AlertDialogHeader>
<AlertDialogTitle className="flex items-center gap-2">
<AlertCircle className="h-5 w-5 text-destructive" />
Delete Entity
</AlertDialogTitle>
<AlertDialogDescription>
Are you sure you want to delete the entity <strong>"{entityToDelete}"</strong>?
<div className="mt-3 p-3 bg-amber-50 dark:bg-amber-950/30 border border-amber-200 dark:border-amber-800/50 rounded-lg text-amber-800 dark:text-amber-300 text-sm">
<strong>Warning:</strong> This will remove all triples containing this entity from the knowledge graph.
</div>
</AlertDialogDescription>
</AlertDialogHeader>
<AlertDialogFooter>
<AlertDialogCancel onClick={() => setEntityToDelete(null)}>Cancel</AlertDialogCancel>
<AlertDialogAction
onClick={confirmDeleteEntity}
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
>
Delete Entity
</AlertDialogAction>
</AlertDialogFooter>
</AlertDialogContent>
</AlertDialog>
</div> </div>
) )
} }

View File

@ -21,10 +21,15 @@ import * as ProgressPrimitive from "@radix-ui/react-progress"
import { cn } from "@/lib/utils" import { cn } from "@/lib/utils"
interface ProgressProps extends React.ComponentPropsWithoutRef<typeof ProgressPrimitive.Root> {
/** Show shimmer animation overlay for visual polish */
shimmer?: boolean
}
const Progress = React.forwardRef< const Progress = React.forwardRef<
React.ElementRef<typeof ProgressPrimitive.Root>, React.ElementRef<typeof ProgressPrimitive.Root>,
React.ComponentPropsWithoutRef<typeof ProgressPrimitive.Root> ProgressProps
>(({ className, value, ...props }, ref) => ( >(({ className, value, shimmer = true, ...props }, ref) => (
<ProgressPrimitive.Root <ProgressPrimitive.Root
ref={ref} ref={ref}
className={cn( className={cn(
@ -34,7 +39,10 @@ const Progress = React.forwardRef<
{...props} {...props}
> >
<ProgressPrimitive.Indicator <ProgressPrimitive.Indicator
className="h-full w-full flex-1 bg-primary transition-all" className={cn(
"h-full w-full flex-1 bg-primary transition-all duration-300 ease-out",
shimmer && (value ?? 0) > 0 && (value ?? 0) < 100 && "progress-shimmer"
)}
style={{ transform: `translateX(-${100 - (value || 0)}%)` }} style={{ transform: `translateX(-${100 - (value || 0)}%)` }}
/> />
</ProgressPrimitive.Root> </ProgressPrimitive.Root>

View File

@ -16,13 +16,25 @@
// //
import { cn } from "@/lib/utils" import { cn } from "@/lib/utils"
interface SkeletonProps extends React.HTMLAttributes<HTMLDivElement> {
/** Use directional shimmer instead of pulse animation */
shimmer?: boolean
}
function Skeleton({ function Skeleton({
className, className,
shimmer = false,
...props ...props
}: React.HTMLAttributes<HTMLDivElement>) { }: SkeletonProps) {
return ( return (
<div <div
className={cn("animate-pulse rounded-md bg-muted", className)} className={cn(
"rounded-md",
shimmer
? "skeleton-shimmer"
: "animate-pulse bg-muted",
className
)}
{...props} {...props}
/> />
) )

View File

@ -27,7 +27,7 @@ const Switch = React.forwardRef<
>(({ className, ...props }, ref) => ( >(({ className, ...props }, ref) => (
<SwitchPrimitives.Root <SwitchPrimitives.Root
className={cn( className={cn(
"peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-input", "peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors duration-200 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-input active:scale-95",
className className
)} )}
{...props} {...props}
@ -35,7 +35,7 @@ const Switch = React.forwardRef<
> >
<SwitchPrimitives.Thumb <SwitchPrimitives.Thumb
className={cn( className={cn(
"pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-transform data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0" "pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-all duration-200 ease-[cubic-bezier(0.34,1.56,0.64,1)] data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0 data-[state=checked]:shadow-primary/25"
)} )}
/> />
</SwitchPrimitives.Root> </SwitchPrimitives.Root>

View File

@ -60,7 +60,7 @@ const TabsContent = React.forwardRef<
<TabsPrimitive.Content <TabsPrimitive.Content
ref={ref} ref={ref}
className={cn( className={cn(
"mt-2 ring-offset-background focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2", "mt-2 ring-offset-background focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 data-[state=active]:animate-in data-[state=active]:fade-in-0 data-[state=active]:slide-in-from-bottom-1 data-[state=active]:duration-200",
className className
)} )}
{...props} {...props}

View File

@ -48,6 +48,8 @@ const toastVariants = cva(
default: "border bg-background text-foreground", default: "border bg-background text-foreground",
destructive: destructive:
"destructive group border-destructive bg-destructive text-destructive-foreground", "destructive group border-destructive bg-destructive text-destructive-foreground",
success:
"success group border-primary/30 bg-primary/10 text-foreground [&>svg]:text-primary",
}, },
}, },
defaultVariants: { defaultVariants: {

View File

@ -393,6 +393,11 @@ export function DocumentProvider({ children }: { children: React.ReactNode }) {
requestBody.llmProvider = "ollama"; requestBody.llmProvider = "ollama";
requestBody.ollamaModel = model.model || "llama3.1:8b"; requestBody.ollamaModel = model.model || "llama3.1:8b";
console.log(`🦙 Using Ollama model: ${requestBody.ollamaModel}`); console.log(`🦙 Using Ollama model: ${requestBody.ollamaModel}`);
} else if (model.provider === "vllm") {
requestBody.llmProvider = "vllm";
requestBody.vllmModel = model.model;
requestBody.vllmBaseUrl = model.baseURL || "http://localhost:8001/v1";
console.log(`🚀 Using vLLM model: ${requestBody.vllmModel}`);
} else if (model.id === "nvidia-nemotron" || model.id === "nvidia-nemotron-nano") { } else if (model.id === "nvidia-nemotron" || model.id === "nvidia-nemotron-nano") {
requestBody.llmProvider = "nvidia"; requestBody.llmProvider = "nvidia";
requestBody.nvidiaModel = model.model; // Pass the actual model name requestBody.nvidiaModel = model.model; // Pass the actual model name

View File

@ -15,6 +15,7 @@
// limitations under the License. // limitations under the License.
// //
import { Database, aql } from 'arangojs'; import { Database, aql } from 'arangojs';
import { createHash } from 'crypto';
/** /**
* ArangoDB service for database operations * ArangoDB service for database operations
@ -29,6 +30,36 @@ export class ArangoDBService {
private constructor() {} private constructor() {}
/**
* Generate a deterministic _key from input string using MD5 hash
* Uses Node.js built-in crypto module - truncated to 16 chars for compact keys
* @param input - String to hash
* @returns Hex-encoded hash string (16 chars, safe for ArangoDB _key)
*/
private generateKey(input: string): string {
return createHash('md5').update(input).digest('hex').slice(0, 16);
}
/**
* Generate a deterministic _key for an entity based on its name
* @param name - Entity name
* @returns Deterministic _key string
*/
private generateEntityKey(name: string): string {
return this.generateKey(name.toLowerCase().trim());
}
/**
* Generate a deterministic _key for an edge based on its endpoints and type
* @param fromKey - Source entity _key
* @param toKey - Target entity _key
* @param relationType - Relationship type/predicate
* @returns Deterministic _key string
*/
private generateEdgeKey(fromKey: string, toKey: string, relationType: string): string {
return this.generateKey(`${fromKey}|${relationType.toLowerCase().trim()}|${toKey}`);
}
/** /**
* Get the singleton instance of ArangoDBService * Get the singleton instance of ArangoDBService
*/ */
@ -77,9 +108,19 @@ export class ArangoDBService {
if (!collectionNames.includes(this.collectionName)) { if (!collectionNames.includes(this.collectionName)) {
await this.db.createCollection(this.collectionName); await this.db.createCollection(this.collectionName);
await this.db.collection(this.collectionName).ensureIndex({ await this.db.collection(this.collectionName).ensureIndex({
type: 'persistent', name: 'inverted_index',
type: 'inverted',
fields: ['name'], fields: ['name'],
unique: true analyzer: 'text_en'
});
await this.db.createView(`${this.collectionName}_view`, {
type: 'search-alias',
indexes: [
{
collection: this.collectionName,
index: 'inverted_index'
}
]
}); });
} }
@ -87,19 +128,25 @@ export class ArangoDBService {
if (!collectionNames.includes(this.edgeCollectionName)) { if (!collectionNames.includes(this.edgeCollectionName)) {
await this.db.createEdgeCollection(this.edgeCollectionName); await this.db.createEdgeCollection(this.edgeCollectionName);
await this.db.collection(this.edgeCollectionName).ensureIndex({ await this.db.collection(this.edgeCollectionName).ensureIndex({
type: 'persistent', name: 'inverted_index',
fields: ['type'] type: 'inverted',
fields: ['type'],
analyzer: 'text_en'
});
await this.db.createView(`${this.edgeCollectionName}_view`, {
type: 'search-alias',
indexes: [
{
collection: this.edgeCollectionName,
index: 'inverted_index'
}
]
}); });
} }
// Create documents collection if it doesn't exist // Create documents collection if it doesn't exist
if (!collectionNames.includes(this.documentsCollectionName)) { if (!collectionNames.includes(this.documentsCollectionName)) {
await this.db.createCollection(this.documentsCollectionName); await this.db.createCollection(this.documentsCollectionName);
await this.db.collection(this.documentsCollectionName).ensureIndex({
type: 'persistent',
fields: ['documentName'],
unique: true
});
} }
console.log('ArangoDB initialized successfully'); console.log('ArangoDB initialized successfully');
@ -158,7 +205,8 @@ export class ArangoDBService {
try { try {
const collection = this.db.collection(this.collectionName); const collection = this.db.collection(this.collectionName);
return await collection.save(properties); const doc = { ...properties, _key: this.generateEntityKey(properties.name) }
return await collection.save(doc, { overwriteMode: 'update' });
} catch (error) { } catch (error) {
console.error('Error creating node in ArangoDB:', error); console.error('Error creating node in ArangoDB:', error);
throw error; throw error;
@ -186,12 +234,13 @@ export class ArangoDBService {
try { try {
const edgeCollection = this.db.collection(this.edgeCollectionName); const edgeCollection = this.db.collection(this.edgeCollectionName);
const edgeData = { const edgeData = {
_key: this.generateEdgeKey(fromKey, toKey, relationType),
_from: `${this.collectionName}/${fromKey}`, _from: `${this.collectionName}/${fromKey}`,
_to: `${this.collectionName}/${toKey}`, _to: `${this.collectionName}/${toKey}`,
type: relationType, type: relationType,
...properties ...properties
}; };
return await edgeCollection.save(edgeData); return await edgeCollection.save(edgeData, { overwriteMode: 'update' });
} catch (error) { } catch (error) {
console.error('Error creating relationship in ArangoDB:', error); console.error('Error creating relationship in ArangoDB:', error);
throw error; throw error;
@ -200,54 +249,69 @@ export class ArangoDBService {
/** /**
* Import triples (subject, predicate, object) into the graph database * Import triples (subject, predicate, object) into the graph database
* Batches inserts every 1000 documents by default
* @param triples - Array of triples to import * @param triples - Array of triples to import
* @param batchSize - Number of documents to insert per batch (default: 1000)
* @returns Promise resolving when import is complete * @returns Promise resolving when import is complete
*/ */
public async importTriples(triples: { subject: string; predicate: string; object: string }[]): Promise<void> { public async importTriples(
triples: { subject: string; predicate: string; object: string }[],
batchSize: number = 1000
): Promise<void> {
if (!this.db) { if (!this.db) {
throw new Error('ArangoDB connection not initialized. Call initialize() first.'); throw new Error('ArangoDB connection not initialized. Call initialize() first.');
} }
let entityBatch: Array<{ _key: string; name: string }> = [];
let edgeBatch: Array<{ _key: string; _from: string; _to: string; type: string }> = [];
const importEntities = async () => {
if (entityBatch.length === 0) return;
await this.db!.collection(this.collectionName).saveAll(entityBatch, { overwriteMode: 'ignore' });
console.log(`[ArangoDB] Imported ${entityBatch.length} entities`);
entityBatch = [];
};
const importEdges = async () => {
if (edgeBatch.length === 0) return;
await this.db!.collection(this.edgeCollectionName).saveAll(edgeBatch, { overwriteMode: 'ignore' });
console.log(`[ArangoDB] Imported ${edgeBatch.length} edges`);
edgeBatch = [];
};
try { try {
// Process triples in batches to improve performance
for (const triple of triples) { for (const triple of triples) {
// Normalize triple values
const normalizedSubject = triple.subject.trim(); const normalizedSubject = triple.subject.trim();
const normalizedPredicate = triple.predicate.trim(); const normalizedPredicate = triple.predicate.trim();
const normalizedObject = triple.object.trim(); const normalizedObject = triple.object.trim();
// Skip invalid triples
if (!normalizedSubject || !normalizedPredicate || !normalizedObject) { if (!normalizedSubject || !normalizedPredicate || !normalizedObject) {
console.warn('Skipping invalid triple:', triple); console.warn('Skipping invalid triple:', triple);
continue; continue;
} }
// Upsert subject and object nodes const subjectKey = this.generateEntityKey(normalizedSubject);
const subjectNode = await this.upsertEntity(normalizedSubject); const objectKey = this.generateEntityKey(normalizedObject);
const objectNode = await this.upsertEntity(normalizedObject); const edgeKey = this.generateEdgeKey(subjectKey, objectKey, normalizedPredicate);
// Check if relationship already exists entityBatch.push({ _key: subjectKey, name: normalizedSubject });
const existingEdges = await this.executeQuery( entityBatch.push({ _key: objectKey, name: normalizedObject });
`FOR e IN ${this.edgeCollectionName}
FILTER e._from == @from AND e._to == @to AND e.type == @type
RETURN e`,
{
from: `${this.collectionName}/${subjectNode._key}`,
to: `${this.collectionName}/${objectNode._key}`,
type: normalizedPredicate
}
);
// Create relationship if it doesn't exist edgeBatch.push({
if (existingEdges.length === 0) { _key: edgeKey,
await this.createRelationship( _from: `${this.collectionName}/${subjectKey}`,
subjectNode._key, _to: `${this.collectionName}/${objectKey}`,
objectNode._key, type: normalizedPredicate
normalizedPredicate });
);
} if (entityBatch.length >= batchSize) await importEntities();
if (edgeBatch.length >= batchSize) await importEdges();
} }
// Flush remaining
await importEntities();
await importEdges();
console.log(`Successfully imported ${triples.length} triples into ArangoDB`); console.log(`Successfully imported ${triples.length} triples into ArangoDB`);
} catch (error) { } catch (error) {
console.error('Error importing triples into ArangoDB:', error); console.error('Error importing triples into ArangoDB:', error);
@ -255,28 +319,6 @@ export class ArangoDBService {
} }
} }
/**
* Helper method to upsert (create or update) an entity
* @param name - Entity name
* @returns Promise resolving to the entity
*/
private async upsertEntity(name: string): Promise<any> {
const collection = this.db!.collection(this.collectionName);
// Look for existing entity
const existing = await this.executeQuery(
`FOR e IN ${this.collectionName} FILTER e.name == @name RETURN e`,
{ name }
);
if (existing.length > 0) {
return existing[0];
}
// Create new entity
return await collection.save({ name });
}
/** /**
* Check if a document has already been processed and stored in ArangoDB * Check if a document has already been processed and stored in ArangoDB
* @param documentName - Name of the document to check * @param documentName - Name of the document to check
@ -287,16 +329,9 @@ export class ArangoDBService {
throw new Error('ArangoDB connection not initialized. Call initialize() first.'); throw new Error('ArangoDB connection not initialized. Call initialize() first.');
} }
try { const collection = this.db.collection(this.documentsCollectionName);
const existing = await this.executeQuery( const key = this.generateKey(documentName.trim());
`FOR d IN ${this.documentsCollectionName} FILTER d.documentName == @documentName RETURN d`, return await collection.documentExists(key);
{ documentName }
);
return existing.length > 0;
} catch (error) {
console.error('Error checking if document is processed:', error);
return false;
}
} }
/** /**
@ -312,30 +347,18 @@ export class ArangoDBService {
try { try {
const collection = this.db.collection(this.documentsCollectionName); const collection = this.db.collection(this.documentsCollectionName);
await collection.save({ const doc = {
_key: this.generateKey(documentName.trim()),
documentName, documentName,
tripleCount, tripleCount,
processedAt: new Date().toISOString() processedAt: new Date().toISOString()
}); };
await collection.save(doc, { overwriteMode: 'replace' });
console.log(`Marked document "${documentName}" as processed with ${tripleCount} triples`); console.log(`Marked document "${documentName}" as processed with ${tripleCount} triples`);
} catch (error) { } catch (error) {
// If error is due to unique constraint (document already exists), update it instead console.error('Error marking document as processed:', error);
if (error && typeof error === 'object' && 'errorNum' in error && error.errorNum === 1210) { throw error;
console.log(`Document "${documentName}" already exists, updating...`);
await this.executeQuery(
`FOR d IN ${this.documentsCollectionName}
FILTER d.documentName == @documentName
UPDATE d WITH { tripleCount: @tripleCount, processedAt: @processedAt } IN ${this.documentsCollectionName}`,
{
documentName,
tripleCount,
processedAt: new Date().toISOString()
}
);
} else {
console.error('Error marking document as processed:', error);
throw error;
}
} }
} }
@ -392,12 +415,6 @@ export class ArangoDBService {
`FOR r IN ${this.edgeCollectionName} RETURN r` `FOR r IN ${this.edgeCollectionName} RETURN r`
); );
// Build id to key mapping for relationships
const idToKey = new Map<string, string>();
for (const entity of entities) {
idToKey.set(entity._id, entity._key);
}
// Format nodes in a way compatible with the application // Format nodes in a way compatible with the application
const nodes = entities.map(entity => ({ const nodes = entities.map(entity => ({
id: entity._key, id: entity._key,
@ -408,7 +425,6 @@ export class ArangoDBService {
// Format relationships in a way compatible with the application // Format relationships in a way compatible with the application
const formattedRelationships = relationships.map(rel => { const formattedRelationships = relationships.map(rel => {
// Extract the entity keys from _from and _to
const source = rel._from.split('/')[1]; const source = rel._from.split('/')[1];
const target = rel._to.split('/')[1]; const target = rel._to.split('/')[1];
@ -507,16 +523,19 @@ export class ArangoDBService {
} }
/** /**
* Perform graph traversal to find relevant triples using ArangoDB's native graph capabilities * Perform graph traversal to find relevant triples using ArangoDB's native text search and graph capabilities
* Uses inverted indexes with BM25 scoring for efficient keyword matching
* @param keywords - Array of keywords to search for * @param keywords - Array of keywords to search for
* @param maxDepth - Maximum traversal depth (default: 2) * @param maxDepth - Maximum traversal depth (default: 2)
* @param maxResults - Maximum number of results to return (default: 100) * @param maxResults - Maximum number of results to return (default: 100)
* @param maxSeeds - Maximum number of seed nodes/edges from text search (default: 50)
* @returns Promise resolving to array of triples with relevance scores * @returns Promise resolving to array of triples with relevance scores
*/ */
public async graphTraversal( public async graphTraversal(
keywords: string[], keywords: string[],
maxDepth: number = 2, maxDepth: number = 2,
maxResults: number = 100 maxResults: number = 100,
maxSeeds: number = 50
): Promise<Array<{ ): Promise<Array<{
subject: string; subject: string;
predicate: string; predicate: string;
@ -540,93 +559,89 @@ export class ArangoDBService {
return []; return [];
} }
// AQL query that:
// 1. Finds seed nodes matching keywords
// 2. Performs graph traversal from those nodes
// 3. Scores results based on keyword matches and depth
const query = ` const query = `
// Find all entities matching keywords (case-insensitive) // 1. Tokenize keywords using the same analyzer as the index
LET keywords_merged = CONCAT_SEPARATOR(" ", @keywords)
LET keywords_tokens = TOKENS(keywords_merged, "text_en")
// 2. Match for entity.name
LET seedNodes = ( LET seedNodes = (
FOR entity IN ${this.collectionName} FOR vertex IN ${this.collectionName}_view
LET lowerName = LOWER(entity.name) SEARCH ANALYZER(vertex.name IN keywords_tokens, "text_en")
LET matches = ( LET score = BM25(vertex)
FOR keyword IN @keywords SORT score DESC
FILTER CONTAINS(lowerName, keyword) LIMIT @maxSeeds
RETURN 1 RETURN { vertex, score }
) )
FILTER LENGTH(matches) > 0
// 3. Match for relationship.type
LET seedEdges = (
FOR edge IN ${this.edgeCollectionName}_view
SEARCH ANALYZER(edge.type IN keywords_tokens, "text_en")
LET score = BM25(edge)
SORT score DESC
LIMIT @maxSeeds
RETURN { edge, score }
)
// 4. Normalize scores
LET maxNodeScore = MAX(seedNodes[*].score) || 1
LET maxEdgeScore = MAX(seedEdges[*].score) || 1
// 5. Traverse from seedNodes up to maxDepth
LET traversalResults = (
FOR seed IN seedNodes
FOR v, e, p IN 1..@maxDepth ANY seed.vertex ${this.edgeCollectionName}
OPTIONS { uniqueVertices: 'path', bfs: true }
LET subjectEntity = DOCUMENT(e._from)
LET objectEntity = DOCUMENT(e._to)
LET depth = LENGTH(p.edges) - 1
// Depth penalty: closer to seed = higher score
LET depthPenalty = 1.0 / (1.0 + depth * 0.2)
// Normalize seed score and apply depth penalty
LET normalizedSeedScore = seed.score / maxNodeScore
LET confidence = normalizedSeedScore * depthPenalty
RETURN {
subject: subjectEntity.name,
predicate: e.type,
object: objectEntity.name,
confidence: confidence,
depth: depth,
_edgeId: e._id,
pathLength: LENGTH(p.edges)
}
)
// 6. Collect triples from seedEdges (direct hits)
LET edgeResults = (
FOR seed IN seedEdges
LET subjectEntity = DOCUMENT(seed.edge._from)
LET objectEntity = DOCUMENT(seed.edge._to)
// Direct edge matches get a boost (depth 0)
LET normalizedScore = seed.score / maxEdgeScore
RETURN { RETURN {
node: entity, subject: subjectEntity.name,
matchCount: LENGTH(matches) predicate: seed.edge.type,
object: objectEntity.name,
confidence: normalizedScore * 1.2, // Boost direct edge matches
depth: 0,
_edgeId: seed.edge._id,
pathLength: 1
} }
) )
// Perform graph traversal from seed nodes // 7. Combine traversalResults and edgeResults
// Multi-hop: Extract ALL edges in each path, not just the final edge LET combinedResults = APPEND(traversalResults, edgeResults)
LET traversalResults = (
FOR seed IN seedNodes
FOR v, e, p IN 0..@maxDepth ANY seed.node._id ${this.edgeCollectionName}
OPTIONS {uniqueVertices: 'global', bfs: true}
FILTER e != null
// Extract all edges from the path for multi-hop context // 8. Remove duplicates by edge ID and sort by confidence
LET pathEdges = (
FOR edgeIdx IN 0..(LENGTH(p.edges) - 1)
LET pathEdge = p.edges[edgeIdx]
LET subjectEntity = DOCUMENT(pathEdge._from)
LET objectEntity = DOCUMENT(pathEdge._to)
LET subjectLower = LOWER(subjectEntity.name)
LET objectLower = LOWER(objectEntity.name)
LET predicateLower = LOWER(pathEdge.type)
// Calculate score for this edge
LET subjectMatches = (
FOR kw IN @keywords
FILTER CONTAINS(subjectLower, kw)
LET isExact = (subjectLower == kw)
RETURN isExact ? 1000 : (LENGTH(kw) * LENGTH(kw))
)
LET objectMatches = (
FOR kw IN @keywords
FILTER CONTAINS(objectLower, kw)
LET isExact = (objectLower == kw)
RETURN isExact ? 1000 : (LENGTH(kw) * LENGTH(kw))
)
LET predicateMatches = (
FOR kw IN @keywords
FILTER CONTAINS(predicateLower, kw)
LET isExact = (predicateLower == kw)
RETURN isExact ? 50 : (LENGTH(kw) * LENGTH(kw))
)
LET totalScore = SUM(subjectMatches) + SUM(objectMatches) + SUM(predicateMatches)
// Depth penalty (edges earlier in path get slight boost)
LET depthPenalty = 1.0 / (1.0 + (edgeIdx * 0.1))
LET confidence = MIN([totalScore * depthPenalty / 1000.0, 1.0])
FILTER confidence > 0
RETURN {
subject: subjectEntity.name,
predicate: pathEdge.type,
object: objectEntity.name,
confidence: confidence,
depth: edgeIdx,
_edgeId: pathEdge._id,
pathLength: LENGTH(p.edges)
}
)
// Return all edges from this path
FOR pathTriple IN pathEdges
RETURN pathTriple
)
// Remove duplicates by edge ID and sort by confidence
LET uniqueResults = ( LET uniqueResults = (
FOR result IN traversalResults FOR result IN combinedResults
COLLECT edgeId = result._edgeId INTO groups COLLECT edgeId = result._edgeId INTO groups
LET best = FIRST( LET best = FIRST(
FOR g IN groups FOR g IN groups
@ -636,8 +651,9 @@ export class ArangoDBService {
RETURN best RETURN best
) )
// Sort by confidence and limit results // 9. Sort by confidence and limit results
FOR result IN uniqueResults FOR result IN uniqueResults
FILTER result != null
SORT result.confidence DESC, result.depth ASC SORT result.confidence DESC, result.depth ASC
LIMIT @maxResults LIMIT @maxResults
RETURN { RETURN {
@ -655,14 +671,15 @@ export class ArangoDBService {
const results = await this.executeQuery(query, { const results = await this.executeQuery(query, {
keywords: keywordConditions, keywords: keywordConditions,
maxDepth, maxDepth,
maxResults maxResults,
maxSeeds
}); });
console.log(`[ArangoDB] Multi-hop graph traversal found ${results.length} triples for keywords: ${keywords.join(', ')}`); console.log(`[ArangoDB] Found ${results.length} triples for keywords: ${keywords.join(', ')}`);
// Log top 10 results with confidence scores // Log top 10 results with confidence scores
if (results.length > 0) { if (results.length > 0) {
console.log('[ArangoDB] Top 10 triples by confidence (multi-hop):'); console.log('[ArangoDB] Top 10 triples by confidence:');
results.slice(0, 10).forEach((triple: any, idx: number) => { results.slice(0, 10).forEach((triple: any, idx: number) => {
const pathInfo = triple.pathLength ? ` path=${triple.pathLength}` : ''; const pathInfo = triple.pathLength ? ` path=${triple.pathLength}` : '';
console.log(` ${idx + 1}. [conf=${triple.confidence?.toFixed(3)}] ${triple.subject} -> ${triple.predicate} -> ${triple.object} (depth=${triple.depth}${pathInfo})`); console.log(` ${idx + 1}. [conf=${triple.confidence?.toFixed(3)}] ${triple.subject} -> ${triple.predicate} -> ${triple.object} (depth=${triple.depth}${pathInfo})`);

View File

@ -32,16 +32,24 @@ import type { Triple } from '@/types/graph';
*/ */
export class BackendService { export class BackendService {
private graphDBService: GraphDBService; private graphDBService: GraphDBService;
private pineconeService: QdrantService; private qdrantService: QdrantService;
private sentenceTransformerUrl: string = 'http://sentence-transformers:80'; private sentenceTransformerUrl: string = 'http://sentence-transformers:80';
private modelName: string = 'all-MiniLM-L6-v2'; private modelName: string = 'all-MiniLM-L6-v2';
private static instance: BackendService; private static instance: BackendService;
private initialized: boolean = false; private initialized: boolean = false;
private activeGraphDbType: GraphDBType = 'arangodb'; private activeGraphDbType: GraphDBType | null = null; // Set at runtime, not build time
private getRuntimeGraphDbType(): GraphDBType {
if (this.activeGraphDbType === null) {
this.activeGraphDbType = (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
console.log(`[BackendService] Initialized activeGraphDbType at runtime: ${this.activeGraphDbType}`);
}
return this.activeGraphDbType;
}
private constructor() { private constructor() {
this.graphDBService = GraphDBService.getInstance(); this.graphDBService = GraphDBService.getInstance();
this.pineconeService = QdrantService.getInstance(); this.qdrantService = QdrantService.getInstance();
// Use environment variables if available // Use environment variables if available
if (process.env.SENTENCE_TRANSFORMER_URL) { if (process.env.SENTENCE_TRANSFORMER_URL) {
@ -64,16 +72,17 @@ export class BackendService {
/** /**
* Initialize the backend services * Initialize the backend services
* @param graphDbType - Type of graph database to use (neo4j or arangodb) * @param graphDbType - Type of graph database to use (defaults to GRAPH_DB_TYPE env var)
*/ */
public async initialize(graphDbType: GraphDBType = 'arangodb'): Promise<void> { public async initialize(graphDbType?: GraphDBType): Promise<void> {
this.activeGraphDbType = graphDbType; const dbType = graphDbType || (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
this.activeGraphDbType = dbType;
// Initialize Graph Database // Initialize Graph Database
if (!this.graphDBService.isInitialized()) { if (!this.graphDBService.isInitialized()) {
try { try {
// Get the appropriate service based on type // Get the appropriate service based on type
const graphDbService = getGraphDbService(graphDbType); const graphDbService = getGraphDbService(dbType);
// Try to get settings from server settings API first // Try to get settings from server settings API first
let serverSettings: Record<string, string> = {}; let serverSettings: Record<string, string> = {};
@ -88,7 +97,7 @@ export class BackendService {
console.log('Failed to load settings from server API, falling back to environment variables:', error); console.log('Failed to load settings from server API, falling back to environment variables:', error);
} }
if (graphDbType === 'neo4j') { if (dbType === 'neo4j') {
// Get Neo4j credentials from server settings first, then fallback to environment // Get Neo4j credentials from server settings first, then fallback to environment
const uri = serverSettings.neo4j_url || process.env.NEO4J_URI; const uri = serverSettings.neo4j_url || process.env.NEO4J_URI;
const username = serverSettings.neo4j_user || process.env.NEO4J_USER || process.env.NEO4J_USERNAME; const username = serverSettings.neo4j_user || process.env.NEO4J_USER || process.env.NEO4J_USERNAME;
@ -107,9 +116,9 @@ export class BackendService {
console.log(`Using ArangoDB database: ${dbName}`); console.log(`Using ArangoDB database: ${dbName}`);
await this.graphDBService.initialize('arangodb', url, username, password); await this.graphDBService.initialize('arangodb', url, username, password);
} }
console.log(`${graphDbType} initialized successfully in backend service`); console.log(`${dbType} initialized successfully in backend service`);
} catch (error) { } catch (error) {
console.error(`Failed to initialize ${graphDbType} in backend service:`, error); console.error(`Failed to initialize ${dbType} in backend service:`, error);
if (process.env.NODE_ENV === 'development') { if (process.env.NODE_ENV === 'development') {
console.log('Development mode: Continuing despite graph database initialization error'); console.log('Development mode: Continuing despite graph database initialization error');
} else { } else {
@ -118,9 +127,9 @@ export class BackendService {
} }
} }
// Initialize Pinecone // Initialize Qdrant
if (!this.pineconeService.isInitialized()) { if (!this.qdrantService.isInitialized()) {
await this.pineconeService.initialize(); await this.qdrantService.initialize();
} }
// Check if sentence-transformer service is available // Check if sentence-transformer service is available
@ -151,7 +160,7 @@ export class BackendService {
* Get the active graph database type * Get the active graph database type
*/ */
public getGraphDbType(): GraphDBType { public getGraphDbType(): GraphDBType {
return this.activeGraphDbType; return this.getRuntimeGraphDbType();
} }
/** /**
@ -183,7 +192,7 @@ export class BackendService {
} }
/** /**
* Process and store triples in graph database and embeddings in Pinecone * Process and store triples in graph database and embeddings in Qdrant
*/ */
public async processTriples(triples: Triple[]): Promise<void> { public async processTriples(triples: Triple[]): Promise<void> {
// Preprocess triples: lowercase and remove duplicates // Preprocess triples: lowercase and remove duplicates
@ -232,8 +241,8 @@ export class BackendService {
} }
} }
// Store embeddings and text content in Pinecone // Store embeddings and text content in Qdrant
await this.pineconeService.storeEmbeddings(entityEmbeddings, textContent); await this.qdrantService.storeEmbeddings(entityEmbeddings, textContent);
console.log(`Backend processing complete: ${uniqueTriples.length} triples and ${entityList.length} entities stored using ${this.activeGraphDbType}`); console.log(`Backend processing complete: ${uniqueTriples.length} triples and ${entityList.length} entities stored using ${this.activeGraphDbType}`);
} }
@ -253,7 +262,7 @@ export class BackendService {
const filteredKeywords = keywords.filter(kw => !this.isStopWord(kw)); const filteredKeywords = keywords.filter(kw => !this.isStopWord(kw));
// If using ArangoDB, use its native graph traversal capabilities // If using ArangoDB, use its native graph traversal capabilities
if (this.activeGraphDbType === 'arangodb') { if (this.getRuntimeGraphDbType() === 'arangodb') {
console.log(`Using ArangoDB native graph traversal for keywords: ${filteredKeywords.join(', ')}`); console.log(`Using ArangoDB native graph traversal for keywords: ${filteredKeywords.join(', ')}`);
try { try {
@ -392,8 +401,8 @@ export class BackendService {
// Generate embedding for query // Generate embedding for query
const queryEmbedding = (await this.generateEmbeddings([queryText]))[0]; const queryEmbedding = (await this.generateEmbeddings([queryText]))[0];
// Find nearest neighbors using Pinecone // Find nearest neighbors using Qdrant
const seedNodes = await this.pineconeService.findSimilarEntities(queryEmbedding, kNeighbors); const seedNodes = await this.qdrantService.findSimilarEntities(queryEmbedding, kNeighbors);
console.log(`Found ${seedNodes.length} seed nodes for query: "${queryText}"`); console.log(`Found ${seedNodes.length} seed nodes for query: "${queryText}"`);
// Get graph data from graph database // Get graph data from graph database
@ -649,7 +658,7 @@ Answer:`;
const embeddings = await this.generateEmbeddings(documents); const embeddings = await this.generateEmbeddings(documents);
// Store in Qdrant document-embeddings collection // Store in Qdrant document-embeddings collection
await this.pineconeService.storeDocumentChunks(documents, embeddings, metadata); await this.qdrantService.storeDocumentChunks(documents, embeddings, metadata);
console.log(`✅ Stored ${documents.length} document chunks in document-embeddings collection`); console.log(`✅ Stored ${documents.length} document chunks in document-embeddings collection`);
} }

View File

@ -22,18 +22,17 @@
/** /**
* Initialize default database settings if not already set * Initialize default database settings if not already set
* Called before syncing with server to ensure defaults are available * Called before syncing with server to ensure defaults are available
* NOTE: Don't set graph_db_type here - let server's GRAPH_DB_TYPE env var control it
*/ */
export function initializeDefaultSettings() { export function initializeDefaultSettings() {
if (typeof window === 'undefined') { if (typeof window === 'undefined') {
return; // Only run on client side return; // Only run on client side
} }
// Set default graph DB type to ArangoDB if not set // Don't set graph_db_type default - let it be controlled by server's GRAPH_DB_TYPE env var
if (!localStorage.getItem('graph_db_type')) { // The server will use its environment variable if no client setting is provided
localStorage.setItem('graph_db_type', 'arangodb');
}
// Set default ArangoDB settings if not set // Set default connection settings only (not the database type selection)
if (!localStorage.getItem('arango_url')) { if (!localStorage.getItem('arango_url')) {
localStorage.setItem('arango_url', 'http://localhost:8529'); localStorage.setItem('arango_url', 'http://localhost:8529');
} }
@ -41,6 +40,11 @@ export function initializeDefaultSettings() {
if (!localStorage.getItem('arango_db')) { if (!localStorage.getItem('arango_db')) {
localStorage.setItem('arango_db', 'txt2kg'); localStorage.setItem('arango_db', 'txt2kg');
} }
// Set default Neo4j settings
if (!localStorage.getItem('neo4j_url')) {
localStorage.setItem('neo4j_url', 'bolt://localhost:7687');
}
} }
/** /**
@ -124,21 +128,6 @@ export async function syncSettingsWithServer() {
settings.NVIDIA_API_KEY = nvidiaApiKey; settings.NVIDIA_API_KEY = nvidiaApiKey;
} }
// Pinecone settings
const pineconeApiKey = localStorage.getItem('pinecone_api_key');
if (pineconeApiKey) {
settings.pinecone_api_key = pineconeApiKey;
}
const pineconeEnvironment = localStorage.getItem('pinecone_environment');
if (pineconeEnvironment) {
settings.pinecone_environment = pineconeEnvironment;
}
const pineconeIndex = localStorage.getItem('pinecone_index');
if (pineconeIndex) {
settings.pinecone_index = pineconeIndex;
}
// Skip the API call if there are no settings to sync // Skip the API call if there are no settings to sync
if (Object.keys(settings).length === 0) { if (Object.keys(settings).length === 0) {

View File

@ -26,7 +26,7 @@ export type GraphDBType = 'neo4j' | 'arangodb';
export class GraphDBService { export class GraphDBService {
private neo4jService: Neo4jService; private neo4jService: Neo4jService;
private arangoDBService: ArangoDBService; private arangoDBService: ArangoDBService;
private activeDBType: GraphDBType = 'arangodb'; // Default to ArangoDB private activeDBType: GraphDBType | null = null; // Set at runtime, not build time
private static instance: GraphDBService; private static instance: GraphDBService;
private constructor() { private constructor() {
@ -34,6 +34,17 @@ export class GraphDBService {
this.arangoDBService = ArangoDBService.getInstance(); this.arangoDBService = ArangoDBService.getInstance();
} }
/**
* Get the active DB type, reading from env at runtime if not set
*/
private getActiveDBType(): GraphDBType {
if (this.activeDBType === null) {
this.activeDBType = (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
console.log(`[GraphDBService] Initialized activeDBType at runtime: ${this.activeDBType}`);
}
return this.activeDBType;
}
/** /**
* Get the singleton instance of GraphDBService * Get the singleton instance of GraphDBService
*/ */
@ -46,24 +57,25 @@ export class GraphDBService {
/** /**
* Initialize the graph database with the specified type * Initialize the graph database with the specified type
* @param dbType - Type of graph database to use * @param dbType - Type of graph database to use (defaults to GRAPH_DB_TYPE env var)
* @param uri - Connection URL * @param uri - Connection URL
* @param username - Database username * @param username - Database username
* @param password - Database password * @param password - Database password
*/ */
public async initialize(dbType: GraphDBType = 'arangodb', uri?: string, username?: string, password?: string): Promise<void> { public async initialize(dbType?: GraphDBType, uri?: string, username?: string, password?: string): Promise<void> {
this.activeDBType = dbType; const graphDbType = dbType || (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
this.activeDBType = graphDbType;
try { try {
if (dbType === 'neo4j') { if (graphDbType === 'neo4j') {
this.neo4jService.initialize(uri, username, password); this.neo4jService.initialize(uri, username, password);
console.log('Neo4j initialized successfully'); console.log('Neo4j initialized successfully');
} else if (dbType === 'arangodb') { } else if (graphDbType === 'arangodb') {
await this.arangoDBService.initialize(uri, undefined, username, password); await this.arangoDBService.initialize(uri, undefined, username, password);
console.log('ArangoDB initialized successfully'); console.log('ArangoDB initialized successfully');
} }
} catch (error) { } catch (error) {
console.error(`Failed to initialize ${dbType}:`, error); console.error(`Failed to initialize ${graphDbType}:`, error);
throw error; throw error;
} }
} }
@ -79,14 +91,14 @@ export class GraphDBService {
* Get the active graph database type * Get the active graph database type
*/ */
public getDBType(): GraphDBType { public getDBType(): GraphDBType {
return this.activeDBType; return this.getActiveDBType();
} }
/** /**
* Check if the active database is initialized * Check if the active database is initialized
*/ */
public isInitialized(): boolean { public isInitialized(): boolean {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
return this.neo4jService.isInitialized(); return this.neo4jService.isInitialized();
} else { } else {
return this.arangoDBService.isInitialized(); return this.arangoDBService.isInitialized();
@ -97,7 +109,7 @@ export class GraphDBService {
* Import triples into the active graph database * Import triples into the active graph database
*/ */
public async importTriples(triples: { subject: string; predicate: string; object: string }[]): Promise<void> { public async importTriples(triples: { subject: string; predicate: string; object: string }[]): Promise<void> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
await this.neo4jService.importTriples(triples); await this.neo4jService.importTriples(triples);
} else { } else {
await this.arangoDBService.importTriples(triples); await this.arangoDBService.importTriples(triples);
@ -121,7 +133,7 @@ export class GraphDBService {
[key: string]: any [key: string]: any
}>; }>;
}> { }> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
return await this.neo4jService.getGraphData(); return await this.neo4jService.getGraphData();
} else { } else {
return await this.arangoDBService.getGraphData(); return await this.arangoDBService.getGraphData();
@ -142,7 +154,7 @@ export class GraphDBService {
resultCount: number; resultCount: number;
} }
): Promise<void> { ): Promise<void> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
await this.neo4jService.logQuery(query, queryMode, metrics); await this.neo4jService.logQuery(query, queryMode, metrics);
} else { } else {
await this.arangoDBService.logQuery(query, queryMode, metrics); await this.arangoDBService.logQuery(query, queryMode, metrics);
@ -153,7 +165,7 @@ export class GraphDBService {
* Get query logs from the active graph database * Get query logs from the active graph database
*/ */
public async getQueryLogs(limit: number = 100): Promise<any[]> { public async getQueryLogs(limit: number = 100): Promise<any[]> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
return await this.neo4jService.getQueryLogs(limit); return await this.neo4jService.getQueryLogs(limit);
} else { } else {
return await this.arangoDBService.getQueryLogs(limit); return await this.arangoDBService.getQueryLogs(limit);
@ -164,7 +176,7 @@ export class GraphDBService {
* Close the connection to the active graph database * Close the connection to the active graph database
*/ */
public async close(): Promise<void> { public async close(): Promise<void> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
this.neo4jService.close(); this.neo4jService.close();
} else { } else {
this.arangoDBService.close(); this.arangoDBService.close();
@ -175,7 +187,7 @@ export class GraphDBService {
* Get info about the active graph database driver * Get info about the active graph database driver
*/ */
public getDriverInfo(): Record<string, any> { public getDriverInfo(): Record<string, any> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
return this.neo4jService.getDriverInfo(); return this.neo4jService.getDriverInfo();
} else { } else {
return this.arangoDBService.getDriverInfo(); return this.arangoDBService.getDriverInfo();
@ -197,7 +209,7 @@ export class GraphDBService {
confidence: number; confidence: number;
depth?: number; depth?: number;
}>> { }>> {
if (this.activeDBType === 'arangodb') { if (this.getActiveDBType() === 'arangodb') {
return await this.arangoDBService.graphTraversal(keywords, maxDepth, maxResults); return await this.arangoDBService.graphTraversal(keywords, maxDepth, maxResults);
} else { } else {
// Neo4j doesn't have this method yet, return empty array // Neo4j doesn't have this method yet, return empty array
@ -210,7 +222,7 @@ export class GraphDBService {
* Clear all data from the active graph database * Clear all data from the active graph database
*/ */
public async clearDatabase(): Promise<void> { public async clearDatabase(): Promise<void> {
if (this.activeDBType === 'neo4j') { if (this.getActiveDBType() === 'neo4j') {
// TODO: Implement Neo4j clear database functionality // TODO: Implement Neo4j clear database functionality
throw new Error('Clear database functionality not implemented for Neo4j'); throw new Error('Clear database functionality not implemented for Neo4j');
} else { } else {

View File

@ -18,20 +18,34 @@ import { GraphDBService, GraphDBType } from './graph-db-service';
import { Neo4jService } from './neo4j'; import { Neo4jService } from './neo4j';
import { ArangoDBService } from './arangodb'; import { ArangoDBService } from './arangodb';
/**
* Get the default graph database type from environment or fallback to arangodb
* Note: This is called at runtime, not build time, so process.env should be available
*/
function getDefaultGraphDbType(): GraphDBType {
const envType = process.env.GRAPH_DB_TYPE;
console.log(`[graph-db-util] getDefaultGraphDbType: env=${envType}`);
return (envType as GraphDBType) || 'arangodb';
}
/** /**
* Get the appropriate graph database service based on the graph database type. * Get the appropriate graph database service based on the graph database type.
* This is useful for API routes that need direct access to a specific graph database. * This is useful for API routes that need direct access to a specific graph database.
* *
* @param graphDbType - The type of graph database to use * @param graphDbType - The type of graph database to use (defaults to GRAPH_DB_TYPE env var)
*/ */
export function getGraphDbService(graphDbType: GraphDBType = 'arangodb') { export function getGraphDbService(graphDbType?: GraphDBType) {
if (graphDbType === 'neo4j') { const dbType = graphDbType || getDefaultGraphDbType();
if (dbType === 'neo4j') {
return Neo4jService.getInstance(); return Neo4jService.getInstance();
} else if (graphDbType === 'arangodb') { } else if (dbType === 'arangodb') {
return ArangoDBService.getInstance(); return ArangoDBService.getInstance();
} else { } else {
// Default to ArangoDB // Default based on environment
return ArangoDBService.getInstance(); return getDefaultGraphDbType() === 'neo4j'
? Neo4jService.getInstance()
: ArangoDBService.getInstance();
} }
} }
@ -39,12 +53,13 @@ export function getGraphDbService(graphDbType: GraphDBType = 'arangodb') {
* Initialize the graph database directly (not using GraphDBService). * Initialize the graph database directly (not using GraphDBService).
* This is useful for API routes that need direct access to a specific graph database. * This is useful for API routes that need direct access to a specific graph database.
* *
* @param graphDbType - The type of graph database to use * @param graphDbType - The type of graph database to use (defaults to GRAPH_DB_TYPE env var)
*/ */
export async function initializeGraphDb(graphDbType: GraphDBType = 'arangodb'): Promise<void> { export async function initializeGraphDb(graphDbType?: GraphDBType): Promise<void> {
const service = getGraphDbService(graphDbType); const dbType = graphDbType || getDefaultGraphDbType();
const service = getGraphDbService(dbType);
if (graphDbType === 'neo4j') { if (dbType === 'neo4j') {
// Get Neo4j credentials from environment // Get Neo4j credentials from environment
const uri = process.env.NEO4J_URI; const uri = process.env.NEO4J_URI;
const username = process.env.NEO4J_USER || process.env.NEO4J_USERNAME; const username = process.env.NEO4J_USER || process.env.NEO4J_USERNAME;
@ -54,7 +69,7 @@ export async function initializeGraphDb(graphDbType: GraphDBType = 'arangodb'):
if (service instanceof Neo4jService) { if (service instanceof Neo4jService) {
service.initialize(uri, username, password); service.initialize(uri, username, password);
} }
} else if (graphDbType === 'arangodb') { } else if (dbType === 'arangodb') {
// Get ArangoDB credentials from environment // Get ArangoDB credentials from environment
const url = process.env.ARANGODB_URL; const url = process.env.ARANGODB_URL;
const dbName = process.env.ARANGODB_DB; const dbName = process.env.ARANGODB_DB;

View File

@ -1,19 +1,3 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
/** /**
* Pinecone service for vector embeddings * Pinecone service for vector embeddings
* Uses direct API calls for Pinecone local server * Uses direct API calls for Pinecone local server

View File

@ -16,7 +16,6 @@
// //
/** /**
* Qdrant service for vector embeddings * Qdrant service for vector embeddings
* Drop-in replacement for PineconeService
*/ */
import { Document } from "@langchain/core/documents"; import { Document } from "@langchain/core/documents";
import { randomUUID } from "crypto"; import { randomUUID } from "crypto";
@ -477,7 +476,7 @@ export class QdrantService {
} }
try { try {
// Qdrant doesn't have a direct "get all" like Pinecone // Use scroll API to get points
// We'll use scroll API to get points // We'll use scroll API to get points
const response = await this.makeRequest(`/collections/${this.collectionName}/points/scroll`, 'POST', { const response = await this.makeRequest(`/collections/${this.collectionName}/points/scroll`, 'POST', {
limit: limit, limit: limit,

View File

@ -28,7 +28,7 @@ import type { Triple } from '@/types/graph';
*/ */
export class RemoteBackendService { export class RemoteBackendService {
private graphDBService: GraphDBService; private graphDBService: GraphDBService;
private pineconeService: QdrantService; private qdrantService: QdrantService;
private embeddingsService: EmbeddingsService; private embeddingsService: EmbeddingsService;
private textProcessor: TextProcessor; private textProcessor: TextProcessor;
private initialized: boolean = false; private initialized: boolean = false;
@ -36,7 +36,7 @@ export class RemoteBackendService {
private constructor() { private constructor() {
this.graphDBService = GraphDBService.getInstance(); this.graphDBService = GraphDBService.getInstance();
this.pineconeService = QdrantService.getInstance(); this.qdrantService = QdrantService.getInstance();
this.embeddingsService = EmbeddingsService.getInstance(); this.embeddingsService = EmbeddingsService.getInstance();
this.textProcessor = TextProcessor.getInstance(); this.textProcessor = TextProcessor.getInstance();
} }
@ -60,18 +60,19 @@ export class RemoteBackendService {
/** /**
* Initialize the remote backend with all required services * Initialize the remote backend with all required services
* @param graphDbType - Type of graph database to use * @param graphDbType - Type of graph database to use (defaults to GRAPH_DB_TYPE env var)
*/ */
public async initialize(graphDbType: GraphDBType = 'arangodb'): Promise<void> { public async initialize(graphDbType?: GraphDBType): Promise<void> {
console.log('Initializing remote backend...'); const dbType = graphDbType || (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
console.log(`Initializing remote backend with ${dbType}...`);
// Initialize Graph Database // Initialize Graph Database
await this.graphDBService.initialize(graphDbType); await this.graphDBService.initialize(dbType);
console.log(`${graphDbType} service initialized`); console.log(`${dbType} service initialized`);
// Initialize Pinecone // Initialize Qdrant
await this.pineconeService.initialize(); await this.qdrantService.initialize();
console.log('Pinecone service initialized'); console.log('Qdrant service initialized');
// Initialize Embeddings service // Initialize Embeddings service
await this.embeddingsService.initialize(); await this.embeddingsService.initialize();
@ -179,9 +180,9 @@ export class RemoteBackendService {
entityMetadata.set(entity, entityData); entityMetadata.set(entity, entityData);
} }
// Store embeddings and metadata in Pinecone // Store embeddings and metadata in Qdrant
await this.pineconeService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata); await this.qdrantService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata);
console.log('Stored embeddings with metadata in Pinecone'); console.log('Stored embeddings with metadata in Qdrant');
console.log('Backend created successfully from text'); console.log('Backend created successfully from text');
} }
@ -224,9 +225,9 @@ export class RemoteBackendService {
}); });
} }
// Store embeddings and metadata in Pinecone // Store embeddings and metadata in Qdrant
await this.pineconeService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata); await this.qdrantService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata);
console.log('Stored embeddings with metadata in Pinecone'); console.log('Stored embeddings with metadata in Qdrant');
console.log('Backend created successfully from triples'); console.log('Backend created successfully from triples');
} }
@ -287,8 +288,8 @@ export class RemoteBackendService {
// Step 1: Generate embedding for query // Step 1: Generate embedding for query
const queryEmbedding = (await this.embeddingsService.encode([query]))[0]; const queryEmbedding = (await this.embeddingsService.encode([query]))[0];
// Step 2: Find nearest neighbors using Pinecone // Step 2: Find nearest neighbors using Qdrant
const seedNodes = await this.pineconeService.findSimilarEntities(queryEmbedding, kNeighbors); const seedNodes = await this.qdrantService.findSimilarEntities(queryEmbedding, kNeighbors);
console.log(`Found ${seedNodes.length} seed nodes using KNN`); console.log(`Found ${seedNodes.length} seed nodes using KNN`);
// Step 3: Retrieve graph data from graph database // Step 3: Retrieve graph data from graph database
@ -552,9 +553,9 @@ export class RemoteBackendService {
// Step 1: Generate embedding for query // Step 1: Generate embedding for query
const queryEmbedding = (await this.embeddingsService.encode([query]))[0]; const queryEmbedding = (await this.embeddingsService.encode([query]))[0];
// Step 2: Find nearest neighbors using Pinecone with metadata // Step 2: Find nearest neighbors using Qdrant with metadata
const { entities: seedNodes, metadata: seedMetadata } = const { entities: seedNodes, metadata: seedMetadata } =
await this.pineconeService.findSimilarEntitiesWithMetadata(queryEmbedding, kNeighbors); await this.qdrantService.findSimilarEntitiesWithMetadata(queryEmbedding, kNeighbors);
console.log(`Found ${seedNodes.length} seed nodes using KNN with metadata`); console.log(`Found ${seedNodes.length} seed nodes using KNN with metadata`);
// Step 3: Retrieve graph data from graph database // Step 3: Retrieve graph data from graph database

View File

@ -376,7 +376,7 @@ ${formatInstructions}`;
} }
], ],
temperature: 0.1, temperature: 0.1,
max_tokens: 8192, max_tokens: 4096, // Reduced to leave room for input tokens in context
top_p: 0.95 top_p: 0.95
}) })
}); });

View File

@ -3,13 +3,10 @@
"version": "0.1.0", "version": "0.1.0",
"private": true, "private": true,
"scripts": { "scripts": {
"predev": "npm run setup-pinecone",
"dev": "next dev", "dev": "next dev",
"prebuild": "npm run setup-pinecone",
"build": "next build", "build": "next build",
"start": "next start", "start": "next start",
"lint": "next lint", "lint": "next lint"
"setup-pinecone": "node ../scripts/setup-pinecone.js"
}, },
"dependencies": { "dependencies": {
"3d-force-graph": "^1.77.0", "3d-force-graph": "^1.77.0",

View File

@ -162,6 +162,26 @@
@apply w-5 h-5 rounded-md bg-nvidia-green/15 flex items-center justify-center transition-transform duration-200; @apply w-5 h-5 rounded-md bg-nvidia-green/15 flex items-center justify-center transition-transform duration-200;
} }
/* Tab content wrapper for max-width */
.nvidia-build-tab-content {
@apply w-full max-w-7xl mx-auto;
}
/* Responsive tab layout */
@media (max-width: 768px) {
.nvidia-build-tabs {
@apply flex-col w-full p-1.5 gap-1;
}
.nvidia-build-tab {
@apply w-full justify-start px-4 py-2.5;
}
.nvidia-build-tab-icon {
@apply w-5 h-5;
}
}
/* Dark Mode Optimizations */ /* Dark Mode Optimizations */
@media (prefers-color-scheme: dark) { @media (prefers-color-scheme: dark) {
.nvidia-build-card { .nvidia-build-card {

View File

@ -90,19 +90,19 @@ def parse_args():
return parser.parse_args() return parser.parse_args()
def load_triples_from_arangodb(arango_url, arango_db, arango_user, arango_password): def load_triples_from_arangodb(arango_url: str, arango_db: str, arango_user: str, arango_password: str) -> list[str]:
""" """
Load triples from ArangoDB for use with the TXT2KG dataset Load triples from ArangoDB for use with the TXT2KG dataset
Args: Args:
arango_url: ArangoDB connection URL arango_url: ArangoDB connection URL
arango_db: ArangoDB database name arango_db: ArangoDB database name
arango_user: ArangoDB username arango_user: ArangoDB username
arango_password: ArangoDB password arango_password: ArangoDB password
Returns: Returns:
Array of triples in the format expected by create_remote_backend_from_triplets List of triples in the format "subject predicate object"
""" """
try: try:
# Connect to ArangoDB # Connect to ArangoDB
client = ArangoClient(hosts=arango_url) client = ArangoClient(hosts=arango_url)
@ -113,28 +113,21 @@ def load_triples_from_arangodb(arango_url, arango_db, arango_user, arango_passwo
else: else:
db = client.db(arango_db) db = client.db(arango_db)
# Query to get all triples from ArangoDB as structured objects # Query to get all triples from ArangoDB
# Handle case sensitivity and trim whitespace # Handle case sensitivity, trim whitespace, and deduplication
aql_query = """ aql_query = """
FOR e IN relationships FOR e IN relationships
LET subject = TRIM(DOCUMENT(e._from).name) LET subject = TRIM(DOCUMENT(e._from).name)
LET object = TRIM(DOCUMENT(e._to).name) LET object = TRIM(DOCUMENT(e._to).name)
LET predicate = TRIM(e.type) LET predicate = TRIM(e.type)
FILTER subject != "" AND predicate != "" AND object != "" FILTER subject != "" AND predicate != "" AND object != ""
RETURN { COLLECT s = subject, p = predicate, o = object
subject: subject, RETURN CONCAT_SEPARATOR(" ", s, p, o)
predicate: predicate,
object: object
}
""" """
# Execute the query # Execute the query with streaming for large datasets
cursor = db.aql.execute(aql_query) cursor = db.aql.execute(aql_query, stream=True, batch_size=1000)
triple_dicts = list(cursor) triples = list(cursor)
# Format triples as strings in the format expected by PyTorch Geometric
# The expected format is a list of strings in the form "subject predicate object"
triples = format_triples_for_pytorch_geometric(triple_dicts)
print(f"Loaded {len(triples)} triples from ArangoDB") print(f"Loaded {len(triples)} triples from ArangoDB")
# Print sample triples for debugging # Print sample triples for debugging
@ -148,34 +141,6 @@ def load_triples_from_arangodb(arango_url, arango_db, arango_user, arango_passwo
print(f"Error loading triples from ArangoDB: {error}") print(f"Error loading triples from ArangoDB: {error}")
raise error raise error
def format_triples_for_pytorch_geometric(triple_dicts):
"""
Format triples from ArangoDB into the format expected by PyTorch Geometric
Args:
triple_dicts: List of dictionaries with subject, predicate, object keys
Returns:
List of strings in the format "subject predicate object"
"""
triples = []
# Create a set to avoid duplicates
unique_triples = set()
for triple_dict in triple_dicts:
# Skip any triple with empty values
if not triple_dict['subject'] or not triple_dict['predicate'] or not triple_dict['object']:
continue
# Create a space-separated string in the format that preprocess_triplet expects
triple_str = f"{triple_dict['subject']} {triple_dict['predicate']} {triple_dict['object']}"
# Only add if not already in the set
if triple_str not in unique_triples:
unique_triples.add(triple_str)
triples.append(triple_str)
return triples
def get_data(args): def get_data(args):
# need a JSON dict of Questions and answers, see below for how its used # need a JSON dict of Questions and answers, see below for how its used
@ -190,48 +155,6 @@ def get_data(args):
return json_obj, text_contexts return json_obj, text_contexts
def validate_triple_format(triples):
"""
Validate and fix triple format if needed to ensure compatibility with preprocess_triplet
Args:
triples: List of triples to validate
Returns:
Fixed list of triples in the format expected by preprocess_triplet
"""
validated_triples = []
print(f"Validating {len(triples)} triples...")
for i, triple in enumerate(triples):
# If triple is already a proper string with subject, predicate, object
if isinstance(triple, str):
parts = triple.split()
# Ensure there are at least 3 parts (subject, predicate, object)
if len(parts) >= 3:
# For strings with more than 3 parts, use first as subject, second as predicate,
# and join the rest as object
subject = parts[0]
predicate = parts[1]
obj = ' '.join(parts[2:])
validated_triple = f"{subject} {predicate} {obj}"
validated_triples.append(validated_triple)
else:
print(f"Warning: Triple at index {i} has fewer than 3 parts: {triple}")
# If triple is a dictionary with subject, predicate, object keys
elif isinstance(triple, dict) and 'subject' in triple and 'predicate' in triple and 'object' in triple:
validated_triple = f"{triple['subject']} {triple['predicate']} {triple['object']}"
validated_triples.append(validated_triple)
# If triple is a tuple or list of length 3
elif (isinstance(triple, tuple) or isinstance(triple, list)) and len(triple) == 3:
validated_triple = f"{triple[0]} {triple[1]} {triple[2]}"
validated_triples.append(validated_triple)
else:
print(f"Warning: Skipping triple at index {i} with invalid format: {triple}")
print(f"Validation complete. {len(validated_triples)} valid triples out of {len(triples)}")
return validated_triples
def make_dataset(args): def make_dataset(args):
"""Modified make_dataset function that can use ArangoDB as a data source""" """Modified make_dataset function that can use ArangoDB as a data source"""
# Create output directory if it doesn't exist # Create output directory if it doesn't exist
@ -262,8 +185,6 @@ def make_dataset(args):
args.arango_user, args.arango_user,
args.arango_password args.arango_password
) )
# Validate and fix triples format if needed
triples = validate_triple_format(triples)
# Save triples for future use # Save triples for future use
torch.save(triples, triples_path) torch.save(triples, triples_path)
else: else:

View File

@ -1,19 +1,3 @@
//
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
/** /**
* Simplified Pinecone setup script for Docker environments * Simplified Pinecone setup script for Docker environments
*/ */

View File

@ -20,7 +20,8 @@
# Parse command line arguments # Parse command line arguments
DEV_FRONTEND=false DEV_FRONTEND=false
USE_COMPLETE=false USE_VLLM=false
USE_VECTOR_SEARCH=false
while [[ $# -gt 0 ]]; do while [[ $# -gt 0 ]]; do
case $1 in case $1 in
@ -28,8 +29,12 @@ while [[ $# -gt 0 ]]; do
DEV_FRONTEND=true DEV_FRONTEND=true
shift shift
;; ;;
--complete) --vllm)
USE_COMPLETE=true USE_VLLM=true
shift
;;
--vector-search)
USE_VECTOR_SEARCH=true
shift shift
;; ;;
--help|-h) --help|-h)
@ -37,14 +42,17 @@ while [[ $# -gt 0 ]]; do
echo "" echo ""
echo "Options:" echo "Options:"
echo " --dev-frontend Run frontend in development mode (without Docker)" echo " --dev-frontend Run frontend in development mode (without Docker)"
echo " --complete Use complete stack (vLLM, Pinecone, Sentence Transformers)" echo " --vllm Use Neo4j + vLLM (GPU-accelerated, for DGX Spark/GB300)"
echo " --vector-search Enable vector search services (Qdrant + Sentence Transformers)"
echo " --help, -h Show this help message" echo " --help, -h Show this help message"
echo "" echo ""
echo "Default: Starts minimal stack with Ollama, ArangoDB, and Next.js frontend" echo "Default: Starts ArangoDB + Ollama"
echo "" echo ""
echo "Examples:" echo "Examples:"
echo " ./start.sh # Start minimal demo (recommended)" echo " ./start.sh # Default: ArangoDB + Ollama"
echo " ./start.sh --complete # Start with all optional services" echo " ./start.sh --vllm # Use Neo4j + vLLM (GPU)"
echo " ./start.sh --vector-search # Add Qdrant + Sentence Transformers"
echo " ./start.sh --vllm --vector-search # vLLM + vector search"
exit 0 exit 0
;; ;;
*) *)
@ -120,21 +128,32 @@ if ! docker info &> /dev/null; then
fi fi
echo "✓ Docker permissions OK" echo "✓ Docker permissions OK"
# Build the docker-compose command # Select compose file and build command
if [ "$USE_COMPLETE" = true ]; then COMPOSE_DIR="$(pwd)/deploy/compose"
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.complete.yml" PROFILES=""
echo "Using complete stack (Ollama, vLLM, Pinecone, Sentence Transformers)..."
if [ "$USE_VLLM" = true ]; then
COMPOSE_FILE="$COMPOSE_DIR/docker-compose.vllm.yml"
echo "Using Neo4j + vLLM (GPU-accelerated)..."
echo " ⚡ Optimized for DGX Spark/GB300 with unified memory support"
else else
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.yml" COMPOSE_FILE="$COMPOSE_DIR/docker-compose.yml"
echo "Using minimal configuration (Ollama + ArangoDB only)..." echo "Using ArangoDB + Ollama configuration..."
fi
CMD="$DOCKER_COMPOSE_CMD -f $COMPOSE_FILE"
if [ "$USE_VECTOR_SEARCH" = true ]; then
PROFILES="--profile vector-search"
echo "Enabling vector search (Qdrant + Sentence Transformers)..."
fi fi
# Execute the command # Execute the command
echo "" echo ""
echo "Starting services..." echo "Starting services..."
echo "Running: $CMD up -d" echo "Running: $CMD $PROFILES up -d"
cd $(dirname "$0") cd $(dirname "$0")
eval "$CMD up -d" eval "$CMD $PROFILES up -d"
echo "" echo ""
echo "==========================================" echo "=========================================="
@ -143,28 +162,44 @@ echo "=========================================="
echo "" echo ""
echo "Core Services:" echo "Core Services:"
echo " • Web UI: http://localhost:3001" echo " • Web UI: http://localhost:3001"
echo " • ArangoDB: http://localhost:8529" if [ "$USE_VLLM" = true ]; then
echo " • Ollama API: http://localhost:11434" echo " • Neo4j Browser: http://localhost:7474"
echo " • vLLM API: http://localhost:8001 (GPU-accelerated)"
else
echo " • ArangoDB: http://localhost:8529"
echo " • Ollama API: http://localhost:11434"
fi
echo "" echo ""
if [ "$USE_COMPLETE" = true ]; then if [ "$USE_VECTOR_SEARCH" = true ]; then
echo "Additional Services (Complete Stack):" echo "Vector Search Services:"
echo " • Local Pinecone: http://localhost:5081" echo " • Qdrant: http://localhost:6333"
echo " • Sentence Transformers: http://localhost:8000" echo " • Sentence Transformers: http://localhost:8000"
echo " • vLLM API: http://localhost:8001"
echo "" echo ""
fi fi
echo "Next steps:" echo "Next steps:"
echo " 1. Pull an Ollama model (if not already done):" if [ "$USE_VLLM" = true ]; then
echo " docker exec ollama-compose ollama pull llama3.1:8b" echo " 1. Wait for vLLM to load the model (check logs with: docker logs vllm-service -f)"
echo "" echo " Note: First startup may take several minutes to download the model"
echo " 2. Open http://localhost:3001 in your browser" echo ""
echo " 2. Open http://localhost:3001 in your browser"
else
echo " 1. Pull an Ollama model (if not already done):"
echo " docker exec ollama-compose ollama pull llama3.1:8b"
echo ""
echo " 2. Open http://localhost:3001 in your browser"
fi
echo " 3. Upload documents and start building your knowledge graph!" echo " 3. Upload documents and start building your knowledge graph!"
echo "" echo ""
echo "Other options:" echo "Other options:"
echo " • Stop services: ./stop.sh" echo " • Stop services: ./stop.sh"
echo " • Run frontend in dev mode: ./start.sh --dev-frontend" echo " • Run frontend in dev mode: ./start.sh --dev-frontend"
echo " • Use complete stack: ./start.sh --complete" if [ "$USE_VLLM" = true ]; then
echo " • Use Ollama: ./start.sh (without --vllm)"
else
echo " • Use vLLM (GPU): ./start.sh --vllm"
fi
echo " • Add vector search: ./start.sh --vector-search"
echo " • View logs: docker compose logs -f" echo " • View logs: docker compose logs -f"
echo "" echo ""

View File

@ -18,27 +18,40 @@
# Stop script for txt2kg project # Stop script for txt2kg project
# Check which Docker Compose version is available
DOCKER_COMPOSE_CMD=""
if docker compose version &> /dev/null; then
DOCKER_COMPOSE_CMD="docker compose"
elif command -v docker-compose &> /dev/null; then
DOCKER_COMPOSE_CMD="docker-compose"
else
echo "Error: Neither 'docker compose' nor 'docker-compose' is available"
exit 1
fi
# Parse command line arguments # Parse command line arguments
USE_COMPLETE=false USE_VLLM=false
USE_VECTOR_SEARCH=false
while [[ $# -gt 0 ]]; do while [[ $# -gt 0 ]]; do
case $1 in case $1 in
--complete) --vllm)
USE_COMPLETE=true USE_VLLM=true
shift
;;
--vector-search)
USE_VECTOR_SEARCH=true
shift shift
;; ;;
--help|-h) --help|-h)
echo "Usage: ./stop.sh [OPTIONS]" echo "Usage: ./stop.sh [OPTIONS]"
echo "" echo ""
echo "Options:" echo "Options:"
echo " --complete Stop complete stack (vLLM, Pinecone, Sentence Transformers)" echo " --vllm Stop vLLM stack (use if you started with --vllm)"
echo " --vector-search Include vector search services"
echo " --help, -h Show this help message" echo " --help, -h Show this help message"
echo "" echo ""
echo "Default: Stops minimal stack with Ollama, ArangoDB, and Next.js frontend" echo "Note: Use the same flags you used with ./start.sh"
echo ""
echo "Examples:"
echo " ./stop.sh # Stop minimal demo"
echo " ./stop.sh --complete # Stop complete stack"
exit 0 exit 0
;; ;;
*) *)
@ -49,52 +62,26 @@ while [[ $# -gt 0 ]]; do
esac esac
done done
# Check which Docker Compose version is available # Select compose file
DOCKER_COMPOSE_CMD="" COMPOSE_DIR="$(pwd)/deploy/compose"
if docker compose version &> /dev/null; then PROFILES=""
DOCKER_COMPOSE_CMD="docker compose"
elif command -v docker-compose &> /dev/null; then if [ "$USE_VLLM" = true ]; then
DOCKER_COMPOSE_CMD="docker-compose" COMPOSE_FILE="$COMPOSE_DIR/docker-compose.vllm.yml"
else else
echo "Error: Neither 'docker compose' nor 'docker-compose' is available" COMPOSE_FILE="$COMPOSE_DIR/docker-compose.yml"
echo "Please install Docker Compose: https://docs.docker.com/compose/install/"
exit 1
fi fi
# Check Docker daemon permissions CMD="$DOCKER_COMPOSE_CMD -f $COMPOSE_FILE"
if ! docker info &> /dev/null; then
echo "" if [ "$USE_VECTOR_SEARCH" = true ]; then
echo "==========================================" PROFILES="--profile vector-search"
echo "ERROR: Docker Permission Denied"
echo "=========================================="
echo ""
echo "You don't have permission to connect to the Docker daemon."
echo ""
echo "To fix this, add your user to the docker group:"
echo " sudo usermod -aG docker \$USER"
echo " newgrp docker"
echo ""
exit 1
fi fi
# Build the docker-compose command echo "Stopping txt2kg services..."
if [ "$USE_COMPLETE" = true ]; then
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.complete.yml"
echo "Stopping complete stack..."
else
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.yml"
echo "Stopping minimal configuration..."
fi
# Execute the command
echo "Running: $CMD down"
cd $(dirname "$0") cd $(dirname "$0")
eval "$CMD down" eval "$CMD $PROFILES down"
echo "" echo ""
echo "==========================================" echo "All services stopped."
echo "txt2kg has been stopped"
echo "=========================================="
echo ""
echo "To start again, run: ./start.sh" echo "To start again, run: ./start.sh"
echo ""

View File

@ -68,7 +68,8 @@ The following models are supported with vLLM on Spark. All listed models are ava
| **Phi-4-multimodal-instruct** | NVFP4 | ✅ | `nvidia/Phi-4-multimodal-instruct-FP4` | | **Phi-4-multimodal-instruct** | NVFP4 | ✅ | `nvidia/Phi-4-multimodal-instruct-FP4` |
| **Phi-4-reasoning-plus** | FP8 | ✅ | `nvidia/Phi-4-reasoning-plus-FP8` | | **Phi-4-reasoning-plus** | FP8 | ✅ | `nvidia/Phi-4-reasoning-plus-FP8` |
| **Phi-4-reasoning-plus** | NVFP4 | ✅ | `nvidia/Phi-4-reasoning-plus-FP4` | | **Phi-4-reasoning-plus** | NVFP4 | ✅ | `nvidia/Phi-4-reasoning-plus-FP4` |
| **Nemotron3-Nano** | BF16 | ✅ | `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16` |
| **Nemotron3-Nano** | FP8 | ✅ | `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8` |
> [!NOTE] > [!NOTE]
> The Phi-4-multimodal-instruct models require `--trust-remote-code` when launching vLLM. > The Phi-4-multimodal-instruct models require `--trust-remote-code` when launching vLLM.
@ -118,6 +119,12 @@ export LATEST_VLLM_VERSION=<latest_container_version>
docker pull nvcr.io/nvidia/vllm:${LATEST_VLLM_VERSION} docker pull nvcr.io/nvidia/vllm:${LATEST_VLLM_VERSION}
``` ```
For Nemotron3-Nano model support, please use release version 25.12.post1-py3
```bash
docker pull nvcr.io/nvidia/vllm:25.12.post1-py3
```
## Step 3. Test vLLM in container ## Step 3. Test vLLM in container
Launch the container and start vLLM server with a test model to verify basic functionality. Launch the container and start vLLM server with a test model to verify basic functionality.