mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-22 18:13:52 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
7e04f555c4
commit
d0dbd18840
@ -43,7 +43,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
|
||||
- [Portfolio Optimization](nvidia/portfolio-optimization/)
|
||||
- [Fine-tune with Pytorch](nvidia/pytorch-fine-tune/)
|
||||
- [RAG Application in AI Workbench](nvidia/rag-ai-workbench/)
|
||||
- [SGLang Inference Server](nvidia/sglang/)
|
||||
- [SGLang for Inference](nvidia/sglang/)
|
||||
- [Single-cell RNA Sequencing](nvidia/single-cell/)
|
||||
- [Speculative Decoding](nvidia/speculative-decoding/)
|
||||
- [Set up Tailscale on Your Spark](nvidia/tailscale/)
|
||||
|
||||
@ -67,8 +67,8 @@ model adaptation for specialized domains while leveraging hardware-specific opti
|
||||
* **Duration:** 30-60 minutes for initial setup, 1-7 hours for training depending on model size and dataset.
|
||||
* **Risks:** Model downloads require significant bandwidth and storage. Training may consume substantial GPU memory and require parameter tuning for hardware constraints.
|
||||
* **Rollback:** Remove Docker containers and cloned repositories. Training checkpoints are saved locally and can be deleted to reclaim storage space.
|
||||
* **Last Updated:** 12/15/2025
|
||||
* Upgrade to latest pytorch container version nvcr.io/nvidia/pytorch:25.11-py3
|
||||
* **Last Updated:** 01/08/2025
|
||||
* Update to Qwen3 LoRA fine-tuning workflow based on LLaMA Factory updates
|
||||
|
||||
## Instructions
|
||||
|
||||
@ -105,10 +105,15 @@ cd LLaMA-Factory
|
||||
|
||||
### Step 4. Install LLaMA Factory with dependencies
|
||||
|
||||
Install the package in editable mode with metrics support for training evaluation.
|
||||
Remove the torchaudio dependency (not needed for LLM fine-tuning) to avoid conflicts with the container's optimized PyTorch, then install.
|
||||
|
||||
```bash
|
||||
## Remove torchaudio dependency that conflicts with NVIDIA's PyTorch build
|
||||
sed -i 's/"torchaudio[^"]*",\?//' pyproject.toml
|
||||
|
||||
## Install LLaMA Factory with metrics support
|
||||
pip install -e ".[metrics]"
|
||||
pip install --no-deps torchaudio
|
||||
```
|
||||
|
||||
## Step 5. Verify Pytorch CUDA support.
|
||||
@ -126,7 +131,7 @@ python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda
|
||||
Examine the provided LoRA fine-tuning configuration for Llama-3.
|
||||
|
||||
```bash
|
||||
cat examples/train_lora/llama3_lora_sft.yaml
|
||||
cat examples/train_lora/qwen3_lora_sft.yaml
|
||||
```
|
||||
|
||||
## Step 7. Launch fine-tuning training
|
||||
@ -137,20 +142,20 @@ cat examples/train_lora/llama3_lora_sft.yaml
|
||||
Execute the training process using the pre-configured LoRA setup.
|
||||
|
||||
```bash
|
||||
huggingface-cli login # if the model is gated
|
||||
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
|
||||
hf auth login # if the model is gated
|
||||
llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml
|
||||
```
|
||||
|
||||
Example output:
|
||||
```bash
|
||||
```
|
||||
***** train metrics *****
|
||||
epoch = 3.0
|
||||
total_flos = 22851591GF
|
||||
train_loss = 0.9113
|
||||
train_runtime = 0:22:21.99
|
||||
train_samples_per_second = 2.437
|
||||
train_steps_per_second = 0.306
|
||||
Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
|
||||
total_flos = 11076559GF
|
||||
train_loss = 0.9993
|
||||
train_runtime = 0:14:32.12
|
||||
train_samples_per_second = 3.749
|
||||
train_steps_per_second = 0.471
|
||||
Figure saved at: saves/qwen3-4b/lora/sft/training_loss.png
|
||||
```
|
||||
|
||||
## Step 8. Validate training completion
|
||||
@ -158,13 +163,12 @@ Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
|
||||
Verify that training completed successfully and checkpoints were saved.
|
||||
|
||||
```bash
|
||||
ls -la saves/llama3-8b/lora/sft/
|
||||
ls -la saves/qwen3-4b/lora/sft/
|
||||
```
|
||||
|
||||
|
||||
Expected output should show:
|
||||
- Final checkpoint directory (`checkpoint-21` or similar)
|
||||
- Model configuration files (`config.json`, `adapter_config.json`)
|
||||
- Final checkpoint directory (`checkpoint-411` or similar)
|
||||
- Model configuration files (`adapter_config.json`)
|
||||
- Training metrics showing decreasing loss values
|
||||
- Training loss plot saved as PNG file
|
||||
|
||||
@ -173,14 +177,14 @@ Expected output should show:
|
||||
Test your fine-tuned model with custom prompts:
|
||||
|
||||
```bash
|
||||
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
|
||||
llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml
|
||||
## Type: "Hello, how can you help me today?"
|
||||
## Expect: Response showing fine-tuned behavior
|
||||
```
|
||||
|
||||
## Step 10. For production deployment, export your model
|
||||
```bash
|
||||
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
|
||||
llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml
|
||||
```
|
||||
|
||||
## Step 11. Cleanup and rollback
|
||||
|
||||
@ -1,4 +1,4 @@
|
||||
# SGLang Inference Server
|
||||
# SGLang for Inference
|
||||
|
||||
> Install and use SGLang on DGX Spark
|
||||
|
||||
@ -68,6 +68,8 @@ The following models are supported with SGLang on Spark. All listed models are a
|
||||
| **Phi-4-reasoning-plus** | FP8 | ✅ | `nvidia/Phi-4-reasoning-plus-FP8` |
|
||||
| **Phi-4-reasoning-plus** | NVFP4 | ✅ | `nvidia/Phi-4-reasoning-plus-FP4` |
|
||||
|
||||
Note: for NVFP4 models, add the `--quantization modelopt_fp4` flag.
|
||||
|
||||
### Time & risk
|
||||
|
||||
* **Estimated time:** 30 minutes for initial setup and validation
|
||||
|
||||
@ -54,9 +54,13 @@ The setup includes:
|
||||
- Document processing time scales with document size and complexity
|
||||
|
||||
- **Rollback**: Stop and remove Docker containers, delete downloaded models if needed
|
||||
- **Last Updated**: 12/02/2025
|
||||
- Knowledge graph search with multi-hop graph traversal
|
||||
- Improved UI/UX
|
||||
- **Last Updated**: 01/08/2025
|
||||
- Migrated from Pinecone to Qdrant for ARM64 compatibility
|
||||
- Added vLLM support with Neo4j
|
||||
- Added Palette UI components with accessibility improvements
|
||||
- Added CPU-only mode for development (`./start.sh --cpu`)
|
||||
- Optimized ArangoDB with deterministic keys and BM25 search
|
||||
- Added GNN preprocessing scripts for knowledge graph training
|
||||
|
||||
## Instructions
|
||||
|
||||
|
||||
@ -19,7 +19,7 @@ This playbook serves as a reference solution for knowledge graph extraction and
|
||||
|
||||
</details>
|
||||
|
||||
By default, this playbook leverages **Ollama** for local LLM inference, providing a fully self-contained solution that runs entirely on your own hardware. You can optionally use NVIDIA-hosted models available in the [NVIDIA API Catalog](https://build.nvidia.com) for advanced capabilities.
|
||||
By default, this playbook leverages **Ollama** for local LLM inference, providing a fully self-contained solution that runs entirely on your own hardware. You can optionally use **vLLM** for GPU-accelerated inference on DGX Spark/GB300, or NVIDIA-hosted models available in the [NVIDIA API Catalog](https://build.nvidia.com) for advanced capabilities.
|
||||
|
||||
## Key Features
|
||||
|
||||
@ -33,7 +33,7 @@ By default, this playbook leverages **Ollama** for local LLM inference, providin
|
||||
- GPU-accelerated LLM inference with Ollama
|
||||
- Fully containerized deployment with Docker Compose
|
||||
- Optional NVIDIA API integration for cloud-based models
|
||||
- Optional vector search and advanced inference capabilities
|
||||
- Optional vector search with Qdrant for semantic similarity
|
||||
- Optional graph-based RAG for contextual answers
|
||||
|
||||
## Software Components
|
||||
@ -55,9 +55,13 @@ By default, this playbook leverages **Ollama** for local LLM inference, providin
|
||||
|
||||
### Optional Components
|
||||
|
||||
* **Vector Database & Embedding** (with `--complete` flag)
|
||||
* **vLLM Stack** (with `--vllm` flag)
|
||||
* **vLLM**: GPU-accelerated LLM inference optimized for DGX Spark/GB300
|
||||
* Default model: `nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8`
|
||||
* **Neo4j**: Alternative graph database
|
||||
* **Vector Database & Embedding** (with `--vector-search` flag)
|
||||
* **SentenceTransformer**: Local embedding generation (model: `all-MiniLM-L6-v2`)
|
||||
* **Pinecone**: Self-hosted vector storage and similarity search
|
||||
* **Qdrant**: Self-hosted vector storage and similarity search
|
||||
* **Cloud Models** (configure separately)
|
||||
* **NVIDIA API**: Cloud-based models via NVIDIA API Catalog
|
||||
|
||||
@ -76,7 +80,7 @@ The core workflow for knowledge graph building and visualization:
|
||||
### Future Enhancements
|
||||
|
||||
Additional capabilities can be added:
|
||||
- **Vector search**: Add semantic similarity search with local Pinecone and SentenceTransformer embeddings
|
||||
- **Vector search**: Add semantic similarity search with Qdrant and SentenceTransformer embeddings
|
||||
- **S3 storage**: MinIO for scalable document storage
|
||||
- **GNN-based GraphRAG**: Graph Neural Networks for enhanced retrieval
|
||||
|
||||
@ -84,7 +88,7 @@ Additional capabilities can be added:
|
||||
|
||||
This playbook includes **GPU-accelerated LLM inference** with Ollama:
|
||||
|
||||
### Ollama Features
|
||||
### Ollama Features (Default)
|
||||
- **Fully local inference**: No cloud dependencies or API keys required
|
||||
- **GPU acceleration**: Automatic CUDA support with NVIDIA GPUs
|
||||
- **Multiple model support**: Use any Ollama-compatible model
|
||||
@ -92,7 +96,13 @@ This playbook includes **GPU-accelerated LLM inference** with Ollama:
|
||||
- **Easy model management**: Pull and switch models with simple commands
|
||||
- **Privacy-first**: All data processing happens on your hardware
|
||||
|
||||
### Default Configuration
|
||||
### vLLM Alternative (via `--vllm` flag)
|
||||
- **High-performance inference**: Optimized for DGX Spark/GB300 unified memory
|
||||
- **FP8 quantization**: Efficient memory usage with minimal quality loss
|
||||
- **Large context support**: Up to 32K tokens context length
|
||||
- **Continuous batching**: High throughput for multiple requests
|
||||
|
||||
### Default Ollama Configuration
|
||||
- Model: `llama3.1:8b`
|
||||
- GPU memory fraction: 0.9 (90% of available VRAM)
|
||||
- Flash attention enabled
|
||||
@ -152,8 +162,39 @@ docker exec ollama-compose ollama pull llama3.1:8b
|
||||
- **ArangoDB**: http://localhost:8529 (no authentication required)
|
||||
- **Ollama API**: http://localhost:11434
|
||||
|
||||
### Alternative: Using vLLM (for DGX Spark/GB300)
|
||||
|
||||
For GPU-accelerated inference with vLLM:
|
||||
|
||||
```bash
|
||||
./start.sh --vllm
|
||||
```
|
||||
|
||||
Then wait for vLLM to load the model:
|
||||
```bash
|
||||
docker logs vllm-service -f
|
||||
```
|
||||
|
||||
Services:
|
||||
- **Web UI**: http://localhost:3001
|
||||
- **Neo4j Browser**: http://localhost:7474 (user: `neo4j`, password: `password123`)
|
||||
- **vLLM API**: http://localhost:8001
|
||||
|
||||
### Adding Vector Search
|
||||
|
||||
Enable semantic similarity search:
|
||||
```bash
|
||||
./start.sh --vector-search
|
||||
```
|
||||
|
||||
This adds:
|
||||
- **Qdrant**: http://localhost:6333
|
||||
- **Sentence Transformers**: http://localhost:8000
|
||||
|
||||
## Available Customizations
|
||||
|
||||
- **Switch LLM backend**: Use `--vllm` flag for vLLM or default for Ollama
|
||||
- **Add vector search**: Use `--vector-search` flag for Qdrant + embeddings
|
||||
- **Switch Ollama models**: Use any model from Ollama's library (Llama, Mistral, Qwen, etc.)
|
||||
- **Modify extraction prompts**: Customize how triples are extracted from text
|
||||
- **Add domain-specific knowledge sources**: Integrate external ontologies or taxonomies
|
||||
@ -163,4 +204,4 @@ docker exec ollama-compose ollama pull llama3.1:8b
|
||||
|
||||
[MIT](LICENSE)
|
||||
|
||||
This project will download and install additional third-party open source software projects and containers.
|
||||
This project will download and install additional third-party open source software projects and containers.
|
||||
|
||||
@ -4,32 +4,36 @@ This directory contains all deployment-related configuration for the txt2kg proj
|
||||
|
||||
## Structure
|
||||
|
||||
- **compose/**: Docker Compose files for local development and testing
|
||||
- `docker-compose.yml`: Minimal Docker Compose configuration (Ollama + ArangoDB + Next.js)
|
||||
- `docker-compose.complete.yml`: Complete stack with optional services (vLLM, Pinecone, Sentence Transformers)
|
||||
- `docker-compose.optional.yml`: Additional optional services
|
||||
- `docker-compose.vllm.yml`: Legacy vLLM configuration (use `--complete` flag instead)
|
||||
- **compose/**: Docker Compose configuration
|
||||
- `docker-compose.yml`: ArangoDB + Ollama (default)
|
||||
- `docker-compose.vllm.yml`: Neo4j + vLLM (GPU-accelerated)
|
||||
|
||||
- **app/**: Frontend application Docker configuration
|
||||
- Dockerfile for Next.js application
|
||||
|
||||
- **services/**: Containerized services
|
||||
- **ollama/**: Ollama LLM inference service with GPU support
|
||||
- **sentence-transformers/**: Sentence transformer service for embeddings (optional)
|
||||
- **vllm/**: vLLM inference service with FP8 quantization (optional)
|
||||
- **gpu-viz/**: GPU-accelerated graph visualization services (optional, run separately)
|
||||
- **gnn_model/**: Graph Neural Network model service (experimental, not in default compose files)
|
||||
- **ollama/**: Ollama LLM inference service (default)
|
||||
- **vllm/**: vLLM inference service with GPU support (via `--vllm` flag)
|
||||
- **sentence-transformers/**: Sentence transformer service for embeddings (via `--vector-search` flag)
|
||||
- **gpu-viz/**: GPU-accelerated graph visualization services (run separately)
|
||||
- **gnn_model/**: Graph Neural Network model service (experimental)
|
||||
|
||||
## Usage
|
||||
|
||||
**Recommended: Use the start script**
|
||||
|
||||
```bash
|
||||
# Minimal setup (Ollama + ArangoDB + Next.js frontend)
|
||||
# Default: ArangoDB + Ollama
|
||||
./start.sh
|
||||
|
||||
# Complete stack (includes vLLM, Pinecone, Sentence Transformers)
|
||||
./start.sh --complete
|
||||
# Use Neo4j + vLLM (GPU-accelerated, for DGX Spark/GB300)
|
||||
./start.sh --vllm
|
||||
|
||||
# Enable vector search (Qdrant + Sentence Transformers)
|
||||
./start.sh --vector-search
|
||||
|
||||
# Combine options
|
||||
./start.sh --vllm --vector-search
|
||||
|
||||
# Development mode (run frontend without Docker)
|
||||
./start.sh --dev-frontend
|
||||
@ -37,31 +41,55 @@ This directory contains all deployment-related configuration for the txt2kg proj
|
||||
|
||||
**Manual Docker Compose commands:**
|
||||
|
||||
To start the minimal services:
|
||||
|
||||
```bash
|
||||
# Default: ArangoDB + Ollama
|
||||
docker compose -f deploy/compose/docker-compose.yml up -d
|
||||
```
|
||||
|
||||
To start the complete stack:
|
||||
# Neo4j + vLLM
|
||||
docker compose -f deploy/compose/docker-compose.vllm.yml up -d
|
||||
|
||||
```bash
|
||||
docker compose -f deploy/compose/docker-compose.complete.yml up -d
|
||||
# With vector search services (add --profile vector-search)
|
||||
docker compose -f deploy/compose/docker-compose.yml --profile vector-search up -d
|
||||
docker compose -f deploy/compose/docker-compose.vllm.yml --profile vector-search up -d
|
||||
```
|
||||
|
||||
## Services Included
|
||||
|
||||
### Minimal Stack (default)
|
||||
### Default Stack (ArangoDB + Ollama)
|
||||
- **Next.js App**: Web UI on port 3001
|
||||
- **ArangoDB**: Graph database on port 8529
|
||||
- **Ollama**: Local LLM inference on port 11434
|
||||
|
||||
### Complete Stack (`--complete` flag)
|
||||
All minimal services plus:
|
||||
- **vLLM**: Advanced LLM inference on port 8001
|
||||
- **Pinecone (Local)**: Vector embeddings on port 5081
|
||||
### vLLM Stack (`--vllm` flag) - Neo4j + vLLM
|
||||
- **Next.js App**: Web UI on port 3001
|
||||
- **Neo4j**: Graph database on ports 7474 (HTTP) and 7687 (Bolt)
|
||||
- **vLLM**: GPU-accelerated LLM inference on port 8001
|
||||
|
||||
### Vector Search (`--vector-search` profile)
|
||||
- **Qdrant**: Vector database on port 6333
|
||||
- **Sentence Transformers**: Embedding generation on port 8000
|
||||
|
||||
### Optional Services (run separately)
|
||||
- **GPU-Viz Services**: See `services/gpu-viz/README.md` for GPU-accelerated visualization
|
||||
- **GNN Model Service**: See `services/gnn_model/README.md` for experimental GNN-based RAG
|
||||
- **GNN Model Service**: See `services/gnn_model/README.md` for experimental GNN-based RAG
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Default Stack (./start.sh) │ vLLM Stack (--vllm) │
|
||||
├──────────────────────────────────────┼──────────────────────────┤
|
||||
│ │ │
|
||||
│ ┌─────────────┐ │ ┌─────────────┐ │
|
||||
│ │ Next.js │ port 3001 │ │ Next.js │ 3001 │
|
||||
│ └──────┬──────┘ │ └──────┬──────┘ │
|
||||
│ │ │ │ │
|
||||
│ ┌──────┴──────┐ ┌─────────────┐ │ ┌──────┴──────┐ ┌─────┐│
|
||||
│ │ ArangoDB │ │ Ollama │ │ │ Neo4j │ │vLLM ││
|
||||
│ │ port 8529 │ │ port 11434 │ │ │ port 7474 │ │8001 ││
|
||||
│ └─────────────┘ └─────────────┘ │ └─────────────┘ └─────┘│
|
||||
│ │ │
|
||||
└──────────────────────────────────────┴──────────────────────────┘
|
||||
|
||||
Optional (--vector-search): Qdrant (6333) + Sentence Transformers (8000)
|
||||
```
|
||||
|
||||
@ -8,10 +8,6 @@ RUN npm install -g pnpm --force --yes
|
||||
|
||||
# Copy dependency files
|
||||
COPY ./frontend/package.json ./frontend/pnpm-lock.yaml* ./
|
||||
COPY ./scripts/ /scripts/
|
||||
|
||||
# Update the setup-pinecone.js path
|
||||
RUN sed -i 's|"setup-pinecone": "node ../scripts/setup-pinecone.js"|"setup-pinecone": "node /scripts/setup-pinecone.js"|g' package.json
|
||||
|
||||
# Install dependencies with cache mount for faster rebuilds
|
||||
RUN --mount=type=cache,target=/root/.local/share/pnpm/store \
|
||||
@ -32,7 +28,6 @@ RUN npm install -g pnpm --force --yes
|
||||
# Copy node_modules from deps stage
|
||||
COPY --from=deps /app/node_modules ./node_modules
|
||||
COPY --from=deps /app/package.json ./package.json
|
||||
COPY --from=deps /scripts /scripts
|
||||
|
||||
# Copy source code
|
||||
COPY ./frontend/ ./
|
||||
|
||||
@ -1,20 +1,4 @@
|
||||
#!/bin/sh
|
||||
#
|
||||
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# Script to initialize Pinecone index at container startup
|
||||
echo "Initializing Pinecone index..."
|
||||
|
||||
@ -104,7 +104,7 @@ services:
|
||||
- OLLAMA_FLASH_ATTENTION=1
|
||||
- OLLAMA_KEEP_ALIVE=30m
|
||||
- OLLAMA_CUDA=1
|
||||
- OLLAMA_LLM_LIBRARY=cuda
|
||||
- OLLAMA_LLM_LIBRARY=cuda_v13
|
||||
- OLLAMA_NUM_PARALLEL=1
|
||||
- OLLAMA_MAX_LOADED_MODELS=1
|
||||
- OLLAMA_KV_CACHE_TYPE=q8_0
|
||||
|
||||
@ -1,6 +1,10 @@
|
||||
# This is a legacy file - use --with-optional flag instead
|
||||
# The vLLM service is now included in docker-compose.optional.yml
|
||||
# This file is kept for backwards compatibility
|
||||
# txt2kg Docker Compose - Neo4j + vLLM (GPU-accelerated)
|
||||
#
|
||||
# Optional stack optimized for DGX Spark/GB300 with unified memory support
|
||||
#
|
||||
# Usage:
|
||||
# ./start.sh --vllm # Use this compose file
|
||||
# ./start.sh --vllm --vector-search # Add Qdrant + Sentence Transformers
|
||||
|
||||
services:
|
||||
app:
|
||||
@ -10,105 +14,100 @@ services:
|
||||
ports:
|
||||
- '3001:3000'
|
||||
environment:
|
||||
- ARANGODB_URL=http://arangodb:8529
|
||||
# Neo4j configuration
|
||||
- NEO4J_URI=bolt://neo4j:7687
|
||||
- NEO4J_USER=neo4j
|
||||
- NEO4J_PASSWORD=password123
|
||||
- GRAPH_DB_TYPE=neo4j
|
||||
# Disable ArangoDB
|
||||
- ARANGODB_URL=http://localhost:8529
|
||||
- ARANGODB_DB=txt2kg
|
||||
- PINECONE_HOST=entity-embeddings
|
||||
- PINECONE_PORT=5081
|
||||
- PINECONE_API_KEY=pclocal
|
||||
- PINECONE_ENVIRONMENT=local
|
||||
# vLLM configuration (GPU-accelerated)
|
||||
- VLLM_BASE_URL=http://vllm:8001/v1
|
||||
- VLLM_MODEL=nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
|
||||
# Disable Ollama
|
||||
- OLLAMA_BASE_URL=http://localhost:11434/v1
|
||||
- OLLAMA_MODEL=disabled
|
||||
# Vector DB configuration
|
||||
- QDRANT_URL=http://qdrant:6333
|
||||
- VECTOR_DB_TYPE=qdrant
|
||||
# Embeddings configuration
|
||||
- LANGCHAIN_TRACING_V2=true
|
||||
- SENTENCE_TRANSFORMER_URL=http://sentence-transformers:80
|
||||
- MODEL_NAME=all-MiniLM-L6-v2
|
||||
- EMBEDDINGS_API_URL=http://sentence-transformers:80
|
||||
# Other settings
|
||||
- GRPC_SSL_CIPHER_SUITES=HIGH+ECDSA:HIGH+aRSA
|
||||
- NODE_TLS_REJECT_UNAUTHORIZED=0
|
||||
- OLLAMA_BASE_URL=http://ollama:11434/v1
|
||||
- OLLAMA_MODEL=qwen3:1.7b
|
||||
- VLLM_BASE_URL=http://vllm:8001/v1
|
||||
- VLLM_MODEL=meta-llama/Llama-3.2-3B-Instruct
|
||||
- REMOTE_WEBGPU_SERVICE_URL=http://txt2kg-remote-webgpu:8083
|
||||
- NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
|
||||
- NODE_OPTIONS=--max-http-header-size=80000
|
||||
- UV_THREADPOOL_SIZE=128
|
||||
- HTTP_TIMEOUT=1800000
|
||||
- REQUEST_TIMEOUT=1800000
|
||||
networks:
|
||||
- pinecone-net
|
||||
- default
|
||||
- txt2kg-network
|
||||
- qdrant-net
|
||||
depends_on:
|
||||
- arangodb
|
||||
- entity-embeddings
|
||||
- sentence-transformers
|
||||
- vllm
|
||||
arangodb:
|
||||
image: arangodb:latest
|
||||
ports:
|
||||
- '8529:8529'
|
||||
environment:
|
||||
- ARANGO_NO_AUTH=1
|
||||
volumes:
|
||||
- arangodb_data:/var/lib/arangodb3
|
||||
- arangodb_apps_data:/var/lib/arangodb3-apps
|
||||
arangodb-init:
|
||||
image: arangodb:latest
|
||||
depends_on:
|
||||
arangodb:
|
||||
neo4j:
|
||||
condition: service_healthy
|
||||
vllm:
|
||||
condition: service_started
|
||||
restart: on-failure
|
||||
entrypoint: >
|
||||
sh -c "
|
||||
echo 'Waiting for ArangoDB to start...' &&
|
||||
sleep 10 &&
|
||||
echo 'Creating txt2kg database...' &&
|
||||
arangosh --server.endpoint tcp://arangodb:8529 --server.authentication false --javascript.execute-string 'try { db._createDatabase(\"txt2kg\"); console.log(\"Database txt2kg created successfully!\"); } catch(e) { if(e.message.includes(\"duplicate\")) { console.log(\"Database txt2kg already exists\"); } else { throw e; } }'
|
||||
"
|
||||
entity-embeddings:
|
||||
image: ghcr.io/pinecone-io/pinecone-index:latest
|
||||
container_name: entity-embeddings
|
||||
environment:
|
||||
PORT: 5081
|
||||
INDEX_TYPE: serverless
|
||||
VECTOR_TYPE: dense
|
||||
DIMENSION: 384
|
||||
METRIC: cosine
|
||||
INDEX_NAME: entity-embeddings
|
||||
|
||||
# Neo4j - Graph database
|
||||
neo4j:
|
||||
image: neo4j:5-community
|
||||
ports:
|
||||
- "5081:5081"
|
||||
platform: linux/amd64
|
||||
networks:
|
||||
- pinecone-net
|
||||
restart: unless-stopped
|
||||
sentence-transformers:
|
||||
build:
|
||||
context: ../../deploy/services/sentence-transformers
|
||||
dockerfile: Dockerfile
|
||||
ports:
|
||||
- '8000:80'
|
||||
- '7474:7474'
|
||||
- '7687:7687'
|
||||
environment:
|
||||
- MODEL_NAME=all-MiniLM-L6-v2
|
||||
- NEO4J_AUTH=neo4j/password123
|
||||
- NEO4J_server_memory_heap_initial__size=512m
|
||||
- NEO4J_server_memory_heap_max__size=2G
|
||||
volumes:
|
||||
- neo4j_data:/data
|
||||
- neo4j_logs:/logs
|
||||
networks:
|
||||
- default
|
||||
restart: unless-stopped
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider http://localhost:7474 || exit 1"]
|
||||
interval: 15s
|
||||
timeout: 10s
|
||||
retries: 10
|
||||
start_period: 60s
|
||||
|
||||
# vLLM - GPU-accelerated LLM with unified memory support
|
||||
vllm:
|
||||
build:
|
||||
context: ../../deploy/services/vllm
|
||||
context: ../services/vllm
|
||||
dockerfile: Dockerfile
|
||||
container_name: vllm-service
|
||||
ports:
|
||||
- '8001:8001'
|
||||
ipc: host
|
||||
ulimits:
|
||||
memlock: -1
|
||||
stack: 67108864
|
||||
shm_size: '16gb'
|
||||
environment:
|
||||
# Model configuration
|
||||
- VLLM_MODEL=meta-llama/Llama-3.2-3B-Instruct
|
||||
- VLLM_MODEL=nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
|
||||
- VLLM_TENSOR_PARALLEL_SIZE=1
|
||||
- VLLM_MAX_MODEL_LEN=4096
|
||||
- VLLM_MAX_MODEL_LEN=32768
|
||||
- VLLM_GPU_MEMORY_UTILIZATION=0.9
|
||||
# NVfp4 quantization settings
|
||||
- VLLM_QUANTIZATION=fp8
|
||||
- VLLM_KV_CACHE_DTYPE=fp8
|
||||
# Service configuration
|
||||
- VLLM_MAX_NUM_SEQS=32
|
||||
- VLLM_MAX_NUM_BATCHED_TOKENS=32768
|
||||
- VLLM_KV_CACHE_DTYPE=auto
|
||||
- VLLM_PORT=8001
|
||||
- VLLM_HOST=0.0.0.0
|
||||
# Performance tuning
|
||||
- CUDA_VISIBLE_DEVICES=0
|
||||
- NCCL_DEBUG=INFO
|
||||
- CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
|
||||
- PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
|
||||
- VLLM_CPU_OFFLOAD_GB=0
|
||||
volumes:
|
||||
- vllm_models:/app/models
|
||||
- /tmp:/tmp
|
||||
# Mount model cache for faster startup
|
||||
- ~/.cache/huggingface:/root/.cache/huggingface
|
||||
networks:
|
||||
- default
|
||||
@ -121,21 +120,75 @@ services:
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8001/v1/models"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 5
|
||||
start_period: 120s # Longer start period for model loading
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8001/health"]
|
||||
interval: 60s
|
||||
timeout: 30s
|
||||
retries: 30
|
||||
start_period: 1800s
|
||||
|
||||
# Optional: Vector search services
|
||||
sentence-transformers:
|
||||
build:
|
||||
context: ../services/sentence-transformers
|
||||
dockerfile: Dockerfile
|
||||
ports:
|
||||
- '8000:80'
|
||||
environment:
|
||||
- MODEL_NAME=all-MiniLM-L6-v2
|
||||
networks:
|
||||
- default
|
||||
restart: unless-stopped
|
||||
profiles:
|
||||
- vector-search
|
||||
|
||||
qdrant:
|
||||
image: qdrant/qdrant:latest
|
||||
container_name: qdrant
|
||||
ports:
|
||||
- "6333:6333"
|
||||
- "6334:6334"
|
||||
volumes:
|
||||
- qdrant_data:/qdrant/storage
|
||||
networks:
|
||||
- qdrant-net
|
||||
restart: unless-stopped
|
||||
profiles:
|
||||
- vector-search
|
||||
|
||||
qdrant-init:
|
||||
image: curlimages/curl:latest
|
||||
depends_on:
|
||||
- qdrant
|
||||
restart: "no"
|
||||
entrypoint: /bin/sh
|
||||
command:
|
||||
- -c
|
||||
- |
|
||||
echo 'Waiting for Qdrant to start...'
|
||||
sleep 5
|
||||
curl -X PUT http://qdrant:6333/collections/entity-embeddings \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
|
||||
curl -X PUT http://qdrant:6333/collections/document-embeddings \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
|
||||
echo 'Collections created'
|
||||
networks:
|
||||
- qdrant-net
|
||||
profiles:
|
||||
- vector-search
|
||||
|
||||
volumes:
|
||||
arangodb_data:
|
||||
arangodb_apps_data:
|
||||
neo4j_data:
|
||||
neo4j_logs:
|
||||
vllm_models:
|
||||
qdrant_data:
|
||||
|
||||
networks:
|
||||
pinecone-net:
|
||||
name: pinecone
|
||||
default:
|
||||
driver: bridge
|
||||
txt2kg-network:
|
||||
driver: bridge
|
||||
qdrant-net:
|
||||
name: qdrant-network
|
||||
|
||||
|
||||
@ -1,3 +1,12 @@
|
||||
# txt2kg Docker Compose - ArangoDB + Ollama (Default)
|
||||
#
|
||||
# Default stack tested and working on DGX Spark
|
||||
#
|
||||
# Usage:
|
||||
# ./start.sh # Default: ArangoDB + Ollama
|
||||
# ./start.sh --vector-search # Add Qdrant + Sentence Transformers
|
||||
#
|
||||
# For Neo4j + vLLM, use: ./start.sh --vllm
|
||||
|
||||
services:
|
||||
app:
|
||||
@ -7,21 +16,32 @@ services:
|
||||
ports:
|
||||
- '3001:3000'
|
||||
environment:
|
||||
# ArangoDB configuration
|
||||
- ARANGODB_URL=http://arangodb:8529
|
||||
- ARANGODB_DB=txt2kg
|
||||
- GRAPH_DB_TYPE=arangodb
|
||||
# Disable Neo4j
|
||||
- NEO4J_URI=bolt://localhost:7687
|
||||
- NEO4J_USER=neo4j
|
||||
- NEO4J_PASSWORD=password123
|
||||
# Ollama configuration
|
||||
- OLLAMA_BASE_URL=http://ollama:11434/v1
|
||||
- OLLAMA_MODEL=llama3.1:8b
|
||||
# Disable vLLM
|
||||
- VLLM_BASE_URL=http://localhost:8001/v1
|
||||
- VLLM_MODEL=disabled
|
||||
# Vector DB configuration
|
||||
- QDRANT_URL=http://qdrant:6333
|
||||
- VECTOR_DB_TYPE=qdrant
|
||||
# Embeddings configuration
|
||||
- LANGCHAIN_TRACING_V2=true
|
||||
- SENTENCE_TRANSFORMER_URL=http://sentence-transformers:80
|
||||
- MODEL_NAME=all-MiniLM-L6-v2
|
||||
- EMBEDDINGS_API_URL=http://sentence-transformers:80
|
||||
# Other settings
|
||||
- GRPC_SSL_CIPHER_SUITES=HIGH+ECDSA:HIGH+aRSA
|
||||
- NODE_TLS_REJECT_UNAUTHORIZED=0
|
||||
- OLLAMA_BASE_URL=http://ollama:11434/v1
|
||||
- OLLAMA_MODEL=llama3.1:8b
|
||||
- REMOTE_WEBGPU_SERVICE_URL=http://txt2kg-remote-webgpu:8083
|
||||
- NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
|
||||
# Node.js timeout configurations for large model processing
|
||||
- NODE_OPTIONS=--max-http-header-size=80000
|
||||
- UV_THREADPOOL_SIZE=128
|
||||
- HTTP_TIMEOUT=1800000
|
||||
@ -29,12 +49,14 @@ services:
|
||||
networks:
|
||||
- default
|
||||
- txt2kg-network
|
||||
- pinecone-net
|
||||
- qdrant-net
|
||||
depends_on:
|
||||
- arangodb
|
||||
- ollama
|
||||
# Optional: sentence-transformers and entity-embeddings are only needed for vector search
|
||||
# Traditional graph search works without these services
|
||||
arangodb:
|
||||
condition: service_started
|
||||
ollama:
|
||||
condition: service_started
|
||||
|
||||
# ArangoDB - Graph database
|
||||
arangodb:
|
||||
image: arangodb:latest
|
||||
ports:
|
||||
@ -44,6 +66,11 @@ services:
|
||||
volumes:
|
||||
- arangodb_data:/var/lib/arangodb3
|
||||
- arangodb_apps_data:/var/lib/arangodb3-apps
|
||||
networks:
|
||||
- default
|
||||
restart: unless-stopped
|
||||
|
||||
# ArangoDB initialization - create database
|
||||
arangodb-init:
|
||||
image: arangodb:latest
|
||||
depends_on:
|
||||
@ -57,6 +84,10 @@ services:
|
||||
echo 'Creating txt2kg database...' &&
|
||||
arangosh --server.endpoint tcp://arangodb:8529 --server.authentication false --javascript.execute-string 'try { db._createDatabase(\"txt2kg\"); console.log(\"Database txt2kg created successfully!\"); } catch(e) { if(e.message.includes(\"duplicate\")) { console.log(\"Database txt2kg already exists\"); } else { throw e; } }'
|
||||
"
|
||||
networks:
|
||||
- default
|
||||
|
||||
# Ollama - Local LLM inference
|
||||
ollama:
|
||||
build:
|
||||
context: ../services/ollama
|
||||
@ -68,13 +99,16 @@ services:
|
||||
volumes:
|
||||
- ollama_data:/root/.ollama
|
||||
environment:
|
||||
- NVIDIA_VISIBLE_DEVICES=all # Make all GPUs visible to the container
|
||||
- NVIDIA_DRIVER_CAPABILITIES=compute,utility # Required capabilities for CUDA
|
||||
- OLLAMA_FLASH_ATTENTION=1 # Enable flash attention for better performance
|
||||
- OLLAMA_KEEP_ALIVE=30m # Keep models loaded for 30 minutes
|
||||
- OLLAMA_NUM_PARALLEL=4 # Process 4 requests in parallel - DGX Spark has unified memory
|
||||
- OLLAMA_MAX_LOADED_MODELS=1 # Load only one model at a time to avoid VRAM contention
|
||||
- OLLAMA_KV_CACHE_TYPE=q8_0 # Reduce KV cache VRAM usage with minimal performance impact
|
||||
- NVIDIA_VISIBLE_DEVICES=all
|
||||
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
|
||||
- CUDA_VISIBLE_DEVICES=0
|
||||
- OLLAMA_FLASH_ATTENTION=1
|
||||
- OLLAMA_KEEP_ALIVE=30m
|
||||
- OLLAMA_NUM_PARALLEL=4
|
||||
- OLLAMA_MAX_LOADED_MODELS=1
|
||||
- OLLAMA_KV_CACHE_TYPE=q8_0
|
||||
- OLLAMA_GPU_LAYERS=-1
|
||||
- OLLAMA_LLM_LIBRARY=cuda_v13
|
||||
networks:
|
||||
- default
|
||||
restart: unless-stopped
|
||||
@ -91,9 +125,8 @@ services:
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 60s
|
||||
|
||||
# Optional services for vector search (NOT required for traditional graph search)
|
||||
# Traditional graph search works with just: app, arangodb, and ollama
|
||||
|
||||
# Optional: Vector search services
|
||||
sentence-transformers:
|
||||
build:
|
||||
context: ../services/sentence-transformers
|
||||
@ -106,7 +139,8 @@ services:
|
||||
- default
|
||||
restart: unless-stopped
|
||||
profiles:
|
||||
- vector-search # Only start with: docker compose --profile vector-search up
|
||||
- vector-search
|
||||
|
||||
qdrant:
|
||||
image: qdrant/qdrant:latest
|
||||
container_name: qdrant
|
||||
@ -116,10 +150,11 @@ services:
|
||||
volumes:
|
||||
- qdrant_data:/qdrant/storage
|
||||
networks:
|
||||
- pinecone-net
|
||||
- qdrant-net
|
||||
restart: unless-stopped
|
||||
profiles:
|
||||
- vector-search # Only start with: docker compose --profile vector-search up
|
||||
- vector-search
|
||||
|
||||
qdrant-init:
|
||||
image: curlimages/curl:latest
|
||||
depends_on:
|
||||
@ -131,32 +166,15 @@ services:
|
||||
- |
|
||||
echo 'Waiting for Qdrant to start...'
|
||||
sleep 5
|
||||
echo 'Checking if entity-embeddings collection exists...'
|
||||
RESPONSE=$(curl -s http://qdrant:6333/collections/entity-embeddings)
|
||||
if echo "$RESPONSE" | grep -q '"status":"ok"'; then
|
||||
echo 'entity-embeddings collection already exists'
|
||||
else
|
||||
echo 'Creating collection entity-embeddings...'
|
||||
curl -X PUT http://qdrant:6333/collections/entity-embeddings \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"vectors":{"size":384,"distance":"Cosine"}}'
|
||||
echo ''
|
||||
echo 'entity-embeddings collection created successfully'
|
||||
fi
|
||||
echo 'Checking if document-embeddings collection exists...'
|
||||
RESPONSE=$(curl -s http://qdrant:6333/collections/document-embeddings)
|
||||
if echo "$RESPONSE" | grep -q '"status":"ok"'; then
|
||||
echo 'document-embeddings collection already exists'
|
||||
else
|
||||
echo 'Creating collection document-embeddings...'
|
||||
curl -X PUT http://qdrant:6333/collections/document-embeddings \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"vectors":{"size":384,"distance":"Cosine"}}'
|
||||
echo ''
|
||||
echo 'document-embeddings collection created successfully'
|
||||
fi
|
||||
curl -X PUT http://qdrant:6333/collections/entity-embeddings \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
|
||||
curl -X PUT http://qdrant:6333/collections/document-embeddings \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"vectors":{"size":384,"distance":"Cosine"}}' || true
|
||||
echo 'Collections created'
|
||||
networks:
|
||||
- pinecone-net
|
||||
- qdrant-net
|
||||
profiles:
|
||||
- vector-search
|
||||
|
||||
@ -171,5 +189,5 @@ networks:
|
||||
driver: bridge
|
||||
txt2kg-network:
|
||||
driver: bridge
|
||||
pinecone-net:
|
||||
name: pinecone
|
||||
qdrant-net:
|
||||
name: qdrant-network
|
||||
|
||||
@ -1,5 +1,5 @@
|
||||
# Use NVIDIA Triton Inference Server with vLLM - optimized for latest NVIDIA hardware
|
||||
FROM nvcr.io/nvidia/tritonserver:25.08-vllm-python-py3
|
||||
# Use official NVIDIA vLLM image - optimized for NVIDIA hardware
|
||||
FROM nvcr.io/nvidia/vllm:25.11-py3
|
||||
|
||||
# Install curl for health checks
|
||||
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
|
||||
|
||||
@ -21,17 +21,11 @@
|
||||
|
||||
# Enable unified memory usage for DGX Spark
|
||||
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
|
||||
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
|
||||
export PYTORCH_ALLOC_CONF=expandable_segments:True
|
||||
|
||||
# Enable CUDA unified memory and oversubscription
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
export PYTORCH_NO_CUDA_MEMORY_CACHING=0
|
||||
|
||||
# Force vLLM to use CPU offloading for large models
|
||||
export VLLM_CPU_OFFLOAD_GB=50
|
||||
export VLLM_ALLOW_RUNTIME_LORA_UPDATES_WITH_SGD_LORA=1
|
||||
export VLLM_SKIP_WARMUP=0
|
||||
|
||||
# Optimized environment for performance
|
||||
export VLLM_LOGGING_LEVEL=INFO
|
||||
export PYTHONUNBUFFERED=1
|
||||
@ -39,8 +33,12 @@ export PYTHONUNBUFFERED=1
|
||||
# Enable CUDA optimizations
|
||||
export VLLM_USE_MODELSCOPE=false
|
||||
|
||||
# Enable unified memory in vLLM
|
||||
export VLLM_USE_V1=0
|
||||
# Enable FP8 MoE optimizations for Nemotron and other MoE models
|
||||
export VLLM_USE_FLASHINFER_MOE_FP8=1
|
||||
export VLLM_USE_FLASHINFER_MOE_FP4=1
|
||||
|
||||
# Enable FlashInfer attention backend for better performance
|
||||
export VLLM_ATTENTION_BACKEND=FLASHINFER
|
||||
|
||||
# First, test basic CUDA functionality
|
||||
echo "=== Testing CUDA functionality ==="
|
||||
@ -64,68 +62,89 @@ if torch.cuda.is_available():
|
||||
"
|
||||
|
||||
echo "=== Starting optimized vLLM server ==="
|
||||
# Optimized configuration for DGX Spark performance with NVFP4 quantization
|
||||
# Available quantized models from NVIDIA
|
||||
NVFP4_MODEL="nvidia/Llama-3.3-70B-Instruct-FP4"
|
||||
NVFP8_MODEL="nvidia/Llama-3.1-8B-Instruct-FP8"
|
||||
STANDARD_MODEL="meta-llama/Llama-3.1-70B-Instruct"
|
||||
|
||||
# Check GPU compute capability for optimal quantization
|
||||
# Check GPU compute capability for optimal settings
|
||||
COMPUTE_CAPABILITY=$(nvidia-smi -i 0 --query-gpu=compute_cap --format=csv,noheader,nounits 2>/dev/null || echo "unknown")
|
||||
echo "Detected GPU compute capability: $COMPUTE_CAPABILITY"
|
||||
|
||||
# Configure quantization based on GPU architecture
|
||||
if [[ "$COMPUTE_CAPABILITY" == "12.1" ]] || [[ "$COMPUTE_CAPABILITY" == "10.0" ]]; then
|
||||
# Blackwell/DGX Spark architecture - use standard 70B model with CPU offloading
|
||||
echo "Using standard Llama-3.1-70B model for Blackwell/DGX Spark with CPU offloading"
|
||||
QUANTIZATION_FLAG=""
|
||||
MODEL_TO_USE="$STANDARD_MODEL" # Use standard 70B model
|
||||
GPU_MEMORY_UTIL="0.7" # Lower GPU memory to allow unified memory
|
||||
MAX_MODEL_LEN="4096" # Shorter sequences for memory efficiency
|
||||
MAX_NUM_SEQS="16" # Lower concurrent sequences for 70B
|
||||
MAX_BATCHED_TOKENS="4096"
|
||||
CPU_OFFLOAD_GB="50" # Offload 50GB to CPU/unified memory
|
||||
elif [[ "$COMPUTE_CAPABILITY" == "9.0" ]]; then
|
||||
# Hopper architecture - use standard model
|
||||
echo "Using standard 70B model for Hopper architecture"
|
||||
QUANTIZATION_FLAG=""
|
||||
MODEL_TO_USE="$STANDARD_MODEL"
|
||||
GPU_MEMORY_UTIL="0.7"
|
||||
MAX_MODEL_LEN="4096"
|
||||
MAX_NUM_SEQS="16"
|
||||
MAX_BATCHED_TOKENS="4096"
|
||||
CPU_OFFLOAD_GB="40"
|
||||
# Use environment variable if set, otherwise default to Qwen (not gated)
|
||||
if [ -n "$VLLM_MODEL" ]; then
|
||||
MODEL_TO_USE="$VLLM_MODEL"
|
||||
echo "Using model from environment: $MODEL_TO_USE"
|
||||
else
|
||||
# Other architectures - use standard precision
|
||||
echo "Using standard 70B model for GPU architecture: $COMPUTE_CAPABILITY"
|
||||
QUANTIZATION_FLAG=""
|
||||
MODEL_TO_USE="$STANDARD_MODEL"
|
||||
GPU_MEMORY_UTIL="0.7"
|
||||
MAX_MODEL_LEN="2048"
|
||||
MAX_NUM_SEQS="16"
|
||||
MAX_BATCHED_TOKENS="2048"
|
||||
CPU_OFFLOAD_GB="40"
|
||||
# Default to Qwen 2.5 7B - not gated, no HuggingFace token required
|
||||
MODEL_TO_USE="Qwen/Qwen2.5-7B-Instruct"
|
||||
echo "Using default model: $MODEL_TO_USE"
|
||||
fi
|
||||
|
||||
echo "Using model: $MODEL_TO_USE"
|
||||
echo "Quantization: ${QUANTIZATION_FLAG:-'disabled'}"
|
||||
# Configure settings based on model size and GPU architecture
|
||||
# Check if using 8B or smaller model
|
||||
if [[ "$MODEL_TO_USE" == *"8B"* ]] || [[ "$MODEL_TO_USE" == *"7B"* ]] || [[ "$MODEL_TO_USE" == *"3B"* ]] || [[ "$MODEL_TO_USE" == *"1B"* ]]; then
|
||||
echo "Configuring for smaller model (8B or less)"
|
||||
QUANTIZATION_FLAG=""
|
||||
GPU_MEMORY_UTIL="${VLLM_GPU_MEMORY_UTILIZATION:-0.9}"
|
||||
MAX_MODEL_LEN="${VLLM_MAX_MODEL_LEN:-8192}"
|
||||
MAX_NUM_SEQS="${VLLM_MAX_NUM_SEQS:-64}"
|
||||
MAX_BATCHED_TOKENS="${VLLM_MAX_NUM_BATCHED_TOKENS:-8192}"
|
||||
CPU_OFFLOAD_GB="${VLLM_CPU_OFFLOAD_GB:-0}"
|
||||
elif [[ "$COMPUTE_CAPABILITY" == "12.1" ]] || [[ "$COMPUTE_CAPABILITY" == "10.0" ]]; then
|
||||
# Blackwell/DGX Spark architecture with larger model - use CPU offloading
|
||||
echo "Configuring for large model on Blackwell/DGX Spark with CPU offloading"
|
||||
QUANTIZATION_FLAG=""
|
||||
GPU_MEMORY_UTIL="${VLLM_GPU_MEMORY_UTILIZATION:-0.7}"
|
||||
MAX_MODEL_LEN="${VLLM_MAX_MODEL_LEN:-4096}"
|
||||
MAX_NUM_SEQS="${VLLM_MAX_NUM_SEQS:-16}"
|
||||
MAX_BATCHED_TOKENS="${VLLM_MAX_NUM_BATCHED_TOKENS:-4096}"
|
||||
CPU_OFFLOAD_GB="${VLLM_CPU_OFFLOAD_GB:-50}"
|
||||
else
|
||||
# Other architectures with larger model
|
||||
echo "Configuring for large model on GPU architecture: $COMPUTE_CAPABILITY"
|
||||
QUANTIZATION_FLAG=""
|
||||
GPU_MEMORY_UTIL="${VLLM_GPU_MEMORY_UTILIZATION:-0.7}"
|
||||
MAX_MODEL_LEN="${VLLM_MAX_MODEL_LEN:-4096}"
|
||||
MAX_NUM_SEQS="${VLLM_MAX_NUM_SEQS:-16}"
|
||||
MAX_BATCHED_TOKENS="${VLLM_MAX_NUM_BATCHED_TOKENS:-4096}"
|
||||
CPU_OFFLOAD_GB="${VLLM_CPU_OFFLOAD_GB:-40}"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== vLLM Configuration ==="
|
||||
echo "Model: $MODEL_TO_USE"
|
||||
echo "GPU memory utilization: $GPU_MEMORY_UTIL"
|
||||
|
||||
echo "Max model length: $MAX_MODEL_LEN"
|
||||
echo "Max num seqs: $MAX_NUM_SEQS"
|
||||
echo "Max batched tokens: $MAX_BATCHED_TOKENS"
|
||||
echo "CPU Offload: ${CPU_OFFLOAD_GB}GB"
|
||||
echo "Quantization: ${QUANTIZATION_FLAG:-'none'}"
|
||||
echo ""
|
||||
|
||||
vllm serve "$MODEL_TO_USE" \
|
||||
# Build command - only add cpu-offload-gb if > 0
|
||||
VLLM_CMD="vllm serve $MODEL_TO_USE \
|
||||
--host 0.0.0.0 \
|
||||
--port 8001 \
|
||||
--tensor-parallel-size 1 \
|
||||
--max-model-len "$MAX_MODEL_LEN" \
|
||||
--max-num-seqs "$MAX_NUM_SEQS" \
|
||||
--max-num-batched-tokens "$MAX_BATCHED_TOKENS" \
|
||||
--gpu-memory-utilization "$GPU_MEMORY_UTIL" \
|
||||
--cpu-offload-gb "$CPU_OFFLOAD_GB" \
|
||||
--max-model-len $MAX_MODEL_LEN \
|
||||
--max-num-seqs $MAX_NUM_SEQS \
|
||||
--gpu-memory-utilization $GPU_MEMORY_UTIL \
|
||||
--kv-cache-dtype auto \
|
||||
--trust-remote-code \
|
||||
--served-model-name "$MODEL_TO_USE" \
|
||||
--enable-chunked-prefill \
|
||||
--disable-custom-all-reduce \
|
||||
--disable-async-output-proc \
|
||||
$QUANTIZATION_FLAG
|
||||
--served-model-name $MODEL_TO_USE"
|
||||
|
||||
# Note: For FP8 models, vLLM auto-detects quantization from model config
|
||||
# No need to specify --dtype float8 (not supported in vLLM 0.11.0)
|
||||
if [[ "$MODEL_TO_USE" == *"FP8"* ]] || [[ "$MODEL_TO_USE" == *"fp8"* ]]; then
|
||||
echo "Detected FP8 model - vLLM will auto-detect FP8 quantization from model config"
|
||||
fi
|
||||
|
||||
# Add CPU offload only for larger models
|
||||
if [ "$CPU_OFFLOAD_GB" -gt 0 ] 2>/dev/null; then
|
||||
VLLM_CMD="$VLLM_CMD --cpu-offload-gb $CPU_OFFLOAD_GB"
|
||||
fi
|
||||
|
||||
# Add quantization if specified
|
||||
if [ -n "$QUANTIZATION_FLAG" ]; then
|
||||
VLLM_CMD="$VLLM_CMD $QUANTIZATION_FLAG"
|
||||
fi
|
||||
|
||||
echo "Running: $VLLM_CMD"
|
||||
exec $VLLM_CMD
|
||||
@ -18,7 +18,7 @@ This directory contains the Next.js frontend application for the txt2kg project.
|
||||
- **lib/**: Utility functions and shared logic
|
||||
- LLM service (Ollama, vLLM, NVIDIA API integration)
|
||||
- Graph database services (ArangoDB, Neo4j)
|
||||
- Pinecone vector database integration
|
||||
- Qdrant vector database integration
|
||||
- RAG service for knowledge graph querying
|
||||
- **public/**: Static assets
|
||||
- **types/**: TypeScript type definitions for graph data structures
|
||||
@ -76,7 +76,7 @@ Required environment variables are configured in docker-compose files:
|
||||
- `OLLAMA_BASE_URL`: Ollama API endpoint
|
||||
- `VLLM_BASE_URL`: vLLM API endpoint (optional)
|
||||
- `NVIDIA_API_KEY`: NVIDIA API key (optional)
|
||||
- `PINECONE_HOST`: Local Pinecone host (optional)
|
||||
- `QDRANT_URL`: Qdrant vector database URL (optional)
|
||||
- `SENTENCE_TRANSFORMER_URL`: Embeddings service URL (optional)
|
||||
|
||||
## Features
|
||||
@ -86,4 +86,4 @@ Required environment variables are configured in docker-compose files:
|
||||
- **RAG Queries**: Query knowledge graphs with retrieval-augmented generation
|
||||
- **Multiple LLM Providers**: Support for Ollama, vLLM, and NVIDIA API
|
||||
- **GPU-Accelerated Rendering**: Optional PyGraphistry integration for large graphs
|
||||
- **Vector Search**: Pinecone integration for semantic search
|
||||
- **Vector Search**: Qdrant integration for semantic search
|
||||
@ -21,7 +21,7 @@ import { getGraphDbType } from '../settings/route';
|
||||
|
||||
/**
|
||||
* Remote backend API that provides endpoints for creating and querying a knowledge graph
|
||||
* using the selected graph database, Pinecone, and SentenceTransformer
|
||||
* using the selected graph database, Qdrant, and SentenceTransformer
|
||||
*/
|
||||
|
||||
/**
|
||||
|
||||
@ -56,24 +56,24 @@ export async function POST(request: NextRequest) {
|
||||
console.log(`Generated ${embeddings.length} embeddings`);
|
||||
|
||||
// Initialize QdrantService
|
||||
const pineconeService = QdrantService.getInstance();
|
||||
const qdrantService = QdrantService.getInstance();
|
||||
|
||||
// Check if Qdrant server is running
|
||||
const isPineconeRunning = await pineconeService.isQdrantRunning();
|
||||
if (!isPineconeRunning) {
|
||||
const isQdrantRunning = await qdrantService.isQdrantRunning();
|
||||
if (!isQdrantRunning) {
|
||||
return NextResponse.json(
|
||||
{ error: 'Qdrant server is not available. Please make sure it is running.' },
|
||||
{ status: 503 }
|
||||
);
|
||||
}
|
||||
|
||||
if (!pineconeService.isInitialized()) {
|
||||
if (!qdrantService.isInitialized()) {
|
||||
try {
|
||||
await pineconeService.initialize();
|
||||
await qdrantService.initialize();
|
||||
} catch (initError) {
|
||||
console.error('Error initializing Pinecone:', initError);
|
||||
console.error('Error initializing Qdrant:', initError);
|
||||
return NextResponse.json(
|
||||
{ error: `Failed to initialize Pinecone: ${initError instanceof Error ? initError.message : String(initError)}` },
|
||||
{ error: `Failed to initialize Qdrant: ${initError instanceof Error ? initError.message : String(initError)}` },
|
||||
{ status: 500 }
|
||||
);
|
||||
}
|
||||
@ -89,13 +89,13 @@ export async function POST(request: NextRequest) {
|
||||
textContent.set(chunkIds[i], chunks[i]);
|
||||
}
|
||||
|
||||
// Store embeddings in PineconeService with retry logic
|
||||
// Store embeddings in Qdrant with retry logic
|
||||
try {
|
||||
await pineconeService.storeEmbeddings(entityEmbeddings, textContent);
|
||||
await qdrantService.storeEmbeddings(entityEmbeddings, textContent);
|
||||
} catch (storeError) {
|
||||
console.error('Error storing embeddings in Pinecone:', storeError);
|
||||
console.error('Error storing embeddings in Qdrant:', storeError);
|
||||
return NextResponse.json(
|
||||
{ error: `Failed to store embeddings in Pinecone: ${storeError instanceof Error ? storeError.message : String(storeError)}` },
|
||||
{ error: `Failed to store embeddings in Qdrant: ${storeError instanceof Error ? storeError.message : String(storeError)}` },
|
||||
{ status: 500 }
|
||||
);
|
||||
}
|
||||
|
||||
@ -132,9 +132,9 @@ export async function POST(req: NextRequest) {
|
||||
},
|
||||
body: JSON.stringify({
|
||||
text,
|
||||
model: vllmModel || 'meta-llama/Llama-3.2-3B-Instruct',
|
||||
model: vllmModel || process.env.VLLM_MODEL || 'nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8',
|
||||
temperature: 0.1,
|
||||
maxTokens: 8192
|
||||
maxTokens: 4096 // Reduced to leave room for input tokens in context
|
||||
})
|
||||
});
|
||||
|
||||
|
||||
@ -88,13 +88,18 @@ async function ensureConnection(request?: NextRequest): Promise<GraphDBType> {
|
||||
* GET handler for retrieving graph data from the selected graph database
|
||||
*/
|
||||
export async function GET(request: NextRequest) {
|
||||
console.log('[graph-db GET] Request received');
|
||||
try {
|
||||
// Initialize with connection parameters
|
||||
console.log('[graph-db GET] Ensuring connection...');
|
||||
const graphDbType = await ensureConnection(request);
|
||||
console.log(`[graph-db GET] Using database type: ${graphDbType}`);
|
||||
const graphDbService = getGraphDbService(graphDbType);
|
||||
|
||||
// Get graph data from the database
|
||||
console.log('[graph-db GET] Fetching graph data...');
|
||||
const graphData = await graphDbService.getGraphData();
|
||||
console.log(`[graph-db GET] Got ${graphData.nodes.length} nodes, ${graphData.relationships.length} relationships`);
|
||||
|
||||
// Transform to format expected by the frontend
|
||||
const nodes = graphData.nodes.map(node => ({
|
||||
|
||||
@ -30,7 +30,7 @@ export async function GET(request: NextRequest) {
|
||||
// Initialize services with the correct graph database type
|
||||
const graphDbType = getGraphDbType();
|
||||
const graphDbService = getGraphDbService(graphDbType);
|
||||
const pineconeService = QdrantService.getInstance();
|
||||
const qdrantService = QdrantService.getInstance();
|
||||
|
||||
// Initialize graph database if needed
|
||||
if (!graphDbService.isInitialized()) {
|
||||
@ -60,7 +60,7 @@ export async function GET(request: NextRequest) {
|
||||
// Get total triples (relationships)
|
||||
const totalTriples = graphData.relationships.length;
|
||||
|
||||
// Get vector stats from Pinecone if available
|
||||
// Get vector stats from Qdrant if available
|
||||
let vectorStats = {
|
||||
totalVectors: 0,
|
||||
avgQueryTime: 0,
|
||||
@ -68,8 +68,8 @@ export async function GET(request: NextRequest) {
|
||||
};
|
||||
|
||||
try {
|
||||
await pineconeService.initialize();
|
||||
const stats = await pineconeService.getStats();
|
||||
await qdrantService.initialize();
|
||||
const stats = await qdrantService.getStats();
|
||||
|
||||
vectorStats = {
|
||||
totalVectors: stats.totalVectorCount || 0,
|
||||
@ -77,7 +77,7 @@ export async function GET(request: NextRequest) {
|
||||
avgRelevanceScore: stats.averageRelevanceScore || 0
|
||||
};
|
||||
} catch (error) {
|
||||
console.warn('Could not fetch Pinecone stats:', error);
|
||||
console.warn('Could not fetch Qdrant stats:', error);
|
||||
}
|
||||
|
||||
// Get real query logs instead of mock data
|
||||
|
||||
@ -57,7 +57,7 @@ export async function POST(req: NextRequest) {
|
||||
console.log(`[${new Date().toISOString()}] /api/ollama: POST request received`);
|
||||
|
||||
try {
|
||||
const { text, model = 'qwen3:1.7b', temperature = 0.1, maxTokens = 8192 } = await req.json();
|
||||
const { text, model = 'qwen3:1.7b', temperature = 0.1, maxTokens = 4096 } = await req.json();
|
||||
console.log(`[${new Date().toISOString()}] /api/ollama: Parsed body - model: ${model}, text length: ${text?.length || 0}, maxTokens: ${maxTokens}`);
|
||||
|
||||
if (!text || typeof text !== 'string') {
|
||||
|
||||
32
nvidia/txt2kg/assets/frontend/app/api/ollama/tags/route.ts
Normal file
32
nvidia/txt2kg/assets/frontend/app/api/ollama/tags/route.ts
Normal file
@ -0,0 +1,32 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
import { NextResponse } from 'next/server';
|
||||
|
||||
/**
|
||||
* Fetch available models from Ollama
|
||||
* GET /api/ollama/tags
|
||||
*/
|
||||
export async function GET() {
|
||||
const ollamaUrl = process.env.OLLAMA_BASE_URL || 'http://ollama:11434/v1';
|
||||
// Convert /v1 URL to base URL for tags endpoint
|
||||
const baseUrl = ollamaUrl.replace('/v1', '');
|
||||
|
||||
try {
|
||||
const response = await fetch(`${baseUrl}/api/tags`, {
|
||||
signal: AbortSignal.timeout(5000),
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
return NextResponse.json({ models: [] }, { status: 200 });
|
||||
}
|
||||
|
||||
const data = await response.json();
|
||||
return NextResponse.json(data);
|
||||
} catch (error) {
|
||||
// Return empty models array if Ollama is not available
|
||||
return NextResponse.json({ models: [] }, { status: 200 });
|
||||
}
|
||||
}
|
||||
|
||||
@ -1,21 +1,5 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import { NextRequest, NextResponse } from 'next/server';
|
||||
import { QdrantService } from '@/lib/qdrant';
|
||||
import { PineconeService } from '@/lib/pinecone';
|
||||
|
||||
/**
|
||||
* Clear all data from the Pinecone vector database
|
||||
@ -23,7 +7,7 @@ import { QdrantService } from '@/lib/qdrant';
|
||||
*/
|
||||
export async function POST() {
|
||||
// Get the Pinecone service instance
|
||||
const pineconeService = QdrantService.getInstance();
|
||||
const pineconeService = PineconeService.getInstance();
|
||||
|
||||
// Clear all vectors from the database
|
||||
const deleteSuccess = await pineconeService.deleteAllEntities();
|
||||
|
||||
@ -1,21 +1,5 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import { NextResponse } from 'next/server';
|
||||
import { QdrantService } from '@/lib/qdrant';
|
||||
import { PineconeService } from '@/lib/pinecone';
|
||||
|
||||
/**
|
||||
* Create Pinecone index API endpoint
|
||||
@ -24,7 +8,7 @@ import { QdrantService } from '@/lib/qdrant';
|
||||
export async function POST() {
|
||||
try {
|
||||
// Get the Pinecone service instance
|
||||
const pineconeService = QdrantService.getInstance();
|
||||
const pineconeService = PineconeService.getInstance();
|
||||
|
||||
// Force re-initialization to create the index
|
||||
(pineconeService as any).initialized = false;
|
||||
|
||||
@ -1,21 +1,5 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import { NextRequest, NextResponse } from 'next/server';
|
||||
import { QdrantService } from '@/lib/qdrant';
|
||||
import { PineconeService } from '@/lib/pinecone';
|
||||
|
||||
/**
|
||||
* Get Pinecone vector database stats
|
||||
@ -23,7 +7,7 @@ import { QdrantService } from '@/lib/qdrant';
|
||||
export async function GET() {
|
||||
try {
|
||||
// Initialize Pinecone service
|
||||
const pineconeService = QdrantService.getInstance();
|
||||
const pineconeService = PineconeService.getInstance();
|
||||
|
||||
// We can now directly call getStats() which handles initialization and error recovery
|
||||
const stats = await pineconeService.getStats();
|
||||
|
||||
@ -19,7 +19,7 @@ import RAGService from '@/lib/rag';
|
||||
|
||||
/**
|
||||
* API endpoint for RAG-based question answering
|
||||
* Uses Pinecone for document retrieval and LangChain for generation
|
||||
* Uses Qdrant for document retrieval and LangChain for generation
|
||||
* POST /api/rag-query
|
||||
*/
|
||||
export async function POST(req: NextRequest) {
|
||||
|
||||
@ -51,7 +51,7 @@ export async function POST(req: NextRequest) {
|
||||
// Optionally store in vector database
|
||||
if (sentenceEmbeddings.length > 0) {
|
||||
try {
|
||||
// Map the embeddings to a format suitable for Pinecone
|
||||
// Map the embeddings to a format suitable for Qdrant
|
||||
const embeddingsMap = new Map<string, number[]>();
|
||||
const textContentMap = new Map<string, string>();
|
||||
const metadataMap = new Map<string, any>();
|
||||
@ -64,9 +64,9 @@ export async function POST(req: NextRequest) {
|
||||
metadataMap.set(key, item.metadata);
|
||||
});
|
||||
|
||||
// Store in Pinecone
|
||||
const pineconeService = QdrantService.getInstance();
|
||||
await pineconeService.storeEmbeddingsWithMetadata(
|
||||
// Store in Qdrant
|
||||
const qdrantService = QdrantService.getInstance();
|
||||
await qdrantService.storeEmbeddingsWithMetadata(
|
||||
embeddingsMap,
|
||||
textContentMap,
|
||||
metadataMap
|
||||
|
||||
@ -17,8 +17,26 @@
|
||||
import { NextRequest, NextResponse } from 'next/server';
|
||||
import { GraphDBType } from '@/lib/graph-db-service';
|
||||
|
||||
// In-memory storage for settings
|
||||
// In-memory storage for settings - use lazy initialization for env vars
|
||||
// because they're not available at build time, only at runtime
|
||||
let serverSettings: Record<string, string> = {};
|
||||
let settingsInitialized = false;
|
||||
|
||||
function ensureSettingsInitialized() {
|
||||
if (!settingsInitialized) {
|
||||
// Read environment variables at runtime, not build time
|
||||
serverSettings = {
|
||||
graph_db_type: process.env.GRAPH_DB_TYPE || 'arangodb',
|
||||
neo4j_uri: process.env.NEO4J_URI || '',
|
||||
neo4j_user: process.env.NEO4J_USER || process.env.NEO4J_USERNAME || '',
|
||||
neo4j_password: process.env.NEO4J_PASSWORD || '',
|
||||
arangodb_url: process.env.ARANGODB_URL || '',
|
||||
arangodb_db: process.env.ARANGODB_DB || '',
|
||||
};
|
||||
settingsInitialized = true;
|
||||
console.log(`[SETTINGS] Initialized at runtime with GRAPH_DB_TYPE: "${serverSettings.graph_db_type}"`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* API Route to sync client settings with server environment variables
|
||||
@ -27,13 +45,16 @@ let serverSettings: Record<string, string> = {};
|
||||
*/
|
||||
export async function POST(request: NextRequest) {
|
||||
try {
|
||||
// Ensure settings are initialized from env vars first
|
||||
ensureSettingsInitialized();
|
||||
|
||||
const { settings } = await request.json();
|
||||
|
||||
if (!settings || typeof settings !== 'object') {
|
||||
return NextResponse.json({ error: 'Settings object is required' }, { status: 400 });
|
||||
}
|
||||
|
||||
// Update server settings
|
||||
// Update server settings (merge with existing)
|
||||
serverSettings = { ...serverSettings, ...settings };
|
||||
|
||||
// Log some important settings for debugging
|
||||
@ -58,6 +79,9 @@ export async function POST(request: NextRequest) {
|
||||
*/
|
||||
export async function GET(request: NextRequest) {
|
||||
try {
|
||||
// Ensure settings are initialized from env vars first
|
||||
ensureSettingsInitialized();
|
||||
|
||||
const url = new URL(request.url);
|
||||
const key = url.searchParams.get('key');
|
||||
|
||||
@ -84,12 +108,32 @@ export async function GET(request: NextRequest) {
|
||||
* For use in other API routes
|
||||
*/
|
||||
export function getSetting(key: string): string | null {
|
||||
ensureSettingsInitialized();
|
||||
return serverSettings[key] || null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the currently selected graph database type
|
||||
* Priority: serverSettings > environment variable > default 'arangodb'
|
||||
*/
|
||||
export function getGraphDbType(): GraphDBType {
|
||||
return (serverSettings.graph_db_type as GraphDBType) || 'arangodb';
|
||||
// Ensure settings are initialized from runtime environment variables
|
||||
ensureSettingsInitialized();
|
||||
|
||||
// Check serverSettings (initialized from env vars or updated by client)
|
||||
if (serverSettings.graph_db_type) {
|
||||
console.log(`[getGraphDbType] Returning: "${serverSettings.graph_db_type}"`);
|
||||
return serverSettings.graph_db_type as GraphDBType;
|
||||
}
|
||||
|
||||
// Direct fallback to runtime environment variable
|
||||
const envType = process.env.GRAPH_DB_TYPE;
|
||||
if (envType) {
|
||||
console.log(`[getGraphDbType] Returning from env: "${envType}"`);
|
||||
return envType as GraphDBType;
|
||||
}
|
||||
|
||||
// Default to arangodb for backwards compatibility
|
||||
console.log(`[getGraphDbType] Returning default: "arangodb"`);
|
||||
return 'arangodb';
|
||||
}
|
||||
@ -0,0 +1,44 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import { NextRequest, NextResponse } from 'next/server';
|
||||
import { QdrantService } from '@/lib/qdrant';
|
||||
|
||||
/**
|
||||
* Clear all data from the Qdrant vector database
|
||||
* POST /api/vector-db/clear
|
||||
*/
|
||||
export async function POST() {
|
||||
// Get the Qdrant service instance
|
||||
const qdrantService = QdrantService.getInstance();
|
||||
|
||||
// Clear all vectors from the database
|
||||
const deleteSuccess = await qdrantService.deleteAllEntities();
|
||||
|
||||
// Get updated stats after clearing
|
||||
const stats = await qdrantService.getStats();
|
||||
|
||||
// Return response based on operation success
|
||||
return NextResponse.json({
|
||||
success: deleteSuccess,
|
||||
message: deleteSuccess
|
||||
? 'Successfully cleared all data from Qdrant vector database'
|
||||
: 'Failed to clear Qdrant database - service may not be available',
|
||||
totalVectorCount: stats.totalVectorCount || 0,
|
||||
httpHealthy: stats.httpHealthy || false
|
||||
});
|
||||
}
|
||||
|
||||
@ -0,0 +1,53 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import { NextResponse } from 'next/server';
|
||||
import { QdrantService } from '@/lib/qdrant';
|
||||
|
||||
/**
|
||||
* Create Qdrant collection API endpoint
|
||||
* POST /api/vector-db/create-collection
|
||||
*/
|
||||
export async function POST() {
|
||||
try {
|
||||
// Get the Qdrant service instance
|
||||
const qdrantService = QdrantService.getInstance();
|
||||
|
||||
// Force re-initialization to create the collection
|
||||
(qdrantService as any).initialized = false;
|
||||
await qdrantService.initialize();
|
||||
|
||||
// Check if initialization was successful by getting stats
|
||||
const stats = await qdrantService.getStats();
|
||||
|
||||
return NextResponse.json({
|
||||
success: true,
|
||||
message: 'Qdrant collection created successfully',
|
||||
httpHealthy: stats.httpHealthy || false
|
||||
});
|
||||
} catch (error) {
|
||||
console.error('Error creating Qdrant collection:', error);
|
||||
|
||||
return NextResponse.json(
|
||||
{
|
||||
success: false,
|
||||
error: `Failed to create Qdrant collection: ${error instanceof Error ? error.message : String(error)}`
|
||||
},
|
||||
{ status: 500 }
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@ -0,0 +1,59 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import { NextRequest, NextResponse } from 'next/server';
|
||||
import { QdrantService } from '@/lib/qdrant';
|
||||
|
||||
/**
|
||||
* Get Qdrant vector database stats
|
||||
*/
|
||||
export async function GET() {
|
||||
try {
|
||||
// Initialize Qdrant service
|
||||
const qdrantService = QdrantService.getInstance();
|
||||
|
||||
// We can now directly call getStats() which handles initialization and error recovery
|
||||
const stats = await qdrantService.getStats();
|
||||
|
||||
return NextResponse.json({
|
||||
...stats,
|
||||
timestamp: new Date().toISOString()
|
||||
});
|
||||
} catch (error) {
|
||||
console.error('Error getting Qdrant stats:', error);
|
||||
|
||||
// Return a successful response with error information
|
||||
// This prevents the UI from breaking when Qdrant is unavailable
|
||||
let errorMessage = error instanceof Error ? error.message : String(error);
|
||||
|
||||
// More specific error message for 404 errors
|
||||
if (errorMessage.includes('404')) {
|
||||
errorMessage = 'Qdrant server returned 404. The server may not be running or the collection does not exist.';
|
||||
}
|
||||
|
||||
return NextResponse.json(
|
||||
{
|
||||
error: `Failed to get Qdrant stats: ${errorMessage}`,
|
||||
totalVectorCount: 0,
|
||||
source: 'error',
|
||||
httpHealthy: false,
|
||||
timestamp: new Date().toISOString()
|
||||
},
|
||||
{ status: 200 } // Use 200 instead of 500 to avoid UI errors
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
40
nvidia/txt2kg/assets/frontend/app/api/vllm/models/route.ts
Normal file
40
nvidia/txt2kg/assets/frontend/app/api/vllm/models/route.ts
Normal file
@ -0,0 +1,40 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
import { NextResponse } from 'next/server';
|
||||
|
||||
/**
|
||||
* Fetch available models from vLLM
|
||||
* GET /api/vllm/models
|
||||
*/
|
||||
export async function GET() {
|
||||
const vllmUrl = process.env.VLLM_BASE_URL || 'http://vllm:8001/v1';
|
||||
|
||||
try {
|
||||
const response = await fetch(`${vllmUrl}/models`, {
|
||||
signal: AbortSignal.timeout(5000),
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
return NextResponse.json({ models: [] }, { status: 200 });
|
||||
}
|
||||
|
||||
const data = await response.json();
|
||||
|
||||
// vLLM returns OpenAI-compatible format: { data: [{ id: "model-name", ... }] }
|
||||
if (data.data && Array.isArray(data.data)) {
|
||||
const models = data.data.map((model: any) => ({
|
||||
id: model.id,
|
||||
name: model.id,
|
||||
}));
|
||||
return NextResponse.json({ models });
|
||||
}
|
||||
|
||||
return NextResponse.json({ models: [] });
|
||||
} catch (error) {
|
||||
// Return empty models array if vLLM is not available
|
||||
return NextResponse.json({ models: [] }, { status: 200 });
|
||||
}
|
||||
}
|
||||
|
||||
@ -86,7 +86,7 @@ export async function GET(req: NextRequest) {
|
||||
*/
|
||||
export async function POST(req: NextRequest) {
|
||||
try {
|
||||
const { text, model = 'meta-llama/Llama-3.2-3B-Instruct', temperature = 0.1, maxTokens = 1024 } = await req.json();
|
||||
const { text, model = process.env.VLLM_MODEL || 'nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8', temperature = 0.1, maxTokens = 1024 } = await req.json();
|
||||
|
||||
if (!text || typeof text !== 'string') {
|
||||
return NextResponse.json({ error: 'Text is required' }, { status: 400 });
|
||||
|
||||
@ -397,3 +397,88 @@ body {
|
||||
/* Light mode: tune specific custom elements */
|
||||
.light .glass-card:hover { box-shadow: 0 10px 18px -8px rgba(0,0,0,0.12) !important; }
|
||||
.light .startup-tab-icon { box-shadow: 0 1px 3px rgba(0,0,0,0.06) !important; }
|
||||
|
||||
/* Progress bar indeterminate animation - smooth sliding with gradient shine */
|
||||
@keyframes progress {
|
||||
0% {
|
||||
width: 0%;
|
||||
margin-left: 0%;
|
||||
}
|
||||
50% {
|
||||
width: 40%;
|
||||
margin-left: 30%;
|
||||
}
|
||||
100% {
|
||||
width: 0%;
|
||||
margin-left: 100%;
|
||||
}
|
||||
}
|
||||
|
||||
.animate-progress {
|
||||
animation: progress 1.8s ease-in-out infinite;
|
||||
}
|
||||
|
||||
/* Progress bar shimmer effect for determinate progress */
|
||||
@keyframes shimmer {
|
||||
0% {
|
||||
transform: translateX(-100%);
|
||||
}
|
||||
100% {
|
||||
transform: translateX(100%);
|
||||
}
|
||||
}
|
||||
|
||||
.progress-shimmer {
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.progress-shimmer::after {
|
||||
content: "";
|
||||
position: absolute;
|
||||
inset: 0;
|
||||
background: linear-gradient(
|
||||
90deg,
|
||||
transparent 0%,
|
||||
rgba(255, 255, 255, 0.15) 50%,
|
||||
transparent 100%
|
||||
);
|
||||
animation: shimmer 2s ease-in-out infinite;
|
||||
}
|
||||
|
||||
/* Enhanced skeleton shimmer with directional sweep */
|
||||
@keyframes skeleton-shimmer {
|
||||
0% {
|
||||
background-position: -200% 0;
|
||||
}
|
||||
100% {
|
||||
background-position: 200% 0;
|
||||
}
|
||||
}
|
||||
|
||||
.skeleton-shimmer {
|
||||
background: linear-gradient(
|
||||
90deg,
|
||||
hsl(var(--muted)) 25%,
|
||||
hsl(var(--muted-foreground) / 0.08) 50%,
|
||||
hsl(var(--muted)) 75%
|
||||
);
|
||||
background-size: 200% 100%;
|
||||
animation: skeleton-shimmer 1.5s ease-in-out infinite;
|
||||
}
|
||||
|
||||
/* Pulse animation for status indicators */
|
||||
@keyframes status-pulse {
|
||||
0%, 100% {
|
||||
opacity: 1;
|
||||
transform: scale(1);
|
||||
}
|
||||
50% {
|
||||
opacity: 0.6;
|
||||
transform: scale(0.95);
|
||||
}
|
||||
}
|
||||
|
||||
.status-pulse {
|
||||
animation: status-pulse 2s ease-in-out infinite;
|
||||
}
|
||||
|
||||
@ -46,7 +46,6 @@ export default function Home() {
|
||||
{ value: "edit", label: "Edit Knowledge Graph", Icon: Edit },
|
||||
{ value: "visualize", label: "Visualize Graph", Icon: Network },
|
||||
] as const;
|
||||
const activeIndex = Math.max(0, steps.findIndex(s => s.value === activeTab));
|
||||
|
||||
// Updated to use callback reference
|
||||
const handleTabChange = React.useCallback((tab: string) => {
|
||||
@ -84,8 +83,8 @@ export default function Home() {
|
||||
|
||||
<main className="container mx-auto px-6 py-12 border-b border-border/10">
|
||||
|
||||
<Tabs defaultValue="upload" className="w-full mb-12" onValueChange={setActiveTab}>
|
||||
<TabsList className="nvidia-build-tabs mb-12" aria-label="Workflow steps">
|
||||
<Tabs defaultValue="upload" className="w-full" onValueChange={setActiveTab}>
|
||||
<TabsList className="nvidia-build-tabs mb-10" aria-label="Workflow steps">
|
||||
{steps.map(({ value, label, Icon }) => (
|
||||
<TabsTrigger
|
||||
key={value}
|
||||
@ -106,22 +105,22 @@ export default function Home() {
|
||||
</TabsList>
|
||||
|
||||
{/* Step 1: Document Upload */}
|
||||
<TabsContent value="upload" className="space-y-8">
|
||||
<TabsContent value="upload" className="nvidia-build-tab-content">
|
||||
<UploadTab onTabChange={handleTabChange} />
|
||||
</TabsContent>
|
||||
|
||||
{/* Step 2: Configure & Process */}
|
||||
<TabsContent value="configure" className="space-y-8">
|
||||
<TabsContent value="configure" className="nvidia-build-tab-content">
|
||||
<ConfigureTab />
|
||||
</TabsContent>
|
||||
|
||||
{/* Step 3: Edit Knowledge */}
|
||||
<TabsContent value="edit" className="space-y-8">
|
||||
<TabsContent value="edit" className="nvidia-build-tab-content">
|
||||
<EditTab />
|
||||
</TabsContent>
|
||||
|
||||
{/* Step 4: Visualize Knowledge Graph */}
|
||||
<TabsContent value="visualize" className="space-y-8">
|
||||
<TabsContent value="visualize" className="nvidia-build-tab-content">
|
||||
<VisualizeTab />
|
||||
</TabsContent>
|
||||
</Tabs>
|
||||
|
||||
@ -68,7 +68,7 @@ export default function RagPage() {
|
||||
}
|
||||
|
||||
// Check if vector search is available
|
||||
const vectorResponse = await fetch('/api/pinecone-diag/stats');
|
||||
const vectorResponse = await fetch('/api/vector-db/stats');
|
||||
if (vectorResponse.ok) {
|
||||
const data = await vectorResponse.json();
|
||||
setVectorEnabled(data.totalVectorCount > 0);
|
||||
@ -112,7 +112,7 @@ export default function RagPage() {
|
||||
});
|
||||
|
||||
try {
|
||||
// If using pure RAG (Pinecone + LangChain) without graph search
|
||||
// If using pure RAG (Qdrant + LangChain) without graph search
|
||||
if (params.usePureRag) {
|
||||
queryMode = 'pure-rag';
|
||||
try {
|
||||
|
||||
@ -14,8 +14,8 @@
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
import React, { useState } from "react";
|
||||
import { ChevronDown, ChevronRight } from "lucide-react";
|
||||
import React, { useState, useRef, useEffect } from "react";
|
||||
import { ChevronDown } from "lucide-react";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
interface AdvancedOptionsProps {
|
||||
@ -32,28 +32,57 @@ export function AdvancedOptions({
|
||||
defaultOpen = false
|
||||
}: AdvancedOptionsProps) {
|
||||
const [isOpen, setIsOpen] = useState(defaultOpen);
|
||||
const contentRef = useRef<HTMLDivElement>(null);
|
||||
const [contentHeight, setContentHeight] = useState<number | undefined>(
|
||||
defaultOpen ? undefined : 0
|
||||
);
|
||||
|
||||
// Update content height when open state changes
|
||||
useEffect(() => {
|
||||
if (isOpen) {
|
||||
const height = contentRef.current?.scrollHeight;
|
||||
setContentHeight(height);
|
||||
// After animation completes, set to auto for dynamic content
|
||||
const timer = setTimeout(() => setContentHeight(undefined), 200);
|
||||
return () => clearTimeout(timer);
|
||||
} else {
|
||||
// First set to current height, then to 0 for smooth collapse
|
||||
setContentHeight(contentRef.current?.scrollHeight);
|
||||
requestAnimationFrame(() => setContentHeight(0));
|
||||
}
|
||||
}, [isOpen]);
|
||||
|
||||
return (
|
||||
<div className={cn("border rounded-md overflow-hidden", className)}>
|
||||
<div
|
||||
className="flex items-center justify-between p-3 bg-muted/30 cursor-pointer hover:bg-muted/50 transition-colors"
|
||||
<button
|
||||
type="button"
|
||||
className="w-full flex items-center justify-between p-3 bg-muted/30 cursor-pointer hover:bg-muted/50 transition-colors focus-visible:ring-2 focus-visible:ring-nvidia-green focus-visible:ring-inset"
|
||||
onClick={() => setIsOpen(!isOpen)}
|
||||
aria-expanded={isOpen}
|
||||
aria-controls="advanced-options-content"
|
||||
>
|
||||
<h3 className="text-sm font-medium flex items-center">
|
||||
{isOpen ? (
|
||||
<ChevronDown className="h-4 w-4 mr-2" />
|
||||
) : (
|
||||
<ChevronRight className="h-4 w-4 mr-2" />
|
||||
)}
|
||||
<ChevronDown
|
||||
className={cn(
|
||||
"h-4 w-4 mr-2 transition-transform duration-200",
|
||||
!isOpen && "-rotate-90"
|
||||
)}
|
||||
/>
|
||||
{title}
|
||||
</h3>
|
||||
</div>
|
||||
</button>
|
||||
|
||||
{isOpen && (
|
||||
<div
|
||||
id="advanced-options-content"
|
||||
ref={contentRef}
|
||||
className="overflow-hidden transition-all duration-200 ease-out"
|
||||
style={{ height: contentHeight !== undefined ? contentHeight : 'auto' }}
|
||||
aria-hidden={!isOpen}
|
||||
>
|
||||
<div className="p-4 border-t border-border/50">
|
||||
{children}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@ -57,24 +57,34 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
|
||||
setGraphError(null)
|
||||
|
||||
try {
|
||||
// Get database type from localStorage
|
||||
const graphDbType = localStorage.getItem("graph_db_type") || "arangodb"
|
||||
// Get database type from localStorage, fall back to fetching from server
|
||||
let graphDbType = localStorage.getItem("graph_db_type")
|
||||
if (!graphDbType) {
|
||||
// Fetch server's default (from GRAPH_DB_TYPE env var)
|
||||
try {
|
||||
const settingsRes = await fetch('/api/settings')
|
||||
const settingsData = await settingsRes.json()
|
||||
graphDbType = settingsData.settings?.graph_db_type || 'neo4j'
|
||||
} catch {
|
||||
graphDbType = 'neo4j'
|
||||
}
|
||||
}
|
||||
setDbType(graphDbType === "arangodb" ? "ArangoDB" : "Neo4j")
|
||||
|
||||
if (graphDbType === "neo4j") {
|
||||
// Neo4j connection logic
|
||||
// Neo4j connection logic - use the unified graph-db endpoint
|
||||
const dbUrl = localStorage.getItem("NEO4J_URL")
|
||||
const dbUsername = localStorage.getItem("NEO4J_USERNAME")
|
||||
const dbPassword = localStorage.getItem("NEO4J_PASSWORD")
|
||||
|
||||
// Add query parameters if credentials exist
|
||||
// Add query parameters with type=neo4j
|
||||
const queryParams = new URLSearchParams()
|
||||
queryParams.append("type", "neo4j")
|
||||
if (dbUrl) queryParams.append("url", dbUrl)
|
||||
if (dbUsername) queryParams.append("username", dbUsername)
|
||||
if (dbPassword) queryParams.append("password", dbPassword)
|
||||
|
||||
const queryString = queryParams.toString()
|
||||
const endpoint = queryString ? `/api/neo4j?${queryString}` : '/api/neo4j'
|
||||
const endpoint = `/api/graph-db?${queryParams.toString()}`
|
||||
|
||||
const response = await fetch(endpoint)
|
||||
|
||||
@ -98,21 +108,21 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
|
||||
setConnectionUrl(dbUrl)
|
||||
}
|
||||
} else {
|
||||
// ArangoDB connection logic
|
||||
// ArangoDB connection logic - use the unified graph-db endpoint with type=arangodb
|
||||
const arangoUrl = localStorage.getItem("arango_url") || "http://localhost:8529"
|
||||
const arangoDb = localStorage.getItem("arango_db") || "txt2kg"
|
||||
const arangoUser = localStorage.getItem("arango_user") || ""
|
||||
const arangoPassword = localStorage.getItem("arango_password") || ""
|
||||
|
||||
// Add query parameters if credentials exist
|
||||
// Add query parameters with type=arangodb
|
||||
const queryParams = new URLSearchParams()
|
||||
queryParams.append("type", "arangodb")
|
||||
if (arangoUrl) queryParams.append("url", arangoUrl)
|
||||
if (arangoDb) queryParams.append("dbName", arangoDb)
|
||||
if (arangoUser) queryParams.append("username", arangoUser)
|
||||
if (arangoPassword) queryParams.append("password", arangoPassword)
|
||||
|
||||
const queryString = queryParams.toString()
|
||||
const endpoint = queryString ? `/api/graph-db?${queryString}` : '/api/graph-db'
|
||||
const endpoint = `/api/graph-db?${queryParams.toString()}`
|
||||
|
||||
const response = await fetch(endpoint)
|
||||
|
||||
@ -144,7 +154,8 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
|
||||
// Disconnect from graph database
|
||||
const disconnectGraph = async () => {
|
||||
try {
|
||||
const graphDbType = localStorage.getItem("graph_db_type") || "arangodb"
|
||||
// Use current dbType state which was already determined from server/localStorage
|
||||
const graphDbType = dbType === "Neo4j" ? "neo4j" : "arangodb"
|
||||
const endpoint = graphDbType === "neo4j" ? '/api/neo4j/disconnect' : '/api/graph-db/disconnect'
|
||||
|
||||
const response = await fetch(endpoint, {
|
||||
@ -171,7 +182,7 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
|
||||
// Fetch vector DB stats
|
||||
const fetchVectorStats = async () => {
|
||||
try {
|
||||
const response = await fetch('/api/pinecone-diag/stats');
|
||||
const response = await fetch('/api/vector-db/stats');
|
||||
const data = await response.json();
|
||||
|
||||
if (response.ok) {
|
||||
@ -273,7 +284,7 @@ export function DatabaseConnection({ className }: DatabaseConnectionProps) {
|
||||
|
||||
try {
|
||||
// Call API to clear the database
|
||||
const response = await fetch('/api/pinecone-diag/clear', {
|
||||
const response = await fetch('/api/vector-db/clear', {
|
||||
method: 'POST',
|
||||
})
|
||||
|
||||
|
||||
@ -28,6 +28,16 @@ import {
|
||||
DialogHeader,
|
||||
DialogTitle,
|
||||
} from "@/components/ui/dialog"
|
||||
import {
|
||||
AlertDialog,
|
||||
AlertDialogAction,
|
||||
AlertDialogCancel,
|
||||
AlertDialogContent,
|
||||
AlertDialogDescription,
|
||||
AlertDialogFooter,
|
||||
AlertDialogHeader,
|
||||
AlertDialogTitle,
|
||||
} from "@/components/ui/alert-dialog"
|
||||
import { Button } from "@/components/ui/button"
|
||||
import type { Triple } from "@/utils/text-processing"
|
||||
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
|
||||
@ -44,6 +54,10 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
const [currentDocumentId, setCurrentDocumentId] = useState<string | null>(null)
|
||||
const [editableTriples, setEditableTriples] = useState<Triple[]>([])
|
||||
const [editingTripleIndex, setEditingTripleIndex] = useState<number | null>(null)
|
||||
|
||||
// Delete confirmation dialog state
|
||||
const [showDeleteDialog, setShowDeleteDialog] = useState(false)
|
||||
const [deleteTarget, setDeleteTarget] = useState<{ type: 'single' | 'multiple', docId?: string, docName?: string } | null>(null)
|
||||
|
||||
// Use shift-select hook for document selection
|
||||
const {
|
||||
@ -63,11 +77,32 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
|
||||
const handleDeleteSelected = () => {
|
||||
if (selectedDocuments.length === 0) return
|
||||
|
||||
if (confirm(`Are you sure you want to delete ${selectedDocuments.length} selected document(s)?`)) {
|
||||
setDeleteTarget({ type: 'multiple' })
|
||||
setShowDeleteDialog(true)
|
||||
}
|
||||
|
||||
const handleConfirmDelete = () => {
|
||||
if (!deleteTarget) return
|
||||
|
||||
if (deleteTarget.type === 'multiple') {
|
||||
deleteDocuments(selectedDocuments)
|
||||
setSelectedDocuments([])
|
||||
toast({
|
||||
title: "Documents Deleted",
|
||||
description: `Successfully deleted ${selectedDocuments.length} document(s).`,
|
||||
duration: 3000,
|
||||
})
|
||||
} else if (deleteTarget.type === 'single' && deleteTarget.docId) {
|
||||
deleteDocuments([deleteTarget.docId])
|
||||
toast({
|
||||
title: "Document Deleted",
|
||||
description: `"${deleteTarget.docName}" has been deleted.`,
|
||||
duration: 3000,
|
||||
})
|
||||
}
|
||||
|
||||
setShowDeleteDialog(false)
|
||||
setDeleteTarget(null)
|
||||
}
|
||||
|
||||
const openTriplesDialog = (documentId: string) => {
|
||||
@ -249,6 +284,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
openTriplesDialog(doc.id);
|
||||
}}
|
||||
className="p-2 text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors"
|
||||
aria-label={`View and edit ${doc.triples?.length || 0} triples for ${doc.name}`}
|
||||
title="View and edit triples"
|
||||
>
|
||||
<Eye className="h-4 w-4" />
|
||||
@ -269,6 +305,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
// Create a simple info modal or tooltip showing document details
|
||||
}}
|
||||
className="p-2 text-muted-foreground hover:text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors"
|
||||
aria-label={`View info for ${doc.name}`}
|
||||
title="View document info"
|
||||
>
|
||||
<Info className="h-4 w-4" />
|
||||
@ -294,6 +331,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
}
|
||||
}}
|
||||
className="p-2 text-muted-foreground hover:text-nvidia-green hover:bg-nvidia-green/10 rounded-lg transition-colors"
|
||||
aria-label={`Download ${doc.name}`}
|
||||
title="Download document"
|
||||
>
|
||||
<Download className="h-4 w-4" />
|
||||
@ -301,11 +339,11 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
<button
|
||||
onClick={(e) => {
|
||||
e.stopPropagation()
|
||||
if (confirm(`Are you sure you want to delete ${doc.name}?`)) {
|
||||
deleteDocuments([doc.id])
|
||||
}
|
||||
setDeleteTarget({ type: 'single', docId: doc.id, docName: doc.name })
|
||||
setShowDeleteDialog(true)
|
||||
}}
|
||||
className="p-2 text-muted-foreground hover:text-red-500 hover:bg-red-500/10 rounded-lg transition-colors"
|
||||
aria-label={`Delete ${doc.name}`}
|
||||
title="Delete document"
|
||||
>
|
||||
<Trash2 className="h-4 w-4" />
|
||||
@ -395,6 +433,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
<button
|
||||
onClick={() => setEditingTripleIndex(null)}
|
||||
className="p-1.5 text-primary hover:text-primary/80 hover:bg-primary/10 rounded-full transition-colors"
|
||||
aria-label={`Save changes to triple: ${triple.subject} ${triple.predicate} ${triple.object}`}
|
||||
title="Save"
|
||||
>
|
||||
<CheckCircle className="h-4 w-4" />
|
||||
@ -403,6 +442,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
<button
|
||||
onClick={() => setEditingTripleIndex(index)}
|
||||
className="p-1.5 text-muted-foreground hover:text-foreground hover:bg-muted/50 rounded-full transition-colors"
|
||||
aria-label={`Edit triple: ${triple.subject} ${triple.predicate} ${triple.object}`}
|
||||
title="Edit"
|
||||
>
|
||||
<Edit className="h-4 w-4" />
|
||||
@ -411,6 +451,7 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
<button
|
||||
onClick={() => deleteTriple(index)}
|
||||
className="p-1.5 text-muted-foreground hover:text-destructive hover:bg-destructive/10 rounded-full transition-colors"
|
||||
aria-label={`Delete triple: ${triple.subject} ${triple.predicate} ${triple.object}`}
|
||||
title="Delete"
|
||||
>
|
||||
<Trash2 className="h-4 w-4" />
|
||||
@ -431,6 +472,40 @@ export function DocumentsTable({ onTabChange }: DocumentsTableProps) {
|
||||
</div>
|
||||
</DialogContent>
|
||||
</Dialog>
|
||||
|
||||
{/* Delete Confirmation Dialog */}
|
||||
<AlertDialog open={showDeleteDialog} onOpenChange={setShowDeleteDialog}>
|
||||
<AlertDialogContent>
|
||||
<AlertDialogHeader>
|
||||
<AlertDialogTitle className="flex items-center gap-2">
|
||||
<Trash2 className="h-5 w-5 text-destructive" />
|
||||
Delete {deleteTarget?.type === 'multiple' ? 'Documents' : 'Document'}
|
||||
</AlertDialogTitle>
|
||||
<AlertDialogDescription>
|
||||
{deleteTarget?.type === 'multiple' ? (
|
||||
<>
|
||||
Are you sure you want to delete <strong>{selectedDocuments.length}</strong> selected document{selectedDocuments.length !== 1 ? 's' : ''}?
|
||||
This action cannot be undone.
|
||||
</>
|
||||
) : (
|
||||
<>
|
||||
Are you sure you want to delete <strong>"{deleteTarget?.docName}"</strong>?
|
||||
This action cannot be undone.
|
||||
</>
|
||||
)}
|
||||
</AlertDialogDescription>
|
||||
</AlertDialogHeader>
|
||||
<AlertDialogFooter>
|
||||
<AlertDialogCancel onClick={() => setDeleteTarget(null)}>Cancel</AlertDialogCancel>
|
||||
<AlertDialogAction
|
||||
onClick={handleConfirmDelete}
|
||||
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
|
||||
>
|
||||
Delete
|
||||
</AlertDialogAction>
|
||||
</AlertDialogFooter>
|
||||
</AlertDialogContent>
|
||||
</AlertDialog>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
@ -19,6 +19,7 @@
|
||||
import { Network, Zap } from "lucide-react"
|
||||
import { useDocuments } from "@/contexts/document-context"
|
||||
import { Loader2 } from "lucide-react"
|
||||
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
|
||||
|
||||
export function GraphActions() {
|
||||
const { documents, processDocuments, isProcessing, openGraphVisualization } = useDocuments()
|
||||
@ -50,34 +51,67 @@ export function GraphActions() {
|
||||
}
|
||||
}
|
||||
|
||||
// Helper to get tooltip content for disabled Process button
|
||||
const getProcessTooltip = () => {
|
||||
if (isProcessing) return "Processing in progress..."
|
||||
if (!hasNewDocuments && documents.length === 0) return "Upload documents first to extract knowledge triples"
|
||||
if (!hasNewDocuments) return "All documents have been processed"
|
||||
return "Extract knowledge triples from uploaded documents"
|
||||
}
|
||||
|
||||
// Helper to get tooltip content for disabled View Graph button
|
||||
const getViewGraphTooltip = () => {
|
||||
if (isProcessing) return "Wait for processing to complete"
|
||||
if (!hasProcessedDocuments && documents.length === 0) return "Upload and process documents first"
|
||||
if (!hasProcessedDocuments) return "Process documents first to generate knowledge triples"
|
||||
return "Visualize the knowledge graph from extracted triples"
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="flex gap-3 items-center">
|
||||
<button
|
||||
className={`btn-primary ${!hasNewDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`}
|
||||
disabled={!hasNewDocuments || isProcessing}
|
||||
onClick={handleProcessDocuments}
|
||||
>
|
||||
{isProcessing ? (
|
||||
<>
|
||||
<Loader2 className="h-4 w-4 animate-spin" />
|
||||
Processing...
|
||||
</>
|
||||
) : (
|
||||
<>
|
||||
<Zap className="h-4 w-4" />
|
||||
Process Documents
|
||||
</>
|
||||
)}
|
||||
</button>
|
||||
<button
|
||||
className={`btn-primary ${!hasProcessedDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`}
|
||||
disabled={!hasProcessedDocuments || isProcessing}
|
||||
onClick={() => openGraphVisualization()}
|
||||
>
|
||||
<Network className="h-4 w-4" />
|
||||
View Knowledge Graph
|
||||
</button>
|
||||
</div>
|
||||
<TooltipProvider>
|
||||
<div className="flex gap-3 items-center">
|
||||
<Tooltip>
|
||||
<TooltipTrigger asChild>
|
||||
<button
|
||||
className={`btn-primary ${!hasNewDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`}
|
||||
disabled={!hasNewDocuments || isProcessing}
|
||||
onClick={handleProcessDocuments}
|
||||
>
|
||||
{isProcessing ? (
|
||||
<>
|
||||
<Loader2 className="h-4 w-4 animate-spin" />
|
||||
Processing...
|
||||
</>
|
||||
) : (
|
||||
<>
|
||||
<Zap className="h-4 w-4" />
|
||||
Process Documents
|
||||
</>
|
||||
)}
|
||||
</button>
|
||||
</TooltipTrigger>
|
||||
<TooltipContent>
|
||||
<p>{getProcessTooltip()}</p>
|
||||
</TooltipContent>
|
||||
</Tooltip>
|
||||
|
||||
<Tooltip>
|
||||
<TooltipTrigger asChild>
|
||||
<button
|
||||
className={`btn-primary ${!hasProcessedDocuments || isProcessing ? "opacity-60 cursor-not-allowed" : ""}`}
|
||||
disabled={!hasProcessedDocuments || isProcessing}
|
||||
onClick={() => openGraphVisualization()}
|
||||
>
|
||||
<Network className="h-4 w-4" />
|
||||
View Knowledge Graph
|
||||
</button>
|
||||
</TooltipTrigger>
|
||||
<TooltipContent>
|
||||
<p>{getViewGraphTooltip()}</p>
|
||||
</TooltipContent>
|
||||
</Tooltip>
|
||||
</div>
|
||||
</TooltipProvider>
|
||||
)
|
||||
}
|
||||
|
||||
|
||||
@ -17,7 +17,7 @@
|
||||
"use client"
|
||||
|
||||
import { useState, useEffect } from "react"
|
||||
import { ChevronDown, Cpu } from "lucide-react"
|
||||
import { ChevronDown, Cpu, Server, RefreshCw } from "lucide-react"
|
||||
import { OllamaIcon } from "@/components/ui/ollama-icon"
|
||||
|
||||
interface LLMModel {
|
||||
@ -28,15 +28,8 @@ interface LLMModel {
|
||||
description?: string
|
||||
}
|
||||
|
||||
// Default models
|
||||
const DEFAULT_MODELS: LLMModel[] = [
|
||||
{
|
||||
id: "ollama-llama3.1:8b",
|
||||
name: "Llama 3.1 8B",
|
||||
model: "llama3.1:8b",
|
||||
provider: "ollama",
|
||||
description: "Local Ollama model"
|
||||
},
|
||||
// NVIDIA API models (always available if API key is set)
|
||||
const NVIDIA_MODELS: LLMModel[] = [
|
||||
{
|
||||
id: "nvidia-nemotron-super",
|
||||
name: "Nemotron Super 49B",
|
||||
@ -54,51 +47,100 @@ const DEFAULT_MODELS: LLMModel[] = [
|
||||
]
|
||||
|
||||
export function LLMSelectorCompact() {
|
||||
const [models, setModels] = useState<LLMModel[]>(DEFAULT_MODELS)
|
||||
const [selectedModel, setSelectedModel] = useState<LLMModel>(DEFAULT_MODELS[0])
|
||||
const [models, setModels] = useState<LLMModel[]>([])
|
||||
const [selectedModel, setSelectedModel] = useState<LLMModel | null>(null)
|
||||
const [isOpen, setIsOpen] = useState(false)
|
||||
const [isLoading, setIsLoading] = useState(true)
|
||||
|
||||
// Load Ollama models from settings
|
||||
useEffect(() => {
|
||||
try {
|
||||
const selectedOllamaModels = localStorage.getItem("selected_ollama_models")
|
||||
if (selectedOllamaModels) {
|
||||
const modelNames: string[] = JSON.parse(selectedOllamaModels)
|
||||
const ollamaModels: LLMModel[] = modelNames.map(name => ({
|
||||
id: `ollama-${name}`,
|
||||
name: name,
|
||||
model: name,
|
||||
provider: "ollama",
|
||||
description: "Local Ollama model"
|
||||
}))
|
||||
|
||||
// Combine with default models, avoiding duplicates
|
||||
const defaultOllamaIds = DEFAULT_MODELS
|
||||
.filter(m => m.provider === "ollama")
|
||||
.map(m => m.model)
|
||||
const uniqueOllamaModels = ollamaModels.filter(
|
||||
m => !defaultOllamaIds.includes(m.model)
|
||||
)
|
||||
|
||||
const allModels = [...DEFAULT_MODELS, ...uniqueOllamaModels]
|
||||
setModels(allModels)
|
||||
}
|
||||
} catch (error) {
|
||||
console.error("Error loading Ollama models:", error)
|
||||
}
|
||||
}, [])
|
||||
// Fetch available models from running backends
|
||||
const fetchAvailableModels = async () => {
|
||||
setIsLoading(true)
|
||||
const availableModels: LLMModel[] = []
|
||||
|
||||
// Load selected model from localStorage
|
||||
useEffect(() => {
|
||||
// Check vLLM first (port 8001)
|
||||
try {
|
||||
const saved = localStorage.getItem("selectedModelForRAG")
|
||||
if (saved) {
|
||||
const savedModel: LLMModel = JSON.parse(saved)
|
||||
setSelectedModel(savedModel)
|
||||
const vllmResponse = await fetch('/api/vllm/models', {
|
||||
signal: AbortSignal.timeout(3000)
|
||||
})
|
||||
if (vllmResponse.ok) {
|
||||
const data = await vllmResponse.json()
|
||||
if (data.models && Array.isArray(data.models)) {
|
||||
data.models.forEach((model: any) => {
|
||||
const modelId = model.id || model.name || model
|
||||
availableModels.push({
|
||||
id: `vllm-${modelId}`,
|
||||
name: modelId.split('/').pop() || modelId,
|
||||
model: modelId,
|
||||
provider: "vllm",
|
||||
description: "vLLM (GPU-accelerated)"
|
||||
})
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
console.error("Error loading selected model:", error)
|
||||
} catch (e) {
|
||||
// vLLM not available
|
||||
console.log("vLLM not available")
|
||||
}
|
||||
|
||||
// Check Ollama (port 11434)
|
||||
try {
|
||||
const ollamaResponse = await fetch('/api/ollama/tags', {
|
||||
signal: AbortSignal.timeout(3000)
|
||||
})
|
||||
if (ollamaResponse.ok) {
|
||||
const data = await ollamaResponse.json()
|
||||
if (data.models && Array.isArray(data.models)) {
|
||||
data.models.forEach((model: any) => {
|
||||
const modelName = model.name || model
|
||||
availableModels.push({
|
||||
id: `ollama-${modelName}`,
|
||||
name: modelName,
|
||||
model: modelName,
|
||||
provider: "ollama",
|
||||
description: "Local Ollama model"
|
||||
})
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
// Ollama not available
|
||||
console.log("Ollama not available")
|
||||
}
|
||||
|
||||
// Always add NVIDIA API models
|
||||
availableModels.push(...NVIDIA_MODELS)
|
||||
|
||||
setModels(availableModels)
|
||||
|
||||
// Set default selected model
|
||||
if (availableModels.length > 0) {
|
||||
// Try to restore saved selection
|
||||
try {
|
||||
const saved = localStorage.getItem("selectedModelForRAG")
|
||||
if (saved) {
|
||||
const savedModel: LLMModel = JSON.parse(saved)
|
||||
const found = availableModels.find(m => m.id === savedModel.id)
|
||||
if (found) {
|
||||
setSelectedModel(found)
|
||||
setIsLoading(false)
|
||||
return
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
// Ignore
|
||||
}
|
||||
|
||||
// Default to first available local model (vLLM or Ollama), not NVIDIA API
|
||||
const localModel = availableModels.find(m => m.provider === "vllm" || m.provider === "ollama")
|
||||
setSelectedModel(localModel || availableModels[0])
|
||||
}
|
||||
|
||||
setIsLoading(false)
|
||||
}
|
||||
|
||||
// Fetch models on mount
|
||||
useEffect(() => {
|
||||
fetchAvailableModels()
|
||||
}, [])
|
||||
|
||||
// Save selected model to localStorage and dispatch event
|
||||
@ -117,14 +159,55 @@ export function LLMSelectorCompact() {
|
||||
if (provider === "ollama") {
|
||||
return <OllamaIcon className="h-3 w-3 text-orange-500" />
|
||||
}
|
||||
if (provider === "vllm") {
|
||||
return <Server className="h-3 w-3 text-purple-500" />
|
||||
}
|
||||
return <Cpu className="h-3 w-3 text-green-500" />
|
||||
}
|
||||
|
||||
const getProviderLabel = (provider: string) => {
|
||||
switch (provider) {
|
||||
case "ollama": return "Ollama"
|
||||
case "vllm": return "vLLM"
|
||||
case "nvidia": return "NVIDIA API"
|
||||
default: return provider
|
||||
}
|
||||
}
|
||||
|
||||
if (isLoading) {
|
||||
return (
|
||||
<div className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50">
|
||||
<RefreshCw className="h-3 w-3 animate-spin text-muted-foreground" />
|
||||
<span className="text-muted-foreground">Loading models...</span>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
if (!selectedModel) {
|
||||
return (
|
||||
<div className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50 text-muted-foreground">
|
||||
No models available
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
// Group models by provider
|
||||
const groupedModels = models.reduce((acc, model) => {
|
||||
if (!acc[model.provider]) {
|
||||
acc[model.provider] = []
|
||||
}
|
||||
acc[model.provider].push(model)
|
||||
return acc
|
||||
}, {} as Record<string, LLMModel[]>)
|
||||
|
||||
return (
|
||||
<div className="relative">
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => setIsOpen(!isOpen)}
|
||||
aria-haspopup="listbox"
|
||||
aria-expanded={isOpen}
|
||||
aria-label={`Select LLM model. Currently selected: ${selectedModel.name}`}
|
||||
className="flex items-center gap-2 px-3 py-1.5 text-sm border border-border/40 rounded-lg bg-background/50 hover:bg-muted/30 transition-colors"
|
||||
>
|
||||
{getModelIcon(selectedModel.provider)}
|
||||
@ -141,37 +224,61 @@ export function LLMSelectorCompact() {
|
||||
/>
|
||||
|
||||
{/* Dropdown */}
|
||||
<div className="absolute top-full left-0 mt-2 w-64 border border-border/40 rounded-lg bg-popover shadow-lg z-50 overflow-hidden">
|
||||
<div className="p-2 border-b border-border/40 bg-muted/30">
|
||||
<div
|
||||
className="absolute top-full left-0 mt-2 w-72 border border-border/40 rounded-lg bg-popover shadow-lg z-50 overflow-hidden"
|
||||
role="listbox"
|
||||
aria-label="Available LLM models"
|
||||
>
|
||||
<div className="p-2 border-b border-border/40 bg-muted/30 flex items-center justify-between">
|
||||
<h4 className="text-xs font-semibold text-foreground">Select LLM for Answer Generation</h4>
|
||||
<button
|
||||
type="button"
|
||||
onClick={(e) => {
|
||||
e.stopPropagation()
|
||||
fetchAvailableModels()
|
||||
}}
|
||||
className="p-1 hover:bg-muted/50 rounded"
|
||||
title="Refresh models"
|
||||
>
|
||||
<RefreshCw className="h-3 w-3 text-muted-foreground" />
|
||||
</button>
|
||||
</div>
|
||||
<div className="max-h-64 overflow-y-auto">
|
||||
{models.map((model) => (
|
||||
<button
|
||||
key={model.id}
|
||||
type="button"
|
||||
onClick={() => handleSelectModel(model)}
|
||||
className={`w-full flex items-start gap-2 p-3 hover:bg-muted/50 transition-colors text-left ${
|
||||
selectedModel.id === model.id ? 'bg-nvidia-green/10' : ''
|
||||
}`}
|
||||
>
|
||||
<div className="mt-0.5">
|
||||
{getModelIcon(model.provider)}
|
||||
<div className="max-h-80 overflow-y-auto">
|
||||
{Object.entries(groupedModels).map(([provider, providerModels]) => (
|
||||
<div key={provider}>
|
||||
<div className="px-3 py-1.5 text-xs font-semibold text-muted-foreground bg-muted/20 border-b border-border/20">
|
||||
{getProviderLabel(provider)}
|
||||
</div>
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="text-sm font-medium text-foreground truncate">
|
||||
{model.name}
|
||||
</div>
|
||||
{model.description && (
|
||||
<div className="text-xs text-muted-foreground">
|
||||
{model.description}
|
||||
{providerModels.map((model) => (
|
||||
<button
|
||||
key={model.id}
|
||||
type="button"
|
||||
role="option"
|
||||
aria-selected={selectedModel.id === model.id}
|
||||
onClick={() => handleSelectModel(model)}
|
||||
className={`w-full flex items-start gap-2 p-3 hover:bg-muted/50 transition-colors text-left ${
|
||||
selectedModel.id === model.id ? 'bg-nvidia-green/10' : ''
|
||||
}`}
|
||||
>
|
||||
<div className="mt-0.5">
|
||||
{getModelIcon(model.provider)}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
{selectedModel.id === model.id && (
|
||||
<div className="w-2 h-2 rounded-full bg-nvidia-green flex-shrink-0 mt-1.5" />
|
||||
)}
|
||||
</button>
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="text-sm font-medium text-foreground truncate">
|
||||
{model.name}
|
||||
</div>
|
||||
{model.description && (
|
||||
<div className="text-xs text-muted-foreground">
|
||||
{model.description}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
{selectedModel.id === model.id && (
|
||||
<div className="w-2 h-2 rounded-full bg-nvidia-green flex-shrink-0 mt-1.5" />
|
||||
)}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
@ -180,4 +287,3 @@ export function LLMSelectorCompact() {
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
|
||||
@ -17,12 +17,22 @@
|
||||
"use client"
|
||||
|
||||
import { useState, useEffect, useRef } from "react"
|
||||
import { createPortal } from "react-dom"
|
||||
import { ChevronDown, Sparkles, Cpu, Server } from "lucide-react"
|
||||
import { ChevronDown, Cpu, Server, RefreshCw } from "lucide-react"
|
||||
import { OllamaIcon } from "@/components/ui/ollama-icon"
|
||||
|
||||
// Base models - NVIDIA NeMo as default (first in list)
|
||||
const baseModels = [
|
||||
interface Model {
|
||||
id: string
|
||||
name: string
|
||||
icon: React.ReactNode
|
||||
description: string
|
||||
model: string
|
||||
baseURL: string
|
||||
provider: string
|
||||
apiKeyName?: string
|
||||
}
|
||||
|
||||
// NVIDIA API models (always available)
|
||||
const NVIDIA_MODELS: Model[] = [
|
||||
{
|
||||
id: "nvidia-nemotron",
|
||||
name: "NVIDIA Llama 3.3 Nemotron Super 49B",
|
||||
@ -31,6 +41,7 @@ const baseModels = [
|
||||
model: "nvidia/llama-3.3-nemotron-super-49b-v1.5",
|
||||
apiKeyName: "NVIDIA_API_KEY",
|
||||
baseURL: "https://integrate.api.nvidia.com/v1",
|
||||
provider: "nvidia",
|
||||
},
|
||||
{
|
||||
id: "nvidia-nemotron-nano",
|
||||
@ -40,68 +51,116 @@ const baseModels = [
|
||||
model: "nvidia/nvidia-nemotron-nano-9b-v2",
|
||||
apiKeyName: "NVIDIA_API_KEY",
|
||||
baseURL: "https://integrate.api.nvidia.com/v1",
|
||||
},
|
||||
// Preset Ollama model
|
||||
{
|
||||
id: "ollama-llama3.1:8b",
|
||||
name: "Ollama llama3.1:8b",
|
||||
icon: <OllamaIcon className="h-4 w-4 text-orange-500" />,
|
||||
description: "Local Ollama server with llama3.1:8b model",
|
||||
model: "llama3.1:8b",
|
||||
baseURL: "http://localhost:11434/v1",
|
||||
provider: "ollama",
|
||||
provider: "nvidia",
|
||||
},
|
||||
]
|
||||
|
||||
// vLLM models removed per user request
|
||||
|
||||
// Helper function to create Ollama model objects
|
||||
const createOllamaModel = (modelName: string) => ({
|
||||
// Helper to create model objects
|
||||
const createOllamaModel = (modelName: string): Model => ({
|
||||
id: `ollama-${modelName}`,
|
||||
name: `Ollama ${modelName}`,
|
||||
icon: <OllamaIcon className="h-4 w-4 text-orange-500" />,
|
||||
description: `Local Ollama server with ${modelName} model`,
|
||||
description: `Local Ollama model`,
|
||||
model: modelName,
|
||||
baseURL: "http://localhost:11434/v1",
|
||||
provider: "ollama",
|
||||
})
|
||||
|
||||
const createVllmModel = (modelName: string): Model => ({
|
||||
id: `vllm-${modelName}`,
|
||||
name: modelName.split('/').pop() || modelName,
|
||||
icon: <Server className="h-4 w-4 text-purple-500" />,
|
||||
description: "vLLM (GPU-accelerated)",
|
||||
model: modelName,
|
||||
baseURL: "http://localhost:8001/v1",
|
||||
provider: "vllm",
|
||||
})
|
||||
|
||||
export function ModelSelector() {
|
||||
const [models, setModels] = useState(() => [...baseModels])
|
||||
const [selectedModel, setSelectedModel] = useState(() => {
|
||||
// Try to find a default Ollama model first
|
||||
const defaultOllama = models.find(m => m.provider === "ollama")
|
||||
return defaultOllama || models[0]
|
||||
})
|
||||
const [models, setModels] = useState<Model[]>([])
|
||||
const [selectedModel, setSelectedModel] = useState<Model | null>(null)
|
||||
const [isOpen, setIsOpen] = useState(false)
|
||||
const [isLoading, setIsLoading] = useState(true)
|
||||
const buttonRef = useRef<HTMLButtonElement | null>(null)
|
||||
const containerRef = useRef<HTMLDivElement | null>(null)
|
||||
const [mounted, setMounted] = useState(false)
|
||||
|
||||
// Load configured Ollama models
|
||||
const loadOllamaModels = () => {
|
||||
// Fetch available models from running backends
|
||||
const fetchAvailableModels = async () => {
|
||||
setIsLoading(true)
|
||||
const availableModels: Model[] = []
|
||||
|
||||
// Check vLLM first (port 8001)
|
||||
try {
|
||||
const selectedOllamaModels = localStorage.getItem("selected_ollama_models")
|
||||
if (selectedOllamaModels) {
|
||||
const modelNames = JSON.parse(selectedOllamaModels)
|
||||
// Filter out models that are already in baseModels to avoid duplicates
|
||||
const baseModelNames = baseModels.filter(m => m.provider === "ollama").map(m => m.model)
|
||||
const filteredModelNames = modelNames.filter((name: string) => !baseModelNames.includes(name))
|
||||
const ollamaModels = filteredModelNames.map(createOllamaModel)
|
||||
const newModels = [...baseModels, ...ollamaModels]
|
||||
setModels(newModels)
|
||||
return newModels
|
||||
const vllmResponse = await fetch('/api/vllm/models', {
|
||||
signal: AbortSignal.timeout(3000)
|
||||
})
|
||||
if (vllmResponse.ok) {
|
||||
const data = await vllmResponse.json()
|
||||
if (data.models && Array.isArray(data.models)) {
|
||||
data.models.forEach((model: any) => {
|
||||
const modelId = model.id || model.name || model
|
||||
availableModels.push(createVllmModel(modelId))
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
console.error("Error loading Ollama models:", error)
|
||||
} catch (e) {
|
||||
console.log("vLLM not available")
|
||||
}
|
||||
// Return base models if no Ollama models configured
|
||||
return [...baseModels]
|
||||
|
||||
// Check Ollama (port 11434)
|
||||
try {
|
||||
const ollamaResponse = await fetch('/api/ollama/tags', {
|
||||
signal: AbortSignal.timeout(3000)
|
||||
})
|
||||
if (ollamaResponse.ok) {
|
||||
const data = await ollamaResponse.json()
|
||||
if (data.models && Array.isArray(data.models)) {
|
||||
data.models.forEach((model: any) => {
|
||||
const modelName = model.name || model
|
||||
availableModels.push(createOllamaModel(modelName))
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
console.log("Ollama not available")
|
||||
}
|
||||
|
||||
// Always add NVIDIA API models
|
||||
availableModels.push(...NVIDIA_MODELS)
|
||||
|
||||
setModels(availableModels)
|
||||
|
||||
// Set default selected model
|
||||
if (availableModels.length > 0) {
|
||||
// Try to restore saved selection
|
||||
try {
|
||||
const saved = localStorage.getItem("selectedModel")
|
||||
if (saved) {
|
||||
const savedModel = JSON.parse(saved)
|
||||
const found = availableModels.find(m => m.id === savedModel.id)
|
||||
if (found) {
|
||||
setSelectedModel(found)
|
||||
setIsLoading(false)
|
||||
return
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
// Ignore
|
||||
}
|
||||
|
||||
// Default to first available local model (vLLM or Ollama)
|
||||
const localModel = availableModels.find(m => m.provider === "vllm" || m.provider === "ollama")
|
||||
setSelectedModel(localModel || availableModels[0])
|
||||
}
|
||||
|
||||
setIsLoading(false)
|
||||
}
|
||||
|
||||
// Dispatch custom event when model changes
|
||||
const updateSelectedModel = (model: any) => {
|
||||
const updateSelectedModel = (model: Model) => {
|
||||
setSelectedModel(model)
|
||||
localStorage.setItem("selectedModel", JSON.stringify(model))
|
||||
|
||||
// Dispatch a custom event with the selected model data
|
||||
const event = new CustomEvent('modelSelected', {
|
||||
@ -110,59 +169,11 @@ export function ModelSelector() {
|
||||
window.dispatchEvent(event)
|
||||
}
|
||||
|
||||
// Fetch models on mount
|
||||
useEffect(() => {
|
||||
// Save selected model to localStorage
|
||||
localStorage.setItem("selectedModel", JSON.stringify(selectedModel))
|
||||
}, [selectedModel])
|
||||
|
||||
// Initialize models and selected model
|
||||
useEffect(() => {
|
||||
const loadedModels = loadOllamaModels()
|
||||
|
||||
// Try to restore selected model from localStorage
|
||||
const savedModel = localStorage.getItem("selectedModel")
|
||||
if (savedModel) {
|
||||
try {
|
||||
const parsed = JSON.parse(savedModel)
|
||||
// Find matching model in our current models array
|
||||
const matchingModel = loadedModels.find(m => m.id === parsed.id)
|
||||
if (matchingModel) {
|
||||
updateSelectedModel(matchingModel)
|
||||
} else {
|
||||
// If saved model not found, use first available model
|
||||
updateSelectedModel(loadedModels[0])
|
||||
}
|
||||
} catch (e) {
|
||||
console.error("Error parsing saved model", e)
|
||||
updateSelectedModel(loadedModels[0])
|
||||
}
|
||||
} else {
|
||||
// If no model in localStorage, use first available model
|
||||
updateSelectedModel(loadedModels[0])
|
||||
}
|
||||
fetchAvailableModels()
|
||||
}, [])
|
||||
|
||||
// Listen for Ollama model updates
|
||||
useEffect(() => {
|
||||
const handleOllamaUpdate = (event: CustomEvent) => {
|
||||
console.log("Ollama models updated, reloading...")
|
||||
const newModels = loadOllamaModels()
|
||||
|
||||
// Check if current selected model still exists
|
||||
const currentModelStillExists = newModels.find(m => m.id === selectedModel.id)
|
||||
if (!currentModelStillExists) {
|
||||
// Select first available model if current one is no longer available
|
||||
updateSelectedModel(newModels[0])
|
||||
}
|
||||
}
|
||||
|
||||
window.addEventListener('ollama-models-updated', handleOllamaUpdate as EventListener)
|
||||
|
||||
return () => {
|
||||
window.removeEventListener('ollama-models-updated', handleOllamaUpdate as EventListener)
|
||||
}
|
||||
}, [selectedModel.id])
|
||||
|
||||
// Set mounted state after component mounts (for SSR compatibility)
|
||||
useEffect(() => {
|
||||
setMounted(true)
|
||||
@ -186,6 +197,55 @@ export function ModelSelector() {
|
||||
}
|
||||
}, [])
|
||||
|
||||
// Listen for Ollama model updates
|
||||
useEffect(() => {
|
||||
const handleOllamaUpdate = () => {
|
||||
console.log("Ollama models updated, reloading...")
|
||||
fetchAvailableModels()
|
||||
}
|
||||
|
||||
window.addEventListener('ollama-models-updated', handleOllamaUpdate)
|
||||
|
||||
return () => {
|
||||
window.removeEventListener('ollama-models-updated', handleOllamaUpdate)
|
||||
}
|
||||
}, [])
|
||||
|
||||
if (isLoading) {
|
||||
return (
|
||||
<div className="flex items-center gap-2 bg-card border border-border rounded-lg px-4 py-2 text-sm">
|
||||
<RefreshCw className="h-4 w-4 animate-spin text-muted-foreground" />
|
||||
<span className="text-muted-foreground">Loading models...</span>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
if (!selectedModel) {
|
||||
return (
|
||||
<div className="flex items-center gap-2 bg-card border border-border rounded-lg px-4 py-2 text-sm text-muted-foreground">
|
||||
No models available
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
// Group models by provider
|
||||
const groupedModels = models.reduce((acc, model) => {
|
||||
if (!acc[model.provider]) {
|
||||
acc[model.provider] = []
|
||||
}
|
||||
acc[model.provider].push(model)
|
||||
return acc
|
||||
}, {} as Record<string, Model[]>)
|
||||
|
||||
const getProviderLabel = (provider: string) => {
|
||||
switch (provider) {
|
||||
case "ollama": return "Ollama (Local)"
|
||||
case "vllm": return "vLLM (GPU-accelerated)"
|
||||
case "nvidia": return "NVIDIA API (Cloud)"
|
||||
default: return provider
|
||||
}
|
||||
}
|
||||
|
||||
return (
|
||||
<div ref={containerRef} className="relative">
|
||||
<button
|
||||
@ -202,35 +262,57 @@ export function ModelSelector() {
|
||||
|
||||
{isOpen && mounted && (
|
||||
<div
|
||||
className="absolute bg-card border border-border rounded-md shadow-md overflow-hidden max-h-80 overflow-y-auto z-50"
|
||||
className="absolute bg-card border border-border rounded-md shadow-md overflow-hidden max-h-96 overflow-y-auto z-50"
|
||||
style={{
|
||||
width: "288px",
|
||||
width: "320px",
|
||||
bottom: "calc(100% + 4px)",
|
||||
left: 0,
|
||||
}}
|
||||
>
|
||||
<ul className="divide-y divide-border/60">
|
||||
{models.map((model) => (
|
||||
<li key={model.id}>
|
||||
<button
|
||||
className={`w-full text-left px-3 py-2 hover:bg-muted/30 text-sm flex flex-col gap-1 ${model.id === selectedModel.id ? 'bg-primary/10' : ''}`}
|
||||
onClick={() => {
|
||||
updateSelectedModel(model)
|
||||
setIsOpen(false)
|
||||
}}
|
||||
>
|
||||
<span className="flex items-center gap-2">
|
||||
{model.icon}
|
||||
<span className={`font-medium ${model.id === selectedModel.id ? 'text-primary' : ''}`}>{model.name}</span>
|
||||
</span>
|
||||
<span className="text-xs text-muted-foreground pl-6">{model.description}</span>
|
||||
</button>
|
||||
</li>
|
||||
<div className="px-3 py-2 border-b border-border/60 bg-muted/30 flex items-center justify-between">
|
||||
<span className="text-xs font-semibold text-foreground">Select Model</span>
|
||||
<button
|
||||
type="button"
|
||||
onClick={(e) => {
|
||||
e.stopPropagation()
|
||||
fetchAvailableModels()
|
||||
}}
|
||||
className="p-1 hover:bg-muted/50 rounded"
|
||||
title="Refresh models"
|
||||
>
|
||||
<RefreshCw className="h-3 w-3 text-muted-foreground" />
|
||||
</button>
|
||||
</div>
|
||||
<div>
|
||||
{Object.entries(groupedModels).map(([provider, providerModels]) => (
|
||||
<div key={provider}>
|
||||
<div className="px-3 py-1.5 text-xs font-semibold text-muted-foreground bg-muted/20 border-b border-border/20">
|
||||
{getProviderLabel(provider)}
|
||||
</div>
|
||||
<ul>
|
||||
{providerModels.map((model) => (
|
||||
<li key={model.id}>
|
||||
<button
|
||||
className={`w-full text-left px-3 py-2 hover:bg-muted/30 text-sm flex flex-col gap-1 ${model.id === selectedModel.id ? 'bg-primary/10' : ''}`}
|
||||
onClick={() => {
|
||||
updateSelectedModel(model)
|
||||
setIsOpen(false)
|
||||
}}
|
||||
>
|
||||
<span className="flex items-center gap-2">
|
||||
{model.icon}
|
||||
<span className={`font-medium ${model.id === selectedModel.id ? 'text-primary' : ''}`}>{model.name}</span>
|
||||
</span>
|
||||
<span className="text-xs text-muted-foreground pl-6">{model.description}</span>
|
||||
</button>
|
||||
</li>
|
||||
))}
|
||||
</ul>
|
||||
</div>
|
||||
))}
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
|
||||
@ -1,19 +1,3 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
"use client"
|
||||
|
||||
import { useState, useEffect } from "react"
|
||||
@ -103,7 +87,7 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
|
||||
<InfoIcon className="h-5 w-5 text-muted-foreground" />
|
||||
</TooltipTrigger>
|
||||
<TooltipContent>
|
||||
<p>Qdrant stores vector embeddings for semantic search</p>
|
||||
<p>Local Pinecone stores vector embeddings in memory for semantic search</p>
|
||||
</TooltipContent>
|
||||
</Tooltip>
|
||||
</TooltipProvider>
|
||||
@ -125,34 +109,34 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
|
||||
<p className="whitespace-normal break-words">Error: {error}</p>
|
||||
{error.includes('404') && (
|
||||
<p className="mt-1 text-xs">
|
||||
The Qdrant server is running but the collection doesn't exist yet.
|
||||
<button
|
||||
The Pinecone server is running but the index doesn't exist yet.
|
||||
<button
|
||||
onClick={async () => {
|
||||
setConnectionStatus("checking");
|
||||
setError(null);
|
||||
try {
|
||||
const response = await fetch('/api/pinecone-diag/create-index', { method: 'POST' });
|
||||
if (response.ok) {
|
||||
// Wait a bit for the collection to be created
|
||||
// Wait a bit for the index to be created
|
||||
await new Promise(resolve => setTimeout(resolve, 2000));
|
||||
checkConnection();
|
||||
} else {
|
||||
const data = await response.json();
|
||||
setError(data.error || 'Failed to create collection');
|
||||
setError(data.error || 'Failed to create index');
|
||||
setConnectionStatus("disconnected");
|
||||
}
|
||||
} catch (err) {
|
||||
setError(err instanceof Error ? err.message : 'Error creating collection');
|
||||
setError(err instanceof Error ? err.message : 'Error creating index');
|
||||
setConnectionStatus("disconnected");
|
||||
}
|
||||
}}
|
||||
className="ml-1 text-blue-600 hover:text-blue-800 underline"
|
||||
>
|
||||
Click here to create the collection
|
||||
Click here to create the index
|
||||
</button>
|
||||
<br />
|
||||
<span className="text-xs text-gray-600">Or using Docker Compose: </span>
|
||||
<code className="mx-1 px-1 bg-gray-100 rounded">docker compose restart qdrant</code>
|
||||
<code className="mx-1 px-1 bg-gray-100 rounded">docker-compose restart pinecone</code>
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
@ -160,25 +144,13 @@ export function PineconeConnection({ className }: PineconeConnectionProps) {
|
||||
|
||||
<div className="text-sm space-y-1 w-full">
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Qdrant</span>
|
||||
<span className="text-xs text-muted-foreground">{(stats as any).url || 'http://qdrant:6333'}</span>
|
||||
<span className="text-muted-foreground">Vectors:</span>
|
||||
<span>{stats.nodes}</span>
|
||||
</div>
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Vectors:</span>
|
||||
<span>{stats.nodes} indexed</span>
|
||||
<span className="text-muted-foreground">Source:</span>
|
||||
<span>{stats.source} local</span>
|
||||
</div>
|
||||
{(stats as any).status && (
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Status:</span>
|
||||
<span className="capitalize">{(stats as any).status}</span>
|
||||
</div>
|
||||
)}
|
||||
{(stats as any).vectorSize && (
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Dimensions:</span>
|
||||
<span>{(stats as any).vectorSize}d ({(stats as any).distance})</span>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<div className="flex space-x-2">
|
||||
|
||||
207
nvidia/txt2kg/assets/frontend/components/qdrant-connection.tsx
Normal file
207
nvidia/txt2kg/assets/frontend/components/qdrant-connection.tsx
Normal file
@ -0,0 +1,207 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
"use client"
|
||||
|
||||
import { useState, useEffect } from "react"
|
||||
import { Button } from '@/components/ui/button'
|
||||
import { Badge } from '@/components/ui/badge'
|
||||
import { InfoIcon } from 'lucide-react'
|
||||
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from '@/components/ui/tooltip'
|
||||
import { VectorDBStats } from '@/types/graph'
|
||||
|
||||
interface QdrantConnectionProps {
|
||||
className?: string
|
||||
}
|
||||
|
||||
export function QdrantConnection({ className }: QdrantConnectionProps) {
|
||||
const [connectionStatus, setConnectionStatus] = useState<"connected" | "disconnected" | "checking">("disconnected")
|
||||
const [error, setError] = useState<string | null>(null)
|
||||
const [stats, setStats] = useState<VectorDBStats>({ nodes: 0, relationships: 0, source: 'none' })
|
||||
|
||||
// Fetch vector DB stats
|
||||
const fetchStats = async () => {
|
||||
try {
|
||||
const response = await fetch('/api/vector-db/stats');
|
||||
const data = await response.json();
|
||||
|
||||
if (response.ok) {
|
||||
setStats({
|
||||
nodes: typeof data.totalVectorCount === 'number' ? data.totalVectorCount : 0,
|
||||
relationships: 0, // Vector DB doesn't store relationships
|
||||
source: data.source || 'unknown',
|
||||
httpHealthy: data.httpHealthy
|
||||
});
|
||||
|
||||
// If we have a healthy HTTP connection, we're connected
|
||||
if (data.httpHealthy) {
|
||||
setConnectionStatus("connected");
|
||||
setError(null);
|
||||
} else {
|
||||
setConnectionStatus("disconnected");
|
||||
setError(data.error || 'Connection failed');
|
||||
}
|
||||
|
||||
console.log('Vector DB stats:', data);
|
||||
} else {
|
||||
console.error('Failed to fetch vector DB stats:', data);
|
||||
setConnectionStatus("disconnected");
|
||||
setError(data.error || 'Failed to connect to vector database');
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error fetching vector DB stats:', error);
|
||||
setConnectionStatus("disconnected");
|
||||
setError(error instanceof Error ? error.message : 'Error connecting to vector database');
|
||||
}
|
||||
};
|
||||
|
||||
// Check connection status and stats
|
||||
const checkConnection = async () => {
|
||||
setConnectionStatus("checking")
|
||||
setError(null)
|
||||
|
||||
try {
|
||||
await fetchStats(); // Fetch stats directly - our status is based on having embeddings
|
||||
} catch (error) {
|
||||
console.error('Error connecting to Vector DB:', error)
|
||||
setConnectionStatus("disconnected")
|
||||
setError(error instanceof Error ? error.message : 'Unknown error connecting to Vector DB')
|
||||
}
|
||||
}
|
||||
|
||||
// Reset connection state
|
||||
const disconnect = async () => {
|
||||
setConnectionStatus("disconnected")
|
||||
setStats({ nodes: 0, relationships: 0, source: 'none' })
|
||||
}
|
||||
|
||||
// Initial connection check
|
||||
useEffect(() => {
|
||||
checkConnection()
|
||||
}, [])
|
||||
|
||||
return (
|
||||
<div className={`flex flex-col items-start space-y-4 p-4 border rounded-md ${className}`}>
|
||||
<div className="flex justify-between w-full">
|
||||
<h2 className="text-lg font-medium">Vector DB</h2>
|
||||
<TooltipProvider>
|
||||
<Tooltip>
|
||||
<TooltipTrigger>
|
||||
<InfoIcon className="h-5 w-5 text-muted-foreground" />
|
||||
</TooltipTrigger>
|
||||
<TooltipContent>
|
||||
<p>Qdrant stores vector embeddings for semantic search</p>
|
||||
</TooltipContent>
|
||||
</Tooltip>
|
||||
</TooltipProvider>
|
||||
</div>
|
||||
|
||||
<div className="flex items-center space-x-2">
|
||||
<span className="text-sm">Status:</span>
|
||||
{connectionStatus === "connected" ? (
|
||||
<Badge variant="outline" className="bg-green-50 text-green-700 hover:bg-green-50 border-green-200">Connected</Badge>
|
||||
) : connectionStatus === "checking" ? (
|
||||
<Badge variant="outline" className="bg-yellow-50 text-yellow-700 hover:bg-yellow-50 border-yellow-200">Checking...</Badge>
|
||||
) : (
|
||||
<Badge variant="outline" className="bg-red-50 text-red-700 hover:bg-red-50 border-red-200">Disconnected</Badge>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{error && (
|
||||
<div className="text-sm text-red-600 bg-red-50 p-2 rounded w-full overflow-auto max-h-20">
|
||||
<p className="whitespace-normal break-words">Error: {error}</p>
|
||||
{error.includes('404') && (
|
||||
<p className="mt-1 text-xs">
|
||||
The Qdrant server is running but the collection doesn't exist yet.
|
||||
<button
|
||||
onClick={async () => {
|
||||
setConnectionStatus("checking");
|
||||
setError(null);
|
||||
try {
|
||||
const response = await fetch('/api/vector-db/create-collection', { method: 'POST' });
|
||||
if (response.ok) {
|
||||
// Wait a bit for the collection to be created
|
||||
await new Promise(resolve => setTimeout(resolve, 2000));
|
||||
checkConnection();
|
||||
} else {
|
||||
const data = await response.json();
|
||||
setError(data.error || 'Failed to create collection');
|
||||
setConnectionStatus("disconnected");
|
||||
}
|
||||
} catch (err) {
|
||||
setError(err instanceof Error ? err.message : 'Error creating collection');
|
||||
setConnectionStatus("disconnected");
|
||||
}
|
||||
}}
|
||||
className="ml-1 text-blue-600 hover:text-blue-800 underline"
|
||||
>
|
||||
Click here to create the collection
|
||||
</button>
|
||||
<br />
|
||||
<span className="text-xs text-gray-600">Or using Docker Compose: </span>
|
||||
<code className="mx-1 px-1 bg-gray-100 rounded">docker compose restart qdrant</code>
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div className="text-sm space-y-1 w-full">
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Qdrant</span>
|
||||
<span className="text-xs text-muted-foreground">{(stats as any).url || 'http://qdrant:6333'}</span>
|
||||
</div>
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Vectors:</span>
|
||||
<span>{stats.nodes} indexed</span>
|
||||
</div>
|
||||
{(stats as any).status && (
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Status:</span>
|
||||
<span className="capitalize">{(stats as any).status}</span>
|
||||
</div>
|
||||
)}
|
||||
{(stats as any).vectorSize && (
|
||||
<div className="flex justify-between">
|
||||
<span className="text-muted-foreground">Dimensions:</span>
|
||||
<span>{(stats as any).vectorSize}d ({(stats as any).distance})</span>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<div className="flex space-x-2">
|
||||
<Button
|
||||
variant="outline"
|
||||
size="sm"
|
||||
onClick={checkConnection}
|
||||
disabled={connectionStatus === "checking"}
|
||||
>
|
||||
{connectionStatus === "checking" ? "Checking..." : "Check Connection"}
|
||||
</Button>
|
||||
|
||||
{connectionStatus === "connected" && (
|
||||
<Button
|
||||
variant="outline"
|
||||
size="sm"
|
||||
onClick={disconnect}
|
||||
>
|
||||
Disconnect
|
||||
</Button>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
@ -156,16 +156,21 @@ export function RagQuery({
|
||||
: 'border-border/30 opacity-50 cursor-not-allowed'
|
||||
}`}
|
||||
>
|
||||
<div className="w-5 h-5 rounded-md bg-nvidia-green/15 flex items-center justify-center mb-1.5">
|
||||
<Zap className="h-2.5 w-2.5 text-nvidia-green" />
|
||||
<div className={`w-5 h-5 rounded-md flex items-center justify-center mb-1.5 ${vectorEnabled ? 'bg-nvidia-green/15' : 'bg-muted/15'}`}>
|
||||
<Zap className={`h-2.5 w-2.5 ${vectorEnabled ? 'text-nvidia-green' : 'text-muted-foreground'}`} />
|
||||
</div>
|
||||
<span className="text-sm font-semibold">Pure RAG</span>
|
||||
<span className={`text-sm font-semibold ${!vectorEnabled ? 'text-muted-foreground' : ''}`}>Pure RAG</span>
|
||||
<span className="text-[10px] mt-0.5 text-center text-muted-foreground leading-tight">
|
||||
Vector DB + LLM
|
||||
</span>
|
||||
{queryMode === 'pure-rag' && (
|
||||
<div className="absolute top-2 right-2 w-1.5 h-1.5 bg-nvidia-green rounded-full"></div>
|
||||
)}
|
||||
{!vectorEnabled && (
|
||||
<div className="text-[9px] px-1.5 py-0.5 bg-blue-500/20 text-blue-700 dark:text-blue-400 rounded mt-1 font-medium">
|
||||
NEEDS EMBEDDINGS
|
||||
</div>
|
||||
)}
|
||||
</button>
|
||||
|
||||
<button
|
||||
|
||||
@ -76,10 +76,8 @@ export function SettingsModal() {
|
||||
const [arangoUser, setArangoUser] = useState("")
|
||||
const [arangoPassword, setArangoPassword] = useState("")
|
||||
|
||||
// Vector DB settings - changed from Milvus to Pinecone
|
||||
const [pineconeApiKey, setPineconeApiKey] = useState("")
|
||||
const [pineconeEnvironment, setPineconeEnvironment] = useState("")
|
||||
const [pineconeIndex, setPineconeIndex] = useState("")
|
||||
// Vector DB settings - Qdrant
|
||||
const [qdrantUrl, setQdrantUrl] = useState("")
|
||||
|
||||
// S3 Storage settings
|
||||
const [s3Endpoint, setS3Endpoint] = useState("")
|
||||
@ -171,9 +169,20 @@ export function SettingsModal() {
|
||||
setIsS3Connected(s3Connected)
|
||||
}
|
||||
|
||||
// Load graph DB type
|
||||
const storedGraphDbType = localStorage.getItem("graph_db_type") || "arangodb"
|
||||
setGraphDbType(storedGraphDbType as GraphDBType)
|
||||
// Load graph DB type - fetch from server if not in localStorage
|
||||
const storedGraphDbType = localStorage.getItem("graph_db_type")
|
||||
if (storedGraphDbType) {
|
||||
setGraphDbType(storedGraphDbType as GraphDBType)
|
||||
} else {
|
||||
// Fetch server's default (from GRAPH_DB_TYPE env var)
|
||||
fetch('/api/settings')
|
||||
.then(res => res.json())
|
||||
.then(data => {
|
||||
const serverDefault = data.settings?.graph_db_type || 'neo4j'
|
||||
setGraphDbType(serverDefault as GraphDBType)
|
||||
})
|
||||
.catch(() => setGraphDbType('neo4j'))
|
||||
}
|
||||
|
||||
// Load Neo4j settings
|
||||
setNeo4jUrl(localStorage.getItem("neo4j_url") || "")
|
||||
@ -186,9 +195,7 @@ export function SettingsModal() {
|
||||
setArangoUser(localStorage.getItem("arango_user") || "")
|
||||
setArangoPassword(localStorage.getItem("arango_password") || "")
|
||||
|
||||
setPineconeApiKey(localStorage.getItem("pinecone_api_key") || "")
|
||||
setPineconeEnvironment(localStorage.getItem("pinecone_environment") || "")
|
||||
setPineconeIndex(localStorage.getItem("pinecone_index") || "")
|
||||
setQdrantUrl(localStorage.getItem("qdrant_url") || "http://localhost:6333")
|
||||
}, [isOpen])
|
||||
|
||||
// Save database settings
|
||||
@ -249,9 +256,7 @@ export function SettingsModal() {
|
||||
const saveVectorDbSettings = async (e: React.FormEvent) => {
|
||||
e.preventDefault()
|
||||
|
||||
localStorage.setItem("pinecone_api_key", pineconeApiKey)
|
||||
localStorage.setItem("pinecone_environment", pineconeEnvironment)
|
||||
localStorage.setItem("pinecone_index", pineconeIndex)
|
||||
localStorage.setItem("qdrant_url", qdrantUrl)
|
||||
|
||||
// Sync settings with server
|
||||
try {
|
||||
@ -262,9 +267,7 @@ export function SettingsModal() {
|
||||
},
|
||||
body: JSON.stringify({
|
||||
settings: {
|
||||
pinecone_api_key: pineconeApiKey,
|
||||
pinecone_environment: pineconeEnvironment,
|
||||
pinecone_index: pineconeIndex,
|
||||
qdrant_url: qdrantUrl,
|
||||
}
|
||||
}),
|
||||
});
|
||||
@ -452,7 +455,11 @@ export function SettingsModal() {
|
||||
return (
|
||||
<Dialog open={isOpen} onOpenChange={setIsOpen}>
|
||||
<DialogTrigger asChild>
|
||||
<button className="flex items-center justify-center gap-2 p-2 hover:bg-primary/10 rounded-full transition-colors" title="Settings">
|
||||
<button
|
||||
className="flex items-center justify-center gap-2 p-2 hover:bg-primary/10 rounded-full transition-colors"
|
||||
aria-label="Open settings"
|
||||
title="Settings"
|
||||
>
|
||||
<Settings className="h-5 w-5 text-muted-foreground hover:text-primary transition-colors" />
|
||||
</button>
|
||||
</DialogTrigger>
|
||||
@ -668,44 +675,22 @@ export function SettingsModal() {
|
||||
<div className="space-y-2">
|
||||
<label className="text-sm font-semibold text-foreground flex items-center gap-2">
|
||||
<SearchIcon className="h-4 w-4 text-nvidia-green" />
|
||||
Pinecone Configuration
|
||||
Qdrant Configuration
|
||||
</label>
|
||||
</div>
|
||||
|
||||
<div className="bg-background/50 rounded-lg p-3 space-y-3">
|
||||
<div className="grid grid-cols-1 gap-3">
|
||||
<div>
|
||||
<label className="text-xs font-medium text-muted-foreground mb-1 block">API Key</label>
|
||||
<label className="text-xs font-medium text-muted-foreground mb-1 block">Qdrant URL</label>
|
||||
<input
|
||||
type="password"
|
||||
value={pineconeApiKey}
|
||||
onChange={(e) => setPineconeApiKey(e.target.value)}
|
||||
placeholder="Enter your Pinecone API key"
|
||||
type="text"
|
||||
value={qdrantUrl}
|
||||
onChange={(e) => setQdrantUrl(e.target.value)}
|
||||
placeholder="http://localhost:6333"
|
||||
className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<label className="text-xs font-medium text-muted-foreground mb-1 block">Environment</label>
|
||||
<input
|
||||
type="text"
|
||||
value={pineconeEnvironment}
|
||||
onChange={(e) => setPineconeEnvironment(e.target.value)}
|
||||
placeholder="us-west1-gcp"
|
||||
className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs font-medium text-muted-foreground mb-1 block">Index Name</label>
|
||||
<input
|
||||
type="text"
|
||||
value={pineconeIndex}
|
||||
onChange={(e) => setPineconeIndex(e.target.value)}
|
||||
placeholder="knowledge-graph"
|
||||
className="w-full bg-background border border-border/60 rounded-md p-2 text-sm text-foreground focus:ring-1 focus:ring-primary/50 focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
@ -21,12 +21,16 @@ import { useTheme } from "./theme-provider"
|
||||
|
||||
export function ThemeToggle() {
|
||||
const { theme, setTheme } = useTheme()
|
||||
|
||||
const nextTheme = theme === "dark" ? "light" : "dark"
|
||||
const label = `Switch to ${nextTheme} theme (currently ${theme})`
|
||||
|
||||
return (
|
||||
<button
|
||||
className="btn-icon relative"
|
||||
onClick={() => setTheme(theme === "dark" ? "light" : "dark")}
|
||||
aria-label="Toggle theme"
|
||||
className="btn-icon relative focus-visible:ring-2 focus-visible:ring-nvidia-green focus-visible:ring-offset-2 focus-visible:ring-offset-background rounded-lg"
|
||||
onClick={() => setTheme(nextTheme)}
|
||||
aria-label={label}
|
||||
title={`Switch to ${nextTheme} theme`}
|
||||
>
|
||||
<Sun
|
||||
className={`h-5 w-5 transition-all ${theme === "dark" ? "opacity-0 scale-0 rotate-90 absolute" : "opacity-100 scale-100 rotate-0 relative"}`}
|
||||
|
||||
@ -91,11 +91,16 @@ export function TripleEditor({ triple, index, onSave, onCancel }: TripleEditorPr
|
||||
<button
|
||||
type="button"
|
||||
onClick={onCancel}
|
||||
aria-label="Cancel editing triple"
|
||||
className="p-2 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/50 transition-colors"
|
||||
>
|
||||
<X className="h-4 w-4" />
|
||||
</button>
|
||||
<button type="submit" className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10 transition-colors">
|
||||
<button
|
||||
type="submit"
|
||||
aria-label="Save triple"
|
||||
className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10 transition-colors"
|
||||
>
|
||||
<Check className="h-4 w-4" />
|
||||
</button>
|
||||
</div>
|
||||
|
||||
@ -19,8 +19,18 @@
|
||||
import { useState, useEffect, useRef } from "react"
|
||||
import { useDocuments } from "@/contexts/document-context"
|
||||
import type { Triple } from "@/utils/text-processing"
|
||||
import { Pencil, Trash2, Plus, Download, ChevronDown, FileJson, FileText, List, Network, Check, X, Database } from "lucide-react"
|
||||
import { Pencil, Trash2, Plus, Download, ChevronDown, FileJson, FileText, List, Network, Check, X, Database, AlertCircle } from "lucide-react"
|
||||
import { TripleEditor } from "./triple-editor"
|
||||
import {
|
||||
AlertDialog,
|
||||
AlertDialogAction,
|
||||
AlertDialogCancel,
|
||||
AlertDialogContent,
|
||||
AlertDialogDescription,
|
||||
AlertDialogFooter,
|
||||
AlertDialogHeader,
|
||||
AlertDialogTitle,
|
||||
} from "@/components/ui/alert-dialog"
|
||||
|
||||
// Add this new EntityEditor component before the TripleViewer component
|
||||
interface EntityEditorProps {
|
||||
@ -59,11 +69,16 @@ function EntityEditor({ entity, onSave, onCancel }: EntityEditorProps) {
|
||||
<button
|
||||
type="button"
|
||||
onClick={onCancel}
|
||||
aria-label="Cancel editing entity"
|
||||
className="p-2 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/30"
|
||||
>
|
||||
<X className="h-4 w-4" />
|
||||
</button>
|
||||
<button type="submit" className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10">
|
||||
<button
|
||||
type="submit"
|
||||
aria-label="Save entity changes"
|
||||
className="p-2 text-primary hover:text-primary/80 rounded-full hover:bg-primary/10"
|
||||
>
|
||||
<Check className="h-4 w-4" />
|
||||
</button>
|
||||
</div>
|
||||
@ -87,6 +102,12 @@ export function TripleViewer() {
|
||||
const [isDropdownOpen, setIsDropdownOpen] = useState(false)
|
||||
const [searchQuery, setSearchQuery] = useState('')
|
||||
const dropdownRef = useRef<HTMLDivElement>(null)
|
||||
|
||||
// Delete confirmation dialog state
|
||||
const [showDeleteTripleDialog, setShowDeleteTripleDialog] = useState(false)
|
||||
const [tripleToDelete, setTripleToDelete] = useState<{ index: number, triple: Triple } | null>(null)
|
||||
const [showDeleteEntityDialog, setShowDeleteEntityDialog] = useState(false)
|
||||
const [entityToDelete, setEntityToDelete] = useState<string | null>(null)
|
||||
|
||||
// Handle click outside to close dropdown
|
||||
useEffect(() => {
|
||||
@ -167,12 +188,19 @@ export function TripleViewer() {
|
||||
}
|
||||
|
||||
const handleDeleteTriple = (index: number) => {
|
||||
if (selectedDoc) {
|
||||
if (confirm("Are you sure you want to delete this triple?")) {
|
||||
deleteTriple(selectedDoc.id, index)
|
||||
}
|
||||
if (selectedDoc && selectedDoc.triples) {
|
||||
setTripleToDelete({ index, triple: selectedDoc.triples[index] })
|
||||
setShowDeleteTripleDialog(true)
|
||||
}
|
||||
}
|
||||
|
||||
const confirmDeleteTriple = () => {
|
||||
if (selectedDoc && tripleToDelete !== null) {
|
||||
deleteTriple(selectedDoc.id, tripleToDelete.index)
|
||||
}
|
||||
setShowDeleteTripleDialog(false)
|
||||
setTripleToDelete(null)
|
||||
}
|
||||
|
||||
const exportTriplesCSV = () => {
|
||||
if (!selectedDoc || !selectedDoc.triples) return
|
||||
@ -281,16 +309,22 @@ export function TripleViewer() {
|
||||
|
||||
const handleDeleteEntity = (entity: string) => {
|
||||
if (!selectedDoc || !selectedDoc.triples) return;
|
||||
|
||||
if (confirm(`Are you sure you want to delete the entity "${entity}"? This will remove all triples containing this entity.`)) {
|
||||
setEntityToDelete(entity)
|
||||
setShowDeleteEntityDialog(true)
|
||||
};
|
||||
|
||||
const confirmDeleteEntity = () => {
|
||||
if (selectedDoc && selectedDoc.triples && entityToDelete) {
|
||||
// Filter out all triples that contain the entity
|
||||
const filteredTriples = selectedDoc.triples.filter(triple =>
|
||||
triple.subject !== entity && triple.object !== entity
|
||||
triple.subject !== entityToDelete && triple.object !== entityToDelete
|
||||
);
|
||||
|
||||
// Update the document with the filtered triples
|
||||
updateTriples(selectedDoc.id, filteredTriples);
|
||||
}
|
||||
setShowDeleteEntityDialog(false)
|
||||
setEntityToDelete(null)
|
||||
};
|
||||
|
||||
// Function to store triples in the Neo4j database
|
||||
@ -383,8 +417,11 @@ export function TripleViewer() {
|
||||
<label className="text-sm font-semibold text-foreground whitespace-nowrap">Select Document</label>
|
||||
<div className="relative w-64">
|
||||
<button
|
||||
className="w-full flex items-center justify-between bg-card border border-border rounded-lg p-3 text-foreground text-sm hover:bg-muted/30 transition-colors"
|
||||
className="w-full flex items-center justify-between bg-card border border-border rounded-lg p-3 text-foreground text-sm hover:bg-muted/30 transition-colors focus-visible:ring-2 focus-visible:ring-nvidia-green focus-visible:ring-offset-2"
|
||||
onClick={() => setIsDropdownOpen(!isDropdownOpen)}
|
||||
aria-haspopup="listbox"
|
||||
aria-expanded={isDropdownOpen}
|
||||
aria-label={`Select document. Currently selected: ${selectedDoc?.name || 'None'}`}
|
||||
>
|
||||
<span className="truncate">
|
||||
{selectedDoc?.name || "Select document"}
|
||||
@ -400,13 +437,18 @@ export function TripleViewer() {
|
||||
strokeLinecap="round"
|
||||
strokeLinejoin="round"
|
||||
className={`transition-transform ${isDropdownOpen ? 'rotate-180' : ''}`}
|
||||
aria-hidden="true"
|
||||
>
|
||||
<polyline points="6 9 12 15 18 9"></polyline>
|
||||
</svg>
|
||||
</button>
|
||||
|
||||
{isDropdownOpen && (
|
||||
<div className="absolute z-10 mt-1 w-full bg-card border border-border rounded-lg shadow-lg max-h-64 overflow-y-auto">
|
||||
<div
|
||||
className="absolute z-10 mt-1 w-full bg-card border border-border rounded-lg shadow-lg max-h-64 overflow-y-auto"
|
||||
role="listbox"
|
||||
aria-label="Processed documents"
|
||||
>
|
||||
<div className="p-2 sticky top-0 bg-card border-b border-border">
|
||||
<input
|
||||
type="text"
|
||||
@ -425,6 +467,8 @@ export function TripleViewer() {
|
||||
filteredDocs.map((doc) => (
|
||||
<button
|
||||
key={doc.id}
|
||||
role="option"
|
||||
aria-selected={doc.id === selectedDoc?.id}
|
||||
className={`w-full text-left p-2 hover:bg-muted/30 text-sm ${
|
||||
doc.id === selectedDoc?.id ? 'bg-primary/10 text-primary' : ''
|
||||
}`}
|
||||
@ -657,6 +701,7 @@ export function TripleViewer() {
|
||||
<button
|
||||
onClick={() => setEditingIndex(index)}
|
||||
className="p-1.5 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/50 transition-colors"
|
||||
aria-label={`Edit triple: ${normalizeText(triple.subject)} ${normalizeText(triple.predicate)} ${normalizeText(triple.object)}`}
|
||||
title="Edit Triple"
|
||||
>
|
||||
<Pencil className="h-3.5 w-3.5" />
|
||||
@ -664,6 +709,7 @@ export function TripleViewer() {
|
||||
<button
|
||||
onClick={() => handleDeleteTriple(index)}
|
||||
className="p-1.5 text-muted-foreground hover:text-destructive rounded-full hover:bg-destructive/10 transition-colors"
|
||||
aria-label={`Delete triple: ${normalizeText(triple.subject)} ${normalizeText(triple.predicate)} ${normalizeText(triple.object)}`}
|
||||
title="Delete Triple"
|
||||
>
|
||||
<Trash2 className="h-3.5 w-3.5" />
|
||||
@ -805,6 +851,7 @@ export function TripleViewer() {
|
||||
<button
|
||||
onClick={() => setEditingEntityIndex(index)}
|
||||
className="p-1.5 text-muted-foreground hover:text-foreground rounded-full hover:bg-muted/30"
|
||||
aria-label={`Edit entity: ${normalizeText(entity)}`}
|
||||
title="Edit Entity"
|
||||
>
|
||||
<Pencil className="h-3.5 w-3.5" />
|
||||
@ -812,6 +859,7 @@ export function TripleViewer() {
|
||||
<button
|
||||
onClick={() => handleDeleteEntity(entity)}
|
||||
className="p-1.5 text-muted-foreground hover:text-destructive rounded-full hover:bg-destructive/10"
|
||||
aria-label={`Delete entity: ${normalizeText(entity)}`}
|
||||
title="Delete Entity"
|
||||
>
|
||||
<Trash2 className="h-3.5 w-3.5" />
|
||||
@ -837,6 +885,66 @@ export function TripleViewer() {
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
|
||||
{/* Delete Triple Confirmation Dialog */}
|
||||
<AlertDialog open={showDeleteTripleDialog} onOpenChange={setShowDeleteTripleDialog}>
|
||||
<AlertDialogContent>
|
||||
<AlertDialogHeader>
|
||||
<AlertDialogTitle className="flex items-center gap-2">
|
||||
<Trash2 className="h-5 w-5 text-destructive" />
|
||||
Delete Triple
|
||||
</AlertDialogTitle>
|
||||
<AlertDialogDescription>
|
||||
Are you sure you want to delete this triple?
|
||||
{tripleToDelete && (
|
||||
<div className="mt-3 p-3 bg-muted/50 rounded-lg text-sm font-mono">
|
||||
<span className="text-foreground">{normalizeText(tripleToDelete.triple.subject)}</span>
|
||||
<span className="text-muted-foreground mx-2">→</span>
|
||||
<span className="text-primary">{normalizeText(tripleToDelete.triple.predicate)}</span>
|
||||
<span className="text-muted-foreground mx-2">→</span>
|
||||
<span className="text-foreground">{normalizeText(tripleToDelete.triple.object)}</span>
|
||||
</div>
|
||||
)}
|
||||
</AlertDialogDescription>
|
||||
</AlertDialogHeader>
|
||||
<AlertDialogFooter>
|
||||
<AlertDialogCancel onClick={() => setTripleToDelete(null)}>Cancel</AlertDialogCancel>
|
||||
<AlertDialogAction
|
||||
onClick={confirmDeleteTriple}
|
||||
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
|
||||
>
|
||||
Delete Triple
|
||||
</AlertDialogAction>
|
||||
</AlertDialogFooter>
|
||||
</AlertDialogContent>
|
||||
</AlertDialog>
|
||||
|
||||
{/* Delete Entity Confirmation Dialog */}
|
||||
<AlertDialog open={showDeleteEntityDialog} onOpenChange={setShowDeleteEntityDialog}>
|
||||
<AlertDialogContent>
|
||||
<AlertDialogHeader>
|
||||
<AlertDialogTitle className="flex items-center gap-2">
|
||||
<AlertCircle className="h-5 w-5 text-destructive" />
|
||||
Delete Entity
|
||||
</AlertDialogTitle>
|
||||
<AlertDialogDescription>
|
||||
Are you sure you want to delete the entity <strong>"{entityToDelete}"</strong>?
|
||||
<div className="mt-3 p-3 bg-amber-50 dark:bg-amber-950/30 border border-amber-200 dark:border-amber-800/50 rounded-lg text-amber-800 dark:text-amber-300 text-sm">
|
||||
<strong>Warning:</strong> This will remove all triples containing this entity from the knowledge graph.
|
||||
</div>
|
||||
</AlertDialogDescription>
|
||||
</AlertDialogHeader>
|
||||
<AlertDialogFooter>
|
||||
<AlertDialogCancel onClick={() => setEntityToDelete(null)}>Cancel</AlertDialogCancel>
|
||||
<AlertDialogAction
|
||||
onClick={confirmDeleteEntity}
|
||||
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
|
||||
>
|
||||
Delete Entity
|
||||
</AlertDialogAction>
|
||||
</AlertDialogFooter>
|
||||
</AlertDialogContent>
|
||||
</AlertDialog>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
@ -21,10 +21,15 @@ import * as ProgressPrimitive from "@radix-ui/react-progress"
|
||||
|
||||
import { cn } from "@/lib/utils"
|
||||
|
||||
interface ProgressProps extends React.ComponentPropsWithoutRef<typeof ProgressPrimitive.Root> {
|
||||
/** Show shimmer animation overlay for visual polish */
|
||||
shimmer?: boolean
|
||||
}
|
||||
|
||||
const Progress = React.forwardRef<
|
||||
React.ElementRef<typeof ProgressPrimitive.Root>,
|
||||
React.ComponentPropsWithoutRef<typeof ProgressPrimitive.Root>
|
||||
>(({ className, value, ...props }, ref) => (
|
||||
ProgressProps
|
||||
>(({ className, value, shimmer = true, ...props }, ref) => (
|
||||
<ProgressPrimitive.Root
|
||||
ref={ref}
|
||||
className={cn(
|
||||
@ -34,7 +39,10 @@ const Progress = React.forwardRef<
|
||||
{...props}
|
||||
>
|
||||
<ProgressPrimitive.Indicator
|
||||
className="h-full w-full flex-1 bg-primary transition-all"
|
||||
className={cn(
|
||||
"h-full w-full flex-1 bg-primary transition-all duration-300 ease-out",
|
||||
shimmer && (value ?? 0) > 0 && (value ?? 0) < 100 && "progress-shimmer"
|
||||
)}
|
||||
style={{ transform: `translateX(-${100 - (value || 0)}%)` }}
|
||||
/>
|
||||
</ProgressPrimitive.Root>
|
||||
|
||||
@ -16,13 +16,25 @@
|
||||
//
|
||||
import { cn } from "@/lib/utils"
|
||||
|
||||
interface SkeletonProps extends React.HTMLAttributes<HTMLDivElement> {
|
||||
/** Use directional shimmer instead of pulse animation */
|
||||
shimmer?: boolean
|
||||
}
|
||||
|
||||
function Skeleton({
|
||||
className,
|
||||
shimmer = false,
|
||||
...props
|
||||
}: React.HTMLAttributes<HTMLDivElement>) {
|
||||
}: SkeletonProps) {
|
||||
return (
|
||||
<div
|
||||
className={cn("animate-pulse rounded-md bg-muted", className)}
|
||||
className={cn(
|
||||
"rounded-md",
|
||||
shimmer
|
||||
? "skeleton-shimmer"
|
||||
: "animate-pulse bg-muted",
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
/>
|
||||
)
|
||||
|
||||
@ -27,7 +27,7 @@ const Switch = React.forwardRef<
|
||||
>(({ className, ...props }, ref) => (
|
||||
<SwitchPrimitives.Root
|
||||
className={cn(
|
||||
"peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-input",
|
||||
"peer inline-flex h-6 w-11 shrink-0 cursor-pointer items-center rounded-full border-2 border-transparent transition-colors duration-200 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 focus-visible:ring-offset-background disabled:cursor-not-allowed disabled:opacity-50 data-[state=checked]:bg-primary data-[state=unchecked]:bg-input active:scale-95",
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
@ -35,7 +35,7 @@ const Switch = React.forwardRef<
|
||||
>
|
||||
<SwitchPrimitives.Thumb
|
||||
className={cn(
|
||||
"pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-transform data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0"
|
||||
"pointer-events-none block h-5 w-5 rounded-full bg-background shadow-lg ring-0 transition-all duration-200 ease-[cubic-bezier(0.34,1.56,0.64,1)] data-[state=checked]:translate-x-5 data-[state=unchecked]:translate-x-0 data-[state=checked]:shadow-primary/25"
|
||||
)}
|
||||
/>
|
||||
</SwitchPrimitives.Root>
|
||||
|
||||
@ -60,7 +60,7 @@ const TabsContent = React.forwardRef<
|
||||
<TabsPrimitive.Content
|
||||
ref={ref}
|
||||
className={cn(
|
||||
"mt-2 ring-offset-background focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2",
|
||||
"mt-2 ring-offset-background focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 data-[state=active]:animate-in data-[state=active]:fade-in-0 data-[state=active]:slide-in-from-bottom-1 data-[state=active]:duration-200",
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
|
||||
@ -48,6 +48,8 @@ const toastVariants = cva(
|
||||
default: "border bg-background text-foreground",
|
||||
destructive:
|
||||
"destructive group border-destructive bg-destructive text-destructive-foreground",
|
||||
success:
|
||||
"success group border-primary/30 bg-primary/10 text-foreground [&>svg]:text-primary",
|
||||
},
|
||||
},
|
||||
defaultVariants: {
|
||||
|
||||
@ -393,6 +393,11 @@ export function DocumentProvider({ children }: { children: React.ReactNode }) {
|
||||
requestBody.llmProvider = "ollama";
|
||||
requestBody.ollamaModel = model.model || "llama3.1:8b";
|
||||
console.log(`🦙 Using Ollama model: ${requestBody.ollamaModel}`);
|
||||
} else if (model.provider === "vllm") {
|
||||
requestBody.llmProvider = "vllm";
|
||||
requestBody.vllmModel = model.model;
|
||||
requestBody.vllmBaseUrl = model.baseURL || "http://localhost:8001/v1";
|
||||
console.log(`🚀 Using vLLM model: ${requestBody.vllmModel}`);
|
||||
} else if (model.id === "nvidia-nemotron" || model.id === "nvidia-nemotron-nano") {
|
||||
requestBody.llmProvider = "nvidia";
|
||||
requestBody.nvidiaModel = model.model; // Pass the actual model name
|
||||
|
||||
@ -15,6 +15,7 @@
|
||||
// limitations under the License.
|
||||
//
|
||||
import { Database, aql } from 'arangojs';
|
||||
import { createHash } from 'crypto';
|
||||
|
||||
/**
|
||||
* ArangoDB service for database operations
|
||||
@ -29,6 +30,36 @@ export class ArangoDBService {
|
||||
|
||||
private constructor() {}
|
||||
|
||||
/**
|
||||
* Generate a deterministic _key from input string using MD5 hash
|
||||
* Uses Node.js built-in crypto module - truncated to 16 chars for compact keys
|
||||
* @param input - String to hash
|
||||
* @returns Hex-encoded hash string (16 chars, safe for ArangoDB _key)
|
||||
*/
|
||||
private generateKey(input: string): string {
|
||||
return createHash('md5').update(input).digest('hex').slice(0, 16);
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate a deterministic _key for an entity based on its name
|
||||
* @param name - Entity name
|
||||
* @returns Deterministic _key string
|
||||
*/
|
||||
private generateEntityKey(name: string): string {
|
||||
return this.generateKey(name.toLowerCase().trim());
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate a deterministic _key for an edge based on its endpoints and type
|
||||
* @param fromKey - Source entity _key
|
||||
* @param toKey - Target entity _key
|
||||
* @param relationType - Relationship type/predicate
|
||||
* @returns Deterministic _key string
|
||||
*/
|
||||
private generateEdgeKey(fromKey: string, toKey: string, relationType: string): string {
|
||||
return this.generateKey(`${fromKey}|${relationType.toLowerCase().trim()}|${toKey}`);
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the singleton instance of ArangoDBService
|
||||
*/
|
||||
@ -77,9 +108,19 @@ export class ArangoDBService {
|
||||
if (!collectionNames.includes(this.collectionName)) {
|
||||
await this.db.createCollection(this.collectionName);
|
||||
await this.db.collection(this.collectionName).ensureIndex({
|
||||
type: 'persistent',
|
||||
name: 'inverted_index',
|
||||
type: 'inverted',
|
||||
fields: ['name'],
|
||||
unique: true
|
||||
analyzer: 'text_en'
|
||||
});
|
||||
await this.db.createView(`${this.collectionName}_view`, {
|
||||
type: 'search-alias',
|
||||
indexes: [
|
||||
{
|
||||
collection: this.collectionName,
|
||||
index: 'inverted_index'
|
||||
}
|
||||
]
|
||||
});
|
||||
}
|
||||
|
||||
@ -87,19 +128,25 @@ export class ArangoDBService {
|
||||
if (!collectionNames.includes(this.edgeCollectionName)) {
|
||||
await this.db.createEdgeCollection(this.edgeCollectionName);
|
||||
await this.db.collection(this.edgeCollectionName).ensureIndex({
|
||||
type: 'persistent',
|
||||
fields: ['type']
|
||||
name: 'inverted_index',
|
||||
type: 'inverted',
|
||||
fields: ['type'],
|
||||
analyzer: 'text_en'
|
||||
});
|
||||
await this.db.createView(`${this.edgeCollectionName}_view`, {
|
||||
type: 'search-alias',
|
||||
indexes: [
|
||||
{
|
||||
collection: this.edgeCollectionName,
|
||||
index: 'inverted_index'
|
||||
}
|
||||
]
|
||||
});
|
||||
}
|
||||
|
||||
// Create documents collection if it doesn't exist
|
||||
if (!collectionNames.includes(this.documentsCollectionName)) {
|
||||
await this.db.createCollection(this.documentsCollectionName);
|
||||
await this.db.collection(this.documentsCollectionName).ensureIndex({
|
||||
type: 'persistent',
|
||||
fields: ['documentName'],
|
||||
unique: true
|
||||
});
|
||||
}
|
||||
|
||||
console.log('ArangoDB initialized successfully');
|
||||
@ -158,7 +205,8 @@ export class ArangoDBService {
|
||||
|
||||
try {
|
||||
const collection = this.db.collection(this.collectionName);
|
||||
return await collection.save(properties);
|
||||
const doc = { ...properties, _key: this.generateEntityKey(properties.name) }
|
||||
return await collection.save(doc, { overwriteMode: 'update' });
|
||||
} catch (error) {
|
||||
console.error('Error creating node in ArangoDB:', error);
|
||||
throw error;
|
||||
@ -186,12 +234,13 @@ export class ArangoDBService {
|
||||
try {
|
||||
const edgeCollection = this.db.collection(this.edgeCollectionName);
|
||||
const edgeData = {
|
||||
_key: this.generateEdgeKey(fromKey, toKey, relationType),
|
||||
_from: `${this.collectionName}/${fromKey}`,
|
||||
_to: `${this.collectionName}/${toKey}`,
|
||||
type: relationType,
|
||||
...properties
|
||||
};
|
||||
return await edgeCollection.save(edgeData);
|
||||
return await edgeCollection.save(edgeData, { overwriteMode: 'update' });
|
||||
} catch (error) {
|
||||
console.error('Error creating relationship in ArangoDB:', error);
|
||||
throw error;
|
||||
@ -200,54 +249,69 @@ export class ArangoDBService {
|
||||
|
||||
/**
|
||||
* Import triples (subject, predicate, object) into the graph database
|
||||
* Batches inserts every 1000 documents by default
|
||||
* @param triples - Array of triples to import
|
||||
* @param batchSize - Number of documents to insert per batch (default: 1000)
|
||||
* @returns Promise resolving when import is complete
|
||||
*/
|
||||
public async importTriples(triples: { subject: string; predicate: string; object: string }[]): Promise<void> {
|
||||
public async importTriples(
|
||||
triples: { subject: string; predicate: string; object: string }[],
|
||||
batchSize: number = 1000
|
||||
): Promise<void> {
|
||||
if (!this.db) {
|
||||
throw new Error('ArangoDB connection not initialized. Call initialize() first.');
|
||||
}
|
||||
|
||||
let entityBatch: Array<{ _key: string; name: string }> = [];
|
||||
let edgeBatch: Array<{ _key: string; _from: string; _to: string; type: string }> = [];
|
||||
|
||||
const importEntities = async () => {
|
||||
if (entityBatch.length === 0) return;
|
||||
await this.db!.collection(this.collectionName).saveAll(entityBatch, { overwriteMode: 'ignore' });
|
||||
console.log(`[ArangoDB] Imported ${entityBatch.length} entities`);
|
||||
entityBatch = [];
|
||||
};
|
||||
|
||||
const importEdges = async () => {
|
||||
if (edgeBatch.length === 0) return;
|
||||
await this.db!.collection(this.edgeCollectionName).saveAll(edgeBatch, { overwriteMode: 'ignore' });
|
||||
console.log(`[ArangoDB] Imported ${edgeBatch.length} edges`);
|
||||
edgeBatch = [];
|
||||
};
|
||||
|
||||
try {
|
||||
// Process triples in batches to improve performance
|
||||
for (const triple of triples) {
|
||||
// Normalize triple values
|
||||
const normalizedSubject = triple.subject.trim();
|
||||
const normalizedPredicate = triple.predicate.trim();
|
||||
const normalizedObject = triple.object.trim();
|
||||
|
||||
// Skip invalid triples
|
||||
|
||||
if (!normalizedSubject || !normalizedPredicate || !normalizedObject) {
|
||||
console.warn('Skipping invalid triple:', triple);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Upsert subject and object nodes
|
||||
const subjectNode = await this.upsertEntity(normalizedSubject);
|
||||
const objectNode = await this.upsertEntity(normalizedObject);
|
||||
|
||||
// Check if relationship already exists
|
||||
const existingEdges = await this.executeQuery(
|
||||
`FOR e IN ${this.edgeCollectionName}
|
||||
FILTER e._from == @from AND e._to == @to AND e.type == @type
|
||||
RETURN e`,
|
||||
{
|
||||
from: `${this.collectionName}/${subjectNode._key}`,
|
||||
to: `${this.collectionName}/${objectNode._key}`,
|
||||
type: normalizedPredicate
|
||||
}
|
||||
);
|
||||
|
||||
// Create relationship if it doesn't exist
|
||||
if (existingEdges.length === 0) {
|
||||
await this.createRelationship(
|
||||
subjectNode._key,
|
||||
objectNode._key,
|
||||
normalizedPredicate
|
||||
);
|
||||
}
|
||||
|
||||
const subjectKey = this.generateEntityKey(normalizedSubject);
|
||||
const objectKey = this.generateEntityKey(normalizedObject);
|
||||
const edgeKey = this.generateEdgeKey(subjectKey, objectKey, normalizedPredicate);
|
||||
|
||||
entityBatch.push({ _key: subjectKey, name: normalizedSubject });
|
||||
entityBatch.push({ _key: objectKey, name: normalizedObject });
|
||||
|
||||
edgeBatch.push({
|
||||
_key: edgeKey,
|
||||
_from: `${this.collectionName}/${subjectKey}`,
|
||||
_to: `${this.collectionName}/${objectKey}`,
|
||||
type: normalizedPredicate
|
||||
});
|
||||
|
||||
if (entityBatch.length >= batchSize) await importEntities();
|
||||
if (edgeBatch.length >= batchSize) await importEdges();
|
||||
}
|
||||
|
||||
|
||||
// Flush remaining
|
||||
await importEntities();
|
||||
await importEdges();
|
||||
|
||||
console.log(`Successfully imported ${triples.length} triples into ArangoDB`);
|
||||
} catch (error) {
|
||||
console.error('Error importing triples into ArangoDB:', error);
|
||||
@ -255,28 +319,6 @@ export class ArangoDBService {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Helper method to upsert (create or update) an entity
|
||||
* @param name - Entity name
|
||||
* @returns Promise resolving to the entity
|
||||
*/
|
||||
private async upsertEntity(name: string): Promise<any> {
|
||||
const collection = this.db!.collection(this.collectionName);
|
||||
|
||||
// Look for existing entity
|
||||
const existing = await this.executeQuery(
|
||||
`FOR e IN ${this.collectionName} FILTER e.name == @name RETURN e`,
|
||||
{ name }
|
||||
);
|
||||
|
||||
if (existing.length > 0) {
|
||||
return existing[0];
|
||||
}
|
||||
|
||||
// Create new entity
|
||||
return await collection.save({ name });
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if a document has already been processed and stored in ArangoDB
|
||||
* @param documentName - Name of the document to check
|
||||
@ -287,16 +329,9 @@ export class ArangoDBService {
|
||||
throw new Error('ArangoDB connection not initialized. Call initialize() first.');
|
||||
}
|
||||
|
||||
try {
|
||||
const existing = await this.executeQuery(
|
||||
`FOR d IN ${this.documentsCollectionName} FILTER d.documentName == @documentName RETURN d`,
|
||||
{ documentName }
|
||||
);
|
||||
return existing.length > 0;
|
||||
} catch (error) {
|
||||
console.error('Error checking if document is processed:', error);
|
||||
return false;
|
||||
}
|
||||
const collection = this.db.collection(this.documentsCollectionName);
|
||||
const key = this.generateKey(documentName.trim());
|
||||
return await collection.documentExists(key);
|
||||
}
|
||||
|
||||
/**
|
||||
@ -312,30 +347,18 @@ export class ArangoDBService {
|
||||
|
||||
try {
|
||||
const collection = this.db.collection(this.documentsCollectionName);
|
||||
await collection.save({
|
||||
const doc = {
|
||||
_key: this.generateKey(documentName.trim()),
|
||||
documentName,
|
||||
tripleCount,
|
||||
processedAt: new Date().toISOString()
|
||||
});
|
||||
};
|
||||
|
||||
await collection.save(doc, { overwriteMode: 'replace' });
|
||||
console.log(`Marked document "${documentName}" as processed with ${tripleCount} triples`);
|
||||
} catch (error) {
|
||||
// If error is due to unique constraint (document already exists), update it instead
|
||||
if (error && typeof error === 'object' && 'errorNum' in error && error.errorNum === 1210) {
|
||||
console.log(`Document "${documentName}" already exists, updating...`);
|
||||
await this.executeQuery(
|
||||
`FOR d IN ${this.documentsCollectionName}
|
||||
FILTER d.documentName == @documentName
|
||||
UPDATE d WITH { tripleCount: @tripleCount, processedAt: @processedAt } IN ${this.documentsCollectionName}`,
|
||||
{
|
||||
documentName,
|
||||
tripleCount,
|
||||
processedAt: new Date().toISOString()
|
||||
}
|
||||
);
|
||||
} else {
|
||||
console.error('Error marking document as processed:', error);
|
||||
throw error;
|
||||
}
|
||||
console.error('Error marking document as processed:', error);
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
@ -363,19 +386,19 @@ export class ArangoDBService {
|
||||
* Get graph data in a format compatible with the existing application
|
||||
* @returns Promise resolving to nodes and relationships
|
||||
*/
|
||||
public async getGraphData(): Promise<{
|
||||
nodes: Array<{
|
||||
id: string;
|
||||
labels: string[];
|
||||
[key: string]: any
|
||||
}>;
|
||||
relationships: Array<{
|
||||
id: string;
|
||||
source: string;
|
||||
target: string;
|
||||
type: string;
|
||||
[key: string]: any
|
||||
}>;
|
||||
public async getGraphData(): Promise<{
|
||||
nodes: Array<{
|
||||
id: string;
|
||||
labels: string[];
|
||||
[key: string]: any
|
||||
}>;
|
||||
relationships: Array<{
|
||||
id: string;
|
||||
source: string;
|
||||
target: string;
|
||||
type: string;
|
||||
[key: string]: any
|
||||
}>;
|
||||
}> {
|
||||
if (!this.db) {
|
||||
throw new Error('ArangoDB connection not initialized. Call initialize() first.');
|
||||
@ -386,18 +409,12 @@ export class ArangoDBService {
|
||||
const entities = await this.executeQuery(
|
||||
`FOR e IN ${this.collectionName} RETURN e`
|
||||
);
|
||||
|
||||
|
||||
// Get all relationships (edges)
|
||||
const relationships = await this.executeQuery(
|
||||
`FOR r IN ${this.edgeCollectionName} RETURN r`
|
||||
);
|
||||
|
||||
// Build id to key mapping for relationships
|
||||
const idToKey = new Map<string, string>();
|
||||
for (const entity of entities) {
|
||||
idToKey.set(entity._id, entity._key);
|
||||
}
|
||||
|
||||
|
||||
// Format nodes in a way compatible with the application
|
||||
const nodes = entities.map(entity => ({
|
||||
id: entity._key,
|
||||
@ -405,13 +422,12 @@ export class ArangoDBService {
|
||||
name: entity.name,
|
||||
...entity
|
||||
}));
|
||||
|
||||
|
||||
// Format relationships in a way compatible with the application
|
||||
const formattedRelationships = relationships.map(rel => {
|
||||
// Extract the entity keys from _from and _to
|
||||
const source = rel._from.split('/')[1];
|
||||
const target = rel._to.split('/')[1];
|
||||
|
||||
|
||||
return {
|
||||
id: rel._key,
|
||||
source,
|
||||
@ -420,7 +436,7 @@ export class ArangoDBService {
|
||||
...rel
|
||||
};
|
||||
});
|
||||
|
||||
|
||||
return {
|
||||
nodes,
|
||||
relationships: formattedRelationships
|
||||
@ -435,7 +451,7 @@ export class ArangoDBService {
|
||||
* Log query information and metrics
|
||||
*/
|
||||
public async logQuery(
|
||||
query: string,
|
||||
query: string,
|
||||
queryMode: 'traditional' | 'vector-search' | 'pure-rag',
|
||||
metrics: {
|
||||
executionTimeMs: number;
|
||||
@ -453,11 +469,11 @@ export class ArangoDBService {
|
||||
// Create a queryLogs collection if it doesn't exist
|
||||
const collections = await this.db.listCollections();
|
||||
const collectionNames = collections.map(c => c.name);
|
||||
|
||||
|
||||
if (!collectionNames.includes('queryLogs')) {
|
||||
await this.db.createCollection('queryLogs');
|
||||
}
|
||||
|
||||
|
||||
// Store query log
|
||||
const queryLog = {
|
||||
query,
|
||||
@ -465,7 +481,7 @@ export class ArangoDBService {
|
||||
metrics,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
|
||||
await this.db.collection('queryLogs').save(queryLog);
|
||||
} catch (error) {
|
||||
console.error('Error logging query to ArangoDB:', error);
|
||||
@ -488,17 +504,17 @@ export class ArangoDBService {
|
||||
// Check if queryLogs collection exists
|
||||
const collections = await this.db.listCollections();
|
||||
const collectionNames = collections.map(c => c.name);
|
||||
|
||||
|
||||
if (!collectionNames.includes('queryLogs')) {
|
||||
return [];
|
||||
}
|
||||
|
||||
|
||||
// Get logs sorted by timestamp
|
||||
const logs = await this.executeQuery(
|
||||
`FOR l IN queryLogs SORT l.timestamp DESC LIMIT @limit RETURN l`,
|
||||
{ limit }
|
||||
);
|
||||
|
||||
|
||||
return logs;
|
||||
} catch (error) {
|
||||
console.error('Error getting query logs from ArangoDB:', error);
|
||||
@ -507,16 +523,19 @@ export class ArangoDBService {
|
||||
}
|
||||
|
||||
/**
|
||||
* Perform graph traversal to find relevant triples using ArangoDB's native graph capabilities
|
||||
* Perform graph traversal to find relevant triples using ArangoDB's native text search and graph capabilities
|
||||
* Uses inverted indexes with BM25 scoring for efficient keyword matching
|
||||
* @param keywords - Array of keywords to search for
|
||||
* @param maxDepth - Maximum traversal depth (default: 2)
|
||||
* @param maxResults - Maximum number of results to return (default: 100)
|
||||
* @param maxSeeds - Maximum number of seed nodes/edges from text search (default: 50)
|
||||
* @returns Promise resolving to array of triples with relevance scores
|
||||
*/
|
||||
public async graphTraversal(
|
||||
keywords: string[],
|
||||
maxDepth: number = 2,
|
||||
maxResults: number = 100
|
||||
maxResults: number = 100,
|
||||
maxSeeds: number = 50
|
||||
): Promise<Array<{
|
||||
subject: string;
|
||||
predicate: string;
|
||||
@ -540,93 +559,89 @@ export class ArangoDBService {
|
||||
return [];
|
||||
}
|
||||
|
||||
// AQL query that:
|
||||
// 1. Finds seed nodes matching keywords
|
||||
// 2. Performs graph traversal from those nodes
|
||||
// 3. Scores results based on keyword matches and depth
|
||||
const query = `
|
||||
// Find all entities matching keywords (case-insensitive)
|
||||
// 1. Tokenize keywords using the same analyzer as the index
|
||||
LET keywords_merged = CONCAT_SEPARATOR(" ", @keywords)
|
||||
LET keywords_tokens = TOKENS(keywords_merged, "text_en")
|
||||
|
||||
// 2. Match for entity.name
|
||||
LET seedNodes = (
|
||||
FOR entity IN ${this.collectionName}
|
||||
LET lowerName = LOWER(entity.name)
|
||||
LET matches = (
|
||||
FOR keyword IN @keywords
|
||||
FILTER CONTAINS(lowerName, keyword)
|
||||
RETURN 1
|
||||
)
|
||||
FILTER LENGTH(matches) > 0
|
||||
FOR vertex IN ${this.collectionName}_view
|
||||
SEARCH ANALYZER(vertex.name IN keywords_tokens, "text_en")
|
||||
LET score = BM25(vertex)
|
||||
SORT score DESC
|
||||
LIMIT @maxSeeds
|
||||
RETURN { vertex, score }
|
||||
)
|
||||
|
||||
// 3. Match for relationship.type
|
||||
LET seedEdges = (
|
||||
FOR edge IN ${this.edgeCollectionName}_view
|
||||
SEARCH ANALYZER(edge.type IN keywords_tokens, "text_en")
|
||||
LET score = BM25(edge)
|
||||
SORT score DESC
|
||||
LIMIT @maxSeeds
|
||||
RETURN { edge, score }
|
||||
)
|
||||
|
||||
// 4. Normalize scores
|
||||
LET maxNodeScore = MAX(seedNodes[*].score) || 1
|
||||
LET maxEdgeScore = MAX(seedEdges[*].score) || 1
|
||||
|
||||
// 5. Traverse from seedNodes up to maxDepth
|
||||
LET traversalResults = (
|
||||
FOR seed IN seedNodes
|
||||
FOR v, e, p IN 1..@maxDepth ANY seed.vertex ${this.edgeCollectionName}
|
||||
OPTIONS { uniqueVertices: 'path', bfs: true }
|
||||
|
||||
LET subjectEntity = DOCUMENT(e._from)
|
||||
LET objectEntity = DOCUMENT(e._to)
|
||||
LET depth = LENGTH(p.edges) - 1
|
||||
|
||||
// Depth penalty: closer to seed = higher score
|
||||
LET depthPenalty = 1.0 / (1.0 + depth * 0.2)
|
||||
|
||||
// Normalize seed score and apply depth penalty
|
||||
LET normalizedSeedScore = seed.score / maxNodeScore
|
||||
LET confidence = normalizedSeedScore * depthPenalty
|
||||
|
||||
RETURN {
|
||||
subject: subjectEntity.name,
|
||||
predicate: e.type,
|
||||
object: objectEntity.name,
|
||||
confidence: confidence,
|
||||
depth: depth,
|
||||
_edgeId: e._id,
|
||||
pathLength: LENGTH(p.edges)
|
||||
}
|
||||
)
|
||||
|
||||
// 6. Collect triples from seedEdges (direct hits)
|
||||
LET edgeResults = (
|
||||
FOR seed IN seedEdges
|
||||
LET subjectEntity = DOCUMENT(seed.edge._from)
|
||||
LET objectEntity = DOCUMENT(seed.edge._to)
|
||||
|
||||
// Direct edge matches get a boost (depth 0)
|
||||
LET normalizedScore = seed.score / maxEdgeScore
|
||||
|
||||
RETURN {
|
||||
node: entity,
|
||||
matchCount: LENGTH(matches)
|
||||
subject: subjectEntity.name,
|
||||
predicate: seed.edge.type,
|
||||
object: objectEntity.name,
|
||||
confidence: normalizedScore * 1.2, // Boost direct edge matches
|
||||
depth: 0,
|
||||
_edgeId: seed.edge._id,
|
||||
pathLength: 1
|
||||
}
|
||||
)
|
||||
|
||||
// Perform graph traversal from seed nodes
|
||||
// Multi-hop: Extract ALL edges in each path, not just the final edge
|
||||
LET traversalResults = (
|
||||
FOR seed IN seedNodes
|
||||
FOR v, e, p IN 0..@maxDepth ANY seed.node._id ${this.edgeCollectionName}
|
||||
OPTIONS {uniqueVertices: 'global', bfs: true}
|
||||
FILTER e != null
|
||||
// 7. Combine traversalResults and edgeResults
|
||||
LET combinedResults = APPEND(traversalResults, edgeResults)
|
||||
|
||||
// Extract all edges from the path for multi-hop context
|
||||
LET pathEdges = (
|
||||
FOR edgeIdx IN 0..(LENGTH(p.edges) - 1)
|
||||
LET pathEdge = p.edges[edgeIdx]
|
||||
LET subjectEntity = DOCUMENT(pathEdge._from)
|
||||
LET objectEntity = DOCUMENT(pathEdge._to)
|
||||
LET subjectLower = LOWER(subjectEntity.name)
|
||||
LET objectLower = LOWER(objectEntity.name)
|
||||
LET predicateLower = LOWER(pathEdge.type)
|
||||
|
||||
// Calculate score for this edge
|
||||
LET subjectMatches = (
|
||||
FOR kw IN @keywords
|
||||
FILTER CONTAINS(subjectLower, kw)
|
||||
LET isExact = (subjectLower == kw)
|
||||
RETURN isExact ? 1000 : (LENGTH(kw) * LENGTH(kw))
|
||||
)
|
||||
LET objectMatches = (
|
||||
FOR kw IN @keywords
|
||||
FILTER CONTAINS(objectLower, kw)
|
||||
LET isExact = (objectLower == kw)
|
||||
RETURN isExact ? 1000 : (LENGTH(kw) * LENGTH(kw))
|
||||
)
|
||||
LET predicateMatches = (
|
||||
FOR kw IN @keywords
|
||||
FILTER CONTAINS(predicateLower, kw)
|
||||
LET isExact = (predicateLower == kw)
|
||||
RETURN isExact ? 50 : (LENGTH(kw) * LENGTH(kw))
|
||||
)
|
||||
|
||||
LET totalScore = SUM(subjectMatches) + SUM(objectMatches) + SUM(predicateMatches)
|
||||
|
||||
// Depth penalty (edges earlier in path get slight boost)
|
||||
LET depthPenalty = 1.0 / (1.0 + (edgeIdx * 0.1))
|
||||
|
||||
LET confidence = MIN([totalScore * depthPenalty / 1000.0, 1.0])
|
||||
|
||||
FILTER confidence > 0
|
||||
|
||||
RETURN {
|
||||
subject: subjectEntity.name,
|
||||
predicate: pathEdge.type,
|
||||
object: objectEntity.name,
|
||||
confidence: confidence,
|
||||
depth: edgeIdx,
|
||||
_edgeId: pathEdge._id,
|
||||
pathLength: LENGTH(p.edges)
|
||||
}
|
||||
)
|
||||
|
||||
// Return all edges from this path
|
||||
FOR pathTriple IN pathEdges
|
||||
RETURN pathTriple
|
||||
)
|
||||
|
||||
// Remove duplicates by edge ID and sort by confidence
|
||||
// 8. Remove duplicates by edge ID and sort by confidence
|
||||
LET uniqueResults = (
|
||||
FOR result IN traversalResults
|
||||
FOR result IN combinedResults
|
||||
COLLECT edgeId = result._edgeId INTO groups
|
||||
LET best = FIRST(
|
||||
FOR g IN groups
|
||||
@ -636,8 +651,9 @@ export class ArangoDBService {
|
||||
RETURN best
|
||||
)
|
||||
|
||||
// Sort by confidence and limit results
|
||||
// 9. Sort by confidence and limit results
|
||||
FOR result IN uniqueResults
|
||||
FILTER result != null
|
||||
SORT result.confidence DESC, result.depth ASC
|
||||
LIMIT @maxResults
|
||||
RETURN {
|
||||
@ -655,14 +671,15 @@ export class ArangoDBService {
|
||||
const results = await this.executeQuery(query, {
|
||||
keywords: keywordConditions,
|
||||
maxDepth,
|
||||
maxResults
|
||||
maxResults,
|
||||
maxSeeds
|
||||
});
|
||||
|
||||
console.log(`[ArangoDB] Multi-hop graph traversal found ${results.length} triples for keywords: ${keywords.join(', ')}`);
|
||||
console.log(`[ArangoDB] Found ${results.length} triples for keywords: ${keywords.join(', ')}`);
|
||||
|
||||
// Log top 10 results with confidence scores
|
||||
if (results.length > 0) {
|
||||
console.log('[ArangoDB] Top 10 triples by confidence (multi-hop):');
|
||||
console.log('[ArangoDB] Top 10 triples by confidence:');
|
||||
results.slice(0, 10).forEach((triple: any, idx: number) => {
|
||||
const pathInfo = triple.pathLength ? ` path=${triple.pathLength}` : '';
|
||||
console.log(` ${idx + 1}. [conf=${triple.confidence?.toFixed(3)}] ${triple.subject} -> ${triple.predicate} -> ${triple.object} (depth=${triple.depth}${pathInfo})`);
|
||||
@ -705,22 +722,22 @@ export class ArangoDBService {
|
||||
try {
|
||||
// Truncate the entities collection (nodes)
|
||||
await this.db.collection(this.collectionName).truncate();
|
||||
|
||||
|
||||
// Truncate the relationships collection (edges)
|
||||
await this.db.collection(this.edgeCollectionName).truncate();
|
||||
|
||||
|
||||
// Also clear query logs if they exist
|
||||
const collections = await this.db.listCollections();
|
||||
const collectionNames = collections.map(c => c.name);
|
||||
|
||||
|
||||
if (collectionNames.includes('queryLogs')) {
|
||||
await this.db.collection('queryLogs').truncate();
|
||||
}
|
||||
|
||||
|
||||
console.log('ArangoDB database cleared successfully');
|
||||
} catch (error) {
|
||||
console.error('Error clearing ArangoDB database:', error);
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ -32,16 +32,24 @@ import type { Triple } from '@/types/graph';
|
||||
*/
|
||||
export class BackendService {
|
||||
private graphDBService: GraphDBService;
|
||||
private pineconeService: QdrantService;
|
||||
private qdrantService: QdrantService;
|
||||
private sentenceTransformerUrl: string = 'http://sentence-transformers:80';
|
||||
private modelName: string = 'all-MiniLM-L6-v2';
|
||||
private static instance: BackendService;
|
||||
private initialized: boolean = false;
|
||||
private activeGraphDbType: GraphDBType = 'arangodb';
|
||||
private activeGraphDbType: GraphDBType | null = null; // Set at runtime, not build time
|
||||
|
||||
private getRuntimeGraphDbType(): GraphDBType {
|
||||
if (this.activeGraphDbType === null) {
|
||||
this.activeGraphDbType = (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
|
||||
console.log(`[BackendService] Initialized activeGraphDbType at runtime: ${this.activeGraphDbType}`);
|
||||
}
|
||||
return this.activeGraphDbType;
|
||||
}
|
||||
|
||||
private constructor() {
|
||||
this.graphDBService = GraphDBService.getInstance();
|
||||
this.pineconeService = QdrantService.getInstance();
|
||||
this.qdrantService = QdrantService.getInstance();
|
||||
|
||||
// Use environment variables if available
|
||||
if (process.env.SENTENCE_TRANSFORMER_URL) {
|
||||
@ -64,16 +72,17 @@ export class BackendService {
|
||||
|
||||
/**
|
||||
* Initialize the backend services
|
||||
* @param graphDbType - Type of graph database to use (neo4j or arangodb)
|
||||
* @param graphDbType - Type of graph database to use (defaults to GRAPH_DB_TYPE env var)
|
||||
*/
|
||||
public async initialize(graphDbType: GraphDBType = 'arangodb'): Promise<void> {
|
||||
this.activeGraphDbType = graphDbType;
|
||||
public async initialize(graphDbType?: GraphDBType): Promise<void> {
|
||||
const dbType = graphDbType || (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
|
||||
this.activeGraphDbType = dbType;
|
||||
|
||||
// Initialize Graph Database
|
||||
if (!this.graphDBService.isInitialized()) {
|
||||
try {
|
||||
// Get the appropriate service based on type
|
||||
const graphDbService = getGraphDbService(graphDbType);
|
||||
const graphDbService = getGraphDbService(dbType);
|
||||
|
||||
// Try to get settings from server settings API first
|
||||
let serverSettings: Record<string, string> = {};
|
||||
@ -88,7 +97,7 @@ export class BackendService {
|
||||
console.log('Failed to load settings from server API, falling back to environment variables:', error);
|
||||
}
|
||||
|
||||
if (graphDbType === 'neo4j') {
|
||||
if (dbType === 'neo4j') {
|
||||
// Get Neo4j credentials from server settings first, then fallback to environment
|
||||
const uri = serverSettings.neo4j_url || process.env.NEO4J_URI;
|
||||
const username = serverSettings.neo4j_user || process.env.NEO4J_USER || process.env.NEO4J_USERNAME;
|
||||
@ -107,9 +116,9 @@ export class BackendService {
|
||||
console.log(`Using ArangoDB database: ${dbName}`);
|
||||
await this.graphDBService.initialize('arangodb', url, username, password);
|
||||
}
|
||||
console.log(`${graphDbType} initialized successfully in backend service`);
|
||||
console.log(`${dbType} initialized successfully in backend service`);
|
||||
} catch (error) {
|
||||
console.error(`Failed to initialize ${graphDbType} in backend service:`, error);
|
||||
console.error(`Failed to initialize ${dbType} in backend service:`, error);
|
||||
if (process.env.NODE_ENV === 'development') {
|
||||
console.log('Development mode: Continuing despite graph database initialization error');
|
||||
} else {
|
||||
@ -118,9 +127,9 @@ export class BackendService {
|
||||
}
|
||||
}
|
||||
|
||||
// Initialize Pinecone
|
||||
if (!this.pineconeService.isInitialized()) {
|
||||
await this.pineconeService.initialize();
|
||||
// Initialize Qdrant
|
||||
if (!this.qdrantService.isInitialized()) {
|
||||
await this.qdrantService.initialize();
|
||||
}
|
||||
|
||||
// Check if sentence-transformer service is available
|
||||
@ -151,7 +160,7 @@ export class BackendService {
|
||||
* Get the active graph database type
|
||||
*/
|
||||
public getGraphDbType(): GraphDBType {
|
||||
return this.activeGraphDbType;
|
||||
return this.getRuntimeGraphDbType();
|
||||
}
|
||||
|
||||
/**
|
||||
@ -183,7 +192,7 @@ export class BackendService {
|
||||
}
|
||||
|
||||
/**
|
||||
* Process and store triples in graph database and embeddings in Pinecone
|
||||
* Process and store triples in graph database and embeddings in Qdrant
|
||||
*/
|
||||
public async processTriples(triples: Triple[]): Promise<void> {
|
||||
// Preprocess triples: lowercase and remove duplicates
|
||||
@ -232,8 +241,8 @@ export class BackendService {
|
||||
}
|
||||
}
|
||||
|
||||
// Store embeddings and text content in Pinecone
|
||||
await this.pineconeService.storeEmbeddings(entityEmbeddings, textContent);
|
||||
// Store embeddings and text content in Qdrant
|
||||
await this.qdrantService.storeEmbeddings(entityEmbeddings, textContent);
|
||||
|
||||
console.log(`Backend processing complete: ${uniqueTriples.length} triples and ${entityList.length} entities stored using ${this.activeGraphDbType}`);
|
||||
}
|
||||
@ -253,7 +262,7 @@ export class BackendService {
|
||||
const filteredKeywords = keywords.filter(kw => !this.isStopWord(kw));
|
||||
|
||||
// If using ArangoDB, use its native graph traversal capabilities
|
||||
if (this.activeGraphDbType === 'arangodb') {
|
||||
if (this.getRuntimeGraphDbType() === 'arangodb') {
|
||||
console.log(`Using ArangoDB native graph traversal for keywords: ${filteredKeywords.join(', ')}`);
|
||||
|
||||
try {
|
||||
@ -392,8 +401,8 @@ export class BackendService {
|
||||
// Generate embedding for query
|
||||
const queryEmbedding = (await this.generateEmbeddings([queryText]))[0];
|
||||
|
||||
// Find nearest neighbors using Pinecone
|
||||
const seedNodes = await this.pineconeService.findSimilarEntities(queryEmbedding, kNeighbors);
|
||||
// Find nearest neighbors using Qdrant
|
||||
const seedNodes = await this.qdrantService.findSimilarEntities(queryEmbedding, kNeighbors);
|
||||
console.log(`Found ${seedNodes.length} seed nodes for query: "${queryText}"`);
|
||||
|
||||
// Get graph data from graph database
|
||||
@ -649,7 +658,7 @@ Answer:`;
|
||||
const embeddings = await this.generateEmbeddings(documents);
|
||||
|
||||
// Store in Qdrant document-embeddings collection
|
||||
await this.pineconeService.storeDocumentChunks(documents, embeddings, metadata);
|
||||
await this.qdrantService.storeDocumentChunks(documents, embeddings, metadata);
|
||||
|
||||
console.log(`✅ Stored ${documents.length} document chunks in document-embeddings collection`);
|
||||
}
|
||||
|
||||
@ -22,18 +22,17 @@
|
||||
/**
|
||||
* Initialize default database settings if not already set
|
||||
* Called before syncing with server to ensure defaults are available
|
||||
* NOTE: Don't set graph_db_type here - let server's GRAPH_DB_TYPE env var control it
|
||||
*/
|
||||
export function initializeDefaultSettings() {
|
||||
if (typeof window === 'undefined') {
|
||||
return; // Only run on client side
|
||||
}
|
||||
|
||||
// Set default graph DB type to ArangoDB if not set
|
||||
if (!localStorage.getItem('graph_db_type')) {
|
||||
localStorage.setItem('graph_db_type', 'arangodb');
|
||||
}
|
||||
|
||||
// Set default ArangoDB settings if not set
|
||||
// Don't set graph_db_type default - let it be controlled by server's GRAPH_DB_TYPE env var
|
||||
// The server will use its environment variable if no client setting is provided
|
||||
|
||||
// Set default connection settings only (not the database type selection)
|
||||
if (!localStorage.getItem('arango_url')) {
|
||||
localStorage.setItem('arango_url', 'http://localhost:8529');
|
||||
}
|
||||
@ -41,6 +40,11 @@ export function initializeDefaultSettings() {
|
||||
if (!localStorage.getItem('arango_db')) {
|
||||
localStorage.setItem('arango_db', 'txt2kg');
|
||||
}
|
||||
|
||||
// Set default Neo4j settings
|
||||
if (!localStorage.getItem('neo4j_url')) {
|
||||
localStorage.setItem('neo4j_url', 'bolt://localhost:7687');
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
@ -124,21 +128,6 @@ export async function syncSettingsWithServer() {
|
||||
settings.NVIDIA_API_KEY = nvidiaApiKey;
|
||||
}
|
||||
|
||||
// Pinecone settings
|
||||
const pineconeApiKey = localStorage.getItem('pinecone_api_key');
|
||||
if (pineconeApiKey) {
|
||||
settings.pinecone_api_key = pineconeApiKey;
|
||||
}
|
||||
|
||||
const pineconeEnvironment = localStorage.getItem('pinecone_environment');
|
||||
if (pineconeEnvironment) {
|
||||
settings.pinecone_environment = pineconeEnvironment;
|
||||
}
|
||||
|
||||
const pineconeIndex = localStorage.getItem('pinecone_index');
|
||||
if (pineconeIndex) {
|
||||
settings.pinecone_index = pineconeIndex;
|
||||
}
|
||||
|
||||
// Skip the API call if there are no settings to sync
|
||||
if (Object.keys(settings).length === 0) {
|
||||
|
||||
@ -26,7 +26,7 @@ export type GraphDBType = 'neo4j' | 'arangodb';
|
||||
export class GraphDBService {
|
||||
private neo4jService: Neo4jService;
|
||||
private arangoDBService: ArangoDBService;
|
||||
private activeDBType: GraphDBType = 'arangodb'; // Default to ArangoDB
|
||||
private activeDBType: GraphDBType | null = null; // Set at runtime, not build time
|
||||
private static instance: GraphDBService;
|
||||
|
||||
private constructor() {
|
||||
@ -34,6 +34,17 @@ export class GraphDBService {
|
||||
this.arangoDBService = ArangoDBService.getInstance();
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the active DB type, reading from env at runtime if not set
|
||||
*/
|
||||
private getActiveDBType(): GraphDBType {
|
||||
if (this.activeDBType === null) {
|
||||
this.activeDBType = (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
|
||||
console.log(`[GraphDBService] Initialized activeDBType at runtime: ${this.activeDBType}`);
|
||||
}
|
||||
return this.activeDBType;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the singleton instance of GraphDBService
|
||||
*/
|
||||
@ -46,24 +57,25 @@ export class GraphDBService {
|
||||
|
||||
/**
|
||||
* Initialize the graph database with the specified type
|
||||
* @param dbType - Type of graph database to use
|
||||
* @param dbType - Type of graph database to use (defaults to GRAPH_DB_TYPE env var)
|
||||
* @param uri - Connection URL
|
||||
* @param username - Database username
|
||||
* @param password - Database password
|
||||
*/
|
||||
public async initialize(dbType: GraphDBType = 'arangodb', uri?: string, username?: string, password?: string): Promise<void> {
|
||||
this.activeDBType = dbType;
|
||||
public async initialize(dbType?: GraphDBType, uri?: string, username?: string, password?: string): Promise<void> {
|
||||
const graphDbType = dbType || (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
|
||||
this.activeDBType = graphDbType;
|
||||
|
||||
try {
|
||||
if (dbType === 'neo4j') {
|
||||
if (graphDbType === 'neo4j') {
|
||||
this.neo4jService.initialize(uri, username, password);
|
||||
console.log('Neo4j initialized successfully');
|
||||
} else if (dbType === 'arangodb') {
|
||||
} else if (graphDbType === 'arangodb') {
|
||||
await this.arangoDBService.initialize(uri, undefined, username, password);
|
||||
console.log('ArangoDB initialized successfully');
|
||||
}
|
||||
} catch (error) {
|
||||
console.error(`Failed to initialize ${dbType}:`, error);
|
||||
console.error(`Failed to initialize ${graphDbType}:`, error);
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
@ -79,14 +91,14 @@ export class GraphDBService {
|
||||
* Get the active graph database type
|
||||
*/
|
||||
public getDBType(): GraphDBType {
|
||||
return this.activeDBType;
|
||||
return this.getActiveDBType();
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if the active database is initialized
|
||||
*/
|
||||
public isInitialized(): boolean {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
return this.neo4jService.isInitialized();
|
||||
} else {
|
||||
return this.arangoDBService.isInitialized();
|
||||
@ -97,7 +109,7 @@ export class GraphDBService {
|
||||
* Import triples into the active graph database
|
||||
*/
|
||||
public async importTriples(triples: { subject: string; predicate: string; object: string }[]): Promise<void> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
await this.neo4jService.importTriples(triples);
|
||||
} else {
|
||||
await this.arangoDBService.importTriples(triples);
|
||||
@ -121,7 +133,7 @@ export class GraphDBService {
|
||||
[key: string]: any
|
||||
}>;
|
||||
}> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
return await this.neo4jService.getGraphData();
|
||||
} else {
|
||||
return await this.arangoDBService.getGraphData();
|
||||
@ -142,7 +154,7 @@ export class GraphDBService {
|
||||
resultCount: number;
|
||||
}
|
||||
): Promise<void> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
await this.neo4jService.logQuery(query, queryMode, metrics);
|
||||
} else {
|
||||
await this.arangoDBService.logQuery(query, queryMode, metrics);
|
||||
@ -153,7 +165,7 @@ export class GraphDBService {
|
||||
* Get query logs from the active graph database
|
||||
*/
|
||||
public async getQueryLogs(limit: number = 100): Promise<any[]> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
return await this.neo4jService.getQueryLogs(limit);
|
||||
} else {
|
||||
return await this.arangoDBService.getQueryLogs(limit);
|
||||
@ -164,7 +176,7 @@ export class GraphDBService {
|
||||
* Close the connection to the active graph database
|
||||
*/
|
||||
public async close(): Promise<void> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
this.neo4jService.close();
|
||||
} else {
|
||||
this.arangoDBService.close();
|
||||
@ -175,7 +187,7 @@ export class GraphDBService {
|
||||
* Get info about the active graph database driver
|
||||
*/
|
||||
public getDriverInfo(): Record<string, any> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
return this.neo4jService.getDriverInfo();
|
||||
} else {
|
||||
return this.arangoDBService.getDriverInfo();
|
||||
@ -197,7 +209,7 @@ export class GraphDBService {
|
||||
confidence: number;
|
||||
depth?: number;
|
||||
}>> {
|
||||
if (this.activeDBType === 'arangodb') {
|
||||
if (this.getActiveDBType() === 'arangodb') {
|
||||
return await this.arangoDBService.graphTraversal(keywords, maxDepth, maxResults);
|
||||
} else {
|
||||
// Neo4j doesn't have this method yet, return empty array
|
||||
@ -210,7 +222,7 @@ export class GraphDBService {
|
||||
* Clear all data from the active graph database
|
||||
*/
|
||||
public async clearDatabase(): Promise<void> {
|
||||
if (this.activeDBType === 'neo4j') {
|
||||
if (this.getActiveDBType() === 'neo4j') {
|
||||
// TODO: Implement Neo4j clear database functionality
|
||||
throw new Error('Clear database functionality not implemented for Neo4j');
|
||||
} else {
|
||||
|
||||
@ -18,20 +18,34 @@ import { GraphDBService, GraphDBType } from './graph-db-service';
|
||||
import { Neo4jService } from './neo4j';
|
||||
import { ArangoDBService } from './arangodb';
|
||||
|
||||
/**
|
||||
* Get the default graph database type from environment or fallback to arangodb
|
||||
* Note: This is called at runtime, not build time, so process.env should be available
|
||||
*/
|
||||
function getDefaultGraphDbType(): GraphDBType {
|
||||
const envType = process.env.GRAPH_DB_TYPE;
|
||||
console.log(`[graph-db-util] getDefaultGraphDbType: env=${envType}`);
|
||||
return (envType as GraphDBType) || 'arangodb';
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the appropriate graph database service based on the graph database type.
|
||||
* This is useful for API routes that need direct access to a specific graph database.
|
||||
*
|
||||
* @param graphDbType - The type of graph database to use
|
||||
* @param graphDbType - The type of graph database to use (defaults to GRAPH_DB_TYPE env var)
|
||||
*/
|
||||
export function getGraphDbService(graphDbType: GraphDBType = 'arangodb') {
|
||||
if (graphDbType === 'neo4j') {
|
||||
export function getGraphDbService(graphDbType?: GraphDBType) {
|
||||
const dbType = graphDbType || getDefaultGraphDbType();
|
||||
|
||||
if (dbType === 'neo4j') {
|
||||
return Neo4jService.getInstance();
|
||||
} else if (graphDbType === 'arangodb') {
|
||||
} else if (dbType === 'arangodb') {
|
||||
return ArangoDBService.getInstance();
|
||||
} else {
|
||||
// Default to ArangoDB
|
||||
return ArangoDBService.getInstance();
|
||||
// Default based on environment
|
||||
return getDefaultGraphDbType() === 'neo4j'
|
||||
? Neo4jService.getInstance()
|
||||
: ArangoDBService.getInstance();
|
||||
}
|
||||
}
|
||||
|
||||
@ -39,12 +53,13 @@ export function getGraphDbService(graphDbType: GraphDBType = 'arangodb') {
|
||||
* Initialize the graph database directly (not using GraphDBService).
|
||||
* This is useful for API routes that need direct access to a specific graph database.
|
||||
*
|
||||
* @param graphDbType - The type of graph database to use
|
||||
* @param graphDbType - The type of graph database to use (defaults to GRAPH_DB_TYPE env var)
|
||||
*/
|
||||
export async function initializeGraphDb(graphDbType: GraphDBType = 'arangodb'): Promise<void> {
|
||||
const service = getGraphDbService(graphDbType);
|
||||
export async function initializeGraphDb(graphDbType?: GraphDBType): Promise<void> {
|
||||
const dbType = graphDbType || getDefaultGraphDbType();
|
||||
const service = getGraphDbService(dbType);
|
||||
|
||||
if (graphDbType === 'neo4j') {
|
||||
if (dbType === 'neo4j') {
|
||||
// Get Neo4j credentials from environment
|
||||
const uri = process.env.NEO4J_URI;
|
||||
const username = process.env.NEO4J_USER || process.env.NEO4J_USERNAME;
|
||||
@ -54,7 +69,7 @@ export async function initializeGraphDb(graphDbType: GraphDBType = 'arangodb'):
|
||||
if (service instanceof Neo4jService) {
|
||||
service.initialize(uri, username, password);
|
||||
}
|
||||
} else if (graphDbType === 'arangodb') {
|
||||
} else if (dbType === 'arangodb') {
|
||||
// Get ArangoDB credentials from environment
|
||||
const url = process.env.ARANGODB_URL;
|
||||
const dbName = process.env.ARANGODB_DB;
|
||||
|
||||
@ -1,19 +1,3 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
/**
|
||||
* Pinecone service for vector embeddings
|
||||
* Uses direct API calls for Pinecone local server
|
||||
|
||||
@ -16,7 +16,6 @@
|
||||
//
|
||||
/**
|
||||
* Qdrant service for vector embeddings
|
||||
* Drop-in replacement for PineconeService
|
||||
*/
|
||||
import { Document } from "@langchain/core/documents";
|
||||
import { randomUUID } from "crypto";
|
||||
@ -477,7 +476,7 @@ export class QdrantService {
|
||||
}
|
||||
|
||||
try {
|
||||
// Qdrant doesn't have a direct "get all" like Pinecone
|
||||
// Use scroll API to get points
|
||||
// We'll use scroll API to get points
|
||||
const response = await this.makeRequest(`/collections/${this.collectionName}/points/scroll`, 'POST', {
|
||||
limit: limit,
|
||||
|
||||
@ -28,7 +28,7 @@ import type { Triple } from '@/types/graph';
|
||||
*/
|
||||
export class RemoteBackendService {
|
||||
private graphDBService: GraphDBService;
|
||||
private pineconeService: QdrantService;
|
||||
private qdrantService: QdrantService;
|
||||
private embeddingsService: EmbeddingsService;
|
||||
private textProcessor: TextProcessor;
|
||||
private initialized: boolean = false;
|
||||
@ -36,7 +36,7 @@ export class RemoteBackendService {
|
||||
|
||||
private constructor() {
|
||||
this.graphDBService = GraphDBService.getInstance();
|
||||
this.pineconeService = QdrantService.getInstance();
|
||||
this.qdrantService = QdrantService.getInstance();
|
||||
this.embeddingsService = EmbeddingsService.getInstance();
|
||||
this.textProcessor = TextProcessor.getInstance();
|
||||
}
|
||||
@ -60,18 +60,19 @@ export class RemoteBackendService {
|
||||
|
||||
/**
|
||||
* Initialize the remote backend with all required services
|
||||
* @param graphDbType - Type of graph database to use
|
||||
* @param graphDbType - Type of graph database to use (defaults to GRAPH_DB_TYPE env var)
|
||||
*/
|
||||
public async initialize(graphDbType: GraphDBType = 'arangodb'): Promise<void> {
|
||||
console.log('Initializing remote backend...');
|
||||
public async initialize(graphDbType?: GraphDBType): Promise<void> {
|
||||
const dbType = graphDbType || (process.env.GRAPH_DB_TYPE as GraphDBType) || 'arangodb';
|
||||
console.log(`Initializing remote backend with ${dbType}...`);
|
||||
|
||||
// Initialize Graph Database
|
||||
await this.graphDBService.initialize(graphDbType);
|
||||
console.log(`${graphDbType} service initialized`);
|
||||
await this.graphDBService.initialize(dbType);
|
||||
console.log(`${dbType} service initialized`);
|
||||
|
||||
// Initialize Pinecone
|
||||
await this.pineconeService.initialize();
|
||||
console.log('Pinecone service initialized');
|
||||
// Initialize Qdrant
|
||||
await this.qdrantService.initialize();
|
||||
console.log('Qdrant service initialized');
|
||||
|
||||
// Initialize Embeddings service
|
||||
await this.embeddingsService.initialize();
|
||||
@ -179,9 +180,9 @@ export class RemoteBackendService {
|
||||
entityMetadata.set(entity, entityData);
|
||||
}
|
||||
|
||||
// Store embeddings and metadata in Pinecone
|
||||
await this.pineconeService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata);
|
||||
console.log('Stored embeddings with metadata in Pinecone');
|
||||
// Store embeddings and metadata in Qdrant
|
||||
await this.qdrantService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata);
|
||||
console.log('Stored embeddings with metadata in Qdrant');
|
||||
|
||||
console.log('Backend created successfully from text');
|
||||
}
|
||||
@ -224,9 +225,9 @@ export class RemoteBackendService {
|
||||
});
|
||||
}
|
||||
|
||||
// Store embeddings and metadata in Pinecone
|
||||
await this.pineconeService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata);
|
||||
console.log('Stored embeddings with metadata in Pinecone');
|
||||
// Store embeddings and metadata in Qdrant
|
||||
await this.qdrantService.storeEmbeddingsWithMetadata(entityEmbeddings, textContent, entityMetadata);
|
||||
console.log('Stored embeddings with metadata in Qdrant');
|
||||
|
||||
console.log('Backend created successfully from triples');
|
||||
}
|
||||
@ -287,8 +288,8 @@ export class RemoteBackendService {
|
||||
// Step 1: Generate embedding for query
|
||||
const queryEmbedding = (await this.embeddingsService.encode([query]))[0];
|
||||
|
||||
// Step 2: Find nearest neighbors using Pinecone
|
||||
const seedNodes = await this.pineconeService.findSimilarEntities(queryEmbedding, kNeighbors);
|
||||
// Step 2: Find nearest neighbors using Qdrant
|
||||
const seedNodes = await this.qdrantService.findSimilarEntities(queryEmbedding, kNeighbors);
|
||||
console.log(`Found ${seedNodes.length} seed nodes using KNN`);
|
||||
|
||||
// Step 3: Retrieve graph data from graph database
|
||||
@ -552,9 +553,9 @@ export class RemoteBackendService {
|
||||
// Step 1: Generate embedding for query
|
||||
const queryEmbedding = (await this.embeddingsService.encode([query]))[0];
|
||||
|
||||
// Step 2: Find nearest neighbors using Pinecone with metadata
|
||||
// Step 2: Find nearest neighbors using Qdrant with metadata
|
||||
const { entities: seedNodes, metadata: seedMetadata } =
|
||||
await this.pineconeService.findSimilarEntitiesWithMetadata(queryEmbedding, kNeighbors);
|
||||
await this.qdrantService.findSimilarEntitiesWithMetadata(queryEmbedding, kNeighbors);
|
||||
console.log(`Found ${seedNodes.length} seed nodes using KNN with metadata`);
|
||||
|
||||
// Step 3: Retrieve graph data from graph database
|
||||
|
||||
@ -376,7 +376,7 @@ ${formatInstructions}`;
|
||||
}
|
||||
],
|
||||
temperature: 0.1,
|
||||
max_tokens: 8192,
|
||||
max_tokens: 4096, // Reduced to leave room for input tokens in context
|
||||
top_p: 0.95
|
||||
})
|
||||
});
|
||||
|
||||
@ -3,13 +3,10 @@
|
||||
"version": "0.1.0",
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"predev": "npm run setup-pinecone",
|
||||
"dev": "next dev",
|
||||
"prebuild": "npm run setup-pinecone",
|
||||
"build": "next build",
|
||||
"start": "next start",
|
||||
"lint": "next lint",
|
||||
"setup-pinecone": "node ../scripts/setup-pinecone.js"
|
||||
"lint": "next lint"
|
||||
},
|
||||
"dependencies": {
|
||||
"3d-force-graph": "^1.77.0",
|
||||
|
||||
@ -162,6 +162,26 @@
|
||||
@apply w-5 h-5 rounded-md bg-nvidia-green/15 flex items-center justify-center transition-transform duration-200;
|
||||
}
|
||||
|
||||
/* Tab content wrapper for max-width */
|
||||
.nvidia-build-tab-content {
|
||||
@apply w-full max-w-7xl mx-auto;
|
||||
}
|
||||
|
||||
/* Responsive tab layout */
|
||||
@media (max-width: 768px) {
|
||||
.nvidia-build-tabs {
|
||||
@apply flex-col w-full p-1.5 gap-1;
|
||||
}
|
||||
|
||||
.nvidia-build-tab {
|
||||
@apply w-full justify-start px-4 py-2.5;
|
||||
}
|
||||
|
||||
.nvidia-build-tab-icon {
|
||||
@apply w-5 h-5;
|
||||
}
|
||||
}
|
||||
|
||||
/* Dark Mode Optimizations */
|
||||
@media (prefers-color-scheme: dark) {
|
||||
.nvidia-build-card {
|
||||
|
||||
@ -90,92 +90,57 @@ def parse_args():
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
def load_triples_from_arangodb(arango_url, arango_db, arango_user, arango_password):
|
||||
"""
|
||||
Load triples from ArangoDB for use with the TXT2KG dataset
|
||||
|
||||
Args:
|
||||
arango_url: ArangoDB connection URL
|
||||
arango_db: ArangoDB database name
|
||||
arango_user: ArangoDB username
|
||||
arango_password: ArangoDB password
|
||||
|
||||
Returns:
|
||||
Array of triples in the format expected by create_remote_backend_from_triplets
|
||||
def load_triples_from_arangodb(arango_url: str, arango_db: str, arango_user: str, arango_password: str) -> list[str]:
|
||||
"""
|
||||
Load triples from ArangoDB for use with the TXT2KG dataset
|
||||
|
||||
Args:
|
||||
arango_url: ArangoDB connection URL
|
||||
arango_db: ArangoDB database name
|
||||
arango_user: ArangoDB username
|
||||
arango_password: ArangoDB password
|
||||
|
||||
Returns:
|
||||
List of triples in the format "subject predicate object"
|
||||
"""
|
||||
try:
|
||||
# Connect to ArangoDB
|
||||
client = ArangoClient(hosts=arango_url)
|
||||
|
||||
|
||||
# Get database (no auth in our docker setup)
|
||||
if arango_user and arango_password:
|
||||
db = client.db(arango_db, username=arango_user, password=arango_password)
|
||||
else:
|
||||
db = client.db(arango_db)
|
||||
|
||||
# Query to get all triples from ArangoDB as structured objects
|
||||
# Handle case sensitivity and trim whitespace
|
||||
|
||||
# Query to get all triples from ArangoDB
|
||||
# Handle case sensitivity, trim whitespace, and deduplication
|
||||
aql_query = """
|
||||
FOR e IN relationships
|
||||
LET subject = TRIM(DOCUMENT(e._from).name)
|
||||
LET object = TRIM(DOCUMENT(e._to).name)
|
||||
LET predicate = TRIM(e.type)
|
||||
FILTER subject != "" AND predicate != "" AND object != ""
|
||||
RETURN {
|
||||
subject: subject,
|
||||
predicate: predicate,
|
||||
object: object
|
||||
}
|
||||
LET subject = TRIM(DOCUMENT(e._from).name)
|
||||
LET object = TRIM(DOCUMENT(e._to).name)
|
||||
LET predicate = TRIM(e.type)
|
||||
FILTER subject != "" AND predicate != "" AND object != ""
|
||||
COLLECT s = subject, p = predicate, o = object
|
||||
RETURN CONCAT_SEPARATOR(" ", s, p, o)
|
||||
"""
|
||||
|
||||
# Execute the query
|
||||
cursor = db.aql.execute(aql_query)
|
||||
triple_dicts = list(cursor)
|
||||
|
||||
# Format triples as strings in the format expected by PyTorch Geometric
|
||||
# The expected format is a list of strings in the form "subject predicate object"
|
||||
triples = format_triples_for_pytorch_geometric(triple_dicts)
|
||||
|
||||
|
||||
# Execute the query with streaming for large datasets
|
||||
cursor = db.aql.execute(aql_query, stream=True, batch_size=1000)
|
||||
triples = list(cursor)
|
||||
|
||||
print(f"Loaded {len(triples)} triples from ArangoDB")
|
||||
# Print sample triples for debugging
|
||||
if len(triples) > 0:
|
||||
print("Sample triples:")
|
||||
for i in range(min(3, len(triples))):
|
||||
print(f" {triples[i]}")
|
||||
|
||||
|
||||
return triples
|
||||
except Exception as error:
|
||||
print(f"Error loading triples from ArangoDB: {error}")
|
||||
raise error
|
||||
|
||||
def format_triples_for_pytorch_geometric(triple_dicts):
|
||||
"""
|
||||
Format triples from ArangoDB into the format expected by PyTorch Geometric
|
||||
|
||||
Args:
|
||||
triple_dicts: List of dictionaries with subject, predicate, object keys
|
||||
|
||||
Returns:
|
||||
List of strings in the format "subject predicate object"
|
||||
"""
|
||||
triples = []
|
||||
# Create a set to avoid duplicates
|
||||
unique_triples = set()
|
||||
|
||||
for triple_dict in triple_dicts:
|
||||
# Skip any triple with empty values
|
||||
if not triple_dict['subject'] or not triple_dict['predicate'] or not triple_dict['object']:
|
||||
continue
|
||||
|
||||
# Create a space-separated string in the format that preprocess_triplet expects
|
||||
triple_str = f"{triple_dict['subject']} {triple_dict['predicate']} {triple_dict['object']}"
|
||||
|
||||
# Only add if not already in the set
|
||||
if triple_str not in unique_triples:
|
||||
unique_triples.add(triple_str)
|
||||
triples.append(triple_str)
|
||||
|
||||
return triples
|
||||
|
||||
def get_data(args):
|
||||
# need a JSON dict of Questions and answers, see below for how its used
|
||||
@ -190,48 +155,6 @@ def get_data(args):
|
||||
|
||||
return json_obj, text_contexts
|
||||
|
||||
def validate_triple_format(triples):
|
||||
"""
|
||||
Validate and fix triple format if needed to ensure compatibility with preprocess_triplet
|
||||
|
||||
Args:
|
||||
triples: List of triples to validate
|
||||
|
||||
Returns:
|
||||
Fixed list of triples in the format expected by preprocess_triplet
|
||||
"""
|
||||
validated_triples = []
|
||||
|
||||
print(f"Validating {len(triples)} triples...")
|
||||
for i, triple in enumerate(triples):
|
||||
# If triple is already a proper string with subject, predicate, object
|
||||
if isinstance(triple, str):
|
||||
parts = triple.split()
|
||||
# Ensure there are at least 3 parts (subject, predicate, object)
|
||||
if len(parts) >= 3:
|
||||
# For strings with more than 3 parts, use first as subject, second as predicate,
|
||||
# and join the rest as object
|
||||
subject = parts[0]
|
||||
predicate = parts[1]
|
||||
obj = ' '.join(parts[2:])
|
||||
validated_triple = f"{subject} {predicate} {obj}"
|
||||
validated_triples.append(validated_triple)
|
||||
else:
|
||||
print(f"Warning: Triple at index {i} has fewer than 3 parts: {triple}")
|
||||
# If triple is a dictionary with subject, predicate, object keys
|
||||
elif isinstance(triple, dict) and 'subject' in triple and 'predicate' in triple and 'object' in triple:
|
||||
validated_triple = f"{triple['subject']} {triple['predicate']} {triple['object']}"
|
||||
validated_triples.append(validated_triple)
|
||||
# If triple is a tuple or list of length 3
|
||||
elif (isinstance(triple, tuple) or isinstance(triple, list)) and len(triple) == 3:
|
||||
validated_triple = f"{triple[0]} {triple[1]} {triple[2]}"
|
||||
validated_triples.append(validated_triple)
|
||||
else:
|
||||
print(f"Warning: Skipping triple at index {i} with invalid format: {triple}")
|
||||
|
||||
print(f"Validation complete. {len(validated_triples)} valid triples out of {len(triples)}")
|
||||
return validated_triples
|
||||
|
||||
def make_dataset(args):
|
||||
"""Modified make_dataset function that can use ArangoDB as a data source"""
|
||||
# Create output directory if it doesn't exist
|
||||
@ -257,13 +180,11 @@ def make_dataset(args):
|
||||
# Load triples from ArangoDB instead of generating with TXT2KG
|
||||
print("Loading triples from ArangoDB...")
|
||||
triples = load_triples_from_arangodb(
|
||||
args.arango_url,
|
||||
args.arango_db,
|
||||
args.arango_user,
|
||||
args.arango_url,
|
||||
args.arango_db,
|
||||
args.arango_user,
|
||||
args.arango_password
|
||||
)
|
||||
# Validate and fix triples format if needed
|
||||
triples = validate_triple_format(triples)
|
||||
# Save triples for future use
|
||||
torch.save(triples, triples_path)
|
||||
else:
|
||||
|
||||
@ -1,19 +1,3 @@
|
||||
//
|
||||
// SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
//
|
||||
/**
|
||||
* Simplified Pinecone setup script for Docker environments
|
||||
*/
|
||||
|
||||
@ -20,7 +20,8 @@
|
||||
|
||||
# Parse command line arguments
|
||||
DEV_FRONTEND=false
|
||||
USE_COMPLETE=false
|
||||
USE_VLLM=false
|
||||
USE_VECTOR_SEARCH=false
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
@ -28,8 +29,12 @@ while [[ $# -gt 0 ]]; do
|
||||
DEV_FRONTEND=true
|
||||
shift
|
||||
;;
|
||||
--complete)
|
||||
USE_COMPLETE=true
|
||||
--vllm)
|
||||
USE_VLLM=true
|
||||
shift
|
||||
;;
|
||||
--vector-search)
|
||||
USE_VECTOR_SEARCH=true
|
||||
shift
|
||||
;;
|
||||
--help|-h)
|
||||
@ -37,14 +42,17 @@ while [[ $# -gt 0 ]]; do
|
||||
echo ""
|
||||
echo "Options:"
|
||||
echo " --dev-frontend Run frontend in development mode (without Docker)"
|
||||
echo " --complete Use complete stack (vLLM, Pinecone, Sentence Transformers)"
|
||||
echo " --vllm Use Neo4j + vLLM (GPU-accelerated, for DGX Spark/GB300)"
|
||||
echo " --vector-search Enable vector search services (Qdrant + Sentence Transformers)"
|
||||
echo " --help, -h Show this help message"
|
||||
echo ""
|
||||
echo "Default: Starts minimal stack with Ollama, ArangoDB, and Next.js frontend"
|
||||
echo "Default: Starts ArangoDB + Ollama"
|
||||
echo ""
|
||||
echo "Examples:"
|
||||
echo " ./start.sh # Start minimal demo (recommended)"
|
||||
echo " ./start.sh --complete # Start with all optional services"
|
||||
echo " ./start.sh # Default: ArangoDB + Ollama"
|
||||
echo " ./start.sh --vllm # Use Neo4j + vLLM (GPU)"
|
||||
echo " ./start.sh --vector-search # Add Qdrant + Sentence Transformers"
|
||||
echo " ./start.sh --vllm --vector-search # vLLM + vector search"
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
@ -120,21 +128,32 @@ if ! docker info &> /dev/null; then
|
||||
fi
|
||||
echo "✓ Docker permissions OK"
|
||||
|
||||
# Build the docker-compose command
|
||||
if [ "$USE_COMPLETE" = true ]; then
|
||||
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.complete.yml"
|
||||
echo "Using complete stack (Ollama, vLLM, Pinecone, Sentence Transformers)..."
|
||||
# Select compose file and build command
|
||||
COMPOSE_DIR="$(pwd)/deploy/compose"
|
||||
PROFILES=""
|
||||
|
||||
if [ "$USE_VLLM" = true ]; then
|
||||
COMPOSE_FILE="$COMPOSE_DIR/docker-compose.vllm.yml"
|
||||
echo "Using Neo4j + vLLM (GPU-accelerated)..."
|
||||
echo " ⚡ Optimized for DGX Spark/GB300 with unified memory support"
|
||||
else
|
||||
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.yml"
|
||||
echo "Using minimal configuration (Ollama + ArangoDB only)..."
|
||||
COMPOSE_FILE="$COMPOSE_DIR/docker-compose.yml"
|
||||
echo "Using ArangoDB + Ollama configuration..."
|
||||
fi
|
||||
|
||||
CMD="$DOCKER_COMPOSE_CMD -f $COMPOSE_FILE"
|
||||
|
||||
if [ "$USE_VECTOR_SEARCH" = true ]; then
|
||||
PROFILES="--profile vector-search"
|
||||
echo "Enabling vector search (Qdrant + Sentence Transformers)..."
|
||||
fi
|
||||
|
||||
# Execute the command
|
||||
echo ""
|
||||
echo "Starting services..."
|
||||
echo "Running: $CMD up -d"
|
||||
echo "Running: $CMD $PROFILES up -d"
|
||||
cd $(dirname "$0")
|
||||
eval "$CMD up -d"
|
||||
eval "$CMD $PROFILES up -d"
|
||||
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
@ -143,28 +162,44 @@ echo "=========================================="
|
||||
echo ""
|
||||
echo "Core Services:"
|
||||
echo " • Web UI: http://localhost:3001"
|
||||
echo " • ArangoDB: http://localhost:8529"
|
||||
echo " • Ollama API: http://localhost:11434"
|
||||
if [ "$USE_VLLM" = true ]; then
|
||||
echo " • Neo4j Browser: http://localhost:7474"
|
||||
echo " • vLLM API: http://localhost:8001 (GPU-accelerated)"
|
||||
else
|
||||
echo " • ArangoDB: http://localhost:8529"
|
||||
echo " • Ollama API: http://localhost:11434"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
if [ "$USE_COMPLETE" = true ]; then
|
||||
echo "Additional Services (Complete Stack):"
|
||||
echo " • Local Pinecone: http://localhost:5081"
|
||||
if [ "$USE_VECTOR_SEARCH" = true ]; then
|
||||
echo "Vector Search Services:"
|
||||
echo " • Qdrant: http://localhost:6333"
|
||||
echo " • Sentence Transformers: http://localhost:8000"
|
||||
echo " • vLLM API: http://localhost:8001"
|
||||
echo ""
|
||||
fi
|
||||
|
||||
echo "Next steps:"
|
||||
echo " 1. Pull an Ollama model (if not already done):"
|
||||
echo " docker exec ollama-compose ollama pull llama3.1:8b"
|
||||
echo ""
|
||||
echo " 2. Open http://localhost:3001 in your browser"
|
||||
if [ "$USE_VLLM" = true ]; then
|
||||
echo " 1. Wait for vLLM to load the model (check logs with: docker logs vllm-service -f)"
|
||||
echo " Note: First startup may take several minutes to download the model"
|
||||
echo ""
|
||||
echo " 2. Open http://localhost:3001 in your browser"
|
||||
else
|
||||
echo " 1. Pull an Ollama model (if not already done):"
|
||||
echo " docker exec ollama-compose ollama pull llama3.1:8b"
|
||||
echo ""
|
||||
echo " 2. Open http://localhost:3001 in your browser"
|
||||
fi
|
||||
echo " 3. Upload documents and start building your knowledge graph!"
|
||||
echo ""
|
||||
echo "Other options:"
|
||||
echo " • Stop services: ./stop.sh"
|
||||
echo " • Run frontend in dev mode: ./start.sh --dev-frontend"
|
||||
echo " • Use complete stack: ./start.sh --complete"
|
||||
if [ "$USE_VLLM" = true ]; then
|
||||
echo " • Use Ollama: ./start.sh (without --vllm)"
|
||||
else
|
||||
echo " • Use vLLM (GPU): ./start.sh --vllm"
|
||||
fi
|
||||
echo " • Add vector search: ./start.sh --vector-search"
|
||||
echo " • View logs: docker compose logs -f"
|
||||
echo ""
|
||||
echo ""
|
||||
|
||||
@ -18,27 +18,40 @@
|
||||
|
||||
# Stop script for txt2kg project
|
||||
|
||||
# Check which Docker Compose version is available
|
||||
DOCKER_COMPOSE_CMD=""
|
||||
if docker compose version &> /dev/null; then
|
||||
DOCKER_COMPOSE_CMD="docker compose"
|
||||
elif command -v docker-compose &> /dev/null; then
|
||||
DOCKER_COMPOSE_CMD="docker-compose"
|
||||
else
|
||||
echo "Error: Neither 'docker compose' nor 'docker-compose' is available"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Parse command line arguments
|
||||
USE_COMPLETE=false
|
||||
USE_VLLM=false
|
||||
USE_VECTOR_SEARCH=false
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--complete)
|
||||
USE_COMPLETE=true
|
||||
--vllm)
|
||||
USE_VLLM=true
|
||||
shift
|
||||
;;
|
||||
--vector-search)
|
||||
USE_VECTOR_SEARCH=true
|
||||
shift
|
||||
;;
|
||||
--help|-h)
|
||||
echo "Usage: ./stop.sh [OPTIONS]"
|
||||
echo ""
|
||||
echo "Options:"
|
||||
echo " --complete Stop complete stack (vLLM, Pinecone, Sentence Transformers)"
|
||||
echo " --vllm Stop vLLM stack (use if you started with --vllm)"
|
||||
echo " --vector-search Include vector search services"
|
||||
echo " --help, -h Show this help message"
|
||||
echo ""
|
||||
echo "Default: Stops minimal stack with Ollama, ArangoDB, and Next.js frontend"
|
||||
echo ""
|
||||
echo "Examples:"
|
||||
echo " ./stop.sh # Stop minimal demo"
|
||||
echo " ./stop.sh --complete # Stop complete stack"
|
||||
echo "Note: Use the same flags you used with ./start.sh"
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
@ -49,52 +62,26 @@ while [[ $# -gt 0 ]]; do
|
||||
esac
|
||||
done
|
||||
|
||||
# Check which Docker Compose version is available
|
||||
DOCKER_COMPOSE_CMD=""
|
||||
if docker compose version &> /dev/null; then
|
||||
DOCKER_COMPOSE_CMD="docker compose"
|
||||
elif command -v docker-compose &> /dev/null; then
|
||||
DOCKER_COMPOSE_CMD="docker-compose"
|
||||
# Select compose file
|
||||
COMPOSE_DIR="$(pwd)/deploy/compose"
|
||||
PROFILES=""
|
||||
|
||||
if [ "$USE_VLLM" = true ]; then
|
||||
COMPOSE_FILE="$COMPOSE_DIR/docker-compose.vllm.yml"
|
||||
else
|
||||
echo "Error: Neither 'docker compose' nor 'docker-compose' is available"
|
||||
echo "Please install Docker Compose: https://docs.docker.com/compose/install/"
|
||||
exit 1
|
||||
COMPOSE_FILE="$COMPOSE_DIR/docker-compose.yml"
|
||||
fi
|
||||
|
||||
# Check Docker daemon permissions
|
||||
if ! docker info &> /dev/null; then
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
echo "ERROR: Docker Permission Denied"
|
||||
echo "=========================================="
|
||||
echo ""
|
||||
echo "You don't have permission to connect to the Docker daemon."
|
||||
echo ""
|
||||
echo "To fix this, add your user to the docker group:"
|
||||
echo " sudo usermod -aG docker \$USER"
|
||||
echo " newgrp docker"
|
||||
echo ""
|
||||
exit 1
|
||||
CMD="$DOCKER_COMPOSE_CMD -f $COMPOSE_FILE"
|
||||
|
||||
if [ "$USE_VECTOR_SEARCH" = true ]; then
|
||||
PROFILES="--profile vector-search"
|
||||
fi
|
||||
|
||||
# Build the docker-compose command
|
||||
if [ "$USE_COMPLETE" = true ]; then
|
||||
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.complete.yml"
|
||||
echo "Stopping complete stack..."
|
||||
else
|
||||
CMD="$DOCKER_COMPOSE_CMD -f $(pwd)/deploy/compose/docker-compose.yml"
|
||||
echo "Stopping minimal configuration..."
|
||||
fi
|
||||
|
||||
# Execute the command
|
||||
echo "Running: $CMD down"
|
||||
echo "Stopping txt2kg services..."
|
||||
cd $(dirname "$0")
|
||||
eval "$CMD down"
|
||||
eval "$CMD $PROFILES down"
|
||||
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
echo "txt2kg has been stopped"
|
||||
echo "=========================================="
|
||||
echo ""
|
||||
echo "All services stopped."
|
||||
echo "To start again, run: ./start.sh"
|
||||
echo ""
|
||||
|
||||
@ -68,7 +68,8 @@ The following models are supported with vLLM on Spark. All listed models are ava
|
||||
| **Phi-4-multimodal-instruct** | NVFP4 | ✅ | `nvidia/Phi-4-multimodal-instruct-FP4` |
|
||||
| **Phi-4-reasoning-plus** | FP8 | ✅ | `nvidia/Phi-4-reasoning-plus-FP8` |
|
||||
| **Phi-4-reasoning-plus** | NVFP4 | ✅ | `nvidia/Phi-4-reasoning-plus-FP4` |
|
||||
|
||||
| **Nemotron3-Nano** | BF16 | ✅ | `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16` |
|
||||
| **Nemotron3-Nano** | FP8 | ✅ | `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8` |
|
||||
|
||||
> [!NOTE]
|
||||
> The Phi-4-multimodal-instruct models require `--trust-remote-code` when launching vLLM.
|
||||
@ -118,6 +119,12 @@ export LATEST_VLLM_VERSION=<latest_container_version>
|
||||
docker pull nvcr.io/nvidia/vllm:${LATEST_VLLM_VERSION}
|
||||
```
|
||||
|
||||
For Nemotron3-Nano model support, please use release version 25.12.post1-py3
|
||||
|
||||
```bash
|
||||
docker pull nvcr.io/nvidia/vllm:25.12.post1-py3
|
||||
```
|
||||
|
||||
## Step 3. Test vLLM in container
|
||||
|
||||
Launch the container and start vLLM server with a test model to verify basic functionality.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user