- Update default topK from 20 to 40 in query parameters
- Support richer context with multi-hop graph traversal
- Maintain flexible topK range (1-50) via slider
- Respect user selection of query mode (don't force mode changes)
Higher topK allows more edges from multi-hop paths to be included
in the context provided to the LLM for answer generation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Display hop number (0, 1, 2...) with network icon for each triple
- Show multi-hop path badge for paths with length > 1
- Add "Multi-hop enabled" badge in Retrieved Knowledge header
- Implement collapsible thinking steps with proper chevron rotation
- Parse <think> tags from NVIDIA reasoning content
- Reduce console logging (sample only, not full dataset)
- Show path length with amber lightning icon
This provides visual feedback about multi-hop reasoning paths
and makes the LLM's chain-of-thought process transparent.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Extract ALL edges from graph traversal paths, not just endpoints
- Add depth field (edge position in path: 0, 1, 2...)
- Add pathLength field (total edges in path)
- Use numeric index iteration for AQL compatibility
- Apply depth penalty to edge scoring (earlier edges weighted higher)
- Enable visualization of knowledge chains in graph queries
- Increase topK default to 40 for richer multi-hop context
This allows Traditional Graph to show how information is connected
across multiple hops in the knowledge graph, similar to GraphRAG.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Integrate NVIDIA API as alternative to Ollama for graph queries
- Implement thinking tokens API with /think system message
- Add min_thinking_tokens (1024) and max_thinking_tokens (2048)
- Format reasoning_content with <think> tags for UI parsing
- Support dynamic model/provider selection per query
- Maintain Ollama fallback for backward compatibility
This enables Traditional Graph to use NVIDIA's reasoning models
(e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Implement multi-stage Dockerfile (deps → builder → runner)
- Add BuildKit cache mounts for pnpm store and Next.js build cache
- Enable Next.js standalone output for smaller production images
- Create non-root user (nextjs:nodejs) with proper permissions
- Enhance .dockerignore to exclude more build artifacts
- Build time reduced from 225+ seconds to ~35 seconds
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG
- Add collapsible reasoning section for <think> tags in answers
- Add markdown rendering support (bold/italic) in answers
- Fix Pure RAG to properly display answers using llmAnswer state
- Hide empty results message for Pure RAG mode
- Update metrics sidebar to show query times by mode instead of overall average
- Add queryTimesByMode field to metrics API and frontend interfaces
- Disable GraphRAG button with "COMING SOON" badge (requires GNN model)
- Fix Qdrant vector store document mapping with contentPayloadKey
- Update console logs to reflect Qdrant instead of Pinecone
- Add @qdrant/js-client-rest dependency to package.json
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Migrate from Pinecone to Qdrant vector database for native ARM64 support
- Add Qdrant service with automatic collection initialization in docker-compose
- Implement QdrantService with UUID-based point IDs to meet Qdrant requirements
- Update all API routes and frontend components to use Qdrant
- Enhance Storage Connections UI with detailed stats (vectors, status, dimensions)
- Add icons and tooltips to Vector DB section matching Graph DB UX
- Implement parallel chunk processing with configurable concurrency
- Add direct NVIDIA API integration bypassing LangChain for better control
- Optimize for DGX Spark unified memory with batch processing
- Use concurrency of 4 for Ollama, 2 for other providers
- Add proper error handling and user stop capability
- Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Improve prompt engineering for triple extraction
- Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5
- Adjust temperature to 0.6 for better response quality
- Increase timeout to 120s for larger model
- Add top_p, frequency_penalty, and presence_penalty parameters
- Remove deprecated response_format configuration
- Update default NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Update model display name and description
- Replace deprecated 70B model with newer 49B Super model
- Replace button text with icons for compact display
- Add tooltips to Refresh, Disconnect, and Clear buttons
- Improve button spacing and alignment
- Import LogOut icon for disconnect action
- Switch traditional graph search to use LLM-enhanced endpoint
- Display LLM-generated answer prominently above triples
- Add llmAnswer state to store and display generated answers
- Update results section to show 'Supporting Triples' when answer exists
- Pass selected LLM model and provider to API
- Improve debug logging for query modes and results
- Integrate LLMSelectorCompact into RAG query component
- Make query mode cards more compact to accommodate LLM selector
- Update styling for better space utilization
- Add LLM selection section with descriptive label
- Create LLMSelectorCompact component for model selection
- Support Ollama and NVIDIA models
- Load available models from localStorage
- Persist selected model and dispatch selection events
- Compact design suitable for inline placement
- Update metrics endpoint to use getGraphDbService utility
- Support both ArangoDB and Neo4j database types
- Initialize graph database based on selected type
- Retrieve graph stats from the active database
- Add queryWithLLM method to BackendService
- Retrieves top K triples from graph and uses LLM to generate answers
- Supports configurable LLM model and provider selection
- Uses research-backed prompt structure for KG-enhanced RAG
- Includes fallback handling for LLM errors
- Create new /api/graph-query-llm endpoint for graph search + LLM generation
- Retrieves triples using graph search and generates answers using LLM
- Supports both traditional and vector-based graph search
- Makes traditional graph search comparable to RAG for benchmarking
- Add optional Pinecone and sentence-transformers services for vector search
- Configure NVIDIA GPU support with proper environment variables
- Add new environment variables for embeddings and Pinecone
- Add docker compose profiles to optionally enable vector-search
- Improve CUDA configuration for Ollama service
- Add pinecone-net network for service communication