- Add queryMode field to QueryLogSummary interface
- Update getQueryLogs to group by both query AND queryMode
- Use composite key (query|||queryMode) for proper separation
- Enables separate tracking of Pure RAG vs Graph Search queries
Previously, queries with the same text but different modes were
merged together, causing metrics to only show one aggregate value.
Now each mode's performance is tracked independently.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add triple structure logging in API route for debugging
- Update graph-db-service imports for multi-hop fields
- Improve embeddings generator UI responsiveness
- Enable data pipeline verification for depth/pathLength fields
These changes help diagnose issues with multi-hop data flow
and ensure proper propagation of metadata through the stack.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Update default topK from 20 to 40 in query parameters
- Support richer context with multi-hop graph traversal
- Maintain flexible topK range (1-50) via slider
- Respect user selection of query mode (don't force mode changes)
Higher topK allows more edges from multi-hop paths to be included
in the context provided to the LLM for answer generation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Display hop number (0, 1, 2...) with network icon for each triple
- Show multi-hop path badge for paths with length > 1
- Add "Multi-hop enabled" badge in Retrieved Knowledge header
- Implement collapsible thinking steps with proper chevron rotation
- Parse <think> tags from NVIDIA reasoning content
- Reduce console logging (sample only, not full dataset)
- Show path length with amber lightning icon
This provides visual feedback about multi-hop reasoning paths
and makes the LLM's chain-of-thought process transparent.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Extract ALL edges from graph traversal paths, not just endpoints
- Add depth field (edge position in path: 0, 1, 2...)
- Add pathLength field (total edges in path)
- Use numeric index iteration for AQL compatibility
- Apply depth penalty to edge scoring (earlier edges weighted higher)
- Enable visualization of knowledge chains in graph queries
- Increase topK default to 40 for richer multi-hop context
This allows Traditional Graph to show how information is connected
across multiple hops in the knowledge graph, similar to GraphRAG.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Integrate NVIDIA API as alternative to Ollama for graph queries
- Implement thinking tokens API with /think system message
- Add min_thinking_tokens (1024) and max_thinking_tokens (2048)
- Format reasoning_content with <think> tags for UI parsing
- Support dynamic model/provider selection per query
- Maintain Ollama fallback for backward compatibility
This enables Traditional Graph to use NVIDIA's reasoning models
(e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Implement multi-stage Dockerfile (deps → builder → runner)
- Add BuildKit cache mounts for pnpm store and Next.js build cache
- Enable Next.js standalone output for smaller production images
- Create non-root user (nextjs:nodejs) with proper permissions
- Enhance .dockerignore to exclude more build artifacts
- Build time reduced from 225+ seconds to ~35 seconds
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG
- Add collapsible reasoning section for <think> tags in answers
- Add markdown rendering support (bold/italic) in answers
- Fix Pure RAG to properly display answers using llmAnswer state
- Hide empty results message for Pure RAG mode
- Update metrics sidebar to show query times by mode instead of overall average
- Add queryTimesByMode field to metrics API and frontend interfaces
- Disable GraphRAG button with "COMING SOON" badge (requires GNN model)
- Fix Qdrant vector store document mapping with contentPayloadKey
- Update console logs to reflect Qdrant instead of Pinecone
- Add @qdrant/js-client-rest dependency to package.json
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Migrate from Pinecone to Qdrant vector database for native ARM64 support
- Add Qdrant service with automatic collection initialization in docker-compose
- Implement QdrantService with UUID-based point IDs to meet Qdrant requirements
- Update all API routes and frontend components to use Qdrant
- Enhance Storage Connections UI with detailed stats (vectors, status, dimensions)
- Add icons and tooltips to Vector DB section matching Graph DB UX
- Implement parallel chunk processing with configurable concurrency
- Add direct NVIDIA API integration bypassing LangChain for better control
- Optimize for DGX Spark unified memory with batch processing
- Use concurrency of 4 for Ollama, 2 for other providers
- Add proper error handling and user stop capability
- Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Improve prompt engineering for triple extraction
- Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5
- Adjust temperature to 0.6 for better response quality
- Increase timeout to 120s for larger model
- Add top_p, frequency_penalty, and presence_penalty parameters
- Remove deprecated response_format configuration
- Update default NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Update model display name and description
- Replace deprecated 70B model with newer 49B Super model
- Replace button text with icons for compact display
- Add tooltips to Refresh, Disconnect, and Clear buttons
- Improve button spacing and alignment
- Import LogOut icon for disconnect action
- Switch traditional graph search to use LLM-enhanced endpoint
- Display LLM-generated answer prominently above triples
- Add llmAnswer state to store and display generated answers
- Update results section to show 'Supporting Triples' when answer exists
- Pass selected LLM model and provider to API
- Improve debug logging for query modes and results
- Integrate LLMSelectorCompact into RAG query component
- Make query mode cards more compact to accommodate LLM selector
- Update styling for better space utilization
- Add LLM selection section with descriptive label
- Create LLMSelectorCompact component for model selection
- Support Ollama and NVIDIA models
- Load available models from localStorage
- Persist selected model and dispatch selection events
- Compact design suitable for inline placement
- Update metrics endpoint to use getGraphDbService utility
- Support both ArangoDB and Neo4j database types
- Initialize graph database based on selected type
- Retrieve graph stats from the active database
- Add queryWithLLM method to BackendService
- Retrieves top K triples from graph and uses LLM to generate answers
- Supports configurable LLM model and provider selection
- Uses research-backed prompt structure for KG-enhanced RAG
- Includes fallback handling for LLM errors
- Create new /api/graph-query-llm endpoint for graph search + LLM generation
- Retrieves triples using graph search and generates answers using LLM
- Supports both traditional and vector-based graph search
- Makes traditional graph search comparable to RAG for benchmarking
- Add optional Pinecone and sentence-transformers services for vector search
- Configure NVIDIA GPU support with proper environment variables
- Add new environment variables for embeddings and Pinecone
- Add docker compose profiles to optionally enable vector-search
- Improve CUDA configuration for Ollama service
- Add pinecone-net network for service communication