Commit Graph

202 Commits

Author SHA1 Message Date
Santosh Bhavani
56d15db148 Rename Traditional Graph to Graph Search and display times in seconds
- Rename "Traditional Graph" to "Graph Search" throughout UI
  - Performance Metrics card label
  - Answer section badge
  - Console logging
- Display query times in seconds instead of milliseconds
  - Pure RAG: 11.09s (was 11090.00ms)
  - Graph Search: 11.09s (was 11090.00ms)
  - GraphRAG: 11.09s (was 11090.00ms)

The new name "Graph Search" better describes the functionality,
and seconds provide more intuitive performance comparison.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 14:10:56 -07:00
Santosh Bhavani
3c39506b06 Fix query mode grouping in performance metrics
- Add queryMode field to QueryLogSummary interface
- Update getQueryLogs to group by both query AND queryMode
- Use composite key (query|||queryMode) for proper separation
- Enables separate tracking of Pure RAG vs Graph Search queries

Previously, queries with the same text but different modes were
merged together, causing metrics to only show one aggregate value.
Now each mode's performance is tracked independently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 14:10:37 -07:00
Santosh Bhavani
325895ffba Add debug logging and minor improvements
- Add triple structure logging in API route for debugging
- Update graph-db-service imports for multi-hop fields
- Improve embeddings generator UI responsiveness
- Enable data pipeline verification for depth/pathLength fields

These changes help diagnose issues with multi-hop data flow
and ensure proper propagation of metadata through the stack.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:49:14 -07:00
Santosh Bhavani
cbe92b50e7 Increase topK default to 40 for multi-hop context
- Update default topK from 20 to 40 in query parameters
- Support richer context with multi-hop graph traversal
- Maintain flexible topK range (1-50) via slider
- Respect user selection of query mode (don't force mode changes)

Higher topK allows more edges from multi-hop paths to be included
in the context provided to the LLM for answer generation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:49:07 -07:00
Santosh Bhavani
3975e92579 Add multi-hop indicators and collapsible thinking steps UI
- Display hop number (0, 1, 2...) with network icon for each triple
- Show multi-hop path badge for paths with length > 1
- Add "Multi-hop enabled" badge in Retrieved Knowledge header
- Implement collapsible thinking steps with proper chevron rotation
- Parse <think> tags from NVIDIA reasoning content
- Reduce console logging (sample only, not full dataset)
- Show path length with amber lightning icon

This provides visual feedback about multi-hop reasoning paths
and makes the LLM's chain-of-thought process transparent.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:49:00 -07:00
Santosh Bhavani
7742a9f0de Implement multi-hop graph traversal with depth tracking
- Extract ALL edges from graph traversal paths, not just endpoints
- Add depth field (edge position in path: 0, 1, 2...)
- Add pathLength field (total edges in path)
- Use numeric index iteration for AQL compatibility
- Apply depth penalty to edge scoring (earlier edges weighted higher)
- Enable visualization of knowledge chains in graph queries
- Increase topK default to 40 for richer multi-hop context

This allows Traditional Graph to show how information is connected
across multiple hops in the knowledge graph, similar to GraphRAG.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:48:52 -07:00
Santosh Bhavani
69cd444ea7 Add NVIDIA API support with thinking tokens for Traditional Graph
- Integrate NVIDIA API as alternative to Ollama for graph queries
- Implement thinking tokens API with /think system message
- Add min_thinking_tokens (1024) and max_thinking_tokens (2048)
- Format reasoning_content with <think> tags for UI parsing
- Support dynamic model/provider selection per query
- Maintain Ollama fallback for backward compatibility

This enables Traditional Graph to use NVIDIA's reasoning models
(e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:48:44 -07:00
Santosh Bhavani
0d5b85cdc5 Optimize Docker build with multi-stage caching
- Implement multi-stage Dockerfile (deps → builder → runner)
- Add BuildKit cache mounts for pnpm store and Next.js build cache
- Enable Next.js standalone output for smaller production images
- Create non-root user (nextjs:nodejs) with proper permissions
- Enhance .dockerignore to exclude more build artifacts
- Build time reduced from 225+ seconds to ~35 seconds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:48:36 -07:00
Santosh Bhavani
8974ee9913 Improve Pure RAG UI and add query mode tracking
- Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG
- Add collapsible reasoning section for <think> tags in answers
- Add markdown rendering support (bold/italic) in answers
- Fix Pure RAG to properly display answers using llmAnswer state
- Hide empty results message for Pure RAG mode
- Update metrics sidebar to show query times by mode instead of overall average
- Add queryTimesByMode field to metrics API and frontend interfaces
- Disable GraphRAG button with "COMING SOON" badge (requires GNN model)
- Fix Qdrant vector store document mapping with contentPayloadKey
- Update console logs to reflect Qdrant instead of Pinecone
- Add @qdrant/js-client-rest dependency to package.json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 10:33:48 -07:00
Santosh Bhavani
de9c46e97e Replace Pinecone with Qdrant for ARM64 compatibility
- Migrate from Pinecone to Qdrant vector database for native ARM64 support
- Add Qdrant service with automatic collection initialization in docker-compose
- Implement QdrantService with UUID-based point IDs to meet Qdrant requirements
- Update all API routes and frontend components to use Qdrant
- Enhance Storage Connections UI with detailed stats (vectors, status, dimensions)
- Add icons and tooltips to Vector DB section matching Graph DB UX
2025-10-24 23:16:44 -07:00
Santosh Bhavani
cfebbc7b04 Add stop.sh script 2025-10-24 22:03:47 -07:00
Santosh Bhavani
eec479197b Add Docker permission validation 2025-10-24 22:02:23 -07:00
Santosh Bhavani
07d4107da4 Merge remote-tracking branch 'upstream/main' 2025-10-24 19:51:27 -07:00
Santosh Bhavani
6e90701a9b Add document tracking to prevent duplicates 2025-10-24 19:45:41 -07:00
Santosh Bhavani
97e4be5772 Add configurable NVIDIA model support 2025-10-24 19:45:36 -07:00
Santosh Bhavani
215ce25c05 Update NVIDIA models to Nemotron Super/Nano 2025-10-24 19:45:31 -07:00
GitLab CI
6a34e25169 chore: Regenerate all playbooks 2025-10-22 19:44:23 +00:00
GitLab CI
ab0cb00e0b chore: Regenerate all playbooks 2025-10-22 18:54:29 +00:00
GitLab CI
d301ca4f84 chore: Regenerate all playbooks 2025-10-22 16:17:25 +00:00
GitLab CI
15beb4e9fc chore: Regenerate all playbooks 2025-10-21 13:09:58 +00:00
GitLab CI
c66572a74b chore: Regenerate all playbooks 2025-10-21 03:53:26 +00:00
GitLab CI
8ca84d63e9 chore: Regenerate all playbooks 2025-10-21 03:50:02 +00:00
GitLab CI
3c3578c620 chore: Regenerate all playbooks 2025-10-21 03:40:46 +00:00
GitLab CI
11f2a77ea7 chore: Regenerate all playbooks 2025-10-21 00:57:26 +00:00
Santosh Bhavani
23b5cbca4c feat(processor): add parallel processing and NVIDIA API support
- Implement parallel chunk processing with configurable concurrency
- Add direct NVIDIA API integration bypassing LangChain for better control
- Optimize for DGX Spark unified memory with batch processing
- Use concurrency of 4 for Ollama, 2 for other providers
- Add proper error handling and user stop capability
- Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Improve prompt engineering for triple extraction
2025-10-19 20:58:59 -07:00
Santosh Bhavani
12c4777eae feat(langchain): upgrade to Llama 3.3 Nemotron Super 49B
- Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5
- Adjust temperature to 0.6 for better response quality
- Increase timeout to 120s for larger model
- Add top_p, frequency_penalty, and presence_penalty parameters
- Remove deprecated response_format configuration
2025-10-19 20:57:03 -07:00
Santosh Bhavani
5be2ad78bf feat(ui): upgrade to NVIDIA Llama 3.3 Nemotron Super 49B
- Update default NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Update model display name and description
- Replace deprecated 70B model with newer 49B Super model
2025-10-19 20:57:00 -07:00
Santosh Bhavani
529debb633 perf(docker): increase Ollama parallel processing for DGX
- Increase OLLAMA_NUM_PARALLEL from 1 to 4 requests
- Leverage DGX Spark's unified memory architecture
- Improve throughput for concurrent inference requests
2025-10-19 20:56:58 -07:00
Santosh Bhavani
ffb0688a63 refactor(ui): improve database connection button UI
- Replace button text with icons for compact display
- Add tooltips to Refresh, Disconnect, and Clear buttons
- Improve button spacing and alignment
- Import LogOut icon for disconnect action
2025-10-19 19:57:17 -07:00
Santosh Bhavani
37ee4b63f1 feat(ui): display LLM-generated answers in RAG page
- Switch traditional graph search to use LLM-enhanced endpoint
- Display LLM-generated answer prominently above triples
- Add llmAnswer state to store and display generated answers
- Update results section to show 'Supporting Triples' when answer exists
- Pass selected LLM model and provider to API
- Improve debug logging for query modes and results
2025-10-19 19:57:15 -07:00
Santosh Bhavani
1bb48b9818 feat(component): add LLM selector to RAG query interface
- Integrate LLMSelectorCompact into RAG query component
- Make query mode cards more compact to accommodate LLM selector
- Update styling for better space utilization
- Add LLM selection section with descriptive label
2025-10-19 19:57:15 -07:00
Santosh Bhavani
db1e7760f6 feat(component): add compact LLM selector component
- Create LLMSelectorCompact component for model selection
- Support Ollama and NVIDIA models
- Load available models from localStorage
- Persist selected model and dispatch selection events
- Compact design suitable for inline placement
2025-10-19 19:57:14 -07:00
Santosh Bhavani
156bfb2e8d feat(api): update metrics route for multi-database support
- Update metrics endpoint to use getGraphDbService utility
- Support both ArangoDB and Neo4j database types
- Initialize graph database based on selected type
- Retrieve graph stats from the active database
2025-10-19 19:57:13 -07:00
Santosh Bhavani
a082a8a737 feat(backend): implement LLM-enhanced query method
- Add queryWithLLM method to BackendService
- Retrieves top K triples from graph and uses LLM to generate answers
- Supports configurable LLM model and provider selection
- Uses research-backed prompt structure for KG-enhanced RAG
- Includes fallback handling for LLM errors
2025-10-19 19:57:12 -07:00
Santosh Bhavani
d842dc996a feat(api): add LLM-enhanced graph query endpoint
- Create new /api/graph-query-llm endpoint for graph search + LLM generation
- Retrieves triples using graph search and generates answers using LLM
- Supports both traditional and vector-based graph search
- Makes traditional graph search comparable to RAG for benchmarking
2025-10-19 19:57:11 -07:00
Santosh Bhavani
8c1d2ae9f3 feat(docker): add vector search services and GPU configuration
- Add optional Pinecone and sentence-transformers services for vector search
- Configure NVIDIA GPU support with proper environment variables
- Add new environment variables for embeddings and Pinecone
- Add docker compose profiles to optionally enable vector-search
- Improve CUDA configuration for Ollama service
- Add pinecone-net network for service communication
2025-10-19 19:56:55 -07:00
Santosh Bhavani
9dc734eee5 Add NVIDIA_API_KEY support and update ollama to v0.12.6 2025-10-19 14:52:24 -05:00
GitLab CI
752eada0cb chore: Regenerate all playbooks 2025-10-18 21:48:15 +00:00
GitLab CI
505cacdbd6 chore: Regenerate all playbooks 2025-10-18 21:28:42 +00:00
GitLab CI
a6f94052b1 chore: Regenerate all playbooks 2025-10-17 17:29:40 +00:00
GitLab CI
3ed5b3b073 chore: Regenerate all playbooks 2025-10-17 00:58:35 +00:00
GitLab CI
0d9108cf14 chore: Regenerate all playbooks 2025-10-16 21:25:27 +00:00
GitLab CI
7457f31016 chore: Regenerate all playbooks 2025-10-16 21:14:27 +00:00
GitLab CI
058b5b70b2 chore: Regenerate all playbooks 2025-10-16 20:25:06 +00:00
GitLab CI
c8ab690414 chore: Regenerate all playbooks 2025-10-16 19:02:54 +00:00
GitLab CI
b4a071c721 chore: Regenerate all playbooks 2025-10-16 18:49:21 +00:00
GitLab CI
5cd142bc41 chore: Regenerate all playbooks 2025-10-16 18:35:50 +00:00
GitLab CI
2ff64d7265 chore: Regenerate all playbooks 2025-10-16 17:29:56 +00:00
GitLab CI
6dd7697210 chore: Regenerate all playbooks 2025-10-16 14:13:04 +00:00
GitLab CI
99c6530528 chore: Regenerate all playbooks 2025-10-16 13:05:16 +00:00