Commit Graph

196 Commits

Author SHA1 Message Date
Santosh Bhavani
69cd444ea7 Add NVIDIA API support with thinking tokens for Traditional Graph
- Integrate NVIDIA API as alternative to Ollama for graph queries
- Implement thinking tokens API with /think system message
- Add min_thinking_tokens (1024) and max_thinking_tokens (2048)
- Format reasoning_content with <think> tags for UI parsing
- Support dynamic model/provider selection per query
- Maintain Ollama fallback for backward compatibility

This enables Traditional Graph to use NVIDIA's reasoning models
(e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:48:44 -07:00
Santosh Bhavani
0d5b85cdc5 Optimize Docker build with multi-stage caching
- Implement multi-stage Dockerfile (deps → builder → runner)
- Add BuildKit cache mounts for pnpm store and Next.js build cache
- Enable Next.js standalone output for smaller production images
- Create non-root user (nextjs:nodejs) with proper permissions
- Enhance .dockerignore to exclude more build artifacts
- Build time reduced from 225+ seconds to ~35 seconds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:48:36 -07:00
Santosh Bhavani
8974ee9913 Improve Pure RAG UI and add query mode tracking
- Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG
- Add collapsible reasoning section for <think> tags in answers
- Add markdown rendering support (bold/italic) in answers
- Fix Pure RAG to properly display answers using llmAnswer state
- Hide empty results message for Pure RAG mode
- Update metrics sidebar to show query times by mode instead of overall average
- Add queryTimesByMode field to metrics API and frontend interfaces
- Disable GraphRAG button with "COMING SOON" badge (requires GNN model)
- Fix Qdrant vector store document mapping with contentPayloadKey
- Update console logs to reflect Qdrant instead of Pinecone
- Add @qdrant/js-client-rest dependency to package.json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 10:33:48 -07:00
Santosh Bhavani
de9c46e97e Replace Pinecone with Qdrant for ARM64 compatibility
- Migrate from Pinecone to Qdrant vector database for native ARM64 support
- Add Qdrant service with automatic collection initialization in docker-compose
- Implement QdrantService with UUID-based point IDs to meet Qdrant requirements
- Update all API routes and frontend components to use Qdrant
- Enhance Storage Connections UI with detailed stats (vectors, status, dimensions)
- Add icons and tooltips to Vector DB section matching Graph DB UX
2025-10-24 23:16:44 -07:00
Santosh Bhavani
cfebbc7b04 Add stop.sh script 2025-10-24 22:03:47 -07:00
Santosh Bhavani
eec479197b Add Docker permission validation 2025-10-24 22:02:23 -07:00
Santosh Bhavani
07d4107da4 Merge remote-tracking branch 'upstream/main' 2025-10-24 19:51:27 -07:00
Santosh Bhavani
6e90701a9b Add document tracking to prevent duplicates 2025-10-24 19:45:41 -07:00
Santosh Bhavani
97e4be5772 Add configurable NVIDIA model support 2025-10-24 19:45:36 -07:00
Santosh Bhavani
215ce25c05 Update NVIDIA models to Nemotron Super/Nano 2025-10-24 19:45:31 -07:00
GitLab CI
6a34e25169 chore: Regenerate all playbooks 2025-10-22 19:44:23 +00:00
GitLab CI
ab0cb00e0b chore: Regenerate all playbooks 2025-10-22 18:54:29 +00:00
GitLab CI
d301ca4f84 chore: Regenerate all playbooks 2025-10-22 16:17:25 +00:00
GitLab CI
15beb4e9fc chore: Regenerate all playbooks 2025-10-21 13:09:58 +00:00
GitLab CI
c66572a74b chore: Regenerate all playbooks 2025-10-21 03:53:26 +00:00
GitLab CI
8ca84d63e9 chore: Regenerate all playbooks 2025-10-21 03:50:02 +00:00
GitLab CI
3c3578c620 chore: Regenerate all playbooks 2025-10-21 03:40:46 +00:00
GitLab CI
11f2a77ea7 chore: Regenerate all playbooks 2025-10-21 00:57:26 +00:00
Santosh Bhavani
23b5cbca4c feat(processor): add parallel processing and NVIDIA API support
- Implement parallel chunk processing with configurable concurrency
- Add direct NVIDIA API integration bypassing LangChain for better control
- Optimize for DGX Spark unified memory with batch processing
- Use concurrency of 4 for Ollama, 2 for other providers
- Add proper error handling and user stop capability
- Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Improve prompt engineering for triple extraction
2025-10-19 20:58:59 -07:00
Santosh Bhavani
12c4777eae feat(langchain): upgrade to Llama 3.3 Nemotron Super 49B
- Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5
- Adjust temperature to 0.6 for better response quality
- Increase timeout to 120s for larger model
- Add top_p, frequency_penalty, and presence_penalty parameters
- Remove deprecated response_format configuration
2025-10-19 20:57:03 -07:00
Santosh Bhavani
5be2ad78bf feat(ui): upgrade to NVIDIA Llama 3.3 Nemotron Super 49B
- Update default NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Update model display name and description
- Replace deprecated 70B model with newer 49B Super model
2025-10-19 20:57:00 -07:00
Santosh Bhavani
529debb633 perf(docker): increase Ollama parallel processing for DGX
- Increase OLLAMA_NUM_PARALLEL from 1 to 4 requests
- Leverage DGX Spark's unified memory architecture
- Improve throughput for concurrent inference requests
2025-10-19 20:56:58 -07:00
Santosh Bhavani
ffb0688a63 refactor(ui): improve database connection button UI
- Replace button text with icons for compact display
- Add tooltips to Refresh, Disconnect, and Clear buttons
- Improve button spacing and alignment
- Import LogOut icon for disconnect action
2025-10-19 19:57:17 -07:00
Santosh Bhavani
37ee4b63f1 feat(ui): display LLM-generated answers in RAG page
- Switch traditional graph search to use LLM-enhanced endpoint
- Display LLM-generated answer prominently above triples
- Add llmAnswer state to store and display generated answers
- Update results section to show 'Supporting Triples' when answer exists
- Pass selected LLM model and provider to API
- Improve debug logging for query modes and results
2025-10-19 19:57:15 -07:00
Santosh Bhavani
1bb48b9818 feat(component): add LLM selector to RAG query interface
- Integrate LLMSelectorCompact into RAG query component
- Make query mode cards more compact to accommodate LLM selector
- Update styling for better space utilization
- Add LLM selection section with descriptive label
2025-10-19 19:57:15 -07:00
Santosh Bhavani
db1e7760f6 feat(component): add compact LLM selector component
- Create LLMSelectorCompact component for model selection
- Support Ollama and NVIDIA models
- Load available models from localStorage
- Persist selected model and dispatch selection events
- Compact design suitable for inline placement
2025-10-19 19:57:14 -07:00
Santosh Bhavani
156bfb2e8d feat(api): update metrics route for multi-database support
- Update metrics endpoint to use getGraphDbService utility
- Support both ArangoDB and Neo4j database types
- Initialize graph database based on selected type
- Retrieve graph stats from the active database
2025-10-19 19:57:13 -07:00
Santosh Bhavani
a082a8a737 feat(backend): implement LLM-enhanced query method
- Add queryWithLLM method to BackendService
- Retrieves top K triples from graph and uses LLM to generate answers
- Supports configurable LLM model and provider selection
- Uses research-backed prompt structure for KG-enhanced RAG
- Includes fallback handling for LLM errors
2025-10-19 19:57:12 -07:00
Santosh Bhavani
d842dc996a feat(api): add LLM-enhanced graph query endpoint
- Create new /api/graph-query-llm endpoint for graph search + LLM generation
- Retrieves triples using graph search and generates answers using LLM
- Supports both traditional and vector-based graph search
- Makes traditional graph search comparable to RAG for benchmarking
2025-10-19 19:57:11 -07:00
Santosh Bhavani
8c1d2ae9f3 feat(docker): add vector search services and GPU configuration
- Add optional Pinecone and sentence-transformers services for vector search
- Configure NVIDIA GPU support with proper environment variables
- Add new environment variables for embeddings and Pinecone
- Add docker compose profiles to optionally enable vector-search
- Improve CUDA configuration for Ollama service
- Add pinecone-net network for service communication
2025-10-19 19:56:55 -07:00
Santosh Bhavani
9dc734eee5 Add NVIDIA_API_KEY support and update ollama to v0.12.6 2025-10-19 14:52:24 -05:00
GitLab CI
752eada0cb chore: Regenerate all playbooks 2025-10-18 21:48:15 +00:00
GitLab CI
505cacdbd6 chore: Regenerate all playbooks 2025-10-18 21:28:42 +00:00
GitLab CI
a6f94052b1 chore: Regenerate all playbooks 2025-10-17 17:29:40 +00:00
GitLab CI
3ed5b3b073 chore: Regenerate all playbooks 2025-10-17 00:58:35 +00:00
GitLab CI
0d9108cf14 chore: Regenerate all playbooks 2025-10-16 21:25:27 +00:00
GitLab CI
7457f31016 chore: Regenerate all playbooks 2025-10-16 21:14:27 +00:00
GitLab CI
058b5b70b2 chore: Regenerate all playbooks 2025-10-16 20:25:06 +00:00
GitLab CI
c8ab690414 chore: Regenerate all playbooks 2025-10-16 19:02:54 +00:00
GitLab CI
b4a071c721 chore: Regenerate all playbooks 2025-10-16 18:49:21 +00:00
GitLab CI
5cd142bc41 chore: Regenerate all playbooks 2025-10-16 18:35:50 +00:00
GitLab CI
2ff64d7265 chore: Regenerate all playbooks 2025-10-16 17:29:56 +00:00
GitLab CI
6dd7697210 chore: Regenerate all playbooks 2025-10-16 14:13:04 +00:00
GitLab CI
99c6530528 chore: Regenerate all playbooks 2025-10-16 13:05:16 +00:00
GitLab CI
2371189ab9 chore: Regenerate all playbooks 2025-10-15 13:32:10 +00:00
GitLab CI
8a12782b17 chore: Regenerate all playbooks 2025-10-14 14:11:25 +00:00
KJ
fc2e7847da
Remove third-party license reference from README
Removed reference to third-party licensing information.
2025-10-14 09:05:25 -04:00
KJ
daeb58d0da
Delete LICENSE-3rd-party 2025-10-14 09:05:07 -04:00
KJ
b50f58cabd
Update LICENSE-3rd-party with new copyright details
Added copyright information for multiple third-party libraries including Marimo, JAX, Unsloth, and OpenCV.
2025-10-13 22:29:59 -04:00
KJ
ecd88a2ac3
Update documentation links in README.md 2025-10-13 22:20:34 -04:00