dgx-spark-playbooks

mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-06-20 21:29:31 +00:00

Author	SHA1	Message	Date
Santosh Bhavani	56d15db148	Rename Traditional Graph to Graph Search and display times in seconds - Rename "Traditional Graph" to "Graph Search" throughout UI - Performance Metrics card label - Answer section badge - Console logging - Display query times in seconds instead of milliseconds - Pure RAG: 11.09s (was 11090.00ms) - Graph Search: 11.09s (was 11090.00ms) - GraphRAG: 11.09s (was 11090.00ms) The new name "Graph Search" better describes the functionality, and seconds provide more intuitive performance comparison. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 14:10:56 -07:00
Santosh Bhavani	3c39506b06	Fix query mode grouping in performance metrics - Add queryMode field to QueryLogSummary interface - Update getQueryLogs to group by both query AND queryMode - Use composite key (query\|\|\|queryMode) for proper separation - Enables separate tracking of Pure RAG vs Graph Search queries Previously, queries with the same text but different modes were merged together, causing metrics to only show one aggregate value. Now each mode's performance is tracked independently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 14:10:37 -07:00
Santosh Bhavani	325895ffba	Add debug logging and minor improvements - Add triple structure logging in API route for debugging - Update graph-db-service imports for multi-hop fields - Improve embeddings generator UI responsiveness - Enable data pipeline verification for depth/pathLength fields These changes help diagnose issues with multi-hop data flow and ensure proper propagation of metadata through the stack. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:49:14 -07:00
Santosh Bhavani	cbe92b50e7	Increase topK default to 40 for multi-hop context - Update default topK from 20 to 40 in query parameters - Support richer context with multi-hop graph traversal - Maintain flexible topK range (1-50) via slider - Respect user selection of query mode (don't force mode changes) Higher topK allows more edges from multi-hop paths to be included in the context provided to the LLM for answer generation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:49:07 -07:00
Santosh Bhavani	3975e92579	Add multi-hop indicators and collapsible thinking steps UI - Display hop number (0, 1, 2...) with network icon for each triple - Show multi-hop path badge for paths with length > 1 - Add "Multi-hop enabled" badge in Retrieved Knowledge header - Implement collapsible thinking steps with proper chevron rotation - Parse <think> tags from NVIDIA reasoning content - Reduce console logging (sample only, not full dataset) - Show path length with amber lightning icon This provides visual feedback about multi-hop reasoning paths and makes the LLM's chain-of-thought process transparent. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:49:00 -07:00
Santosh Bhavani	7742a9f0de	Implement multi-hop graph traversal with depth tracking - Extract ALL edges from graph traversal paths, not just endpoints - Add depth field (edge position in path: 0, 1, 2...) - Add pathLength field (total edges in path) - Use numeric index iteration for AQL compatibility - Apply depth penalty to edge scoring (earlier edges weighted higher) - Enable visualization of knowledge chains in graph queries - Increase topK default to 40 for richer multi-hop context This allows Traditional Graph to show how information is connected across multiple hops in the knowledge graph, similar to GraphRAG. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:52 -07:00
Santosh Bhavani	69cd444ea7	Add NVIDIA API support with thinking tokens for Traditional Graph - Integrate NVIDIA API as alternative to Ollama for graph queries - Implement thinking tokens API with /think system message - Add min_thinking_tokens (1024) and max_thinking_tokens (2048) - Format reasoning_content with <think> tags for UI parsing - Support dynamic model/provider selection per query - Maintain Ollama fallback for backward compatibility This enables Traditional Graph to use NVIDIA's reasoning models (e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:44 -07:00
Santosh Bhavani	0d5b85cdc5	Optimize Docker build with multi-stage caching - Implement multi-stage Dockerfile (deps → builder → runner) - Add BuildKit cache mounts for pnpm store and Next.js build cache - Enable Next.js standalone output for smaller production images - Create non-root user (nextjs:nodejs) with proper permissions - Enhance .dockerignore to exclude more build artifacts - Build time reduced from 225+ seconds to ~35 seconds 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:36 -07:00
Santosh Bhavani	8974ee9913	Improve Pure RAG UI and add query mode tracking - Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG - Add collapsible reasoning section for <think> tags in answers - Add markdown rendering support (bold/italic) in answers - Fix Pure RAG to properly display answers using llmAnswer state - Hide empty results message for Pure RAG mode - Update metrics sidebar to show query times by mode instead of overall average - Add queryTimesByMode field to metrics API and frontend interfaces - Disable GraphRAG button with "COMING SOON" badge (requires GNN model) - Fix Qdrant vector store document mapping with contentPayloadKey - Update console logs to reflect Qdrant instead of Pinecone - Add @qdrant/js-client-rest dependency to package.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 10:33:48 -07:00
Santosh Bhavani	de9c46e97e	Replace Pinecone with Qdrant for ARM64 compatibility - Migrate from Pinecone to Qdrant vector database for native ARM64 support - Add Qdrant service with automatic collection initialization in docker-compose - Implement QdrantService with UUID-based point IDs to meet Qdrant requirements - Update all API routes and frontend components to use Qdrant - Enhance Storage Connections UI with detailed stats (vectors, status, dimensions) - Add icons and tooltips to Vector DB section matching Graph DB UX	2025-10-24 23:16:44 -07:00
Santosh Bhavani	cfebbc7b04	Add stop.sh script	2025-10-24 22:03:47 -07:00
Santosh Bhavani	eec479197b	Add Docker permission validation	2025-10-24 22:02:23 -07:00
Santosh Bhavani	07d4107da4	Merge remote-tracking branch 'upstream/main'	2025-10-24 19:51:27 -07:00
Santosh Bhavani	6e90701a9b	Add document tracking to prevent duplicates	2025-10-24 19:45:41 -07:00
Santosh Bhavani	97e4be5772	Add configurable NVIDIA model support	2025-10-24 19:45:36 -07:00
Santosh Bhavani	215ce25c05	Update NVIDIA models to Nemotron Super/Nano	2025-10-24 19:45:31 -07:00
GitLab CI	6a34e25169	chore: Regenerate all playbooks	2025-10-22 19:44:23 +00:00
GitLab CI	ab0cb00e0b	chore: Regenerate all playbooks	2025-10-22 18:54:29 +00:00
GitLab CI	d301ca4f84	chore: Regenerate all playbooks	2025-10-22 16:17:25 +00:00
GitLab CI	15beb4e9fc	chore: Regenerate all playbooks	2025-10-21 13:09:58 +00:00
GitLab CI	c66572a74b	chore: Regenerate all playbooks	2025-10-21 03:53:26 +00:00
GitLab CI	8ca84d63e9	chore: Regenerate all playbooks	2025-10-21 03:50:02 +00:00
GitLab CI	3c3578c620	chore: Regenerate all playbooks	2025-10-21 03:40:46 +00:00
GitLab CI	11f2a77ea7	chore: Regenerate all playbooks	2025-10-21 00:57:26 +00:00
Santosh Bhavani	23b5cbca4c	feat(processor): add parallel processing and NVIDIA API support - Implement parallel chunk processing with configurable concurrency - Add direct NVIDIA API integration bypassing LangChain for better control - Optimize for DGX Spark unified memory with batch processing - Use concurrency of 4 for Ollama, 2 for other providers - Add proper error handling and user stop capability - Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5 - Improve prompt engineering for triple extraction	2025-10-19 20:58:59 -07:00
Santosh Bhavani	12c4777eae	feat(langchain): upgrade to Llama 3.3 Nemotron Super 49B - Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5 - Adjust temperature to 0.6 for better response quality - Increase timeout to 120s for larger model - Add top_p, frequency_penalty, and presence_penalty parameters - Remove deprecated response_format configuration	2025-10-19 20:57:03 -07:00
Santosh Bhavani	5be2ad78bf	feat(ui): upgrade to NVIDIA Llama 3.3 Nemotron Super 49B - Update default NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5 - Update model display name and description - Replace deprecated 70B model with newer 49B Super model	2025-10-19 20:57:00 -07:00
Santosh Bhavani	529debb633	perf(docker): increase Ollama parallel processing for DGX - Increase OLLAMA_NUM_PARALLEL from 1 to 4 requests - Leverage DGX Spark's unified memory architecture - Improve throughput for concurrent inference requests	2025-10-19 20:56:58 -07:00
Santosh Bhavani	ffb0688a63	refactor(ui): improve database connection button UI - Replace button text with icons for compact display - Add tooltips to Refresh, Disconnect, and Clear buttons - Improve button spacing and alignment - Import LogOut icon for disconnect action	2025-10-19 19:57:17 -07:00
Santosh Bhavani	37ee4b63f1	feat(ui): display LLM-generated answers in RAG page - Switch traditional graph search to use LLM-enhanced endpoint - Display LLM-generated answer prominently above triples - Add llmAnswer state to store and display generated answers - Update results section to show 'Supporting Triples' when answer exists - Pass selected LLM model and provider to API - Improve debug logging for query modes and results	2025-10-19 19:57:15 -07:00
Santosh Bhavani	1bb48b9818	feat(component): add LLM selector to RAG query interface - Integrate LLMSelectorCompact into RAG query component - Make query mode cards more compact to accommodate LLM selector - Update styling for better space utilization - Add LLM selection section with descriptive label	2025-10-19 19:57:15 -07:00
Santosh Bhavani	db1e7760f6	feat(component): add compact LLM selector component - Create LLMSelectorCompact component for model selection - Support Ollama and NVIDIA models - Load available models from localStorage - Persist selected model and dispatch selection events - Compact design suitable for inline placement	2025-10-19 19:57:14 -07:00
Santosh Bhavani	156bfb2e8d	feat(api): update metrics route for multi-database support - Update metrics endpoint to use getGraphDbService utility - Support both ArangoDB and Neo4j database types - Initialize graph database based on selected type - Retrieve graph stats from the active database	2025-10-19 19:57:13 -07:00
Santosh Bhavani	a082a8a737	feat(backend): implement LLM-enhanced query method - Add queryWithLLM method to BackendService - Retrieves top K triples from graph and uses LLM to generate answers - Supports configurable LLM model and provider selection - Uses research-backed prompt structure for KG-enhanced RAG - Includes fallback handling for LLM errors	2025-10-19 19:57:12 -07:00
Santosh Bhavani	d842dc996a	feat(api): add LLM-enhanced graph query endpoint - Create new /api/graph-query-llm endpoint for graph search + LLM generation - Retrieves triples using graph search and generates answers using LLM - Supports both traditional and vector-based graph search - Makes traditional graph search comparable to RAG for benchmarking	2025-10-19 19:57:11 -07:00
Santosh Bhavani	8c1d2ae9f3	feat(docker): add vector search services and GPU configuration - Add optional Pinecone and sentence-transformers services for vector search - Configure NVIDIA GPU support with proper environment variables - Add new environment variables for embeddings and Pinecone - Add docker compose profiles to optionally enable vector-search - Improve CUDA configuration for Ollama service - Add pinecone-net network for service communication	2025-10-19 19:56:55 -07:00
Santosh Bhavani	9dc734eee5	Add NVIDIA_API_KEY support and update ollama to v0.12.6	2025-10-19 14:52:24 -05:00
GitLab CI	752eada0cb	chore: Regenerate all playbooks	2025-10-18 21:48:15 +00:00
GitLab CI	505cacdbd6	chore: Regenerate all playbooks	2025-10-18 21:28:42 +00:00
GitLab CI	a6f94052b1	chore: Regenerate all playbooks	2025-10-17 17:29:40 +00:00
GitLab CI	3ed5b3b073	chore: Regenerate all playbooks	2025-10-17 00:58:35 +00:00
GitLab CI	0d9108cf14	chore: Regenerate all playbooks	2025-10-16 21:25:27 +00:00
GitLab CI	7457f31016	chore: Regenerate all playbooks	2025-10-16 21:14:27 +00:00
GitLab CI	058b5b70b2	chore: Regenerate all playbooks	2025-10-16 20:25:06 +00:00
GitLab CI	c8ab690414	chore: Regenerate all playbooks	2025-10-16 19:02:54 +00:00
GitLab CI	b4a071c721	chore: Regenerate all playbooks	2025-10-16 18:49:21 +00:00
GitLab CI	5cd142bc41	chore: Regenerate all playbooks	2025-10-16 18:35:50 +00:00
GitLab CI	2ff64d7265	chore: Regenerate all playbooks	2025-10-16 17:29:56 +00:00
GitLab CI	6dd7697210	chore: Regenerate all playbooks	2025-10-16 14:13:04 +00:00
GitLab CI	99c6530528	chore: Regenerate all playbooks	2025-10-16 13:05:16 +00:00

1 2 3 4 5

202 Commits