dgx-spark-playbooks

mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-04-27 04:13:52 +00:00

Author	SHA1	Message	Date
Santosh Bhavani	3c39506b06	Fix query mode grouping in performance metrics - Add queryMode field to QueryLogSummary interface - Update getQueryLogs to group by both query AND queryMode - Use composite key (query\|\|\|queryMode) for proper separation - Enables separate tracking of Pure RAG vs Graph Search queries Previously, queries with the same text but different modes were merged together, causing metrics to only show one aggregate value. Now each mode's performance is tracked independently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 14:10:37 -07:00
Santosh Bhavani	325895ffba	Add debug logging and minor improvements - Add triple structure logging in API route for debugging - Update graph-db-service imports for multi-hop fields - Improve embeddings generator UI responsiveness - Enable data pipeline verification for depth/pathLength fields These changes help diagnose issues with multi-hop data flow and ensure proper propagation of metadata through the stack. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:49:14 -07:00
Santosh Bhavani	7742a9f0de	Implement multi-hop graph traversal with depth tracking - Extract ALL edges from graph traversal paths, not just endpoints - Add depth field (edge position in path: 0, 1, 2...) - Add pathLength field (total edges in path) - Use numeric index iteration for AQL compatibility - Apply depth penalty to edge scoring (earlier edges weighted higher) - Enable visualization of knowledge chains in graph queries - Increase topK default to 40 for richer multi-hop context This allows Traditional Graph to show how information is connected across multiple hops in the knowledge graph, similar to GraphRAG. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:52 -07:00
Santosh Bhavani	69cd444ea7	Add NVIDIA API support with thinking tokens for Traditional Graph - Integrate NVIDIA API as alternative to Ollama for graph queries - Implement thinking tokens API with /think system message - Add min_thinking_tokens (1024) and max_thinking_tokens (2048) - Format reasoning_content with <think> tags for UI parsing - Support dynamic model/provider selection per query - Maintain Ollama fallback for backward compatibility This enables Traditional Graph to use NVIDIA's reasoning models (e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:44 -07:00
Santosh Bhavani	8974ee9913	Improve Pure RAG UI and add query mode tracking - Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG - Add collapsible reasoning section for <think> tags in answers - Add markdown rendering support (bold/italic) in answers - Fix Pure RAG to properly display answers using llmAnswer state - Hide empty results message for Pure RAG mode - Update metrics sidebar to show query times by mode instead of overall average - Add queryTimesByMode field to metrics API and frontend interfaces - Disable GraphRAG button with "COMING SOON" badge (requires GNN model) - Fix Qdrant vector store document mapping with contentPayloadKey - Update console logs to reflect Qdrant instead of Pinecone - Add @qdrant/js-client-rest dependency to package.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 10:33:48 -07:00
Santosh Bhavani	de9c46e97e	Replace Pinecone with Qdrant for ARM64 compatibility - Migrate from Pinecone to Qdrant vector database for native ARM64 support - Add Qdrant service with automatic collection initialization in docker-compose - Implement QdrantService with UUID-based point IDs to meet Qdrant requirements - Update all API routes and frontend components to use Qdrant - Enhance Storage Connections UI with detailed stats (vectors, status, dimensions) - Add icons and tooltips to Vector DB section matching Graph DB UX	2025-10-24 23:16:44 -07:00
Santosh Bhavani	6e90701a9b	Add document tracking to prevent duplicates	2025-10-24 19:45:41 -07:00
Santosh Bhavani	97e4be5772	Add configurable NVIDIA model support	2025-10-24 19:45:36 -07:00
Santosh Bhavani	23b5cbca4c	feat(processor): add parallel processing and NVIDIA API support - Implement parallel chunk processing with configurable concurrency - Add direct NVIDIA API integration bypassing LangChain for better control - Optimize for DGX Spark unified memory with batch processing - Use concurrency of 4 for Ollama, 2 for other providers - Add proper error handling and user stop capability - Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5 - Improve prompt engineering for triple extraction	2025-10-19 20:58:59 -07:00
Santosh Bhavani	12c4777eae	feat(langchain): upgrade to Llama 3.3 Nemotron Super 49B - Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5 - Adjust temperature to 0.6 for better response quality - Increase timeout to 120s for larger model - Add top_p, frequency_penalty, and presence_penalty parameters - Remove deprecated response_format configuration	2025-10-19 20:57:03 -07:00
Santosh Bhavani	a082a8a737	feat(backend): implement LLM-enhanced query method - Add queryWithLLM method to BackendService - Retrieves top K triples from graph and uses LLM to generate answers - Supports configurable LLM model and provider selection - Uses research-backed prompt structure for KG-enhanced RAG - Includes fallback handling for LLM errors	2025-10-19 19:57:12 -07:00
GitLab CI	27fe116e71	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00

12 Commits