dgx-spark-playbooks

mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-06-19 04:49:35 +00:00

Author	SHA1	Message	Date
Santosh Bhavani	69cd444ea7	Add NVIDIA API support with thinking tokens for Traditional Graph - Integrate NVIDIA API as alternative to Ollama for graph queries - Implement thinking tokens API with /think system message - Add min_thinking_tokens (1024) and max_thinking_tokens (2048) - Format reasoning_content with <think> tags for UI parsing - Support dynamic model/provider selection per query - Maintain Ollama fallback for backward compatibility This enables Traditional Graph to use NVIDIA's reasoning models (e.g., nvidia-nemotron-nano-9b-v2) with visible chain-of-thought. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:44 -07:00
Santosh Bhavani	0d5b85cdc5	Optimize Docker build with multi-stage caching - Implement multi-stage Dockerfile (deps → builder → runner) - Add BuildKit cache mounts for pnpm store and Next.js build cache - Enable Next.js standalone output for smaller production images - Create non-root user (nextjs:nodejs) with proper permissions - Enhance .dockerignore to exclude more build artifacts - Build time reduced from 225+ seconds to ~35 seconds 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 13:48:36 -07:00
Santosh Bhavani	8974ee9913	Improve Pure RAG UI and add query mode tracking - Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG - Add collapsible reasoning section for <think> tags in answers - Add markdown rendering support (bold/italic) in answers - Fix Pure RAG to properly display answers using llmAnswer state - Hide empty results message for Pure RAG mode - Update metrics sidebar to show query times by mode instead of overall average - Add queryTimesByMode field to metrics API and frontend interfaces - Disable GraphRAG button with "COMING SOON" badge (requires GNN model) - Fix Qdrant vector store document mapping with contentPayloadKey - Update console logs to reflect Qdrant instead of Pinecone - Add @qdrant/js-client-rest dependency to package.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 10:33:48 -07:00
Santosh Bhavani	de9c46e97e	Replace Pinecone with Qdrant for ARM64 compatibility - Migrate from Pinecone to Qdrant vector database for native ARM64 support - Add Qdrant service with automatic collection initialization in docker-compose - Implement QdrantService with UUID-based point IDs to meet Qdrant requirements - Update all API routes and frontend components to use Qdrant - Enhance Storage Connections UI with detailed stats (vectors, status, dimensions) - Add icons and tooltips to Vector DB section matching Graph DB UX	2025-10-24 23:16:44 -07:00
Santosh Bhavani	cfebbc7b04	Add stop.sh script	2025-10-24 22:03:47 -07:00
Santosh Bhavani	eec479197b	Add Docker permission validation	2025-10-24 22:02:23 -07:00
Santosh Bhavani	07d4107da4	Merge remote-tracking branch 'upstream/main'	2025-10-24 19:51:27 -07:00
Santosh Bhavani	6e90701a9b	Add document tracking to prevent duplicates	2025-10-24 19:45:41 -07:00
Santosh Bhavani	97e4be5772	Add configurable NVIDIA model support	2025-10-24 19:45:36 -07:00
Santosh Bhavani	215ce25c05	Update NVIDIA models to Nemotron Super/Nano	2025-10-24 19:45:31 -07:00
GitLab CI	6a34e25169	chore: Regenerate all playbooks	2025-10-22 19:44:23 +00:00
GitLab CI	ab0cb00e0b	chore: Regenerate all playbooks	2025-10-22 18:54:29 +00:00
GitLab CI	d301ca4f84	chore: Regenerate all playbooks	2025-10-22 16:17:25 +00:00
GitLab CI	15beb4e9fc	chore: Regenerate all playbooks	2025-10-21 13:09:58 +00:00
GitLab CI	c66572a74b	chore: Regenerate all playbooks	2025-10-21 03:53:26 +00:00
GitLab CI	8ca84d63e9	chore: Regenerate all playbooks	2025-10-21 03:50:02 +00:00
GitLab CI	3c3578c620	chore: Regenerate all playbooks	2025-10-21 03:40:46 +00:00
GitLab CI	11f2a77ea7	chore: Regenerate all playbooks	2025-10-21 00:57:26 +00:00
Santosh Bhavani	23b5cbca4c	feat(processor): add parallel processing and NVIDIA API support - Implement parallel chunk processing with configurable concurrency - Add direct NVIDIA API integration bypassing LangChain for better control - Optimize for DGX Spark unified memory with batch processing - Use concurrency of 4 for Ollama, 2 for other providers - Add proper error handling and user stop capability - Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5 - Improve prompt engineering for triple extraction	2025-10-19 20:58:59 -07:00
Santosh Bhavani	12c4777eae	feat(langchain): upgrade to Llama 3.3 Nemotron Super 49B - Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5 - Adjust temperature to 0.6 for better response quality - Increase timeout to 120s for larger model - Add top_p, frequency_penalty, and presence_penalty parameters - Remove deprecated response_format configuration	2025-10-19 20:57:03 -07:00
Santosh Bhavani	5be2ad78bf	feat(ui): upgrade to NVIDIA Llama 3.3 Nemotron Super 49B - Update default NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5 - Update model display name and description - Replace deprecated 70B model with newer 49B Super model	2025-10-19 20:57:00 -07:00
Santosh Bhavani	529debb633	perf(docker): increase Ollama parallel processing for DGX - Increase OLLAMA_NUM_PARALLEL from 1 to 4 requests - Leverage DGX Spark's unified memory architecture - Improve throughput for concurrent inference requests	2025-10-19 20:56:58 -07:00
Santosh Bhavani	ffb0688a63	refactor(ui): improve database connection button UI - Replace button text with icons for compact display - Add tooltips to Refresh, Disconnect, and Clear buttons - Improve button spacing and alignment - Import LogOut icon for disconnect action	2025-10-19 19:57:17 -07:00
Santosh Bhavani	37ee4b63f1	feat(ui): display LLM-generated answers in RAG page - Switch traditional graph search to use LLM-enhanced endpoint - Display LLM-generated answer prominently above triples - Add llmAnswer state to store and display generated answers - Update results section to show 'Supporting Triples' when answer exists - Pass selected LLM model and provider to API - Improve debug logging for query modes and results	2025-10-19 19:57:15 -07:00
Santosh Bhavani	1bb48b9818	feat(component): add LLM selector to RAG query interface - Integrate LLMSelectorCompact into RAG query component - Make query mode cards more compact to accommodate LLM selector - Update styling for better space utilization - Add LLM selection section with descriptive label	2025-10-19 19:57:15 -07:00
Santosh Bhavani	db1e7760f6	feat(component): add compact LLM selector component - Create LLMSelectorCompact component for model selection - Support Ollama and NVIDIA models - Load available models from localStorage - Persist selected model and dispatch selection events - Compact design suitable for inline placement	2025-10-19 19:57:14 -07:00
Santosh Bhavani	156bfb2e8d	feat(api): update metrics route for multi-database support - Update metrics endpoint to use getGraphDbService utility - Support both ArangoDB and Neo4j database types - Initialize graph database based on selected type - Retrieve graph stats from the active database	2025-10-19 19:57:13 -07:00
Santosh Bhavani	a082a8a737	feat(backend): implement LLM-enhanced query method - Add queryWithLLM method to BackendService - Retrieves top K triples from graph and uses LLM to generate answers - Supports configurable LLM model and provider selection - Uses research-backed prompt structure for KG-enhanced RAG - Includes fallback handling for LLM errors	2025-10-19 19:57:12 -07:00
Santosh Bhavani	d842dc996a	feat(api): add LLM-enhanced graph query endpoint - Create new /api/graph-query-llm endpoint for graph search + LLM generation - Retrieves triples using graph search and generates answers using LLM - Supports both traditional and vector-based graph search - Makes traditional graph search comparable to RAG for benchmarking	2025-10-19 19:57:11 -07:00
Santosh Bhavani	8c1d2ae9f3	feat(docker): add vector search services and GPU configuration - Add optional Pinecone and sentence-transformers services for vector search - Configure NVIDIA GPU support with proper environment variables - Add new environment variables for embeddings and Pinecone - Add docker compose profiles to optionally enable vector-search - Improve CUDA configuration for Ollama service - Add pinecone-net network for service communication	2025-10-19 19:56:55 -07:00
Santosh Bhavani	9dc734eee5	Add NVIDIA_API_KEY support and update ollama to v0.12.6	2025-10-19 14:52:24 -05:00
GitLab CI	752eada0cb	chore: Regenerate all playbooks	2025-10-18 21:48:15 +00:00
GitLab CI	505cacdbd6	chore: Regenerate all playbooks	2025-10-18 21:28:42 +00:00
GitLab CI	a6f94052b1	chore: Regenerate all playbooks	2025-10-17 17:29:40 +00:00
GitLab CI	3ed5b3b073	chore: Regenerate all playbooks	2025-10-17 00:58:35 +00:00
GitLab CI	0d9108cf14	chore: Regenerate all playbooks	2025-10-16 21:25:27 +00:00
GitLab CI	7457f31016	chore: Regenerate all playbooks	2025-10-16 21:14:27 +00:00
GitLab CI	058b5b70b2	chore: Regenerate all playbooks	2025-10-16 20:25:06 +00:00
GitLab CI	c8ab690414	chore: Regenerate all playbooks	2025-10-16 19:02:54 +00:00
GitLab CI	b4a071c721	chore: Regenerate all playbooks	2025-10-16 18:49:21 +00:00
GitLab CI	5cd142bc41	chore: Regenerate all playbooks	2025-10-16 18:35:50 +00:00
GitLab CI	2ff64d7265	chore: Regenerate all playbooks	2025-10-16 17:29:56 +00:00
GitLab CI	6dd7697210	chore: Regenerate all playbooks	2025-10-16 14:13:04 +00:00
GitLab CI	99c6530528	chore: Regenerate all playbooks	2025-10-16 13:05:16 +00:00
GitLab CI	2371189ab9	chore: Regenerate all playbooks	2025-10-15 13:32:10 +00:00
GitLab CI	8a12782b17	chore: Regenerate all playbooks	2025-10-14 14:11:25 +00:00
GitLab CI	159d5e2b24	chore: Regenerate all playbooks	2025-10-14 01:21:39 +00:00
GitLab CI	34239a8313	chore: Regenerate all playbooks	2025-10-14 00:40:26 +00:00
GitLab CI	e17deb3167	chore: Regenerate all playbooks	2025-10-13 22:21:04 +00:00
GitLab CI	e9a3f2a759	chore: Regenerate all playbooks	2025-10-13 17:36:51 +00:00

1 2 3 4

191 Commits