dgx-spark-playbooks

mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-06-21 05:39:31 +00:00

Author	SHA1	Message	Date
Santosh Bhavani	8974ee9913	Improve Pure RAG UI and add query mode tracking - Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG - Add collapsible reasoning section for <think> tags in answers - Add markdown rendering support (bold/italic) in answers - Fix Pure RAG to properly display answers using llmAnswer state - Hide empty results message for Pure RAG mode - Update metrics sidebar to show query times by mode instead of overall average - Add queryTimesByMode field to metrics API and frontend interfaces - Disable GraphRAG button with "COMING SOON" badge (requires GNN model) - Fix Qdrant vector store document mapping with contentPayloadKey - Update console logs to reflect Qdrant instead of Pinecone - Add @qdrant/js-client-rest dependency to package.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 10:33:48 -07:00
Santosh Bhavani	de9c46e97e	Replace Pinecone with Qdrant for ARM64 compatibility - Migrate from Pinecone to Qdrant vector database for native ARM64 support - Add Qdrant service with automatic collection initialization in docker-compose - Implement QdrantService with UUID-based point IDs to meet Qdrant requirements - Update all API routes and frontend components to use Qdrant - Enhance Storage Connections UI with detailed stats (vectors, status, dimensions) - Add icons and tooltips to Vector DB section matching Graph DB UX	2025-10-24 23:16:44 -07:00
Santosh Bhavani	6e90701a9b	Add document tracking to prevent duplicates	2025-10-24 19:45:41 -07:00
Santosh Bhavani	97e4be5772	Add configurable NVIDIA model support	2025-10-24 19:45:36 -07:00
Santosh Bhavani	23b5cbca4c	feat(processor): add parallel processing and NVIDIA API support - Implement parallel chunk processing with configurable concurrency - Add direct NVIDIA API integration bypassing LangChain for better control - Optimize for DGX Spark unified memory with batch processing - Use concurrency of 4 for Ollama, 2 for other providers - Add proper error handling and user stop capability - Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5 - Improve prompt engineering for triple extraction	2025-10-19 20:58:59 -07:00
Santosh Bhavani	12c4777eae	feat(langchain): upgrade to Llama 3.3 Nemotron Super 49B - Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5 - Adjust temperature to 0.6 for better response quality - Increase timeout to 120s for larger model - Add top_p, frequency_penalty, and presence_penalty parameters - Remove deprecated response_format configuration	2025-10-19 20:57:03 -07:00
Santosh Bhavani	a082a8a737	feat(backend): implement LLM-enhanced query method - Add queryWithLLM method to BackendService - Retrieves top K triples from graph and uses LLM to generate answers - Supports configurable LLM model and provider selection - Uses research-backed prompt structure for KG-enhanced RAG - Includes fallback handling for LLM errors	2025-10-19 19:57:12 -07:00
GitLab CI	27fe116e71	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00

8 Commits