- Add query mode badge to answer section showing Pure RAG/Traditional Graph/GraphRAG
- Add collapsible reasoning section for <think> tags in answers
- Add markdown rendering support (bold/italic) in answers
- Fix Pure RAG to properly display answers using llmAnswer state
- Hide empty results message for Pure RAG mode
- Update metrics sidebar to show query times by mode instead of overall average
- Add queryTimesByMode field to metrics API and frontend interfaces
- Disable GraphRAG button with "COMING SOON" badge (requires GNN model)
- Fix Qdrant vector store document mapping with contentPayloadKey
- Update console logs to reflect Qdrant instead of Pinecone
- Add @qdrant/js-client-rest dependency to package.json
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Migrate from Pinecone to Qdrant vector database for native ARM64 support
- Add Qdrant service with automatic collection initialization in docker-compose
- Implement QdrantService with UUID-based point IDs to meet Qdrant requirements
- Update all API routes and frontend components to use Qdrant
- Enhance Storage Connections UI with detailed stats (vectors, status, dimensions)
- Add icons and tooltips to Vector DB section matching Graph DB UX
- Implement parallel chunk processing with configurable concurrency
- Add direct NVIDIA API integration bypassing LangChain for better control
- Optimize for DGX Spark unified memory with batch processing
- Use concurrency of 4 for Ollama, 2 for other providers
- Add proper error handling and user stop capability
- Update NVIDIA model to Llama 3.3 Nemotron Super 49B v1.5
- Improve prompt engineering for triple extraction
- Update LangChain service to use Llama 3.3 Nemotron Super 49B v1.5
- Adjust temperature to 0.6 for better response quality
- Increase timeout to 120s for larger model
- Add top_p, frequency_penalty, and presence_penalty parameters
- Remove deprecated response_format configuration
- Add queryWithLLM method to BackendService
- Retrieves top K triples from graph and uses LLM to generate answers
- Supports configurable LLM model and provider selection
- Uses research-backed prompt structure for KG-enhanced RAG
- Includes fallback handling for LLM errors