mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-22 18:13:52 +00:00
69 lines
2.0 KiB
Markdown
69 lines
2.0 KiB
Markdown
# Backend
|
|
|
|
FastAPI Python application serving as the API backend for the chatbot demo.
|
|
|
|
## Overview
|
|
|
|
The backend handles:
|
|
- Multi-model LLM integration (local models)
|
|
- Document ingestion and vector storage for RAG
|
|
- WebSocket connections for real-time chat streaming
|
|
- Image processing and analysis
|
|
- Chat history management
|
|
- Model Control Protocol (MCP) integration
|
|
|
|
## Key Features
|
|
|
|
- **Multi-model support**: Integrates various LLM providers and local models
|
|
- **RAG pipeline**: Document processing, embedding generation, and retrieval
|
|
- **Streaming responses**: Real-time token streaming via WebSocket
|
|
- **Image analysis**: Multi-modal capabilities for image understanding
|
|
- **Vector database**: Efficient similarity search for document retrieval
|
|
- **Session management**: Chat history and context persistence
|
|
|
|
## Architecture
|
|
|
|
FastAPI application with async support, integrated with vector databases for RAG functionality and WebSocket endpoints for real-time communication.
|
|
|
|
## Docker Troubleshooting
|
|
|
|
### Container Issues
|
|
- **Port conflicts**: Ensure port 8000 is not in use
|
|
- **Memory issues**: Backend requires significant RAM for model loading
|
|
- **Startup failures**: Check if required environment variables are set
|
|
|
|
### Model Loading Problems
|
|
```bash
|
|
# Check model download status
|
|
docker logs backend | grep -i "model"
|
|
|
|
# Verify model files exist
|
|
docker exec -it cbackend ls -la /app/models/
|
|
|
|
# Check available disk space
|
|
docker exec -it backend df -h
|
|
```
|
|
|
|
### Common Commands
|
|
```bash
|
|
# View backend logs
|
|
docker logs -f backend
|
|
|
|
# Restart backend container
|
|
docker restart backend
|
|
|
|
# Rebuild backend
|
|
docker-compose up --build -d backend
|
|
|
|
# Access container shell
|
|
docker exec -it backend /bin/bash
|
|
|
|
# Check API health
|
|
curl http://localhost:8000/health
|
|
```
|
|
|
|
### Performance Issues
|
|
- **Slow responses**: Check GPU availability and model size
|
|
- **Memory errors**: Increase Docker memory limit or use smaller models
|
|
- **Connection timeouts**: Verify WebSocket connections and firewall settings
|