# Backend

FastAPI Python application serving as the API backend for the chatbot demo.

## Overview

The backend handles:
- Multi-model LLM integration (local models)
- Document ingestion and vector storage for RAG
- WebSocket connections for real-time chat streaming
- Image processing and analysis
- Chat history management
- Model Control Protocol (MCP) integration

## Key Features

- **Multi-model support**: Integrates various LLM providers and local models
- **RAG pipeline**: Document processing, embedding generation, and retrieval
- **Streaming responses**: Real-time token streaming via WebSocket
- **Image analysis**: Multi-modal capabilities for image understanding
- **Vector database**: Efficient similarity search for document retrieval
- **Session management**: Chat history and context persistence

## Architecture

FastAPI application with async support, integrated with vector databases for RAG functionality and WebSocket endpoints for real-time communication.

## Docker Troubleshooting

### Container Issues
- **Port conflicts**: Ensure port 8000 is not in use
- **Memory issues**: Backend requires significant RAM for model loading
- **Startup failures**: Check if required environment variables are set

### Model Loading Problems
```bash
# Check model download status
docker logs backend | grep -i "model"

# Verify model files exist
docker exec -it cbackend ls -la /app/models/

# Check available disk space
docker exec -it backend df -h
```

### Common Commands
```bash
# View backend logs
docker logs -f backend

# Restart backend container
docker restart backend

# Rebuild backend
docker-compose up --build -d backend

# Access container shell
docker exec -it backend /bin/bash

# Check API health
curl http://localhost:8000/health
```

### Performance Issues
- **Slow responses**: Check GPU availability and model size
- **Memory errors**: Increase Docker memory limit or use smaller models
- **Connection timeouts**: Verify WebSocket connections and firewall settings