# GNN Model Service This service provides a REST API for serving predictions from a Graph Neural Network (GNN) model trained to enhance RAG (Retrieval Augmented Generation) performance. It allows comparing GNN-based knowledge graph retrieval with traditional RAG approaches. ## Overview The service exposes a simple API to: - Load a pre-trained GNN model that combines graph structures with language models - Process queries by incorporating graph-structured knowledge - Return predictions that leverage both text and graph relationships ## Getting Started ### Prerequisites - Docker and Docker Compose - The trained model file (created using `train_export.py`) ### Running the Service The service is included in the main docker-compose configuration. Simply run: ```bash docker-compose up -d ``` This will start the GNN model service along with other services in the system. ## Training the Model Before using the service, you need to train the GNN model: ```bash # Create the models directory if it doesn't exist mkdir -p models # Run the training script python deploy/services/gnn_model/train_export.py --output_dir models ``` This will create the `tech-qa-model.pt` file in the models directory, which the service will load. ## API Endpoints ### Health Check ``` GET /health ``` Returns the health status of the service. ### Prediction ``` POST /predict ``` Request body: ```json { "question": "Your question here", "context": "Retrieved context information" } ``` Response: ```json { "question": "Your question here", "answer": "The generated answer" } ``` ## Using the Client Example A simple client script is provided to test the service: ```bash python deploy/services/gnn_model/client_example.py --question "What is the capital of France?" --context "France is a country in Western Europe. Its capital is Paris, which is known for the Eiffel Tower." ``` This script also includes a placeholder for comparing the GNN-based approach with a traditional RAG approach. ## Architecture The GNN model service uses: - A Graph Attention Network (GAT) to process graph structured data - A Language Model (LLM) to generate answers - A combined architecture (GRetriever) that leverages both components ## Limitations - The current implementation requires graph construction to be handled separately - The `create_graph_from_text` function in the service is a placeholder that needs implementation based on your specific graph construction approach