dgx-spark-playbooks/nvidia/txt2kg/assets/deploy/services/gnn_model/README.md

95 lines
2.4 KiB
Markdown
Raw Normal View History

2025-10-06 17:05:41 +00:00
# GNN Model Service
This service provides a REST API for serving predictions from a Graph Neural Network (GNN) model trained to enhance RAG (Retrieval Augmented Generation) performance. It allows comparing GNN-based knowledge graph retrieval with traditional RAG approaches.
## Overview
The service exposes a simple API to:
- Load a pre-trained GNN model that combines graph structures with language models
- Process queries by incorporating graph-structured knowledge
- Return predictions that leverage both text and graph relationships
## Getting Started
### Prerequisites
- Docker and Docker Compose
- The trained model file (created using `train_export.py`)
### Running the Service
The service is included in the main docker-compose configuration. Simply run:
```bash
docker-compose up -d
```
This will start the GNN model service along with other services in the system.
## Training the Model
Before using the service, you need to train the GNN model:
```bash
# Create the models directory if it doesn't exist
mkdir -p models
# Run the training script
python deploy/services/gnn_model/train_export.py --output_dir models
```
This will create the `tech-qa-model.pt` file in the models directory, which the service will load.
## API Endpoints
### Health Check
```
GET /health
```
Returns the health status of the service.
### Prediction
```
POST /predict
```
Request body:
```json
{
"question": "Your question here",
"context": "Retrieved context information"
}
```
Response:
```json
{
"question": "Your question here",
"answer": "The generated answer"
}
```
## Using the Client Example
A simple client script is provided to test the service:
```bash
python deploy/services/gnn_model/client_example.py --question "What is the capital of France?" --context "France is a country in Western Europe. Its capital is Paris, which is known for the Eiffel Tower."
```
This script also includes a placeholder for comparing the GNN-based approach with a traditional RAG approach.
## Architecture
The GNN model service uses:
- A Graph Attention Network (GAT) to process graph structured data
- A Language Model (LLM) to generate answers
- A combined architecture (GRetriever) that leverages both components
## Limitations
- The current implementation requires graph construction to be handled separately
- The `create_graph_from_text` function in the service is a placeholder that needs implementation based on your specific graph construction approach