dgx-spark-playbooks/nvidia/txt2kg/assets/deploy/services/gnn_model/README.md

# GNN Model Service

This service provides a REST API for serving predictions from a Graph Neural Network (GNN) model trained to enhance RAG (Retrieval Augmented Generation) performance. It allows comparing GNN-based knowledge graph retrieval with traditional RAG approaches.

## Overview

The service exposes a simple API to:
- Load a pre-trained GNN model that combines graph structures with language models
- Process queries by incorporating graph-structured knowledge
- Return predictions that leverage both text and graph relationships

## Getting Started

### Prerequisites

- Docker and Docker Compose
- The trained model file (created using `train_export.py`)

### Running the Service

The service is included in the main docker-compose configuration. Simply run:

```bash
docker-compose up -d
```

This will start the GNN model service along with other services in the system.

## Training the Model

Before using the service, you need to train the GNN model:

```bash
# Create the models directory if it doesn't exist
mkdir -p models

# Run the training script
python deploy/services/gnn_model/train_export.py --output_dir models
```

This will create the `tech-qa-model.pt` file in the models directory, which the service will load.

## API Endpoints

### Health Check

```
GET /health
```

Returns the health status of the service.

### Prediction

```
POST /predict
```

Request body:
```json
{
  "question": "Your question here",
  "context": "Retrieved context information"
}
```

Response:
```json
{
  "question": "Your question here",
  "answer": "The generated answer"
}
```

## Using the Client Example

A simple client script is provided to test the service:

```bash
python deploy/services/gnn_model/client_example.py --question "What is the capital of France?" --context "France is a country in Western Europe. Its capital is Paris, which is known for the Eiffel Tower."
```

This script also includes a placeholder for comparing the GNN-based approach with a traditional RAG approach.

## Architecture

The GNN model service uses:
- A Graph Attention Network (GAT) to process graph structured data
- A Language Model (LLM) to generate answers
- A combined architecture (GRetriever) that leverages both components

## Limitations

- The current implementation requires graph construction to be handled separately
- The `create_graph_from_text` function in the service is a placeholder that needs implementation based on your specific graph construction approach
chore: Regenerate all playbooks 2025-10-06 17:05:41 +00:00			`# GNN Model Service`

			`This service provides a REST API for serving predictions from a Graph Neural Network (GNN) model trained to enhance RAG (Retrieval Augmented Generation) performance. It allows comparing GNN-based knowledge graph retrieval with traditional RAG approaches.`

			`## Overview`

			`The service exposes a simple API to:`
			`- Load a pre-trained GNN model that combines graph structures with language models`
			`- Process queries by incorporating graph-structured knowledge`
			`- Return predictions that leverage both text and graph relationships`

			`## Getting Started`

			`### Prerequisites`

			`- Docker and Docker Compose`
			- The trained model file (created using `train_export.py`)

			`### Running the Service`

			`The service is included in the main docker-compose configuration. Simply run:`

			```bash
			`docker-compose up -d`
			```

			`This will start the GNN model service along with other services in the system.`

			`## Training the Model`

			`Before using the service, you need to train the GNN model:`

			```bash
			`# Create the models directory if it doesn't exist`
			`mkdir -p models`

			`# Run the training script`
			`python deploy/services/gnn_model/train_export.py --output_dir models`
			```

			This will create the `tech-qa-model.pt` file in the models directory, which the service will load.

			`## API Endpoints`

			`### Health Check`

			```
			`GET /health`
			```

			`Returns the health status of the service.`

			`### Prediction`

			```
			`POST /predict`
			```

			`Request body:`
			```json
			`{`
			`"question": "Your question here",`
			`"context": "Retrieved context information"`
			`}`
			```

			`Response:`
			```json
			`{`
			`"question": "Your question here",`
			`"answer": "The generated answer"`
			`}`
			```

			`## Using the Client Example`

			`A simple client script is provided to test the service:`

			```bash
			`python deploy/services/gnn_model/client_example.py --question "What is the capital of France?" --context "France is a country in Western Europe. Its capital is Paris, which is known for the Eiffel Tower."`
			```

			`This script also includes a placeholder for comparing the GNN-based approach with a traditional RAG approach.`

			`## Architecture`

			`The GNN model service uses:`
			`- A Graph Attention Network (GAT) to process graph structured data`
			`- A Language Model (LLM) to generate answers`
			`- A combined architecture (GRetriever) that leverages both components`

			`## Limitations`

			`- The current implementation requires graph construction to be handled separately`
			- The `create_graph_from_text` function in the service is a placeholder that needs implementation based on your specific graph construction approach