mirror of https://github.com/NVIDIA/dgx-spark-playbooks.git synced 2026-04-25 11:23:52 +00:00

History

GitLab CI 27fe116e71 chore: Regenerate all playbooks		2025-10-06 17:05:41 +00:00
..
app.py	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00
client_example.py	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00
Dockerfile	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00
README.md	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00
train_export.py	chore: Regenerate all playbooks	2025-10-06 17:05:41 +00:00

README.md

GNN Model Service

This service provides a REST API for serving predictions from a Graph Neural Network (GNN) model trained to enhance RAG (Retrieval Augmented Generation) performance. It allows comparing GNN-based knowledge graph retrieval with traditional RAG approaches.

Overview

The service exposes a simple API to:

Load a pre-trained GNN model that combines graph structures with language models
Process queries by incorporating graph-structured knowledge
Return predictions that leverage both text and graph relationships

Getting Started

Prerequisites

Docker and Docker Compose
The trained model file (created using train_export.py)

Running the Service

The service is included in the main docker-compose configuration. Simply run:

docker-compose up -d

This will start the GNN model service along with other services in the system.

Training the Model

Before using the service, you need to train the GNN model:

# Create the models directory if it doesn't exist
mkdir -p models

# Run the training script
python deploy/services/gnn_model/train_export.py --output_dir models

This will create the tech-qa-model.pt file in the models directory, which the service will load.

API Endpoints

Health Check

GET /health

Returns the health status of the service.

Prediction

POST /predict

Request body:

{
  "question": "Your question here",
  "context": "Retrieved context information"
}

Response:

{
  "question": "Your question here",
  "answer": "The generated answer"
}

Using the Client Example

A simple client script is provided to test the service:

python deploy/services/gnn_model/client_example.py --question "What is the capital of France?" --context "France is a country in Western Europe. Its capital is Paris, which is known for the Eiffel Tower."

This script also includes a placeholder for comparing the GNN-based approach with a traditional RAG approach.

Architecture

The GNN model service uses:

A Graph Attention Network (GAT) to process graph structured data
A Language Model (LLM) to generate answers
A combined architecture (GRetriever) that leverages both components

Limitations

The current implementation requires graph construction to be handled separately
The create_graph_from_text function in the service is a placeholder that needs implementation based on your specific graph construction approach