mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-06-23 14:49:31 +00:00
183 lines
6.2 KiB
Markdown
183 lines
6.2 KiB
Markdown
# Local Healthcare Agent — Setup Guide
|
|
|
|
From-scratch setup for a DGX Station. Produces a working multi-agent clinical analysis system with OpenFold3 protein structure prediction.
|
|
|
|
**Hardware**: DGX Station (284 GB VRAM, aarch64)
|
|
|
|
---
|
|
|
|
## Prerequisites
|
|
|
|
- Docker + NVIDIA Container Toolkit (`docker info --format '{{.ServerVersion}}'` ≥ 23.0.1)
|
|
- **Node.js v22+** (the DGX Station ships with v18 — upgrade with `curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - && sudo apt-get install -y nodejs`)
|
|
- OpenShell CLI ≥ 0.0.33
|
|
- **At least 200 GB free** on `/` (86 GB Ollama model + Docker images + working space; verify with `df -h /`)
|
|
- A single GPU with **≥150 GB free VRAM** (target the GB300, not the RTX PRO 6000, on dual-GPU stations)
|
|
- Network access to `r4.smarthealthit.org` (FHIR test server) and `nvcr.io` (NGC registry)
|
|
- NVIDIA NGC API key ([get one here](https://ngc.nvidia.com/setup/api-key)) **and** `docker login nvcr.io` (run `make ngc-login` after Step 1)
|
|
|
|
> [!TIP]
|
|
> `make prereq` checks all of the above (Docker, Node.js v22, OpenShell, disk, GPU, port 11434, NGC docker login) in one shot.
|
|
|
|
## 1. Clone the repo
|
|
|
|
```bash
|
|
git clone https://github.com/jaival-nvidia/local-healthcare-agent.git
|
|
cd local-healthcare-agent
|
|
cp .env.example .env
|
|
# Edit .env: set NGC_API_KEY=nvapi-...
|
|
make ngc-login # docker login nvcr.io with NGC_API_KEY (required to pull OpenFold3)
|
|
```
|
|
|
|
## 2. Install OpenShell
|
|
|
|
```bash
|
|
pip install openshell --upgrade --pre \
|
|
--index-url https://urm.nvidia.com/artifactory/api/pypi/nv-shared-pypi/simple
|
|
openshell --version # >= 0.0.33
|
|
```
|
|
|
|
> [!NOTE]
|
|
> Ollama runs as a Docker container in this playbook (Step 3) — you do **not** need a host Ollama install. If host Ollama is already running on port 11434 (e.g., from the NemoClaw playbook), stop it first (`sudo systemctl stop ollama`) or override `OLLAMA_PORT` in `.env`. Both `docker-compose.yml` and `setup_sandbox.sh` honor the override.
|
|
|
|
## 3. Start infrastructure
|
|
|
|
```bash
|
|
make up
|
|
# Starts Ollama and OpenFold3 via Docker Compose.
|
|
# Auto-pulls nemotron-3-super:120b-a12b (~86 GB) if not cached.
|
|
# Wait for both services to report healthy:
|
|
make status
|
|
```
|
|
|
|
This starts:
|
|
- **Ollama** (port `${OLLAMA_PORT:-11434}`) — LLM inference with nemotron-3-super:120b-a12b
|
|
- **OpenFold3 NIM** (port `${OPENFOLD_PORT:-8000}`) — protein structure prediction (~3 min startup)
|
|
|
|
> [!TIP]
|
|
> On dual-GPU stations, set `LLM_GPU` and `OPENFOLD_GPU` in `.env` to the **GB300** index (find with `nvidia-smi --query-gpu=index,name --format=csv,noheader`). Nemotron 3 Super (~94 GB resident) does not fit safely on the RTX PRO 6000 (98 GB), and OpenFold3 crashes on multi-GPU containers.
|
|
|
|
## 4. Start OpenShell gateway
|
|
|
|
```bash
|
|
# Ubuntu 24.04 needs the cgroup fix:
|
|
OPENSHELL_K3S_ARGS='--kubelet-arg=cgroup-driver=systemd' openshell gateway start
|
|
|
|
# k3s takes 10-15s to accept connections; poll until ready:
|
|
for i in $(seq 1 30); do
|
|
openshell status 2>/dev/null | grep -q "Connected" && break
|
|
sleep 2
|
|
done
|
|
openshell status # Should show: Status: Connected
|
|
```
|
|
|
|
## 5. (Optional) Configure inference provider manually
|
|
|
|
`make setup` (Step 6) creates the inference provider for you. To do it by hand:
|
|
|
|
```bash
|
|
BRIDGE_IP=$(ip -4 addr show docker0 | grep -oP 'inet \K[\d.]+')
|
|
|
|
openshell provider create \
|
|
--name ollama-local \
|
|
--type openai \
|
|
--credential OPENAI_API_KEY=ollama \
|
|
--config OPENAI_BASE_URL=http://${BRIDGE_IP}:${OLLAMA_PORT:-11434}/v1
|
|
|
|
openshell inference set \
|
|
--provider ollama-local \
|
|
--model nemotron-3-super:120b-a12b
|
|
```
|
|
|
|
> [!NOTE]
|
|
> Current OpenShell releases do not accept the `--base-url` shorthand for `provider create` — use `--config OPENAI_BASE_URL=...` as shown above.
|
|
|
|
## 6. Create sandbox and deploy everything
|
|
|
|
```bash
|
|
make setup
|
|
# Creates sandbox, installs Python packages, deploys skills/agents/config,
|
|
# starts OpenClaw gateway, runs smoke test.
|
|
# Takes ~5-15 min (PyPI downloads are slow through the sandbox proxy).
|
|
```
|
|
|
|
Or for local access without SSH tunnel:
|
|
```bash
|
|
make setup-local
|
|
```
|
|
|
|
## 7. Verify
|
|
|
|
```bash
|
|
make test # L1-3: infrastructure + OpenShell + config (~1 min)
|
|
make test-full # L1-5: includes agent functional + E2E (~20 min)
|
|
```
|
|
|
|
## 8. Access the demo
|
|
|
|
**Remote (SSH tunnel):**
|
|
```bash
|
|
ssh -f -N -L 18789:localhost:18789 <user>@<dgx-ip>
|
|
open http://localhost:18789/
|
|
```
|
|
|
|
**Local:**
|
|
```bash
|
|
open http://localhost:18789/
|
|
```
|
|
|
|
Canvas (charts + molecular viewers): `http://localhost:18789/__openclaw__/canvas/`
|
|
|
|
---
|
|
|
|
## Day-to-day commands
|
|
|
|
| Command | What |
|
|
|---------|------|
|
|
| `make up` | Start Ollama + OpenFold3 |
|
|
| `make down` | Stop all Docker services |
|
|
| `make status` | Health dashboard |
|
|
| `make restart` | Restart OpenClaw gateway in sandbox |
|
|
| `make test` | Quick validation (L1-3, ~1 min) |
|
|
| `make test-full` | Full validation (L1-5, ~20 min) |
|
|
| `make logs` | Tail all service logs |
|
|
| `make clean` | Remove test results + volumes |
|
|
|
|
---
|
|
|
|
## Demo Prompts
|
|
|
|
**Clinical cohort analysis:**
|
|
```
|
|
Find all diabetic patients, get their latest A1c and medications. Identify gap patients with A1c above 9% not on insulin or GLP-1. Show the A1c distribution as a histogram.
|
|
```
|
|
|
|
**Patient case summary:**
|
|
```
|
|
Look up the first patient. Compile a case summary: demographics, conditions, recent labs (flag abnormal), and medications.
|
|
```
|
|
|
|
**Molecular visualization:**
|
|
```
|
|
Show me the 3D protein structure of atorvastatin bound to its target
|
|
```
|
|
|
|
**End-to-end investigation:**
|
|
```
|
|
Find all diabetic patients with poorly controlled A1c, identify what medications they are on, show me the distribution, and visualize the molecular target of the most common therapy.
|
|
```
|
|
|
|
---
|
|
|
|
## Known Issues
|
|
|
|
**PyPI slow through sandbox proxy**: Package installs take 10-15 minutes due to the privacy router throttling. The setup script retries automatically.
|
|
|
|
**Canvas proxy strips `<script src=...>`**: The `build_viewer.py` script inlines JS libraries to work around this.
|
|
|
|
**Model misspellings**: nemotron-3-super:120b-a12b (120B total, 12B active MoE) occasionally misspells drug names. The actual code execution is correct.
|
|
|
|
**Ollama model unloading**: The model unloads after the `keep_alive` timeout. First request after unloading takes ~30s to reload. For demos, send a warmup message first.
|
|
|
|
**OpenFold3 startup**: Takes ~3 minutes to load the model. The healthcheck waits for it automatically.
|