dgx-spark-playbooks/nvidia/station-healthcare-agent/assets/skills/fhir-basics/SKILL.md
2026-05-26 18:25:53 +00:00

407 lines
16 KiB
Markdown

---
name: fhir-basics
description: Teaches agents how FHIR R4 APIs work, what resources are available, how to query them with search parameters, and how to correctly parse all response formats including component Observations.
metadata:
openclaw:
requires:
bins: ["python3"]
---
# FHIR Data Retrieval
**Important**: In this sandbox, run Python scripts with `python` (not `python3`). Use `subprocess.run(["curl", "-sf", url], capture_output=True, text=True)` for all FHIR HTTP calls — the `requests` library does NOT work through the sandbox proxy. Always parse the output with `json.loads()`.
FHIR (Fast Healthcare Interoperability Resources) R4 is the standard API format mandated by the 21st Century Cures Act for US healthcare interoperability. ~70% of US hospitals expose FHIR R4 endpoints (ONC 2024). All queries use REST GET requests returning JSON Bundles.
## Default FHIR Endpoint
Unless the user specifies a different FHIR server, always use: `https://r4.smarthealthit.org`
This is the SMART on FHIR public test server with synthetic (Synthea) patient data. No authentication required.
**Query format for this server**: Use bare codes without system URIs. Example: `code=44054006` NOT `code=http://snomed.info/sct|44054006`. The test server does not support system-qualified code searches and will return empty results.
## Authentication
- **Public test servers** (e.g., `https://r4.smarthealthit.org`): No authentication required. Synthetic data, real FHIR format.
- **Production hospital endpoints**: Use SMART on FHIR OAuth2 flows. Requires a `client_id`, redirect URI, and scope negotiation. Access tokens are passed as `Authorization: Bearer {token}` headers.
## Resource Endpoints and Search Parameters
### Patient
```
GET /Patient -- all patients (paginated)
GET /Patient?name=Smith -- search by family or given name
GET /Patient?name=John&name=Smith -- search by given AND family name
GET /Patient?birthdate=1970-01-01 -- exact birthdate
GET /Patient?birthdate=ge1960-01-01&birthdate=le1970-12-31 -- date range
GET /Patient?gender=male -- filter by gender
GET /Patient/{id} -- get a specific patient by ID
GET /Patient?_count=50 -- control page size
```
### Condition (Diagnoses)
```
GET /Condition?patient={id} -- all conditions for a patient
GET /Condition?patient={id}&clinical-status=active -- only active conditions
GET /Condition?code={snomed_code} -- all patients with a condition (cohort query)
GET /Condition?code=44054006&_count=200 -- paginated cohort (bare code for test server)
```
For the default test server (`r4.smarthealthit.org`), always use bare codes (e.g., `code=44054006`). Production servers may require the full system URI (`code=http://snomed.info/sct|44054006`).
### Observation (Labs and Vitals)
```
GET /Observation?patient={id} -- all observations
GET /Observation?patient={id}&code={loinc} -- specific lab by LOINC
GET /Observation?patient={id}&code={loinc}&_sort=-date&_count=1 -- most recent only
GET /Observation?patient={id}&category=vital-signs -- vitals only
GET /Observation?patient={id}&category=laboratory -- labs only
GET /Observation?patient={id}&date=ge2023-01-01 -- after a date
```
### MedicationRequest (Prescriptions)
```
GET /MedicationRequest?patient={id} -- all prescriptions
GET /MedicationRequest?patient={id}&status=active -- current prescriptions only
GET /MedicationRequest?patient={id}&_count=100 -- increase page size
```
### Encounter (Visits)
```
GET /Encounter?patient={id} -- all encounters
GET /Encounter?patient={id}&_sort=-date&_count=5 -- 5 most recent visits
GET /Encounter?patient={id}&type=office -- office visits only
```
### DiagnosticReport (Lab Reports, Imaging)
```
GET /DiagnosticReport?patient={id} -- all reports
GET /DiagnosticReport?patient={id}&category=LAB -- lab reports
GET /DiagnosticReport?patient={id}&category=imaging -- imaging reports
```
## Key LOINC Codes
| LOINC | Lab/Vital | Notes |
|-------|-----------|-------|
| 4548-4 | Hemoglobin A1c (HbA1c) | Primary diabetes monitoring |
| 2345-7 | Glucose | Fasting or random |
| 2160-0 | Creatinine | Kidney function |
| 33914-3 | eGFR (CKD-EPI) | Kidney function staging |
| 2093-3 | Total Cholesterol | Lipid panel |
| 2571-8 | Triglycerides | Lipid panel |
| 2085-9 | HDL Cholesterol | Lipid panel |
| 18262-6 | LDL Cholesterol | Lipid panel |
| 85354-9 | Blood Pressure panel | Component observation (see below) |
| 8480-6 | Systolic Blood Pressure | Component of BP panel, or standalone |
| 8462-4 | Diastolic Blood Pressure | Component of BP panel, or standalone |
| 42637-9 | BNP (B-type Natriuretic Peptide) | Heart failure marker |
| 33762-6 | NT-proBNP | Heart failure marker (alternative to BNP) |
| 6690-2 | WBC Count | Infection/inflammation |
| 718-7 | Hemoglobin | Anemia screening |
| 2823-3 | Potassium | Electrolyte; critical for ACEi/ARB/MRA monitoring |
| 2951-2 | Sodium | Electrolyte |
| 1742-6 | ALT | Liver function |
| 14959-1 | Microalbumin/Creatinine Ratio (urine) | Diabetic nephropathy screening |
## Parsing FHIR JSON Responses
### Bundle Structure
Every search returns a Bundle:
```json
{
"resourceType": "Bundle",
"type": "searchset",
"total": 42,
"entry": [ { "resource": { ... } }, ... ],
"link": [
{ "relation": "self", "url": "..." },
{ "relation": "next", "url": "..." }
]
}
```
Always check `bundle.get('entry', [])` before iterating -- an empty result returns a Bundle with no `entry` key.
### Patient Resource
```python
patient = entry['resource']
patient_id = patient['id']
given = patient['name'][0].get('given', [''])[0]
family = patient['name'][0].get('family', '')
full_name = f"{given} {family}"
birth_date = patient.get('birthDate', 'Unknown')
gender = patient.get('gender', 'Unknown')
# Address (optional)
address = patient.get('address', [{}])[0]
city = address.get('city', '')
state = address.get('state', '')
```
### Condition Resource
```python
condition = entry['resource']
# Code -- check ALL coding entries, not just [0]
codings = condition.get('code', {}).get('coding', [])
for coding in codings:
system = coding.get('system', '')
code = coding.get('code', '')
display = coding.get('display', '')
if 'snomed' in system:
snomed_code = code
elif 'icd' in system.lower():
icd_code = code
# Clinical status
status_codings = condition.get('clinicalStatus', {}).get('coding', [])
clinical_status = status_codings[0]['code'] if status_codings else 'unknown'
# Values: "active", "recurrence", "relapse", "inactive", "remission", "resolved"
# Onset
onset = condition.get('onsetDateTime', condition.get('onsetPeriod', {}).get('start', 'Unknown'))
# Verification status (confirmed, unconfirmed, provisional, differential, refuted)
verification = condition.get('verificationStatus', {}).get('coding', [{}])[0].get('code', 'unknown')
```
### Observation Resource -- Simple (single value)
Most labs return a single value in `valueQuantity`:
```python
obs = entry['resource']
lab_name = obs['code']['coding'][0]['display']
loinc_code = obs['code']['coding'][0]['code']
date = obs.get('effectiveDateTime', 'Unknown')
# Value -- multiple possible formats
if 'valueQuantity' in obs:
value = obs['valueQuantity']['value']
unit = obs['valueQuantity'].get('unit', '')
elif 'valueString' in obs:
value = obs['valueString'] # qualitative result like "negative"
unit = ''
elif 'valueCodeableConcept' in obs:
value = obs['valueCodeableConcept'].get('text', 'See coding')
unit = ''
else:
value = None # check component (see below)
# Reference range (from the lab, more accurate than general tables)
ref_ranges = obs.get('referenceRange', [])
if ref_ranges:
low = ref_ranges[0].get('low', {}).get('value')
high = ref_ranges[0].get('high', {}).get('value')
```
### Observation Resource -- Component (Blood Pressure)
Blood pressure in FHIR is typically a **component Observation** with LOINC `85354-9` (BP panel). Systolic and diastolic are nested inside `component[]`, NOT in `valueQuantity`:
```python
obs = entry['resource']
panel_code = obs['code']['coding'][0]['code']
if panel_code == '85354-9' or 'component' in obs:
systolic = None
diastolic = None
for comp in obs.get('component', []):
comp_code = comp['code']['coding'][0]['code']
if comp_code == '8480-6': # systolic
systolic = comp['valueQuantity']['value']
elif comp_code == '8462-4': # diastolic
diastolic = comp['valueQuantity']['value']
```
**Critical**: When querying for systolic BP (LOINC 8480-6), some FHIR servers return the panel Observation (85354-9) where systolic is inside `component`. Others return a standalone Observation with `valueQuantity`. Your code must handle both:
```python
def get_bp(patient_id, base_url):
"""Get most recent blood pressure, handling both panel and standalone formats."""
# Try panel first
r = fhir_get("Observation",
params={"patient": patient_id, "code": "85354-9",
"_sort": "-date", "_count": "1"}).json()
if r.get('entry'):
obs = r['entry'][0]['resource']
systolic = diastolic = None
for comp in obs.get('component', []):
c = comp['code']['coding'][0]['code']
if c == '8480-6':
systolic = comp['valueQuantity']['value']
elif c == '8462-4':
diastolic = comp['valueQuantity']['value']
if systolic is not None:
return systolic, diastolic, obs.get('effectiveDateTime', 'Unknown')
# Fallback: standalone systolic
r = fhir_get("Observation",
params={"patient": patient_id, "code": "8480-6",
"_sort": "-date", "_count": "1"}).json()
if r.get('entry'):
obs = r['entry'][0]['resource']
systolic = obs.get('valueQuantity', {}).get('value')
return systolic, None, obs.get('effectiveDateTime', 'Unknown')
return None, None, None
```
### MedicationRequest Resource
```python
med = entry['resource']
# Medication name -- check multiple locations
med_name = (
med.get('medicationCodeableConcept', {}).get('text')
or med.get('medicationCodeableConcept', {}).get('coding', [{}])[0].get('display')
or 'Unknown medication'
)
# Status
status = med.get('status', 'unknown') # active, on-hold, cancelled, completed, stopped
# Dosage
dosage_instructions = med.get('dosageInstruction', [{}])
dosage_text = dosage_instructions[0].get('text', 'No dosage recorded') if dosage_instructions else 'No dosage recorded'
# Authored date
authored = med.get('authoredOn', 'Unknown')
```
## Pagination
FHIR responses default to 20 results per page (server-dependent). Always handle pagination for cohort queries:
```python
def get_all_pages(url, params=None):
"""Fetch all pages of a FHIR Bundle search."""
all_entries = []
if params:
r = fhir_get(url, params)
else:
r = fhir_get(url)
all_entries.extend(r.get('entry', []))
# Follow 'next' links
while True:
next_url = None
for link in r.get('link', []):
if link.get('relation') == 'next':
next_url = link['url']
break
if not next_url:
break
r = fhir_get(next_url)
all_entries.extend(r.get('entry', []))
return all_entries
```
For large cohorts, set `_count=200` to reduce the number of pages. The SMART test server caps at ~1000 results regardless.
## Batched and Multi-Patient Queries (CRITICAL for Performance)
**Never loop over patients making individual FHIR calls.** Each HTTP call through the sandbox proxy adds 1-3 seconds of latency. For 24 patients x 4 LOINC codes, that is 96 sequential calls = 5+ minutes. Instead, fetch all observations for a LOINC code in one request and filter client-side in Python.
### Pattern 1: Fetch all Observations for a LOINC code (preferred)
Query Observation by code alone (no patient filter) to get results for ALL patients in one call:
```
GET /Observation?code={loinc}&_count=500&_sort=-date
```
Then filter in Python by patient reference:
```python
def get_all_obs_for_code(loinc_code, count=500):
"""Fetch ALL observations for a LOINC code across all patients in one call."""
entries = get_all_pages("Observation", {"code": loinc_code, "_count": str(count), "_sort": "-date"})
# Build dict: patient_id -> list of observations (already sorted newest first)
by_patient = {}
for e in entries:
obs = e['resource']
ref = obs.get('subject', {}).get('reference', '')
pid = ref.split('/')[-1] if '/' in ref else ref
if pid not in by_patient:
by_patient[pid] = obs # keep only the most recent per patient
return by_patient
```
### Pattern 2: Multi-patient query parameter
Some FHIR servers accept comma-separated patient references:
```
GET /Observation?patient=Patient/X,Patient/Y,Patient/Z&code={loinc}&_count=500
```
The SMART test server supports this. Use it when you have a specific list of patient IDs and want to avoid fetching observations for patients not in your cohort.
### Pattern 3: FHIR Batch Bundle
For heterogeneous queries (different resource types per patient), POST a Bundle of type `batch` to the server root:
```python
def fhir_batch(requests_list):
"""Execute multiple FHIR queries in a single HTTP call using a batch Bundle.
requests_list: list of {"method": "GET", "url": "Observation?patient=X&code=Y"} dicts.
"""
bundle = {
"resourceType": "Bundle",
"type": "batch",
"entry": [{"request": req} for req in requests_list]
}
import subprocess, json
r = subprocess.run(
["curl", "-sf", "--max-time", "60",
"-X", "POST", "-H", "Content-Type: application/fhir+json",
"-d", json.dumps(bundle), f"{BASE_URL}"],
capture_output=True, text=True, timeout=65
)
if r.returncode != 0 or not r.stdout.strip():
return []
result = json.loads(r.stdout)
return result.get('entry', [])
```
### When to use which pattern
| Scenario | Pattern | Why |
|----------|---------|-----|
| Labs for a cohort (same LOINC, many patients) | Pattern 1: code-only query | One call gets everything; filter in Python |
| Labs for a specific patient list | Pattern 2: comma-separated patients | Scoped to your cohort; one call |
| Mixed data (labs + meds + conditions per patient) | Pattern 3: batch Bundle | Multiple queries, single HTTP round-trip |
| Single patient lookup | Individual GET | Fine for 1-2 patients |
## Bulk FHIR (Production-Scale)
For production population health workflows (not used in this demo, but important context): FHIR Bulk Data Access (SMART/HL7) allows exporting entire patient populations as NDJSON files via an async `$export` operation. This is how real quality measure engines work at scale -- they don't query patient-by-patient. The demo uses individual queries for clarity and because the public test server doesn't support Bulk FHIR.
## Error Handling
- Always check `response.status_code` before parsing JSON
- Check if `entry` exists in the response before iterating: `bundle.get('entry', [])`
- Some fields may be missing -- use `.get()` with sensible defaults
- Never fabricate data. If a field is absent, report "Not recorded"
- Handle `OperationOutcome` responses (FHIR error format):
```python
if response.json().get('resourceType') == 'OperationOutcome':
issues = response.json().get('issue', [])
error_msg = issues[0].get('diagnostics', 'Unknown FHIR error') if issues else 'Unknown error'
```
- Rate limiting: The public SMART test server has no rate limits, but production endpoints may. Add a small delay (0.1-0.5s) between calls in tight loops
- Timeout: Use `--max-time 30` with curl; FHIR servers can be slow under load