mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-06-24 15:19:30 +00:00
407 lines
16 KiB
Markdown
407 lines
16 KiB
Markdown
|
|
---
|
||
|
|
name: fhir-basics
|
||
|
|
description: Teaches agents how FHIR R4 APIs work, what resources are available, how to query them with search parameters, and how to correctly parse all response formats including component Observations.
|
||
|
|
metadata:
|
||
|
|
openclaw:
|
||
|
|
requires:
|
||
|
|
bins: ["python3"]
|
||
|
|
---
|
||
|
|
|
||
|
|
# FHIR Data Retrieval
|
||
|
|
|
||
|
|
**Important**: In this sandbox, run Python scripts with `python` (not `python3`). Use `subprocess.run(["curl", "-sf", url], capture_output=True, text=True)` for all FHIR HTTP calls — the `requests` library does NOT work through the sandbox proxy. Always parse the output with `json.loads()`.
|
||
|
|
|
||
|
|
FHIR (Fast Healthcare Interoperability Resources) R4 is the standard API format mandated by the 21st Century Cures Act for US healthcare interoperability. ~70% of US hospitals expose FHIR R4 endpoints (ONC 2024). All queries use REST GET requests returning JSON Bundles.
|
||
|
|
|
||
|
|
## Default FHIR Endpoint
|
||
|
|
|
||
|
|
Unless the user specifies a different FHIR server, always use: `https://r4.smarthealthit.org`
|
||
|
|
|
||
|
|
This is the SMART on FHIR public test server with synthetic (Synthea) patient data. No authentication required.
|
||
|
|
|
||
|
|
**Query format for this server**: Use bare codes without system URIs. Example: `code=44054006` NOT `code=http://snomed.info/sct|44054006`. The test server does not support system-qualified code searches and will return empty results.
|
||
|
|
|
||
|
|
## Authentication
|
||
|
|
|
||
|
|
- **Public test servers** (e.g., `https://r4.smarthealthit.org`): No authentication required. Synthetic data, real FHIR format.
|
||
|
|
- **Production hospital endpoints**: Use SMART on FHIR OAuth2 flows. Requires a `client_id`, redirect URI, and scope negotiation. Access tokens are passed as `Authorization: Bearer {token}` headers.
|
||
|
|
|
||
|
|
## Resource Endpoints and Search Parameters
|
||
|
|
|
||
|
|
### Patient
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /Patient -- all patients (paginated)
|
||
|
|
GET /Patient?name=Smith -- search by family or given name
|
||
|
|
GET /Patient?name=John&name=Smith -- search by given AND family name
|
||
|
|
GET /Patient?birthdate=1970-01-01 -- exact birthdate
|
||
|
|
GET /Patient?birthdate=ge1960-01-01&birthdate=le1970-12-31 -- date range
|
||
|
|
GET /Patient?gender=male -- filter by gender
|
||
|
|
GET /Patient/{id} -- get a specific patient by ID
|
||
|
|
GET /Patient?_count=50 -- control page size
|
||
|
|
```
|
||
|
|
|
||
|
|
### Condition (Diagnoses)
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /Condition?patient={id} -- all conditions for a patient
|
||
|
|
GET /Condition?patient={id}&clinical-status=active -- only active conditions
|
||
|
|
GET /Condition?code={snomed_code} -- all patients with a condition (cohort query)
|
||
|
|
GET /Condition?code=44054006&_count=200 -- paginated cohort (bare code for test server)
|
||
|
|
```
|
||
|
|
|
||
|
|
For the default test server (`r4.smarthealthit.org`), always use bare codes (e.g., `code=44054006`). Production servers may require the full system URI (`code=http://snomed.info/sct|44054006`).
|
||
|
|
|
||
|
|
### Observation (Labs and Vitals)
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /Observation?patient={id} -- all observations
|
||
|
|
GET /Observation?patient={id}&code={loinc} -- specific lab by LOINC
|
||
|
|
GET /Observation?patient={id}&code={loinc}&_sort=-date&_count=1 -- most recent only
|
||
|
|
GET /Observation?patient={id}&category=vital-signs -- vitals only
|
||
|
|
GET /Observation?patient={id}&category=laboratory -- labs only
|
||
|
|
GET /Observation?patient={id}&date=ge2023-01-01 -- after a date
|
||
|
|
```
|
||
|
|
|
||
|
|
### MedicationRequest (Prescriptions)
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /MedicationRequest?patient={id} -- all prescriptions
|
||
|
|
GET /MedicationRequest?patient={id}&status=active -- current prescriptions only
|
||
|
|
GET /MedicationRequest?patient={id}&_count=100 -- increase page size
|
||
|
|
```
|
||
|
|
|
||
|
|
### Encounter (Visits)
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /Encounter?patient={id} -- all encounters
|
||
|
|
GET /Encounter?patient={id}&_sort=-date&_count=5 -- 5 most recent visits
|
||
|
|
GET /Encounter?patient={id}&type=office -- office visits only
|
||
|
|
```
|
||
|
|
|
||
|
|
### DiagnosticReport (Lab Reports, Imaging)
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /DiagnosticReport?patient={id} -- all reports
|
||
|
|
GET /DiagnosticReport?patient={id}&category=LAB -- lab reports
|
||
|
|
GET /DiagnosticReport?patient={id}&category=imaging -- imaging reports
|
||
|
|
```
|
||
|
|
|
||
|
|
## Key LOINC Codes
|
||
|
|
|
||
|
|
| LOINC | Lab/Vital | Notes |
|
||
|
|
|-------|-----------|-------|
|
||
|
|
| 4548-4 | Hemoglobin A1c (HbA1c) | Primary diabetes monitoring |
|
||
|
|
| 2345-7 | Glucose | Fasting or random |
|
||
|
|
| 2160-0 | Creatinine | Kidney function |
|
||
|
|
| 33914-3 | eGFR (CKD-EPI) | Kidney function staging |
|
||
|
|
| 2093-3 | Total Cholesterol | Lipid panel |
|
||
|
|
| 2571-8 | Triglycerides | Lipid panel |
|
||
|
|
| 2085-9 | HDL Cholesterol | Lipid panel |
|
||
|
|
| 18262-6 | LDL Cholesterol | Lipid panel |
|
||
|
|
| 85354-9 | Blood Pressure panel | Component observation (see below) |
|
||
|
|
| 8480-6 | Systolic Blood Pressure | Component of BP panel, or standalone |
|
||
|
|
| 8462-4 | Diastolic Blood Pressure | Component of BP panel, or standalone |
|
||
|
|
| 42637-9 | BNP (B-type Natriuretic Peptide) | Heart failure marker |
|
||
|
|
| 33762-6 | NT-proBNP | Heart failure marker (alternative to BNP) |
|
||
|
|
| 6690-2 | WBC Count | Infection/inflammation |
|
||
|
|
| 718-7 | Hemoglobin | Anemia screening |
|
||
|
|
| 2823-3 | Potassium | Electrolyte; critical for ACEi/ARB/MRA monitoring |
|
||
|
|
| 2951-2 | Sodium | Electrolyte |
|
||
|
|
| 1742-6 | ALT | Liver function |
|
||
|
|
| 14959-1 | Microalbumin/Creatinine Ratio (urine) | Diabetic nephropathy screening |
|
||
|
|
|
||
|
|
## Parsing FHIR JSON Responses
|
||
|
|
|
||
|
|
### Bundle Structure
|
||
|
|
|
||
|
|
Every search returns a Bundle:
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"resourceType": "Bundle",
|
||
|
|
"type": "searchset",
|
||
|
|
"total": 42,
|
||
|
|
"entry": [ { "resource": { ... } }, ... ],
|
||
|
|
"link": [
|
||
|
|
{ "relation": "self", "url": "..." },
|
||
|
|
{ "relation": "next", "url": "..." }
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
Always check `bundle.get('entry', [])` before iterating -- an empty result returns a Bundle with no `entry` key.
|
||
|
|
|
||
|
|
### Patient Resource
|
||
|
|
|
||
|
|
```python
|
||
|
|
patient = entry['resource']
|
||
|
|
patient_id = patient['id']
|
||
|
|
given = patient['name'][0].get('given', [''])[0]
|
||
|
|
family = patient['name'][0].get('family', '')
|
||
|
|
full_name = f"{given} {family}"
|
||
|
|
birth_date = patient.get('birthDate', 'Unknown')
|
||
|
|
gender = patient.get('gender', 'Unknown')
|
||
|
|
|
||
|
|
# Address (optional)
|
||
|
|
address = patient.get('address', [{}])[0]
|
||
|
|
city = address.get('city', '')
|
||
|
|
state = address.get('state', '')
|
||
|
|
```
|
||
|
|
|
||
|
|
### Condition Resource
|
||
|
|
|
||
|
|
```python
|
||
|
|
condition = entry['resource']
|
||
|
|
|
||
|
|
# Code -- check ALL coding entries, not just [0]
|
||
|
|
codings = condition.get('code', {}).get('coding', [])
|
||
|
|
for coding in codings:
|
||
|
|
system = coding.get('system', '')
|
||
|
|
code = coding.get('code', '')
|
||
|
|
display = coding.get('display', '')
|
||
|
|
if 'snomed' in system:
|
||
|
|
snomed_code = code
|
||
|
|
elif 'icd' in system.lower():
|
||
|
|
icd_code = code
|
||
|
|
|
||
|
|
# Clinical status
|
||
|
|
status_codings = condition.get('clinicalStatus', {}).get('coding', [])
|
||
|
|
clinical_status = status_codings[0]['code'] if status_codings else 'unknown'
|
||
|
|
# Values: "active", "recurrence", "relapse", "inactive", "remission", "resolved"
|
||
|
|
|
||
|
|
# Onset
|
||
|
|
onset = condition.get('onsetDateTime', condition.get('onsetPeriod', {}).get('start', 'Unknown'))
|
||
|
|
|
||
|
|
# Verification status (confirmed, unconfirmed, provisional, differential, refuted)
|
||
|
|
verification = condition.get('verificationStatus', {}).get('coding', [{}])[0].get('code', 'unknown')
|
||
|
|
```
|
||
|
|
|
||
|
|
### Observation Resource -- Simple (single value)
|
||
|
|
|
||
|
|
Most labs return a single value in `valueQuantity`:
|
||
|
|
```python
|
||
|
|
obs = entry['resource']
|
||
|
|
lab_name = obs['code']['coding'][0]['display']
|
||
|
|
loinc_code = obs['code']['coding'][0]['code']
|
||
|
|
date = obs.get('effectiveDateTime', 'Unknown')
|
||
|
|
|
||
|
|
# Value -- multiple possible formats
|
||
|
|
if 'valueQuantity' in obs:
|
||
|
|
value = obs['valueQuantity']['value']
|
||
|
|
unit = obs['valueQuantity'].get('unit', '')
|
||
|
|
elif 'valueString' in obs:
|
||
|
|
value = obs['valueString'] # qualitative result like "negative"
|
||
|
|
unit = ''
|
||
|
|
elif 'valueCodeableConcept' in obs:
|
||
|
|
value = obs['valueCodeableConcept'].get('text', 'See coding')
|
||
|
|
unit = ''
|
||
|
|
else:
|
||
|
|
value = None # check component (see below)
|
||
|
|
|
||
|
|
# Reference range (from the lab, more accurate than general tables)
|
||
|
|
ref_ranges = obs.get('referenceRange', [])
|
||
|
|
if ref_ranges:
|
||
|
|
low = ref_ranges[0].get('low', {}).get('value')
|
||
|
|
high = ref_ranges[0].get('high', {}).get('value')
|
||
|
|
```
|
||
|
|
|
||
|
|
### Observation Resource -- Component (Blood Pressure)
|
||
|
|
|
||
|
|
Blood pressure in FHIR is typically a **component Observation** with LOINC `85354-9` (BP panel). Systolic and diastolic are nested inside `component[]`, NOT in `valueQuantity`:
|
||
|
|
|
||
|
|
```python
|
||
|
|
obs = entry['resource']
|
||
|
|
panel_code = obs['code']['coding'][0]['code']
|
||
|
|
|
||
|
|
if panel_code == '85354-9' or 'component' in obs:
|
||
|
|
systolic = None
|
||
|
|
diastolic = None
|
||
|
|
for comp in obs.get('component', []):
|
||
|
|
comp_code = comp['code']['coding'][0]['code']
|
||
|
|
if comp_code == '8480-6': # systolic
|
||
|
|
systolic = comp['valueQuantity']['value']
|
||
|
|
elif comp_code == '8462-4': # diastolic
|
||
|
|
diastolic = comp['valueQuantity']['value']
|
||
|
|
```
|
||
|
|
|
||
|
|
**Critical**: When querying for systolic BP (LOINC 8480-6), some FHIR servers return the panel Observation (85354-9) where systolic is inside `component`. Others return a standalone Observation with `valueQuantity`. Your code must handle both:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def get_bp(patient_id, base_url):
|
||
|
|
"""Get most recent blood pressure, handling both panel and standalone formats."""
|
||
|
|
# Try panel first
|
||
|
|
r = fhir_get("Observation",
|
||
|
|
params={"patient": patient_id, "code": "85354-9",
|
||
|
|
"_sort": "-date", "_count": "1"}).json()
|
||
|
|
if r.get('entry'):
|
||
|
|
obs = r['entry'][0]['resource']
|
||
|
|
systolic = diastolic = None
|
||
|
|
for comp in obs.get('component', []):
|
||
|
|
c = comp['code']['coding'][0]['code']
|
||
|
|
if c == '8480-6':
|
||
|
|
systolic = comp['valueQuantity']['value']
|
||
|
|
elif c == '8462-4':
|
||
|
|
diastolic = comp['valueQuantity']['value']
|
||
|
|
if systolic is not None:
|
||
|
|
return systolic, diastolic, obs.get('effectiveDateTime', 'Unknown')
|
||
|
|
|
||
|
|
# Fallback: standalone systolic
|
||
|
|
r = fhir_get("Observation",
|
||
|
|
params={"patient": patient_id, "code": "8480-6",
|
||
|
|
"_sort": "-date", "_count": "1"}).json()
|
||
|
|
if r.get('entry'):
|
||
|
|
obs = r['entry'][0]['resource']
|
||
|
|
systolic = obs.get('valueQuantity', {}).get('value')
|
||
|
|
return systolic, None, obs.get('effectiveDateTime', 'Unknown')
|
||
|
|
|
||
|
|
return None, None, None
|
||
|
|
```
|
||
|
|
|
||
|
|
### MedicationRequest Resource
|
||
|
|
|
||
|
|
```python
|
||
|
|
med = entry['resource']
|
||
|
|
|
||
|
|
# Medication name -- check multiple locations
|
||
|
|
med_name = (
|
||
|
|
med.get('medicationCodeableConcept', {}).get('text')
|
||
|
|
or med.get('medicationCodeableConcept', {}).get('coding', [{}])[0].get('display')
|
||
|
|
or 'Unknown medication'
|
||
|
|
)
|
||
|
|
|
||
|
|
# Status
|
||
|
|
status = med.get('status', 'unknown') # active, on-hold, cancelled, completed, stopped
|
||
|
|
|
||
|
|
# Dosage
|
||
|
|
dosage_instructions = med.get('dosageInstruction', [{}])
|
||
|
|
dosage_text = dosage_instructions[0].get('text', 'No dosage recorded') if dosage_instructions else 'No dosage recorded'
|
||
|
|
|
||
|
|
# Authored date
|
||
|
|
authored = med.get('authoredOn', 'Unknown')
|
||
|
|
```
|
||
|
|
|
||
|
|
## Pagination
|
||
|
|
|
||
|
|
FHIR responses default to 20 results per page (server-dependent). Always handle pagination for cohort queries:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def get_all_pages(url, params=None):
|
||
|
|
"""Fetch all pages of a FHIR Bundle search."""
|
||
|
|
all_entries = []
|
||
|
|
if params:
|
||
|
|
r = fhir_get(url, params)
|
||
|
|
else:
|
||
|
|
r = fhir_get(url)
|
||
|
|
all_entries.extend(r.get('entry', []))
|
||
|
|
|
||
|
|
# Follow 'next' links
|
||
|
|
while True:
|
||
|
|
next_url = None
|
||
|
|
for link in r.get('link', []):
|
||
|
|
if link.get('relation') == 'next':
|
||
|
|
next_url = link['url']
|
||
|
|
break
|
||
|
|
if not next_url:
|
||
|
|
break
|
||
|
|
r = fhir_get(next_url)
|
||
|
|
all_entries.extend(r.get('entry', []))
|
||
|
|
|
||
|
|
return all_entries
|
||
|
|
```
|
||
|
|
|
||
|
|
For large cohorts, set `_count=200` to reduce the number of pages. The SMART test server caps at ~1000 results regardless.
|
||
|
|
|
||
|
|
## Batched and Multi-Patient Queries (CRITICAL for Performance)
|
||
|
|
|
||
|
|
**Never loop over patients making individual FHIR calls.** Each HTTP call through the sandbox proxy adds 1-3 seconds of latency. For 24 patients x 4 LOINC codes, that is 96 sequential calls = 5+ minutes. Instead, fetch all observations for a LOINC code in one request and filter client-side in Python.
|
||
|
|
|
||
|
|
### Pattern 1: Fetch all Observations for a LOINC code (preferred)
|
||
|
|
|
||
|
|
Query Observation by code alone (no patient filter) to get results for ALL patients in one call:
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /Observation?code={loinc}&_count=500&_sort=-date
|
||
|
|
```
|
||
|
|
|
||
|
|
Then filter in Python by patient reference:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def get_all_obs_for_code(loinc_code, count=500):
|
||
|
|
"""Fetch ALL observations for a LOINC code across all patients in one call."""
|
||
|
|
entries = get_all_pages("Observation", {"code": loinc_code, "_count": str(count), "_sort": "-date"})
|
||
|
|
# Build dict: patient_id -> list of observations (already sorted newest first)
|
||
|
|
by_patient = {}
|
||
|
|
for e in entries:
|
||
|
|
obs = e['resource']
|
||
|
|
ref = obs.get('subject', {}).get('reference', '')
|
||
|
|
pid = ref.split('/')[-1] if '/' in ref else ref
|
||
|
|
if pid not in by_patient:
|
||
|
|
by_patient[pid] = obs # keep only the most recent per patient
|
||
|
|
return by_patient
|
||
|
|
```
|
||
|
|
|
||
|
|
### Pattern 2: Multi-patient query parameter
|
||
|
|
|
||
|
|
Some FHIR servers accept comma-separated patient references:
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /Observation?patient=Patient/X,Patient/Y,Patient/Z&code={loinc}&_count=500
|
||
|
|
```
|
||
|
|
|
||
|
|
The SMART test server supports this. Use it when you have a specific list of patient IDs and want to avoid fetching observations for patients not in your cohort.
|
||
|
|
|
||
|
|
### Pattern 3: FHIR Batch Bundle
|
||
|
|
|
||
|
|
For heterogeneous queries (different resource types per patient), POST a Bundle of type `batch` to the server root:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def fhir_batch(requests_list):
|
||
|
|
"""Execute multiple FHIR queries in a single HTTP call using a batch Bundle.
|
||
|
|
requests_list: list of {"method": "GET", "url": "Observation?patient=X&code=Y"} dicts.
|
||
|
|
"""
|
||
|
|
bundle = {
|
||
|
|
"resourceType": "Bundle",
|
||
|
|
"type": "batch",
|
||
|
|
"entry": [{"request": req} for req in requests_list]
|
||
|
|
}
|
||
|
|
import subprocess, json
|
||
|
|
r = subprocess.run(
|
||
|
|
["curl", "-sf", "--max-time", "60",
|
||
|
|
"-X", "POST", "-H", "Content-Type: application/fhir+json",
|
||
|
|
"-d", json.dumps(bundle), f"{BASE_URL}"],
|
||
|
|
capture_output=True, text=True, timeout=65
|
||
|
|
)
|
||
|
|
if r.returncode != 0 or not r.stdout.strip():
|
||
|
|
return []
|
||
|
|
result = json.loads(r.stdout)
|
||
|
|
return result.get('entry', [])
|
||
|
|
```
|
||
|
|
|
||
|
|
### When to use which pattern
|
||
|
|
|
||
|
|
| Scenario | Pattern | Why |
|
||
|
|
|----------|---------|-----|
|
||
|
|
| Labs for a cohort (same LOINC, many patients) | Pattern 1: code-only query | One call gets everything; filter in Python |
|
||
|
|
| Labs for a specific patient list | Pattern 2: comma-separated patients | Scoped to your cohort; one call |
|
||
|
|
| Mixed data (labs + meds + conditions per patient) | Pattern 3: batch Bundle | Multiple queries, single HTTP round-trip |
|
||
|
|
| Single patient lookup | Individual GET | Fine for 1-2 patients |
|
||
|
|
|
||
|
|
## Bulk FHIR (Production-Scale)
|
||
|
|
|
||
|
|
For production population health workflows (not used in this demo, but important context): FHIR Bulk Data Access (SMART/HL7) allows exporting entire patient populations as NDJSON files via an async `$export` operation. This is how real quality measure engines work at scale -- they don't query patient-by-patient. The demo uses individual queries for clarity and because the public test server doesn't support Bulk FHIR.
|
||
|
|
|
||
|
|
## Error Handling
|
||
|
|
|
||
|
|
- Always check `response.status_code` before parsing JSON
|
||
|
|
- Check if `entry` exists in the response before iterating: `bundle.get('entry', [])`
|
||
|
|
- Some fields may be missing -- use `.get()` with sensible defaults
|
||
|
|
- Never fabricate data. If a field is absent, report "Not recorded"
|
||
|
|
- Handle `OperationOutcome` responses (FHIR error format):
|
||
|
|
```python
|
||
|
|
if response.json().get('resourceType') == 'OperationOutcome':
|
||
|
|
issues = response.json().get('issue', [])
|
||
|
|
error_msg = issues[0].get('diagnostics', 'Unknown FHIR error') if issues else 'Unknown error'
|
||
|
|
```
|
||
|
|
- Rate limiting: The public SMART test server has no rate limits, but production endpoints may. Add a small delay (0.1-0.5s) between calls in tight loops
|
||
|
|
- Timeout: Use `--max-time 30` with curl; FHIR servers can be slow under load
|