dgx-spark-playbooks/nvidia/station-healthcare-agent/assets/skills/fhir-basics/SKILL.md
2026-05-26 18:25:53 +00:00

16 KiB

name description metadata
fhir-basics Teaches agents how FHIR R4 APIs work, what resources are available, how to query them with search parameters, and how to correctly parse all response formats including component Observations.
openclaw
requires
bins
python3

FHIR Data Retrieval

Important: In this sandbox, run Python scripts with python (not python3). Use subprocess.run(["curl", "-sf", url], capture_output=True, text=True) for all FHIR HTTP calls — the requests library does NOT work through the sandbox proxy. Always parse the output with json.loads().

FHIR (Fast Healthcare Interoperability Resources) R4 is the standard API format mandated by the 21st Century Cures Act for US healthcare interoperability. ~70% of US hospitals expose FHIR R4 endpoints (ONC 2024). All queries use REST GET requests returning JSON Bundles.

Default FHIR Endpoint

Unless the user specifies a different FHIR server, always use: https://r4.smarthealthit.org

This is the SMART on FHIR public test server with synthetic (Synthea) patient data. No authentication required.

Query format for this server: Use bare codes without system URIs. Example: code=44054006 NOT code=http://snomed.info/sct|44054006. The test server does not support system-qualified code searches and will return empty results.

Authentication

  • Public test servers (e.g., https://r4.smarthealthit.org): No authentication required. Synthetic data, real FHIR format.
  • Production hospital endpoints: Use SMART on FHIR OAuth2 flows. Requires a client_id, redirect URI, and scope negotiation. Access tokens are passed as Authorization: Bearer {token} headers.

Resource Endpoints and Search Parameters

Patient

GET /Patient                              -- all patients (paginated)
GET /Patient?name=Smith                   -- search by family or given name
GET /Patient?name=John&name=Smith         -- search by given AND family name
GET /Patient?birthdate=1970-01-01         -- exact birthdate
GET /Patient?birthdate=ge1960-01-01&birthdate=le1970-12-31  -- date range
GET /Patient?gender=male                  -- filter by gender
GET /Patient/{id}                         -- get a specific patient by ID
GET /Patient?_count=50                    -- control page size

Condition (Diagnoses)

GET /Condition?patient={id}                           -- all conditions for a patient
GET /Condition?patient={id}&clinical-status=active     -- only active conditions
GET /Condition?code={snomed_code}                      -- all patients with a condition (cohort query)
GET /Condition?code=44054006&_count=200                -- paginated cohort (bare code for test server)

For the default test server (r4.smarthealthit.org), always use bare codes (e.g., code=44054006). Production servers may require the full system URI (code=http://snomed.info/sct|44054006).

Observation (Labs and Vitals)

GET /Observation?patient={id}                            -- all observations
GET /Observation?patient={id}&code={loinc}               -- specific lab by LOINC
GET /Observation?patient={id}&code={loinc}&_sort=-date&_count=1  -- most recent only
GET /Observation?patient={id}&category=vital-signs       -- vitals only
GET /Observation?patient={id}&category=laboratory        -- labs only
GET /Observation?patient={id}&date=ge2023-01-01          -- after a date

MedicationRequest (Prescriptions)

GET /MedicationRequest?patient={id}                  -- all prescriptions
GET /MedicationRequest?patient={id}&status=active    -- current prescriptions only
GET /MedicationRequest?patient={id}&_count=100       -- increase page size

Encounter (Visits)

GET /Encounter?patient={id}                          -- all encounters
GET /Encounter?patient={id}&_sort=-date&_count=5     -- 5 most recent visits
GET /Encounter?patient={id}&type=office              -- office visits only

DiagnosticReport (Lab Reports, Imaging)

GET /DiagnosticReport?patient={id}                   -- all reports
GET /DiagnosticReport?patient={id}&category=LAB      -- lab reports
GET /DiagnosticReport?patient={id}&category=imaging  -- imaging reports

Key LOINC Codes

LOINC Lab/Vital Notes
4548-4 Hemoglobin A1c (HbA1c) Primary diabetes monitoring
2345-7 Glucose Fasting or random
2160-0 Creatinine Kidney function
33914-3 eGFR (CKD-EPI) Kidney function staging
2093-3 Total Cholesterol Lipid panel
2571-8 Triglycerides Lipid panel
2085-9 HDL Cholesterol Lipid panel
18262-6 LDL Cholesterol Lipid panel
85354-9 Blood Pressure panel Component observation (see below)
8480-6 Systolic Blood Pressure Component of BP panel, or standalone
8462-4 Diastolic Blood Pressure Component of BP panel, or standalone
42637-9 BNP (B-type Natriuretic Peptide) Heart failure marker
33762-6 NT-proBNP Heart failure marker (alternative to BNP)
6690-2 WBC Count Infection/inflammation
718-7 Hemoglobin Anemia screening
2823-3 Potassium Electrolyte; critical for ACEi/ARB/MRA monitoring
2951-2 Sodium Electrolyte
1742-6 ALT Liver function
14959-1 Microalbumin/Creatinine Ratio (urine) Diabetic nephropathy screening

Parsing FHIR JSON Responses

Bundle Structure

Every search returns a Bundle:

{
  "resourceType": "Bundle",
  "type": "searchset",
  "total": 42,
  "entry": [ { "resource": { ... } }, ... ],
  "link": [
    { "relation": "self", "url": "..." },
    { "relation": "next", "url": "..." }
  ]
}

Always check bundle.get('entry', []) before iterating -- an empty result returns a Bundle with no entry key.

Patient Resource

patient = entry['resource']
patient_id = patient['id']
given = patient['name'][0].get('given', [''])[0]
family = patient['name'][0].get('family', '')
full_name = f"{given} {family}"
birth_date = patient.get('birthDate', 'Unknown')
gender = patient.get('gender', 'Unknown')

# Address (optional)
address = patient.get('address', [{}])[0]
city = address.get('city', '')
state = address.get('state', '')

Condition Resource

condition = entry['resource']

# Code -- check ALL coding entries, not just [0]
codings = condition.get('code', {}).get('coding', [])
for coding in codings:
    system = coding.get('system', '')
    code = coding.get('code', '')
    display = coding.get('display', '')
    if 'snomed' in system:
        snomed_code = code
    elif 'icd' in system.lower():
        icd_code = code

# Clinical status
status_codings = condition.get('clinicalStatus', {}).get('coding', [])
clinical_status = status_codings[0]['code'] if status_codings else 'unknown'
# Values: "active", "recurrence", "relapse", "inactive", "remission", "resolved"

# Onset
onset = condition.get('onsetDateTime', condition.get('onsetPeriod', {}).get('start', 'Unknown'))

# Verification status (confirmed, unconfirmed, provisional, differential, refuted)
verification = condition.get('verificationStatus', {}).get('coding', [{}])[0].get('code', 'unknown')

Observation Resource -- Simple (single value)

Most labs return a single value in valueQuantity:

obs = entry['resource']
lab_name = obs['code']['coding'][0]['display']
loinc_code = obs['code']['coding'][0]['code']
date = obs.get('effectiveDateTime', 'Unknown')

# Value -- multiple possible formats
if 'valueQuantity' in obs:
    value = obs['valueQuantity']['value']
    unit = obs['valueQuantity'].get('unit', '')
elif 'valueString' in obs:
    value = obs['valueString']    # qualitative result like "negative"
    unit = ''
elif 'valueCodeableConcept' in obs:
    value = obs['valueCodeableConcept'].get('text', 'See coding')
    unit = ''
else:
    value = None  # check component (see below)

# Reference range (from the lab, more accurate than general tables)
ref_ranges = obs.get('referenceRange', [])
if ref_ranges:
    low = ref_ranges[0].get('low', {}).get('value')
    high = ref_ranges[0].get('high', {}).get('value')

Observation Resource -- Component (Blood Pressure)

Blood pressure in FHIR is typically a component Observation with LOINC 85354-9 (BP panel). Systolic and diastolic are nested inside component[], NOT in valueQuantity:

obs = entry['resource']
panel_code = obs['code']['coding'][0]['code']

if panel_code == '85354-9' or 'component' in obs:
    systolic = None
    diastolic = None
    for comp in obs.get('component', []):
        comp_code = comp['code']['coding'][0]['code']
        if comp_code == '8480-6':  # systolic
            systolic = comp['valueQuantity']['value']
        elif comp_code == '8462-4':  # diastolic
            diastolic = comp['valueQuantity']['value']

Critical: When querying for systolic BP (LOINC 8480-6), some FHIR servers return the panel Observation (85354-9) where systolic is inside component. Others return a standalone Observation with valueQuantity. Your code must handle both:

def get_bp(patient_id, base_url):
    """Get most recent blood pressure, handling both panel and standalone formats."""
    # Try panel first
    r = fhir_get("Observation",
                     params={"patient": patient_id, "code": "85354-9",
                             "_sort": "-date", "_count": "1"}).json()
    if r.get('entry'):
        obs = r['entry'][0]['resource']
        systolic = diastolic = None
        for comp in obs.get('component', []):
            c = comp['code']['coding'][0]['code']
            if c == '8480-6':
                systolic = comp['valueQuantity']['value']
            elif c == '8462-4':
                diastolic = comp['valueQuantity']['value']
        if systolic is not None:
            return systolic, diastolic, obs.get('effectiveDateTime', 'Unknown')

    # Fallback: standalone systolic
    r = fhir_get("Observation",
                     params={"patient": patient_id, "code": "8480-6",
                             "_sort": "-date", "_count": "1"}).json()
    if r.get('entry'):
        obs = r['entry'][0]['resource']
        systolic = obs.get('valueQuantity', {}).get('value')
        return systolic, None, obs.get('effectiveDateTime', 'Unknown')

    return None, None, None

MedicationRequest Resource

med = entry['resource']

# Medication name -- check multiple locations
med_name = (
    med.get('medicationCodeableConcept', {}).get('text')
    or med.get('medicationCodeableConcept', {}).get('coding', [{}])[0].get('display')
    or 'Unknown medication'
)

# Status
status = med.get('status', 'unknown')  # active, on-hold, cancelled, completed, stopped

# Dosage
dosage_instructions = med.get('dosageInstruction', [{}])
dosage_text = dosage_instructions[0].get('text', 'No dosage recorded') if dosage_instructions else 'No dosage recorded'

# Authored date
authored = med.get('authoredOn', 'Unknown')

Pagination

FHIR responses default to 20 results per page (server-dependent). Always handle pagination for cohort queries:

def get_all_pages(url, params=None):
    """Fetch all pages of a FHIR Bundle search."""
    all_entries = []
    if params:
        r = fhir_get(url, params)
    else:
        r = fhir_get(url)
    all_entries.extend(r.get('entry', []))

    # Follow 'next' links
    while True:
        next_url = None
        for link in r.get('link', []):
            if link.get('relation') == 'next':
                next_url = link['url']
                break
        if not next_url:
            break
        r = fhir_get(next_url)
        all_entries.extend(r.get('entry', []))

    return all_entries

For large cohorts, set _count=200 to reduce the number of pages. The SMART test server caps at ~1000 results regardless.

Batched and Multi-Patient Queries (CRITICAL for Performance)

Never loop over patients making individual FHIR calls. Each HTTP call through the sandbox proxy adds 1-3 seconds of latency. For 24 patients x 4 LOINC codes, that is 96 sequential calls = 5+ minutes. Instead, fetch all observations for a LOINC code in one request and filter client-side in Python.

Pattern 1: Fetch all Observations for a LOINC code (preferred)

Query Observation by code alone (no patient filter) to get results for ALL patients in one call:

GET /Observation?code={loinc}&_count=500&_sort=-date

Then filter in Python by patient reference:

def get_all_obs_for_code(loinc_code, count=500):
    """Fetch ALL observations for a LOINC code across all patients in one call."""
    entries = get_all_pages("Observation", {"code": loinc_code, "_count": str(count), "_sort": "-date"})
    # Build dict: patient_id -> list of observations (already sorted newest first)
    by_patient = {}
    for e in entries:
        obs = e['resource']
        ref = obs.get('subject', {}).get('reference', '')
        pid = ref.split('/')[-1] if '/' in ref else ref
        if pid not in by_patient:
            by_patient[pid] = obs  # keep only the most recent per patient
    return by_patient

Pattern 2: Multi-patient query parameter

Some FHIR servers accept comma-separated patient references:

GET /Observation?patient=Patient/X,Patient/Y,Patient/Z&code={loinc}&_count=500

The SMART test server supports this. Use it when you have a specific list of patient IDs and want to avoid fetching observations for patients not in your cohort.

Pattern 3: FHIR Batch Bundle

For heterogeneous queries (different resource types per patient), POST a Bundle of type batch to the server root:

def fhir_batch(requests_list):
    """Execute multiple FHIR queries in a single HTTP call using a batch Bundle.
    requests_list: list of {"method": "GET", "url": "Observation?patient=X&code=Y"} dicts.
    """
    bundle = {
        "resourceType": "Bundle",
        "type": "batch",
        "entry": [{"request": req} for req in requests_list]
    }
    import subprocess, json
    r = subprocess.run(
        ["curl", "-sf", "--max-time", "60",
         "-X", "POST", "-H", "Content-Type: application/fhir+json",
         "-d", json.dumps(bundle), f"{BASE_URL}"],
        capture_output=True, text=True, timeout=65
    )
    if r.returncode != 0 or not r.stdout.strip():
        return []
    result = json.loads(r.stdout)
    return result.get('entry', [])

When to use which pattern

Scenario Pattern Why
Labs for a cohort (same LOINC, many patients) Pattern 1: code-only query One call gets everything; filter in Python
Labs for a specific patient list Pattern 2: comma-separated patients Scoped to your cohort; one call
Mixed data (labs + meds + conditions per patient) Pattern 3: batch Bundle Multiple queries, single HTTP round-trip
Single patient lookup Individual GET Fine for 1-2 patients

Bulk FHIR (Production-Scale)

For production population health workflows (not used in this demo, but important context): FHIR Bulk Data Access (SMART/HL7) allows exporting entire patient populations as NDJSON files via an async $export operation. This is how real quality measure engines work at scale -- they don't query patient-by-patient. The demo uses individual queries for clarity and because the public test server doesn't support Bulk FHIR.

Error Handling

  • Always check response.status_code before parsing JSON
  • Check if entry exists in the response before iterating: bundle.get('entry', [])
  • Some fields may be missing -- use .get() with sensible defaults
  • Never fabricate data. If a field is absent, report "Not recorded"
  • Handle OperationOutcome responses (FHIR error format):
    if response.json().get('resourceType') == 'OperationOutcome':
        issues = response.json().get('issue', [])
        error_msg = issues[0].get('diagnostics', 'Unknown FHIR error') if issues else 'Unknown error'
    
  • Rate limiting: The public SMART test server has no rate limits, but production endpoints may. Add a small delay (0.1-0.5s) between calls in tight loops
  • Timeout: Use --max-time 30 with curl; FHIR servers can be slow under load