--- name: fhir-basics description: Teaches agents how FHIR R4 APIs work, what resources are available, how to query them with search parameters, and how to correctly parse all response formats including component Observations. metadata: openclaw: requires: bins: ["python3"] --- # FHIR Data Retrieval **Important**: In this sandbox, run Python scripts with `python` (not `python3`). Use `subprocess.run(["curl", "-sf", url], capture_output=True, text=True)` for all FHIR HTTP calls — the `requests` library does NOT work through the sandbox proxy. Always parse the output with `json.loads()`. FHIR (Fast Healthcare Interoperability Resources) R4 is the standard API format mandated by the 21st Century Cures Act for US healthcare interoperability. ~70% of US hospitals expose FHIR R4 endpoints (ONC 2024). All queries use REST GET requests returning JSON Bundles. ## Default FHIR Endpoint Unless the user specifies a different FHIR server, always use: `https://r4.smarthealthit.org` This is the SMART on FHIR public test server with synthetic (Synthea) patient data. No authentication required. **Query format for this server**: Use bare codes without system URIs. Example: `code=44054006` NOT `code=http://snomed.info/sct|44054006`. The test server does not support system-qualified code searches and will return empty results. ## Authentication - **Public test servers** (e.g., `https://r4.smarthealthit.org`): No authentication required. Synthetic data, real FHIR format. - **Production hospital endpoints**: Use SMART on FHIR OAuth2 flows. Requires a `client_id`, redirect URI, and scope negotiation. Access tokens are passed as `Authorization: Bearer {token}` headers. ## Resource Endpoints and Search Parameters ### Patient ``` GET /Patient -- all patients (paginated) GET /Patient?name=Smith -- search by family or given name GET /Patient?name=John&name=Smith -- search by given AND family name GET /Patient?birthdate=1970-01-01 -- exact birthdate GET /Patient?birthdate=ge1960-01-01&birthdate=le1970-12-31 -- date range GET /Patient?gender=male -- filter by gender GET /Patient/{id} -- get a specific patient by ID GET /Patient?_count=50 -- control page size ``` ### Condition (Diagnoses) ``` GET /Condition?patient={id} -- all conditions for a patient GET /Condition?patient={id}&clinical-status=active -- only active conditions GET /Condition?code={snomed_code} -- all patients with a condition (cohort query) GET /Condition?code=44054006&_count=200 -- paginated cohort (bare code for test server) ``` For the default test server (`r4.smarthealthit.org`), always use bare codes (e.g., `code=44054006`). Production servers may require the full system URI (`code=http://snomed.info/sct|44054006`). ### Observation (Labs and Vitals) ``` GET /Observation?patient={id} -- all observations GET /Observation?patient={id}&code={loinc} -- specific lab by LOINC GET /Observation?patient={id}&code={loinc}&_sort=-date&_count=1 -- most recent only GET /Observation?patient={id}&category=vital-signs -- vitals only GET /Observation?patient={id}&category=laboratory -- labs only GET /Observation?patient={id}&date=ge2023-01-01 -- after a date ``` ### MedicationRequest (Prescriptions) ``` GET /MedicationRequest?patient={id} -- all prescriptions GET /MedicationRequest?patient={id}&status=active -- current prescriptions only GET /MedicationRequest?patient={id}&_count=100 -- increase page size ``` ### Encounter (Visits) ``` GET /Encounter?patient={id} -- all encounters GET /Encounter?patient={id}&_sort=-date&_count=5 -- 5 most recent visits GET /Encounter?patient={id}&type=office -- office visits only ``` ### DiagnosticReport (Lab Reports, Imaging) ``` GET /DiagnosticReport?patient={id} -- all reports GET /DiagnosticReport?patient={id}&category=LAB -- lab reports GET /DiagnosticReport?patient={id}&category=imaging -- imaging reports ``` ## Key LOINC Codes | LOINC | Lab/Vital | Notes | |-------|-----------|-------| | 4548-4 | Hemoglobin A1c (HbA1c) | Primary diabetes monitoring | | 2345-7 | Glucose | Fasting or random | | 2160-0 | Creatinine | Kidney function | | 33914-3 | eGFR (CKD-EPI) | Kidney function staging | | 2093-3 | Total Cholesterol | Lipid panel | | 2571-8 | Triglycerides | Lipid panel | | 2085-9 | HDL Cholesterol | Lipid panel | | 18262-6 | LDL Cholesterol | Lipid panel | | 85354-9 | Blood Pressure panel | Component observation (see below) | | 8480-6 | Systolic Blood Pressure | Component of BP panel, or standalone | | 8462-4 | Diastolic Blood Pressure | Component of BP panel, or standalone | | 42637-9 | BNP (B-type Natriuretic Peptide) | Heart failure marker | | 33762-6 | NT-proBNP | Heart failure marker (alternative to BNP) | | 6690-2 | WBC Count | Infection/inflammation | | 718-7 | Hemoglobin | Anemia screening | | 2823-3 | Potassium | Electrolyte; critical for ACEi/ARB/MRA monitoring | | 2951-2 | Sodium | Electrolyte | | 1742-6 | ALT | Liver function | | 14959-1 | Microalbumin/Creatinine Ratio (urine) | Diabetic nephropathy screening | ## Parsing FHIR JSON Responses ### Bundle Structure Every search returns a Bundle: ```json { "resourceType": "Bundle", "type": "searchset", "total": 42, "entry": [ { "resource": { ... } }, ... ], "link": [ { "relation": "self", "url": "..." }, { "relation": "next", "url": "..." } ] } ``` Always check `bundle.get('entry', [])` before iterating -- an empty result returns a Bundle with no `entry` key. ### Patient Resource ```python patient = entry['resource'] patient_id = patient['id'] given = patient['name'][0].get('given', [''])[0] family = patient['name'][0].get('family', '') full_name = f"{given} {family}" birth_date = patient.get('birthDate', 'Unknown') gender = patient.get('gender', 'Unknown') # Address (optional) address = patient.get('address', [{}])[0] city = address.get('city', '') state = address.get('state', '') ``` ### Condition Resource ```python condition = entry['resource'] # Code -- check ALL coding entries, not just [0] codings = condition.get('code', {}).get('coding', []) for coding in codings: system = coding.get('system', '') code = coding.get('code', '') display = coding.get('display', '') if 'snomed' in system: snomed_code = code elif 'icd' in system.lower(): icd_code = code # Clinical status status_codings = condition.get('clinicalStatus', {}).get('coding', []) clinical_status = status_codings[0]['code'] if status_codings else 'unknown' # Values: "active", "recurrence", "relapse", "inactive", "remission", "resolved" # Onset onset = condition.get('onsetDateTime', condition.get('onsetPeriod', {}).get('start', 'Unknown')) # Verification status (confirmed, unconfirmed, provisional, differential, refuted) verification = condition.get('verificationStatus', {}).get('coding', [{}])[0].get('code', 'unknown') ``` ### Observation Resource -- Simple (single value) Most labs return a single value in `valueQuantity`: ```python obs = entry['resource'] lab_name = obs['code']['coding'][0]['display'] loinc_code = obs['code']['coding'][0]['code'] date = obs.get('effectiveDateTime', 'Unknown') # Value -- multiple possible formats if 'valueQuantity' in obs: value = obs['valueQuantity']['value'] unit = obs['valueQuantity'].get('unit', '') elif 'valueString' in obs: value = obs['valueString'] # qualitative result like "negative" unit = '' elif 'valueCodeableConcept' in obs: value = obs['valueCodeableConcept'].get('text', 'See coding') unit = '' else: value = None # check component (see below) # Reference range (from the lab, more accurate than general tables) ref_ranges = obs.get('referenceRange', []) if ref_ranges: low = ref_ranges[0].get('low', {}).get('value') high = ref_ranges[0].get('high', {}).get('value') ``` ### Observation Resource -- Component (Blood Pressure) Blood pressure in FHIR is typically a **component Observation** with LOINC `85354-9` (BP panel). Systolic and diastolic are nested inside `component[]`, NOT in `valueQuantity`: ```python obs = entry['resource'] panel_code = obs['code']['coding'][0]['code'] if panel_code == '85354-9' or 'component' in obs: systolic = None diastolic = None for comp in obs.get('component', []): comp_code = comp['code']['coding'][0]['code'] if comp_code == '8480-6': # systolic systolic = comp['valueQuantity']['value'] elif comp_code == '8462-4': # diastolic diastolic = comp['valueQuantity']['value'] ``` **Critical**: When querying for systolic BP (LOINC 8480-6), some FHIR servers return the panel Observation (85354-9) where systolic is inside `component`. Others return a standalone Observation with `valueQuantity`. Your code must handle both: ```python def get_bp(patient_id, base_url): """Get most recent blood pressure, handling both panel and standalone formats.""" # Try panel first r = fhir_get("Observation", params={"patient": patient_id, "code": "85354-9", "_sort": "-date", "_count": "1"}).json() if r.get('entry'): obs = r['entry'][0]['resource'] systolic = diastolic = None for comp in obs.get('component', []): c = comp['code']['coding'][0]['code'] if c == '8480-6': systolic = comp['valueQuantity']['value'] elif c == '8462-4': diastolic = comp['valueQuantity']['value'] if systolic is not None: return systolic, diastolic, obs.get('effectiveDateTime', 'Unknown') # Fallback: standalone systolic r = fhir_get("Observation", params={"patient": patient_id, "code": "8480-6", "_sort": "-date", "_count": "1"}).json() if r.get('entry'): obs = r['entry'][0]['resource'] systolic = obs.get('valueQuantity', {}).get('value') return systolic, None, obs.get('effectiveDateTime', 'Unknown') return None, None, None ``` ### MedicationRequest Resource ```python med = entry['resource'] # Medication name -- check multiple locations med_name = ( med.get('medicationCodeableConcept', {}).get('text') or med.get('medicationCodeableConcept', {}).get('coding', [{}])[0].get('display') or 'Unknown medication' ) # Status status = med.get('status', 'unknown') # active, on-hold, cancelled, completed, stopped # Dosage dosage_instructions = med.get('dosageInstruction', [{}]) dosage_text = dosage_instructions[0].get('text', 'No dosage recorded') if dosage_instructions else 'No dosage recorded' # Authored date authored = med.get('authoredOn', 'Unknown') ``` ## Pagination FHIR responses default to 20 results per page (server-dependent). Always handle pagination for cohort queries: ```python def get_all_pages(url, params=None): """Fetch all pages of a FHIR Bundle search.""" all_entries = [] if params: r = fhir_get(url, params) else: r = fhir_get(url) all_entries.extend(r.get('entry', [])) # Follow 'next' links while True: next_url = None for link in r.get('link', []): if link.get('relation') == 'next': next_url = link['url'] break if not next_url: break r = fhir_get(next_url) all_entries.extend(r.get('entry', [])) return all_entries ``` For large cohorts, set `_count=200` to reduce the number of pages. The SMART test server caps at ~1000 results regardless. ## Batched and Multi-Patient Queries (CRITICAL for Performance) **Never loop over patients making individual FHIR calls.** Each HTTP call through the sandbox proxy adds 1-3 seconds of latency. For 24 patients x 4 LOINC codes, that is 96 sequential calls = 5+ minutes. Instead, fetch all observations for a LOINC code in one request and filter client-side in Python. ### Pattern 1: Fetch all Observations for a LOINC code (preferred) Query Observation by code alone (no patient filter) to get results for ALL patients in one call: ``` GET /Observation?code={loinc}&_count=500&_sort=-date ``` Then filter in Python by patient reference: ```python def get_all_obs_for_code(loinc_code, count=500): """Fetch ALL observations for a LOINC code across all patients in one call.""" entries = get_all_pages("Observation", {"code": loinc_code, "_count": str(count), "_sort": "-date"}) # Build dict: patient_id -> list of observations (already sorted newest first) by_patient = {} for e in entries: obs = e['resource'] ref = obs.get('subject', {}).get('reference', '') pid = ref.split('/')[-1] if '/' in ref else ref if pid not in by_patient: by_patient[pid] = obs # keep only the most recent per patient return by_patient ``` ### Pattern 2: Multi-patient query parameter Some FHIR servers accept comma-separated patient references: ``` GET /Observation?patient=Patient/X,Patient/Y,Patient/Z&code={loinc}&_count=500 ``` The SMART test server supports this. Use it when you have a specific list of patient IDs and want to avoid fetching observations for patients not in your cohort. ### Pattern 3: FHIR Batch Bundle For heterogeneous queries (different resource types per patient), POST a Bundle of type `batch` to the server root: ```python def fhir_batch(requests_list): """Execute multiple FHIR queries in a single HTTP call using a batch Bundle. requests_list: list of {"method": "GET", "url": "Observation?patient=X&code=Y"} dicts. """ bundle = { "resourceType": "Bundle", "type": "batch", "entry": [{"request": req} for req in requests_list] } import subprocess, json r = subprocess.run( ["curl", "-sf", "--max-time", "60", "-X", "POST", "-H", "Content-Type: application/fhir+json", "-d", json.dumps(bundle), f"{BASE_URL}"], capture_output=True, text=True, timeout=65 ) if r.returncode != 0 or not r.stdout.strip(): return [] result = json.loads(r.stdout) return result.get('entry', []) ``` ### When to use which pattern | Scenario | Pattern | Why | |----------|---------|-----| | Labs for a cohort (same LOINC, many patients) | Pattern 1: code-only query | One call gets everything; filter in Python | | Labs for a specific patient list | Pattern 2: comma-separated patients | Scoped to your cohort; one call | | Mixed data (labs + meds + conditions per patient) | Pattern 3: batch Bundle | Multiple queries, single HTTP round-trip | | Single patient lookup | Individual GET | Fine for 1-2 patients | ## Bulk FHIR (Production-Scale) For production population health workflows (not used in this demo, but important context): FHIR Bulk Data Access (SMART/HL7) allows exporting entire patient populations as NDJSON files via an async `$export` operation. This is how real quality measure engines work at scale -- they don't query patient-by-patient. The demo uses individual queries for clarity and because the public test server doesn't support Bulk FHIR. ## Error Handling - Always check `response.status_code` before parsing JSON - Check if `entry` exists in the response before iterating: `bundle.get('entry', [])` - Some fields may be missing -- use `.get()` with sensible defaults - Never fabricate data. If a field is absent, report "Not recorded" - Handle `OperationOutcome` responses (FHIR error format): ```python if response.json().get('resourceType') == 'OperationOutcome': issues = response.json().get('issue', []) error_msg = issues[0].get('diagnostics', 'Unknown FHIR error') if issues else 'Unknown error' ``` - Rate limiting: The public SMART test server has no rate limits, but production endpoints may. Add a small delay (0.1-0.5s) between calls in tight loops - Timeout: Use `--max-time 30` with curl; FHIR servers can be slow under load