mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-06-24 15:19:30 +00:00
4.8 KiB
4.8 KiB
| name | description | metadata | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| analysis-methods | Teaches the analyst agent how to write correct, robust Python analysis code for FHIR clinical data using pandas, matplotlib, and scipy. |
|
Analysis Code Guidelines
FHIR Helpers Library
Always import the helpers library at the top of every analysis script:
import sys
sys.path.insert(0, '/sandbox/clinical-intelligence/skills/analysis-methods/scripts')
from fhir_helpers import *
Available functions
| Function | Use for | HTTP calls |
|---|---|---|
get_patients_with_condition(snomed_code) |
Find patients with a condition → list of IDs | 1-2 |
get_latest_labs_batch(loinc_code, patient_ids) |
Labs for a cohort → dict: pid → (value, unit, date) | 1-2 |
get_all_medications_batch(patient_ids) |
Meds for a cohort → dict: pid → [med names] | 1-2 |
build_cohort_df(patient_ids, loinc, lab_name, drug_check_fn) |
Full DataFrame with labs + meds | 2-3 |
get_latest_lab(patient_id, loinc_code) |
Lab for ONE patient → (value, unit, date) | 1 |
get_medications(patient_id) |
Meds for ONE patient → [names] | 1 |
get_latest_bp(patient_id) |
BP for ONE patient → (sys, dia, date) | 1-2 |
check_drug_class(med_list, drug_names) |
Check if any med matches drug list → bool | 0 |
fhir_get(path, params) |
Raw FHIR GET → parsed JSON | 1 |
get_all_pages(path, params) |
Paginated FHIR GET → all entries | 1+ |
save_chart_to_canvas(fig, filename) |
Save matplotlib figure to canvas directory | 0 |
Performance rules
- Cohort queries (2+ patients): Use
get_latest_labs_batch()andget_all_medications_batch(). These make 1-2 HTTP calls total regardless of patient count. - Single patient: Use
get_latest_lab(),get_medications(),get_latest_bp(). - NEVER loop over patients calling
get_latest_lab()per patient. Each HTTP call through the sandbox proxy adds 1-3s. For 48 patients = 48 calls = 2+ minutes. The batch function does it in one call.
Execution Rules
- Run scripts with
python(NOTpython3) - Write a SINGLE Python script for the entire task
- Write the script to
/tmp/<name>.py, then execute it - All HTTP inside the sandbox must use
subprocess.run(["curl", ...])— therequestslibrary does NOT work
Mandatory Workflow
STEP 1 - WRITE SCRIPT (import fhir_helpers, write analysis)
STEP 2 - VALIDATE: python /sandbox/clinical-intelligence/scripts/validate_and_run.py --validate-only /tmp/<name>.py
STEP 3 - EXECUTE: python /tmp/<name>.py
STEP 4 - INTERPRET: explain results using clinical-knowledge skill
Code Structure
- Imports (always start with fhir_helpers import)
- Data collection (use batch functions)
- DataFrame construction
- Analysis (filters, aggregations)
- Visualization -- use
save_chart_to_canvas(fig, filename)(NOT plt.savefig) - Summary (print findings)
- Disclaimer
Care Gap Analysis Pattern
# Example: diabetes care gap
patients = get_patients_with_condition("44054006") # SNOMED for diabetes
df = build_cohort_df(patients, "4548-4", "HbA1c",
lambda meds: check_drug_class(meds, ["metformin", "insulin", "glipizide"]))
gap = df[(df['HbA1c'] > 9) & (~df['on_target_med'])]
denom = len(df[df['HbA1c'].notna()])
pct = f"{len(gap)/denom*100:.1f}%" if denom > 0 else "N/A (no HbA1c data)"
print(f"Care gap: {len(gap)}/{denom} ({pct})")
Visualization
Always use dark theme. Use save_chart_to_canvas() instead of plt.savefig() directly.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
plt.style.use('dark_background')
fig, ax = plt.subplots(figsize=(10, 6))
fig.patch.set_facecolor('#1a1a1a')
ax.set_facecolor('#1a1a1a')
# Histogram with NVIDIA green
ax.hist(values, bins=15, color='#76B900', edgecolor='#1a1a1a', alpha=0.85)
ax.axvline(x=threshold, color='#ff4444', linestyle='--', linewidth=2, label=f'Threshold ({threshold})')
ax.set_title("Title", fontsize=14, fontweight='bold', color='white')
ax.legend()
ax.grid(axis='y', alpha=0.2, color='#444444')
ax.text(0.98, 0.95, f"N = {len(values)}", transform=ax.transAxes, fontsize=11, color='#888888', ha='right', va='top')
# MANDATORY: use save_chart_to_canvas (NOT plt.savefig)
save_chart_to_canvas(fig, "chart.png")
plt.close()
Guardrails
- Never compute statistics on fewer than 5 data points
- Always report sample size: "45.0% (27 out of 60)"
- Flag data quality issues if >30% missing
- Do not fabricate data — report what exists, flag what's missing
- All charts must include N annotation
Output Format
End every script with:
print(f"\nDisclaimer: This analysis is for research and operational purposes.")
print("Clinical decisions should be made by qualified clinicians.")