18 KiB
Technical Runbook
Installation, configuration, troubleshooting, and operational details for the Clinical Intelligence Playbook.
Important
The supported install path is Docker Ollama via
make up(seeinstructions.mdStep 3 of the parent playbook, orSETUP-GUIDE.md). The manual steps below are for advanced users who want to wire components by hand or substitute a host Ollama install. If host Ollama is already running on port 11434, stop it (sudo systemctl stop ollama) or overrideOLLAMA_PORTin.envbefore starting Docker Ollama.
Quick Start (OpenShell Sandbox on GB300)
This sets up an isolated OpenShell sandbox with OpenClaw and a local Ollama inference backend. Everything the agent touches — filesystem, network, processes — is confined to the sandbox.
Requirements: DGX Station GB300, Python 3.10+, uv, Docker + NVIDIA Container Toolkit, OpenShell CLI ≥ 0.0.33, Node.js 22+ LTS.
1. Install OpenShell
uv pip install openshell --upgrade --pre \
--index-url https://urm.nvidia.com/artifactory/api/pypi/nv-shared-pypi/simple
2. Start Ollama (Docker, recommended)
The bundled docker-compose.yml runs Ollama in a container, binds it to 0.0.0.0:${OLLAMA_PORT:-11434}, and pins it to a single GPU (LLM_GPU in .env):
make up # docker compose up -d ollama openfold3 + model-pull
make status # confirm Ollama and OpenFold3 are healthy
Tip
Host Ollama alternative. If you prefer running Ollama on the host (e.g., to share with another playbook),
sudo systemctl stop ollamafirst or setOLLAMA_PORTin.envto a free port. ThenOLLAMA_HOST=0.0.0.0 ollama serveandollama pull nemotron-3-super:120b-a12b. The host's systemd unit binds to127.0.0.1by default — override withEnvironment="OLLAMA_HOST=0.0.0.0"in/etc/systemd/system/ollama.service.d/override.confso the sandbox can reach Ollama via the Docker bridge.
3. Pull Model (Docker path runs this automatically)
make up invokes the model-pull service, which pulls ${OLLAMA_MODEL:-nemotron-3-super:120b-a12b} (~86 GB) on first run. Subsequent runs skip if the model is cached.
If you opted into host Ollama, run ollama pull nemotron-3-super:120b-a12b manually.
4. Configure OpenShell Inference Provider
make setup runs this for you (and detects the Docker bridge IP automatically). To do it manually, replace <HOST_IP> with your Docker bridge IP (ip -4 addr show docker0 | grep -oP 'inet \K[\d.]+'):
openshell provider create \
--name ollama-local \
--type openai \
--credential OPENAI_API_KEY=ollama \
--config OPENAI_BASE_URL=http://<HOST_IP>:${OLLAMA_PORT:-11434}/v1
openshell inference set \
--provider ollama-local \
--model nemotron-3-super:120b-a12b
Note
Current OpenShell releases do not accept the
--base-urlshorthand forprovider create. Use--config OPENAI_BASE_URL=...as shown above. Thesetup_sandbox.shscript uses this form.
5. Generate the Sandbox Policy
The repo's sandbox-policy.yaml is a template: it contains a __DOCKER_BRIDGE_IP__ placeholder that must be substituted with the host's docker0 IP before the policy is valid. The helper scripts/gen_sandbox_policy.sh does this automatically and writes sandbox-policy-local.yaml:
bash scripts/gen_sandbox_policy.sh # writes sandbox-policy-local.yaml
make setup invokes this generator. If you are running steps by hand, always pass the generated file (sandbox-policy-local.yaml) to openshell sandbox create, not the template.
Verify the policy is working after sandbox creation:
# These should succeed:
curl -sk https://r4.smarthealthit.org/Patient?_count=1 | head -3 # FHIR
curl -sk https://inference.local/v1/models | head -3 # LLM
# These should fail (blocked):
curl -sk --max-time 5 https://google.com # denied
curl -sk --max-time 5 https://api.openai.com # denied
ping 8.8.8.8 -c 1 # Operation not permitted
6. Create the Sandbox
# Stop any stale forwards from a prior sandbox using the same port — these
# block re-creation with a "port already forwarded" error.
for s in $(openshell forward list 2>/dev/null | awk '/:18789 /{print $NF}' | sort -u); do
openshell forward stop 18789 "$s" 2>/dev/null || true
done
openshell sandbox create \
--policy sandbox-policy-local.yaml \
--provider ollama-local \
--forward 18789 \
--keep
--keep prevents the sandbox from being torn down on exit so you can re-enter it later.
7. Inside the Sandbox: Set Up OpenClaw
Note: These manual steps (7-9) are automated by
make setup. Use them only if you need to customize individual steps.
Everything from here runs inside the sandbox shell that OpenShell drops you into.
git clone https://github.com/jaival-nvidia/clinical-intelligence.git
cd clinical-intelligence
uv pip install pandas matplotlib
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
export NVM_DIR="$HOME/.nvm" && [ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh"
nvm install --lts
npm install -g openclaw@latest
8. Inside the Sandbox: Configure Inference Provider
OpenShell maps the host Ollama to inference.local inside the sandbox. Save this as configure_inference_local.py and run it to register a custom OpenClaw provider:
#!/usr/bin/env python3
"""configure_inference_local.py — patches OpenClaw config for OpenShell sandbox."""
import json
from pathlib import Path
config_path = Path.home() / ".openclaw" / "openclaw.json"
config_path.parent.mkdir(parents=True, exist_ok=True)
config = json.loads(config_path.read_text()) if config_path.exists() else {}
config.setdefault("providers", {})
config["providers"]["local-ollama"] = {
"type": "openai-compatible",
"baseUrl": "https://inference.local/v1",
"apiKey": "ollama",
"models": ["nemotron-3-super:120b-a12b"]
}
defaults = config.setdefault("agents", {}).setdefault("defaults", {})
defaults["model"] = "local-ollama/nemotron-3-super:120b-a12b"
defaults["timeoutSeconds"] = 600
sub = defaults.setdefault("subagents", {})
sub.update({"maxSpawnDepth": 2, "maxConcurrent": 4, "runTimeoutSeconds": 600})
config_path.write_text(json.dumps(config, indent=2))
print(f"Wrote {config_path}")
print("Provider: local-ollama -> https://inference.local/v1")
print("Model: nemotron-3-super:120b-a12b")
python3 configure_inference_local.py
9. Inside the Sandbox: Install Skills and Agents
# Deploy all 7 skills to ~/.openclaw/workspace/skills/ (matching setup_sandbox.sh)
mkdir -p ~/.openclaw/workspace/skills
cp -r skills/fhir-basics skills/clinical-knowledge skills/analysis-methods \
skills/case-summary skills/cohort-compare skills/molecular-viz \
skills/clinical-delegation ~/.openclaw/workspace/skills/
# Register sub-agents
for agent in patient-data labs-vitals medications analyst molecular; do
mkdir -p ~/.openclaw/workspaces/$agent
cp agents/${agent}-agent.md ~/.openclaw/workspaces/$agent/AGENTS.md
openclaw agents add $agent \
--workspace ~/.openclaw/workspaces/$agent \
--model local-ollama/nemotron-3-super:120b-a12b \
--non-interactive
done
# TOOLS.md for sub-agents
for agent in patient-data labs-vitals medications analyst molecular; do
cat > ~/.openclaw/workspaces/$agent/TOOLS.md << 'TOOLSEOF'
# Tools
Use the `exec` tool to run Python scripts against the FHIR server.
Call exec with: cat > /tmp/script.py << 'PYEOF'
import subprocess, json
r = subprocess.run(["curl", "-sf", "--max-time", "30", "https://r4.smarthealthit.org/Patient?_count=1"],
capture_output=True, text=True, timeout=35)
data = json.loads(r.stdout)
# your code here
PYEOF
python3 /tmp/script.py
Always use `exec` to run code. Do NOT use `write` for scripts.
Use subprocess.run(["curl", ...]) + json.loads() for HTTP requests. Do NOT use the requests library.
Print all results to stdout.
TOOLSEOF
done
# Auth profiles (provider name must match: local-ollama)
AUTH='{"version":1,"profiles":{"ollama":{"type":"api_key","provider":"local-ollama","key":"ollama"}}}'
for agent in main patient-data labs-vitals medications analyst molecular; do
mkdir -p ~/.openclaw/agents/$agent/agent
echo "$AUTH" > ~/.openclaw/agents/$agent/agent/auth-profiles.json
done
# Workspace stubs
for agent in patient-data labs-vitals medications analyst molecular; do
echo "# Identity" > ~/.openclaw/workspaces/$agent/IDENTITY.md
echo "# User" > ~/.openclaw/workspaces/$agent/USER.md
done
echo -e "# Identity\nClinical Intelligence Coordinator" > ~/.openclaw/workspace/IDENTITY.md
echo "# User" > ~/.openclaw/workspace/USER.md
# Allow coordinator to spawn sub-agents + disable unused skills
python3 -c "
import json
with open('$HOME/.openclaw/openclaw.json') as f: d = json.load(f)
for a in d.get('agents', {}).get('list', []):
if a['id'] == 'main': a['subagents'] = {'allowAgents': ['*']}
d.setdefault('tools', {})['sessions'] = {'visibility': 'all'}
d.setdefault('skills', {})['entries'] = {
'weather': {'enabled': False}, 'tmux': {'enabled': False},
'healthcheck': {'enabled': False}, 'gh-issues': {'enabled': False},
'skill-creator': {'enabled': False}, 'github': {'enabled': False}
}
with open('$HOME/.openclaw/openclaw.json', 'w') as f: json.dump(d, f, indent=2)
"
10. Verify Inside the Sandbox
# Verify inference routing
curl -sk https://inference.local/v1/models | head -3
# Verify FHIR access
curl -sk https://r4.smarthealthit.org/Patient?_count=1 | head -3
# Verify deny (should timeout or fail)
curl -sk --max-time 5 https://google.com
# Run a test
openclaw agent --local --session-id smoke --thinking off \
--message "Say hello in one sentence" --timeout 120
Compatible Models
Any Ollama model with native tool calling works. Tested:
- nemotron-3-super:120b-a12b — recommended, MoE, fast inference
- qwen3:235b — larger, slower, reliable
- qwen3-coder:480b — best for code generation
- qwen3:32b — for constrained environments
- qwen2.5:72b — proven stable, good tool calling
To switch: ollama pull <model>, then openshell cluster inference set --provider ollama-local --model <model>, then inside the sandbox: openclaw config set agents.defaults.model local-ollama/<model>.
OpenShell Sandbox Security
OpenShell sandboxes use an implicit-deny model. All network egress, filesystem writes, and resource usage are blocked by default. The sandbox-policy.yaml explicitly whitelists what the sandbox can reach.
In practice:
- The agent cannot reach the internet except for the FHIR endpoint in the policy.
- The agent cannot exfiltrate data — no outbound HTTP, no DNS to arbitrary hosts.
- LLM inference routes through
inference.local, a OpenShell-managed bridge to the host Ollama. This never leaves the machine. - Filesystem writes are confined to
/tmp,/home/sandbox, and~/.openclaw. - Resource caps prevent runaway processes from consuming the host.
Modifying the Policy
To allow additional FHIR endpoints (e.g., a hospital server), add entries under network_policies in sandbox-policy.yaml:
hospital_fhir:
name: hospital_fhir
endpoints:
- host: fhir.your-hospital.org
port: 443
protocol: https
description: Hospital FHIR R4 endpoint
Then recreate the sandbox:
openshell sandbox delete <sandbox-name>
openshell sandbox create --policy sandbox-policy.yaml --provider ollama-local --forward 18789 --keep
Audit
OpenShell logs all network connections attempted by the sandbox, including blocked ones:
openshell sandbox logs <sandbox-name> --filter network
Running Workflows
All openclaw commands below run inside the OpenShell sandbox.
openshell sandbox connect <sandbox-name>
CMS122 Diabetes Quality Gap
openclaw agent --local --session-id test-cms122 \
--thinking off \
--message "Find all diabetic patients, get their latest A1c and medications. Identify gap patients with A1c above 9% not on insulin or GLP-1. Show the A1c distribution as a histogram." \
--timeout 600
Validate: python3 scripts/validate_and_run.py --validate-only /tmp/cms122.py
Patient Case Summary
openclaw agent --local --session-id test-case \
--thinking off \
--message "Look up the first patient. Compile a case summary: demographics, conditions, recent labs (flag abnormal), and medications." \
--timeout 600
Ad-Hoc Cross-Reference
openclaw agent --local --session-id test-adhoc \
--thinking off \
--message "Which patients have both diabetes and hypertension? For the overlap, get their latest HbA1c and blood pressure." \
--timeout 600
Conversational Follow-Up Test
Three turns in the same session (tests context persistence):
openclaw agent --local --session-id follow-up-test --thinking off \
--message "Find all diabetic patients. Print the count and list their IDs." \
--timeout 600
openclaw agent --local --session-id follow-up-test --thinking off \
--message "From those diabetic patients, which ones also have hypertension? Intersect the two groups." \
--timeout 600
openclaw agent --local --session-id follow-up-test --thinking off \
--message "For the patients with both diabetes and hypertension, get their latest HbA1c and eGFR. Flag anyone with eGFR below 60 as CKD risk." \
--timeout 600
Bulk Validation
for f in /tmp/cms122.py /tmp/case.py /tmp/adhoc.py /tmp/step1.py /tmp/step2.py /tmp/step3.py; do
echo "--- $f ---"
python3 scripts/validate_and_run.py --validate-only "$f"
done
Connecting a Hospital FHIR Endpoint
- Get OAuth2 credentials from your integration team (client_id, token_endpoint, scopes
patient/*.read). - Add the hospital host to
sandbox-policy.yamlundernetwork_policiesand recreate the sandbox. - Specify the FHIR URL directly in your prompt instead of
https://r4.smarthealthit.org. - Add an auth helper to
skills/fhir-basics/SKILL.mdfor token-based requests. - Test:
python3 scripts/test-fhir.py --url https://your-hospital.org/fhir --token YOUR_TOKEN
Troubleshooting
Sandbox Provisioning Failures
| Symptom | Fix |
|---|---|
sandbox create hangs |
Check host network; openshell sandbox create --verbose for pull progress |
failed to pull image |
GHCR rate limit — docker login ghcr.io with a PAT, or wait and retry |
policy validation error |
Check YAML syntax in sandbox-policy.yaml; recreate the sandbox |
provider not found |
openshell provider list — confirm ollama-local exists |
inference.local Not Resolving
| Check | Fix |
|---|---|
| Ollama listening on 0.0.0.0? | ss -tlnp | grep 11434 on host. Docker Ollama (default): make down && make up. Host Ollama: add Environment="OLLAMA_HOST=0.0.0.0" to /etc/systemd/system/ollama.service.d/override.conf and sudo systemctl restart ollama. |
| Provider base URL correct? | openshell provider show ollama-local — must use host IP, not 127.0.0.1 |
| DNS working inside sandbox? | nslookup inference.local inside sandbox; recreate sandbox if broken |
| Firewall blocking bridge? | curl -sk https://inference.local/v1/models inside sandbox; check host firewall |
LLM Connection Failures
| Check | Fix |
|---|---|
| Ollama running? | curl -sk https://inference.local/v1/models — start Ollama on host if down |
| Model pulled? | ollama list on host — ollama pull nemotron-3-super:120b-a12b if missing |
| Model loaded? | ollama ps on host — warmup: curl -sk https://inference.local/v1/chat/completions -d '{"model":"nemotron-3-super:120b-a12b","messages":[{"role":"user","content":"ping"}]}' |
OpenClaw Auth Errors
"No API key found for provider" — Create auth profiles as shown in step 9. Provider name must match local-ollama.
"Profile ollama timed out" — Model unloaded. Warm it up from inside the sandbox: curl -sk https://inference.local/v1/chat/completions -d '{"model":"nemotron-3-super:120b-a12b","messages":[{"role":"user","content":"ping"}]}' > /dev/null
FHIR Empty Results
| Check | Fix |
|---|---|
| Reachable from sandbox? | curl -s https://r4.smarthealthit.org/metadata | head -20 inside sandbox |
| Blocked by policy? | openshell sandbox logs <sandbox-name> --filter network — look for denied connections |
| SNOMED code format? | Use bare codes: code=44054006. The test server returns empty results with system-qualified URIs. |
| Page size? | Ensure _count=200 in cohort queries |
BP Values Always Null
Blood pressure is stored as a component Observation (LOINC 85354-9). The fhir-basics skill includes get_bp() that handles both panel and standalone formats. If the LLM queries 8480-6 directly without checking the panel, regenerate.
Slow Performance
| Bottleneck | Fix |
|---|---|
| LLM generation > 120s | Check GPU utilization with nvidia-smi on host; model may be on CPU |
| FHIR queries > 2s/patient | Test server can be slow; use offline mode for demos |
| Model keeps unloading | Set keep_alive to 120m in warmup curl |
| First sandbox command slow | One-time init cost; subsequent commands are fast |
Managing Sandboxes
openshell sandbox list # list running sandboxes
openshell sandbox connect <sandbox-name> # re-enter existing sandbox
openshell sandbox logs <sandbox-name> # view sandbox logs
openshell sandbox delete <sandbox-name> # tear down (after policy changes, recreate)
Performance Notes
Performance depends on the model, hardware, and FHIR server responsiveness. LLM generation time typically dominates over FHIR query time. Larger cohorts take longer due to per-patient REST queries; production deployments should use Bulk FHIR for populations over 200.