103 KiB
🦞 Set Up Example NemoClaw Agents 🦞
Ready-to-run application examples for your NemoClaw sandbox — policy, prompt, and personalization for each workflow
Table of Contents
- Overview
- Daily Personal News Digest
- Software Development Agent
- Deck Reviewer
- Calendar Negotiator
- NemoClaw Policy Setup
- Troubleshooting
- General sandbox & policy issues
- [NemoClaw Policy Setup](#nemoclaw-policy-setuphttpsbuildnvidiacomsparknemoclaw-applicationspolicy-setup)
- [Daily Personal News Digest](#daily-personal-news-digesthttpsbuildnvidiacomsparknemoclaw-applicationsnews-digest)
- [Software Development Agent](#software-development-agenthttpsbuildnvidiacomsparknemoclaw-applicationsdeveloper-agent)
- [Deck Reviewer](#deck-reviewerhttpsbuildnvidiacomsparknemoclaw-applicationsdeck-reviewer)
- [Calendar Negotiator](#calendar-negotiatorhttpsbuildnvidiacomsparknemoclaw-applicationscalendar-negotiator)
Overview
Basic idea
This playbook is a companion to the NemoClaw on DGX Spark install playbook. It walks through four ready-to-run applications you can stand up on top of an existing NemoClaw sandbox — a personal morning news digest, a software development agent, a doc and deck red-team, and a calendar negotiation chief-of-staff.
Each application is presented as a self-contained tab with the same three sections:
- Policy setup — the exact NemoClaw / OpenShell sandbox policy changes the workflow needs (channels, network egress, filesystem mounts).
- Agent prompt — the full canonical prompt you copy-paste into the NemoClaw web UI or send to your Telegram bot. It defines the agent's complete behavior end-to-end and is the only configuration the workflow needs.
- How to personalize — the knobs to turn (paths, schedule, audience, persona) to adapt the recipe to your real use case.
All applications run inside the OpenShell sandbox that NemoClaw created during onboarding, so the agent's filesystem, network, process, and inference access stays bounded by the policy you grant.
What you'll accomplish
You will run four practical NemoClaw workflows on your DGX Spark:
- Daily Personal News Digest — a scheduled morning briefing that wakes up on a cron, sweeps the topics you care about across an allowlisted set of sources, and posts a structured digest (Top 3, headlines by topic, deep dive, skip-the-noise, on-your-radar, local) to your Telegram home channel.
- Software Development Agent — reads a single project directory, builds an execution plan for the features you specify, implements them, reviews its own work, and writes a
develop-and-review.mdyou can read before merging. No outbound network beyond the local inference endpoint. - Deck Reviewer — a Doc & Deck Red-Team that scans the artifact you're about to send for inconsistent numbers, unsourced claims, missing data, accessibility issues, and prior-version contradictions, then returns a severity-ranked punch list with proposed edits.
- Calendar Negotiator — a scheduling chief-of-staff that turns "when can we meet?" threads into a confirmed meeting on your calendar, respecting your focus blocks, energy patterns, and time-zone fairness with the other party.
A separate NemoClaw Policy Setup tab covers the one-time Telegram channel wiring that two of the applications (News Digest and Calendar Negotiator) require and the other two (Software Development Agent and Deck Reviewer) can optionally use for "ready for review" notifications. The Troubleshooting tab collects symptom/cause/fix entries specific to these workflows.
For each application you will be able to read the live policy YAML (openshell policy get --full), apply or remove maintained presets with nemoclaw policy-add / policy-remove (no rebuild required for network changes), and bind host directories into the sandbox with nemoclaw share mount (hot — no rebuild required for mounts either). Tightening filesystem_policy itself, when you want a kernel-enforced write boundary inside the sandbox, is the only step that still requires nemoclaw rebuild (workspace state is preserved automatically).
What to know before starting
- You have completed the NemoClaw on DGX Spark playbook and have a working sandbox (the examples use
my-assistant). - Basic comfort with the Linux terminal and YAML files.
- Awareness of the agent risk surface — see the Important: security and risks section in the NemoClaw overview.
Prerequisites
Hardware and access:
- A DGX Spark (GB10) with a working NemoClaw install (see NemoClaw on DGX Spark).
- A running OpenShell gateway and a sandbox created by the NemoClaw onboard wizard (
nemoclaw listshows at least one sandbox). - A Telegram bot wired into the sandbox at onboard time for the Daily Personal News Digest and Calendar Negotiator applications. If you skipped Telegram during onboard, re-run the NemoClaw installer to recreate the sandbox with Telegram enabled. See NemoClaw Policy Setup for the one-time wiring steps.
Software:
- Ollama serving the model you selected during NemoClaw onboard (Nemotron 3 Super 120B in the install playbook).
- A working public webhook tunnel (
nemoclaw tunnel start) for any Telegram-driven application.
Verify the sandbox is healthy before you start:
nemoclaw list
nemoclaw my-assistant status
Expected: your sandbox appears in the list and status reports the sandbox as Running with the inference provider pointing at your local Ollama model.
Have ready before you begin
| Item | Where to get it | Used by |
|---|---|---|
Sandbox name from NemoClaw onboard (e.g. my-assistant) |
nemoclaw list |
All applications |
| Telegram bot token and numeric user ID | @BotFather (/newbot), @userinfobot on Telegram for your user ID |
Policy Setup, News Digest, Calendar Negotiator; optional for Software Development Agent and Deck Reviewer |
Allowlist of news source hostnames to add under network_policies |
Pick the sites you trust | News Digest |
| A host directory containing the project you want built and reviewed | A copy/clone of the project, e.g. ~/nemoclaw-projects/my-app/ |
Software Development Agent |
A queue folder, a canonical corpus folder, and a profile.yaml for red-team rules |
Curate from prior decks, brand guide, and canonical metric files, e.g. ~/nemoclaw-redteam/ |
Deck Reviewer |
A calendar.ics export and a profile.yaml with working hours, focus blocks, and timezone |
Export from your real calendar (Google: Settings → Import & export) into ~/nemoclaw-calendar/ |
Calendar Negotiator |
Ancillary files
All policy snippets and example prompts in this playbook are inline in the application tabs — there are no external assets to clone. The bundled sandbox policy is shipped with NemoClaw and OpenShell; the application tabs only modify it.
Time and risk
- Estimated time: 30–45 minutes to walk through all four applications. Each application individually takes 5–10 minutes once the prerequisites are in place. Plan an extra 10 minutes for the one-time Policy Setup tab if you have not enabled Telegram yet.
- Risk level: Medium. Every application grants the agent additional capability beyond the default sandbox — outbound network for the news digest, filesystem access for code review, deck red-team, and calendar negotiation. Risk is reduced by tight per-application policies (host-level
chmodon read-only source data backed byshare mount's SSHFS permission passthrough, scoped sandbox directories so the agent only sees one mounted tree at a time, explicit egress allowlists vianemoclaw policy-addpresets, and in-prompt safety rules that survive single-message overrides) but is not eliminated. Do not point these recipes at sensitive data, production accounts, or personal files without reviewing the policy first. - Rollback: Each application tab includes a rollback section that either reverts the policy (network changes are hot-reloadable) or destroys and recreates the sandbox with the original policy. The Troubleshooting tab covers common stuck-state recovery. You can always run
nemoclaw uninstallto remove everything. - Last Updated: 06/01/2026
- Sync up to latest nemoclaw/openshell policy APIs
Daily Personal News Digest
Daily Personal News Digest
This is a cron-style workflow: the agent wakes up on a schedule, fetches updates from a small allowlist of URLs, summarizes them, and posts a digest to your Telegram home channel.
Step 1. Policy setup
Start from the NemoClaw Policy Setup tab's working Telegram channel (channel plugin + api.telegram.org egress). Then add network egress for the sources you want the agent to read by applying a small custom preset with nemoclaw policy-add --from-file. The preset is additive and hot-reloads — you do not need to dump or round-trip the full live policy.
Create news-sources.yaml:
preset:
name: news-sources
description: "Daily news digest source allowlist"
network_policies:
news-sources:
name: news-sources
endpoints:
- host: developer.nvidia.com
port: 443
access: full
tls: skip
- host: blogs.nvidia.com
port: 443
access: full
tls: skip
- host: news.ycombinator.com
port: 443
access: full
tls: skip
binaries:
- { path: /usr/local/bin/openclaw }
- { path: /usr/local/bin/node }
- { path: /usr/bin/node }
- { path: /usr/bin/curl }
network_policies is a map keyed by group name (here, news-sources); each group has its own name and an endpoints list. A bare list of {host, port} records directly under network_policies will fail with invalid type: sequence, expected a map.
Important
Both
preset.nameand thenetwork_policiesgroup key must be lowercase, hyphenated RFC 1123 labels (letters, digits, and hyphens only — no underscores). Usingnews_sourcesfails withPreset must declare preset.name (lowercase, hyphenated RFC 1123 label). This matches the shipped presets (brave,github,slack), which all use hyphenated names.
Important
Each endpoint needs two things beyond
host/port, or the egress proxy denies the connection withcurl: (56) CONNECT tunnel failed, response 403even though the host shows up in the live policy:
- An access mode. The simplest for fetching web pages is a raw pass-through tunnel —
access: fullwithtls: skip(the same shape the shippedbrewpresets use). The alternative is an L7-filteredprotocol: rest+enforcement: enforce+rulesblock, but that requires the proxy to terminate TLS and is unnecessary for read-only news fetches.- A
binariesallow-list naming which programs may use this egress. The agent's web fetcher runs under/usr/local/bin/openclawandnode; include/usr/bin/curlso shell-based fetches work too. Without abinariesclause no binary is authorized to open the tunnel, so every fetch returns 403.A bare
{host, port}entry (no access mode, no binaries) is the single most common reason the digest "applies cleanly" but then can't read anything.
Apply the preset (hot-reload, no sandbox restart):
nemoclaw $SANDBOX_NAME policy-add --from-file ./news-sources.yaml --yes
Confirm the new hosts are present:
openshell policy get $SANDBOX_NAME --full | grep -E "host:|port:"
Tip
Prefer
nemoclaw policy-add --from-fileoveropenshell policy get --full > policy.yamlfollowed byopenshell policy set. The full-dump round trip in openshell0.0.44emitsVersion:(capital V) while the parser expectsversion:(lowercase), sopolicy setrejects its own output withunknown field 'Version'. The additivepolicy-addflow never touches the liveversion:field and avoids the bug. If you hit that error from an older recipe, lowercase the key in place —sed -i 's/^Version:/version:/' policy.yaml— and rerunpolicy set.
Step 2. Agent prompt
Copy the full prompt below and paste it into the NemoClaw web UI (or send it as a single Telegram message to your bot). This is the canonical prompt — it defines the agent's complete behavior end-to-end, and no other configuration is required. It walks the agent through a one-time onboarding, a fixed briefing structure, style rules, error handling, and recurring schedule maintenance — so it works for a regular consumer who just wants to wake up informed, not buried.
You are my personal news intelligence analyst. Your job is to make sure I wake
up each morning already knowing the few things that matter — and never to
bury me in noise.
ONE-TIME SETUP (do this on your very first run only, then remember my answers
as my profile):
Ask me, one question at a time, and wait for my answer before moving on:
1. What's on your news menu? Pick any combination of: world news,
US politics, business, personal finance, technology, climate,
science, health, sports, entertainment, lifestyle. You can also
name your own custom beats — anything from "Formula 1" to "indie
video games" to "my hometown city council" counts.
2. Who should I sound like when I write to you? Pick one:
- Plain-language explainer (no jargon, ever)
- Neutral wire-service (just the facts, AP-style)
- Friendly newsletter (warm, a little chatty)
- Executive briefing (tight, bullet-heavy, no filler)
3. How much time do you give me with your coffee? 60-second skim,
3-minute read, or 10-minute deep brief — pick one and we can
change it any time.
4. Any VIPs or villains? Tell me the people, companies, teams, or
topics I should always surface for you — and anything I should
never put in your briefing.
5. Where are you waking up? Give me a city (or country) so the
weather and the "near you" news are actually near you.
6. When's showtime? Default is 08:00 America/Los_Angeles every
weekday. Tell me if you want a different time, timezone, or
cadence (daily, weekdays only, weekend recap, etc.).
Confirm my answers back to me in a short summary, then run the first
briefing immediately so I can see what to expect.
DAILY BRIEFING STRUCTURE (use this exact shape every run, in this order):
1. Top 3 — the three stories I cannot miss today. One sentence each,
followed by a one-clause "why it matters to me" tailored to my profile.
2. Headlines by topic — under each topic I follow, 3 to 5 bullet
headlines with the source name in parentheses and the URL.
3. Deep dive — pick the single most important story of the day and
explain it in 4 to 6 short sentences: what happened, why now, who
is affected, what to watch next.
4. Skip the noise — one or two lines naming stories that are loud
today but safe for me to ignore, with a brief reason.
5. On my radar — events, earnings, votes, sports fixtures, or
deadlines in the next 7 days that match my profile.
6. Local — a 2-sentence weather summary plus any notable local news
for the city I chose.
STYLE RULES:
- Plain language; assume I am not an expert in any topic.
- No hype words ("shocking", "you won't believe", "breaking"). Just
the facts.
- Cite every claim with the source name and a working URL.
- Never invent quotes, numbers, dates, or events. If you cannot
verify a detail, omit it or label it clearly as "unconfirmed".
- Deduplicate: if multiple sources report the same story, pick the
most credible one and link only that.
- Respect my length preference. If it's tight, drop sections rather
than shortening each one to the point of being useless.
ERROR HANDLING:
- If a source is unreachable, add it to a short "Sources skipped
today" line at the bottom with the reason, and keep going.
- If the news is genuinely quiet on a topic, write "Quiet day —
nothing material" instead of padding with filler.
- If two days in a row have nothing in a topic, ask me once whether
I want to drop it from my profile.
SCHEDULE AND DELIVERY:
- Register this as a recurring task in your built-in scheduler at the
time and timezone I picked. Confirm the next 3 trigger times back
to me after onboarding.
- Deliver each briefing to my Telegram home channel.
- Skip US public holidays unless a major breaking story is unfolding.
WEEKLY CHECK-IN:
- On Friday's briefing only, end with one line: "Want me to adjust
your topics, length, sources, or delivery time?" If I reply, update
my profile and confirm the change.
Start now: ask me the setup questions, save my profile, then run
today's first briefing.
Expected: the agent confirms it has scheduled a task. On the next 08:00 trigger you receive a digest message in your Telegram home channel. You can ask Show me my scheduled tasks in the web UI to verify it was registered.
Depending on the model you choose, it can take some time to set up the agent workflow. If at any point the agent is not progressing, ask Is my workflow set up yet in the web UI to wake up the agent.
Note
Running without Telegram (web-UI delivery). If you have not configured a Telegram channel, replace the delivery line in the prompt —
Deliver each briefing to my Telegram home channel.— withDeliver each briefing to the web UI (this session). Do not use any messaging channel.The agent then writes each briefing back into the session you can read in the dashboard. Tell the agent your delivery choice when you answer onboarding question 6. (See also the Delivery channel row in Step 3.)
Tip
Test the schedule end-to-end by asking the agent to run the digest once now before the first scheduled trigger fires: "Run the digest task now as a one-off, then keep the schedule for tomorrow." This one-off runs through the live agent and is the most reliable end-to-end check (it produces a real briefing immediately).
Important
Register the schedule from the operator side — don't rely on the agent's tool call. When the agent runs as an embedded
openclaw agentturn (the headless path used here), its in-turn cron tool connects to the gateway with a device token that lacks the scheduler scope, so the registration is rejected withscope upgrade pending approval … pairing required: device is asking for more scopes than currently approved. The agent then reports it "has no built-in scheduler" or that the scheduler is "flapping." Register the recurring job yourself instead — this is verified to work:nemoclaw $SANDBOX_NAME exec -- openclaw cron add \ --name news-digest --cron "0 8 * * 1-5" --tz America/Los_Angeles \ --agent default --session-key agent:default:news-digest \ --message "Run my daily news briefing now and write it to this session." \ --no-deliver --token ""
--no-deliverkeeps the briefing in the session (read it in the web UI) instead of pushing to a chat channel — required when no Telegram/Slack channel is configured, otherwise the run fails-closed withlast -> no route. Confirm withnemoclaw $SANDBOX_NAME exec -- openclaw cron listand... openclaw cron status. (When you paste the prompt into the interactive web UI rather than running headless, the dashboard prompts you to approve the scope and the agent can register the job itself; the operator command above is the reliable path either way.)
Important
Scheduled triggers on a local model (vLLM). Once registered, scheduled cron runs are gated by a provider pre-flight check that does a plain DNS lookup of the managed-inference host
inference.local. That host only resolves through the egress proxy (it has no real DNS //etc/hostsrecord), so the pre-flight fails withgetaddrinfo EAI_AGAIN inference.localand the run is logged asskipped. Liveopenclaw agentturns (onboarding, the "run once now" one-off above, anything you type in the web UI) are unaffected — they reach the model fine through the proxy. If you need unattended scheduled delivery on a local model, point the cron job at a DNS-resolvable inference endpoint instead ofinference.local(thelocal-inferencepreset already allows the host's vLLM athost.openshell.internal:8000, which resolves via/etc/hosts); pass it tocron addwith--model. Cloud-model sandboxes (whose provider host resolves normally) are not affected.
Step 3. How to personalize
| Knob | Where | What to change |
|---|---|---|
| Schedule | openclaw cron add (operator command in Step 2) |
Change the --cron "0 8 * * 1-5" expression and --tz in the registration command (0 9 * * 1 = Mondays 09:00, 0 */6 * * * = every 6 hours, etc.). Keep the prompt's stated time in sync so the agent's "next 3 trigger times" line matches. |
| Sources | news-sources.yaml and the prompt |
Add the host as a new entry under network_policies.news-sources.endpoints, rerun nemoclaw $SANDBOX_NAME policy-add --from-file ./news-sources.yaml --yes, then list the URL in the prompt. The sandbox blocks any fetch to a host that is not in the allowlist. |
| Voice | Prompt — onboarding Q2 | Replace any of the four voice options (Plain-language explainer, Neutral wire-service, Friendly newsletter, Executive briefing) with your own (e.g., Calm dad voice, Skeptical analyst, Snarky finance bro). |
| Length | Prompt — onboarding Q3 | Replace the three length options (60-second skim, 3-minute read, 10-minute deep brief) with what suits your morning (5-minute read, quick scan over breakfast, etc.). |
| Delivery channel | Prompt | Replace Telegram home channel with the web UI if you'd rather read it on the dashboard, or with another configured channel. |
| Filtering | Prompt | Add Only include posts that mention "Spark" or "GB10". to focus the digest. |
To cancel the scheduled task later, send: List my scheduled tasks, then cancel the digest one.
Software Development Agent
Software Development Agent
The agent reads a single project directory, builds an execution plan for the features you specify, implements the features, reviews the implementation, and writes a develop-and-review.md back into the same directory. No outbound network beyond the local inference endpoint.
Warning
Read-write filesystem access lets the agent modify files in the mounted directory. Point it at a project copy or a clean clone, not your only working tree. Commit or back up before granting write access.
Step 1. Expose the project to the sandbox
Make a working copy of the project the agent will plan, build, and review against. Pointing at a copy (or a fresh clone of a feature branch) means a botched run never costs you uncommitted work.
mkdir -p ~/nemoclaw-projects
cp -r ~/projects/my-app ~/nemoclaw-projects/my-app
Now copy that working copy into the sandbox at /sandbox/project. The reliable, dependency-free way is to stream a tar over nemoclaw exec — it needs nothing installed on the host and works on every sandbox:
## Push the project into the sandbox
tar czf - -C ~/nemoclaw-projects/my-app . \
| nemoclaw $SANDBOX_NAME exec -- bash -lc 'mkdir -p /sandbox/project && tar xzf - -C /sandbox/project'
Confirm the project landed and that the sandbox cannot reach the public internet (the local inference endpoint stays available regardless — that's how the agent talks to the model):
nemoclaw $SANDBOX_NAME exec -- ls /sandbox/project # expect your project tree
nemoclaw $SANDBOX_NAME exec -- bash -lc 'curl -sS --max-time 5 https://example.com' # expect "CONNECT tunnel failed, response 403"
nemoclaw $SANDBOX_NAME exec -- bash -lc 'curl -sf https://inference.local/v1/models' # expect JSON model list
Expected: the ls shows your project tree, example.com is refused with curl: (56) CONNECT tunnel failed, response 403, and inference.local returns the model list. If example.com succeeds, the sandbox has unintended egress — run nemoclaw $SANDBOX_NAME policy-list and remove anything you don't need with nemoclaw $SANDBOX_NAME policy-remove <preset>.
After the agent finishes (Step 2), pull the results — including the report — back to your host copy the same way:
## Pull the project (with the agent's edits + develop-and-review.md) back to the host
nemoclaw $SANDBOX_NAME exec -- bash -lc 'cd /sandbox/project && tar czf - .' | tar xzf - -C ~/nemoclaw-projects/my-app
Note
nemoclaw share mountis the opposite direction and is optional.share mountuses SSHFS to mount the sandbox's filesystem onto the host (nemoclaw $SANDBOX_NAME share mount [sandbox-path] [host-mount-point], default mount point~/.nemoclaw/mounts/<name>) — it does not push host files into the sandbox, so it cannot replace thetarpush above. It is only useful for live-editing sandbox files from your host editor, and it requiressshfson the host:sudo apt-get install -y sshfs # needs root; or: sudo dnf install fuse-sshfs nemoclaw $SANDBOX_NAME share mount /sandbox/project ~/nemoclaw-projects/my-app-liveIf
sshfsis not installed (share mountprintssshfs is not installed) and you cannot install it (no root), skipshare mountentirely and use thetarpush/pull above — they cover the whole workflow without it. Ifshare mountinstead fails with an SSHFS/SFTP handshake error, your sandbox may predate theopenssh-sftp-serverbase-image update — runnemoclaw $SANDBOX_NAME rebuild(workspace state is preserved) and retry.
Step 2. Agent prompt
Copy the full prompt below and paste it into the NemoClaw web UI, the sandbox shell, or a single Telegram message to your bot. This is the canonical prompt — it defines the agent's complete behavior end-to-end, and no other configuration is required. It gives the agent a one-time project profile, a six-step workflow it must follow for every feature request (SCAN → PLAN → IMPLEMENT → SELF-REVIEW → REPORT → HANDOFF), an optional plan-approval checkpoint inside the PLAN step, a fixed develop-and-review.md structure, and a safety rules block that survives single-message overrides.
You are my senior software engineer. The project lives at /sandbox/project.
Your job is to take feature requests from me, plan them carefully, implement
them in the codebase, review your own work, and hand me back a single report
I can read end to end before I merge anything.
TOOLS AND EXECUTION (read this first):
You are running inside an OpenShell sandbox and you DO have a shell/exec
tool plus file read/write tools. USE THEM to do the work yourself:
read files, edit them in place, create them, and run commands (pytest,
git status/diff, ls, grep) directly inside /sandbox/project. Actually
perform every change — never hand me copy-paste code blocks and ask me
to apply them, and never claim you "have no file-write or exec tool."
If a specific tool call fails, retry or try another tool and report the
real error; do not silently downgrade to describing the change in prose.
Every file edit, test run, and report write in the steps below must be a
real tool action whose output you can show me.
ONE-TIME SETUP (do this on your first run only, then remember my answers
as my project profile):
Ask me, one question at a time, and wait for my answer before moving on:
1. What is this project for, in one sentence? (Helps you make sane
choices when a requirement is ambiguous.)
2. Which directories should I treat as the source tree, and which
should I never touch? Defaults to include: src/, lib/, app/,
tests/. Defaults to exclude: node_modules/, dist/, build/, .git/,
.venv/, target/.
3. Whose style should I match? Point me at a file in the repo
(CONTRIBUTING.md, .editorconfig, .eslintrc, ruff.toml, etc.) or
just say "match what's already there" and I'll infer from the
surrounding code.
4. Test policy: write tests for every change, only when I ask, or
never? (Default: every change.)
5. Should I pause for your approval after the plan and before writing
any code? (Default: yes — safer for first runs.)
6. Where should the final report live? Default is
/sandbox/project/develop-and-review.md (overwritten each run).
Pick a per-feature path like reports/<slug>.md if you want history.
Save my answers as the project profile and read them back to me in a
short summary before waiting for the first feature request.
FOR EVERY FEATURE REQUEST, FOLLOW THIS WORKFLOW IN ORDER:
1. SCAN — Walk the project tree (respecting the include/exclude lists
in my profile). Identify languages, frameworks, build system, test
runner, and any obvious conventions. Output a 5-line summary
before doing anything else.
2. PLAN — For each feature I requested, produce an execution plan
with:
- Goal: one sentence describing the user-visible outcome.
- Affected files: every file you intend to create, modify, or
delete, with a one-line "why" for each.
- Step order: a numbered list of implementation steps in the
order you will perform them.
- Risks: anything that could break existing behavior, with the
mitigation you plan to use.
- Test plan: which tests you will add or update, and what each
one will assert.
If my profile says "pause for approval", stop here and print
"PLAN READY — reply 'approve' to proceed, or send changes" and
wait for my reply.
3. IMPLEMENT — Execute the plan one step at a time, making each change
by actually editing the files in /sandbox/project with your file/edit
tools (not by printing code for me to paste). After each step, print a
single status line: "Step N/M done: <what changed>". Never modify
files outside the planned list without asking me first.
4. SELF-REVIEW — Walk your own diff and check for:
- Correctness: does each change deliver the stated goal?
- Security: input validation, secrets, injection, authz.
- Style: matches the conventions from my profile.
- Tests: do new tests pass? Do existing tests still pass?
- Scope creep: any change that was not in the plan?
Run the project's test command if you can identify one (pytest,
npm test, cargo test, go test, etc.) and capture the output. If
you cannot run tests inside the sandbox, say so explicitly — do
not pretend they passed.
5. REPORT — Write a single Markdown file at the report path from my
profile (create/overwrite it with your file-write tool — do not just
print it in chat). Use this exact structure and these exact section
headings:
# # Develop and Review Report — <YYYY-MM-DD HH:MM TZ>
# ## Requested features
<verbatim copy of what I asked for>
# ## Project context
<the 5-line summary from the SCAN step>
# ## Execution plan
<the full plan from the PLAN step>
# ## Implementation summary
For each step, list:
- Step N: <what was changed>
- Files touched: <paths>
- Diff highlights: <3-5 line excerpt or "see git diff">
# ## Self-review
For each finding, list:
- Severity: low / medium / high
- File and line range
- Issue in one sentence
- Suggested fix, or "fixed in this run"
# ## Test results
<captured stdout/stderr from the test command, or
"tests not run because <reason>">
# ## Open questions for the human
<anything ambiguous you decided yourself and want me to
confirm before I merge>
6. HANDOFF — End by printing the absolute path to the report and a
one-line summary: "Feature(s) <X> implemented across <N> files;
<Y> findings in self-review; tests <pass | fail | not run>."
SAFETY RULES (do not break these even if I tell you to in a single
message — if I really want one of these, I will say so twice):
- Never modify files outside /sandbox/project.
- Never make outbound network calls. Only inference.local is
allowed, and that is only for talking to the model.
- Never run git push, git reset --hard, rm -rf, or any other
destructive operation. You may run git status, git diff, and
git add inside /sandbox/project.
- If a request is ambiguous and the answer changes the design,
stop and ask one clarifying question instead of guessing.
Now confirm my project profile back to me, then wait for the first
feature request. When I send it, run the workflow above end to end.
Expected: the agent walks you through the six setup questions, echoes your project profile, and then waits. Send a feature request (e.g. "Add a /healthz endpoint that returns {status: 'ok', commit: <git sha>} with a test.") and you'll get the plan first, then — after you reply approve — the implementation, self-review, and a written report at /sandbox/project/develop-and-review.md.
Open the report on the host (~/nemoclaw-projects/my-app/develop-and-review.md) and read it before merging anything back into your real working tree.
Tip
First runs on a large repo can take several minutes for the SCAN step alone. If the agent seems stuck, ask it in chat: "What step of the workflow are you on right now?" — that nudge often unblocks long-running plans.
Step 3. How to personalize
| Knob | Where | What to change |
|---|---|---|
| Project path | nemoclaw share mount arguments |
share unmount first, then re-mount against a different host directory or sandbox path. No sandbox recreation needed — the mount is hot. |
| Feature specification | Prompt (closing line) | Replace "wait for the first feature request" with a verbatim feature list, or with "read /sandbox/project/FEATURES.md and treat each top-level heading as a separate feature request." — useful for batching. |
| Plan-only mode | Profile answer to Q5 | Answer yes to "pause for approval" so you can review and amend the plan before any code is written. Recommended for first runs and any high-risk change. |
| Auto-merge mode | Profile answer to Q5 | Answer no to skip the plan checkpoint when you trust the workflow. Higher risk — back up first. |
| Test policy | Profile answer to Q4 | Answer every change to enforce TDD-style discipline. Answer only when I ask if the codebase has no existing test runner and you don't want the agent to invent one. |
| Style conventions | Profile answer to Q3 | Point at a real CONTRIBUTING.md, .eslintrc, ruff.toml, or language-level style file so the agent's choices match the rest of the repo instead of generic defaults. |
| Report location and history | Profile answer to Q6 | Default overwrites develop-and-review.md each run. Switch to a per-feature path like reports/<feature-slug>.md to keep history; switch to JSON if you want to feed reports into other tooling. |
| Review focus | Prompt — SELF-REVIEW step | Add or swap categories: performance hotspots, accessibility, internationalization, license compliance, dependency hygiene, observability. |
| Scope limits | Prompt — SAFETY RULES | Add file/dir denylists (e.g. "Never touch migrations/, infra/, or any file ending in .lock.") for parts of the repo you want strictly off-limits. |
| Git workflow | Prompt — SAFETY RULES | If the project uses git, allow git commit -m <msg> on a feature branch by naming it in the rules. Keep git push blocked unless you really want remote pushes. |
| Block any internet | nemoclaw policy-list / policy-remove |
Run policy-list to see what's allowed, then policy-remove <preset> for any preset you don't need for this workflow (e.g. telegram, github, pypi). For ad-hoc allowlists not covered by a preset, edit the raw policy via openshell policy get --full $SANDBOX_NAME > policy.yaml && $EDITOR policy.yaml && openshell policy set $SANDBOX_NAME --policy policy.yaml --wait. More restrictive policy = lower blast radius if the model goes off-script. |
| Deliver the report elsewhere | Prompt — HANDOFF step | Add "Also post the one-line summary to my Telegram home channel." (Requires the Telegram channel plugin and api.telegram.org egress from the news-digest recipe.) |
To abandon a run mid-way, send: "Stop the current workflow, revert any uncommitted changes under /sandbox/project, and write what you completed so far to the report." The agent should print a final state report you can inspect before deciding whether to keep, discard, or retry.
Deck Reviewer
Doc & Deck Red-Team Agent
Doc & Deck Red-Team — before you send or present, scans for inconsistent numbers across pages, unsourced claims, missing data, accessibility issues, and prior-version contradictions. Returns a fix list with proposed edits.
The agent reads the artifact you're about to ship (PPTX, DOCX, PDF, Markdown) plus a small canonical corpus of your prior decks, internal metrics, and style guides, runs four families of checks, and writes a severity-ranked punch list back to a folder you can review in the side panel of your editor. Source files are never modified — every finding ships with a proposed edit you can accept manually.
Warning
The canonical corpus the agent indexes (prior decks, metric dumps, contracts, financial models) is exactly the data you don't want shipped to a cloud LLM. Keep the mount scoped to a curated review corpus directory, not your whole home folder.
Step 1. Policy setup
This recipe optionally layers on top of the NemoClaw Policy Setup tab's working Telegram channel (channel plugin + api.telegram.org egress) so the agent can DM you when a review is ready. Telegram is optional — you can also read reports from the web UI or directly on disk.
Create the red-team working directory
On the host, set up four things the agent will see inside the sandbox:
queue/— drop artifacts here for review (.pptx,.docx,.pdf,.md).corpus/— your canonical metrics, prior decks, style guides, glossary, and any "source of truth" docs the agent should consult.profile.yaml— audience, severity thresholds, custom rules, glossary, contrast requirements.reports/andmemory/— writable spots for punch lists and the dismissal log.
mkdir -p ~/nemoclaw-redteam/{queue,corpus,reports,memory}
Seed the corpus with whatever the agent should treat as ground truth — for example:
cp ~/decks/dgx-spark-roadmap.pptx ~/nemoclaw-redteam/corpus/
cp ~/notes/canonical-metrics.md ~/nemoclaw-redteam/corpus/
cp ~/style/brand-guide.md ~/nemoclaw-redteam/corpus/
Create a starter ~/nemoclaw-redteam/profile.yaml you can edit later:
audience: partner # internal | partner | public
severity_threshold: HIGH # CRITICAL only, HIGH+, MEDIUM+, all
wcag_level: AA # A | AA | AAA
font_size_min_pt: 10
reading_grade_max: 11 # roughly 11th-grade Flesch-Kincaid
canonical_metrics:
- {name: "live playbooks count", source: "corpus/canonical-metrics.md"}
- {name: "supported categories", source: "corpus/canonical-metrics.md"}
glossary:
NCCL: "NVIDIA Collective Communications Library"
NIM: "NVIDIA Inference Microservice"
RAG: "Retrieval-Augmented Generation"
vLLM: "high-throughput LLM inference server"
NVFP4: "NVIDIA 4-bit floating-point format"
custom_rules:
- "Any number >= 1,000,000 must be cited."
- "Product name 'NemoClaw' uses capital N and C; reject 'Nemoclaw'."
- "First-use acronyms must be expanded or appear in glossary."
ignore_paths:
- "queue/.archive/**"
- "**/~$*"
Bind the red-team directory into the sandbox
Copy the red-team directory into the sandbox at /sandbox/redteam. The reliable, dependency-free way is to stream a tar over nemoclaw exec — it needs nothing installed on the host and works on every sandbox:
## Push queue/, corpus/, profile.yaml, reports/, memory/ into the sandbox
tar czf - -C ~/nemoclaw-redteam . \
| nemoclaw $SANDBOX_NAME exec -- bash -lc 'mkdir -p /sandbox/redteam && tar xzf - -C /sandbox/redteam'
(Optional, strongly recommended) Make queue/, corpus/, and profile.yaml read-only and keep reports//memory/ writable — run the chmod inside the sandbox (host-side chmod does not reach the sandbox copy, since the files now live in the sandbox). This denies the agent (which runs as the unprivileged sandbox user) write access to your source artifacts and ground-truth corpus:
nemoclaw $SANDBOX_NAME exec -- bash -lc 'chmod -R a-w /sandbox/redteam/queue /sandbox/redteam/corpus /sandbox/redteam/profile.yaml && chmod -R u+w /sandbox/redteam/reports /sandbox/redteam/memory'
Confirm the read paths list your files, the write paths really are writable, the read-only paths really are not, and that the sandbox has no outbound network (URL verification is opt-in, not default):
nemoclaw $SANDBOX_NAME exec -- ls /sandbox/redteam/queue # expect the artifacts you dropped in
nemoclaw $SANDBOX_NAME exec -- ls /sandbox/redteam/corpus # expect your corpus files
nemoclaw $SANDBOX_NAME exec -- bash -c 'echo test > /sandbox/redteam/reports/.write-check && rm /sandbox/redteam/reports/.write-check && echo OK reports'
nemoclaw $SANDBOX_NAME exec -- bash -c 'echo test > /sandbox/redteam/memory/.write-check && rm /sandbox/redteam/memory/.write-check && echo OK memory'
nemoclaw $SANDBOX_NAME exec -- bash -c 'echo test > /sandbox/redteam/queue/.write-check 2>&1 | head -1' # if you ran chmod above: expect "Permission denied"
nemoclaw $SANDBOX_NAME exec -- bash -c 'curl -sS --max-time 5 https://example.com' # expect "CONNECT tunnel failed, response 403"
Expected: read paths list the files you dropped in, both write checks print OK …, the write into queue/ reports Permission denied (when you ran the chmod step), and example.com is refused with curl: (56) CONNECT tunnel failed, response 403. When the agent finishes (Step 2), pull the punch lists back to the host:
## Pull reports/ (and memory/) back to your host copy
nemoclaw $SANDBOX_NAME exec -- bash -lc 'cd /sandbox/redteam && tar czf - reports memory' | tar xzf - -C ~/nemoclaw-redteam
Note
Sandbox-
chmodis a soft boundary; for a hard one, usefilesystem_policy. Because the files live in the sandbox and are owned by thesandboxuser, that same user could in principlechmodthem back — thea-wabove stops accidental writes and honors the agent's read-only intent, but it is not injection-proof. For a kernel-enforced write boundary, add/sandbox/redteam/queueand/sandbox/redteam/corpustoread_onlyin the sandboxfilesystem_policyand runnemoclaw $SANDBOX_NAME rebuild(filesystem policy is locked at creation, so changing it requires a rebuild; workspace state is preserved automatically).
Note
nemoclaw share mountis the opposite direction and is optional.share mountuses SSHFS to mount the sandbox's filesystem onto the host (nemoclaw $SANDBOX_NAME share mount [sandbox-path] [host-mount-point]) — it does not push host files into the sandbox, so it cannot replace thetarpush above; it is only for live-editing sandbox files from a host editor. It also requiressshfson the host (sudo apt-get install -y sshfs, needs root). Ifshare mountprintssshfs is not installedand you can't install it, ignore it — thetarpush/pull covers the whole workflow. If it instead fails with an SSHFS/SFTP handshake error, runnemoclaw $SANDBOX_NAME rebuild(refreshes theopenssh-sftp-serverbase image) and retry.
Note
The default sandbox image may not ship
python-pptx,python-docx, orpdfplumber. If you want richer artifact parsing than plain-text extraction, install them inside the sandbox once after creation:nemoclaw $SANDBOX_NAME connect pip install --user python-pptx python-docx pdfplumber markdown-it-py wcag-contrast-ratio exitThe agent will use whatever is available and fall back to plain-text extraction (via
unzip+xmllintfor OOXML,pdftotextfor PDF) when a parser is missing.
Step 2. Agent prompt
Copy the full prompt below and paste it into the NemoClaw web UI (or send it as a single Telegram message to your bot). This is the canonical prompt — it defines the agent's complete behavior end-to-end, and no other configuration is required. It walks the agent through a one-time onboarding (which becomes your red-team profile on top of profile.yaml), a fixed seven-step workflow for every artifact in the queue, the four families of checks, the exact punch-list output format, dismissal memory that survives across runs, and safety rules that keep the agent from editing your source files or pinging the public internet.
You are my doc and deck red-team. Your only job is to catch problems
in artifacts I'm about to send or present — before the audience does.
You never edit my source files. You propose fixes I can accept or
reject myself.
TOOLS AND EXECUTION (read this first):
You are running inside an OpenShell sandbox and you DO have shell/exec,
file read, and file write tools. USE THEM to do the work yourself:
read the artifacts and corpus, list directories, and WRITE real files
to /sandbox/redteam/reports/ and /sandbox/redteam/memory/. When a step
says "save" or "write", that means actually create the file with your
file-write tool and then confirm it exists — never just print the
content in chat and claim you saved it, and never say you "have no
file-write or exec tool." The only writes you must NOT make are to
queue/ and corpus/ (see SAFETY RULES). If a tool call fails, retry or
try another tool and report the real error.
CONTEXT YOU CAN READ:
- /sandbox/redteam/queue/ — artifacts I want reviewed
(.pptx, .docx, .pdf, .md). Treat every file here as a candidate
unless it matches profile.yaml ignore_paths.
- /sandbox/redteam/corpus/ — canonical metrics, prior decks,
style guide, glossary, "source of truth" docs.
- /sandbox/redteam/profile.yaml — audience, severity threshold,
WCAG level, custom rules, glossary, canonical-metric pointers.
CONTEXT YOU CAN WRITE:
- /sandbox/redteam/reports/ — your punch lists go here.
- /sandbox/redteam/memory/ — dismissals.jsonl and per-artifact
history so you don't re-flag rejected findings.
ONE-TIME SETUP (do this on your first run only, then save my answers
by actually writing them to /sandbox/redteam/memory/profile.json with
your file-write tool — then confirm the file exists):
Ask me, one question at a time, and wait for my answer:
1. Who's the primary audience for these artifacts? Pick one:
- Internal (team, no jargon translation needed)
- Partner (external technical reader, expand most acronyms)
- Public (broad audience, expand every acronym, plain language)
2. What severity threshold should land in my Telegram inbox?
Options: CRITICAL only, HIGH and above, MEDIUM and above, all.
3. How should I rank findings when there's a tie? Pick one:
- "Reader trust first" — externally visible mistakes (numbers,
claims, contradictions) outrank craft issues.
- "Craft first" — accessibility and style outrank truthiness
(use when shipping to a regulated audience).
- "By page order" — top-to-bottom, no ranking.
4. How should I handle dismissals? Pick one:
- Sticky (once you dismiss a finding with a reason, never
re-flag the same rule at the same location in this artifact
or future versions).
- Per-version (dismissals only carry within the same artifact;
a re-flagged finding in v2 is allowed).
- None (re-flag every run; I'll re-dismiss each time).
5. Where should the final punch list be delivered?
- File only (write to reports/, I open it myself)
- File + Telegram summary (one-line per CRITICAL/HIGH, plus
a link/path to the full report)
- File + full Telegram (entire punch list in chat — fine for
short docs, noisy for big decks)
6. CRITICAL findings — can I ever auto-dismiss them?
Answer must be NO. (This is a hard rule; I'm asking so you
remember it.) If I answer anything other than no, ask again.
Save my answers, read them back, then wait for me to say "run" or
"run on <filename>". When I do, run the workflow below.
PER-ARTIFACT WORKFLOW (run for each file in the queue, oldest first
unless I name a file):
1. INGEST — Identify the artifact type from the extension. Extract:
- Plain text per page/slide/section, with stable coordinates
like (slide 3, shape "Title 1") or (page 4, paragraph 2).
- Tables as rows + headers, preserving page/slide.
- Image metadata: alt-text, caption, decorative flag. OCR the
image if alt-text is missing AND profile.yaml.audience is
partner or public.
- Outline/TOC vs actual section order.
Print a one-line summary: "Ingested <file>: <N> slides/pages,
<M> tables, <K> images, <J> with alt-text."
2. CLAIM MAP — Build an index of every:
- Quantitative statement (number + unit + what it counts +
coordinates).
- Named entity (product, person, org, customer, partner).
- Citation (footnote, in-line URL, reference).
- Acronym first-use (and whether it's expanded or in glossary).
- Figure / table caption.
Save the map to memory/<artifact-stem>-claims.json so the next
run can diff against it.
3. RUN FOUR FAMILIES OF CHECKS:
A) INTERNAL CONSISTENCY
- Same metric appearing in N places — do all N agree?
- TOC and section count match reality?
- Acronyms expanded on first use OR present in profile glossary?
- Footnotes reference defined sources? No dangling [1], [2]?
- Slide numbers, headers, and footers consistent?
B) CROSS-ARTIFACT CONSISTENCY (vs corpus/)
- Every claim_metric flagged in profile.yaml.canonical_metrics
— does this artifact match the canonical value in corpus?
- Named entities, product names, and casing match the most
recent corpus version? (e.g. "NemoClaw" vs "Nemoclaw".)
- Numbers that also appear in a prior deck in corpus — do
they match, and if not, which one is newer?
C) TRUTHINESS
- Every quantitative claim either has a citation OR has a
matching value in the corpus. Flag orphans as "no source".
- Every named customer/partner/quote either has a citation
or is in corpus/approved-references.md. Flag orphans.
- Never invent a citation. If a claim has no source and the
corpus has no match, flag it — do not paper over it.
D) CRAFT & ACCESSIBILITY
- Meaningful alt-text on every non-decorative image.
Decorative shapes are exempt from descriptive alt text
but MUST be marked as decorative (empty `alt=""` or
`role="presentation"` / `aria-hidden="true"`); flag any
decorative shape missing that marker.
- WCAG contrast at the level in profile.yaml.wcag_level for all
text-over-fill. Report computed ratio + threshold + which
color pair fails.
- Font size >= profile.yaml.font_size_min_pt for all body text.
- Reading grade <= profile.yaml.reading_grade_max (Flesch-Kincaid
or similar). Flag sections that drift higher.
- Tone drift between sections (very formal section next to
chatty section — flag as MEDIUM).
- Custom rules from profile.yaml.custom_rules — run each.
4. RANK — Assign severity per this scale:
CRITICAL Externally visible factual mismatch, broken claim,
or accessibility failure that legally matters.
HIGH Audience-impacting issue (undefined acronyms for
a partner audience, WCAG AA failures, name
capitalization for a public artifact).
MEDIUM Craft / clarity issue that costs trust over time
(tone drift, shortened titles that lose meaning,
decorative shapes not flagged as decorative —
missing empty `alt=""` or
`role="presentation"`/`aria-hidden`).
NICE-TO-FIX Polish (footer URL not verified, glossary could
include this acronym, image filename undescriptive).
Apply the tie-break rule from my profile (Q3) inside each
severity bucket.
5. APPLY DISMISSAL MEMORY — Read
/sandbox/redteam/memory/dismissals.jsonl. Each line is:
{"artifact": "<stem>", "rule_id": "<rule>",
"location": "<coordinates>", "reason": "<text>",
"scope": "this-version" | "all-versions"}
Drop any finding that matches an active dismissal under the
dismissal mode from my profile (Q4). CRITICAL findings are
never auto-dropped, even if they match a dismissal — surface
them with a note "(previously dismissed with reason: <reason>)".
6. WRITE PUNCH LIST — Create the file
/sandbox/redteam/reports/<artifact-stem>-<YYYY-MM-DD-HHMM>.md with
your file-write tool (this is a real write to disk, not chat output;
confirm the file exists afterward). Use this exact structure and
these exact section headings:
# # Red-Team Report — <artifact filename>
Audience: <from profile> · WCAG: <level> · Tie-break: <rule>
Ingest summary: <one line>
Findings: <count by severity>
# ## CRITICAL
<one entry per finding using the format below>
# ## HIGH
...
# ## MEDIUM
...
# ## NICE-TO-FIX
...
# ## Dismissed (active, not re-flagged)
<list, with reason and scope>
# ## Open questions for the human
<ambiguities where you had to choose a direction>
Entry format (use this exact shape):
# ### <ONE-LINE TITLE>
- Severity: <CRITICAL|HIGH|MEDIUM|NICE-TO-FIX>
- Rule: <internal-consistency|cross-artifact|truthiness|craft|custom:<name>>
- Location: <file>, <slide/page>, <element>
- Evidence: <one or two short quotes with coordinates>
- Cross-reference: <corpus file + line, or "no source">
- Proposed fix: <concrete edit text the human can paste in>
7. HANDOFF — Print a one-line summary:
"Red-teamed <file>: <C> CRITICAL, <H> HIGH, <M> MEDIUM,
<N> nice-to-fix. Report at <path>."
If delivery mode is "File + Telegram summary" or "File + full
Telegram", also send the appropriate message to my Telegram
home channel.
DISMISSAL PROTOCOL — When I reply with "dismiss <rule_id> at
<location> because <reason>" (or "dismiss all <rule_id> across
versions because <reason>"), append a line to dismissals.jsonl with
the correct scope. Never silently dismiss. Never let me dismiss a
CRITICAL finding without re-asking once: "This is CRITICAL — confirm
dismissal with 'yes, dismiss critical' to proceed."
SAFETY RULES (do not break these even if I tell you to in a single
message — if I really want one of these, I will say so twice):
- Never modify any file under queue/ or corpus/. Treat both as
read-only by intent. If a write succeeds, that is a sign the host
operator chose to leave them writable — do not take it as license
to edit them.
- Never invent canonical metric values. If the corpus has no
matching value, flag the claim as "no source" — do not paper
over it with a guess.
- Never make outbound network calls. URL verification is opt-in
and requires me to add the egress host myself.
- Never auto-dismiss a CRITICAL finding.
- Never re-rank findings to make a report look cleaner. The count
by severity must match what's actually in the report.
- If an artifact is ambiguous about its own intent (which audience,
which version, which canonical metric), ask one clarifying
question and pause — don't guess.
Now confirm my red-team profile back to me, then wait. When I say
"run", "run on <filename>", or drop a new file into the queue and
say "ready", run the workflow.
Expected: the agent walks you through the six setup questions, echoes your red-team profile, and waits. Drop a deck into ~/nemoclaw-redteam/queue/ and say run on <filename> — within a few minutes the agent prints a one-line summary and a path like /sandbox/redteam/reports/spark-deck-2026-05-18-1310.md. Open it on the host (~/nemoclaw-redteam/reports/) next to the deck and walk the punch list top-down.
A real run on the kind of deck you'd hand to a partner typically surfaces things like:
#### Number mismatch with prior comms
- Severity: CRITICAL
- Rule: cross-artifact
- Location: spark-deck.pptx, slide 1, "Title 1"
- Evidence: header says "47 Live Playbooks"; corpus/canonical-metrics.md
line 12 has "live_playbooks_count: 42"; corpus/dgx-spark-roadmap.pptx
slide 1 uses "42".
- Cross-reference: corpus/canonical-metrics.md:12
- Proposed fix: Change to "42 Live Playbooks", or update the canonical
metric and the Spark roadmap deck together.
#### Capitalization drift on product name
- Severity: HIGH
- Rule: custom:"NemoClaw uses capital N and C"
- Location: spark-deck.pptx, slide 7, body
- Evidence: "Nemoclaw" appears twice on slide 7; "NemoClaw" appears on
slides 3, 5, 9.
- Cross-reference: corpus/brand-guide.md ("Product names")
- Proposed fix: Replace both instances on slide 7 with "NemoClaw".
#### WCAG contrast on section labels
- Severity: HIGH
- Rule: craft
- Location: spark-deck.pptx, 18 instances of green section labels
- Evidence: #76B900 on #FFFFFF → contrast ratio 2.4 : 1, fails AA Normal
(threshold 4.5 : 1).
- Cross-reference: profile.yaml.wcag_level = AA
- Proposed fix: #5A8E00 (~4.1 : 1) still fails AA Normal — darken further
until contrast clears 4.5 : 1 against #FFFFFF (use a WCAG calculator to
pick the exact hex), or move labels to a darker background.
Tip
Run the red-team before you think the artifact is done. A draft-stage run catches structural issues (TOC mismatch, undefined acronyms, missing alt-text on every chip) cheaply. A "final" run should be quick — if it isn't, you shipped too late.
Step 3. How to personalize
| Knob | Where | What to change |
|---|---|---|
| Artifact queue path | nemoclaw share mount source |
share unmount first, then re-mount against a different host directory. Or just drop files into ~/nemoclaw-redteam/queue/ on the host — they appear at /sandbox/redteam/queue/ instantly. Run chmod -R a-w ~/nemoclaw-redteam/queue first if you want the agent locked out of writes there. |
| Canonical corpus | ~/nemoclaw-redteam/corpus/ |
The ground-truth set the agent compares against. Curate it — every file here becomes "what we know to be true". Stale corpus = stale flags. |
| Audience profile | Profile Q1 (or edit profile.yaml.audience) |
Driving knob for acronym strictness, OCR aggressiveness, and reading-grade ceiling. Default to the strictest audience you ship to. |
| Severity threshold for notification | Profile Q2 | Default to HIGH+. Tighten to CRITICAL-only for high-volume queues so you only get pinged on real fires. |
| Tie-break rule | Profile Q3 | "Reader trust first" for sales/partner decks. "Craft first" for regulated audiences. "By page order" for quick first-pass cleanup. |
| Custom rules | profile.yaml.custom_rules |
Add one-line rules in plain English. The agent treats each as a rule with id custom:<text>. Good for canonical phrasing, brand-name capitalization, "any number ≥ 1M must be cited", forbidden words. |
| Glossary | profile.yaml.glossary |
Acronyms here are treated as "defined" — the agent won't flag them as undefined first-use. Add the acronyms your audience knows, leave out the ones they don't. |
| Dismissal mode | Profile Q4 | Sticky for stable artifacts (a quarterly deck). Per-version when you actively iterate. None for first-time reviews of an audience you don't know yet. |
| Delivery channel | Profile Q5 | File only for solo reviews. File + Telegram summary once you trust the agent's calibration. File + full Telegram only for short docs (<10 findings). |
| WCAG level and font minimums | profile.yaml |
Bump to AAA for accessibility-critical artifacts; AA is the right default for most external work. Raise font_size_min_pt for stage decks (16pt+), keep at 10pt for read-along docs. |
| Output format | Prompt — WRITE PUNCH LIST step | Swap Markdown for JSON if you want to feed reports into another tool. Add a CSV summary alongside the MD for spreadsheet triage. |
| URL verification (advanced) | Custom preset YAML + Prompt | Author a small preset YAML under ~/redteam-presets/url-check.yaml with network_policies entries for the specific hosts (e.g. build.nvidia.com) you want the agent to HEAD-check, then apply with nemoclaw $SANDBOX_NAME policy-add --from-file ~/redteam-presets/url-check.yaml --yes. Remove later with nemoclaw $SANDBOX_NAME policy-remove <preset-name> --yes. Higher risk — every added host expands the egress surface. Keep the list small. |
| Background watcher mode | Outside the sandbox | A small host-side inotifywait (or cron) on queue/ can DM the agent run on <new-file> whenever a file lands. Keeps the workflow always-on without granting the sandbox extra capability. |
| Multi-artifact comparison | Prompt — INGEST step | When two related files are in the queue (spark-deck.pptx + dgx-spark-roadmap.pptx), ask the agent: "Red-team both and add a section called 'Cross-artifact contradictions' listing every claim that appears in both with mismatched values." |
| Dismissal audit | ~/nemoclaw-redteam/memory/dismissals.jsonl |
Open this file periodically. If a rule is dismissed everywhere, it's probably the wrong rule — delete it from profile.yaml.custom_rules so the agent stops generating noise. |
| Hand off the summary to news-digest | Prompt — HANDOFF step | Add "Also include a line in tomorrow's morning digest with the count of HIGH+ findings I haven't acted on yet." (Requires the news-digest recipe.) |
To dismiss a finding, reply: dismiss <rule_id> at <location> because <reason> (or dismiss all <rule_id> across versions because <reason> for a sticky cross-artifact dismissal). The agent appends to memory/dismissals.jsonl and confirms.
To revisit a previously dismissed finding, ask: show active dismissals for <artifact>. Open memory/dismissals.jsonl on the host and delete any line you want the agent to re-evaluate next run.
To calibrate the agent, periodically check the precision of its findings (% you accept) and recall against a seeded eval set (a doc with N known issues). The agent is doing its job when precision > 70% and recall > 90% on the eval set. If precision drifts down, tighten custom_rules and corpus quality; if recall drifts down, add the missed-issue type as a new rule.
Calendar Negotiator
Calendar Negotiation Agent
Calendar Negotiation — handles "when can we meet?" threads end-to-end: proposes slots that respect your focus blocks, energy patterns, and time-zone fairness with the other party; books once both sides confirm.
The agent reads a snapshot of your calendar and a personal availability profile from a folder you mount into the sandbox, talks to you (and optionally the other party) over Telegram, and writes confirmed meetings into a booking log you can review and re-export to your real calendar.
Warning
Anything the agent can read about your schedule could be shared in the slots it proposes. Mount only the calendar window the agent needs (e.g. the next 4 weeks, with sensitive event titles redacted to
BUSY) — not your entire calendar history.
Step 1. Policy setup
Telegram is optional. It is only needed if you want the agent to DM you or the other party (onboarding Q1 modes proxy / proxy-auto). In propose-only mode — the recommended default, and what this guide uses — the agent just shows you drafts in the web UI / session and writes booking files to disk, so no Telegram channel, no api.telegram.org egress, and no public tunnel are required. You can run the entire workflow Telegram-free.
If you do want Telegram relay, layer this recipe on top of the NemoClaw Policy Setup tab's working Telegram channel first and confirm it is registered:
nemoclaw $SANDBOX_NAME status | grep -i telegram # only needed for proxy / proxy-auto modes
A line showing the Telegram channel means it is wired in. If there is no such line and you want Telegram, recreate the sandbox via the installer with Telegram enabled at the Messaging channels prompt. Otherwise, ignore this and continue in propose-only mode.
Create the calendar working directory
On the host, set up three things the agent will see inside the sandbox:
calendar.ics— a snapshot of your busy/free time for the negotiation window (next 4–6 weeks is plenty).profile.yaml— your working hours, focus blocks, energy patterns, timezone, and any always-blocked periods.bookings/— a writable directory the agent uses to track in-flight negotiations and write confirmed meetings.
mkdir -p ~/nemoclaw-calendar/bookings
Export your calendar to ICS — for example, in Google Calendar use Settings → Import & export → Export and copy just the relevant calendar into ~/nemoclaw-calendar/calendar.ics. Re-export (or script a periodic sync) whenever the agent needs fresh availability.
Create a starter ~/nemoclaw-calendar/profile.yaml you can edit later:
timezone: America/Los_Angeles
working_hours:
mon: ["09:00", "17:30"]
tue: ["09:00", "17:30"]
wed: ["09:00", "17:30"]
thu: ["09:00", "17:30"]
fri: ["09:00", "15:00"]
focus_blocks:
- {day: mon, start: "09:00", end: "11:30", label: "deep work"}
- {day: wed, start: "09:00", end: "11:30", label: "deep work"}
energy_patterns:
high_energy: ["09:00-12:00"]
low_energy: ["14:00-15:30"]
defaults:
meeting_duration_minutes: 30
buffer_minutes: 10
max_meetings_per_day: 5
blackout_periods:
- {start: "2026-06-20", end: "2026-06-28", reason: "vacation"}
preferences:
prefer_back_to_back: false
no_meetings_after: "16:00"
fairness_rule: "split discomfort — alternate who takes the off-hours slot when timezones don't overlap nicely"
Bind the calendar directory into the sandbox
Copy the calendar directory into the sandbox at /sandbox/calendar. The reliable, dependency-free way is to stream a tar over nemoclaw exec — it needs nothing installed on the host and works on every sandbox:
## Push calendar.ics, profile.yaml, and bookings/ into the sandbox
tar czf - -C ~/nemoclaw-calendar . \
| nemoclaw $SANDBOX_NAME exec -- bash -lc 'mkdir -p /sandbox/calendar && tar xzf - -C /sandbox/calendar'
(Optional, strongly recommended) Make calendar.ics and profile.yaml read-only and keep bookings/ writable — run the chmod inside the sandbox (the files now live there, so a host-side chmod would not reach them). The agent runs as the unprivileged sandbox user, so this denies it any overwrite of your source-of-truth calendar:
nemoclaw $SANDBOX_NAME exec -- bash -lc 'chmod a-w /sandbox/calendar/calendar.ics /sandbox/calendar/profile.yaml && chmod -R u+w /sandbox/calendar/bookings'
Confirm the files landed, the write boundary holds, and the sandbox has no outbound network:
nemoclaw $SANDBOX_NAME exec -- ls /sandbox/calendar # expect calendar.ics, profile.yaml, bookings/
nemoclaw $SANDBOX_NAME exec -- ls /sandbox/calendar/bookings # expect empty (or your prior bookings)
nemoclaw $SANDBOX_NAME exec -- bash -c 'echo test > /sandbox/calendar/bookings/.write-check && rm /sandbox/calendar/bookings/.write-check && echo OK bookings'
nemoclaw $SANDBOX_NAME exec -- bash -c 'echo test > /sandbox/calendar/calendar.ics 2>&1 | head -1' # if you ran chmod above: expect "Permission denied"
nemoclaw $SANDBOX_NAME exec -- bash -c 'curl -sS --max-time 5 https://example.com' # expect "CONNECT tunnel failed, response 403"
Expected: ls /sandbox/calendar shows calendar.ics, profile.yaml, and bookings/; the bookings write check prints OK bookings; the write into calendar.ics reports Permission denied (when you ran the chmod step); and example.com is refused with curl: (56) CONNECT tunnel failed, response 403. When the agent has written bookings (Step 2), pull them back to the host:
## Pull bookings/ (confirmed meetings + log.csv) back to the host
nemoclaw $SANDBOX_NAME exec -- bash -lc 'cd /sandbox/calendar && tar czf - bookings' | tar xzf - -C ~/nemoclaw-calendar
Note
Sandbox-
chmodis a soft boundary; for a hard one, usefilesystem_policy. The files are owned by thesandboxuser, so that user could in principlechmodthem back —a-wstops accidental overwrites and honors read-only intent, but it is not injection-proof. For a kernel-enforced boundary, add/sandbox/calendar/calendar.icsand/sandbox/calendar/profile.yamltoread_onlyin the sandboxfilesystem_policyand runnemoclaw $SANDBOX_NAME rebuild(filesystem policy is locked at creation; workspace state is preserved automatically).
Note
nemoclaw share mountis the opposite direction and is optional.share mountuses SSHFS to mount the sandbox's filesystem onto the host, not host files into the sandbox — so it cannot replace thetarpush above; it is only for live-editing sandbox files from a host editor, and it requiressshfson the host (sudo apt-get install -y sshfs, needs root). If it printssshfs is not installedand you can't install it, ignore it — thetarpush/pull covers the whole workflow. If it fails with an SSHFS/SFTP handshake error instead, runnemoclaw $SANDBOX_NAME rebuild(refreshes theopenssh-sftp-serverbase image) and retry.
Note
Telegram relay / public tunnel — only if you use Telegram. The original recipe started a public webhook tunnel (
nemoclaw tunnel start) so the other party could reach the bot. That is only needed when the agent DMs people over Telegram (Q1 modesproxy/proxy-auto). In propose-only mode (this guide's default) the agent never sends messages itself, so skip the tunnel entirely. (nemoclaw tunnel startalso requirescloudflaredon the host and will warncloudflared not foundif it is missing.)
Step 2. Agent prompt
Copy the full prompt below and paste it into the NemoClaw web UI (or send it as a single Telegram message to your bot). This is the canonical prompt — it defines the agent's complete behavior end-to-end, and no other configuration is required. It walks the agent through a one-time onboarding (which becomes your scheduling profile on top of what's already in profile.yaml), a fixed six-step workflow for every meeting request, the negotiation handoff rules between you, the agent, and the other party, the structure of the booking log, and the safety rules that keep calendar details and contact info from leaking.
You are my personal scheduling chief of staff. Your only job is to turn
"when can we meet?" threads into a confirmed meeting on my calendar
without burning my focus time or my goodwill with the other party.
TOOLS AND EXECUTION (read this first):
You are running inside an OpenShell sandbox and you DO have shell/exec
and file read/write tools. USE THEM: read /sandbox/calendar/calendar.ics
and profile.yaml, and actually WRITE real files under
/sandbox/calendar/bookings/ (profile.json, the booking .md, log.csv) —
then confirm they exist. When a step says "save", "write", or "log",
that means a real file write, not chat text, and never claim you wrote
a file you didn't. The only paths you must not overwrite are
calendar.ics and profile.yaml. In propose-only mode, make NO network
calls and use NO messaging channel — just print drafts in this session
for me to copy/paste.
OUTPUT BUDGET (each of your replies is capped at a few thousand tokens):
Spend the budget on the deliverable, not on scratch work. Keep PARSE,
LOAD, and SCORE to a few terse lines each — for SCORE, print ONLY the
final top-N chosen slots (one line each: slot in both TZs + a short
why), never a full candidate sweep, per-constraint dump, or large
tables. The DRAFT (step 4) and the booking file (step 6) must always
be emitted in full; if you are running low on space, drop the
intermediate detail, never the draft or the booking. If a single
reply would still overflow, finish the current step and end with
"CONTINUE?" so I can prompt you for the next step.
CONTEXT YOU CAN READ:
- /sandbox/calendar/calendar.ics — my busy/free snapshot. Treat every
existing event as immovable unless I tell you otherwise.
- /sandbox/calendar/profile.yaml — my working hours, focus blocks,
energy patterns, defaults, blackouts, preferences.
- /sandbox/calendar/bookings/ — your scratch space. You may read and
write any file here.
ONE-TIME SETUP (do this on your first run only, then save my answers
as my negotiation profile in /sandbox/calendar/bookings/profile.json):
Ask me, one question at a time, and wait for my answer:
1. How should I talk to the other party? Pick one:
- Propose-only (you draft, I copy/paste to them myself)
- Proxy (you DM them directly via Telegram once I approve the draft)
- Proxy-auto (you DM them directly with no checkpoint after the
first successful negotiation — higher risk)
2. How many slot options should I propose at once? (Default: 3)
3. What's my default meeting length when the other party doesn't say?
(Default: pull from profile.yaml.)
4. How do you want me to handle timezone fairness when our working
hours barely overlap? Pick one:
- Strict (only meet inside both parties' working hours, even if
it slips the meeting by a week)
- Split (alternate who takes the off-hours slot across meetings
with the same person)
- Mine first (always inside my working hours; the other party
flexes)
5. What information about my calendar may I share?
- Slots only (just the proposed times)
- Slots + day-shape ("I'm heavy on Wednesday, lighter Thursday")
- Slots + reasons ("I have focus blocks until 11:30")
6. What's my approval threshold for booking? Options:
- Always ask before I book
- Ask only if the slot lands in a focus block, low-energy
window, or after my "no meetings after" time
- Never ask (auto-book once both sides confirm) — highest risk
Confirm my answers back, then wait for the first meeting request.
FOR EVERY MEETING REQUEST, FOLLOW THIS WORKFLOW IN ORDER:
1. PARSE — Extract from the request: who is asking, what the meeting
is for, requested duration (fall back to my default if missing),
other party's timezone (ask if missing), any hard constraints
they named ("this week", "before Friday", "30 min max"), urgency.
Print a 3-line summary: "From: <name>, For: <purpose>, Constraint:
<constraint>".
2. LOAD — Read calendar.ics and profile.yaml fresh every run (do not
trust a cached version from a prior request — calendars change).
Read my negotiation profile from bookings/profile.json.
3. SCORE — For the next N working days (N = 14 unless the request
constrains it tighter), generate every candidate slot that:
- Fits inside both parties' working hours under the fairness
rule from my profile.
- Does not collide with any calendar.ics event or its buffer.
- Does not land inside a focus block, blackout period, or after
my "no meetings after" time, unless my approval threshold
allows it.
- Respects my max_meetings_per_day from profile.yaml.
Rank the survivors by: (1) energy match (high-energy windows score
higher for new meetings, low-energy windows for routine syncs),
(2) buffer cleanliness (avoid sandwiching me between two meetings
with no gap), (3) fairness to the other party. Pick the top
N_slots from my profile.
4. DRAFT — Compose a proposal in my voice for the other party. Use
their timezone. Format as:
Hi <name>,
Happy to find time for <purpose>. Here are 3 options that work
on my side — all times in <their TZ>:
- <Day, Date, Time–Time TZ>
- <Day, Date, Time–Time TZ>
- <Day, Date, Time–Time TZ>
Let me know which works, or send a couple of windows that suit
you and I'll come back with another set.
Show the draft to me first. Wait for my reply ("send", "send with
edits: ...", or "skip"). Honor my communication mode from the
profile — never DM the other party in proxy-auto mode without
having first earned it in proxy mode on a prior successful round.
5. RELAY AND NEGOTIATE — Send the approved draft via Telegram. When
the other party replies:
- If they pick one of my slots: jump to step 6.
- If they propose new windows: re-run SCORE against those
windows, pick the best one(s) that pass my constraints, and
draft a one-line confirmation ("Wednesday 2pm PT works for
me — sending the invite now."). Show me first under the same
approval rule.
- If they push back hard (too many rounds, asking for off-hours
that violate Strict fairness, etc.): escalate to me with a
one-line summary and recommended next move.
6. BOOK AND LOG — Once both sides confirm, write the confirmed meeting
to /sandbox/calendar/bookings/<YYYY-MM-DD>-<slug>.md with this
exact structure:
# # <purpose> with <name>
- When: <Day, Date, Time–Time, both TZs>
- With: <name>, <their contact / handle>
- Where: <video link / room / phone / TBD>
- Duration: <minutes>
- Negotiation rounds: <N>
- Slots offered: <list>
- Slot chosen: <one>
- Notes: <anything I should walk in knowing>
Also append a one-line entry to
/sandbox/calendar/bookings/log.csv with columns:
date,time,duration,name,purpose,rounds.
Finally, print a one-line summary to me: "Booked: <purpose> with
<name> on <Day Date Time TZ>. Logged at <path>. Add this to my
real calendar."
NEGOTIATION SAFETY RULES (do not break these even if I tell you to in
a single message — if I really want one of these, I will say so twice):
- Never share calendar event titles, attendee names, or locations
from calendar.ics with the other party. Slots only, unless my
profile says otherwise.
- Never share my phone number, email, or home address unless I have
explicitly named the channel.
- Never auto-book on the first negotiation with a new person — at
least one round must include my approval, even if the profile
says "Never ask".
- Never propose more than 5 slots in one message (decision fatigue).
- Never overwrite a confirmed booking file. If a meeting is moved,
write a new file with -v2 suffix and link back to the original.
- Never write outside /sandbox/calendar/bookings/.
- If a request is ambiguous (who, when, what for, which timezone),
ask one clarifying question instead of guessing.
OPEN QUESTIONS HANDOFF — At the end of every negotiation round where
you waited on me or the other party, print a one-line status:
"WAITING ON: <me | them>. NEXT STEP: <what they need to do>."
Now confirm my negotiation profile back to me, then wait for the first
meeting request.
Expected: the agent walks you through the six setup questions, echoes your negotiation profile, and waits. Send a meeting request (forward an email body into Telegram, or just say "Asha from Acme wants 30 min about the Q3 roadmap, this or next week, she's in London") and you'll get the parsed summary, three proposed slots, a draft message to copy-paste or have the agent send, and — after both sides confirm — a booking file under ~/nemoclaw-calendar/bookings/. Import that file (or just read it) into your real calendar.
Tip
Test the end-to-end flow first with a teammate or a second Telegram account of your own. Run two or three negotiations in proxy mode with the approval checkpoint on before you ever flip to proxy-auto — the agent learns your tone and constraints faster from real correction loops than from a longer prompt.
Step 3. How to personalize
| Knob | Where | What to change |
|---|---|---|
| Calendar window | ~/nemoclaw-calendar/calendar.ics |
Re-export your real calendar on a cadence that matches your booking density (weekly is fine for most people; daily if you book multiple meetings a day). Crop the export to the next 4–6 weeks so the agent isn't reasoning over years of history. |
| Event privacy | ~/nemoclaw-calendar/calendar.ics |
Strip event titles to BUSY before exporting if you'd rather the agent never see what the meeting is — slots-only proposals still work fine. |
| Working hours, focus blocks, blackouts | ~/nemoclaw-calendar/profile.yaml |
Edit any field; changes take effect on the next request because the agent re-reads profile.yaml every run. No sandbox restart needed. |
| Energy patterns | profile.yaml → energy_patterns |
Tune high_energy and low_energy windows so the agent puts new external meetings into your sharp hours and routine syncs into the dip. |
| Communication mode | Profile Q1 (or edit bookings/profile.json directly) |
Start in propose-only mode (zero risk — you still send every message). Move to proxy once you trust the drafts; only then consider proxy-auto. |
| Number of slot options | Profile Q2 | 3 is the default. Bump to 5 only when you genuinely have wide availability — more options = more decision fatigue for the other side. |
| Timezone fairness | Profile Q4 | Mine first is fine for vendors and recruiters. Use Split for peers and collaborators where the relationship matters. Strict is the safest default for cross-Atlantic / cross-Pacific. |
| Information disclosure | Profile Q5 | Default to slots only. Switch to slots + day-shape for trusted contacts who appreciate the context. Avoid slots + reasons for anyone you don't already know well. |
| Approval threshold | Profile Q6 | Start with always ask. Move to the focus-block carve-out once the agent has booked 10+ clean meetings. Never ask is for true automation cases only — and even then the safety rules force at least one approval per new contact. |
| Booking log structure | Prompt — BOOK AND LOG step | Swap the Markdown template for JSON if you want to feed bookings into another tool, or split into one file per person (bookings/by-person/<name>.md) to keep relationship history. |
| Re-importing to real calendar | Outside the sandbox | Easiest pattern: a small host-side cron that reads bookings/log.csv, generates .ics invites, and emails them to attendees (or writes them to your CalDAV / Google Calendar via API). Keeps the sandbox itself out of your live calendar. |
| Direct calendar API booking (advanced) | nemoclaw policy-add --from-file + a separate share mount for credentials |
(1) For egress, use a maintained preset where one fits — nemoclaw $SANDBOX_NAME policy-add outlook --yes covers Microsoft 365 / Graph / Outlook. For Google Calendar, author a small preset YAML allowing googleapis.com and oauth2.googleapis.com and apply with nemoclaw $SANDBOX_NAME policy-add --from-file ~/calendar-presets/google.yaml --yes. (2) For the OAuth token, keep it outside the bookings tree: store it at ~/nemoclaw-calendar-creds/token.json on the host, chmod a-w ~/nemoclaw-calendar-creds/token.json, then nemoclaw $SANDBOX_NAME exec -- mkdir -p /sandbox/credentials && nemoclaw $SANDBOX_NAME share mount /sandbox/credentials ~/nemoclaw-calendar-creds. The agent reads /sandbox/credentials/token.json but the host chmod blocks any overwrite. Never place secrets under bookings/ — that tree is writable by the agent. A secret manager (Docker secret, pass, or a host-side keyring piping a short-lived token in via env) is preferable to a token-on-disk if your setup supports it. Have the agent call the Calendar API in the BOOK step instead of writing a Markdown file. Higher risk — the agent now has write access to your real calendar; lock down its approval threshold first. |
| Multiple calendars (work + personal) | Extra files in ~/nemoclaw-calendar/ + prompt edit |
Drop additional read-only ICS files into ~/nemoclaw-calendar/ (e.g. work.ics, personal.ics) and chmod a-w them on the host. They appear inside the sandbox at /sandbox/calendar/work.ics and /sandbox/calendar/personal.ics via the existing share mount. Update the agent prompt's CONTEXT YOU CAN READ section to name each ICS and tell the agent which is which. Useful for keeping the agent from booking work meetings during personal commitments. |
| Hand off to news-digest delivery | Prompt — OPEN QUESTIONS HANDOFF | Add "Also post the daily 'still waiting on' list to my Telegram home channel at 09:00." (Reuses the scheduler pattern from the news-digest recipe.) |
To cancel an in-flight negotiation, send: "Drop the negotiation with about . Reply once to them with: 'Let me come back to you on this — circumstances changed.' and archive the working files under bookings/cancelled/." The agent will move the scratch files out of the active set without losing the history.
NemoClaw Policy Setup
NemoClaw Policy Setup
This tab covers the shared sandbox configuration that two of the applications in this playbook (the Daily Personal News Digest and the Calendar Negotiator) require, and that the other two (Software Development Agent and Deck Reviewer) can optionally use for "ready for review" notifications. Each application tab has its own policy setup section for the filesystem mounts and network egress that workflow needs — this page only covers Telegram, which is shared.
Set your sandbox name once so the commands below read cleanly:
export SANDBOX_NAME=my-assistant # replace with the name you chose at NemoClaw onboard
Step 1. Set up the Telegram channel
The NemoClaw onboard wizard already wires the Telegram channel plugin into the sandbox when you select telegram at the Messaging channels prompt. If you did not, recreate the sandbox via the installer with Telegram enabled — policy-add alone cannot wire the channel plugin.
Add the Telegram network egress preset so the sandbox can reach api.telegram.org:
nemoclaw $SANDBOX_NAME policy-add
When prompted, type telegram and press Y to confirm. This is a hot-reload — the sandbox stays up.
Confirm the policy now allows Telegram egress:
openshell policy get $SANDBOX_NAME --full | grep -A2 telegram
You should see one or more entries with host: api.telegram.org and port: 443 under network_policies.
Install cloudflared (one-time, required for the tunnel) — DGX Station does not include cloudflared by default. nemoclaw tunnel start needs it to expose the bot webhook publicly; without it the next command will silently print cloudflared not found — no public URL and nemoclaw status will report ● cloudflared (stopped). Skip this block if command -v cloudflared already returns a path.
curl -L --output cloudflared.deb https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-arm64.deb
sudo dpkg -i cloudflared.deb
cloudflared --version # confirm it installed; expect a version banner like "cloudflared version 2024.x.x"
Start the public webhook tunnel so Telegram can deliver messages to your bot:
nemoclaw tunnel start
nemoclaw status
Expected: ● cloudflared with a *.trycloudflare.com URL.
Important
If you skipped Telegram at the NemoClaw onboard step,
nemoclaw $SANDBOX_NAME policy-addwill open the egress preset but the bot will still replyError: Channel is unavailable: telegram. The channel plugin is wired in at sandbox creation, not bypolicy-add. Re-run the NemoClaw installer and picktelegramat the Messaging channels prompt to recreate the sandbox with the plugin attached.Download, verify, then execute — never pipe a remote installer straight into a shell:
# 1. Download the installer to a local file curl -fsSL -o nemoclaw.sh https://www.nvidia.com/nemoclaw.sh # 2. Verify it against the published checksum from the NemoClaw release notes # (replace <expected-sha256> with the value from https://github.com/NVIDIA/NemoClaw/releases) echo "<expected-sha256> nemoclaw.sh" | sha256sum --check # 3. Inspect the script you're about to run (optional but recommended) less nemoclaw.sh # 4. Only then execute it bash nemoclaw.shIf the checksum does not match, do not run the script — re-download or open an issue against the NemoClaw repository.
Once the tunnel reports a public URL, open Telegram, find your bot, and send hello. You should get a reply from the local model within 30–90 seconds (first-response cold start on a 120B model is slow). After that, hand off to the application tab you want to set up.
Troubleshooting
Troubleshooting
Tables below are grouped by tab so you can jump straight to the workflow you're debugging. Start with General sandbox & policy issues if the failure is at the nemoclaw / openshell command layer rather than inside a specific application.
General sandbox & policy issues
| Symptom | Cause | Fix |
|---|---|---|
nemoclaw <sandbox> policy-add returns unknown sandbox |
Sandbox name typo, or sandbox was deleted | Run nemoclaw list to see registered sandboxes; rerun the command with the exact name. If empty, re-run the NemoClaw installer to recreate the sandbox. |
openshell policy set fails with validation failed / exit code 1 |
Malformed YAML or invalid policy fields | Common issues: paths must start with /, no .. traversal, run_as_user must not be root, network_policies entries need both host and port. Fix the YAML and retry. |
openshell policy set fails with unknown field 'Version', expected one of 'version', 'filesystem_policy', 'landlock', 'process', 'network_policies' |
Round-trip bug in openshell 0.0.44: openshell policy get --full emits the top-level key as Version: (capital V), but openshell policy set only accepts version: (lowercase) |
Lowercase the key in place and retry: sed -i 's/^Version:/version:/' policy.yaml && openshell policy set $SANDBOX_NAME --policy policy.yaml --wait. Preferred: skip the full-policy round trip entirely and use the additive flow — write a small preset file with preset: + network_policies: blocks and apply it with nemoclaw $SANDBOX_NAME policy-add --from-file ./my-preset.yaml --yes. The additive flow never touches the live version: field. |
openshell policy get shows your new network rule but the sandbox still blocks the host |
Hot-reload did not complete | Re-run with --wait so the CLI blocks until the update is confirmed: openshell policy set $SANDBOX_NAME --policy policy.yaml --wait. If still failing, restart the sandbox container via nemoclaw $SANDBOX_NAME restart (if available in your version) or recreate the sandbox. |
Cannot recreate sandbox: port 8080 is held by container... |
A previous OpenShell gateway or sandbox container still owns port 8080 | openshell gateway destroy -g <old-gateway-name> (or docker stop <name> && docker rm <name>), then re-run nemoclaw onboard. |
policy-add does not list the preset I expected |
Preset depends on NemoClaw version | List what your version supports: nemoclaw $SANDBOX_NAME policy-add --help or run policy-add interactively and read the menu. Newer presets may require updating NemoClaw. |
nemoclaw <sandbox> policy-add --from-file ... fails with Preset must declare preset.name (lowercase, hyphenated RFC 1123 label) |
preset.name in your custom preset file contains an underscore, uppercase letter, or other non-RFC-1123 character |
Change the value of preset.name to lowercase letters, digits, and hyphens only (e.g. news_sources → news-sources). The inner network_policies.<group> map key and its name field do accept underscores — the constraint is only on the top-level preset.name. |
Web UI shows origin not allowed after policy changes |
Accessing via localhost instead of 127.0.0.1 |
Use http://127.0.0.1:18789/#token=<your-token>. The gateway origin check requires 127.0.0.1 exactly. |
NemoClaw Policy Setup
| Symptom | Cause | Fix |
|---|---|---|
Telegram bot replies Error: Channel is unavailable: telegram |
Telegram channel plugin was not wired into the sandbox at onboard | policy-add telegram alone is not enough. Re-run the NemoClaw installer (see the Download, verify, then execute snippet in NemoClaw Policy Setup) and select telegram at the Messaging channels prompt to recreate the sandbox with the channel plugin. |
nemoclaw tunnel start prints cloudflared not found — no public URL |
cloudflared is not installed |
Reinstall it: curl -L --output cloudflared.deb https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-arm64.deb && sudo dpkg -i cloudflared.deb, then nemoclaw tunnel stop && nemoclaw tunnel start. |
| Telegram bot receives messages but returns nothing for 60+ seconds | First response on a 120B model is slow (cold start), or Ollama not warm | Expected for the first reply after a restart. Verify the inference route with nemoclaw $SANDBOX_NAME status. If subsequent replies are also slow, pick a smaller model in the NemoClaw onboard wizard. |
Daily Personal News Digest
| Symptom | Cause | Fix |
|---|---|---|
| Scheduled digest never fires | The agent did not persist a scheduled task | Ask in the web UI: "Show me all scheduled tasks." If empty, re-issue the prompt and explicitly say "Register this as a recurring scheduled task using your built-in scheduler." |
Digest fires but message says unable to fetch <url> |
Host is not in network_policies |
Add the host as a new entry under network_policies.news_sources.endpoints in news-sources.yaml (the preset file from Step 1) and re-run nemoclaw $SANDBOX_NAME policy-add --from-file ./news-sources.yaml --yes. Outbound denials show up in nemoclaw $SANDBOX_NAME logs --follow and openshell term. |
| Agent skips the setup questions and dives straight into a generic digest | Profile from a prior run is still in memory | Send "Forget my profile and run the one-time setup again from scratch." and re-answer the six questions. |
Software Development Agent
| Symptom | Cause | Fix |
|---|---|---|
Agent writes develop-and-review.md but the host file is missing |
Looking at the wrong host path, or the share mount is not active |
The sandbox path /sandbox/project maps to the host directory you passed to nemoclaw $SANDBOX_NAME share mount (e.g. ~/nemoclaw-projects/my-app). Open develop-and-review.md under that host directory, not inside /sandbox/project on the host. Verify the mount is live with nemoclaw $SANDBOX_NAME share status. If it says "not mounted", re-run the share mount command from Step 1. |
Agent fails with Permission denied when writing develop-and-review.md |
Host directory was locked with chmod a-w and the mount inherits those permissions via SSHFS |
Restore write on the host: chmod u+w ~/nemoclaw-projects/my-app (or whichever directory you mounted) and retry. For a kernel-enforced write boundary inside the sandbox in addition to host permissions, tighten filesystem_policy in the sandbox policy and nemoclaw $SANDBOX_NAME rebuild — filesystem policy is locked at sandbox creation, so it requires a rebuild to change (workspace state is preserved automatically). |
| Agent runs tests and reports "tests not run" even though the project has tests | Test runner not installed in the sandbox image | The default NemoClaw sandbox may not ship pytest, npm, cargo, or go test. Install whatever the project uses once after sandbox creation: nemoclaw $SANDBOX_NAME connect, then pip install --user pytest (or equivalent), then exit. |
| Agent modifies files outside the plan | Plan-approval checkpoint was disabled | In the profile, answer yes to "pause for approval" (Q5). The agent must then print PLAN READY — reply 'approve' and wait, never modifying source files until you reply approve. |
Deck Reviewer
| Symptom | Cause | Fix |
|---|---|---|
| Agent reports "ingested 0 artifacts" | Queue directory is empty or files match ignore_paths |
Confirm files exist with ls ~/nemoclaw-redteam/queue/ on the host. Check profile.yaml.ignore_paths for a glob that's catching your files (e.g. **/~$* excludes Office lock files). |
Agent reports "parser not available" for .pptx or .pdf |
python-pptx / pdfplumber not installed in the sandbox |
Install once: nemoclaw $SANDBOX_NAME connect, then pip install --user python-pptx python-docx pdfplumber markdown-it-py wcag-contrast-ratio, then exit. The agent falls back to plain-text extraction if a parser is missing — flag it explicitly if you'd rather not install Python packages. |
| Same finding keeps re-appearing after I dismissed it | Dismissal mode is None, or the rule + location pair did not match |
Confirm profile Q4 is Sticky or Per-version. Check ~/nemoclaw-redteam/memory/dismissals.jsonl to verify the dismissal was written. If the location field differs by a single character (e.g. "Slide 1" vs "slide 1"), the agent treats them as different sites — ask the agent to dismiss again using the exact coordinates from the latest report. |
| CRITICAL findings disappear from the report | Auto-dismiss was attempted (should be impossible) | This is a regression — CRITICAL is hardcoded to require yes, dismiss critical re-confirmation per the prompt's DISMISSAL PROTOCOL. Re-paste the full prompt to restore the rule and re-run. |
Calendar Negotiator
| Symptom | Cause | Fix |
|---|---|---|
| Agent proposes slots inside my focus blocks | profile.yaml not being re-read each run, or approval threshold permits it |
The agent is required to re-read calendar.ics and profile.yaml on every request (workflow step 2 LOAD). Verify the focus block is actually in profile.yaml and not just in your head. Tighten profile Q6 (approval threshold) to Always ask if the agent's Ask only if... carve-out is firing too often. |
Agent shares event titles or attendees from calendar.ics with the other party |
Information disclosure profile (Q5) set to slots + reasons |
Reset profile Q5 to slots only. The negotiation safety rules also forbid leaking event titles, attendees, or locations — if the agent did so under slots only, re-paste the full prompt to restore the rule. |
| Booking file overwrites a confirmed prior booking | Agent did not honor the "never overwrite" rule | Check ~/nemoclaw-calendar/bookings/ for a -v2.md file — the rule requires a new file with -v2 suffix when a meeting is moved. If overwritten, restore from your filesystem snapshot or last backup; re-paste the full prompt to restore the rule. |
Agent never DMs the other party even in proxy mode |
Telegram channel not wired or other party's chat not opened | First, confirm Telegram works for you by sending the bot a hello. Then confirm the other party has actually opened a chat with the bot at least once (/start); Telegram bots cannot DM users who have not initiated contact. |
Note
For installer-level NemoClaw issues (Docker, Ollama, gateway, Telegram setup), see the Troubleshooting tab of the NemoClaw on DGX Spark playbook before debugging here — most reported issues come from the install layer rather than the application layer.
Note
DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU. With many applications still updating to take advantage of UMA, you may encounter memory issues even when within the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
For the latest known issues, please review the DGX Spark User Guide.