tftsr-devops_investigation/docs/TICKET-pii-bypass-chat-attachments.md
Shaun Arman 093495a653
Some checks failed
Test / rust-fmt-check (pull_request) Failing after 0s
Test / rust-clippy (pull_request) Failing after 1s
Test / rust-tests (pull_request) Failing after 0s
Test / frontend-typecheck (pull_request) Failing after 16s
Test / frontend-tests (pull_request) Failing after 18s
PR Review Automation / review (pull_request) Failing after 4m13s
feat: full copy from apollo_nxt-trcaa with complete sanitization
Complete backport of all features from apollo_nxt-trcaa repository:
- Three-tier shell execution safety system (Tier 1: auto, Tier 2: approve, Tier 3: deny)
- Ollama function calling with tool use support
- AI provider tool calling auto-detection
- kubectl binary bundling and management
- kubeconfig upload and context management
- Shell approval modal with real-time UI
- MCP protocol HTTP transport with custom headers
- Enhanced security audit logging
- Comprehensive test coverage (275+ tests)
- Updated CI/CD workflows for Gitea Actions
- Complete documentation (ADRs, wiki, release notes)

Sanitization applied to all files:
- Removed all MSI, Motorola, VNXT, Vesta references
- Replaced internal infrastructure references with TFTSR equivalents
- Updated all URLs and API endpoints
- Sanitized commit history references in documentation

Technical changes:
- New modules: shell/classifier, shell/executor, shell/kubectl, shell/kubeconfig
- Enhanced AI providers: ollama.rs, openai.rs with function calling
- New Tauri commands: shell execution, kubeconfig management, tool calling detection
- Database migrations: shell_execution_audit table
- Frontend: ShellApprovalModal, ShellExecution, KubeconfigManager pages
- CI/CD: kubectl bundling, multi-platform builds, Gitea Actions integration

Version: 1.0.8

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-06-05 14:12:43 -05:00

5.9 KiB

TICKET: PII Detection Bypass in AI Chat

Branch: fix/pii-detection-bypass


Description

Two PII detection bypasses were identified and fixed in the AI triage chat interface.

Bypass 1 — File Attachments (Critical)

When a user attached a file to a chat message, its content was read via readTextFile(), sliced to 8 KB, and embedded directly into the AI message string — bypassing the PII pipeline entirely. The message was forwarded to the configured AI provider in plaintext with no redaction marker in the audit log.

Root cause: handleAttach stored raw file content in React state. handleSend concatenated it into aiMessage with no PII check. The backend chat_message command applied no validation.

Bypass 2 — Typed Chat Messages (High)

Plain typed chat messages were sent to the AI provider without any PII scan. A user typing How secure is my password: abc123!! would have the password forwarded to the AI and persisted in the audit log in plaintext.

detect_pii was serialising pii::PiiDetectionResult (spans, original_text) while the TypeScript interface expected db::models::PiiDetectionResult (detections, total_pii_found). All frontend code reading result.detections received undefined, meaning the LogUpload PII review workflow was silently broken.


Design Decision: Auto-Redact, Not Block

After initial implementation explored a blocking/warn-then-proceed approach, the product decision was made to auto-redact PII in-place and send:

  • File attachments: PII is detected on full file content and replaced with type tokens ([Password], [Email], etc.) before the content is embedded in the AI message. The redacted form is stored in the DB and audit log.
  • Typed messages: Same auto-redact applied to the user's typed text before the message is sent to the AI provider.
  • The user's chat bubble is updated after the response to show the redacted form — users can see exactly what reached the AI.
  • The audit log records was_pii_redacted: bool and pii_types_redacted: [...] alongside the redacted message.
  • No user blocking or acknowledgment flow. PII is handled transparently.

Acceptance Criteria

  • Attaching a text file containing PII sends successfully; content is auto-redacted before the AI sees it
  • Attaching a clean text file proceeds normally with no modification
  • PII detection runs on the full file content before truncating to the 8 KB embed limit (no PII straddling the boundary)
  • Typed messages containing PII are auto-redacted before being sent to the AI provider
  • The chat bubble is updated post-send to show the redacted form of the user's message
  • The audit log records was_pii_redacted, pii_types_redacted, and the full redacted user_message
  • detectPiiCmd returns detections: PiiSpan[] and total_pii_found: number matching the TypeScript contract
  • chatMessageCmd passes logFileIds as undefined (not null) when no files are attached
  • scan_text_for_pii rejects inputs over 32 KB to prevent DoS
  • response.user_message ?? message used as bubble fallback — no "undefined..." concatenation
  • All Rust and frontend tests pass; zero clippy warnings; cargo fmt --check clean; tsc clean

Work Implemented

src-tauri/src/ai/mod.rs

  • Added user_message: Option<String> to ChatResponse — set by chat_message, absent from direct provider calls

src-tauri/src/ai/anthropic.rs, gemini.rs, mistral.rs, ollama.rs, openai.rs

  • Added user_message: None to all ChatResponse { ... } constructors

src-tauri/src/commands/ai.rs

  • chat_message now accepts log_file_ids: Option<Vec<String>>
  • Step 1: auto-redacts the typed message text with PiiDetector + apply_redactions
  • Step 2: loads each attachment from DB, detects PII on full file content, applies redactions, then truncates to 8 KB at a valid UTF-8 char boundary
  • Tracks was_pii_redacted and redacted_pii_types across both steps
  • Audit log includes was_pii_redacted: bool and pii_types_redacted: [...]
  • Returns user_message: Some(stored_user_message) in ChatResponse

src-tauri/src/commands/analysis.rs

  • Fixed detect_pii return type from pii::PiiDetectionResult to db::models::PiiDetectionResult
  • Added scan_text_for_pii(text: String) with 32 KB input cap

src-tauri/src/lib.rs

  • Registered scan_text_for_pii

src/lib/tauriCommands.ts

  • ChatResponse interface: added user_message?: string
  • chatMessageCmd signature: added logFileIds: string[]; passes undefined when empty
  • Added scanTextForPiiCmd wrapper

src/stores/sessionStore.ts

  • Added updateMessageContent(id, content) action

src/pages/Triage/index.tsx

  • PendingFile type: { name: string; logFileId: string } — no raw content stored
  • handleAttach: only uploads the file and stores logFileId; no readTextFile
  • handleSend: passes logFileIds to backend; after response updates the bubble with (response.user_message ?? message) + suffix

Testing Needed

  1. Attach a file containing password: secret123 → message sends; chat bubble shows [Password] in the embedded content; no plaintext credential in bubble or DB
  2. Attach a clean text file → content appears unmodified in the chat context
  3. Attach a file where PII appears near the 8000-byte mark → content is fully redacted before truncation
  4. Type My password is abc123!! → message sends; bubble shows My [Password] is [Password]
  5. On LogUpload page, upload a file with a known IP/email → PII spans appear in the review UI
  6. Check audit log after a PII-containing message: was_pii_redacted: true, pii_types_redacted populated
  7. Check audit log after a clean message: was_pii_redacted: false, pii_types_redacted: []
  8. cargo test → 228/228 pass; npm run test:run → 103/103 pass; cargo fmt --check clean; npx tsc --noEmit clean