Remove frontend detectPiiCmd pre-scan loop — backend is sole redaction
authority; bubble update via response.user_message covers user feedback.
Detect PII on full file content before truncating. Previous order
(truncate to 8000 bytes then scan) could miss PII straddling the
boundary. Now: read full content, scan, redact, then truncate to
EMBED_LIMIT (8000 bytes) at a valid UTF-8 char boundary.
logFileIds IPC: pass undefined (not null) for empty array so Tauri
serialises it correctly to Rust Option::None.
Add MAX_TEXT_SCAN_BYTES (32 KB) guard in scan_text_for_pii to prevent
unbounded regex evaluation on oversized payloads.
Fix clippy uninlined_format_args in ai.rs.
Addresses three findings from the third automated review:
[BLOCKER] No frontend PII pre-check on attachments.
Added detectPiiCmd call for each logFileId before chatMessageCmd.
PII is not blocked (per explicit product decision: auto-redact and
send) but the user now sees a non-blocking amber notice listing
each file and the PII types that will be auto-redacted. Backend
remains the authoritative redaction layer.
[WARNING 2] Chat bubble showed original PII-laden message even though
only the redacted form was sent to AI.
Added updateMessageContent to sessionStore. After chatMessageCmd
returns, if response.user_message is set the user bubble is updated
to reflect what was actually stored in the DB, so the UI is
consistent with the audit log.
CI fix: cargo fmt changes to analysis.rs were not staged in the prior
commit. Committed here — fmt check now passes cleanly.
File attachments were embedded into AI messages without any PII
scanning, allowing credentials, tokens, and other sensitive data
to be forwarded to AI providers in plaintext.
Typed chat messages had the same gap: a user could type a password
or API key directly and it would be sent unscanned.
Changes:
- chat_message (Rust): defence-in-depth scan of all attachment body
content (between --- Attached: markers); hard rejects if PII found
- detect_pii (Rust): fix return type from pii::PiiDetectionResult
(spans/original_text) to db::models::PiiDetectionResult
(detections/total_pii_found) to match the TypeScript contract; the
LogUpload PII review workflow was receiving undefined for detections
- scan_text_for_pii (Rust): new command — scans arbitrary text for PII
without creating DB records; used for typed message warnings
- Triage/index.tsx: PendingFile now carries logFileId; handleSend gates
each text attachment through detectPiiCmd (hard block on PII found);
typed message text scanned via scanTextForPiiCmd with a one-time
warning — second send of same message proceeds as acknowledgment
- compress_text now returns Result<Vec<u8>, String>; callers propagate
the error instead of silently storing empty BLOB on gzip failure
- upload_image_attachment_by_content and upload_paste_image now validate
decoded byte length against MAX_IMAGE_FILE_BYTES before DB storage,
closing the size-bypass gap that existed for base64 content uploads
- Image View modal in AttachmentsTab now surfaces the error string when
get_image_attachment_data fails, replacing the opaque "could not be
loaded" message with actionable diagnostic text
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Store compressed log content and raw image bytes in SQLite so attachments
are self-contained regardless of source file availability on disk.
DB (migrations 020-022):
- log_files.content_compressed BLOB — gzip-compressed extracted text
- image_attachments.image_data BLOB — raw image bytes
- Views v_log_files_with_issue and v_image_attachments_with_issue for
cross-incident queries with joined issue title
Rust backend:
- compress_text / decompress_text helpers (flate2 rust_backend / miniz_oxide)
with 100 MB decompression-bomb guard
- upload_log_file*, upload_log_file_by_content store content_compressed
- upload_image_attachment*, upload_paste_image store image_data
- New commands: get_log_file_content, list_all_log_files (analysis.rs)
- New commands: get_image_attachment_data, list_all_image_attachments (image.rs)
- All commands fall back to file_path for pre-migration records
Frontend:
- LogFileSummary, ImageAttachmentSummary types in tauriCommands.ts
- attachmentStore (Zustand) — loadAttachments, searchAttachments
- History page: Issues tab (existing) + Attachments tab (new)
with log/image tables, search bar, View modals, lazy thumbnails
Tests: 227 Rust (+16 new), 103 frontend (+9 new), tsc clean, clippy clean
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Only a single hardcoded entry (word/document.xml) is ever accessed from
the ZIP archive; no arbitrary path extraction occurs, so path traversal
attacks cannot apply. Add a comment to make this invariant explicit for
future maintainers.
- Add extension allowlist (SAFE_TEXT_EXTENSIONS + SAFE_BINARY_EXTENSIONS)
rejecting unsupported file types at both upload_log_file and
upload_log_file_by_content entry points
- Add extract_text_content() with PDF text extraction via lopdf and
DOCX extraction via zip+quick-xml
- Binary files (PDF/DOCX) get extracted text written to .extracted.txt
for downstream PII detection
- Expand frontend file input accept list and add collapsible
supported-formats disclosure element
- Add 11 unit tests covering allowlist logic and extraction paths
- Add use_datastore_upload field to ProviderConfig for enabling datastore uploads
- Add upload_file_to_datastore and upload_file_to_datastore_any commands
- Add upload_log_file_by_content and upload_image_attachment_by_content commands for drag-and-drop without file paths
- Add multipart/form-data support for file uploads to GenAI datastore
- Add support for image/bmp MIME type in image validation
- Add x-generic-api-key header support for GenAI API authentication
This addresses:
- Paste fails to attach screenshot (clipboard)
- File upload fails with 500 error when using GenAI API
- GenAI datastore upload endpoint support for non-text files
Rust's `regex` crate does not support lookaround assertions. The hostname
pattern `(?=.{1,253}\b)` caused a panic on every `PiiDetector::new()` call,
failing all four PII detector tests in CI (rust-fmt-check, rust-clippy,
rust-tests). Removed the lookahead; the remaining pattern correctly matches
valid FQDNs without the RFC 1035 length pre-check.
Also reformatted analysis.rs:253 to satisfy `rustfmt` (line break after `=`).
All 127 Rust tests pass and `cargo fmt --check` and `cargo clippy -- -D
warnings` are clean.
Remove high-risk defaults and tighten data handling across auth, storage, IPC, provider calls, and capabilities so sensitive data is better protected by default. Also update README/wiki security guidance and add targeted tests for the new hardening behaviors.
Made-with: Cursor