fix(ci): fix secret scrubbing regex that was deleting legitimate code lines
All checks were successful
Test / rust-fmt-check (pull_request) Successful in 1m51s
Test / frontend-tests (pull_request) Successful in 1m51s
Test / frontend-typecheck (pull_request) Successful in 1m55s
Test / rust-clippy (pull_request) Successful in 3m11s
Test / rust-tests (pull_request) Successful in 4m27s
PR Review Automation / review (pull_request) Successful in 4m47s

The previous regex matched any line containing "password", "token", etc.
near certain punctuation characters. This silently removed function
signatures, variable declarations, and test assertions from the context
sent to the LLM — causing it to hallucinate 3 BLOCKERs per review:
- "function signature missing" (the `password: &str` param was scrubbed)
- "filter body empty" (the filter condition containing "password" was scrubbed)
- "password passed unencrypted" (the decrypt_token call line was scrubbed)

Fix: match actual credential VALUES only:
- Well-known token formats (AKIA..., ghp_..., xox...)
- keyword = "long_quoted_literal" (25+ chars, clearly a value not a name)
- Standalone base64 blob lines (60+ chars, PEM-style)

Never scrub a line just because it contains a credential-related word.
This commit is contained in:
Shaun Arman 2026-05-31 14:33:44 -05:00
parent 1de59db9f0
commit 6373f0b09c

View File

@ -58,9 +58,13 @@ jobs:
# Build context: full file content for each changed file.
# Files <= 500 lines: include complete content.
# Files > 500 lines: include the per-file diff with generous context (±50 lines).
# Secret scrubbing applied to both paths.
SECRET_PATTERN='^([[:space:]]*[+\-]?[[:space:]]*).*[pP]assword[[:space:]]*[=:"'"'"']|[tT]oken[[:space:]]*[=:"'"'"']|[aA][pP][iI][_][kK]ey[[:space:]]*[=:"'"'"']|AKIA[A-Z0-9]{16}|gh[opsu]_[A-Za-z0-9_]{36,}|Authorization:[[:space:]]'
B64_PATTERN='^[[:space:]]*[+\-]?[[:space:]]*[A-Za-z0-9+/]{40,}={0,2}([^A-Za-z0-9+/=]|$)'
#
# Secret scrubbing: match actual credential VALUES only — known API key formats,
# or keyword="long_quoted_literal" (25+ chars). Never scrub on keyword alone,
# which would silently delete function signatures, variable declarations, and tests.
SECRET_PATTERN='AKIA[A-Z0-9]{16}|gh[opsu]_[A-Za-z0-9_]{36,}|xox[baprs]-[0-9]{10,13}-[0-9]{10,13}-[a-zA-Z0-9]{24}|(password|token|api_key|secret)[[:space:]]*=[[:space:]]*["'"'"'][A-Za-z0-9+/_\-!@#]{25,}["'"'"']'
# Only strip lines that are ENTIRELY a long base64 blob (e.g. PEM cert bodies)
B64_PATTERN='^[[:space:]]*[A-Za-z0-9+/]{60,}={0,2}[[:space:]]*$'
> /tmp/pr_context.txt
while IFS= read -r file; do