tftsr-devops_investigation

History

Shaun Arman cf5bc83b75 Some checks failed Test / rust-fmt-check (pull_request) Successful in 1m42s Details Test / frontend-typecheck (pull_request) Successful in 1m42s Details Test / frontend-tests (pull_request) Successful in 1m42s Details Test / rust-clippy (pull_request) Successful in 3m16s Details PR Review Automation / review (pull_request) Failing after 4m33s Details Test / rust-tests (pull_request) Successful in 4m54s Details fix(ci): add post-generation evidence verification to pr-review qwen3-coder-next fabricates plausible-looking code in its Evidence blocks instead of quoting from the actual files provided. This adds a Python verification step that greps each fenced code block against the real changed files and tags any finding whose evidence cannot be found as UNVERIFIED. This is a safeguard, not a fix — the model is fundamentally unreliable for grounded code review. The longer-term fix is to replace qwen3-coder with a model that stays grounded to context (Claude Haiku, devstral, or deepseek-coder-v2 via the LiteLLM proxy / vLLM at 172.0.1.42).		2026-05-31 14:41:47 -05:00
..
auto-tag.yml	fix(ci): pass release_tag as job output; fix equal-version case; drop git-describe [skip ci]	2026-05-23 22:48:14 +00:00
build-images.yml	fix(ci): replace docker:24-cli with alpine + docker-cli in build-images	2026-04-12 20:16:32 -05:00
pr-review.yml	fix(ci): add post-generation evidence verification to pr-review	2026-05-31 14:41:47 -05:00
test.yml	fix(ci): switch PR review from Ollama to liteLLM (qwen2.5-72b)	2026-04-19 18:41:54 -05:00