tftsr-devops_investigation/.gitea/workflows
Shaun Arman cf5bc83b75
Some checks failed
Test / rust-fmt-check (pull_request) Successful in 1m42s
Test / frontend-typecheck (pull_request) Successful in 1m42s
Test / frontend-tests (pull_request) Successful in 1m42s
Test / rust-clippy (pull_request) Successful in 3m16s
PR Review Automation / review (pull_request) Failing after 4m33s
Test / rust-tests (pull_request) Successful in 4m54s
fix(ci): add post-generation evidence verification to pr-review
qwen3-coder-next fabricates plausible-looking code in its Evidence
blocks instead of quoting from the actual files provided. This adds a
Python verification step that greps each fenced code block against the
real changed files and tags any finding whose evidence cannot be found
as UNVERIFIED.

This is a safeguard, not a fix — the model is fundamentally unreliable
for grounded code review. The longer-term fix is to replace qwen3-coder
with a model that stays grounded to context (Claude Haiku, devstral,
or deepseek-coder-v2 via the LiteLLM proxy / vLLM at 172.0.1.42).
2026-05-31 14:41:47 -05:00
..
auto-tag.yml fix(ci): pass release_tag as job output; fix equal-version case; drop git-describe [skip ci] 2026-05-23 22:48:14 +00:00
build-images.yml fix(ci): replace docker:24-cli with alpine + docker-cli in build-images 2026-04-12 20:16:32 -05:00
pr-review.yml fix(ci): add post-generation evidence verification to pr-review 2026-05-31 14:41:47 -05:00
test.yml fix(ci): switch PR review from Ollama to liteLLM (qwen2.5-72b) 2026-04-19 18:41:54 -05:00