|
Some checks failed
Test / rust-fmt-check (pull_request) Successful in 1m42s
Test / frontend-typecheck (pull_request) Successful in 1m42s
Test / frontend-tests (pull_request) Successful in 1m42s
Test / rust-clippy (pull_request) Successful in 3m16s
PR Review Automation / review (pull_request) Failing after 4m33s
Test / rust-tests (pull_request) Successful in 4m54s
qwen3-coder-next fabricates plausible-looking code in its Evidence blocks instead of quoting from the actual files provided. This adds a Python verification step that greps each fenced code block against the real changed files and tags any finding whose evidence cannot be found as UNVERIFIED. This is a safeguard, not a fix — the model is fundamentally unreliable for grounded code review. The longer-term fix is to replace qwen3-coder with a model that stays grounded to context (Claude Haiku, devstral, or deepseek-coder-v2 via the LiteLLM proxy / vLLM at 172.0.1.42). |
||
|---|---|---|
| .. | ||
| workflows | ||