fix/proxmox-reconnect-v1.2.3 #125

Merged
sarman merged 6 commits from fix/proxmox-reconnect-v1.2.3 into beta 2026-06-21 01:33:26 +00:00
Owner

Summary

  • Reconnect failure after app restart — authenticate() was deserializing directly into AuthResponse but the Proxmox API wraps every response in {"data": {...}}. The deserialiser always failed silently, so the in-memory
    session pool was never populated on app open. Added ProxmoxEnvelope to strip the wrapper correctly. Also fixed: CSRF token (CSRFPreventionToken) was never stored or sent, so all mutating operations (VM start/stop,
    firewall edits) would have failed with a CSRF check error; add_proxmox_cluster was inserting an unauthenticated stub into the pool and immediately reporting "connected"; disconnect_proxmox_cluster had no backend
    command so the session was never actually dropped; reqwest was rejecting self-signed Proxmox certificates.
  • VM list always empty — list_vms was issuing a POST to cluster/resources, which is a GET-only endpoint. Switched to GET cluster/resources?type=vm. The same function also had the data envelope double-unwrap (see
    below), and stopped VMs were being silently dropped because cpu was absent from the response for halted guests — fixed to unwrap_or(0.0).
  • Double-unwrap of Proxmox data envelope across all modules — handle_response() in client.rs already strips the {"data": ...} wrapper before returning to callers, but every proxmox module (vm, ceph, ceph_cluster,
    certificates, acme, firewall, sdn, ha, apt, updates, updates_ext, tasks, migration, metrics, shell, auth_realm, views, backup) was calling .get("data") a second time. This always returned None, causing 19 functions
    across 18 files to silently return empty arrays or "Invalid response format" errors regardless of what the API actually returned.

Test plan

  • Add a new Proxmox host with correct credentials → "connected" status appears only after a real ticket is obtained
  • Add a host with wrong credentials → immediate auth error, host not saved
  • Restart the app → previously-added hosts show "disconnected"; clicking Connect re-authenticates successfully
  • Click Disconnect → status changes to "disconnected"; subsequent API calls fail until Connect is clicked again
  • Navigate to a connected host's VM list → VMs appear (including stopped VMs)
  • Verify a Proxmox host with a self-signed certificate connects without TLS errors
  • Trigger a VM start/stop → verify CSRF token flow doesn't produce a 401
Summary - Reconnect failure after app restart — authenticate() was deserializing directly into AuthResponse but the Proxmox API wraps every response in {"data": {...}}. The deserialiser always failed silently, so the in-memory session pool was never populated on app open. Added ProxmoxEnvelope<T> to strip the wrapper correctly. Also fixed: CSRF token (CSRFPreventionToken) was never stored or sent, so all mutating operations (VM start/stop, firewall edits) would have failed with a CSRF check error; add_proxmox_cluster was inserting an unauthenticated stub into the pool and immediately reporting "connected"; disconnect_proxmox_cluster had no backend command so the session was never actually dropped; reqwest was rejecting self-signed Proxmox certificates. - VM list always empty — list_vms was issuing a POST to cluster/resources, which is a GET-only endpoint. Switched to GET cluster/resources?type=vm. The same function also had the data envelope double-unwrap (see below), and stopped VMs were being silently dropped because cpu was absent from the response for halted guests — fixed to unwrap_or(0.0). - Double-unwrap of Proxmox data envelope across all modules — handle_response() in client.rs already strips the {"data": ...} wrapper before returning to callers, but every proxmox module (vm, ceph, ceph_cluster, certificates, acme, firewall, sdn, ha, apt, updates, updates_ext, tasks, migration, metrics, shell, auth_realm, views, backup) was calling .get("data") a second time. This always returned None, causing 19 functions across 18 files to silently return empty arrays or "Invalid response format" errors regardless of what the API actually returned. Test plan - [ ] Add a new Proxmox host with correct credentials → "connected" status appears only after a real ticket is obtained - [ ] Add a host with wrong credentials → immediate auth error, host not saved - [ ] Restart the app → previously-added hosts show "disconnected"; clicking Connect re-authenticates successfully - [ ] Click Disconnect → status changes to "disconnected"; subsequent API calls fail until Connect is clicked again - [ ] Navigate to a connected host's VM list → VMs appear (including stopped VMs) - [ ] Verify a Proxmox host with a self-signed certificate connects without TLS errors - [ ] Trigger a VM start/stop → verify CSRF token flow doesn't produce a 401
sarman added 3 commits 2026-06-21 00:31:37 +00:00
Root cause: authenticate() tried to deserialize the Proxmox API response
directly into AuthResponse, but Proxmox wraps every response in
{"data": {...}}.  This caused every reconnect attempt after app restart
to fail silently.

Additional fixes bundled in this commit:
- add_proxmox_cluster now authenticates immediately so the in-memory pool
  always contains a live, ticketed client (not a bare unauthenticated stub)
- ProxmoxClient stores the CSRFPreventionToken and includes it in the
  CSRFPreventionToken header on POST/PUT/DELETE requests (Proxmox requires
  this for all mutating calls)
- accept-invalid-certs enabled on the reqwest Client so self-signed PVE
  certificates do not block connections
- Removed double-unwrap of the data field in 10 commands (list_acls,
  list_users, get_cluster_notes, search_proxmox_resources, get_node_status,
  get_syslog, list_network_interfaces, get_subscription_status,
  list_cluster_tasks, list_proxmox_containers) — handle_response already
  strips the envelope before returning to callers
- Added connect_proxmox_cluster and disconnect_proxmox_cluster Tauri
  commands so the UI can explicitly connect/disconnect sessions
- Wired RemotesPage Connect/Disconnect buttons to the real backend commands
- Updated and added tests covering envelope parsing, CSRF header logic,
  already-unwrapped response handling, and the new connect/disconnect paths
handle_response() in client.rs already strips the {"data":...} wrapper
before returning to callers. Every proxmox module was calling .get("data")
a second time on the already-unwrapped Value, which always returned None
and caused all API responses to silently yield empty results or errors.

vm.rs had an additional bug: list_vms used POST on cluster/resources (a
GET-only endpoint) and dropped VMs with no cpu field via filter_map ?
instead of unwrap_or(0.0). Both corrected.

Affected modules: vm, ceph, ceph_cluster, certificates, acme, firewall,
sdn, ha, apt, updates, updates_ext, tasks, migration, metrics, shell,
auth_realm, views, backup — 18 files, 19 functions.

426 Rust tests pass. clippy -D warnings clean. tsc --noEmit clean.
docs: update ticket to include VM listing and module-wide double-unwrap fixes
Some checks failed
Test / frontend-tests (pull_request) Successful in 2m0s
Test / frontend-typecheck (pull_request) Successful in 2m13s
PR Review Automation / review (pull_request) Has been cancelled
Test / rust-clippy (pull_request) Has been cancelled
Test / rust-tests (pull_request) Has been cancelled
Test / rust-fmt-check (pull_request) Has been cancelled
afe9ac1a3a
sarman force-pushed fix/proxmox-reconnect-v1.2.3 from afe9ac1a3a to 4d066e47fd 2026-06-21 00:38:57 +00:00 Compare
sarman reviewed 2026-06-21 00:47:48 +00:00
sarman left a comment
Author
Owner

Automated PR Review could not be completed - LLM analysis failed or produced no output.

Automated PR Review could not be completed - LLM analysis failed or produced no output.
sarman added 1 commit 2026-06-21 00:49:50 +00:00
ci(pr-review): increase LLM timeout to 600s, retry-max-time to 1200s
Some checks failed
PR Review Automation / review (pull_request) Has been cancelled
Test / rust-fmt-check (pull_request) Has been cancelled
Test / rust-clippy (pull_request) Has been cancelled
Test / frontend-typecheck (pull_request) Has been cancelled
Test / frontend-tests (pull_request) Has been cancelled
Test / rust-tests (pull_request) Has been cancelled
9b94e99e4d
qwen3.5-122b-think on a 253KB prompt was consistently hitting the 300s
--max-time wall. --retry-max-time also equalled --max-time, so retries
never fired. Now: 10 min per attempt, 2 retries, 20 min total budget.
sarman added 1 commit 2026-06-21 00:54:08 +00:00
ci(pr-review): iterative batch review for large diffs
Some checks failed
Test / frontend-tests (pull_request) Successful in 1m41s
Test / frontend-typecheck (pull_request) Successful in 1m58s
PR Review Automation / review (pull_request) Successful in 10m22s
Test / rust-fmt-check (pull_request) Failing after 12m51s
Test / rust-clippy (pull_request) Successful in 14m40s
Test / rust-tests (pull_request) Successful in 16m29s
c8399bcdeb
Replaces the single monolithic LLM call (which timed out on PRs >~150KB)
with a batching strategy:

- Changed files are grouped into batches capped at 2500 lines each,
  always respecting file boundaries (no file is split across batches)
- Each batch is sent as an independent LLM call (max-time 600s)
- Verdicts are aggregated: harshest verdict across all batches wins
- Single-batch PRs produce identical output to the old workflow
- Multi-batch PRs get a combined review with per-batch sections and
  an Overall Verdict line at the end
- Partial failures are tolerated: if some batches succeed, the review
  is posted with a note about any failed batches
- Cleanup removes /tmp/pr_batch_*.txt alongside existing temp files
sarman reviewed 2026-06-21 01:05:35 +00:00
sarman left a comment
Author
Owner

Automated PR Review (qwen3.5-122b-think via liteLLM):\n\n_This PR was reviewed in 4 batches (0 batch(es) failed)._

Batch 1 of 4

CODE REVIEW SUMMARY

This batch contains the CI/CD workflow changes and the core Proxmox command implementations. The changes appear well-structured and address the authentication, connection state, and response parsing issues described in the PR body.

Summary
The Batch 1 changes implement critical infrastructure for the Proxmox reconnect feature. The workflow batching logic is sound for handling large PRs, and the Proxmox commands properly handle authentication before storing clusters, with correct double-unwrap fixes applied. No critical bugs detected in this batch.

Findings

  • No findings. All examined code follows correct patterns for async Rust, proper error handling, and Tauri command registration. The workflow batching respects file boundaries and properly aggregates verdicts across batches.

  • Minor observation (SUGGESTION): The workflow file uses jq -cn with --rawfile which is a relatively new feature. Consider adding a fallback or version check if this workflow needs to run on older Gitea runners that may not support this syntax.

  • Minor observation (SUGGESTION): The connect_proxmox_cluster command returns Result<bool, String> but always returns Ok(true) on success. Consider returning Result<(), String> since the boolean doesn't add information beyond the Result itself.

Verdict: APPROVE WITH COMMENTS


Notes for subsequent batches:

  • Batch 2-4 should contain client.rs, vm.rs, and the other proxmox module files where the ProxmoxEnvelope<T>, CSRF token handling, and certificate configuration changes are implemented. These files are critical to verify the complete fix.
  • The CLI module (cli/mod.rs) references client.authenticate() which should now take &mut self based on the PR description - verify this signature change is consistent across all callers.

Batch 2 of 4

CODE REVIEW - BATCH 2 OF 4

Summary

This batch contains 7 Proxmox module files (acme, apt, auth_realm, backup, ceph, ceph_cluster, certificates) where the double-unwrap fix for the Proxmox API response envelope has been partially applied. The ceph.rs diff correctly shows the transition from response.get("data").and_then(|d| d.as_array()) to response.as_array(), confirming the client's handle_response() already strips the wrapper. However, several error messages still reference "missing 'data' field" which is now misleading since the wrapper is stripped at the client layer.

Findings

  • [WARNING] src-tauri/src/proxmox/acme.rs:72 - Error message references non-existent "data" field after client strips wrapper
    Evidence: Err("Invalid response format: missing 'data' field".to_string())
    Fix: Change to Err("Invalid response format".to_string())

  • [WARNING] src-tauri/src/proxmox/acme.rs:197 - Same issue in get_acme_challenges
    Evidence: Err("Invalid response format: missing 'data' field".to_string())
    Fix: Change to Err("Invalid response format".to_string())

  • [WARNING] src-tauri/src/proxmox/backup.rs:300 - Error message inconsistent with actual response structure
    Evidence: Err("Invalid response format: missing 'data' field".to_string())
    Fix: Change to Err("Invalid response format".to_string())

  • [WARNING] src-tauri/src/proxmox/certificates.rs:379 - Same issue in list_certificates
    Evidence: Err("Invalid response format: missing 'data' field".to_string())
    Fix: Change to Err("Invalid response format".to_string())

  • [WARNING] src-tauri/src/proxmox/certificates.rs:458 - Same issue in list_node_certificates
    Evidence: Err("Invalid response format: missing 'data' field".to_string())
    Fix: Change to Err("Invalid response format".to_string())

Notes on Consistency

  • The ceph.rs diff correctly shows the fix applied (all response.get("data") calls removed)
  • Error messages should be updated to match the new response structure where the wrapper is already stripped
  • Several modules in this batch (apt.rs, auth_realm.rs, ceph_cluster.rs) already have correct error messages without "data" field references

Verdict: APPROVE WITH COMMENTS

The functional fix is correct (response structure handling is now consistent with the client stripping the wrapper). The only issues are misleading error messages that reference a "data" field that no longer exists in the response. These should be updated for clarity but won't cause functional failures.


Batch 3 of 4

Summary

I reviewed all 10 proxmox module files in this batch. The code implements proper Proxmox API envelope handling with ProxmoxEnvelope<T> in client.rs, correct CSRF token handling for mutating operations, and consistent error handling patterns across all modules. No compilation errors, security vulnerabilities, or logic bugs were detected.

Findings

No findings.

Verdict: APPROVE


Batch 4 of 4

Summary

This batch (BATCH 4 of 4) contains the Proxmox VM management module, dashboard views module, TypeScript client updates, and the Remotes page UI. The changes address the core issues described in the PR: fixing the data envelope handling, correcting the VM listing endpoint, and adding proper connect/disconnect functionality. No critical bugs or security issues were found in this batch.

Findings

  • [SUGGESTION] src/pages/Proxmox/RemotesPage.tsx:29 - Status field type narrowing could be more explicit
    Evidence: status: (c.connected ? 'connected' : 'disconnected') as RemoteInfo['status']
    Fix: Consider adding explicit type annotation RemoteInfo['status'] or use a type guard to ensure the cast is safe. The current code works but the as cast bypasses TypeScript's type checking for this field.

  • [SUGGESTION] src-tauri/src/proxmox/views.rs:68 - Error message could include more context
    Evidence: Ok(vec![]) when response is not an array
    Fix: Consider logging or returning a more descriptive error if the API response format is unexpected, rather than silently returning an empty array. This would help with debugging API compatibility issues.

Verdict: APPROVE


Detailed Analysis

  1. src-tauri/src/proxmox/views.rs: The code correctly handles the Proxmox API response without double-unwrapping the data envelope. The list_views, add_view, update_view, delete_view, and get_view functions all properly use the ticket parameter and handle JSON parsing correctly.

  2. src-tauri/src/proxmox/vm.rs: The list_vms function correctly uses GET on cluster/resources?type=vm and properly handles the cpu field being absent for stopped VMs with unwrap_or(0.0). This addresses the issues described in the PR.

  3. src/lib/proxmoxClient.ts: The new connectProxmoxCluster and disconnectProxmoxCluster functions are properly typed and invoke the corresponding Tauri commands with correct parameter types.

  4. src/pages/Proxmox/RemotesPage.tsx: The component properly integrates the new connect/disconnect functionality with appropriate error handling and toast notifications. The status updates in the UI state are consistent with the backend operations.

The changes in this batch are consistent with the PR description and do not introduce any new issues. The code quality is appropriate for production use.



Overall Verdict: APPROVE WITH COMMENTS

Overall verdict reflects the most critical finding across all batches.

Automated PR Review (qwen3.5-122b-think via liteLLM):\n\n_This PR was reviewed in 4 batches (0 batch(es) failed)._ ## Batch 1 of 4 ## CODE REVIEW SUMMARY This batch contains the CI/CD workflow changes and the core Proxmox command implementations. The changes appear well-structured and address the authentication, connection state, and response parsing issues described in the PR body. **Summary** The Batch 1 changes implement critical infrastructure for the Proxmox reconnect feature. The workflow batching logic is sound for handling large PRs, and the Proxmox commands properly handle authentication before storing clusters, with correct double-unwrap fixes applied. No critical bugs detected in this batch. **Findings** - **No findings.** All examined code follows correct patterns for async Rust, proper error handling, and Tauri command registration. The workflow batching respects file boundaries and properly aggregates verdicts across batches. - **Minor observation (SUGGESTION)**: The workflow file uses `jq -cn` with `--rawfile` which is a relatively new feature. Consider adding a fallback or version check if this workflow needs to run on older Gitea runners that may not support this syntax. - **Minor observation (SUGGESTION)**: The `connect_proxmox_cluster` command returns `Result<bool, String>` but always returns `Ok(true)` on success. Consider returning `Result<(), String>` since the boolean doesn't add information beyond the Result itself. **Verdict**: APPROVE WITH COMMENTS --- **Notes for subsequent batches:** - Batch 2-4 should contain `client.rs`, `vm.rs`, and the other proxmox module files where the `ProxmoxEnvelope<T>`, CSRF token handling, and certificate configuration changes are implemented. These files are critical to verify the complete fix. - The CLI module (`cli/mod.rs`) references `client.authenticate()` which should now take `&mut self` based on the PR description - verify this signature change is consistent across all callers. --- ## Batch 2 of 4 ## CODE REVIEW - BATCH 2 OF 4 **Summary** This batch contains 7 Proxmox module files (acme, apt, auth_realm, backup, ceph, ceph_cluster, certificates) where the double-unwrap fix for the Proxmox API response envelope has been partially applied. The ceph.rs diff correctly shows the transition from `response.get("data").and_then(|d| d.as_array())` to `response.as_array()`, confirming the client's `handle_response()` already strips the wrapper. However, several error messages still reference "missing 'data' field" which is now misleading since the wrapper is stripped at the client layer. **Findings** - [WARNING] src-tauri/src/proxmox/acme.rs:72 - Error message references non-existent "data" field after client strips wrapper Evidence: `Err("Invalid response format: missing 'data' field".to_string())` Fix: Change to `Err("Invalid response format".to_string())` - [WARNING] src-tauri/src/proxmox/acme.rs:197 - Same issue in `get_acme_challenges` Evidence: `Err("Invalid response format: missing 'data' field".to_string())` Fix: Change to `Err("Invalid response format".to_string())` - [WARNING] src-tauri/src/proxmox/backup.rs:300 - Error message inconsistent with actual response structure Evidence: `Err("Invalid response format: missing 'data' field".to_string())` Fix: Change to `Err("Invalid response format".to_string())` - [WARNING] src-tauri/src/proxmox/certificates.rs:379 - Same issue in `list_certificates` Evidence: `Err("Invalid response format: missing 'data' field".to_string())` Fix: Change to `Err("Invalid response format".to_string())` - [WARNING] src-tauri/src/proxmox/certificates.rs:458 - Same issue in `list_node_certificates` Evidence: `Err("Invalid response format: missing 'data' field".to_string())` Fix: Change to `Err("Invalid response format".to_string())` **Notes on Consistency** - The ceph.rs diff correctly shows the fix applied (all `response.get("data")` calls removed) - Error messages should be updated to match the new response structure where the wrapper is already stripped - Several modules in this batch (apt.rs, auth_realm.rs, ceph_cluster.rs) already have correct error messages without "data" field references **Verdict**: APPROVE WITH COMMENTS The functional fix is correct (response structure handling is now consistent with the client stripping the wrapper). The only issues are misleading error messages that reference a "data" field that no longer exists in the response. These should be updated for clarity but won't cause functional failures. --- ## Batch 3 of 4 **Summary** I reviewed all 10 proxmox module files in this batch. The code implements proper Proxmox API envelope handling with `ProxmoxEnvelope<T>` in `client.rs`, correct CSRF token handling for mutating operations, and consistent error handling patterns across all modules. No compilation errors, security vulnerabilities, or logic bugs were detected. **Findings** No findings. **Verdict**: APPROVE --- ## Batch 4 of 4 **Summary** This batch (BATCH 4 of 4) contains the Proxmox VM management module, dashboard views module, TypeScript client updates, and the Remotes page UI. The changes address the core issues described in the PR: fixing the data envelope handling, correcting the VM listing endpoint, and adding proper connect/disconnect functionality. No critical bugs or security issues were found in this batch. **Findings** - [SUGGESTION] src/pages/Proxmox/RemotesPage.tsx:29 - Status field type narrowing could be more explicit Evidence: `status: (c.connected ? 'connected' : 'disconnected') as RemoteInfo['status']` Fix: Consider adding explicit type annotation `RemoteInfo['status']` or use a type guard to ensure the cast is safe. The current code works but the `as` cast bypasses TypeScript's type checking for this field. - [SUGGESTION] src-tauri/src/proxmox/views.rs:68 - Error message could include more context Evidence: `Ok(vec![])` when response is not an array Fix: Consider logging or returning a more descriptive error if the API response format is unexpected, rather than silently returning an empty array. This would help with debugging API compatibility issues. **Verdict**: APPROVE --- **Detailed Analysis** 1. **src-tauri/src/proxmox/views.rs**: The code correctly handles the Proxmox API response without double-unwrapping the data envelope. The `list_views`, `add_view`, `update_view`, `delete_view`, and `get_view` functions all properly use the ticket parameter and handle JSON parsing correctly. 2. **src-tauri/src/proxmox/vm.rs**: The `list_vms` function correctly uses GET on `cluster/resources?type=vm` and properly handles the `cpu` field being absent for stopped VMs with `unwrap_or(0.0)`. This addresses the issues described in the PR. 3. **src/lib/proxmoxClient.ts**: The new `connectProxmoxCluster` and `disconnectProxmoxCluster` functions are properly typed and invoke the corresponding Tauri commands with correct parameter types. 4. **src/pages/Proxmox/RemotesPage.tsx**: The component properly integrates the new connect/disconnect functionality with appropriate error handling and toast notifications. The status updates in the UI state are consistent with the backend operations. The changes in this batch are consistent with the PR description and do not introduce any new issues. The code quality is appropriate for production use. --- --- **Overall Verdict**: APPROVE WITH COMMENTS _Overall verdict reflects the most critical finding across all batches._
sarman added 1 commit 2026-06-21 01:14:01 +00:00
style: apply rustfmt to client.rs and vm.rs
Some checks failed
PR Review Automation / review (pull_request) Has been cancelled
Test / frontend-tests (pull_request) Successful in 1m41s
Test / frontend-typecheck (pull_request) Successful in 1m53s
Test / rust-fmt-check (pull_request) Successful in 12m3s
Test / rust-clippy (pull_request) Successful in 13m49s
Test / rust-tests (pull_request) Successful in 15m45s
c1885ffc2b
sarman merged commit 3edb00dfb0 into beta 2026-06-21 01:33:26 +00:00
sarman deleted branch fix/proxmox-reconnect-v1.2.3 2026-06-21 01:33:26 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sarman/tftsr-devops_investigation#125
No description provided.