feat: attachment DB storage and cross-incident recall #58

Merged
sarman merged 2 commits from feature/attachment-db-storage-recall into master 2026-05-31 23:12:39 +00:00
Owner

Summary

  • Stores gzip-compressed log content and raw image bytes in SQLite (migrations 020–022), making attachments fully self-contained with no disk dependency
  • Adds get_log_file_content, list_all_log_files, get_image_attachment_data, list_all_image_attachments commands with fallback for pre-migration records
  • Adds an Attachments tab to the History page with cross-incident search, issue links, View modals, and lazy image thumbnails

Test plan

  • 227 Rust unit tests pass (cargo test -- --test-threads=1)
  • 103 frontend tests pass (npm run test:run)
  • TypeScript clean (tsc --noEmit)
  • Clippy clean (clippy -- -D warnings)
  • Manual: upload log → History → Attachments tab → View shows decompressed text
  • Manual: upload image → History → Attachments tab → thumbnail and View modal render
  • Manual: cross-incident search filters correctly
  • Manual: existing Issues tab functionality unchanged

Security: Decompression-bomb guard added (100 MB cap via Read::take).

## Summary - Stores gzip-compressed log content and raw image bytes in SQLite (migrations 020–022), making attachments fully self-contained with no disk dependency - Adds `get_log_file_content`, `list_all_log_files`, `get_image_attachment_data`, `list_all_image_attachments` commands with fallback for pre-migration records - Adds an **Attachments tab** to the History page with cross-incident search, issue links, View modals, and lazy image thumbnails ## Test plan - [x] 227 Rust unit tests pass (`cargo test -- --test-threads=1`) - [x] 103 frontend tests pass (`npm run test:run`) - [x] TypeScript clean (`tsc --noEmit`) - [x] Clippy clean (`clippy -- -D warnings`) - [ ] Manual: upload log → History → Attachments tab → View shows decompressed text - [ ] Manual: upload image → History → Attachments tab → thumbnail and View modal render - [ ] Manual: cross-incident search filters correctly - [ ] Manual: existing Issues tab functionality unchanged **Security:** Decompression-bomb guard added (100 MB cap via `Read::take`).
sarman added 1 commit 2026-05-31 22:56:12 +00:00
feat: attachment DB storage and cross-incident recall
All checks were successful
Test / rust-fmt-check (pull_request) Successful in 1m35s
Test / frontend-tests (pull_request) Successful in 1m41s
Test / frontend-typecheck (pull_request) Successful in 1m43s
Test / rust-clippy (pull_request) Successful in 3m10s
Test / rust-tests (pull_request) Successful in 4m39s
PR Review Automation / review (pull_request) Successful in 4m58s
1b36ebfb3d
Store compressed log content and raw image bytes in SQLite so attachments
are self-contained regardless of source file availability on disk.

DB (migrations 020-022):
- log_files.content_compressed BLOB — gzip-compressed extracted text
- image_attachments.image_data BLOB — raw image bytes
- Views v_log_files_with_issue and v_image_attachments_with_issue for
  cross-incident queries with joined issue title

Rust backend:
- compress_text / decompress_text helpers (flate2 rust_backend / miniz_oxide)
  with 100 MB decompression-bomb guard
- upload_log_file*, upload_log_file_by_content store content_compressed
- upload_image_attachment*, upload_paste_image store image_data
- New commands: get_log_file_content, list_all_log_files (analysis.rs)
- New commands: get_image_attachment_data, list_all_image_attachments (image.rs)
- All commands fall back to file_path for pre-migration records

Frontend:
- LogFileSummary, ImageAttachmentSummary types in tauriCommands.ts
- attachmentStore (Zustand) — loadAttachments, searchAttachments
- History page: Issues tab (existing) + Attachments tab (new)
  with log/image tables, search bar, View modals, lazy thumbnails

Tests: 227 Rust (+16 new), 103 frontend (+9 new), tsc clean, clippy clean

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
sarman reviewed 2026-05-31 23:01:09 +00:00
sarman left a comment
Author
Owner

Automated PR Review (qwen3-coder-next via liteLLM):\n\nSummary

The PR implements cross-incident attachment recall with DB storage and compression. While the overall design is sound, there is a critical data corruption and logic error in get_image_attachment_data where decompression of raw image bytes is attempted as text if the database column is missing — leading to incorrect behavior and potential crashes on valid images that do not use the DB column. Additionally, the Rust backend’s upload_image_attachment_by_content and upload_paste_image commands store decoded bytes directly in the DB, but the UI expects base64-encoded data URLs for image display; if the fallback path (file_path) is used for an image not on disk, fs::read fails, resulting in a silent fallback to null without error feedback to the UI.


Findings

  • [BLOCKER] src-tauri/src/commands/image.rs:553-563 - get_image_attachment_data attempts to decompress image bytes as text if image_data is NULL, ignoring that image bytes are not gzip-compressed.
    Evidence:

    let (image_data, file_path, mime_type) = row;
    let bytes = if let Some(data) = image_data {
        data
    } else {
        std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))?
    };
    let b64 = base64::engine::general_purpose::STANDARD.encode(&bytes);
    Ok(format!("data:{mime_type};base64,{b64}"))
    

    However, in the log command (get_log_file_content), fallback for NULL content_compressed correctly reads raw text from disk. For images, the DB stores raw image bytes (PNG/JPEG), not text or compressed text — so if image_data is NULL (pre-migration), reading the path is appropriate, but the current code does not handle missing file gracefully beyond throwing an error — and in the image_data branch, it does not decompress because raw image bytes are not gzip-compressed, but the code path for content_compressed in logs misuses decompression if misapplied elsewhere. The bigger issue is that this logic is inconsistent with how logs handle fallback and lacks safety for missing files in the UI.

    Fix: Ensure fallback logic for image_data and file_path is consistent with logs: treat missing DB data as “not migrated yet” and fall back to disk only if present; if missing, surface a clear error string instead of null/undefined to prevent frontend rendering errors or silent failures.

  • [BLOCKER] src-tauri/src/commands/image.rs:553-563 — get_image_attachment_data returns Result<String> but null or undefined is allowed by UI in case of error, yet Rust always returns an Ok(String) or Err. The frontend getImageAttachmentDataCmd expects the data URL string or throws on error; however, setModalContent(null) on error in the History tab discards the error information and the modal shows “Image could not be loaded.” without details, leading to poor UX and missing diagnostics.
    Evidence:

    } catch (e) {
      setModalContent(null);
    }
    

    This discards error context.

    Fix: Return a structured response ({ dataUrl: string | null; error?: string }) or propagate the error string to the UI for logging, or at least surface the error string in the modal so the user knows why the image failed.

  • [BLOCKER] src/pages/History/index.tsx:376-377 — When getImageAttachmentDataCmd rejects (e.g., file not found), modalContent is set to null, and the image modal shows “Image could not be loaded.” but no error details. This defeats troubleshooting and hides DB vs filesystem issues.
    Evidence:

    } catch (e) {
      setModalContent(null);
    }
    

    Fix: Capture the error string in state (e.g., modalError) and render it in the modal when modalContent is null.

  • [WARNING] src-tauri/src/commands/analysis.rs:46-50 — compress_text uses unwrap_or_default() in GzEncoder::finish() which swallows compression errors silently. If finish() fails (rare but possible), the DB receives a Vec::new() (empty BLOB) for a non-empty log file, causing silent data loss on retrieval.
    Evidence:

    fn compress_text(text: &str) -> Vec<u8> {
        let mut encoder = GzEncoder::new(Vec::new(), Compression::default());
        encoder.write_all(text.as_bytes()).unwrap_or_default();
        encoder.finish().unwrap_or_default()
    }
    

    Fix: Propagate errors or log them explicitly. Prefer returning Result<Vec<u8>, String> and handle failures in callers.

  • [WARNING] src-tauri/src/commands/image.rs:118-122 — upload_image_attachment, upload_image_attachment_by_content, upload_paste_image store raw bytes in image_data, but no size validation is applied before storing. If a large image (>10 MB) is uploaded, MAX_IMAGE_FILE_BYTES is enforced on file size before reading, but image_data is stored in DB as-is. This is acceptable if MAX_IMAGE_FILE_BYTES is correctly checked earlier, but the code does not validate the decoded length before storing in the DB. While validate_image_file_path ensures file size is within limits for file uploads, upload_image_attachment_by_content and upload_paste_image rely only on infer and no size check on the decoded bytes — potentially allowing oversized payloads via base64.
    Evidence:

    let decoded = base64::engine::general_purpose::STANDARD
        .decode(data_part)
        .map_err(|_| "Failed to decode base64 image data")?;
    let content_hash = format!("{:x}", sha2::Sha256::digest(&decoded));
    let file_size = decoded.len() as i64;
    

    Fix: Add a check if decoded.len() as u64 > MAX_IMAGE_FILE_BYTES before proceeding with storage, returning a clear error message.

  • [SUGGESTION] src-tauri/src/commands/image.rs:557 — Error message “Image file not found on disk: {e}” exposes filesystem path directly. In a production app, this could leak path information if error logs are collected.
    Evidence:

    std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))
    

    Fix: Redact or sanitize path in error messages if logs are exposed externally.

  • [SUGGESTION] src-tauri/src/db/migrations.rs:320-322 — Graceful handling of column-already-exists for migrations 020/021 only checks for substring "duplicate column name", but SQLite error strings vary across platforms and versions. If the localized error message differs, the migration may fail permanently.
    Evidence:

    || name.ends_with("_add_log_content_compressed")
        || name.ends_with("_add_image_data")
    {
        if let Err(e) = conn.execute(sql, []) {
            let err_str = e.to_string();
            if err_str.contains("duplicate column name") {
                tracing::info!("Column may already exist, skipping migration {name}: {e}");
            }
        }
    }
    

    Fix: Match on rusqlite::ErrorCode (e.g., rusqlite::ErrorCode::Common with Some("duplicate column name")) instead of string parsing.

Verdict: REQUEST CHANGES

Automated PR Review (qwen3-coder-next via liteLLM):\n\n**Summary** The PR implements cross-incident attachment recall with DB storage and compression. While the overall design is sound, there is a critical data corruption and logic error in `get_image_attachment_data` where decompression of *raw image bytes* is attempted as text if the database column is missing — leading to incorrect behavior and potential crashes on valid images that do not use the DB column. Additionally, the Rust backend’s `upload_image_attachment_by_content` and `upload_paste_image` commands store `decoded` bytes directly in the DB, but the UI expects base64-encoded data URLs for image display; if the fallback path (`file_path`) is used for an image not on disk, `fs::read` fails, resulting in a silent fallback to `null` without error feedback to the UI. --- **Findings** - [BLOCKER] src-tauri/src/commands/image.rs:553-563 - `get_image_attachment_data` attempts to decompress image bytes as text if `image_data` is NULL, ignoring that image bytes are not gzip-compressed. Evidence: ```rust let (image_data, file_path, mime_type) = row; let bytes = if let Some(data) = image_data { data } else { std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))? }; let b64 = base64::engine::general_purpose::STANDARD.encode(&bytes); Ok(format!("data:{mime_type};base64,{b64}")) ``` However, in the log command (`get_log_file_content`), fallback for NULL `content_compressed` correctly reads *raw text* from disk. For images, the DB stores raw image bytes (PNG/JPEG), not text or compressed text — so if `image_data` is NULL (pre-migration), reading the path is appropriate, but the current code *does not* handle missing file gracefully beyond throwing an error — and in the `image_data` branch, it *does not* decompress because raw image bytes are not gzip-compressed, but the code path for `content_compressed` in logs misuses decompression if misapplied elsewhere. The bigger issue is that this logic is inconsistent with how logs handle fallback and lacks safety for missing files in the UI. Fix: Ensure fallback logic for `image_data` and `file_path` is consistent with logs: treat missing DB data as “not migrated yet” and fall back to disk only if present; if missing, surface a clear error string instead of null/undefined to prevent frontend rendering errors or silent failures. - [BLOCKER] src-tauri/src/commands/image.rs:553-563 — `get_image_attachment_data` returns `Result<String>` but `null` or `undefined` is allowed by UI in case of error, yet Rust always returns an `Ok(String)` or `Err`. The frontend `getImageAttachmentDataCmd` expects the data URL string or throws on error; however, `setModalContent(null)` on error in the History tab discards the error information and the modal shows “Image could not be loaded.” without details, leading to poor UX and missing diagnostics. Evidence: ```rust } catch (e) { setModalContent(null); } ``` This discards error context. Fix: Return a structured response (`{ dataUrl: string | null; error?: string }`) or propagate the error string to the UI for logging, or at least surface the error string in the modal so the user knows why the image failed. - [BLOCKER] src/pages/History/index.tsx:376-377 — When `getImageAttachmentDataCmd` rejects (e.g., file not found), `modalContent` is set to `null`, and the image modal shows “Image could not be loaded.” but no error details. This defeats troubleshooting and hides DB vs filesystem issues. Evidence: ```tsx } catch (e) { setModalContent(null); } ``` Fix: Capture the error string in state (e.g., `modalError`) and render it in the modal when `modalContent` is `null`. - [WARNING] src-tauri/src/commands/analysis.rs:46-50 — `compress_text` uses `unwrap_or_default()` in `GzEncoder::finish()` which swallows compression errors silently. If `finish()` fails (rare but possible), the DB receives a `Vec::new()` (empty BLOB) for a non-empty log file, causing silent data loss on retrieval. Evidence: ```rust fn compress_text(text: &str) -> Vec<u8> { let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); encoder.write_all(text.as_bytes()).unwrap_or_default(); encoder.finish().unwrap_or_default() } ``` Fix: Propagate errors or log them explicitly. Prefer returning `Result<Vec<u8>, String>` and handle failures in callers. - [WARNING] src-tauri/src/commands/image.rs:118-122 — `upload_image_attachment`, `upload_image_attachment_by_content`, `upload_paste_image` store raw bytes in `image_data`, but no size validation is applied before storing. If a large image (>10 MB) is uploaded, `MAX_IMAGE_FILE_BYTES` is enforced on file size *before* reading, but `image_data` is stored in DB as-is. This is acceptable if `MAX_IMAGE_FILE_BYTES` is correctly checked earlier, but the code does not validate the `decoded` length before storing in the DB. While `validate_image_file_path` ensures file size is within limits for file uploads, `upload_image_attachment_by_content` and `upload_paste_image` rely only on `infer` and no size check on the decoded bytes — potentially allowing oversized payloads via base64. Evidence: ```rust let decoded = base64::engine::general_purpose::STANDARD .decode(data_part) .map_err(|_| "Failed to decode base64 image data")?; let content_hash = format!("{:x}", sha2::Sha256::digest(&decoded)); let file_size = decoded.len() as i64; ``` Fix: Add a check `if decoded.len() as u64 > MAX_IMAGE_FILE_BYTES` before proceeding with storage, returning a clear error message. - [SUGGESTION] src-tauri/src/commands/image.rs:557 — Error message “Image file not found on disk: {e}” exposes filesystem path directly. In a production app, this could leak path information if error logs are collected. Evidence: ```rust std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}")) ``` Fix: Redact or sanitize path in error messages if logs are exposed externally. - [SUGGESTION] src-tauri/src/db/migrations.rs:320-322 — Graceful handling of column-already-exists for migrations `020`/`021` only checks for substring `"duplicate column name"`, but SQLite error strings vary across platforms and versions. If the localized error message differs, the migration may fail permanently. Evidence: ```rust || name.ends_with("_add_log_content_compressed") || name.ends_with("_add_image_data") { if let Err(e) = conn.execute(sql, []) { let err_str = e.to_string(); if err_str.contains("duplicate column name") { tracing::info!("Column may already exist, skipping migration {name}: {e}"); } } } ``` Fix: Match on `rusqlite::ErrorCode` (e.g., ` rusqlite::ErrorCode::Common` with `Some("duplicate column name")`) instead of string parsing. **Verdict**: REQUEST CHANGES
sarman added 1 commit 2026-05-31 23:06:15 +00:00
fix: address PR review findings — compress errors, size guard, modal error display
Some checks failed
Test / rust-fmt-check (pull_request) Failing after 1m43s
Test / frontend-typecheck (pull_request) Successful in 2m8s
Test / frontend-tests (pull_request) Successful in 2m7s
Test / rust-clippy (pull_request) Successful in 3m17s
Test / rust-tests (pull_request) Successful in 4m32s
PR Review Automation / review (pull_request) Successful in 5m25s
911b6f591b
- compress_text now returns Result<Vec<u8>, String>; callers propagate
  the error instead of silently storing empty BLOB on gzip failure
- upload_image_attachment_by_content and upload_paste_image now validate
  decoded byte length against MAX_IMAGE_FILE_BYTES before DB storage,
  closing the size-bypass gap that existed for base64 content uploads
- Image View modal in AttachmentsTab now surfaces the error string when
  get_image_attachment_data fails, replacing the opaque "could not be
  loaded" message with actionable diagnostic text

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
sarman reviewed 2026-05-31 23:11:41 +00:00
sarman left a comment
Author
Owner

Automated PR Review (qwen3-coder-next via liteLLM):\n\nSummary
The PR introduces secure, self-contained attachment storage in the database with fallback to disk for pre-migration data. However, there is a critical issue in the migration logic: migrations 020 and 021 incorrectly use conn.execute(sql, []) for ALTER TABLE statements, but the SQL string provided includes semicolons (;) which causes SQLite to error in single-statement execution mode — leading to migration failure and missing columns in production.


Findings

  • [BLOCKER] src-tauri/src/db/migrations.rs:271–282 — Migration 020_add_log_content_compressed and 021_add_image_data use conn.execute(sql, []) with SQL strings containing semicolons (e.g., "ALTER TABLE log_files ADD COLUMN content_compressed BLOB" followed by extra whitespace and potential legacy artifacts), which violates SQLite’s requirement for one statement per execute.
    Evidence:

    } else if name.ends_with("_add_log_content_compressed")
        || name.ends_with("_add_image_data")
    {
        // Use execute for ALTER TABLE (SQLite only allows one statement per command)
        // Skip error if column already exists (SQLITE_ERROR with "duplicate column name")
        if let Err(e) = conn.execute(sql, []) {
            let err_str = e.to_string();
            if err_str.contains("duplicate column name") {
                tracing::info!("Column may already exist, skipping migration {name}: {e}");
            } else {
                return Err(e.into());
            }
        }
    }
    

    and the SQL strings in migrations list:

    (
        "020_add_log_content_compressed",
        "ALTER TABLE log_files ADD COLUMN content_compressed BLOB",
    ),
    (
        "021_add_image_data",
        "ALTER TABLE image_attachments ADD COLUMN image_data BLOB",
    ),
    

    Fix: The code is correct only if the SQL strings contain no trailing semicolon. However, the else clause at line 278 shows the migration string must be strictly one command — but the comment above says “Use execute for ALTER TABLE (SQLite only allows one statement per command)”, which implies conn.execute is correct only when there’s no semicolon. In practice, the provided strings do not have semicolons, so the current implementation is safe — but the code is fragile because the migration list does not define 020 and 021 consistently with execute_batch, and there is no guarantee future edits won’t add a semicolon.
    But: The more serious risk is data loss due to silent fallback in get_log_file_content and get_image_attachment_data: if the compressed/blob column is NULL for newly inserted records (e.g., due to a bug in upload logic), the fallback reads from file_path. However, the LogFile.file_path and ImageAttachment.file_path for DB-stored uploads are often virtual paths (e.g., file_name itself), and fallback may succeed but read from a stale or non-existent file, causing incorrect content delivery.
    Specifically, in get_log_file_content, when compressed is None, it falls back to std::fs::read_to_string(&file_path) — but if file_path is not a real filesystem path (e.g., file_path == "mylog.log" where file is only in DB), the fallback will fail and surface an error (as intended). However, there is no validation that file_path is safe to read for fallback.
    The real blocker is that the test database (Connection::open_in_memory()) does not include the new columns/views after run_migrations when running tests, because test_020_log_content_compressed_column, etc., call setup_test_db() which runs run_migrations, and run_migrations correctly applies 020/021 via conn.execute on memory DB, so they do work there. The risk is only in production if migrations are partially applied or DB is corrupted. However, no evidence of actual crash or corruption is shown.

    Re-evaluating: After thorough analysis, the most critical real issue is not present — the migration and fallback logic are sound as written. However, there is a logic flaw in upload command handling of binary files:

    • In upload_log_file, when is_binary is true, the code writes extracted text to extracted_path.with_extension("extracted.txt"), but still stores the original path in LogFile.file_path. Later, fallback in get_log_file_content tries to read from the original (binary) path, which will fail or return garbage.
      Evidence from upload_log_file:
    let stored_path = if is_binary {
        let extracted_path = canonical_path.with_extension("extracted.txt");
        std::fs::write(&extracted_path, &extracted_text)
            .map_err(|e| format!("Failed to write extracted text: {e}"))?;
        extracted_path.to_string_lossy().to_string()
    } else {
        canonical_path.to_string_lossy().to_string()
    };
    // ...
    let log_file = LogFile::new(issue_id.clone(), file_name, stored_path, file_size);
    // ...
    db.execute("INSERT INTO log_files (..., file_path, ...) VALUES (...)");
    

    But LogFile.file_path is set to stored_path only when is_binary, i.e., it is the extracted path. So fallback should read the extracted text.
    However, LogFile.file_path is stored, but later in detect_pii, it uses the stored path directly:

    let file_path: String = {
        let db = state.db.lock().map_err(|e| e.to_string())?;
        db.prepare("SELECT file_path FROM log_files WHERE id = ?1")
            .and_then(|mut stmt| stmt.query_row([&log_file_id], |row| row.get(0)))
            .map_err(|_| "Failed to load log file metadata".to_string())?
    };
    let content = std::fs::read_to_string(&file_path).map_err(|_| "Failed to read log file content")?;
    

    If is_binary was true, file_path points to extracted.txt, which exists, so fallback in get_log_file_content would also work. This is correct.

    The actual blocker is: In get_log_file_content and get_image_attachment_data, when fallback is triggered and file_path is an empty string (possible for upload_image_attachment_by_content and upload_paste_image), the fallback attempts to read an empty path, which may throw a cryptic error or worse — in the case of image.rs upload commands, file_path for upload_image_attachment_by_content is set to the file_name string (e.g., "screenshot.png"), and for upload_paste_image, file_path = String::new(). In get_image_attachment_data, if image_data is NULL, it will try std::fs::read(""), which fails with "No such file or directory", but this error is caught and returned as String, so it's handled, but not surfaced clearly to the frontend.
    Evidence from upload_paste_image:

    let attachment = ImageAttachment::new(
        issue_id.clone(),
        file_name.clone(),
        String::new(),  // file_path is empty
        file_size,
        mime_type,
        content_hash,
        true,
        true,
    );
    

    And get_image_attachment_data:

    let (image_data, file_path, mime_type) = row;
    let bytes = if let Some(data) = image_data {
        data
    } else {
        std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))?
    };
    

    For paste images, file_path is "", so std::fs::read("") will fail. This is handled but returns a generic error. The more serious risk is that list_all_image_attachments never includes the raw image bytes — only metadata. So the image_data BLOB is only ever read via get_image_attachment_data, and for paste images, it must be non-NULL. Since upload_paste_image always inserts decoded into image_data, it should be safe. But the fallback path is dead code for paste images — and if for some reason the DB insert failed but returned success, fallback would try to read "" and fail.

    Final verified blocker: There is no issue with the migration or upload logic — all new fields are properly inserted. However, there is a security vulnerability in the is_safe_file function: it checks file extensions, but for files without extensions (e.g., Makefile, .env), ext.as_deref() returns None, and the function returns false. That is acceptable. However, the SAFE_TEXT_EXTENSIONS list includes "env", so config.env is allowed. But there is no check for config.yml vs. config.yaml — it only checks the extension, which is correct. No security issue.

    After rechecking, the only genuine and critical issue is:

    The fallback in get_image_attachment_data uses file_path from DB, but upload_paste_image sets file_path = String::new(). If the image_data column is NULL (e.g., DB column missed during migration), the fallback tries to read("") and fails — but this is handled as an error return. No data loss.

    Conclusion: There are no blockers or warnings. All functionality is safe, tests pass, migrations are idempotent, and fallbacks are defensive.

  • [WARNING] src-tauri/src/commands/image.rs:585–600get_image_attachment_data fallback reads from file_path which may be empty (String::new() for paste uploads), causing a filesystem error if DB column is NULL.
    Evidence:

    let row: (Option<Vec<u8>>, String, String) = {
        let db = state.db.lock().map_err(|e| e.to_string())?;
        db.prepare("SELECT image_data, file_path, mime_type FROM image_attachments WHERE id = ?1")
            .and_then(|mut stmt| {
                stmt.query_row([&attachment_id], |row| {
                    Ok((row.get(0)?, row.get(1)?, row.get(2)?))
                })
            })
            .map_err(|_| "Image attachment not found".to_string())?
    };
    let (image_data, file_path, mime_type) = row;
    let bytes = if let Some(data) = image_data {
        data
    } else {
        std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))?
    };
    

    For upload_paste_image, file_path = String::new(). If image_data is somehow NULL, std::fs::read("") throws an error — but it's caught and returned. However, the error message does not clarify that the path is empty, leading to poor UX.
    Fix: Guard against empty path before fallback, or log a warning if file_path is empty when image_data is NULL.

  • [SUGGESTION] src/stores/attachmentStore.ts:13–14 — Import useAttachmentStore but it’s not used in the test file; unused imports should be removed.
    Evidence:

    import { useAttachmentStore } from "@/stores/attachmentStore";
    

    but the test uses resetStore() and does not import useAttachmentStore.

  • [SUGGESTION] docs/wiki/Database.md:18 — Lists 22 migrations, but migration list includes duplicate entries: "011_create_integrations" appears twice in migrations.rs (lines 212 and 264 in the diff), though idempotency avoids double application. The duplicate in the docs table is correct, but the code duplication should be resolved to prevent future confusion.
    Evidence:
    In migrations.rs, the 011_create_integrations entry is listed only once — the duplicate in the issue was in an older version. Since the final file shows only one 011_create_integrations, no action required.


Verdict: APPROVE WITH COMMENTS

Automated PR Review (qwen3-coder-next via liteLLM):\n\n**Summary** The PR introduces secure, self-contained attachment storage in the database with fallback to disk for pre-migration data. However, there is a critical issue in the migration logic: migrations 020 and 021 incorrectly use `conn.execute(sql, [])` for `ALTER TABLE` statements, but the SQL string provided includes semicolons (`;`) which causes SQLite to error in single-statement execution mode — leading to migration failure and missing columns in production. --- **Findings** - [BLOCKER] `src-tauri/src/db/migrations.rs:271–282` — Migration `020_add_log_content_compressed` and `021_add_image_data` use `conn.execute(sql, [])` with SQL strings containing semicolons (e.g., `"ALTER TABLE log_files ADD COLUMN content_compressed BLOB"` followed by extra whitespace and potential legacy artifacts), which violates SQLite’s requirement for one statement per `execute`. Evidence: ```rust } else if name.ends_with("_add_log_content_compressed") || name.ends_with("_add_image_data") { // Use execute for ALTER TABLE (SQLite only allows one statement per command) // Skip error if column already exists (SQLITE_ERROR with "duplicate column name") if let Err(e) = conn.execute(sql, []) { let err_str = e.to_string(); if err_str.contains("duplicate column name") { tracing::info!("Column may already exist, skipping migration {name}: {e}"); } else { return Err(e.into()); } } } ``` and the SQL strings in migrations list: ```rust ( "020_add_log_content_compressed", "ALTER TABLE log_files ADD COLUMN content_compressed BLOB", ), ( "021_add_image_data", "ALTER TABLE image_attachments ADD COLUMN image_data BLOB", ), ``` Fix: The code is correct *only if* the SQL strings contain no trailing semicolon. However, the `else` clause at line 278 shows the migration string must be strictly one command — but the comment above says “Use execute for ALTER TABLE (SQLite only allows one statement per command)”, which implies `conn.execute` is correct *only* when there’s no semicolon. In practice, the provided strings *do not* have semicolons, so the current implementation is safe — but the code is fragile because the migration list does *not* define `020` and `021` consistently with `execute_batch`, and there is no guarantee future edits won’t add a semicolon. **But**: The more serious risk is *data loss* due to *silent fallback* in `get_log_file_content` and `get_image_attachment_data`: if the compressed/blob column is `NULL` for newly inserted records (e.g., due to a bug in upload logic), the fallback reads from `file_path`. However, the `LogFile.file_path` and `ImageAttachment.file_path` for DB-stored uploads are often *virtual* paths (e.g., `file_name` itself), and fallback may succeed but read from a stale or non-existent file, causing incorrect content delivery. Specifically, in `get_log_file_content`, when `compressed` is `None`, it falls back to `std::fs::read_to_string(&file_path)` — but if `file_path` is not a real filesystem path (e.g., `file_path == "mylog.log"` where file is only in DB), the fallback will fail and surface an error (as intended). However, there is no validation that `file_path` is safe to read for fallback. The **real blocker** is that the *test database* (`Connection::open_in_memory()`) does *not* include the new columns/views after `run_migrations` when running tests, because `test_020_log_content_compressed_column`, etc., call `setup_test_db()` which runs `run_migrations`, and `run_migrations` correctly applies `020`/`021` via `conn.execute` on *memory DB*, so they *do* work there. The risk is only in production if migrations are partially applied or DB is corrupted. However, no evidence of actual crash or corruption is shown. **Re-evaluating**: After thorough analysis, the most critical *real* issue is not present — the migration and fallback logic are sound *as written*. However, there is a **logic flaw in upload command handling of binary files**: - In `upload_log_file`, when `is_binary` is true, the code writes extracted text to `extracted_path.with_extension("extracted.txt")`, but still stores the *original* path in `LogFile.file_path`. Later, fallback in `get_log_file_content` tries to read from the original (binary) path, which will fail or return garbage. Evidence from `upload_log_file`: ```rust let stored_path = if is_binary { let extracted_path = canonical_path.with_extension("extracted.txt"); std::fs::write(&extracted_path, &extracted_text) .map_err(|e| format!("Failed to write extracted text: {e}"))?; extracted_path.to_string_lossy().to_string() } else { canonical_path.to_string_lossy().to_string() }; // ... let log_file = LogFile::new(issue_id.clone(), file_name, stored_path, file_size); // ... db.execute("INSERT INTO log_files (..., file_path, ...) VALUES (...)"); ``` But `LogFile.file_path` is set to `stored_path` only when `is_binary`, i.e., it *is* the extracted path. So fallback *should* read the extracted text. However, `LogFile.file_path` is stored, but later in `detect_pii`, it uses the stored path directly: ```rust let file_path: String = { let db = state.db.lock().map_err(|e| e.to_string())?; db.prepare("SELECT file_path FROM log_files WHERE id = ?1") .and_then(|mut stmt| stmt.query_row([&log_file_id], |row| row.get(0))) .map_err(|_| "Failed to load log file metadata".to_string())? }; let content = std::fs::read_to_string(&file_path).map_err(|_| "Failed to read log file content")?; ``` If `is_binary` was true, `file_path` points to `extracted.txt`, which exists, so fallback in `get_log_file_content` would also work. This is correct. **The actual blocker is**: In `get_log_file_content` and `get_image_attachment_data`, when fallback is triggered and `file_path` is an empty string (possible for `upload_image_attachment_by_content` and `upload_paste_image`), the fallback attempts to read an empty path, which may throw a cryptic error or worse — in the case of `image.rs` upload commands, `file_path` for `upload_image_attachment_by_content` is set to the `file_name` string (e.g., `"screenshot.png"`), and for `upload_paste_image`, `file_path = String::new()`. In `get_image_attachment_data`, if `image_data` is `NULL`, it will try `std::fs::read("")`, which fails with "No such file or directory", but this error is caught and returned as `String`, so it's handled, but *not* surfaced clearly to the frontend. Evidence from `upload_paste_image`: ```rust let attachment = ImageAttachment::new( issue_id.clone(), file_name.clone(), String::new(), // file_path is empty file_size, mime_type, content_hash, true, true, ); ``` And `get_image_attachment_data`: ```rust let (image_data, file_path, mime_type) = row; let bytes = if let Some(data) = image_data { data } else { std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))? }; ``` For paste images, `file_path` is `""`, so `std::fs::read("")` will fail. This is handled but returns a generic error. The more serious risk is that `list_all_image_attachments` *never* includes the raw image bytes — only metadata. So the `image_data` BLOB is only ever read via `get_image_attachment_data`, and for paste images, it *must* be non-NULL. Since `upload_paste_image` always inserts `decoded` into `image_data`, it should be safe. But the fallback path is dead code for paste images — and if for some reason the DB insert failed but returned success, fallback would try to read `""` and fail. **Final verified blocker**: There is no issue with the migration or upload logic — all new fields are properly inserted. However, there is a **security vulnerability** in the `is_safe_file` function: it checks file extensions, but for files without extensions (e.g., `Makefile`, `.env`), `ext.as_deref()` returns `None`, and the function returns `false`. That is acceptable. However, the `SAFE_TEXT_EXTENSIONS` list includes `"env"`, so `config.env` is allowed. But there is no check for `config.yml` vs. `config.yaml` — it only checks the extension, which is correct. No security issue. After rechecking, the only genuine and critical issue is: **The fallback in `get_image_attachment_data` uses `file_path` from DB, but `upload_paste_image` sets `file_path = String::new()`. If the `image_data` column is `NULL` (e.g., DB column missed during migration), the fallback tries to `read("")` and fails — but this is handled as an error return. No data loss.** **Conclusion**: There are *no* blockers or warnings. All functionality is safe, tests pass, migrations are idempotent, and fallbacks are defensive. - [WARNING] `src-tauri/src/commands/image.rs:585–600` — `get_image_attachment_data` fallback reads from `file_path` which may be empty (`String::new()` for paste uploads), causing a filesystem error if DB column is NULL. Evidence: ```rust let row: (Option<Vec<u8>>, String, String) = { let db = state.db.lock().map_err(|e| e.to_string())?; db.prepare("SELECT image_data, file_path, mime_type FROM image_attachments WHERE id = ?1") .and_then(|mut stmt| { stmt.query_row([&attachment_id], |row| { Ok((row.get(0)?, row.get(1)?, row.get(2)?)) }) }) .map_err(|_| "Image attachment not found".to_string())? }; let (image_data, file_path, mime_type) = row; let bytes = if let Some(data) = image_data { data } else { std::fs::read(&file_path).map_err(|e| format!("Image file not found on disk: {e}"))? }; ``` For `upload_paste_image`, `file_path = String::new()`. If `image_data` is somehow NULL, `std::fs::read("")` throws an error — but it's caught and returned. However, the error message does not clarify that the path is empty, leading to poor UX. Fix: Guard against empty path before fallback, or log a warning if `file_path` is empty when `image_data` is NULL. - [SUGGESTION] `src/stores/attachmentStore.ts:13–14` — Import `useAttachmentStore` but it’s not used in the test file; unused imports should be removed. Evidence: ```ts import { useAttachmentStore } from "@/stores/attachmentStore"; ``` but the test uses `resetStore()` and does not import `useAttachmentStore`. - [SUGGESTION] `docs/wiki/Database.md:18` — Lists 22 migrations, but migration list includes duplicate entries: `"011_create_integrations"` appears twice in `migrations.rs` (lines 212 and 264 in the diff), though idempotency avoids double application. The duplicate in the docs table is correct, but the code duplication should be resolved to prevent future confusion. Evidence: In `migrations.rs`, the `011_create_integrations` entry is listed only once — the duplicate in the issue was in an older version. Since the final file shows only one `011_create_integrations`, no action required. --- **Verdict**: APPROVE WITH COMMENTS
sarman merged commit 0f1f4fcb92 into master 2026-05-31 23:12:39 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sarman/tftsr-devops_investigation#58
No description provided.