tftsr-devops_investigation/CLAUDE.md
Shaun Arman 0a9686fb04
Some checks failed
Test / frontend-tests (pull_request) Successful in 1m42s
Test / frontend-typecheck (pull_request) Successful in 1m51s
PR Review Automation / review (pull_request) Has been cancelled
Test / rust-fmt-check (pull_request) Has been cancelled
Test / rust-clippy (pull_request) Has been cancelled
Test / rust-tests (pull_request) Has been cancelled
docs: update to v1.1.0 release with Kubernetes Management UI
- Add v1.1.0 release notes with Kubernetes Management UI features
- Update CI/CD references from GitHub Actions to Gitea Actions
- Update domain prompt count from 17 to 15 in documentation
- Update project status to include Phase 13 (Kubernetes Management)
- Remove completed features from Future Enhancements sections
- Update AGENTS.md and Architecture.md for consistency
2026-06-10 20:19:50 -05:00

10 KiB
Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.


Commands

Development

# Start full dev environment (Vite + Tauri hot reload)
cargo tauri dev

# Frontend only (Vite at localhost:1420)
npm run dev

# Frontend production build
npm run build

Rust toolchain must be in PATH: source ~/.cargo/env

Testing

# Rust unit tests
cargo test --manifest-path src-tauri/Cargo.toml

# Run a single Rust test module
cargo test --manifest-path src-tauri/Cargo.toml pii::detector

# Run a single Rust test by name
cargo test --manifest-path src-tauri/Cargo.toml test_detect_ipv4

# Frontend tests (single run)
npm run test:run

# Frontend tests (watch mode)
npm run test

# Frontend coverage report
npm run test:coverage

# TypeScript type check
npx tsc --noEmit

Linting

# Rust format check
cargo fmt --manifest-path src-tauri/Cargo.toml --check

# Rust lints
cargo clippy --manifest-path src-tauri/Cargo.toml -- -D warnings

# Rust quick type check (no linking)
cargo check --manifest-path src-tauri/Cargo.toml

# Frontend linting
npx eslint . --max-warnings 0

System Prerequisites (Linux/Fedora)

sudo dnf install -y glib2-devel gtk3-devel webkit2gtk4.1-devel \
  libsoup3-devel openssl-devel librsvg2-devel

Production Build

cargo tauri build  # Outputs to src-tauri/target/release/bundle/

CI/CD

  • Test pipeline: .gitea/workflows/test.yml — runs on every push/PR targeting main
  • Release pipeline: .gitea/workflows/auto-tag.yml — runs on every push to master, auto-tags, produces multi-platform bundles (Linux amd64+arm64, Windows, macOS arm64+Intel), uploads to Gitea Releases at https://gogs.tftsr.com/sarman/tftsr-devops_investigation/releases
  • Docker builder images: .gitea/workflows/build-images.yml — rebuilds 172.0.0.29:3000/tftsr/tftsr-* images when .docker/** changes on master

Architecture

Backend (Rust / Tauri)

Entry point: src-tauri/src/lib.rsrun() initialises tracing, opens the DB, registers Tauri plugins, and calls generate_handler![] with all IPC commands.

Shared state (src-tauri/src/state.rs):

pub struct AppState {
    pub db: Arc<Mutex<rusqlite::Connection>>,
    pub settings: Arc<Mutex<AppSettings>>,
    pub app_data_dir: PathBuf,  // ~/.local/share/tftsr on Linux
}

All command handlers receive State<'_, AppState> as a Tauri-injected parameter. Lock the Mutex inside a { } block and release it before any .await — holding a MutexGuard across an await point causes a compile error because MutexGuard is not Send.

Module layout:

Path Responsibility
commands/db.rs Issue CRUD, 5-whys entries, timeline events
commands/ai.rs analyze_logs, chat_message, list_providers
commands/analysis.rs Log file upload, PII detection, redaction application
commands/docs.rs RCA and post-mortem generation, document export
commands/system.rs Ollama management, hardware probe, app settings, audit log
commands/integrations.rs Confluence / ServiceNow / ADO — all v0.2 stubs
ai/provider.rs Provider trait + create_provider() factory
pii/detector.rs Multi-pattern PII scanner with overlap resolution
db/migrations.rs Versioned schema (10 migrations tracked in _migrations table)
db/models.rs All DB types — see IssueDetail note below
docs/rca.rs + docs/postmortem.rs Markdown template builders
audit/log.rs write_audit_event() — called before every external send

AI provider factory: ai/provider.rs::create_provider(config) dispatches on config.name to the matching struct. Adding a provider means implementing the Provider trait and adding a match arm.

Database encryption: cfg!(debug_assertions) → plain SQLite; release → SQLCipher AES-256. Key from TRCAA_DB_KEY (or legacy TRCAA_DB_KEY) env var (defaults to a dev placeholder). DB path from TRCAA_DATA_DIR (or legacy TRCAA_DATA_DIR) or platform data dir.

Credential encryption: API keys stored in AppSettings are encrypted using AES-256-GCM via the aes-gcm crate. The encryption key is derived from TRCAA_ENCRYPTION_KEY (or legacy TRCAA_ENCRYPTION_KEY) env var. Credentials are encrypted on save and decrypted on load. See commands/system.rs::save_settings() for implementation.

Frontend (React / TypeScript)

IPC layer: All Tauri invoke() calls are in src/lib/tauriCommands.ts. Every command has a typed wrapper function (e.g., createIssueCmd, chatMessageCmd). This is the single source of truth for the frontend's API surface.

Stores (Zustand):

  • sessionStore.ts — ephemeral triage session: current issue, chat messages, PII spans, why-level (05), loading state. Not persisted.
  • settingsStore.ts — AI providers, theme, Ollama URL. Persisted to localStorage as "tftsr-settings".
  • historyStore.ts — read-only cache of past issues for the History page.

Page flow:

NewIssue → createIssueCmd → startSession(detail.issue) → navigate /issue/:id/triage
LogUpload → uploadLogFileCmd → detectPiiCmd → applyRedactionsCmd
Triage   → chatMessageCmd loop, parse AI response for "why 2..5", detect root cause
Resolution → getIssueCmd, mark 5-whys steps done
RCA      → generateRcaCmd → DocEditor → exportDocumentCmd

Domain system prompts: src/lib/domainPrompts.ts contains expert-level system prompts for Linux, Windows, Network, Kubernetes, Databases, Virtualization, Hardware, and Observability. Each prompt is injected as the first message in every triage conversation.

Key Type: IssueDetail

get_issue() returns a nested struct, not a flat Issue. Use detail.issue.title, not detail.title:

pub struct IssueDetail {
    pub issue: Issue,                          // Base issue fields
    pub log_files: Vec<LogFile>,
    pub resolution_steps: Vec<ResolutionStep>, // 5-whys entries
    pub conversations: Vec<AiConversation>,
}

On the TypeScript side, tauriCommands.ts mirrors this shape exactly.

PII Detection

PiiDetector::detect(&str) returns Vec<PiiSpan> with non-overlapping spans (longest match wins on overlap). Spans carry start/end byte offsets and a replacement string ([IPv4], [EMAIL], etc.). The redactor applies spans by iterating in reverse order to preserve offsets.

Before any text is sent to an AI provider, apply_redactions must be called and the resulting SHA-256 hash recorded via audit::log::write_audit_event.

Shell Command Execution (v1.0.0)

Status: Production-ready agentic shell execution with three-tier safety classification.

Features:

  • kubectl commands with bundled binary (v1.30.0)
  • Proxmox tools (pvecm, pvesh, qm)
  • General shell diagnostics
  • Real-time approval modal for Tier 2 commands
  • Multiple kubeconfig support with AES-256 encrypted storage
  • Pipe/chain command analysis with tier escalation
  • Command execution history and audit logging

Three-Tier Safety System:

  • Tier 1 (Auto-execute): kubectl get|describe|logs, cat|grep|ls
  • Tier 2 (User approval): kubectl apply|delete|scale, ssh, systemctl restart
  • Tier 3 (Always deny): rm -rf, shutdown, mkfs

Key Files:

  • src-tauri/src/shell/classifier.rs: Command safety classification (19 tests, 100% coverage)
  • src-tauri/src/shell/executor.rs: Execution flow with approval gates
  • src-tauri/src/shell/kubectl.rs: kubectl binary management
  • src-tauri/src/shell/kubeconfig.rs: Kubeconfig parsing and encryption
  • src-tauri/src/commands/shell.rs: 7 Tauri commands for kubeconfig and execution management
  • src-tauri/src/ai/tools.rs: execute_shell_command tool registration
  • src/components/ShellApprovalModal.tsx: Real-time approval UI
  • src/pages/Settings/ShellExecution.tsx: Settings and history view
  • src/pages/Settings/KubeconfigManager.tsx: Multi-cluster management UI
  • scripts/download-kubectl.sh: Binary download for all platforms

Database Tables (Migrations 024-027):

  • shell_commands: Pre-defined command templates with tier definitions
  • kubeconfig_files: Encrypted kubeconfig storage
  • command_executions: Full audit trail (command, tier, status, exit code, stdout, stderr, timing)
  • approval_decisions: Session-based approval preferences

Documentation: docs/wiki/Shell-Execution.md

Gitea Actions CI

All pipelines run on Gitea Actions at https://gogs.tftsr.com/sarman/tftsr-devops_investigation/actions.

  • TFT_GITEA_TOKEN is the only credential needed — no external secrets required
  • Builder images are hosted on 172.0.0.29:3000/tftsr/ (private registry)
  • Branch protection on master requires rust-test and frontend-test checks to pass, plus PR review, before merging
  • kubectl binaries downloaded during build via scripts/download-kubectl.sh for all platforms

Wiki Maintenance

The project wiki lives at https://gogs.tftsr.com/sarman/tftsr-devops_investigation/wiki.

Source of truth: docs/wiki/*.md in this repo. The auto-tag workflow (in .gitea/workflows/auto-tag.yml) automatically pushes any changes to the Gitea wiki on every push to master.

When making code changes, update the corresponding wiki file in docs/wiki/ before committing:

Changed area Wiki file to update
New/changed Tauri commands (commands/*.rs, tauriCommands.ts) docs/wiki/IPC-Commands.md
DB schema or migrations (db/migrations.rs, db/models.rs) docs/wiki/Database.md
New/changed AI provider (ai/*.rs) docs/wiki/AI-Providers.md
PII patterns or detection logic (pii/) docs/wiki/PII-Detection.md
CI/CD pipeline changes (.github/workflows/*.yml) docs/wiki/CICD-Pipeline.md
Rust architecture or module layout (lib.rs, state.rs) docs/wiki/Architecture.md
Security-relevant changes (capabilities, audit, Stronghold) docs/wiki/Security-Model.md
Dev setup, prerequisites, build commands docs/wiki/Development-Setup.md
Integration stubs or v0.2 progress (integrations/) docs/wiki/Integrations.md
Recurring bugs and fixes docs/wiki/Troubleshooting.md
Shell execution, kubectl, kubeconfig management (shell/) docs/wiki/Shell-Execution.md

To manually push wiki changes without waiting for CI:

cd /tmp/apollo-wiki   # local clone of the wiki git repo
# edit *.md files, then:
git add -A && git commit -m "docs: ..." && git push