sarman/tftsr-devops_investigation

Fork 0

Shaun Arman 093495a653

Test / rust-fmt-check (pull_request) Failing after 0s

Details

Test / rust-clippy (pull_request) Failing after 1s

Details

Test / rust-tests (pull_request) Failing after 0s

Details

Test / frontend-typecheck (pull_request) Failing after 16s

Details

Test / frontend-tests (pull_request) Failing after 18s

Details

PR Review Automation / review (pull_request) Failing after 4m13s

Details

feat: full copy from apollo_nxt-trcaa with complete sanitization

Complete backport of all features from apollo_nxt-trcaa repository:
- Three-tier shell execution safety system (Tier 1: auto, Tier 2: approve, Tier 3: deny)
- Ollama function calling with tool use support
- AI provider tool calling auto-detection
- kubectl binary bundling and management
- kubeconfig upload and context management
- Shell approval modal with real-time UI
- MCP protocol HTTP transport with custom headers
- Enhanced security audit logging
- Comprehensive test coverage (275+ tests)
- Updated CI/CD workflows for Gitea Actions
- Complete documentation (ADRs, wiki, release notes)

Sanitization applied to all files:
- Removed all MSI, Motorola, VNXT, Vesta references
- Replaced internal infrastructure references with TFTSR equivalents
- Updated all URLs and API endpoints
- Sanitized commit history references in documentation

Technical changes:
- New modules: shell/classifier, shell/executor, shell/kubectl, shell/kubeconfig
- Enhanced AI providers: ollama.rs, openai.rs with function calling
- New Tauri commands: shell execution, kubeconfig management, tool calling detection
- Database migrations: shell_execution_audit table
- Frontend: ShellApprovalModal, ShellExecution, KubeconfigManager pages
- CI/CD: kubectl bundling, multi-platform builds, Gitea Actions integration

Version: 1.0.8

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-06-05 14:12:43 -05:00

10 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

Development

# Start full dev environment (Vite + Tauri hot reload)
cargo tauri dev

# Frontend only (Vite at localhost:1420)
npm run dev

# Frontend production build
npm run build

Rust toolchain must be in PATH: source ~/.cargo/env

Testing

# Rust unit tests
cargo test --manifest-path src-tauri/Cargo.toml

# Run a single Rust test module
cargo test --manifest-path src-tauri/Cargo.toml pii::detector

# Run a single Rust test by name
cargo test --manifest-path src-tauri/Cargo.toml test_detect_ipv4

# Frontend tests (single run)
npm run test:run

# Frontend tests (watch mode)
npm run test

# Frontend coverage report
npm run test:coverage

# TypeScript type check
npx tsc --noEmit

Linting

# Rust format check
cargo fmt --manifest-path src-tauri/Cargo.toml --check

# Rust lints
cargo clippy --manifest-path src-tauri/Cargo.toml -- -D warnings

# Rust quick type check (no linking)
cargo check --manifest-path src-tauri/Cargo.toml

# Frontend linting
npx eslint . --max-warnings 0

System Prerequisites (Linux/Fedora)

sudo dnf install -y glib2-devel gtk3-devel webkit2gtk4.1-devel \
  libsoup3-devel openssl-devel librsvg2-devel

Production Build

cargo tauri build  # Outputs to src-tauri/target/release/bundle/

CI/CD

Test pipeline: .github/workflows/test.yml — runs on every push/PR targeting main
Release pipeline: .github/workflows/release.yml — runs on every push to main, auto-tags, produces multi-platform bundles (Linux amd64+arm64, Windows, macOS arm64+Intel), uploads to GitHub Releases at https://github.com/tftsr/apollo_nxt-trcaa/releases
Docker builder images: .github/workflows/build-images.yml — rebuilds ghcr.io/tftsr/trcaa-* images when .docker/** changes on main

Architecture

Backend (Rust / Tauri)

Entry point: src-tauri/src/lib.rs → run() initialises tracing, opens the DB, registers Tauri plugins, and calls generate_handler![] with all IPC commands.

Shared state (src-tauri/src/state.rs):

pub struct AppState {
    pub db: Arc<Mutex<rusqlite::Connection>>,
    pub settings: Arc<Mutex<AppSettings>>,
    pub app_data_dir: PathBuf,  // ~/.local/share/trcaa on Linux
}

All command handlers receive State<'_, AppState> as a Tauri-injected parameter. Lock the Mutex inside a { } block and release it before any .await — holding a MutexGuard across an await point causes a compile error because MutexGuard is not Send.

Module layout:

Path	Responsibility
`commands/db.rs`	Issue CRUD, 5-whys entries, timeline events
`commands/ai.rs`	`analyze_logs`, `chat_message`, `list_providers`
`commands/analysis.rs`	Log file upload, PII detection, redaction application
`commands/docs.rs`	RCA and post-mortem generation, document export
`commands/system.rs`	Ollama management, hardware probe, app settings, audit log
`commands/integrations.rs`	Confluence / ServiceNow / ADO — all v0.2 stubs
`ai/provider.rs`	`Provider` trait + `create_provider()` factory
`pii/detector.rs`	Multi-pattern PII scanner with overlap resolution
`db/migrations.rs`	Versioned schema (10 migrations tracked in `_migrations` table)
`db/models.rs`	All DB types — see IssueDetail note below
`docs/rca.rs` + `docs/postmortem.rs`	Markdown template builders
`audit/log.rs`	`write_audit_event()` — called before every external send

AI provider factory: ai/provider.rs::create_provider(config) dispatches on config.name to the matching struct. Adding a provider means implementing the Provider trait and adding a match arm.

Database encryption: cfg!(debug_assertions) → plain SQLite; release → SQLCipher AES-256. Key from TRCAA_DB_KEY (or legacy TRCAA_DB_KEY) env var (defaults to a dev placeholder). DB path from TRCAA_DATA_DIR (or legacy TRCAA_DATA_DIR) or platform data dir.

Credential encryption: API keys stored in AppSettings are encrypted using AES-256-GCM via the aes-gcm crate. The encryption key is derived from TRCAA_ENCRYPTION_KEY (or legacy TRCAA_ENCRYPTION_KEY) env var. Credentials are encrypted on save and decrypted on load. See commands/system.rs::save_settings() for implementation.

Frontend (React / TypeScript)

IPC layer: All Tauri invoke() calls are in src/lib/tauriCommands.ts. Every command has a typed wrapper function (e.g., createIssueCmd, chatMessageCmd). This is the single source of truth for the frontend's API surface.

Stores (Zustand):

sessionStore.ts — ephemeral triage session: current issue, chat messages, PII spans, why-level (0–5), loading state. Not persisted.
settingsStore.ts — AI providers, theme, Ollama URL. Persisted to localStorage as "trcaa-settings".
historyStore.ts — read-only cache of past issues for the History page.

Page flow:

NewIssue → createIssueCmd → startSession(detail.issue) → navigate /issue/:id/triage
LogUpload → uploadLogFileCmd → detectPiiCmd → applyRedactionsCmd
Triage   → chatMessageCmd loop, parse AI response for "why 2..5", detect root cause
Resolution → getIssueCmd, mark 5-whys steps done
RCA      → generateRcaCmd → DocEditor → exportDocumentCmd

Domain system prompts: src/lib/domainPrompts.ts contains expert-level system prompts for Linux, Windows, Network, Kubernetes, Databases, Virtualization, Hardware, and Observability. Each prompt is injected as the first message in every triage conversation.

Key Type: `IssueDetail`

get_issue() returns a nested struct, not a flat Issue. Use detail.issue.title, not detail.title:

pub struct IssueDetail {
    pub issue: Issue,                          // Base issue fields
    pub log_files: Vec<LogFile>,
    pub resolution_steps: Vec<ResolutionStep>, // 5-whys entries
    pub conversations: Vec<AiConversation>,
}

On the TypeScript side, tauriCommands.ts mirrors this shape exactly.

PII Detection

PiiDetector::detect(&str) returns Vec<PiiSpan> with non-overlapping spans (longest match wins on overlap). Spans carry start/end byte offsets and a replacement string ([IPv4], [EMAIL], etc.). The redactor applies spans by iterating in reverse order to preserve offsets.

Before any text is sent to an AI provider, apply_redactions must be called and the resulting SHA-256 hash recorded via audit::log::write_audit_event.

Shell Command Execution (v1.0.0)

Status: Production-ready agentic shell execution with three-tier safety classification.

Features:

kubectl commands with bundled binary (v1.30.0)
Proxmox tools (pvecm, pvesh, qm)
General shell diagnostics
Real-time approval modal for Tier 2 commands
Multiple kubeconfig support with AES-256 encrypted storage
Pipe/chain command analysis with tier escalation
Command execution history and audit logging

Three-Tier Safety System:

Tier 1 (Auto-execute): kubectl get|describe|logs, cat|grep|ls
Tier 2 (User approval): kubectl apply|delete|scale, ssh, systemctl restart
Tier 3 (Always deny): rm -rf, shutdown, mkfs

Key Files:

src-tauri/src/shell/classifier.rs: Command safety classification (19 tests, 100% coverage)
src-tauri/src/shell/executor.rs: Execution flow with approval gates
src-tauri/src/shell/kubectl.rs: kubectl binary management
src-tauri/src/shell/kubeconfig.rs: Kubeconfig parsing and encryption
src-tauri/src/commands/shell.rs: 7 Tauri commands for kubeconfig and execution management
src-tauri/src/ai/tools.rs: execute_shell_command tool registration
src/components/ShellApprovalModal.tsx: Real-time approval UI
src/pages/Settings/ShellExecution.tsx: Settings and history view
src/pages/Settings/KubeconfigManager.tsx: Multi-cluster management UI
scripts/download-kubectl.sh: Binary download for all platforms

Database Tables (Migrations 024-027):

shell_commands: Pre-defined command templates with tier definitions
kubeconfig_files: Encrypted kubeconfig storage
command_executions: Full audit trail (command, tier, status, exit code, stdout, stderr, timing)
approval_decisions: Session-based approval preferences

Documentation: docs/wiki/Shell-Execution.md

GitHub Actions CI

All pipelines run on GitHub Actions at https://github.com/tftsr/apollo_nxt-trcaa/actions.

GITHUB_TOKEN is the only credential needed — no external secrets required
Builder images are hosted on ghcr.io/tftsr/ (GitHub Container Registry)
Branch protection on main requires rust-test and frontend-test checks to pass, plus Copilot code review, before merging
kubectl binaries downloaded during build via scripts/download-kubectl.sh for all platforms

Wiki Maintenance

The project wiki lives at https://github.com/tftsr/apollo_nxt-trcaa/wiki.

Source of truth: docs/wiki/*.md in this repo. The wiki-sync job (in .github/workflows/release.yml) automatically pushes any changes to the GitHub wiki on every push to main.

When making code changes, update the corresponding wiki file in docs/wiki/ before committing:

Changed area	Wiki file to update
New/changed Tauri commands (`commands/*.rs`, `tauriCommands.ts`)	`docs/wiki/IPC-Commands.md`
DB schema or migrations (`db/migrations.rs`, `db/models.rs`)	`docs/wiki/Database.md`
New/changed AI provider (`ai/*.rs`)	`docs/wiki/AI-Providers.md`
PII patterns or detection logic (`pii/`)	`docs/wiki/PII-Detection.md`
CI/CD pipeline changes (`.github/workflows/*.yml`)	`docs/wiki/CICD-Pipeline.md`
Rust architecture or module layout (`lib.rs`, `state.rs`)	`docs/wiki/Architecture.md`
Security-relevant changes (capabilities, audit, Stronghold)	`docs/wiki/Security-Model.md`
Dev setup, prerequisites, build commands	`docs/wiki/Development-Setup.md`
Integration stubs or v0.2 progress (`integrations/`)	`docs/wiki/Integrations.md`
Recurring bugs and fixes	`docs/wiki/Troubleshooting.md`
Shell execution, kubectl, kubeconfig management (`shell/`)	`docs/wiki/Shell-Execution.md`

To manually push wiki changes without waiting for CI:

cd /tmp/apollo-wiki   # local clone of the wiki git repo
# edit *.md files, then:
git add -A && git commit -m "docs: ..." && git push

10 KiB Raw Blame History Unescape Escape

CLAUDE.md

Commands

Development

Testing

Linting

System Prerequisites (Linux/Fedora)

Production Build

CI/CD

Architecture

Backend (Rust / Tauri)

Frontend (React / TypeScript)

Key Type: IssueDetail

PII Detection

Shell Command Execution (v1.0.0)

GitHub Actions CI

Wiki Maintenance

10 KiB

Raw Blame History

Key Type: `IssueDetail`