tftsr-devops_investigation/tickets/ci-runner-speed-optimization.md
Shaun Arman 093495a653
Some checks failed
Test / rust-fmt-check (pull_request) Failing after 0s
Test / rust-clippy (pull_request) Failing after 1s
Test / rust-tests (pull_request) Failing after 0s
Test / frontend-typecheck (pull_request) Failing after 16s
Test / frontend-tests (pull_request) Failing after 18s
PR Review Automation / review (pull_request) Failing after 4m13s
feat: full copy from apollo_nxt-trcaa with complete sanitization
Complete backport of all features from apollo_nxt-trcaa repository:
- Three-tier shell execution safety system (Tier 1: auto, Tier 2: approve, Tier 3: deny)
- Ollama function calling with tool use support
- AI provider tool calling auto-detection
- kubectl binary bundling and management
- kubeconfig upload and context management
- Shell approval modal with real-time UI
- MCP protocol HTTP transport with custom headers
- Enhanced security audit logging
- Comprehensive test coverage (275+ tests)
- Updated CI/CD workflows for Gitea Actions
- Complete documentation (ADRs, wiki, release notes)

Sanitization applied to all files:
- Removed all MSI, Motorola, VNXT, Vesta references
- Replaced internal infrastructure references with TFTSR equivalents
- Updated all URLs and API endpoints
- Sanitized commit history references in documentation

Technical changes:
- New modules: shell/classifier, shell/executor, shell/kubectl, shell/kubeconfig
- Enhanced AI providers: ollama.rs, openai.rs with function calling
- New Tauri commands: shell execution, kubeconfig management, tool calling detection
- Database migrations: shell_execution_audit table
- Frontend: ShellApprovalModal, ShellExecution, KubeconfigManager pages
- CI/CD: kubectl bundling, multi-platform builds, Gitea Actions integration

Version: 1.0.8

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-06-05 14:12:43 -05:00

6.0 KiB

CI Runner Speed Optimization via Pre-baked Images + Caching

Description

Every CI run (both test.yml and auto-tag.yml) was installing system packages from scratch on each job invocation: apt-get update, Tauri system libs, Node.js via nodesource, and in the arm64 job — a full rustup install. This was the primary cause of slow builds.

The repository already contains pre-baked builder Docker images (.docker/Dockerfile.*) and a build-images.yml workflow to push them to the local Gitea registry at gitea.tftsr.com:3000. These images were never referenced by the actual CI jobs — a critical gap. This work closes that gap and adds actions/cache@v3 for Cargo and npm.

Acceptance Criteria

  • Dockerfile.linux-amd64 includes rustfmt and clippy components
  • Dockerfile.linux-arm64 includes rustfmt and clippy components
  • test.yml Rust jobs use gitea.tftsr.com:3000/sarman/trcaa-linux-amd64:rust1.88-node22
  • test.yml Rust jobs have no inline apt-get or rustup component add steps
  • test.yml Rust jobs include actions/cache@v3 for ~/.cargo/registry
  • test.yml frontend jobs include actions/cache@v3 for ~/.npm
  • auto-tag.yml build-linux-amd64 uses pre-baked trcaa-linux-amd64 image
  • auto-tag.yml build-windows-amd64 uses pre-baked trcaa-windows-cross image
  • auto-tag.yml build-linux-arm64 uses pre-baked trcaa-linux-arm64 image
  • All three build jobs have no Install dependencies step
  • All three build jobs include actions/cache@v3 for Cargo and npm
  • docs/wiki/CICD-Pipeline.md documents pre-baked images, cache keys, and server prerequisites
  • build-images.yml triggered manually before merging to ensure images exist in registry

Work Implemented

.docker/Dockerfile.linux-amd64

Added RUN rustup component add rustfmt clippy after the existing target add line. The rust-fmt-check and rust-clippy CI jobs now rely on these being pre-installed in the image rather than installing them at job runtime.

.docker/Dockerfile.linux-arm64

Added && /root/.cargo/bin/rustup component add rustfmt clippy appended to the existing rustup installation RUN command (chained with && to keep it one layer).

.gitea/workflows/test.yml

  • rust-fmt-check, rust-clippy, rust-tests: switched container image from rust:1.88-slimgitea.tftsr.com:3000/sarman/trcaa-linux-amd64:rust1.88-node22. Removed apt-get install git from Checkout steps (git is pre-installed in image). Removed apt-get install libwebkit2gtk-... steps. Removed rustup component add rustfmt and rustup component add clippy steps. Added actions/cache@v3 step for ~/.cargo/registry/index, ~/.cargo/registry/cache, ~/.cargo/git/db keyed on Cargo.lock hash.
  • frontend-typecheck, frontend-tests: kept node:22-alpine image (no change needed). Added actions/cache@v3 step for ~/.npm keyed on package-lock.json hash.

.gitea/workflows/auto-tag.yml

  • build-linux-amd64: image rust:1.88-slimtrcaa-linux-amd64:rust1.88-node22. Removed Checkout apt-get install git, removed entire Install dependencies step. Removed rustup target add x86_64-unknown-linux-gnu from Build step. Added cargo + npm cache.
  • build-windows-amd64: image rust:1.88-slimtrcaa-windows-cross:rust1.88-node22. Removed Checkout apt-get install git, removed entire Install dependencies step. Removed rustup target add x86_64-pc-windows-gnu from Build step. Added cargo (with -windows- suffix key to avoid collision) + npm cache.
  • build-linux-arm64: image ubuntu:22.04trcaa-linux-arm64:rust1.88-node22. Removed Checkout apt-get install git, removed entire Install dependencies step (~40 lines). Removed . "$HOME/.cargo/env" (PATH already set via ENV in Dockerfile). Removed rustup target add aarch64-unknown-linux-gnu from Build step. Added cargo (with -arm64- suffix key) + npm cache.

docs/wiki/CICD-Pipeline.md

Added two new sections before the Test Pipeline section:

  • Pre-baked Builder Images: table of all three images and their contents, rebuild triggers, how-to-rebuild instructions, and the insecure-registries Docker daemon prerequisite for gitea.tftsr.com.
  • Cargo and npm Caching: documents the actions/cache@v3 key patterns in use, including the per-platform cache key suffixes for cross-compile jobs. Updated the Test Pipeline section to reference the correct pre-baked image name. Updated the Release Pipeline job table to show which image each build job uses.

Testing Needed

  1. Pre-build images (prerequisite): Trigger build-images.yml via workflow_dispatch on Gitea Actions UI. Confirm all 3 images are pushed and visible in the registry.

  2. Server prerequisite: Confirm /etc/docker/daemon.json on gitea.tftsr.com contains {"insecure-registries":["gitea.tftsr.com:3000"]} and Docker was restarted after.

  3. PR test suite: Open a PR with these changes. Verify:

    • All 5 test jobs pass (rust-fmt-check, rust-clippy, rust-tests, frontend-typecheck, frontend-tests)
    • Job logs show no apt-get or rustup component add output
    • Cache hit messages appear on second run
  4. Release build: Merge to master. Verify auto-tag.yml runs and:

    • All 3 Linux/Windows build jobs start without Install dependencies step
    • Artifacts are produced and uploaded to the Gitea release
    • Total release time is significantly reduced (~7 min vs ~25 min before)
  5. Expected time savings after caching warms up:

    Job Before After
    rust-fmt-check ~2 min ~20 sec
    rust-clippy ~4 min ~45 sec
    rust-tests ~5 min ~1.5 min
    frontend-typecheck ~2 min ~30 sec
    frontend-tests ~3 min ~40 sec
    build-linux-amd64 ~10 min ~3 min
    build-windows-amd64 ~12 min ~4 min
    build-linux-arm64 ~15 min ~4 min
    PR test total (parallel) ~5 min ~1.5 min
    Release total ~25 min ~7 min