tftsr-devops_investigation/tickets/ci-runner-speed-optimization.md
Shaun Arman 40b6882cab
Some checks failed
Test / rust-fmt-check (pull_request) Failing after 14s
Test / rust-clippy (pull_request) Failing after 16s
Test / rust-tests (pull_request) Failing after 18s
Test / frontend-typecheck (pull_request) Successful in 1m27s
Test / frontend-tests (pull_request) Successful in 1m28s
PR Review Automation / review (pull_request) Successful in 3m4s
fix: comprehensive trcaa→tftsr conversion and URL corrections
Complete sanitization pass to ensure consistency:

**1. Repository/Project Name Changes:**
- trcaa-devops_investigation → tftsr-devops_investigation (everywhere)
- gogs.trcaa.com → gogs.tftsr.com (all URLs)
- ollama-ui.trcaa.com → ollama-ui.tftsr.com

**2. Internal CI URLs (must use 172.0.0.29):**
- gitea.tftsr.com:3000 → 172.0.0.29:3000 in:
  - AGENTS.md
  - README.md
  - docs/architecture/README.md
  - docs/wiki/*.md
- CI runners cannot reach external DNS

**3. Code Simplifications:**
- MSIGenAI/TFTSRGenAI → GenAI (src-tauri/src/ai/openai.rs)
- Cleaner comments without org-specific references

**4. Build System Updates:**
- Makefile: GH_TOKEN → GOGS_TOKEN, GH_REPO → GOGS_REPO
- Commented out GitHub release upload commands
- Fixed lib name: tftsr_lib → trcaa_lib (src/main.rs)

**5. Documentation Cleanup:**
- CLAUDE.md: Fixed wiki URL, Woodpecker→Gitea Actions
- Removed PLAN.md, SECURITY_AUDIT.md (not needed in git)
- Removed hackathon docs (HACKATHON-*.md)
- Removed v1.0.5/7/8 summary docs (superseded)

**6. Preserved:**
- TRCAA (all caps) = application name (correct!)
- trcaa package name in Cargo.toml (correct!)
- trcaa_lib library name (correct!)

**Test Results:** 308 Rust + 92 frontend tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-06-05 15:38:29 -05:00

108 lines
6.0 KiB
Markdown

# CI Runner Speed Optimization via Pre-baked Images + Caching
## Description
Every CI run (both `test.yml` and `auto-tag.yml`) was installing system packages from scratch
on each job invocation: `apt-get update`, Tauri system libs, Node.js via nodesource, and in
the arm64 job — a full `rustup` install. This was the primary cause of slow builds.
The repository already contains pre-baked builder Docker images (`.docker/Dockerfile.*`) and a
`build-images.yml` workflow to push them to the local Gitea registry at `gitea.tftsr.com:3000`.
These images were never referenced by the actual CI jobs — a critical gap. This work closes
that gap and adds `actions/cache@v3` for Cargo and npm.
## Acceptance Criteria
- [ ] `Dockerfile.linux-amd64` includes `rustfmt` and `clippy` components
- [ ] `Dockerfile.linux-arm64` includes `rustfmt` and `clippy` components
- [ ] `test.yml` Rust jobs use `gitea.tftsr.com:3000/sarman/tftsr-linux-amd64:rust1.88-node22`
- [ ] `test.yml` Rust jobs have no inline `apt-get` or `rustup component add` steps
- [ ] `test.yml` Rust jobs include `actions/cache@v3` for `~/.cargo/registry`
- [ ] `test.yml` frontend jobs include `actions/cache@v3` for `~/.npm`
- [ ] `auto-tag.yml` `build-linux-amd64` uses pre-baked `tftsr-linux-amd64` image
- [ ] `auto-tag.yml` `build-windows-amd64` uses pre-baked `tftsr-windows-cross` image
- [ ] `auto-tag.yml` `build-linux-arm64` uses pre-baked `tftsr-linux-arm64` image
- [ ] All three build jobs have no `Install dependencies` step
- [ ] All three build jobs include `actions/cache@v3` for Cargo and npm
- [ ] `docs/wiki/CICD-Pipeline.md` documents pre-baked images, cache keys, and server prerequisites
- [ ] `build-images.yml` triggered manually before merging to ensure images exist in registry
## Work Implemented
### `.docker/Dockerfile.linux-amd64`
Added `RUN rustup component add rustfmt clippy` after the existing target add line.
The `rust-fmt-check` and `rust-clippy` CI jobs now rely on these being pre-installed
in the image rather than installing them at job runtime.
### `.docker/Dockerfile.linux-arm64`
Added `&& /root/.cargo/bin/rustup component add rustfmt clippy` appended to the
existing `rustup` installation RUN command (chained with `&&` to keep it one layer).
### `.gitea/workflows/test.yml`
- **rust-fmt-check**, **rust-clippy**, **rust-tests**: switched container image from
`rust:1.88-slim``gitea.tftsr.com:3000/sarman/tftsr-linux-amd64:rust1.88-node22`.
Removed `apt-get install git` from Checkout steps (git is pre-installed in image).
Removed `apt-get install libwebkit2gtk-...` steps.
Removed `rustup component add rustfmt` and `rustup component add clippy` steps.
Added `actions/cache@v3` step for `~/.cargo/registry/index`, `~/.cargo/registry/cache`,
`~/.cargo/git/db` keyed on `Cargo.lock` hash.
- **frontend-typecheck**, **frontend-tests**: kept `node:22-alpine` image (no change needed).
Added `actions/cache@v3` step for `~/.npm` keyed on `package-lock.json` hash.
### `.gitea/workflows/auto-tag.yml`
- **build-linux-amd64**: image `rust:1.88-slim``tftsr-linux-amd64:rust1.88-node22`.
Removed Checkout apt-get install git, removed entire Install dependencies step.
Removed `rustup target add x86_64-unknown-linux-gnu` from Build step. Added cargo + npm cache.
- **build-windows-amd64**: image `rust:1.88-slim``tftsr-windows-cross:rust1.88-node22`.
Removed Checkout apt-get install git, removed entire Install dependencies step.
Removed `rustup target add x86_64-pc-windows-gnu` from Build step.
Added cargo (with `-windows-` suffix key to avoid collision) + npm cache.
- **build-linux-arm64**: image `ubuntu:22.04``tftsr-linux-arm64:rust1.88-node22`.
Removed Checkout apt-get install git, removed entire Install dependencies step (~40 lines).
Removed `. "$HOME/.cargo/env"` (PATH already set via `ENV` in Dockerfile).
Removed `rustup target add aarch64-unknown-linux-gnu` from Build step.
Added cargo (with `-arm64-` suffix key) + npm cache.
### `docs/wiki/CICD-Pipeline.md`
Added two new sections before the Test Pipeline section:
- **Pre-baked Builder Images**: table of all three images and their contents, rebuild
triggers, how-to-rebuild instructions, and the insecure-registries Docker daemon
prerequisite for gitea.tftsr.com.
- **Cargo and npm Caching**: documents the `actions/cache@v3` key patterns in use,
including the per-platform cache key suffixes for cross-compile jobs.
Updated the Test Pipeline section to reference the correct pre-baked image name.
Updated the Release Pipeline job table to show which image each build job uses.
## Testing Needed
1. **Pre-build images** (prerequisite): Trigger `build-images.yml` via `workflow_dispatch`
on Gitea Actions UI. Confirm all 3 images are pushed and visible in the registry.
2. **Server prerequisite**: Confirm `/etc/docker/daemon.json` on `gitea.tftsr.com` contains
`{"insecure-registries":["gitea.tftsr.com:3000"]}` and Docker was restarted after.
3. **PR test suite**: Open a PR with these changes. Verify:
- All 5 test jobs pass (`rust-fmt-check`, `rust-clippy`, `rust-tests`,
`frontend-typecheck`, `frontend-tests`)
- Job logs show no `apt-get` or `rustup component add` output
- Cache hit messages appear on second run
4. **Release build**: Merge to master. Verify `auto-tag.yml` runs and:
- All 3 Linux/Windows build jobs start without Install dependencies step
- Artifacts are produced and uploaded to the Gitea release
- Total release time is significantly reduced (~7 min vs ~25 min before)
5. **Expected time savings after caching warms up**:
| Job | Before | After |
|-----|--------|-------|
| rust-fmt-check | ~2 min | ~20 sec |
| rust-clippy | ~4 min | ~45 sec |
| rust-tests | ~5 min | ~1.5 min |
| frontend-typecheck | ~2 min | ~30 sec |
| frontend-tests | ~3 min | ~40 sec |
| build-linux-amd64 | ~10 min | ~3 min |
| build-windows-amd64 | ~12 min | ~4 min |
| build-linux-arm64 | ~15 min | ~4 min |
| PR test total (parallel) | ~5 min | ~1.5 min |
| Release total | ~25 min | ~7 min |