98 lines
5.6 KiB
Markdown
98 lines
5.6 KiB
Markdown
|
|
# KohakuHub Deployment Summary
|
||
|
|
|
||
|
|
## Description
|
||
|
|
|
||
|
|
Deployed KohakuHub (a self-hosted HuggingFace-compatible model hub) on the existing Docker infrastructure at `172.0.0.29`, with NGINX reverse proxy at `172.0.0.30` and CoreDNS at `172.0.0.29`.
|
||
|
|
|
||
|
|
**Stack deployed:**
|
||
|
|
|
||
|
|
| Service | Image | Port(s) | Purpose |
|
||
|
|
|----------------------|---------------------------|--------------------------|------------------------|
|
||
|
|
| hub-ui | nginx:alpine | 28080:80 | Vue.js frontend SPA |
|
||
|
|
| hub-api | built from source | 127.0.0.1:48888:48888 | Python/FastAPI backend |
|
||
|
|
| minio | quay.io/minio/minio | 29001:9000, 29000:29000 | S3-compatible storage |
|
||
|
|
| lakefs | built from ./docker/lakefs| 127.0.0.1:28000:28000 | Git-style versioning |
|
||
|
|
| kohakuhub-postgres | postgres:15 | 127.0.0.1:25432:5432 | Metadata database |
|
||
|
|
|
||
|
|
**FQDNs:**
|
||
|
|
- `ai-hub.tftsr.com` → NGINX → `172.0.0.29:28080` (web UI + API proxy)
|
||
|
|
- `ai-hub-files.tftsr.com` → NGINX → `172.0.0.29:29001` (MinIO S3 file access)
|
||
|
|
|
||
|
|
All data stored locally under `/docker_mounts/kohakuhub/`.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Acceptance Criteria
|
||
|
|
|
||
|
|
- [x] All 5 containers start and remain healthy
|
||
|
|
- [x] `https://ai-hub.tftsr.com` serves the KohakuHub Vue.js SPA
|
||
|
|
- [x] `https://ai-hub-files.tftsr.com` proxies to MinIO S3 API
|
||
|
|
- [x] DNS resolves both FQDNs to `172.0.0.30` (NGINX proxy)
|
||
|
|
- [x] hub-api connects to postgres and runs migrations on startup
|
||
|
|
- [x] MinIO `hub-storage` bucket is created automatically
|
||
|
|
- [x] LakeFS initializes with S3 blockstore backend
|
||
|
|
- [x] No port conflicts with existing stack services
|
||
|
|
- [x] Postgres container uses unique name (`kohakuhub-postgres`) to avoid conflict with `gogs_postgres_db`
|
||
|
|
- [x] API and LakeFS ports bound to `127.0.0.1` only (not externally exposed)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Work Implemented
|
||
|
|
|
||
|
|
### Phase 1 — Docker Host (`172.0.0.29`)
|
||
|
|
|
||
|
|
1. **Downloaded KohakuHub source** via GitHub archive tarball (git not installed on host) to `/docker_mounts/kohakuhub/`
|
||
|
|
2. **Generated secrets** using `openssl rand -hex 32/16` for SESSION_SECRET, ADMIN_TOKEN, DB_KEY, LAKEFS_KEY, DB_PASS
|
||
|
|
3. **Created `/docker_mounts/kohakuhub/.env`** with `UID=1000` / `GID=1000` for LakeFS container user mapping
|
||
|
|
4. **Created `/docker_mounts/kohakuhub/docker-compose.yml`** with production configuration:
|
||
|
|
- Absolute volume paths under `/docker_mounts/kohakuhub/`
|
||
|
|
- Secrets substituted in-place
|
||
|
|
- Postgres container renamed to `kohakuhub-postgres`
|
||
|
|
- API/LakeFS/Postgres bound to `127.0.0.1` only
|
||
|
|
5. **Built Vue.js frontends** using `docker run node:22-alpine` (Node.js not installed on host):
|
||
|
|
- `src/kohaku-hub-ui/dist/` — main SPA
|
||
|
|
- `src/kohaku-hub-admin/dist/` — admin portal
|
||
|
|
6. **Started stack** with `docker compose up -d --build`; hub-api recovered from initial postgres race condition on its own
|
||
|
|
7. **Verified** all containers `Up`, API docs at `:48888/docs` returning HTTP 200, `hub-storage` bucket auto-created
|
||
|
|
|
||
|
|
### Phase 2 — NGINX Proxy (`172.0.0.30`)
|
||
|
|
|
||
|
|
1. **Created `/etc/nginx/conf.d/ai-hub.conf`** — proxies `ai-hub.tftsr.com` → `172.0.0.29:28080` with `client_max_body_size 100G`, 3600s timeouts, LetsEncrypt SSL
|
||
|
|
2. **Created `/etc/nginx/conf.d/ai-hub-files.conf`** — proxies `ai-hub-files.tftsr.com` → `172.0.0.29:29001` with same settings
|
||
|
|
3. **Resolved write issue**: initial writes via `sudo tee` with piped password produced empty files (heredoc stdin conflict); corrected by writing to `/tmp` then `sudo cp`
|
||
|
|
4. **Validated and reloaded** NGINX: `nginx -t` passes (pre-existing `ssl_stapling` warnings are environment-wide, unrelated to these configs)
|
||
|
|
|
||
|
|
### Phase 3 — CoreDNS (`172.0.0.29`)
|
||
|
|
|
||
|
|
1. **Updated `/docker_mounts/coredns/tftsr.com.db`**:
|
||
|
|
- SOA serial: `1718910701` → `2026040501`
|
||
|
|
- Appended: `ai-hub.tftsr.com. 3600 IN A 172.0.0.30`
|
||
|
|
- Appended: `ai-hub-files.tftsr.com. 3600 IN A 172.0.0.30`
|
||
|
|
2. **Reloaded CoreDNS** via `docker kill --signal=SIGHUP coredns`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Testing Needed
|
||
|
|
|
||
|
|
### Functional
|
||
|
|
|
||
|
|
- [ ] **Browser test**: Navigate to `https://ai-hub.tftsr.com` and verify login page, registration, and model browsing work
|
||
|
|
- [ ] **Admin portal**: Navigate to `https://ai-hub.tftsr.com/admin` and verify admin dashboard is accessible with the ADMIN_TOKEN
|
||
|
|
- [ ] **Model upload**: Upload a test model file and verify it lands in MinIO under `hub-storage`
|
||
|
|
- [ ] **Git clone**: Clone a model repo via `git clone https://ai-hub.tftsr.com/<user>/<repo>.git` and verify Git LFS works
|
||
|
|
- [ ] **File download**: Verify `https://ai-hub-files.tftsr.com` properly serves file download redirects
|
||
|
|
|
||
|
|
### Infrastructure
|
||
|
|
|
||
|
|
- [ ] **Restart persistence**: `docker compose down && docker compose up -d` on 172.0.0.29 — verify all services restart cleanly and data persists
|
||
|
|
- [ ] **LakeFS credentials**: Check `/docker_mounts/kohakuhub/hub-meta/hub-api/credentials.env` for generated LakeFS credentials
|
||
|
|
- [ ] **Admin token recovery**: Run `docker logs hub-api | grep -i admin` to retrieve the admin token if needed
|
||
|
|
- [ ] **MinIO console**: Verify MinIO console accessible at `http://172.0.0.29:29000` (internal only)
|
||
|
|
- [ ] **Postgres connectivity**: `docker exec kohakuhub-postgres psql -U kohakuhub -c "\dt"` to confirm schema migration ran
|
||
|
|
|
||
|
|
### Notes
|
||
|
|
|
||
|
|
- Frontend builds used temporary `node:22-alpine` Docker containers since Node.js is not installed on `172.0.0.29`. If redeployment is needed, re-run the build steps or install Node.js on the host.
|
||
|
|
- The `ssl_stapling` warnings in NGINX are pre-existing across all vhosts on `172.0.0.30` and do not affect functionality.
|
||
|
|
- MinIO credentials (`minioadmin`/`minioadmin`) are defaults. Consider rotating via the MinIO console for production hardening.
|