90 lines
4.4 KiB
Markdown
90 lines
4.4 KiB
Markdown
# NOTES — compose-image digest pinning (deferred prevention item #4)
|
|
|
|
**Date:** 2026-05-20
|
|
**Author:** v2-deploy-coordination session (kua-deploy-verify branch)
|
|
**Status:** design sketch, **not implemented** — needs platform-wide coordination.
|
|
|
|
## Why this matters
|
|
|
|
Today every app's `docker-compose.yml` references its app image by **tag** —
|
|
typically `:latest`. `docker compose up -d` (without `--force-recreate`)
|
|
is then a legal no-op when the tag string is unchanged, *even if the
|
|
digest behind that tag changed*. That is structurally why kua-deploy's
|
|
`deploy` step false-success'd on muralla 2026-05-19 — the build pushed
|
|
`muralla-muralla:latest` to a new digest, but the running container was
|
|
still bound to the previous digest, and `up -d` saw "no change."
|
|
|
|
The other prevention items in this branch (post-deploy SHA verify,
|
|
runtime-status endpoint, release-app post-verify) **detect** the failure
|
|
after the fact. **Digest pinning would prevent it at the compose level**
|
|
— the reference itself changes every build, so compose has no choice
|
|
but to recreate.
|
|
|
|
## The design
|
|
|
|
1. **Build step writes the just-built digest to a small artifact**, e.g.
|
|
`/root/apps/<app>/.deploy/<svc>.sha`
|
|
contents = `sha256:655fe0c64391...`. Captured immediately after
|
|
`docker compose build` from `docker compose images --quiet <svc>`.
|
|
|
|
2. **Deploy step rewrites compose with the pinned digest** before `up`,
|
|
either:
|
|
- **In-place rewrite** of `docker-compose.yml`: replace
|
|
`image: muralla-muralla:latest` with `image: muralla-muralla@<sha>`.
|
|
Risk: dirties Bruno's working tree → trips the existing dirty-tree
|
|
gate. Mitigation: rewrite a temporary `docker-compose.deploy.yml`
|
|
instead, pass via `-f` to compose.
|
|
- **Override file**: keep `docker-compose.yml` untouched; generate a
|
|
sibling `docker-compose.override.deploy.yml` per build that just
|
|
contains `services: <svc>: image: muralla-muralla@<sha>`. Apply via
|
|
`docker compose -f docker-compose.yml -f docker-compose.override.deploy.yml up -d`.
|
|
Cleaner; doesn't dirty the primary file.
|
|
|
|
3. **Recreate is then guaranteed**: compose sees the `image:` field
|
|
changed → must recreate. The post-deploy verifier from this branch
|
|
becomes a belt-and-suspenders, not the load-bearing safety.
|
|
|
|
## Why this is deferred
|
|
|
|
- Touches **every app's deploy path** — kua-cashier, muralla,
|
|
muralla-socials, atlas, playgram, coder-core, kua-mail, etc. Each
|
|
needs its compose conventions checked (some may already pin digests;
|
|
some may use private registries with digest-bound auth).
|
|
- Interacts with the **dual-SoT** between `bin/webhook-repos.json` and
|
|
`services/kua-deploy/deploy-registry.json` (the coordinator broadcast
|
|
called this out — cadencia is mid-untangling it).
|
|
- The post-deploy verify alone closes the false-success class for now;
|
|
digest pinning is the *durable* fix layered on top.
|
|
- Stateful services (postgres, redis) explicitly do **not** want digest
|
|
pinning at the compose level — they should drift across kua-deploy
|
|
releases. The override-file pattern naturally limits pinning to the
|
|
app's own services.
|
|
|
|
## Recommended owner / sequencing
|
|
|
|
- Land this branch (`chore/kua-deploy-sha-verify-and-release-app-post`)
|
|
first — verify catches false-success cleanly.
|
|
- Sequence after the coordinator's `chore/deploy-mode-default-direct-and-docs`
|
|
branch (cadencia) so the webhook→direct migration is in place.
|
|
- Then a separate focused branch implements the override-file pattern
|
|
in kua-deploy + a per-app smoke test (one app at a time, kua-cashier
|
|
is a good first because it's not yet in production traffic).
|
|
- Add an integration test that simulates compose-no-op (build new
|
|
image, run deploy without bumping tag) and confirms the override-file
|
|
variant recreates while the plain-tag variant would not.
|
|
|
|
## Open questions
|
|
|
|
- Should we keep `:latest` as the *human-readable* tag and *additionally*
|
|
pin by digest in the override? (Yes — operators still want
|
|
`muralla-muralla:latest` as a stable rollback anchor.)
|
|
- Where does the override file live across the deploy lifecycle?
|
|
Tempted: in `/root/apps/<app>/.deploy/` (git-ignored), rotated per
|
|
build, kept for N builds for fast rollback.
|
|
- Does the existing `release-app` workflow pass an explicit deploy SHA
|
|
that the build step could record alongside the image digest? (Yes —
|
|
the `deployCommit` variable in kua-deploy server.js already carries
|
|
this.)
|
|
|
|
— end notes —
|