Pre start init containers#13862
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
ff47ea5 to
8e44596
Compare
|
/review |
|
❌ PR Review Failed — The review agent encountered an error and could not complete the review. View logs. |
8e44596 to
4634251
Compare
ndeloof
left a comment
There was a problem hiding this comment.
LGTM. I wonder we could get this managed by reconciliation, so orchestrating init container is part of the reconciliation plan, this can come later in a follow-up PR
4634251 to
2a3f178
Compare
There was a problem hiding this comment.
Pull request overview
Adds support for the Compose spec pre_start lifecycle hook by running hooks as ephemeral containers before a service’s containers start, gating service startup on hook success.
Changes:
- Introduces a
pre_startexecution path that creates/runs ephemeral hook containers attached to the service networks and sharing volumes viaVolumesFrom. - Integrates
pre_startinto service start convergence logic (runs once per service when no replica is already running). - Adds unit tests plus E2E coverage and fixtures for success/failure, ordering, idempotency, spec-change reruns, force-recreate, and scaling semantics.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/compose/pre_start.go | Implements pre_start hook container creation, network attachment, log streaming, and exit-code gating. |
| pkg/compose/pre_start_test.go | Unit tests validating sequencing, failure behavior, image fallback, and container configuration. |
| pkg/compose/convergence.go | Hooks pre_start into service startup flow and refactors start logic into startServiceContainer. |
| pkg/compose/containers.go | Adds isNotRunning predicate for container filtering during startup. |
| pkg/e2e/hooks_test.go | Adds E2E scenarios covering pre_start behavior across success/failure and various up flows. |
| pkg/e2e/fixtures/pre_start/Dockerfile | Fixture image used to validate build-image inheritance for hooks. |
| pkg/e2e/fixtures/pre_start/compose-success.yaml | E2E fixture for a successful pre_start writing init state consumed by the service. |
| pkg/e2e/fixtures/pre_start/compose-error.yaml | E2E fixture for a failing pre_start that must gate service startup. |
| pkg/e2e/fixtures/pre_start/compose-build.yaml | E2E fixture validating hook image fallback to the service’s built image. |
| pkg/e2e/fixtures/pre_start/idempotent/compose.yaml | Fixture for idempotent re-up and scale-up behavior (hook should not rerun). |
| pkg/e2e/fixtures/pre_start/mid-failure/compose.yaml | Fixture validating sequential hook execution with mid-sequence failure. |
| pkg/e2e/fixtures/pre_start/sequential/compose.yaml | Fixture validating strict sequential ordering of multiple hooks. |
| pkg/e2e/fixtures/pre_start/scaled/compose.yaml | Fixture validating “run once per service” semantics for a scaled service. |
| pkg/e2e/fixtures/pre_start/spec-change/compose.v1.yaml | Fixture for spec-change rerun behavior (v1). |
| pkg/e2e/fixtures/pre_start/spec-change/compose.v2.yaml | Fixture for spec-change rerun behavior (v2). |
| probe := c.RunDockerCmd(t, "run", "--rm", "-v", projectName+"_data:/mnt", "alpine", "wc", "-l", "/mnt/tokens.txt") | ||
| assert.Assert(t, strings.Contains(strings.TrimSpace(probe.Stdout()), "1"), "expected 1 token line after first up, got: %s", probe.Stdout()) |
2a3f178 to
5d892ac
Compare
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
This PR adds well-structured pre_start lifecycle hook support with good test coverage (8 unit tests + 10 E2E tests). One confirmed resource leak on older API versions and one low-severity observation about log goroutine cleanup are noted below.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two medium-severity issues found in pkg/compose/pre_start.go, both in the new hook execution path. The overall design is solid — cleanup on error paths is well-handled, the deterministic replica selection for VolumesFrom is correct, and the test coverage is thorough. The two items below are edge-case robustness concerns in the wait/cleanup logic.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🔴 CRITICAL
This PR adds well-structured pre_start lifecycle hook support with solid test coverage (8 unit tests + 10 E2E tests). One high-severity race condition in waitPreStart can cause intermittent spurious failures for healthy hooks, two medium findings relate to goroutine blocking and missing Compose labels on hook containers, and one low finding is a silent discard of a cleanup error. All findings are in new code introduced by this PR.
3187afa to
519c329
Compare
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The pre_start lifecycle hook implementation is well-structured and carefully handles the tricky aspects of ephemeral container management:
- Correct lifecycle ordering:
ContainerWaitis subscribed beforeContainerStartto avoid missing exit events for fast-exiting hooks, and the log stream is opened before start to race-safely avoid AutoRemove-vs-404. - Orphan cleanup on failure: If
ContainerStartfails, the never-started hook container is explicitly force-removed (sinceAutoRemoveonly fires after a successful start), with a warn-level log if removal itself fails. waitPreStartrace handling: The two-channel ContainerWait race (nil-on-Error coinciding with Result) is handled correctly. TheerrCh = nilpattern after a clean nil receive correctly prevents a busy-loop on a closed channel; since Docker's Error channel delivers at most one event, no real error can arrive after the nil. The non-blocking post-Result check still catches concurrent real errors when errCh has not yet been nil'd.- Log stream cancellation: A dedicated
logCtxprevents a daemon that holds the log stream open from deadlocking<-logsDoneafter the hook exits. - Skip condition:
len(serviceContainers) == len(toStart)correctly gates the hook to initial start (no replicas running), and is skipped on scale-up as specified.
No bugs introduced by this PR were confirmed. The implementation is solid.
runPreStart executes a service's pre_start hooks sequentially as ephemeral containers that share the first non-running replica's volumes via VolumesFrom and attach to the same networks. A non-zero hook exit gates service start. per_replica: false is the only currently supported mode; per_replica: true is rejected up front. The donor replica is the lowest-numbered one so the choice is deterministic. ContainerWait uses WaitConditionNextExit, and the wait loop deterministically handles the daemon's clean-close (nil on Error + exit code on Result) and transport-error races to avoid spurious hook failures. The log stream is opened before ContainerStart to avoid racing AutoRemove on fast-exiting hooks, and runs under a derived context so a daemon that keeps the connection open cannot deadlock the call. Hook containers carry project/service/version labels; the two cleanup paths force-remove the never-started container explicitly and warn when removal fails. pre_start runs once per service when no replica is already running (initial up, force-recreate or spec change), and is skipped on scale-up so additional replicas don't re-trigger the hooks. Coverage: 11 unit tests (including scheduler-race stress) with goroutine-leak verification via goleak, plus 10 E2E tests (success path, hook failure gating start, build-image inheritance, idempotent re-up, spec change, force-recreate, mid-sequence failure, ordering, scale-up, scaled service). Signed-off-by: Guillaume Lours <glours@users.noreply.github.com>
519c329 to
354e9ea
Compare
<!--Delete sections as needed --> ## Description As per docker/compose#13862 and compose-spec/compose-spec#647 ## Related issues or tickets <!-- Related issues, pull requests, or Jira tickets --> ## Reviews <!-- Notes for reviewers here --> <!-- List applicable reviews (optionally @tag reviewers) --> - [ ] Technical review - [ ] Editorial review - [ ] Product review --------- Co-authored-by: Guillaume Lours <705411+glours@users.noreply.github.com>
What I did
Add support for the
pre_startservice lifecycle hook from the compose spec.Hooks run as ephemeral containers before a service's containers start, sharing the first replica's volumes via
VolumesFromand attached to the same networks. A non-zero hook exit gates service start.Semantics:
up,--force-recreate, spec change).scale up.per_replica: trueis rejected up front; onlyper_replica: falseis supported for now.build:); overridable per hook.Covered by 8 unit tests and 10 E2E tests (success, failure gating, build inheritance, idempotent re-up, spec change, force-recreate, mid-sequence failure, ordering, scale-up, scaled service).
Note: the first commit adds a temporary
replaceto acompose-gofork exposingPreStartonServiceConfig. To be reverted once a propercompose-gorelease lands.Related issue
compose-spec/compose-spec#11 & compose-spec/compose-spec#291
(not mandatory) A picture of a cute animal, if possible in relation to what you did
