fix(tests): retry rmdir to clear Windows EBUSY flake in updater-integration#7728
Merged
Merged
Conversation
…ration
The Windows backend-test job has been intermittently red on `crash-loop
guard: bootCount=3 forces immediate rollback` (and other cases in
`updater-integration.ts`) with:
Error: EBUSY: resource busy or locked, rmdir
'C:\Users\RUNNER~1\AppData\Local\Temp\updater-it-...'
Each `it()` builds a temp git repo via `execSync('git ...')` and cleans
up in a `try…finally` with `fs.rm(dir, {recursive: true, force: true})`.
On Windows, git child processes can briefly hold file handles after
exit (NTFS lazy-release / antivirus scan / pack-file handles), so the
first rmdir attempt hits EBUSY. `fs.rm`'s default `maxRetries` is 0, so
there is no recovery and the test errors out.
Hoist the cleanup to a single `cleanupTmp()` helper that passes
`maxRetries: 10, retryDelay: 100` (a built-in `fs.rm` capability since
Node 14.14). On Linux/macOS this is a no-op — there's nothing to retry.
On Windows it absorbs the transient lock.
No production code touched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
Review Summary by QodoFix Windows EBUSY flake in updater-integration tests with retry logic
WalkthroughsDescription• Introduce cleanupTmp() helper with retry logic for Windows EBUSY flake • Configure maxRetries: 10, retryDelay: 100 to handle transient file locks • Replace all direct fs.rm() calls with new helper in test cleanup blocks • Resolves intermittent test failures in updater-integration on Windows Diagramflowchart LR
A["Windows git process<br/>holds file handles"] -->|transient lock| B["fs.rm fails<br/>with EBUSY"]
C["New cleanupTmp()<br/>helper"] -->|maxRetries: 10| D["Retry rmdir<br/>on Windows"]
D -->|success| E["Test cleanup<br/>completes"]
B -.->|before fix| F["Test fails"]
E -.->|after fix| G["Test passes"]
File Changes1. src/tests/backend/specs/updater-integration.ts
|
Windows CI failure on this branch surfaced a *second* flake in the same
file (`crash-loop guard: bootCount=3 forces immediate rollback`):
TypeError: Cannot read properties of undefined (reading 'execution')
at updater-integration.ts:230
`checkPendingVerification` kicks off `performRollback` as fire-and-forget
(`void performRollback(s, deps).catch(...)`), and the test waits a flat
250 ms before asserting `states.at(-1)!.execution.status === 'rolled-back'`.
On Linux 250 ms is plenty. On Windows, git checkout + spawned-process
bookkeeping regularly push past that — so `saveState` hasn't fired yet
and `states` is empty.
This race was previously masked: the test's `finally` ran `fs.rm`,
which threw EBUSY against handles still held by the in-flight rollback,
and JS's "finally-throws-override-try-throws" semantics meant mocha
reported the EBUSY rather than the underlying TypeError. The retry-rm
patch on this branch unmasked it.
Replace the flat sleep with condition-based polling (25 ms tick, 10 s
ceiling) for a terminal state (`rolled-back` | `rollback-failed`). The
existing `assert.equal(... 'rolled-back')` still runs, giving a clean
diff if rollback landed on the failure side instead. Linux runtime
drops 329 ms → 104 ms because the poll exits as soon as the state
lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EBUSY: resource busy or locked, rmdiragainst the temp-repo dirs created insrc/tests/backend/specs/updater-integration.ts. Failing case most often: "crash-loop guard: bootCount=3 forces immediate rollback" — but everyit()in the file is exposed.execSync('git …')and cleans up intry…finallywithfs.rm(dir, {recursive: true, force: true}). On Windows, git child processes can briefly hold file handles after exit (NTFS lazy-release / antivirus scan / pack files), so the first rmdir hitsEBUSY.fs.rm's defaultmaxRetriesis 0 — no recovery, test errors.cleanupTmp()helper that passesmaxRetries: 10, retryDelay: 100. Built-infs.rmcapability since Node 14.14. On Linux/macOS it is a no-op (nothing to retry); on Windows it absorbs the transient lock.Test plan
cd src && npx mocha --import=tsx --timeout 120000 tests/backend/specs/updater-integration.ts→ 5/5 pass locally on Linux.pnpm ts-checkreports no new errors for this file (pre-existing plugin_packages errors unrelated).Notes
The brief also flagged a separate "shutdown hook leak" line that prints right after the rmdir error. That message is a consequence of the assertion failure propagating before the test's own server shuts down cleanly; once the rmdir succeeds, the leak warning goes away. Separate investigation deferred.
🤖 Generated with Claude Code