Skip to content

fix: wait for shell deadlock#348

Merged
ethanpailes merged 1 commit intomasterfrom
shell-spawn-deadlock-fix
Apr 24, 2026
Merged

fix: wait for shell deadlock#348
ethanpailes merged 1 commit intomasterfrom
shell-spawn-deadlock-fix

Conversation

@ethanpailes
Copy link
Copy Markdown
Contributor

@ethanpailes ethanpailes commented Apr 24, 2026

Issue Link

This is a re-write of #342 to strip out some agent artifacts.

AI Policy Ack

Ack

Description

This patch fixes a deadlock that could occur if the shell blocks forever on startup. To fix, we only grab the global shells lock when looking for an existing session and then grab it again when we modify or examine the global shells table, but critically we are not holding the lock while spawning the shell for the new session.

This prevents a shell that never finishes starting up from locking up the whole daemon by retaining ownership of the global shells table lock.

I also added a timeout to the shell startup process, though this isn't really the core of the deadlock issue. It would have been nice to have and would have prevented perminant daemon lockup.

See #342 for context.

This patch fixes a deadlock that could occur if the shell blocks
forever on startup. To fix, we only grab the global shells lock
when looking for an existing session and then grab it again when
we modify or examine the global shells table, but critically we
are not holding the lock while spawning the shell for the new session.

This prevents a shell that never finishes starting up from
locking up the whole daemon by retaining ownership of the global
shells table lock.

See #342 for context.
@ethanpailes ethanpailes force-pushed the shell-spawn-deadlock-fix branch from efbc529 to bdb3eee Compare April 24, 2026 20:29
Copy link
Copy Markdown

@maxhbooth maxhbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


// Even the most sluggish dotfile setup ought to be done within
// 90 seconds.
const SENTINEL_POLL_TIMEOUT: time::Duration = time::Duration::from_secs(90);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be worth mentioning this change in the commit, my first thought when I saw this after reading the title was that the solution was to add a timeout.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@ethanpailes ethanpailes merged commit 5d33308 into master Apr 24, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants