fix(captions): prevent Metal memory exhaustion when generating subtitles by ManthanNimodiya · Pull Request #1949 · CapSoftware/Cap

ManthanNimodiya · 2026-06-28T08:20:06Z

On Apple Silicon, each WhisperState allocates ~700 MB of Metal (unified) memory. Without serialisation, rapid re-clicks on the subtitle button spawned N concurrent transcription sessions, exhausting RAM (44 GB observed for ~60 retries).

Changes:

Added TRANSCRIPTION_LOCK: Mutex<()> to ensure at most one transcription runs at a time
Acquired the lock in transcribe_audio before entering the engine match
Released the cached WhisperContext immediately after Whisper finishes so Metal buffers (~500 MB) are freed rather than held until the editor closes

Greptile Summary

This PR changes subtitle transcription to reduce ML memory pressure. The main changes are:

A global mutex serializes Whisper and Parakeet transcription.
Whisper transcription releases the cached context after each run.
The comments document the Apple Silicon Metal memory spike that motivated the change.

Confidence Score: 4/5

The cancellation path for transcription needs a fix before merging.

Dropping the async command can release the global slot while the blocking ML worker continues.
A retry can then start a second worker and recreate overlapping memory-heavy sessions.
The unconditional Whisper cache eviction is a smaller cross-platform performance regression.

apps/desktop/src-tauri/src/captions.rs

Important Files Changed

Filename	Overview
apps/desktop/src-tauri/src/captions.rs	Adds transcription serialization and Whisper cache eviction, but the lock can be released before the blocking ML worker exits if the command is cancelled.

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/desktop/src-tauri/src/captions.rs:1119
**Cancelled Commands Release The Slot**

When the async command is dropped while the `spawn_blocking` transcription is still running, this guard is dropped but the blocking worker keeps using the ML resources. A retry can then acquire `TRANSCRIPTION_LOCK` and start another Whisper or Parakeet worker, so cancel-and-retry or window-close-and-retry can still create overlapping transcription sessions and hit the same memory exhaustion this lock is meant to prevent.

### Issue 2 of 2
apps/desktop/src-tauri/src/captions.rs:1158-1159
**Whisper Cache Always Evicted**

This clears the cached Whisper context after every Whisper run on all platforms, although the memory issue described here is specific to Apple Silicon Metal buffers. On Windows, Linux, and Intel macOS, repeated subtitle generation now reloads the Whisper model from disk each time instead of reusing the warmed `WHISPER_CONTEXT`, causing avoidable latency and memory churn with no Metal-memory benefit.

_{Reviews (1): Last reviewed commit: "fix(captions): serialise transcription a..." | Re-trigger Greptile}

Greptile also left 2 inline comments on this PR.

Context used:

Context used - CLAUDE.md (source)
Context used - AGENTS.md (source)

… prevent Metal memory exhaustion

tembo · 2026-06-28T08:22:34Z

+            // Release the cached context immediately after use so Metal buffers
+            // (~500 MB on Apple Silicon) are freed rather than held until the
+            // editor closes.  The next call will reload the model as needed.
+            {
+                let mut ctx = WHISPER_CONTEXT.lock().await;
+                *ctx = None;
+            }


Releasing the cached context on every platform means we’ll reload the model for every Whisper run (potential perf regression on non-Apple Silicon). If the memory pressure issue is specifically Apple Silicon, consider gating this to macos/aarch64.

Suggested change

// Release the cached context immediately after use so Metal buffers

// (~500 MB on Apple Silicon) are freed rather than held until the

// editor closes. The next call will reload the model as needed.

{

let mut ctx = WHISPER_CONTEXT.lock().await;

*ctx = None;

}

// Release the cached context immediately after use so Metal buffers

// (~500 MB on Apple Silicon) are freed rather than held until the

// editor closes. The next call will reload the model as needed.

#[cfg(all(target_os = "macos", target_arch = "aarch64"))]

{

let mut ctx = WHISPER_CONTEXT.lock().await;

*ctx = None;

}

tembo · 2026-06-28T08:22:42Z

+    // WhisperState / Parakeet session exists at a time.  Without this, rapid
+    // re-clicks spawn N concurrent sessions each consuming ~700 MB of Metal
+    // (unified) memory on Apple Silicon, which produced the observed 44 GB spike.
+    let _transcription_guard = TRANSCRIPTION_LOCK.lock().await;


Minor thought: if rapid re-clicks also trigger redundant extract_audio_from_video work, you may want to acquire TRANSCRIPTION_LOCK earlier (before extraction) so only one click does the full pipeline at a time. Current placement still prevents concurrent model sessions, but multiple extractions can run in parallel.

greptile-apps · 2026-06-28T08:23:17Z

+    // WhisperState / Parakeet session exists at a time.  Without this, rapid
+    // re-clicks spawn N concurrent sessions each consuming ~700 MB of Metal
+    // (unified) memory on Apple Silicon, which produced the observed 44 GB spike.
+    let _transcription_guard = TRANSCRIPTION_LOCK.lock().await;


Cancelled Commands Release The Slot

When the async command is dropped while the spawn_blocking transcription is still running, this guard is dropped but the blocking worker keeps using the ML resources. A retry can then acquire TRANSCRIPTION_LOCK and start another Whisper or Parakeet worker, so cancel-and-retry or window-close-and-retry can still create overlapping transcription sessions and hit the same memory exhaustion this lock is meant to prevent.

Prompt To Fix With AI

This is a comment left during a code review. Path: apps/desktop/src-tauri/src/captions.rs Line: 1119 Comment: **Cancelled Commands Release The Slot** When the async command is dropped while the `spawn_blocking` transcription is still running, this guard is dropped but the blocking worker keeps using the ML resources. A retry can then acquire `TRANSCRIPTION_LOCK` and start another Whisper or Parakeet worker, so cancel-and-retry or window-close-and-retry can still create overlapping transcription sessions and hit the same memory exhaustion this lock is meant to prevent. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-06-28T08:23:18Z

+                let mut ctx = WHISPER_CONTEXT.lock().await;
+                *ctx = None;


Whisper Cache Always Evicted

This clears the cached Whisper context after every Whisper run on all platforms, although the memory issue described here is specific to Apple Silicon Metal buffers. On Windows, Linux, and Intel macOS, repeated subtitle generation now reloads the Whisper model from disk each time instead of reusing the warmed WHISPER_CONTEXT, causing avoidable latency and memory churn with no Metal-memory benefit.

Prompt To Fix With AI

This is a comment left during a code review. Path: apps/desktop/src-tauri/src/captions.rs Line: 1158-1159 Comment: **Whisper Cache Always Evicted** This clears the cached Whisper context after every Whisper run on all platforms, although the memory issue described here is specific to Apple Silicon Metal buffers. On Windows, Linux, and Intel macOS, repeated subtitle generation now reloads the Whisper model from disk each time instead of reusing the warmed `WHISPER_CONTEXT`, causing avoidable latency and memory churn with no Metal-memory benefit. How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…ontext eviction to aarch64

tembo · 2026-06-28T12:16:15Z

-                .await
-                .map_err(|e| format!("Parakeet task panicked: {e}"))?
+            tokio::task::spawn_blocking(move || {
+                let _guard = TRANSCRIPTION_LOCK.lock().unwrap();


std::sync::Mutex::lock() can return a poisoned guard if a prior transcription panicked; unwrap() would then panic and permanently break subtitles until restart. Might be worth recovering here (same applies to the Whisper lock below).

Suggested change

let _guard = TRANSCRIPTION_LOCK.lock().unwrap();

let _guard = TRANSCRIPTION_LOCK

.lock()

.unwrap_or_else(|poisoned| poisoned.into_inner());

fix(captions): serialise transcription and release Whisper context to…

7f4ffa9

… prevent Metal memory exhaustion

tembo Bot reviewed Jun 28, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 28, 2026

View reviewed changes

fix(captions): move serialisation lock into spawn_blocking and gate c…

3923ac7

…ontext eviction to aarch64

tembo Bot reviewed Jun 28, 2026

View reviewed changes

fix(captions): recover from poisoned mutex instead of panicking

e64725b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(captions): prevent Metal memory exhaustion when generating subtitles#1949

fix(captions): prevent Metal memory exhaustion when generating subtitles#1949
ManthanNimodiya wants to merge 3 commits into
CapSoftware:mainfrom
ManthanNimodiya:fix/subtitle-memory-leak

ManthanNimodiya commented Jun 28, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

tembo Bot Jun 28, 2026

Uh oh!

tembo Bot Jun 28, 2026

Uh oh!

greptile-apps Bot Jun 28, 2026

Uh oh!

greptile-apps Bot Jun 28, 2026

Uh oh!

tembo Bot Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ManthanNimodiya commented Jun 28, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

tembo Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

tembo Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

tembo Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ManthanNimodiya commented Jun 28, 2026 •

edited by greptile-apps Bot

Loading