janitor cleanup crashes when snapshot references a gateway not present in local config
Version: sqlmesh 0.230.1 (via tcloud 2.11.0)
Problem
Running janitor fails with SQLMeshError: Gateway '<name>' not found in the available engine adapters when the state store contains snapshots created under a gateway that isn't configured in the current environment.
The crash originates in SnapshotEvaluator.cleanup() (evaluator.py:540):
self.get_adapter(s.model_gateway)
get_adapter() raises unconditionally if the gateway key isn't found in self.adapters.
This can happen when a developer creates models using a non-default gateway in a dev environment, but that gateway config is never merged into the shared project config. The snapshots persist in the state store and block janitor for everyone.
Impact
The entire janitor batch fails — no expired snapshots in that batch get cleaned up, including ones with valid/available gateways.
Expected behavior
Janitor should skip snapshots whose gateway isn't locally configured (with a warning) and continue cleaning up the rest. These snapshots can't be cleaned up without the adapter anyway.
Suggested fix
In SnapshotEvaluator.cleanup(), filter target_snapshots against self.adapters.keys() before passing them to concurrent_apply_to_snapshots. Log a warning for any skipped snapshots so operators know they exist.
Reproduction
- Developer A creates models using a gateway not in the shared/main config, via a dev environment
- The branch is never merged; the gateway config never reaches main
- Developer B (or CI) runs
janitor — the state store still has snapshots referencing that gateway
- Janitor crashes on the missing gateway
janitor cleanup crashes when snapshot references a gateway not present in local config
Version: sqlmesh 0.230.1 (via tcloud 2.11.0)
Problem
Running
janitorfails withSQLMeshError: Gateway '<name>' not found in the available engine adapterswhen the state store contains snapshots created under a gateway that isn't configured in the current environment.The crash originates in
SnapshotEvaluator.cleanup()(evaluator.py:540):get_adapter()raises unconditionally if the gateway key isn't found inself.adapters.This can happen when a developer creates models using a non-default gateway in a dev environment, but that gateway config is never merged into the shared project config. The snapshots persist in the state store and block janitor for everyone.
Impact
The entire janitor batch fails — no expired snapshots in that batch get cleaned up, including ones with valid/available gateways.
Expected behavior
Janitor should skip snapshots whose gateway isn't locally configured (with a warning) and continue cleaning up the rest. These snapshots can't be cleaned up without the adapter anyway.
Suggested fix
In
SnapshotEvaluator.cleanup(), filtertarget_snapshotsagainstself.adapters.keys()before passing them toconcurrent_apply_to_snapshots. Log a warning for any skipped snapshots so operators know they exist.Reproduction
janitor— the state store still has snapshots referencing that gateway