Replies: 1 comment
-
|
This is a real interaction between three things, not a Celery bug or an SQLModel bug in isolation:
The mechanism is general: any non-atomic initialization that can be interrupted by a Python-level signal handler can leave a class permanently broken. It hits SQLModel/pydantic especially hard because pydantic builds validators lazily and caches on the class. What actually works1. Don't raise from a signal handler in worker code. The Celery docs themselves note that SoftTimeLimit "raises an exception in the task" — what they don't stress is that it does so from a signal context. The safer contract is: # In your task, check periodically and exit cleanly:
from celery.exceptions import SoftTimeLimitExceeded
@app.task(soft_time_limit=30, time_limit=60)
def mytask():
for item in items:
if self.should_stop(): # cooperative check, not signal-driven
raise SoftTimeLimitExceeded()
do_work(item)Cooperative cancellation between model-validation calls is interruption-safe. A signal-raised cancellation inside one is not. 2. If you must use soft limits, isolate them to subprocess workers. # celeryconfig.py
worker_pool = 'prefork' # default; each task runs in its own process
task_soft_time_limit = 30
task_time_limit = 60With 3. Pre-warm the validator cache before enabling signal handlers. # Run once at worker start, before any soft-time-limit-bound tasks:
for model_cls in (Hero, OtherModel, ...):
model_cls.model_validate({"name": "_"}) # or use model_rebuild()
model_cls.model_rebuild()Once the validator graph is fully built and cached, 4. Hard-limit only, or SIGKILL on timeout. If the task is operating on external resources (DB rows, files) where abrupt termination is acceptable, skip soft limits entirely and let Celery kill the worker process on hard timeout. No signal-raised exceptions, no half-initialized objects, worker recycles cleanly. Why the
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
First Check
Commit to Help
Example Code
Description
Bug detection
This section is only for SEO, ie. help people in the same situation find this issue
We detected this issue in our codebase as we were starting to see that kind of PostgreSQL errors:
Explanation
Workaround
Note that one way to fix this error is to monkey patch
partial_init:However, it seems
partial_initbeing interrupted by an exception is not the only codepath in SQLModel that can lead to corrupted models.Operating System
Linux
Operating System Details
Reproduced in many environments (GKE cluster, local NixOS machine, macOS laptop, etc.), with different Python versions, on different SQLModel versions.
SQLModel Version
0.0.38
Python Version
3.13
Additional Context
Trace of a local execution of the reproducer:
Beta Was this translation helpful? Give feedback.
All reactions