lkl: per-thread irqs_enabled fixes IRQ-leak hangs under real drivers#638
lkl: per-thread irqs_enabled fixes IRQ-leak hangs under real drivers#638josephnef wants to merge 3 commits into
Conversation
tavip
left a comment
There was a problem hiding this comment.
Sleeping with interrupts disabled is not permitted in Linux:
https://share.google/aimode/5O1kJJ2idJt2FDdyE
Could you share the problem you are facing in your project, there are ways to avoids this issue with different design pattern
|
@tavip — thanks for the quick read.
Yes, happy to. Some background, then the specific failure mode, then a proposal for the test. The project (R&D, not production)A proof-of-concept that runs the mainline Linux
End-to-end RX validated on three RTL chips (RTL8812AU The actual failureYou're right that what the patch-1 KUnit test demonstrates — libusb's event thread is a true POSIX
The host thread's Patch 2 ( The testGiven the above — I'll replace the KUnit suite in patch 1 with one that actually exercises the host-thread path. Shape:
That mirrors what libusb-backend-style hosts actually do, and avoids the I'll respin with that test + the |
irqs_enabled in arch/lkl/kernel/irq.c is a single `static bool`
global, with no save/restore in __switch_to. That is fine for the
existing test suite, which doesn't exercise the failure paths it
exposes, but it has two correctness consequences for code outside
the in-tree tests:
- Any kernel path that does spin_lock_irqsave and schedules before
the matching restore (an unsupported pattern in Linux; lockdep
would catch it on a normal kernel — LKL doesn't) leaks the
DISABLED value to whichever thread runs next, and the next
thread's restore overwrites it. The save/restore semantics of
arch_local_irq_save / arch_local_irq_restore are not per-thread
in LKL today.
- More importantly, a host pthread (one created via
lkl_ops->thread_create, e.g. a libusb event thread or a
GLib/Qt timer callback that backs a virtio/USB-style host
shim) invoking lkl_trigger_irq from outside any kernel context
reads whatever value the last kernel task to run left in the
global. The next commit relies on per-thread irqs_enabled to
distinguish such host callers cleanly; this commit prepares the
field for that distinction.
This patch is a pure refactor: move irqs_enabled into
struct thread_info; access via current_thread_info() from
arch_local_save_flags, arch_local_irq_restore, and the
lkl_trigger_irq pending-check. No explicit save/restore is added
to __switch_to — the existing
_current_thread_info = task_thread_info(next);
line is the entire mechanism. Each thread's irqs_enabled travels
with its thread_info, the same way a real CPU's register file
follows the thread.
Behaviour for the existing test suite is identical (the suite
doesn't exercise the cross-thread leak). The follow-up commit
("lkl: deliver IRQs from host-thread callers regardless of
irqs_enabled") uses the per-thread field to fix the host-caller
case observed in real-world backends.
- arch/lkl/include/asm/thread_info.h: add `unsigned long
irqs_enabled` field; INIT_THREAD_INFO sets it to 1
(ARCH_IRQ_ENABLED) so the init task starts enabled.
- arch/lkl/kernel/irq.c: drop the `static bool irqs_enabled`
global. arch_local_save_flags, arch_local_irq_restore, and the
lkl_trigger_irq check all go through current_thread_info().
- arch/lkl/kernel/threads.c: init_ti sets ti->irqs_enabled =
ARCH_IRQ_ENABLED for freshly-allocated kernel threads.
Signed-off-by: Joseph <joseph@josephnef.dev>
lkl_trigger_irq is documented as callable "from arbitrary host threads" (see the comment block above the function). True host pthreads — libusb completion threads, glibc SIGEV_THREAD timer callbacks, anything created by a backend library that ends up posting an IRQ into the LKL kernel — acquire the LKL CPU via lkl_cpu_get but never go through __switch_to. _current_thread_info therefore still points at whichever kernel task last ran (often the idle task), and that task's irqs_enabled field may be ARCH_IRQ_DISABLED at the moment the host caller reads it. Honoring that stale flag for host-thread callers is a silent IRQ-pending hang: the IRQ gets marked pending, but nothing on the kernel side notices until the matching irqrestore in the original kernel context — which often never comes, because the kernel has already moved on. Drivers that post IRQs from host-thread backends (e.g. an out-of-tree USB host-controller shim that forwards URBs to libusb and signals completion from libusb's event thread; any virtio/host-shim backend with a thread-based notification scheme has the same exposure) hang the kernel within tens of operations. Detect host-thread callers by comparing thread_self() to the thread_info owner's tid via lkl_ops->thread_equal. When they differ, the caller is not the kernel thread that owns this thread_info; the stale irqs_enabled field has no claim on us, and we deliver the IRQ. Kernel-thread callers — including the recursive "lkl_trigger_irq -> IRQ -> softirq -> lkl_trigger_irq" path called out in the original comment — continue to honor their own per-thread irqs_enabled (set by the previous commit). Signed-off-by: Joseph <joseph@josephnef.dev>
lkl_trigger_irq is documented as callable "from arbitrary host
threads," and host-thread-driven backends (the in-tree
virtio-net-tap and similar) rely on that contract. The previous
commit ("lkl: deliver IRQs from host-thread callers regardless of
irqs_enabled") makes the contract hold even when the kernel task
currently in current_thread_info() has irqs_enabled disabled.
Add a small KUnit suite (CONFIG_LKL_IRQ_KUNIT_TEST) that locks
that contract in:
- Allocate an IRQ via lkl_get_free_irq + request_irq with a
handler that increments a counter and completes a completion.
- Spawn a true host pthread via lkl_ops->thread_create.
- The host pthread calls lkl_trigger_irq on the registered IRQ
from outside any kernel context.
- wait_for_completion_timeout from the test kthread releases the
LKL CPU so the host pthread can acquire it via
lkl_cpu_try_run_irq inside lkl_trigger_irq, and gives the
handler 500 ms to fire.
- Assert the handler ran exactly once.
Wiring:
- arch/lkl/kernel/irq_test.c: the new KUnit suite (.name = "lkl_irq").
- arch/lkl/kernel/Makefile: build it when CONFIG_LKL_IRQ_KUNIT_TEST=y.
- arch/lkl/Kconfig: new boolean depends on KUNIT.
- tools/lkl/Makefile.autoconf: kunit_test_enable also sets
LKL_IRQ_KUNIT_TEST, so the existing kunit=yes CI lane picks it
up alongside LKL_PCI_KUNIT_TEST.
- tools/lkl/tests/boot.c: lkl_test_kunit_irq parses the boot log
for "ok N lkl_irq", mirroring lkl_test_kunit_pci.
Signed-off-by: Joseph <joseph@josephnef.dev>
45352f4 to
79779c9
Compare
Summary
arch/lkl/kernel/irq.ckeepsirqs_enabledas a singlestatic boolglobal. That is fine for the in-tree test suite, which doesn't
exercise the failure paths it exposes, but it has two correctness
consequences for code that runs on LKL outside those tests:
No per-thread isolation of IRQ-enable state across context
switches. Any kernel path that does
spin_lock_irqsaveandschedules before the matching restore leaks the DISABLED value to
whichever thread runs next. (Sleeping with IRQs disabled is not a
supported Linux pattern, and isn't the load-bearing motivation for
this series — patch 1 is just the refactor that enables patch 2.)
Host-thread callers of
lkl_trigger_irqhonour stale state.lkl_trigger_irqis documented as callable "from arbitrary hostthreads," but reads
irqs_enabledeven when the caller is a hostpthread that doesn't own the current
thread_info. Host pthreads(libusb completion callbacks, glibc
SIGEV_THREADtimer callbacks,anything created by a backend library that ends up posting an IRQ
into the kernel from outside any kernel context) acquire the LKL
CPU via
lkl_cpu_getbut never go through__switch_to, so_current_thread_infostill points at whichever kernel task lastran, often the idle task. Honouring its stale
irqs_enabledsilently pends every host-injected IRQ. This is the load-bearing
bug, and patch 2 fixes it.
Context where this was hit
A research proof-of-concept that runs the mainline Linux
rtw88USBWi-Fi driver entirely in userspace via LKL, on a real USB Wi-Fi
adapter. The host program links
liblkl.a, registers a virtual USBhost controller into the in-process kernel (a ~470-line shim in the
drivers/usb/host/-style that translatesstruct urbto a flat viewstruct and back), and forwards URBs to libusb running on the host
process. libusb's event thread is a true POSIX pthread; URB
completions are posted back into the LKL kernel from that thread via
lkl_trigger_irq, which is exactly the path bug 2 affects.Behaviour with this series applied:
rtw88driver hangs the in-process LKLkernel within ~50 USB control transfers, reliably.
jiffiesstopsadvancing;
msleep-based kthreads freeze.gdbconfirms theglobal
irqs_enabledis stuck at 0.finishes firmware download, brings
wlan0up, and the host canopen an
AF_PACKETsocket onwlan0and capture 802.11 frameswith radiotap headers. Validated on three RTL chips (RTL8812AU
0bda:8812, RTL8821AU2357:0120, RTL8814AU0bda:8813). TX(radiotap injection via
AF_PACKET sendtoonwlan0in monitormode) also works.
What the series does
Three commits, in order:
lkl: make irqs_enabled per-thread via current_thread_info()— A pure refactor. Move
irqs_enabledfrom astatic boolglobalin
irq.cinto a field onstruct thread_info; access viacurrent_thread_info()fromarch_local_save_flags/arch_local_irq_restore/lkl_trigger_irq. No explicitsave/restore code — the existing
_current_thread_info = task_thread_info(next);line in__switch_tois the entire mechanism. IRQ-enable state moves withits thread, the same way a real CPU's register file does. No
behavioural change for the existing test suite.
lkl: deliver IRQs from host-thread callers regardless of irqs_enabled— The semantic fix. Inlkl_trigger_irq, detecthost-thread callers via
lkl_ops->thread_equal(ti->tid, lkl_ops->thread_self())anddeliver unconditionally; kernel-thread callers (including the
recursive
lkl_trigger_irq -> IRQ -> softirq -> lkl_trigger_irqpath called out in the original comment) continue to honour their
own per-thread
irqs_enabled. Depends on patch 1 to give theowner-vs-self
tidcomparison something precise to compare.lkl: add KUnit test for host-pthread caller of lkl_trigger_irq— Locks in the contract. New
CONFIG_LKL_IRQ_KUNIT_TEST+arch/lkl/kernel/irq_test.c. The test spawns a true host pthreadvia
lkl_ops->thread_create, has it calllkl_trigger_irqon akernel-registered IRQ from outside any kernel context, and asserts
the handler runs synchronously. Wired into the existing
kunit=yesCI lane intools/lkl/Makefile.autoconfalongsideLKL_PCI_KUNIT_TEST, and intotools/lkl/tests/boot.caskunit_irq(mirrorskunit_pci/kunit_mmu).Test plan
kunitlane: the newkunit_irqtest passes via thestandard
ok N lkl_irqline in the boot log (visible in thegithub-actions Test Results comment on this PR).
linux/windows-2022/clang-build/mmu_kasanlanes: unchanged behaviour. The
current_thread_info()->irqs_enabledfield is initialised inINIT_THREAD_INFOandinit_ti; pre-existing tests aren'taffected.
checkpatch: clean.described above: rtw88-family driver validated locally on the
three RTL chips. All come up with this series applied; all hang
on the un-patched kernel.
Note on the previous version of this PR
An earlier revision of this series included a KUnit test that
provoked the cross-thread leak by doing
spin_lock_irqsave+schedule_timeoutin one kthread while a sibling observed the IRQstate — a pattern that is invalid in Linux generally. That framing
was misleading: the actual real-world bug is the host-pthread caller
path (bug 2 above), and the test now exercises that path directly,
via
lkl_ops->thread_createand a registered IRQ handler. Theschedule-while-disabled framing is gone from both the test and the
commit messages.