htlcswitch: add link state machine fuzz harness by MPins · Pull Request #10865 · lightningnetwork/lnd

MPins · 2026-05-30T18:09:53Z

This PR adds a coverage-guided fuzz harness that exercises the link state machine by randomly interleaving HTLC additions, commits, revocations, settlements, failures, link restarts, and transitions into and out of quiescence mode from both Alice and Bob.

Making each iteration cheap enough to fuzz

A coverage-guided fuzzer typically needs to execute hundreds of iterations per second to be able to find interesting states. The dominant per-iteration costs in the normal link path are secp256k1 arithmetic and disk I/O, neither of which are the focus of this harness. To keep the state machine the bottleneck rather than crypto and the filesystem, the harness introduces three substitutions:

Mocked signing and verification
Mocked commitment key derivation
In-memory database*

*channeldb is backed by bbolt files created under t.TempDir(). To avoid the disk I/O bottleneck, the harness redirects TMPDIR to /dev/shm (tmpfs) so the bbolt files live in RAM, because tmpfs is Linux-specific, the harness fails fast on non-Linux hosts.

This work was initially developed and reviewed in my fork MPins#1 before being submitted upstream.

go test ./htlcswitch/ -run TestChannelLinkFSMScenarios -v
go test ./htlcswitch -run=^$ -fuzz=FuzzChannelLinkFSM -fuzztime=1m

Expose the `invoiceRegistry` field in `singleLinkTestHarness` so tests can register and look up invoices directly. Add `generateSingleHopHtlc`, a test helper that builds a single-hop `UpdateAddHTLC` with a random preimage, intended for use in unit and fuzz tests.

Add a no-op MailBox implementation and a no-op ticker for use in the channelLink FSM fuzz harness.

Replace createChannelLinkWithPeer (which required a Switch and spawned the htlcManager goroutine) with newFuzzLink, a minimal link factory that: - accepts dependencies directly (registry, preimage cache, circuit map, bestHeight) instead of a mockServer, so no Switch or background goroutines are created at all - sets link.upstream directly to a buffered channel controlled by the caller, bypassing the mailbox entirely - attaches a mockMailBox so mailBox.ResetPackets() in resumeLink succeeds

Add a failReason string field to channelLink that is populated by failf alongside the existing failed flag. This gives fuzz and unit tests direct access to the human-readable failure reason without requiring a dedicated OnChannelFailure callback or log scraping.

Introduce a one-shot nextOnionFailMode flag on mockIteratorDecoder and matching payloadFail / extractFail fields on mockHopIterator so that fuzz and unit tests can deterministically exercise the three error branches of channelLink.processRemoteAdds: - onionFailDecode → DecodeHopIterator returns a non-CodeNone failcode (CodeTemporaryChannelFailure). - onionFailPayload → HopPayload returns hop.ErrInvalidPayload. - onionFailExtract → ExtractErrorEncrypter returns a non-CodeNone failcode (CodeInvalidOnionVersion). The flag is consumed and cleared on each DecodeHopIterator call so it affects exactly one HTLC. Default behaviour is unchanged when no mode is armed, so existing callers see no difference. newMockHopIterator now returns *mockHopIterator (instead of hop.Iterator) so the decoder can set the per-iterator failure flags after construction; the concrete type still satisfies hop.Iterator and the only external caller in test_utils.go is unaffected.

Introduce `fuzz_link_test.go` with a model-based fuzzer that drives the Alice-Bob channel link through arbitrary sequences of protocol events and checks key invariants after each step.

Introduce fuzzSigner and fuzzSigVerifier in the fuzz harness, along with the SigVerifier hook in LightningChannel (WithSigVerifier, verifySig) and a matching SigPool extension (VerifyFunc field) so the harness can bypass secp256k1 verification end-to-end. Also refactors createTestChannel to accept functional options (testChannelOpt) so the signer and channel options can be injected from tests.

Introduce CommitKeyDeriverFunc and WithCommitKeyDeriver to allow LightningChannel to bypass the secp256k1-based DeriveCommitmentKeys on every commit round. All internal call sites are migrated to lc.deriveCommitmentKeys. The fuzz harness injects fuzzCommitKeyDeriver, a trivial identity deriver that avoids scalar-multiplication overhead.

createTestChannel started alicePool and bobPool but never stopped them. During fuzzing this caused goroutines to leak per. Register t.Cleanup handlers to call Stop() on both pools so all workers are torn down when the test ends.

newMockRegistry started an InvoiceRegistry but never stopped it. InvoiceRegistry internally starts two background goroutines — invoiceEventLoop and the InvoiceExpiryWatcher mainLoop — that run for the lifetime of the registry. Without a matching Stop() call both goroutines leaked for every test that called newMockRegistry, accumulating thousands of goroutines during fuzzing. Register a t.Cleanup to call registry.Stop() so both loops are torn down when the test ends.

gemini-code-assist · 2026-05-30T18:10:05Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds a high-performance, coverage-guided fuzz harness for the Lightning Network channel link state machine. By replacing expensive cryptographic operations and disk I/O with in-memory mocks, the harness achieves the high iteration rates required for effective fuzzing. Additionally, the PR includes documentation for the link state machine and refactors core wallet components to support these testing hooks.

Highlights

Fuzz Harness Implementation: Introduced a coverage-guided fuzz harness in htlcswitch/fuzz_link_test.go to exercise the link state machine by simulating various HTLC operations and link events.
Performance Optimizations: Optimized the fuzz harness for high-throughput execution by mocking secp256k1 signing/verification and commitment key derivation, and by redirecting database I/O to memory (tmpfs).
Protocol State Machine Documentation: Added htlcswitch/htlc_commitment_state_machine.md with a Mermaid diagram documenting the channel link state machine transitions.
Refactoring for Testability: Updated lnwallet and htlcswitch to support custom signature verification and commitment key derivation via functional options, enabling the use of faster, non-cryptographic mocks in testing.

New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a coverage-guided fuzz test and deterministic scenario tests for the two-party commitment protocol, along with state machine documentation and testing hooks to bypass expensive cryptographic operations. The review feedback is highly constructive, pointing out opportunities to prevent test failures on non-Linux platforms, avoid integer wrapping issues when calculating fees from fuzz inputs, and ensure strict compliance with the repository's style guide regarding function comments.

gemini-code-assist · 2026-05-30T18:12:15Z

+	if runtime.GOOS != "linux" {
+		t.Fatalf("Error: fuzzing on non-Linux OS: %s", runtime.GOOS)
+	}


Using t.Fatalf when the operating system is not Linux will cause the entire test suite to fail on non-Linux platforms (such as macOS or Windows). This breaks cross-platform development and testing for developers who run the unit tests locally. Consider using t.Skipf instead to gracefully skip these tests on unsupported platforms.

Suggested change

if runtime.GOOS != "linux" {

t.Fatalf("Error: fuzzing on non-Linux OS: %s", runtime.GOOS)

}

if runtime.GOOS != "linux" {

t.Skipf("Skipping fuzz/scenario test on non-Linux OS: %s", runtime.GOOS)

}

gemini-code-assist · 2026-05-30T18:12:15Z

+		newFee := ((len(f.aliceLink.channel.ActiveHtlcs()))+
+			int(f.htlcRef))*100 + 1000


If f.htlcRef is extremely large, casting it to int can result in a negative value due to signed integer wrapping, leading to a negative or extremely large newFee. This can cause the fee update to fail immediately or terminate the link, reducing fuzzing efficiency. Bounding f.htlcRef using a modulo operator keeps the fee rate within a reasonable, positive range.

Suggested change

newFee := ((len(f.aliceLink.channel.ActiveHtlcs()))+

int(f.htlcRef))*100 + 1000

newFee := (len(f.aliceLink.channel.ActiveHtlcs()) +

int(f.htlcRef%100000))*100 + 1000

gemini-code-assist · 2026-05-30T18:12:16Z

+		newFee := ((len(f.bobLink.channel.ActiveHtlcs()))+
+			int(f.htlcRef))*100 + 1000


If f.htlcRef is extremely large, casting it to int can result in a negative value due to signed integer wrapping, leading to a negative or extremely large newFee. Bounding f.htlcRef using a modulo operator keeps the fee rate within a reasonable, positive range.

Suggested change

newFee := ((len(f.bobLink.channel.ActiveHtlcs()))+

int(f.htlcRef))*100 + 1000

newFee := (len(f.bobLink.channel.ActiveHtlcs()) +

int(f.htlcRef%100000))*100 + 1000

gemini-code-assist · 2026-05-30T18:12:16Z

+	bobSettlesPending   map[uint64]struct{}
+}
+
+func newFuzzFSM(t *testing.T, channelSize, aliceShareGen,


According to the repository style guide, every function must be commented with its purpose and assumptions, and the comment must begin with the function name as a complete sentence. Please add a compliant comment to newFuzzFSM.

// newFuzzFSM initializes and returns a new fuzz finite state machine (FSM) // instance with the specified channel size and configuration parameters. func newFuzzFSM(t *testing.T, channelSize, aliceShareGen,

References

Every function must be commented with its purpose and assumptions. Function comments must begin with the function name and should be complete sentences. ^(link)

gemini-code-assist · 2026-05-30T18:12:16Z

+	}
+}
+
+func (f *fuzzFSM) assertInvariants() {


According to the repository style guide, every function must be commented with its purpose and assumptions, and the comment must begin with the function name as a complete sentence. Please add a compliant comment to assertInvariants.

Suggested change

func (f *fuzzFSM) assertInvariants() {

// assertInvariants verifies that the channel heights and balances are consistent

// and mirrored between Alice and Bob, ensuring no silent fund misallocation.

func (f *fuzzFSM) assertInvariants() {

References

Every function must be commented with its purpose and assumptions. Function comments must begin with the function name and should be complete sentences. ^(link)

gemini-code-assist · 2026-05-30T18:12:16Z

 	return htlc, invoice
 }

+// generateSingleHopHtlc generate a single hop htlc to send to the receiver.


According to the repository style guide, function comments should be complete sentences. Please correct the comment to use 'generates' instead of 'generate' to make it a complete sentence.

Suggested change

// generateSingleHopHtlc generate a single hop htlc to send to the receiver.

// generateSingleHopHtlc generates a single hop htlc to send to the receiver.

References

Function comments should be complete sentences. ^(link)

github-actions · 2026-05-30T18:12:56Z

🔴 PR Severity: CRITICAL

Automated classification | 9 files | 2347 lines changed (487 non-test/non-generated)

🔴 Critical (6 files)

htlcswitch/link.go — HTLC forwarding / payment routing state machine
htlcswitch/mock.go — htlcswitch package support code
htlcswitch/test_utils.go — htlcswitch package test utilities (not excluded: not *_test.go)
lnwallet/channel.go — Wallet operations, commitment transactions
lnwallet/mock.go — lnwallet package support code
lnwallet/sigpool.go — Wallet signing pool

🟢 Low (1 file)

htlcswitch/htlc_commitment_state_machine.md — Markdown documentation (*.md → LOW)

⚪ Excluded from counting (2 test files)

htlcswitch/fuzz_link_test.go — *_test.go, 1820 lines
htlcswitch/link_test.go — *_test.go, 40 lines

Analysis

This PR touches two distinct CRITICAL package families:

htlcswitch/ — the HTLC forwarding and payment routing state machine. Changes to link.go directly affect how HTLCs are processed and forwarded.
lnwallet/ — wallet operations, channel commitment transactions, and the signing pool. Changes to channel.go and sigpool.go affect commitment transaction construction and signing.

Severity bump check (non-test/non-generated files):

Files: 7 (threshold >20 — not triggered)
Lines: 487 (threshold >500 — not triggered)
Multiple distinct critical packages: ✅ htlcswitch + lnwallet — triggered, but base is already CRITICAL

The bulk of the additions (1,820 lines) is a new fuzz test (fuzz_link_test.go), excluded from the severity bump calculation. The non-test production changes are relatively modest, but because they span two top-tier critical subsystems, expert review of both is warranted.

_{To override, add a severity-override-{critical,high,medium,low} label.}

MPins added 11 commits March 29, 2026 13:39

doc: create a diagram representing the link machine state

a4a3279

htlcswitch: add mockMailBox and noopTicker test helpers

4918769

Add a no-op MailBox implementation and a no-op ticker for use in the channelLink FSM fuzz harness.

htlcswitch: add FSM fuzz harness for channelLink commit protocol

94e37ff

Introduce `fuzz_link_test.go` with a model-based fuzzer that drives the Alice-Bob channel link through arbitrary sequences of protocol events and checks key invariants after each step.

htlcswitch: stop SigPools in createTestChannel cleanup

b98a267

createTestChannel started alicePool and bobPool but never stopped them. During fuzzing this caused goroutines to leak per. Register t.Cleanup handlers to call Stop() on both pools so all workers are torn down when the test ends.

github-actions Bot added the severity-critical Requires expert review - security/consensus critical label May 30, 2026

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

htlcswitch: add link state machine fuzz harness#10865

htlcswitch: add link state machine fuzz harness#10865
MPins wants to merge 11 commits into
lightningnetwork:masterfrom
MPins:link_fsm_fuzz

MPins commented May 30, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

gemini-code-assist Bot May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		newFee := ((len(f.aliceLink.channel.ActiveHtlcs()))+
		int(f.htlcRef))*100 + 1000

		newFee := ((len(f.bobLink.channel.ActiveHtlcs()))+
		int(f.htlcRef))*100 + 1000

	// generateSingleHopHtlc generate a single hop htlc to send to the receiver.
	// generateSingleHopHtlc generates a single hop htlc to send to the receiver.

Conversation

MPins commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Making each iteration cheap enough to fuzz

Uh oh!

gemini-code-assist Bot commented May 30, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 30, 2026

🔴 PR Severity: CRITICAL

Analysis

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MPins commented May 30, 2026 •

edited

Loading