[fix][client] Prevent duplicate ServiceUrlProvider initialization by oneby-wang · Pull Request #25899 · apache/pulsar

oneby-wang · 2026-05-30T09:49:46Z

Motivation

#25892 fixed a flaky SameAuthParamsLookupAutoClusterFailoverTest by removing an extra manual failover.initialize(client) call from the test.

The root cause of that flakiness was duplicate initialization. PulsarClientBuilder.build() already initializes the configured ServiceUrlProvider through PulsarClientImpl, so calling initialize(client) again starts duplicate background checks for the same provider instance.

This is especially problematic for SameAuthParamsLookupAutoClusterFailover, because each initialize call creates a new broker-service-url-check EventLoopGroup. Multiple checker threads can then mutate the same failover state and produce subtle race conditions that are difficult to diagnose. AutoClusterFailover and ControlledClusterFailover have the same lifecycle risk: duplicate initialization can register duplicate scheduled tasks, and ControlledClusterFailover can also recreate its HTTP client without closing the previous one.

Modifications

Make AutoClusterFailover, ControlledClusterFailover, and SameAuthParamsLookupAutoClusterFailover fail fast when initialize(PulsarClient) is called more than once.
Remove the duplicate manual initialize call from SameAuthParamsLookupAutoClusterFailoverTest.testAutoClusterFailover because the provider is already initialized by PulsarClientBuilder.build().
Add dedicated tests that build a PulsarClient with each provider and then verify a second initialize(client) call throws IllegalStateException.

Verifying this change

Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

The threading model is affected only by preventing duplicate background failover check tasks from being registered for the same ServiceUrlProvider instance.

void-ptr974

I think this needs a lifecycle fix. If the same ServiceUrlProvider instance is reused to build a second client, the second build now fails with IllegalStateException, but the constructor failure path calls shutdown(), which unconditionally closes conf.getServiceUrlProvider(). That can close the provider still used by the first live client.

Example:

ServiceUrlProvider provider = AutoClusterFailover.builder()
        .primary(primary)
        .secondary(List.of(secondary))
        .failoverDelay(1, TimeUnit.SECONDS)
        .switchBackDelay(1, TimeUnit.SECONDS)
        .build();

PulsarClient client1 = PulsarClient.builder()
        .serviceUrlProvider(provider)
        .build();

PulsarClient.builder()
        .serviceUrlProvider(provider)
        .build(); // fails, then closes provider used by client1

Could we only close the provider on constructor failure if this PulsarClientImpl successfully initialized it?

oneby-wang · 2026-06-01T10:47:47Z

Could we only close the provider on constructor failure if this PulsarClientImpl successfully initialized it?

@void-ptr974 Nice catch. I agree with this approach. @lhotari WDYT?

lhotari

LGTM

lhotari · 2026-06-02T18:57:58Z

Could we only close the provider on constructor failure if this PulsarClientImpl successfully initialized it?

@void-ptr974 Nice catch. I agree with this approach. @lhotari WDYT?

@oneby-wang It would be an improvement to handle this case. In addition, the javadoc of org.apache.pulsar.client.api.ServiceUrlProvider should be improved. There should be explanation that each instance lifecycle is tied to one PulsarClient instance. Similar javadoc should be added to the implementation where this is now enforced.

oneby-wang · 2026-06-03T02:16:21Z

/pulsarbot rerun-failure-checks

oneby-wang · 2026-06-03T02:18:54Z

It would be an improvement to handle this case.

@lhotari I see, let's get this merged first. I'll create another PR to address the improvement.

lhotari · 2026-06-03T07:23:54Z

It would be an improvement to handle this case.

@lhotari I see, let's get this merged first. I'll create another PR to address the improvement.

@oneby-wang I think that the javadoc improvement belongs to this PR. This PR makes the implicit contract of the ServiceUrlProvider interface explicit and it's useful to document that in javadoc.

oneby-wang · 2026-06-03T10:49:22Z

This PR makes the implicit contract of the ServiceUrlProvider interface explicit and it's useful to document that in javadoc.

@lhotari Addressed.

lhotari

LGTM, good work @oneby-wang

…5899) (cherry picked from commit 882946c)

[fix][client] Prevent duplicate ServiceUrlProvider initialization

4a66058

void-ptr974 suggested changes May 31, 2026

View reviewed changes

lhotari approved these changes Jun 2, 2026

View reviewed changes

Address pr comment

b259b2c

lhotari approved these changes Jun 3, 2026

View reviewed changes

lhotari added this to the 5.0.0-M1 milestone Jun 3, 2026

lhotari added release/4.2.2 release/4.0.11 labels Jun 3, 2026

lhotari merged commit 882946c into apache:master Jun 3, 2026
43 checks passed

lhotari pushed a commit that referenced this pull request Jun 3, 2026

[fix][client] Prevent duplicate ServiceUrlProvider initialization (#2…

c3c3fe5

…5899) (cherry picked from commit 882946c)

lhotari added the cherry-picked/branch-4.2 label Jun 3, 2026

lhotari pushed a commit that referenced this pull request Jun 3, 2026

[fix][client] Prevent duplicate ServiceUrlProvider initialization (#2…

fb83c7d

…5899) (cherry picked from commit 882946c)

lhotari added the cherry-picked/branch-4.0 label Jun 3, 2026

This was referenced Jun 5, 2026

[fix][client] Avoid closing reused ServiceUrlProvider in another PulsarClientImpl instance oneby-wang/pulsar#45

Closed

[fix][client] Avoid closing reused ServiceUrlProvider in another PulsarClientImpl instance oneby-wang/pulsar#46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix][client] Prevent duplicate ServiceUrlProvider initialization#25899

[fix][client] Prevent duplicate ServiceUrlProvider initialization#25899
lhotari merged 2 commits into
apache:masterfrom
oneby-wang:service-url-provider-init-idempotency

oneby-wang commented May 30, 2026

Uh oh!

void-ptr974 left a comment •

edited

Loading

Uh oh!

oneby-wang commented Jun 1, 2026

Uh oh!

lhotari left a comment

Uh oh!

lhotari commented Jun 2, 2026 •

edited

Loading

Uh oh!

oneby-wang commented Jun 3, 2026

Uh oh!

oneby-wang commented Jun 3, 2026 •

edited

Loading

Uh oh!

lhotari commented Jun 3, 2026

Uh oh!

oneby-wang commented Jun 3, 2026

Uh oh!

lhotari left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

oneby-wang commented May 30, 2026

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Uh oh!

void-ptr974 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oneby-wang commented Jun 1, 2026

Uh oh!

lhotari left a comment

Choose a reason for hiding this comment

Uh oh!

lhotari commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oneby-wang commented Jun 3, 2026

Uh oh!

oneby-wang commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lhotari commented Jun 3, 2026

Uh oh!

oneby-wang commented Jun 3, 2026

Uh oh!

lhotari left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

void-ptr974 left a comment •

edited

Loading

lhotari commented Jun 2, 2026 •

edited

Loading

oneby-wang commented Jun 3, 2026 •

edited

Loading