Skip to content

Add a ConfigProvider callback for runtime instrumentation option changes#8076

Open
jackshirazi wants to merge 16 commits intoopen-telemetry:mainfrom
jackshirazi:config-provider-callback
Open

Add a ConfigProvider callback for runtime instrumentation option changes#8076
jackshirazi wants to merge 16 commits intoopen-telemetry:mainfrom
jackshirazi:config-provider-callback

Conversation

@jackshirazi
Copy link
Copy Markdown
Contributor

First step in adding support for runtime changes to options, focusing only on API additions.

The assumption is that if an instrumentation is able to handle changes at runtime, it will register a callback against the ConfigProvider instance that the otel SDK is using.

Further steps (to be discussed):

  • Add spec change for runtime config updates and listener semantics
  • Expose getConfigProvider() in the SDK API
  • Provide a "change mechanism" for example, mutable provider or equivalent update source
  • Add implementations

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 12, 2026

Codecov Report

❌ Patch coverage is 74.72527% with 46 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.73%. Comparing base (c6a6358) to head (8f15038).
⚠️ Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
.../opentelemetry/sdk/internal/SdkConfigProvider.java 74.57% 29 Missing and 16 partials ⚠️
...lemetry/sdk/internal/ExtendedOpenTelemetrySdk.java 75.00% 1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (74.72%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #8076      +/-   ##
============================================
+ Coverage     90.31%   90.73%   +0.41%     
- Complexity     7725     7996     +271     
============================================
  Files           850      896      +46     
  Lines         23259    24189     +930     
  Branches       2364     2430      +66     
============================================
+ Hits          21007    21948     +941     
+ Misses         1528     1486      -42     
- Partials        724      755      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jackshirazi jackshirazi marked this pull request as ready for review February 12, 2026 15:43
@jackshirazi jackshirazi requested a review from a team as a code owner February 12, 2026 15:43
Copy link
Copy Markdown
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These contracts look like a good start to me. Missing implementation corresponding spec though so need to talk about how to proceed with that.

@jackshirazi
Copy link
Copy Markdown
Contributor Author

These contracts look like a good start to me. Missing implementation corresponding spec though so need to talk about how to proceed with that.

open-telemetry/opentelemetry-specification#4899
open-telemetry/opentelemetry-specification#4900

@jackshirazi
Copy link
Copy Markdown
Contributor Author

@jack-berg I need some help here. I implemented the ConfigProvider.addConfigChangeListener() callback, and then I started looking at the mutation update to the declarative config, and that's a little more confusing to me. Because declarative config is read-only, I believe I have to replace the entire config tree on a mutation?

Also there is the issue of replacing a single value vs replacing a node subtree. It looks like I need two new APIs for that

  • ConfigProvider.updateConfig(String path, DeclarativeConfigProperties newSubtree)
  • ConfigProvider.setConfigProperty(String path, String key, Object value)

I've implemented both in the PR for now. Examples for calling them would be

((ExtendedOpenTelemetry) GlobalOpenTelemetry.get())
    .getConfigProvider()
    .setConfigProperty(".instrumentation/development.java.methods", "include", "X");

and

ConfigProvider configProvider =
    ((ExtendedOpenTelemetry) GlobalOpenTelemetry.get()).getConfigProvider();
DeclarativeConfigProperties newMethodsConfig =
    DeclarativeConfigProperties.fromMap(
        Map.of("include", "X"),
        configProvider.getInstrumentationConfig().getComponentLoader());
configProvider.updateConfig(".instrumentation/development.java.methods", newMethodsConfig);

There is a difference in the effect here. If there was another key at the "include" level, eg

methods:
  include: "X"
  avoid: "Y"

then the setConfigProperty would leave the "avoid" value as "Y", but the updateConfig would not


private SdkConfigProvider(DeclarativeConfigProperties openTelemetryConfigModel) {
this.instrumentationConfig = openTelemetryConfigModel.get("instrumentation/development");
this.openTelemetryConfigModel = new AtomicReference<>(requireNonNull(openTelemetryConfigModel));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does openTelemetryConfigModel need to be an AtomicReference?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config could get mutated from any thread, and indeed from multiple threads concurrently

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a volatile be sufficient?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it were a volatile, in the updateConfig() two concurrent updates could both concurrently set openTelemetryConfigModel to the new root at the same time. One of the updates would be lost. With the atomic reference in the while(true) loop, the changes are retained from all concurrent updates. So I'd say with the current implementation to avoid losing updates, no. Maybe a different implementation would allow that.

private final AtomicReference<DeclarativeConfigProperties> openTelemetryConfigModel;
private final ConcurrentMap<String, CopyOnWriteArrayList<ListenerRegistration>> listenersByPath =
new ConcurrentHashMap<>();
private final AtomicBoolean disposed = new AtomicBoolean(false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private final AtomicBoolean disposed = new AtomicBoolean(false);
private final AtomicBoolean isShutdown = new AtomicBoolean(false);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adapted

Map<String, Object> subtreeMap = DeclarativeConfigProperties.toMap(newSubtree);
while (true) {
DeclarativeConfigProperties current = requireNonNull(openTelemetryConfigModel.get());
Map<String, Object> rootMap = DeclarativeConfigProperties.toMap(current);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit: I believe setSubtreeAtPath relies on the map being mutable, and is relying on coincidence that DeclarativeConfigProperties.toMap returns a mutable hashmap.

Instead, setSubtreeAtPath could return a new hashmap instance. We can make it all efficient later if needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, adapted

@SuppressWarnings("unchecked")
private static void setSubtreeAtPath(
Map<String, Object> rootMap, String normalizedPath, Map<String, Object> subtreeMap) {
String relativePath = normalizedPath.substring(1);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding an JsonPath abstraction to gather up some of this logic about how to walk paths in a single place.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented as a TODO, if you want me to refactor in this PR, where to?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A TODO is fine - this is just a prototype

Object child = parent.get(segments[i]);
if (child instanceof Map) {
parent = (Map<String, Object>) child;
} else {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is interesting. If the child is not a map, it could either be null or another type (like a scalar or a array). In the null case, I think the behavior here to makes sense. But if the type isn't a map, the we have a conflict between the current schema and the structure the updater is trying to apply. I think we ought to be conservative in this case and throw an exception. Can advertise the exception in the update method signatures, and let the caller figure out what to do if they produce a conflict.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified accordingly

return current;
}

private static boolean hasSameContents(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof this is an expensive equality check. Maybe a TODO to optimize later. An optimized version could use the introspection APIs like getPropertyKeys to a allocation free recursive comparison.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

* @param componentLoader the component loader to use
* @return a {@link DeclarativeConfigProperties} backed by the map
*/
static DeclarativeConfigProperties fromMap(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to avoid this type of thing in the API. I accept that we need these utilities in the SDK to make UX acceptable. An SDK version of this could be based on YamlDeclarativeConfigProperties, which is has a create method to instantiate from Map<String, Object>.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay nice! I didn't see that. I've had to move it into the sdk from incubator (otherwise accessing it becomes silly indirection), and followed your direction here

@jack-berg
Copy link
Copy Markdown
Member

Also there is the issue of replacing a single value vs replacing a node subtree. It looks like I need two new APIs for that

I agree you'll need APIs for that. Should probably be at the SDK level though (SdkConfigProvider) rather than API (ConfigProvider). API is accessible to instrumentation and we want instrumentation to be consuming config, not mucking around and editing it, right?

@jackshirazi
Copy link
Copy Markdown
Contributor Author

Also there is the issue of replacing a single value vs replacing a node subtree. It looks like I need two new APIs for that

I agree you'll need APIs for that. Should probably be at the SDK level though (SdkConfigProvider) rather than API (ConfigProvider). API is accessible to instrumentation and we want instrumentation to be consuming config, not mucking around and editing it, right?

Couple of options here

  • Put the mutation methods on ExtendedOpenTelemetry, needed because of the ObfuscatedConfigProvider
  • Create a separate MutableConfigProvider interface that extends ConfigProvider, callers would cast: ((MutableConfigProvider) configProvider).setConfigProperty(...).

preference? or alternative?

@jackshirazi jackshirazi force-pushed the config-provider-callback branch from 3a86394 to f45b2bb Compare April 30, 2026 13:06
# Conflicts:
#	api/incubator/src/test/java/io/opentelemetry/api/incubator/config/InstrumentationConfigUtilTest.java
#	sdk-extensions/declarative-config/src/main/java/io/opentelemetry/sdk/autoconfigure/declarativeconfig/DeclarativeConfiguration.java
#	sdk-extensions/declarative-config/src/main/java/io/opentelemetry/sdk/autoconfigure/declarativeconfig/YamlDeclarativeConfigProperties.java
#	sdk-extensions/declarative-config/src/test/java/io/opentelemetry/sdk/autoconfigure/declarativeconfig/YamlDeclarativeConfigPropertiesTest.java
#	sdk-extensions/incubator/src/main/java/io/opentelemetry/sdk/extension/incubator/fileconfig/YamlDeclarativeConfigProperties.java
#	sdk/all/src/main/java/io/opentelemetry/sdk/internal/YamlDeclarativeConfigProperties.java
@jack-berg
Copy link
Copy Markdown
Member

jack-berg commented Apr 30, 2026

preference? or alternative?

I've sketched out a few ideas in: jackshirazi#1

Easier to talk in code these days.

Most notably:

  • The update API is SDK only
  • The update APIs are consolidated into a single setConfig(String path, Object value) method
  • Verification that when an update occurs, the update target path is either unset or that the type of the existing value matches the new value
  • The CAS loop notification path is simplified by just using a lock - in the process, this fixes a race condition where a listener could be notified of update after a later update. I.e. given update 1 at T1, update 2 at T2, listener receives update 2 then update 1, and is permanently left with out of date state

@jackshirazi
Copy link
Copy Markdown
Contributor Author

jackshirazi commented May 6, 2026

I've applied your diff as is, with one addition, an extra isShutdown check in the addConfigChangeListener because there is a race condition between the first check and acquiring the lock when another thread could acquire the lock in shutdown but the addConfigChangeListener would continue. After that the first isShutdown check isn't strictly needed, but is a fast exit without acquiring a lock, but can be removed.

My concern with this implementation is how do I call the setConfig from an extension? GlobalOpenTelemetry.get() is returned as an obfuscated wrapper, so can't be cast to ExtendedOpenTelemetrySdk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants