Skip to content

fix(ruby): keep OTel auto-instrumentation working on Rails 7.0#643

Open
ccschmitz-launchdarkly wants to merge 18 commits into
mainfrom
feat/ruby-old-rails-compat
Open

fix(ruby): keep OTel auto-instrumentation working on Rails 7.0#643
ccschmitz-launchdarkly wants to merge 18 commits into
mainfrom
feat/ruby-old-rails-compat

Conversation

@ccschmitz-launchdarkly

@ccschmitz-launchdarkly ccschmitz-launchdarkly commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

The observability plugin depended on opentelemetry-instrumentation-all, whose Rails-family members raised their minimum to Rails 7.1. On Rails 7.0 every Rails-family instrumentation fails its runtime compatible? check, logs a flurry of "... failed to install" warnings, and never attaches — so the app loses all Rails auto-instrumentation (no HTTP server spans, DB spans, etc.) and gets only manual instrumentation. The meta-gem couples every instrumentation to a single version, so the fix is to stop using it.

Self-contained against main: this PR includes the boot-instrumentation foundation (previously split out as #582) plus the old-Rails compatibility fix. Supersedes #582.

Changes

Boot auto-instrumentation (formerly #582)

  • Install the OTel Rails-family instrumentations during boot via the Railtie so they attach even when the LaunchDarkly client is created lazily after Rails has booted; emit one clear warning when boot-time install was missed.

Old-Rails compatibility

  • Drop opentelemetry-instrumentation-all for individual gems. The Rails family is capped below each member's Rails-7.1-enforcing release (rails <0.42, action_pack <0.18, action_view/active_record <0.13, active_support <0.12, active_job <0.12, action_mailer <0.8, active_storage <0.5); everything else tracks the latest. Capped releases still work on Rails 7.1+, so modern apps keep the latest non-Rails instrumentations. use_all still activates any extra instrumentation gem a consumer adds.
  • InstrumentationLogFilter replaces the per-instrumentation "failed to install" flurry with one actionable summary (wraps the whole SDK.configure call, since the SDK installs instrumentations after the configure block returns).
  • Repro app e2e/ruby/rails/demo-rails70 (Rails 7.0) with a pure-Ruby in-process OTLP protobuf sink asserting traces (a server span), a log, and a captured exception are exported over the wire — runs under bundle exec rake, no Docker/Node.
  • CI: e2e-rails-legacy job in ruby-plugin.yml as a regression guard (the other e2e apps run Rails 7.2, above the floor).
  • README compatibility section + CHANGELOG.

Context

CardFlight (the original report) has since upgraded Rails, so this no longer blocks them — but it remains a valuable general regression guard so a future instrumentation-gem bump can't silently re-break Rails 7.0.

Testing

Red→green with the CI job's command (bundle exec rake in demo-rails70):

  • pre-fix (instrumentation-all → instrumentation-rails 0.42): 5 runs, 9 assertions, 4 failures
  • fixed: 5 runs, 16 assertions, 0 failures, 0 "failed to install"

demo (7.2) and api-only (7.2) pass and still resolve the latest non-Rails instrumentations (rack 0.31, redis 0.29, pg 0.36, …). Gem unit tests + rubocop green. Rebased onto main (0.2.2).

Notes

🤖 Generated with Claude Code


Note

Medium Risk
Changes default OpenTelemetry dependency resolution and boot/configure ordering for all Ruby/Rails consumers; behavior is heavily tested but mis-pinned gems could affect instrumentation on edge framework versions.

Overview
Fixes Rails 7.0 apps losing OpenTelemetry auto-instrumentation when opentelemetry-instrumentation-all pulled in Rails-family gems that require Rails 7.1+.

The launchdarkly-observability gem drops the instrumentation-all meta-gem for a curated set of individual opentelemetry-instrumentation-* dependencies, with the Rails family capped below the 7.1-enforcing releases while other instrumentations stay on current versions. InstrumentationLogFilter collapses per-gem “failed to install” noise into one actionable warning.

Boot-time install (Railtie + install_rails_instrumentation) attaches Rails OTel hooks during boot so lazy LaunchDarkly client init still works; plugin registration later adds exporters without replacing the tracer provider. A hyphenated launchdarkly-observability.rb entry point enables Bundler.require to load the Railtie early.

Adds e2e/ruby/rails/demo-rails70 (Rails 7.0) with an in-process OTLP protobuf sink and integration tests (server span, logs, exceptions), plus CI job e2e-rails-legacy. Existing demo / api-only apps get matching lockfile changes and lazy-init instrumentation tests.

Reviewed by Cursor Bugbot for commit 6980ff9. Bugbot is set up for automated code reviews on this repo. Configure here.

@abelonogov-ld abelonogov-ld left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Test app features

ccschmitz-launchdarkly and others added 12 commits June 23, 2026 14:29
The Railtie's `attach_otel_log_bridge` delegated to a module method
(`LaunchDarklyObservability.otel_logger_provider_available?`) defined in the
`class << self` block of launchdarkly_observability.rb. That block runs *after*
the `require_relative '.../rails'` near the top of the file. When the gem is
required lazily after Rails has booted, the Railtie's `config.after_initialize`
hook fires synchronously during the require — before the module method exists —
so the bridge attach failed with:

  Could not attach log bridge to Rails.logger: undefined method
  `otel_logger_provider_available?' for module LaunchDarklyObservability

Inline the availability check in the Railtie so it no longer depends on load
order, and move the `rails.rb` require to the bottom of the main file (after the
module body is fully defined) so future Railtie code can't reintroduce the same
class of bug. Adds a regression test that removes the module method to simulate
the load-order state.

Co-Authored-By: Claude <noreply@anthropic.com>
…rsions

The default instrumentation config passed `enable_recognize_route: true` to the
Rails instrumentation and `db_statement: :include` to ActiveRecord. Neither
option exists on those instrumentations (verified against the pinned gem
versions: rails 0.39.1, active_record 0.11.1, rack 0.29.0), so they were no-ops
that emitted a warning on every boot:

  Instrumentation ... ignored the following unknown configuration options [...]

Remove them. Route-based span naming (http.route) is handled automatically by
the ActionPack instrumentation, and SQL capture comes from the database adapter
instrumentations (Mysql2, PG, ...) which already obfuscate statements by default.

Co-Authored-By: Claude <noreply@anthropic.com>
The Rails e2e demo app only verified that a controller responds; nothing checked
that the OTel Rails-family instrumentations actually attached. That gap is why a
recent customer report ("Instrumentation: ... failed to install") went unnoticed
— the app initializes the plugin during Rails boot (the happy path) and never
asserted the result.

Add an integration test that asserts Rack, ActionPack, ActiveRecord,
ActiveSupport, and Rails report `installed? == true` after boot (the exact
inverse of the "failed to install" warning), and that an HTTP request produces a
server span. The plugin only configures OpenTelemetry when the LD client
registers it, which needs a non-empty SDK key, so test_helper sets a dummy key
before boot (invalid, so the client never connects).

Co-Authored-By: Claude <noreply@anthropic.com>
The OTel Rails-family instrumentations (ActionPack, ActiveRecord, ...) patch via
ActiveSupport.on_load hooks that fire while Rails boots. The plugin configured
OpenTelemetry from Plugin#register (at LDClient.new), so when an app creates the
client lazily — e.g. from a model on first request, after Rails has booted —
those hooks had already fired and every Rails instrumentation logged
"Instrumentation: ... failed to install".

Fix it in two parts:

1. Ship a hyphenated entry point (lib/launchdarkly-observability.rb) matching the
   gem name so Bundler.require auto-loads the gem — and its Railtie — during
   boot. Previously the gem was only loadable as 'launchdarkly_observability'
   (underscore), so Bundler couldn't auto-require it and users loaded it manually
   in an initializer, too late for the Railtie.

2. Add a Railtie initializer that installs auto-instrumentation during boot,
   decoupled from the LD client. project_id for the resource is resolved from
   LAUNCHDARKLY_SDK_KEY, which is present in the environment at boot in the common
   case even when the client object is built lazily. At register time the plugin
   then only attaches exporters to the existing provider instead of reconfiguring
   it (which would drop the boot-time instrumentation). It runs
   `after: :load_config_initializers` and no-ops if a provider is already
   configured, so boot-time-init apps keep their own configuration untouched.

When the client is registered after boot and boot-time install did not run (e.g.
no SDK key in the environment at boot), the plugin now logs a single actionable
warning instead of the upstream flood of "failed to install" lines.

Co-Authored-By: Claude <noreply@anthropic.com>
Add a lazy client-initialization mode to the Rails demo app (LD_LAZY_INIT=1):
the initializer skips creating the client at boot, and a LazyLdClient model
creates it on first use — mirroring apps that build the client after Rails has
booted. A new integration test boots a separate Rails process in this mode and
asserts the Rails-family instrumentations are still installed, which only holds
because the Railtie installs them during boot. Verified that removing the
boot-time install makes the test fail with "failed to install".

Co-Authored-By: Claude <noreply@anthropic.com>
…lure (red)

Reproduces the CardFlight failure: on Rails 7.0 with the current gem,
opentelemetry-instrumentation-all resolves to the latest (0.94 ->
instrumentation-rails 0.42), whose Rails-family members raised their floor to
Rails 7.1. They fail their runtime compatible? check, log a flurry of
'failed to install' warnings, and never attach, so no autoinstrumented HTTP
server span is produced.

- e2e/ruby/rails/demo-rails70: copy of demo pinned to rails ~> 7.0 (Ruby 3.3.4
  kept so bundler still resolves the latest instrumentation, faithfully matching
  a current-Ruby Rails 7.0 customer).
- test/support/otlp_sink.rb: in-process OTLP/protobuf sink (pure Ruby, decodes
  via the exporter gems' proto classes) so the e2e test asserts telemetry over
  the real export path under bundle exec rake (no Docker/Node).
- test/integration/otlp_export_e2e_test.rb: asserts traces (server span), a log
  record, and a captured exception all reach the sink. RED here.

Boot logs show 8 Rails-family 'failed to install' warnings; the new test and the
existing observability_instrumentation_test both fail (missing server span).
The observability plugin depended on opentelemetry-instrumentation-all, whose
Rails-family members raised their floor to Rails 7.1. On Rails 7.0 every
Rails-family instrumentation failed its compatible? check, logged a flurry of
'failed to install' warnings, and never attached — so apps lost all Rails
auto-instrumentation (no HTTP server spans, DB spans, etc.) and got only manual
instrumentation. The meta gem couples versions, so the only fix is to stop using
it.

- Replace opentelemetry-instrumentation-all with individual instrumentation gems
  (lib/launchdarkly_observability/instrumentations.rb). The Rails family is
  capped below each member's Rails-7.1-enforcing release (rails <0.42,
  action_pack <0.18, active_record/action_view <0.13, active_support <0.12,
  active_job <0.12, action_mailer <0.8, active_storage <0.5); everything else
  tracks the latest. These capped releases still work on Rails 7.1+, so modern
  apps are unaffected, and use_all still activates any extra instrumentation gem
  a consumer adds.
- Replace the per-instrumentation 'failed to install' log flurry with a single
  actionable summary via a logger filter (InstrumentationLogFilter), naming the
  instrumentations that could not attach and how to resolve it.

Verified on the demo-rails70 repro: 0 'failed to install', all 13
instrumentations install, and traces (server span) + a log + a captured
exception are exported to the OTLP sink. 5 runs, 16 assertions, 0 failures.
…lures correctly

- Move the log-suppression filter into its own file
  (instrumentation_log_filter.rb) with class-level capture_failures/
  failure_warning helpers (keeps OpenTelemetryConfig under the class-length
  limit and is unit-testable).
- Wrap the whole OpenTelemetry::SDK.configure call rather than just use_all:
  the SDK installs instrumentations AFTER the configure block returns, so the
  earlier placement around use_all never saw the install logging. Now the
  per-instrumentation 'failed to install' / 'successfully installed' chatter is
  actually suppressed and failures are collected for the single summary.
- rails.rb: Railtie#otel_logger_provider_available? now returns an explicit
  boolean instead of nil when the logs constant is absent. Fixes a pre-existing
  order-dependent failure in rails_railtie_test (verified failing identically on
  the unmodified gem at the same seed).

Updated e2e Gemfile.locks for the new individual-gem resolution. demo (7.2) and
api-only (7.2) suites pass; demo-rails70 (7.0) passes with 0 'failed to install'.
Adds e2e-rails-legacy to ruby-plugin.yml, running e2e/ruby/rails/demo-rails70
(Rails 7.0) under bundle exec rake. The existing e2e jobs run Rails 7.2 — above
the floor the OTel Rails-family instrumentations enforce — so they cannot catch
an old-Rails compatibility break. This job fails if Rails-family
auto-instrumentation stops attaching on Rails 7.0.

Demonstrated red-then-green with the job's exact command:
- pre-fix gemspec (instrumentation-all -> instrumentation-rails 0.42.0):
  5 runs, 9 assertions, 4 failures (ActionPack not installed, no server span)
- fixed gemspec (instrumentation-rails 0.41.0): 5 runs, 16 assertions, 0 failures
- README: replace the opentelemetry-instrumentation-all dependency note with a
  'Ruby & Rails compatibility' section (Ruby >= 3.0, Rails >= 7.0 matrix, the
  individual-gem rationale, the single-warning behavior, and the per-gem pin
  escape hatch). Update the auto-instrumentation list to the curated set.
- CHANGELOG: add an Unreleased Bug Fixes entry for the Rails 7.0 fix.
- demo-rails70/README: explain what the repro proves and how to run it.
…ncies)

rubocop's Gemspec/OrderedDependencies (run by the build CI job) requires
alphabetical order within each section. No functional change.
@ccschmitz-launchdarkly ccschmitz-launchdarkly force-pushed the feat/ruby-old-rails-compat branch from d039e6b to aaf9f8d Compare June 23, 2026 19:32
@ccschmitz-launchdarkly ccschmitz-launchdarkly changed the base branch from feat/ruby-rails-boot-instrumentation to main June 23, 2026 19:32
@github-actions

Copy link
Copy Markdown
Contributor

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
1327 1151 87% 0% 🟢

New Files

No new covered files...

Modified Files

No covered modified files...

updated for commit: aaf9f8d by action🐍

root: <%= Rails.root.join("tmp/storage") %>

local:
service: Disk

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity vulnerability may affect your project—review required:
Line 6 lists a dependency (activestorage) with a known High severity vulnerability.

ℹ️ Why this matters

Affected versions of activestorage are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Active Storage DiskService builds filesystem paths from blob keys without containing them to the storage root; a key containing traversal sequences can read, write, or delete arbitrary files if that key reaches disk storage (for example when custom keys are derived from untrusted input)

References: GHSA, CVE

To resolve this comment:
Check if you are using ActiveStorage::Service::DiskService in your Ruby application.

💬 Ignore this finding

To ignore this, reply with:

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

You can view more details on this finding in the Semgrep AppSec Platform here.

@@ -0,0 +1,33 @@
test:
service: Disk

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity vulnerability may affect your project—review required:
Line 2 lists a dependency (activestorage) with a known High severity vulnerability.

ℹ️ Why this matters

Affected versions of activestorage are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Active Storage DiskService builds filesystem paths from blob keys without containing them to the storage root; a key containing traversal sequences can read, write, or delete arbitrary files if that key reaches disk storage (for example when custom keys are derived from untrusted input)

References: GHSA, CVE

To resolve this comment:
Check if you are using ActiveStorage::Service::DiskService in your Ruby application.

💬 Ignore this finding

To ignore this, reply with:

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

You can view more details on this finding in the Semgrep AppSec Platform here.

@semgrep-code-launchdarkly

Copy link
Copy Markdown

Semgrep found 1 ssc-c8b7a1f2-4d36-4f0a-9e2b-1a5c8d7e6f30 finding:

Risk: Affected versions of vite and vite-plus are vulnerable to Exposure of Sensitive Information to an Unauthorized Actor / Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Vite's server.fs.deny blocklist—which protects sensitive files such as .env and certificate files from being served—can be bypassed on Windows using alternate path representations (NTFS Alternate Data Stream syntax like /.env::$DATA?raw, or 8.3 short filenames), allowing an attacker to read otherwise-denied files when the dev server is exposed to the network.

Manual Review Advice: A vulnerability from this advisory is reachable if you expose the Vite dev server or vite-plus to the network by configuring a non-loopback address using the --host CLI flag on Windows

Fix: Upgrade this library to at least version 6.4.3 at observability-sdk/yarn.lock:47256.

Reference(s): GHSA-fx2h-pf6j-xcff

@ccschmitz-launchdarkly ccschmitz-launchdarkly marked this pull request as ready for review June 25, 2026 18:25
@ccschmitz-launchdarkly ccschmitz-launchdarkly requested a review from a team as a code owner June 25, 2026 18:25
When auto-instrumentation installs during Rails boot (the LD_LAZY_INIT path), the tracer provider's resource is built before the plugin's service_name/service_version options exist, so it carries the inferred service name. configure_traces then only attached an exporter and left that resource in place, while logs and metrics built a fresh resource from the options -- so trace spans reported a different service identity than logs/metrics. Update the existing provider's resource in place (it reads @resource live per span) instead of reconfiguring the SDK, which would drop the boot-time instrumentation.

Flagged by Cursor Bugbot on #643.
FAILED_PATTERN capture and the single summary warning only fire when an instrumentation is below its framework floor (the pre-fix red state), so the green e2e suite never exercises them. Add direct unit coverage.
Requiring opentelemetry/instrumentation/rails pulls all seven Rails-family instrumentations (incl. action_mailer and active_storage); spell them out so the comment matches the gemspec caps and explains where those caps come from.
demo and demo-rails70 both reported service_name 'rails-demo-app', so their telemetry collided. Name the Rails 7.0 repro 'rails7-demo-app' so its spans are distinguishable from the Rails 7.2 demo.
The lockfile still pinned the path gem at 0.2.1 while the gem is 0.2.2 (and demo-rails70 was already updated). Align it.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9da2c1a. Configure here.

@sample_evaluations = state.values_map.first(5).to_h

# Make an HTTP request (auto-instrumented by OpenTelemetry)
@http_url = 'http://www.example.com/?test=1'
# Make an HTTP request (auto-instrumented by OpenTelemetry)
@http_url = 'http://www.example.com/?test=1'
with_launchdarkly_span('pages-home-fetch', attributes: { 'custom.source' => 'demo' }) do
response = Net::HTTP.get_response(URI.parse(@http_url))
@@ -0,0 +1,11 @@
development:
adapter: redis
url: redis://localhost:6379/1

production:
adapter: redis
url: <%= ENV.fetch("REDIS_URL") { "redis://localhost:6379/1" } %>
@mutex = Mutex.new
@server = WEBrick::HTTPServer.new(
Port: port,
BindAddress: '127.0.0.1',
# boot. This keeps the E2E test fully self-contained: no external collector,
# no network egress, no LaunchDarkly backend.
OTLP_SINK_PORT = (ENV['OTLP_SINK_PORT'] || '4327').to_i
ENV['OTEL_EXPORTER_OTLP_ENDPOINT'] ||= "http://127.0.0.1:#{OTLP_SINK_PORT}"

config = LaunchDarklyObservability::OpenTelemetryConfig.new(
project_id: 'my-project',
otlp_endpoint: 'http://localhost:4318',
In the lazy-init path the Railtie installs auto-instrumentation at boot with defaults, before the plugin exists. Options that only take effect at install time -- custom instrumentations config and enable_traces: false -- can't be applied retroactively: an instrumentation patches its library at boot (via ActiveSupport.on_load hooks) and can't be reconfigured or detached afterward. Emit a single actionable warning instead of silently dropping them, matching the existing warn_if_rails_instrumentation_missed pattern. (service_name/service_version remain the exception -- the trace resource is read live per span and is refreshed in configure_traces.)

Flagged by Cursor Bugbot on #643.
@ccschmitz-launchdarkly

Copy link
Copy Markdown
Contributor Author

Re: Cursor Bugbot — "Boot trace path skips options" (opentelemetry_config.rb)

Addressed the silent-drop in 6980ff9: the lazy-init path now emits a single warning when instrumentations config or enable_traces: false are passed but can't take effect, pointing users to the config/initializer path.

The remaining behavior is by design. In the lazy-init path the Railtie installs auto-instrumentation during boot (via ActiveSupport.on_load hooks), before the LaunchDarkly client — and therefore the plugin's options — exist. An OTel instrumentation patches its target library at that point and can't be reconfigured or detached afterward, so install-time options (instrumentations, enable_traces: false) genuinely can't be applied retroactively without giving up boot-time attachment (the whole point of this path). service_name/service_version are the one exception, since the tracer provider reads its resource live per span — that's why configure_traces refreshes it. To honor install-time options, create the client from a config/initializer so #configure runs during boot with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants