fix(ruby): keep OTel auto-instrumentation working on Rails 7.0#643
fix(ruby): keep OTel auto-instrumentation working on Rails 7.0#643ccschmitz-launchdarkly wants to merge 18 commits into
Conversation
abelonogov-ld
left a comment
There was a problem hiding this comment.
Cool! Test app features
The Railtie's `attach_otel_log_bridge` delegated to a module method (`LaunchDarklyObservability.otel_logger_provider_available?`) defined in the `class << self` block of launchdarkly_observability.rb. That block runs *after* the `require_relative '.../rails'` near the top of the file. When the gem is required lazily after Rails has booted, the Railtie's `config.after_initialize` hook fires synchronously during the require — before the module method exists — so the bridge attach failed with: Could not attach log bridge to Rails.logger: undefined method `otel_logger_provider_available?' for module LaunchDarklyObservability Inline the availability check in the Railtie so it no longer depends on load order, and move the `rails.rb` require to the bottom of the main file (after the module body is fully defined) so future Railtie code can't reintroduce the same class of bug. Adds a regression test that removes the module method to simulate the load-order state. Co-Authored-By: Claude <noreply@anthropic.com>
…rsions The default instrumentation config passed `enable_recognize_route: true` to the Rails instrumentation and `db_statement: :include` to ActiveRecord. Neither option exists on those instrumentations (verified against the pinned gem versions: rails 0.39.1, active_record 0.11.1, rack 0.29.0), so they were no-ops that emitted a warning on every boot: Instrumentation ... ignored the following unknown configuration options [...] Remove them. Route-based span naming (http.route) is handled automatically by the ActionPack instrumentation, and SQL capture comes from the database adapter instrumentations (Mysql2, PG, ...) which already obfuscate statements by default. Co-Authored-By: Claude <noreply@anthropic.com>
The Rails e2e demo app only verified that a controller responds; nothing checked
that the OTel Rails-family instrumentations actually attached. That gap is why a
recent customer report ("Instrumentation: ... failed to install") went unnoticed
— the app initializes the plugin during Rails boot (the happy path) and never
asserted the result.
Add an integration test that asserts Rack, ActionPack, ActiveRecord,
ActiveSupport, and Rails report `installed? == true` after boot (the exact
inverse of the "failed to install" warning), and that an HTTP request produces a
server span. The plugin only configures OpenTelemetry when the LD client
registers it, which needs a non-empty SDK key, so test_helper sets a dummy key
before boot (invalid, so the client never connects).
Co-Authored-By: Claude <noreply@anthropic.com>
The OTel Rails-family instrumentations (ActionPack, ActiveRecord, ...) patch via ActiveSupport.on_load hooks that fire while Rails boots. The plugin configured OpenTelemetry from Plugin#register (at LDClient.new), so when an app creates the client lazily — e.g. from a model on first request, after Rails has booted — those hooks had already fired and every Rails instrumentation logged "Instrumentation: ... failed to install". Fix it in two parts: 1. Ship a hyphenated entry point (lib/launchdarkly-observability.rb) matching the gem name so Bundler.require auto-loads the gem — and its Railtie — during boot. Previously the gem was only loadable as 'launchdarkly_observability' (underscore), so Bundler couldn't auto-require it and users loaded it manually in an initializer, too late for the Railtie. 2. Add a Railtie initializer that installs auto-instrumentation during boot, decoupled from the LD client. project_id for the resource is resolved from LAUNCHDARKLY_SDK_KEY, which is present in the environment at boot in the common case even when the client object is built lazily. At register time the plugin then only attaches exporters to the existing provider instead of reconfiguring it (which would drop the boot-time instrumentation). It runs `after: :load_config_initializers` and no-ops if a provider is already configured, so boot-time-init apps keep their own configuration untouched. When the client is registered after boot and boot-time install did not run (e.g. no SDK key in the environment at boot), the plugin now logs a single actionable warning instead of the upstream flood of "failed to install" lines. Co-Authored-By: Claude <noreply@anthropic.com>
Add a lazy client-initialization mode to the Rails demo app (LD_LAZY_INIT=1): the initializer skips creating the client at boot, and a LazyLdClient model creates it on first use — mirroring apps that build the client after Rails has booted. A new integration test boots a separate Rails process in this mode and asserts the Rails-family instrumentations are still installed, which only holds because the Railtie installs them during boot. Verified that removing the boot-time install makes the test fail with "failed to install". Co-Authored-By: Claude <noreply@anthropic.com>
…lure (red) Reproduces the CardFlight failure: on Rails 7.0 with the current gem, opentelemetry-instrumentation-all resolves to the latest (0.94 -> instrumentation-rails 0.42), whose Rails-family members raised their floor to Rails 7.1. They fail their runtime compatible? check, log a flurry of 'failed to install' warnings, and never attach, so no autoinstrumented HTTP server span is produced. - e2e/ruby/rails/demo-rails70: copy of demo pinned to rails ~> 7.0 (Ruby 3.3.4 kept so bundler still resolves the latest instrumentation, faithfully matching a current-Ruby Rails 7.0 customer). - test/support/otlp_sink.rb: in-process OTLP/protobuf sink (pure Ruby, decodes via the exporter gems' proto classes) so the e2e test asserts telemetry over the real export path under bundle exec rake (no Docker/Node). - test/integration/otlp_export_e2e_test.rb: asserts traces (server span), a log record, and a captured exception all reach the sink. RED here. Boot logs show 8 Rails-family 'failed to install' warnings; the new test and the existing observability_instrumentation_test both fail (missing server span).
The observability plugin depended on opentelemetry-instrumentation-all, whose Rails-family members raised their floor to Rails 7.1. On Rails 7.0 every Rails-family instrumentation failed its compatible? check, logged a flurry of 'failed to install' warnings, and never attached — so apps lost all Rails auto-instrumentation (no HTTP server spans, DB spans, etc.) and got only manual instrumentation. The meta gem couples versions, so the only fix is to stop using it. - Replace opentelemetry-instrumentation-all with individual instrumentation gems (lib/launchdarkly_observability/instrumentations.rb). The Rails family is capped below each member's Rails-7.1-enforcing release (rails <0.42, action_pack <0.18, active_record/action_view <0.13, active_support <0.12, active_job <0.12, action_mailer <0.8, active_storage <0.5); everything else tracks the latest. These capped releases still work on Rails 7.1+, so modern apps are unaffected, and use_all still activates any extra instrumentation gem a consumer adds. - Replace the per-instrumentation 'failed to install' log flurry with a single actionable summary via a logger filter (InstrumentationLogFilter), naming the instrumentations that could not attach and how to resolve it. Verified on the demo-rails70 repro: 0 'failed to install', all 13 instrumentations install, and traces (server span) + a log + a captured exception are exported to the OTLP sink. 5 runs, 16 assertions, 0 failures.
…lures correctly - Move the log-suppression filter into its own file (instrumentation_log_filter.rb) with class-level capture_failures/ failure_warning helpers (keeps OpenTelemetryConfig under the class-length limit and is unit-testable). - Wrap the whole OpenTelemetry::SDK.configure call rather than just use_all: the SDK installs instrumentations AFTER the configure block returns, so the earlier placement around use_all never saw the install logging. Now the per-instrumentation 'failed to install' / 'successfully installed' chatter is actually suppressed and failures are collected for the single summary. - rails.rb: Railtie#otel_logger_provider_available? now returns an explicit boolean instead of nil when the logs constant is absent. Fixes a pre-existing order-dependent failure in rails_railtie_test (verified failing identically on the unmodified gem at the same seed). Updated e2e Gemfile.locks for the new individual-gem resolution. demo (7.2) and api-only (7.2) suites pass; demo-rails70 (7.0) passes with 0 'failed to install'.
Adds e2e-rails-legacy to ruby-plugin.yml, running e2e/ruby/rails/demo-rails70 (Rails 7.0) under bundle exec rake. The existing e2e jobs run Rails 7.2 — above the floor the OTel Rails-family instrumentations enforce — so they cannot catch an old-Rails compatibility break. This job fails if Rails-family auto-instrumentation stops attaching on Rails 7.0. Demonstrated red-then-green with the job's exact command: - pre-fix gemspec (instrumentation-all -> instrumentation-rails 0.42.0): 5 runs, 9 assertions, 4 failures (ActionPack not installed, no server span) - fixed gemspec (instrumentation-rails 0.41.0): 5 runs, 16 assertions, 0 failures
- README: replace the opentelemetry-instrumentation-all dependency note with a 'Ruby & Rails compatibility' section (Ruby >= 3.0, Rails >= 7.0 matrix, the individual-gem rationale, the single-warning behavior, and the per-gem pin escape hatch). Update the auto-instrumentation list to the curated set. - CHANGELOG: add an Unreleased Bug Fixes entry for the Rails 7.0 fix. - demo-rails70/README: explain what the repro proves and how to run it.
…ncies) rubocop's Gemspec/OrderedDependencies (run by the build CI job) requires alphabetical order within each section. No functional change.
d039e6b to
aaf9f8d
Compare
☂️ Python Coverage
Overall Coverage
New FilesNo new covered files... Modified FilesNo covered modified files...
|
| root: <%= Rails.root.join("tmp/storage") %> | ||
|
|
||
| local: | ||
| service: Disk |
There was a problem hiding this comment.
High severity vulnerability may affect your project—review required:
Line 6 lists a dependency (activestorage) with a known High severity vulnerability.
ℹ️ Why this matters
Affected versions of activestorage are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Active Storage DiskService builds filesystem paths from blob keys without containing them to the storage root; a key containing traversal sequences can read, write, or delete arbitrary files if that key reaches disk storage (for example when custom keys are derived from untrusted input)
To resolve this comment:
Check if you are using ActiveStorage::Service::DiskService in your Ruby application.
- If you're affected, upgrade this dependency to at least version 7.2.3.1 at e2e/ruby/rails/demo-rails70/Gemfile.lock.
- If you're not affected, comment
/fp we don't use this [condition]
💬 Ignore this finding
To ignore this, reply with:
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
You can view more details on this finding in the Semgrep AppSec Platform here.
| @@ -0,0 +1,33 @@ | |||
| test: | |||
| service: Disk | |||
There was a problem hiding this comment.
High severity vulnerability may affect your project—review required:
Line 2 lists a dependency (activestorage) with a known High severity vulnerability.
ℹ️ Why this matters
Affected versions of activestorage are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Active Storage DiskService builds filesystem paths from blob keys without containing them to the storage root; a key containing traversal sequences can read, write, or delete arbitrary files if that key reaches disk storage (for example when custom keys are derived from untrusted input)
To resolve this comment:
Check if you are using ActiveStorage::Service::DiskService in your Ruby application.
- If you're affected, upgrade this dependency to at least version 7.2.3.1 at e2e/ruby/rails/demo-rails70/Gemfile.lock.
- If you're not affected, comment
/fp we don't use this [condition]
💬 Ignore this finding
To ignore this, reply with:
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
You can view more details on this finding in the Semgrep AppSec Platform here.
|
Semgrep found 1 Risk: Affected versions of vite and vite-plus are vulnerable to Exposure of Sensitive Information to an Unauthorized Actor / Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Vite's server.fs.deny blocklist—which protects sensitive files such as .env and certificate files from being served—can be bypassed on Windows using alternate path representations (NTFS Alternate Data Stream syntax like Manual Review Advice: A vulnerability from this advisory is reachable if you expose the Vite dev server or vite-plus to the network by configuring a non-loopback address using the --host CLI flag on Windows Fix: Upgrade this library to at least version 6.4.3 at observability-sdk/yarn.lock:47256. Reference(s): GHSA-fx2h-pf6j-xcff |
When auto-instrumentation installs during Rails boot (the LD_LAZY_INIT path), the tracer provider's resource is built before the plugin's service_name/service_version options exist, so it carries the inferred service name. configure_traces then only attached an exporter and left that resource in place, while logs and metrics built a fresh resource from the options -- so trace spans reported a different service identity than logs/metrics. Update the existing provider's resource in place (it reads @resource live per span) instead of reconfiguring the SDK, which would drop the boot-time instrumentation. Flagged by Cursor Bugbot on #643.
FAILED_PATTERN capture and the single summary warning only fire when an instrumentation is below its framework floor (the pre-fix red state), so the green e2e suite never exercises them. Add direct unit coverage.
Requiring opentelemetry/instrumentation/rails pulls all seven Rails-family instrumentations (incl. action_mailer and active_storage); spell them out so the comment matches the gemspec caps and explains where those caps come from.
demo and demo-rails70 both reported service_name 'rails-demo-app', so their telemetry collided. Name the Rails 7.0 repro 'rails7-demo-app' so its spans are distinguishable from the Rails 7.2 demo.
The lockfile still pinned the path gem at 0.2.1 while the gem is 0.2.2 (and demo-rails70 was already updated). Align it.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9da2c1a. Configure here.
| @sample_evaluations = state.values_map.first(5).to_h | ||
|
|
||
| # Make an HTTP request (auto-instrumented by OpenTelemetry) | ||
| @http_url = 'http://www.example.com/?test=1' |
| # Make an HTTP request (auto-instrumented by OpenTelemetry) | ||
| @http_url = 'http://www.example.com/?test=1' | ||
| with_launchdarkly_span('pages-home-fetch', attributes: { 'custom.source' => 'demo' }) do | ||
| response = Net::HTTP.get_response(URI.parse(@http_url)) |
| @@ -0,0 +1,11 @@ | |||
| development: | |||
| adapter: redis | |||
| url: redis://localhost:6379/1 | |||
|
|
||
| production: | ||
| adapter: redis | ||
| url: <%= ENV.fetch("REDIS_URL") { "redis://localhost:6379/1" } %> |
| @mutex = Mutex.new | ||
| @server = WEBrick::HTTPServer.new( | ||
| Port: port, | ||
| BindAddress: '127.0.0.1', |
| # boot. This keeps the E2E test fully self-contained: no external collector, | ||
| # no network egress, no LaunchDarkly backend. | ||
| OTLP_SINK_PORT = (ENV['OTLP_SINK_PORT'] || '4327').to_i | ||
| ENV['OTEL_EXPORTER_OTLP_ENDPOINT'] ||= "http://127.0.0.1:#{OTLP_SINK_PORT}" |
|
|
||
| config = LaunchDarklyObservability::OpenTelemetryConfig.new( | ||
| project_id: 'my-project', | ||
| otlp_endpoint: 'http://localhost:4318', |
In the lazy-init path the Railtie installs auto-instrumentation at boot with defaults, before the plugin exists. Options that only take effect at install time -- custom instrumentations config and enable_traces: false -- can't be applied retroactively: an instrumentation patches its library at boot (via ActiveSupport.on_load hooks) and can't be reconfigured or detached afterward. Emit a single actionable warning instead of silently dropping them, matching the existing warn_if_rails_instrumentation_missed pattern. (service_name/service_version remain the exception -- the trace resource is read live per span and is refreshed in configure_traces.) Flagged by Cursor Bugbot on #643.
|
Re: Cursor Bugbot — "Boot trace path skips options" (opentelemetry_config.rb) Addressed the silent-drop in 6980ff9: the lazy-init path now emits a single warning when The remaining behavior is by design. In the lazy-init path the Railtie installs auto-instrumentation during boot (via |

Summary
The observability plugin depended on
opentelemetry-instrumentation-all, whose Rails-family members raised their minimum to Rails 7.1. On Rails 7.0 every Rails-family instrumentation fails its runtimecompatible?check, logs a flurry of"... failed to install"warnings, and never attaches — so the app loses all Rails auto-instrumentation (no HTTP server spans, DB spans, etc.) and gets only manual instrumentation. The meta-gem couples every instrumentation to a single version, so the fix is to stop using it.Self-contained against
main: this PR includes the boot-instrumentation foundation (previously split out as #582) plus the old-Rails compatibility fix. Supersedes #582.Changes
Boot auto-instrumentation (formerly #582)
Old-Rails compatibility
opentelemetry-instrumentation-allfor individual gems. The Rails family is capped below each member's Rails-7.1-enforcing release (rails <0.42,action_pack <0.18,action_view/active_record<0.13,active_support <0.12,active_job <0.12,action_mailer <0.8,active_storage <0.5); everything else tracks the latest. Capped releases still work on Rails 7.1+, so modern apps keep the latest non-Rails instrumentations.use_allstill activates any extra instrumentation gem a consumer adds.InstrumentationLogFilterreplaces the per-instrumentation "failed to install" flurry with one actionable summary (wraps the wholeSDK.configurecall, since the SDK installs instrumentations after the configure block returns).e2e/ruby/rails/demo-rails70(Rails 7.0) with a pure-Ruby in-process OTLP protobuf sink asserting traces (a server span), a log, and a captured exception are exported over the wire — runs underbundle exec rake, no Docker/Node.e2e-rails-legacyjob inruby-plugin.ymlas a regression guard (the other e2e apps run Rails 7.2, above the floor).Context
CardFlight (the original report) has since upgraded Rails, so this no longer blocks them — but it remains a valuable general regression guard so a future instrumentation-gem bump can't silently re-break Rails 7.0.
Testing
Red→green with the CI job's command (
bundle exec rakeindemo-rails70):5 runs, 9 assertions, 4 failures5 runs, 16 assertions, 0 failures, 0 "failed to install"demo(7.2) andapi-only(7.2) pass and still resolve the latest non-Rails instrumentations (rack 0.31, redis 0.29, pg 0.36, …). Gem unit tests + rubocop green. Rebased ontomain(0.2.2).Notes
rails_railtie_teststrict-boolean de-flake here also appears in fix(ruby): silence observability plugin warnings (log bridge, nil context id, instrumentation options) #581 — coordinate so only one lands.🤖 Generated with Claude Code
Note
Medium Risk
Changes default OpenTelemetry dependency resolution and boot/configure ordering for all Ruby/Rails consumers; behavior is heavily tested but mis-pinned gems could affect instrumentation on edge framework versions.
Overview
Fixes Rails 7.0 apps losing OpenTelemetry auto-instrumentation when
opentelemetry-instrumentation-allpulled in Rails-family gems that require Rails 7.1+.The launchdarkly-observability gem drops the
instrumentation-allmeta-gem for a curated set of individualopentelemetry-instrumentation-*dependencies, with the Rails family capped below the 7.1-enforcing releases while other instrumentations stay on current versions.InstrumentationLogFiltercollapses per-gem “failed to install” noise into one actionable warning.Boot-time install (Railtie +
install_rails_instrumentation) attaches Rails OTel hooks during boot so lazy LaunchDarkly client init still works; plugin registration later adds exporters without replacing the tracer provider. A hyphenatedlaunchdarkly-observability.rbentry point enablesBundler.requireto load the Railtie early.Adds
e2e/ruby/rails/demo-rails70(Rails 7.0) with an in-process OTLP protobuf sink and integration tests (server span, logs, exceptions), plus CI jobe2e-rails-legacy. Existing demo / api-only apps get matching lockfile changes and lazy-init instrumentation tests.Reviewed by Cursor Bugbot for commit 6980ff9. Bugbot is set up for automated code reviews on this repo. Configure here.