Skip to content

Telemetry: instrument auth, deployment, and credential lifecycle #984

@EhabY

Description

@EhabY

Follow-up from the local telemetry audit of #953 / #903.

Scope

Main already has broad command-level telemetry (command.invoked), auth recovery/token-refresh traces, activation auth state, and HTTP rollups. Keep this issue focused on gaps where those signals do not explain the outcome.

Add a small number of domain traces/events for:

  • manual login outcome: one auth.login trace with source, method, result, bounded failure/cancel reason, and duration
  • logout outcome: one auth.logout trace with result, bounded failure/cancel reason, and duration
  • credential persistence/removal: one trace around store and one around clear, with keyring enabled/disabled, credential mode/category, result, bounded failure category, and duration
  • deployment lifecycle/recovery: compact events for meaningful state changes such as suspended, recovered, cross-window deployment detected, and auth-config recovery failed

Do not add separate started|succeeded|failed|cancelled events when one trace can represent result and duration.

Avoid duplicating

  • command.invoked already covers command duration/result for coder.login, coder.logout, and coder.switchDeployment.
  • auth.login_prompted, auth.unauthorized_intercepted, auth.token_refreshed, and activation.deployment_init already cover prompted re-auth, 401 recovery, token refresh, and activation-time auth validation.
  • Deployment URL is already part of telemetry context today; do not add more raw URLs or deployment identifiers unless we first settle the privacy/redaction story.

Privacy

Do not record tokens, authorization codes, auth URLs, callback URLs, raw browser URLs, or free-form user-controlled strings as dimensions. Use bounded enums for failure/cancel categories.

Acceptance criteria

  • Manual login cancellation/failure/success is distinguishable without relying only on command.invoked.
  • Logout and credential store/clear success/failure/cancellation are visible with bounded properties.
  • Deployment suspension/recovery/cross-window recovery paths are visible without high-cardinality values.
  • Existing auth recovery/token-refresh/activation telemetry is reused rather than duplicated.
  • Tests cover representative success, failure, and cancellation paths.

Generated by Coder Agent from the telemetry audit of #953. Updated after reviewing existing telemetry on main.

Metadata

Metadata

Assignees

Labels

enhancementtelemetryTelemetry and observability instrumentation

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions