Skip to content

VPR-141 feat(healthchecks): add /health endpoints and UI dashboard#159

Open
rlorenzo wants to merge 4 commits intomainfrom
VPR-141-health-check
Open

VPR-141 feat(healthchecks): add /health endpoints and UI dashboard#159
rlorenzo wants to merge 4 commits intomainfrom
VPR-141-health-check

Conversation

@rlorenzo
Copy link
Copy Markdown
Contributor

  • /health (anonymous liveness for Jenkins) + /health/detail (tagged "ready", IP-gated to SVM /20 and infra /24, not CAS-gated so it stays reachable when auth is degraded).
  • HealthChecks.UI dashboard at /healthchecks with UC Davis branding, duration humanizer, and a campus-status banner that appears when any campus-* check is non-healthy.
  • Adaptive polling decorator on campus checks (LDAP/CAS/SMTP/VMACs): healthy results cached 1 hour, failures re-probe every 5 min (one UI poll cycle). Cuts external traffic from 12/hour to 1/hour per instance while healthy.
  • Real LDAPS bind, MailKit SMTP connect, AWS SSM probe, disk checks for app/photos/CMS/logs, EF DbContext checks for all contexts.
  • Adopts DotNetDiag.HealthChecks.UI 10.0.7 (fork of abandoned Xabaril packages; upstream does not build on .NET 10). Pinned exactly.
  • Jenkins Deploy stages now poll /health post-deploy.

- /health (anonymous liveness for Jenkins) + /health/detail (tagged
  "ready", IP-gated to SVM /20 and infra /24, not CAS-gated so it
  stays reachable when auth is degraded).
- HealthChecks.UI dashboard at /healthchecks with UC Davis branding,
  duration humanizer, and a campus-status banner that appears when
  any campus-* check is non-healthy.
- Adaptive polling decorator on campus checks (LDAP/CAS/SMTP/VMACs):
  healthy results cached 1 hour, failures re-probe every 5 min (one
  UI poll cycle). Cuts external traffic from 12/hour to 1/hour per
  instance while healthy.
- Real LDAPS bind, MailKit SMTP connect, AWS SSM probe, disk checks
  for app/photos/CMS/logs, EF DbContext checks for all contexts.
- Adopts DotNetDiag.HealthChecks.UI 10.0.7 (fork of abandoned Xabaril
  packages; upstream does not build on .NET 10). Pinned exactly.
- Jenkins Deploy stages now poll /health post-deploy.
Copilot AI review requested due to automatic review settings April 22, 2026 23:41
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 0% with 455 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.87%. Comparing base (72fb86b) to head (22d4a1a).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
web/Classes/HealthChecks/HealthCheckExtensions.cs 0.00% 190 Missing ⚠️
web/Classes/HealthChecks/DiskSpaceHealthCheck.cs 0.00% 86 Missing ⚠️
web/Classes/HealthChecks/SmtpHealthCheck.cs 0.00% 47 Missing ⚠️
...Classes/HealthChecks/AdaptivePollingHealthCheck.cs 0.00% 42 Missing ⚠️
web/Classes/HealthChecks/LdapHealthCheck.cs 0.00% 33 Missing ⚠️
...eb/Classes/HealthChecks/HttpEndpointHealthCheck.cs 0.00% 28 Missing ⚠️
web/Classes/HealthChecks/AwsSsmHealthCheck.cs 0.00% 23 Missing ⚠️
web/Program.cs 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #159      +/-   ##
==========================================
- Coverage   43.27%   42.87%   -0.41%     
==========================================
  Files         862      869       +7     
  Lines       50319    50839     +520     
  Branches     4696     4735      +39     
==========================================
+ Hits        21777    21795      +18     
- Misses      28019    28521     +502     
  Partials      523      523              
Flag Coverage Δ
backend 42.92% <0.00%> (-0.43%) ⬇️
frontend 41.69% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment thread web/Classes/HealthChecks/HealthCheckExtensions.cs Fixed
Comment thread web/Classes/HealthChecks/DiskSpaceHealthCheck.cs Fixed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds operational health checking to the web app: anonymous liveness, IP-gated readiness details, and an internal HealthChecks.UI dashboard with UC Davis branding and reduced external probe traffic via adaptive polling.

Changes:

  • Introduces /health and /health/detail endpoints plus the /healthchecks UI, including IP allowlisting and CSP bypass for the UI bundle.
  • Adds multiple new health checks (DB contexts, disk space, LDAP, SMTP, CAS/VMACs HTTP probes, AWS SSM) and an adaptive polling decorator to reduce probe frequency when healthy.
  • Updates Jenkins deploy stages to poll /health after deploy; adds UI branding assets and a small injected JS enhancer.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
web/wwwroot/js/healthchecks-ui-extras.js Injected UI script to humanize duration cells and show a campus-status banner.
web/wwwroot/healthchecks-ui-logo.png New logo asset for HealthChecks.UI branding.
web/wwwroot/css/healthchecks-ui-branding.css UC Davis palette + UI CSS tweaks (contrast, layout, banner styling).
web/appsettings.json Expands InternalAllowlist to CIDR ranges for health detail/UI access.
web/Viper.csproj Adds DotNetDiag HealthChecks.UI packages and EF health checks package reference.
web/Program.cs Hooks health check DI + pipeline wiring; conditionally applies CSP outside UI paths.
web/Classes/HealthChecks/SmtpHealthCheck.cs MailKit-based SMTP reachability/TLS probe.
web/Classes/HealthChecks/LdapHealthCheck.cs Real LDAPS bind probe for directory health.
web/Classes/HealthChecks/HttpEndpointHealthCheck.cs Generic HTTP endpoint reachability probe.
web/Classes/HealthChecks/HealthCheckExtensions.cs Centralized health checks registration + endpoint/UI mapping + UI HTML injection.
web/Classes/HealthChecks/DiskSpaceHealthCheck.cs Drive space (and optional writability) checks for app/photos/CMS/log paths.
web/Classes/HealthChecks/AwsSsmHealthCheck.cs AWS SSM reachability probe using a lightweight DescribeParameters call.
web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs Caches health results with status-dependent TTLs to reduce probe load.
JenkinsFile Adds post-deploy /health polling for test and prod.

Comment thread web/wwwroot/js/healthchecks-ui-extras.js Outdated
Comment thread web/Classes/HealthChecks/HealthCheckExtensions.cs Outdated
Comment thread web/Classes/HealthChecks/HealthCheckExtensions.cs Outdated
Comment thread web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs Outdated
- Dispose StreamReader with leaveOpen so the response buffer stays
  usable after the HTML-injection middleware reads it.
- Accept trailing slash on the UI path via StartsWithSegments so the
  extras script injects for "/healthchecks/" (Xabaril serves both
  forms); still gated on text/html content type.
- Collapse empty UnauthorizedAccessException catch into the
  IOException best-effort handler in DiskSpaceHealthCheck cleanup.
- Fix formatDuration "1m60s" rollover: round to whole seconds
  first, then split, so 59.6s promotes to the next minute.
- Use DateTime.Now (DateTimeKind.Local per project convention) for
  cache timestamps and the injected-script cache-buster, with a
  scoped S6561 pragma where we use it for elapsed-time math.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class health check endpoints and an operator-facing HealthChecks.UI dashboard to VIPER, including custom probes (DB, disk, LDAP/CAS/SMTP/SSM) and Jenkins post-deploy verification.

Changes:

  • Introduces /health liveness, /health/detail readiness JSON, and /healthchecks UI with IP allowlisting.
  • Adds multiple custom IHealthCheck implementations plus an adaptive polling decorator to reduce external traffic.
  • Updates Jenkins deploy stages to poll /health after deployment.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
web/wwwroot/js/healthchecks-ui-extras.js UI-side DOM tweaks (duration humanizer + campus-status banner).
web/wwwroot/healthchecks-ui-logo.png Adds UC Davis-branded logo asset for the dashboard.
web/wwwroot/css/healthchecks-ui-branding.css Custom palette/branding + minor UI layout/accessibility tweaks.
web/appsettings.json Expands InternalAllowlist to CIDR ranges for readiness/UI access.
web/Viper.csproj Adds HealthChecks.UI + EF Core health check package references.
web/Program.cs Hooks health check DI + pipeline wiring; skips CSP on UI paths.
web/Classes/HealthChecks/SmtpHealthCheck.cs Adds SMTP relay probe via MailKit connect/noop/disconnect.
web/Classes/HealthChecks/LdapHealthCheck.cs Adds LDAPS bind probe matching existing LDAP service settings.
web/Classes/HealthChecks/HttpEndpointHealthCheck.cs Adds generic HTTP reachability probe for CAS/VMACs.
web/Classes/HealthChecks/HealthCheckExtensions.cs Centralizes health check registration, endpoint mapping, UI config, and response-body script injection.
web/Classes/HealthChecks/DiskSpaceHealthCheck.cs Adds disk free-space (and optional writability) probe for key volumes.
web/Classes/HealthChecks/AwsSsmHealthCheck.cs Adds lightweight SSM reachability probe.
web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs Adds status-based caching to reduce expensive probe frequency.
JenkinsFile Adds post-deploy /health polling for test and prod stages.

Comment thread web/Classes/HealthChecks/HealthCheckExtensions.cs Outdated
Comment thread web/Classes/HealthChecks/HealthCheckExtensions.cs Outdated
Comment thread web/wwwroot/css/healthchecks-ui-branding.css Outdated
@codecov-commenter
Copy link
Copy Markdown

Bundle Report

Bundle size has no change ✅

Root-relative URLs in the dashboard wiring broke in TEST/PROD where
the app is hosted under /2 - the collector 404'd /health/detail, the
injected script 404'd /js/healthchecks-ui-extras.js, and the logo
404'd /healthchecks-ui-logo.png.

- Script src injected into the dashboard HTML now prefixes
  ctx.Request.PathBase so it resolves to /2/js/... in TEST/PROD
  and /js/... in dev.
- Health-detail endpoint URL is built from EmailSettings:BaseUrl
  (already configured per-env with the /2 path base); dev falls
  back to a relative URL which Xabaril resolves against the
  Kestrel listening address.
- Logo inlined as a base64 data URI in the custom stylesheet so
  path base is no longer part of its URL; dropped the now-unused
  PNG from wwwroot.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class health checking to VIPER, including liveness/readiness endpoints for deploy automation and an IP-gated HealthChecks.UI dashboard tailored to campus ops needs.

Changes:

  • Introduces /health (anonymous liveness) and /health/detail (IP-gated readiness with tagged checks) plus HealthChecks.UI at /healthchecks.
  • Adds health check implementations (LDAP, SMTP, HTTP endpoint probes, disk space, AWS SSM) and an adaptive polling decorator to reduce external probe traffic.
  • Updates Jenkins deploy stages to poll /2/health post-deploy; adds UC Davis branding + UI tweaks (duration humanizer + campus-status banner).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
web/wwwroot/js/healthchecks-ui-extras.js Injected UI tweaks (duration humanizer, campus-status banner) via MutationObserver.
web/wwwroot/css/healthchecks-ui-branding.css UC Davis palette + UI readability adjustments + campus-status banner styling.
web/appsettings.json Replaces a single internal allowlisted IP with CIDR ranges for staff + infra.
web/Viper.csproj Adds DotNetDiag HealthChecks.UI packages and EF Core health check package.
web/Program.cs Hooks health check DI + pipeline wiring; skips CSP on HealthChecks.UI paths.
web/Classes/HealthChecks/SmtpHealthCheck.cs MailKit-based SMTP reachability probe.
web/Classes/HealthChecks/LdapHealthCheck.cs Real LDAPS bind probe using existing LDAP service credentials.
web/Classes/HealthChecks/HttpEndpointHealthCheck.cs HTTP(S) reachability probe (treats non-5xx as healthy).
web/Classes/HealthChecks/HealthCheckExtensions.cs Centralizes health check registration, endpoint mapping, UI wiring, and IP gating.
web/Classes/HealthChecks/DiskSpaceHealthCheck.cs Disk free-space (and optional writability) probe for key volumes/paths.
web/Classes/HealthChecks/AwsSsmHealthCheck.cs AWS SSM reachability probe via DescribeParameters.
web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs Caches healthy vs unhealthy results for different durations to reduce probe load.
JenkinsFile Adds post-deploy polling of /2/health for TEST and PROD.

Comment thread web/Program.cs Outdated
builder.Services.AddScoped<Viper.EmailTemplates.Services.IEmailTemplateRenderer, Viper.EmailTemplates.Services.EmailTemplateRenderer>();

// All health-check DI wiring lives in HealthCheckExtensions; see that file
// (and PLAN-hangfire.md PR 0) for design rationale.
Comment on lines +13 to +14
/// See PLAN-hangfire.md PR 0 for the design rationale (liveness vs IP-gated
/// detail, CSP branching, Xabaril UI fork choice, etc.).
Comment on lines +40 to +41
/// run on /health/detail; /health is bare liveness. Hangfire checks layer
/// onto the "ready" tag in PR 3.
// reachability with the same SDK.
builder.AddCheck(
"aws-ssm",
new AwsSsmHealthCheck(),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping healthyWhenMissing: false here. AWS SSM is a hard dependency in local dev - the app fetches database passwords from SSM Parameter Store at startup via .AddSystemsManager in Program.cs, so developers must have AWS credentials configured. If SSM is unreachable in dev, DB connections fail, so Unhealthy is the correct signal. The photos/CMS healthyWhenMissing=true pattern is specifically for network drives that aren't mounted on dev machines; SSM is a different case.

Removed references to PLAN-hangfire.md and future PR numbers
from comments in HealthCheckExtensions.cs and Program.cs. The
plan file is an untracked working note and won't exist on main;
PR-number references rot once the branch is merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants