feat(observability): extend audit log, PostHog, and storage metering coverage#5269
feat(observability): extend audit log, PostHog, and storage metering coverage#5269waleedlatif1 wants to merge 29 commits into
Conversation
…coverage Instruments previously-uncaptured resources and actions across the three observability layers (audit log, PostHog product analytics, usage metering): - Exfiltration audit: file downloads (workspace/public-share/v1), table, workflow, and workspace exports. - Credential access audited at the token-issuance boundary (success-only). - Full revenue trail: invoice paid/failed, overage billed, disputes, credit fulfillment, subscription lifecycle, plan/seat changes. - Session/login lifecycle audit via Better Auth hooks (login, blocked sign-in, logout, session revoke, account delete). - v1/admin programmatic surface and copilot tool handlers instrumented. - Dead audit/PostHog constants wired (lock/unlock, table, custom tool, etc.). - Storage metering extended to KB documents and copilot files. - PostHog person identify + workspace/organization group hygiene. Audit package hardened to null the actor FK for system actors (admin-api) and adds an awaitable recordAuditNow for pre-delete hooks. All instrumentation is fire-and-forget and never blocks or breaks the primary operation.
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Audit package accepts Storage metering for knowledge-base uploads checks quota before insert and increments after commit; hard deletes decrement bytes in the same transaction via new PostHog event map grows (revenue, org/seat, credentials, schedules, etc.); Reviewed by Cursor Bugbot for commit 2afd114. Configure here. |
Greptile SummaryThis PR extends three observability layers — audit log, PostHog analytics, and storage metering — across previously uninstrumented surfaces: file downloads, billing webhooks, credential access, copilot tool operations, KB document storage, and the full v1/admin API surface. The audit package is hardened to null actor FKs for system actors and unresolvable users, and a new
Confidence Score: 5/5Safe to merge. All instrumentation is fire-and-forget and isolated from primary code paths; the critical billing and payment-blocking logic is fully decoupled from audit/analytics calls. The PR makes no changes to functional business logic — it only wraps existing paths with audit and analytics calls. Every billing webhook's instrumentation block that could interfere with user-blocking is wrapped in its own try/catch. The audit package's FK-safety hardening has been verified across the catch and not-found branches. Storage metering additions follow the same atomic-decrement pattern already in use for workspace files. The one data-quality note (org ID as PostHog distinct ID on system reconciles) is a PostHog analytics concern, not a correctness or security issue. apps/sim/lib/billing/organizations/seats.ts — the captureServerEvent fallback to organizationId when actorId is absent is worth reviewing for PostHog data quality. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph Billing Webhooks
IPS[invoice.payment_succeeded] --> |resolveBillingActorId| RA1[recordAudit INVOICE_PAYMENT_SUCCEEDED]
IPF[invoice.payment_failed] --> |try-catch guard| RA2[recordAudit INVOICE_PAYMENT_FAILED]
IPF --> |unguarded| UB[Block Users]
CD[charge.dispute] --> |getOrganizationOwnerId| RA3[recordAudit CHARGE_DISPUTE]
SC[subscription events] --> |resolveSubscriptionActorId| RA4[recordAudit SUBSCRIPTION]
ENT[enterprise provisioning] --> |stripeCustomerId lookup| RA5[recordAudit ENTERPRISE_PROVISIONED]
OVG[overage threshold] --> |tx returns result| RA6[recordAudit OVERAGE_BILLED]
end
subgraph Storage Metering
KBU[KB doc upload] --> QC[checkStorageQuota inside tx]
QC --> TX1[tx commits]
TX1 --> ISU[incrementStorageUsage outside tx]
KBD[KB hard delete] --> PRE[pre-fetch subscriptions]
PRE --> TX2[tx: delete rows + decrementStorageUsageInTx]
TX2 --> DSF[deleteDocumentStorageFiles]
end
subgraph Audit Package
RA[recordAudit] --> |fire-and-forget| INS[insertAuditLog]
RAN[recordAuditNow awaitable] --> INS
INS --> UL{user lookup}
UL --> |found| OK[persist with real actor]
UL --> |not found or throws| NULL[null actorId + System label]
end
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
subgraph Billing Webhooks
IPS[invoice.payment_succeeded] --> |resolveBillingActorId| RA1[recordAudit INVOICE_PAYMENT_SUCCEEDED]
IPF[invoice.payment_failed] --> |try-catch guard| RA2[recordAudit INVOICE_PAYMENT_FAILED]
IPF --> |unguarded| UB[Block Users]
CD[charge.dispute] --> |getOrganizationOwnerId| RA3[recordAudit CHARGE_DISPUTE]
SC[subscription events] --> |resolveSubscriptionActorId| RA4[recordAudit SUBSCRIPTION]
ENT[enterprise provisioning] --> |stripeCustomerId lookup| RA5[recordAudit ENTERPRISE_PROVISIONED]
OVG[overage threshold] --> |tx returns result| RA6[recordAudit OVERAGE_BILLED]
end
subgraph Storage Metering
KBU[KB doc upload] --> QC[checkStorageQuota inside tx]
QC --> TX1[tx commits]
TX1 --> ISU[incrementStorageUsage outside tx]
KBD[KB hard delete] --> PRE[pre-fetch subscriptions]
PRE --> TX2[tx: delete rows + decrementStorageUsageInTx]
TX2 --> DSF[deleteDocumentStorageFiles]
end
subgraph Audit Package
RA[recordAudit] --> |fire-and-forget| INS[insertAuditLog]
RAN[recordAuditNow awaitable] --> INS
INS --> UL{user lookup}
UL --> |found| OK[persist with real actor]
UL --> |not found or throws| NULL[null actorId + System label]
end
Reviews (27): Last reviewed commit: "chore(observability): trim verbose inlin..." | Re-trigger Greptile |
Address Cursor Bugbot review: - GET OAuth token: emit CREDENTIAL_ACCESSED/credential_used after refreshTokenIfNeeded succeeds (matches POST), not before. - File export: audit on each actual success exit via a shared helper — including the non-markdown serve redirect (previously unaudited) and after the zip is generated (previously before asset fetch).
Address Greptile P1: setting actorId to the file owner made every anonymous external download read as a self-download, undermining the exfiltration trail. recordAudit now accepts a null actor; the public content/inline routes record actorId=null with the owner in metadata.sharedByUserId and rely on ip/user-agent for the forensic trail. The misleading owner-attributed file_downloaded PostHog event is dropped on these anonymous paths.
|
@cursor review |
- Admin workflow/workspace ZIP exports now audit only after the archive is built (via a local helper), so a zip/build failure no longer logs an export. - Remove redundant 'success-only placement' inline comments and tighten the remaining design-rationale notes (export helper now TSDoc).
|
@greptile |
|
@cursor review |
Address Cursor review: - handleDisputeClosed now records CHARGE_DISPUTE_CLOSED for every closed dispute (won/lost/warning_closed), unblocking only on favorable outcomes. dispute.status in the metadata distinguishes the outcome, so lost chargebacks are no longer missing from the trail. - Threshold overage now emits OVERAGE_BILLED + overage_billed even when credits fully cover the overage (settledVia: 'credits' vs 'stripe'), so credit-settled overages are audited instead of silently returning null.
Address Greptile P1: deleting the metadata row before the decrement meant a decrement failure left the quota permanently inflated with no record to retry from. Decrement first; only remove the metadata row once it succeeds.
|
@greptile |
|
@cursor review |
…tion - Copilot file delete now releases storage via a single transaction (releaseDeletedFileStorage): the soft-delete is the idempotency claim and the decrement shares the transaction, so neither a partial failure (inflated counter) nor a retry (double-decrement) can desync the quota. Resolves the conflicting Greptile/Cursor ordering findings. - Policy-blocked sign-ups now record USER_SIGNUP_BLOCKED instead of USER_SIGNIN_BLOCKED, so account-lifecycle events aren't mislabeled.
|
@cursor review |
|
@cursor review |
Address Cursor review: addImportColumns (async createColumns import path) now threads the importing userId into auditTableColumnsAdded instead of falling back to table.createdBy, so column additions are attributed to the actual member who ran the import rather than the table creator.
|
@cursor review |
Per design decision: copilot files are working/conversational artifacts, so gating their materialization on storage quota would fail an agent operation mid-flow when a user is over limit, and metering them would inflate usage enough to block KB/workspace uploads indirectly. Remove copilot quota gates + copilot ingest metering entirely (revert copilot-file-manager, metadata, and the upload route's copilot branch to baseline) and drop the now-unused releaseDeletedFileStorage. KB document metering + quota enforcement (the deliberate, persistent storage path) is unchanged.
|
@cursor review |
Address Cursor review: gate the group-sync effect on workspace metadata being loaded (activeWorkspace present) so the workspace and organization groups always update atomically. Acting during the load window paired the new workspace group with the previous workspace's org group; until metadata loads, events stay consistently attributed to the previous workspace.
|
@cursor review |
Address Cursor review: async exports only emitted TABLE_EXPORTED when the background job reached 'ready', so an authorized export whose job later failed or was abandoned left no audit trail — inconsistent with the sync export route, which audits before streaming. Move the audit + analytics to the async route's authorization point (after the job is claimed/dispatched) and remove it from the runner. Drop the now-unused userId from TableExportPayload.
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit f5f8a00. Configure here.
…king Address Greptile P1: the payment_failed audit hoisted an unguarded isSubscriptionOrgScoped DB read (for entity_type) directly before the attempt-count user-blocking block. A transient failure of that read would throw out of the handler and skip blocking. Wrap the whole audit/analytics block in try/catch (best-effort), and let the blocking compute its own isSubscriptionOrgScoped as it did originally — instrumentation can no longer abort payment processing.
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 2afd114. Configure here.
Summary
Extends the three observability layers — audit log, PostHog product analytics, and usage metering — to cover resources and actions that were previously uninstrumented. Found via a codebase-wide coverage audit; every gap below was a real blind spot.
Audit log
CREDENTIAL_ACCESSED), success-only.v1/adminprogrammatic surface (member/role/export ops) and copilot tool handlers (skills, custom tools, tables) — both previously bypassed audit entirely.AuditActionconstants (lock/unlock, table update/delete, etc.).PostHog
Metering
Hardening
admin-api) with a readable label instead of silently FK-failing; adds an awaitablerecordAuditNowfor pre-delete hooks.Type of Change
Testing
check:api-validationpass.Checklist