Skip to content

feat(observability): extend audit log, PostHog, and storage metering coverage#5269

Open
waleedlatif1 wants to merge 29 commits into
stagingfrom
worktree-audit-posthog-coverage
Open

feat(observability): extend audit log, PostHog, and storage metering coverage#5269
waleedlatif1 wants to merge 29 commits into
stagingfrom
worktree-audit-posthog-coverage

Conversation

@waleedlatif1

Copy link
Copy Markdown
Collaborator

Summary

Extends the three observability layers — audit log, PostHog product analytics, and usage metering — to cover resources and actions that were previously uninstrumented. Found via a codebase-wide coverage audit; every gap below was a real blind spot.

Audit log

  • Exfiltration trail (none existed): file downloads (workspace, public-share, v1 API), and table / workflow / workspace exports.
  • Credential access audited at the token-issuance boundary (CREDENTIAL_ACCESSED), success-only.
  • Full revenue/financial trail: invoice paid/failed, overage billed, charge disputes, credit fulfillment, subscription create/cancel/transfer, enterprise provisioning, plan/seat changes.
  • Session/account lifecycle via Better Auth hooks: login, blocked sign-in, logout, session revoke, account delete.
  • The entire v1/admin programmatic surface (member/role/export ops) and copilot tool handlers (skills, custom tools, tables) — both previously bypassed audit entirely.
  • Wired several defined-but-dead AuditAction constants (lock/unlock, table update/delete, etc.).

PostHog

  • Revenue events (payments, overage, credits, disputes, plan conversion, enterprise), org/seat lifecycle, KB search, copilot/skill/tool events.
  • Client-side workspace + organization group association so events roll up by account.

Metering

  • Storage now counted for KB documents and copilot files (was workspace-files only). Connector-synced KB docs remain correctly unmetered on both ingest and delete.

Hardening

  • Audit package nulls the actor FK for system actors (admin-api) with a readable label instead of silently FK-failing; adds an awaitable recordAuditNow for pre-delete hooks.
  • Every billing-webhook actor lookup is guarded so instrumentation can never fail payment processing.
  • All instrumentation is fire-and-forget and never blocks or breaks the primary operation. Ephemeral execution-file storage is intentionally not metered (no decrement path → would leak).

Type of Change

  • New feature (observability coverage)

Testing

  • Full typecheck, biome, and check:api-validation pass.
  • 2,617 touched-area tests + 22 audit-package tests pass.
  • Validated by a multi-agent adversarial review across every changed domain (FK safety, fire-and-forget, no double-emit, storage symmetry, behavior-preserving refactors).

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

…coverage

Instruments previously-uncaptured resources and actions across the three
observability layers (audit log, PostHog product analytics, usage metering):

- Exfiltration audit: file downloads (workspace/public-share/v1), table,
  workflow, and workspace exports.
- Credential access audited at the token-issuance boundary (success-only).
- Full revenue trail: invoice paid/failed, overage billed, disputes, credit
  fulfillment, subscription lifecycle, plan/seat changes.
- Session/login lifecycle audit via Better Auth hooks (login, blocked sign-in,
  logout, session revoke, account delete).
- v1/admin programmatic surface and copilot tool handlers instrumented.
- Dead audit/PostHog constants wired (lock/unlock, table, custom tool, etc.).
- Storage metering extended to KB documents and copilot files.
- PostHog person identify + workspace/organization group hygiene.

Audit package hardened to null the actor FK for system actors (admin-api) and
adds an awaitable recordAuditNow for pre-delete hooks. All instrumentation is
fire-and-forget and never blocks or breaks the primary operation.
@vercel

vercel Bot commented Jun 29, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 30, 2026 2:41am

Request Review

@cursor

cursor Bot commented Jun 29, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Large additive surface across credentials, billing webhooks, and storage accounting; changes are mostly fire-and-forget, but KB storage quota/decrement in transactions and billing instrumentation still warrant careful review for double-counting or missed rollbacks.

Overview
Adds audit log and PostHog instrumentation across previously unlogged paths: OAuth/credential token issuance, file downloads and exports (workspace, public share, v1 API, admin bulk export), table exports, workflow public API and lock toggles, org/billing lifecycle (plan switch, seat reconcile, subscription webhooks, threshold overage, disputes, credits), admin v1 APIs, copilot chat, and copilot table/skill/custom-tool handlers. Realtime debounced workflow variable writes now record WORKFLOW_VARIABLES_UPDATED with the last editor as actor.

Audit package accepts actorId: null for anonymous events and nulls the user FK for missing users or admin-api, with display labels instead of FK failures.

Storage metering for knowledge-base uploads checks quota before insert and increments after commit; hard deletes decrement bytes in the same transaction via new decrementStorageUsageInTx (connector-synced docs still excluded). Table rename/delete/archive and import column adds gain actor-aware audit hooks.

PostHog event map grows (revenue, org/seat, credentials, schedules, etc.); WorkspaceScopeSync sets organization and workspace groups together once workspace metadata loads. Seat reconciliation and invite acceptance pass actorId into centralized seat audit/analytics.

Reviewed by Cursor Bugbot for commit 2afd114. Configure here.

Comment thread apps/sim/app/api/auth/oauth/token/route.ts
Comment thread apps/sim/app/api/files/export/[id]/route.ts Outdated
Comment thread apps/sim/app/api/files/export/[id]/route.ts Outdated
@greptile-apps

greptile-apps Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR extends three observability layers — audit log, PostHog analytics, and storage metering — across previously uninstrumented surfaces: file downloads, billing webhooks, credential access, copilot tool operations, KB document storage, and the full v1/admin API surface. The audit package is hardened to null actor FKs for system actors and unresolvable users, and a new recordAuditNow awaitable is added for pre-delete hooks.

  • Audit coverage: Exfiltration trail (file downloads, exports), revenue trail (invoices, overage, disputes, subscription lifecycle, enterprise provisioning), session lifecycle via Better Auth hooks, and the copilot/v1-admin surfaces that previously bypassed audit entirely.
  • Storage metering: KB documents and copilot files are now counted; connector-synced documents are correctly excluded; hard-deletes use decrementStorageUsageInTx so the counter decrement is atomic with the DB row removal.
  • PostHog: Revenue and org-lifecycle events added server-side; client-side WorkspaceScopeSync extended to also set the organization group so events roll up correctly by account.

Confidence Score: 5/5

Safe to merge. All instrumentation is fire-and-forget and isolated from primary code paths; the critical billing and payment-blocking logic is fully decoupled from audit/analytics calls.

The PR makes no changes to functional business logic — it only wraps existing paths with audit and analytics calls. Every billing webhook's instrumentation block that could interfere with user-blocking is wrapped in its own try/catch. The audit package's FK-safety hardening has been verified across the catch and not-found branches. Storage metering additions follow the same atomic-decrement pattern already in use for workspace files. The one data-quality note (org ID as PostHog distinct ID on system reconciles) is a PostHog analytics concern, not a correctness or security issue.

apps/sim/lib/billing/organizations/seats.ts — the captureServerEvent fallback to organizationId when actorId is absent is worth reviewing for PostHog data quality.

Important Files Changed

Filename Overview
packages/audit/src/log.ts Hardens actor-FK resolution: nulls actorId on both the not-found and lookup-throws paths, preventing FK violations from system/deleted actors. Adds awaitable recordAuditNow for pre-delete hooks.
packages/audit/src/types.ts Adds new AuditAction constants for billing events, file downloads, credential access, subscription lifecycle, and org/table operations that were previously untracked.
apps/sim/lib/knowledge/documents/service.ts Adds quota check + storage increment for KB document uploads and a transactional decrement for hard-deletes, excluding connector-synced docs. Quota check is inside the transaction while increment is outside (known TOCTOU race noted in previous review).
apps/sim/lib/billing/storage/tracking.ts Adds decrementStorageUsageInTx to enable atomic counter decrements inside existing transactions, making KB hard-delete storage accounting failure-safe.
apps/sim/lib/billing/webhooks/invoices.ts Adds audit and PostHog for payment success/failure events. The payment_failed instrumentation block is correctly wrapped in try/catch, fully decoupled from user-blocking logic.
apps/sim/lib/billing/webhooks/subscription.ts Adds audit rows for subscription create/cancel events. resolveSubscriptionActorId is called unconditionally for all subscriptions, making a harmless extra DB query for personal plans.
apps/sim/lib/billing/webhooks/disputes.ts Refactors handleDisputeClosed to audit all outcomes (not just wins), adds dispute-opened audit for user and org paths. Actor lookup failures fall back safely to org ID, which the audit layer nulls to System.
apps/sim/lib/billing/webhooks/enterprise.ts Adds audit and PostHog for enterprise subscription provisioning. Actor is resolved via Stripe customer ID lookup with try-catch fallback to referenceId.
apps/sim/lib/billing/threshold-billing.ts Refactors the threshold-billing transaction to return a structured result instead of void, enabling post-transaction audit and PostHog emission without blocking the critical billing path.
apps/sim/lib/billing/organizations/seats.ts Moves seat-change audit and PostHog into reconcileOrganizationSeats itself; falls back to organizationId as PostHog distinct ID for system-initiated reconciles, which creates phantom person profiles.
apps/sim/app/api/auth/oauth/token/route.ts Adds CREDENTIAL_ACCESSED audit and PostHog for all three token-issuance paths (user OAuth, service account, GET refresh). Audit correctly fires after token resolution in all paths.
apps/sim/app/workspace/[workspaceId]/providers/workspace-scope-sync.tsx Extends workspace PostHog group sync to include org group association; waits for workspace metadata before setting groups so workspace and org groups switch atomically.
apps/sim/lib/table/service.ts Adds optional actingUserId param to renameTable and deleteTable; both emit TABLE_UPDATED/TABLE_DELETED audit rows when an actor is provided.
apps/sim/lib/invitations/core.ts Moves seat-change audit into reconcileOrganizationSeats via actorId param; adds ORG_MEMBER_ADDED audit+PostHog on successful invite acceptance.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Billing Webhooks
        IPS[invoice.payment_succeeded] --> |resolveBillingActorId| RA1[recordAudit INVOICE_PAYMENT_SUCCEEDED]
        IPF[invoice.payment_failed] --> |try-catch guard| RA2[recordAudit INVOICE_PAYMENT_FAILED]
        IPF --> |unguarded| UB[Block Users]
        CD[charge.dispute] --> |getOrganizationOwnerId| RA3[recordAudit CHARGE_DISPUTE]
        SC[subscription events] --> |resolveSubscriptionActorId| RA4[recordAudit SUBSCRIPTION]
        ENT[enterprise provisioning] --> |stripeCustomerId lookup| RA5[recordAudit ENTERPRISE_PROVISIONED]
        OVG[overage threshold] --> |tx returns result| RA6[recordAudit OVERAGE_BILLED]
    end

    subgraph Storage Metering
        KBU[KB doc upload] --> QC[checkStorageQuota inside tx]
        QC --> TX1[tx commits]
        TX1 --> ISU[incrementStorageUsage outside tx]
        KBD[KB hard delete] --> PRE[pre-fetch subscriptions]
        PRE --> TX2[tx: delete rows + decrementStorageUsageInTx]
        TX2 --> DSF[deleteDocumentStorageFiles]
    end

    subgraph Audit Package
        RA[recordAudit] --> |fire-and-forget| INS[insertAuditLog]
        RAN[recordAuditNow awaitable] --> INS
        INS --> UL{user lookup}
        UL --> |found| OK[persist with real actor]
        UL --> |not found or throws| NULL[null actorId + System label]
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    subgraph Billing Webhooks
        IPS[invoice.payment_succeeded] --> |resolveBillingActorId| RA1[recordAudit INVOICE_PAYMENT_SUCCEEDED]
        IPF[invoice.payment_failed] --> |try-catch guard| RA2[recordAudit INVOICE_PAYMENT_FAILED]
        IPF --> |unguarded| UB[Block Users]
        CD[charge.dispute] --> |getOrganizationOwnerId| RA3[recordAudit CHARGE_DISPUTE]
        SC[subscription events] --> |resolveSubscriptionActorId| RA4[recordAudit SUBSCRIPTION]
        ENT[enterprise provisioning] --> |stripeCustomerId lookup| RA5[recordAudit ENTERPRISE_PROVISIONED]
        OVG[overage threshold] --> |tx returns result| RA6[recordAudit OVERAGE_BILLED]
    end

    subgraph Storage Metering
        KBU[KB doc upload] --> QC[checkStorageQuota inside tx]
        QC --> TX1[tx commits]
        TX1 --> ISU[incrementStorageUsage outside tx]
        KBD[KB hard delete] --> PRE[pre-fetch subscriptions]
        PRE --> TX2[tx: delete rows + decrementStorageUsageInTx]
        TX2 --> DSF[deleteDocumentStorageFiles]
    end

    subgraph Audit Package
        RA[recordAudit] --> |fire-and-forget| INS[insertAuditLog]
        RAN[recordAuditNow awaitable] --> INS
        INS --> UL{user lookup}
        UL --> |found| OK[persist with real actor]
        UL --> |not found or throws| NULL[null actorId + System label]
    end
Loading

Reviews (27): Last reviewed commit: "chore(observability): trim verbose inlin..." | Re-trigger Greptile

Comment thread apps/sim/app/api/files/public/[token]/content/route.ts
Address Cursor Bugbot review:
- GET OAuth token: emit CREDENTIAL_ACCESSED/credential_used after
  refreshTokenIfNeeded succeeds (matches POST), not before.
- File export: audit on each actual success exit via a shared helper —
  including the non-markdown serve redirect (previously unaudited) and
  after the zip is generated (previously before asset fetch).
Address Greptile P1: setting actorId to the file owner made every anonymous
external download read as a self-download, undermining the exfiltration trail.
recordAudit now accepts a null actor; the public content/inline routes record
actorId=null with the owner in metadata.sharedByUserId and rely on ip/user-agent
for the forensic trail. The misleading owner-attributed file_downloaded PostHog
event is dropped on these anonymous paths.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/v1/admin/workflows/export/route.ts Outdated
Comment thread apps/sim/lib/knowledge/documents/service.ts
- Admin workflow/workspace ZIP exports now audit only after the archive is
  built (via a local helper), so a zip/build failure no longer logs an export.
- Remove redundant 'success-only placement' inline comments and tighten the
  remaining design-rationale notes (export helper now TSDoc).
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/billing/webhooks/disputes.ts
Comment thread apps/sim/lib/billing/threshold-billing.ts
Comment thread apps/sim/lib/uploads/contexts/copilot/copilot-file-manager.ts Outdated
Address Cursor review:
- handleDisputeClosed now records CHARGE_DISPUTE_CLOSED for every closed
  dispute (won/lost/warning_closed), unblocking only on favorable outcomes.
  dispute.status in the metadata distinguishes the outcome, so lost
  chargebacks are no longer missing from the trail.
- Threshold overage now emits OVERAGE_BILLED + overage_billed even when credits
  fully cover the overage (settledVia: 'credits' vs 'stripe'), so credit-settled
  overages are audited instead of silently returning null.
Address Greptile P1: deleting the metadata row before the decrement meant a
decrement failure left the quota permanently inflated with no record to retry
from. Decrement first; only remove the metadata row once it succeeds.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/table/[tableId]/export/route.ts Outdated
Comment thread apps/sim/lib/auth/auth.ts Outdated
Comment thread apps/sim/lib/uploads/contexts/copilot/copilot-file-manager.ts Outdated
…tion

- Copilot file delete now releases storage via a single transaction
  (releaseDeletedFileStorage): the soft-delete is the idempotency claim and the
  decrement shares the transaction, so neither a partial failure (inflated
  counter) nor a retry (double-decrement) can desync the quota. Resolves the
  conflicting Greptile/Cursor ordering findings.
- Policy-blocked sign-ups now record USER_SIGNUP_BLOCKED instead of
  USER_SIGNIN_BLOCKED, so account-lifecycle events aren't mislabeled.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/table/import-data.ts
Address Cursor review: addImportColumns (async createColumns import path) now
threads the importing userId into auditTableColumnsAdded instead of falling back
to table.createdBy, so column additions are attributed to the actual member who
ran the import rather than the table creator.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/files/download/route.ts
Per design decision: copilot files are working/conversational artifacts, so
gating their materialization on storage quota would fail an agent operation
mid-flow when a user is over limit, and metering them would inflate usage enough
to block KB/workspace uploads indirectly. Remove copilot quota gates + copilot
ingest metering entirely (revert copilot-file-manager, metadata, and the upload
route's copilot branch to baseline) and drop the now-unused releaseDeletedFileStorage.
KB document metering + quota enforcement (the deliberate, persistent storage path)
is unchanged.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Address Cursor review: gate the group-sync effect on workspace metadata being
loaded (activeWorkspace present) so the workspace and organization groups always
update atomically. Acting during the load window paired the new workspace group
with the previous workspace's org group; until metadata loads, events stay
consistently attributed to the previous workspace.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/table/[tableId]/export-async/route.ts
Address Cursor review: async exports only emitted TABLE_EXPORTED when the
background job reached 'ready', so an authorized export whose job later failed or
was abandoned left no audit trail — inconsistent with the sync export route,
which audits before streaming. Move the audit + analytics to the async route's
authorization point (after the job is claimed/dispatched) and remove it from the
runner. Drop the now-unused userId from TableExportPayload.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit f5f8a00. Configure here.

Comment thread apps/sim/lib/billing/webhooks/invoices.ts Outdated
…king

Address Greptile P1: the payment_failed audit hoisted an unguarded
isSubscriptionOrgScoped DB read (for entity_type) directly before the
attempt-count user-blocking block. A transient failure of that read would throw
out of the handler and skip blocking. Wrap the whole audit/analytics block in
try/catch (best-effort), and let the blocking compute its own isSubscriptionOrgScoped
as it did originally — instrumentation can no longer abort payment processing.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 2afd114. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant