Skip to content

fix(billing): drop transaction wrapper in recordUsage to relieve pool contention#4494

Merged
TheodoreSpeaks merged 2 commits intostagingfrom
fix/usage-row-contention
May 7, 2026
Merged

fix(billing): drop transaction wrapper in recordUsage to relieve pool contention#4494
TheodoreSpeaks merged 2 commits intostagingfrom
fix/usage-row-contention

Conversation

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator

Summary

  • Split recordUsage() in apps/sim/lib/billing/core/usage-log.ts into two independent statements (INSERT into usage_log, UPDATE on user_stats) instead of one wrapped transaction
  • Under high concurrency for a single user (~90 workflow finishes in seconds), the transaction wrapper pinned pgbouncer connections through the entire user_stats row-lock wait, exhausting the pool and failing unrelated requests with query_wait_timeout and connection-terminated errors
  • Without the transaction, each statement only holds its connection while executing, so contention stays inside Postgres's row-lock queue and does not bleed into the pool
  • usage_log is the source of truth (immutable ledger); if the user_stats UPDATE fails after a successful INSERT the counter drifts and must be reconciled from usage_log out-of-band. The "row not found" case now logs a warning instead of throwing (no transaction to roll back)

Type of Change

  • Bug fix

Testing

Tested manually. bun run lint clean, bun run check:api-validation:strict passes, tsc --noEmit clean for the touched file. No tests directly cover recordUsage. Followup: add a periodic reconciler that SUMs usage_log per period and corrects user_stats.currentPeriodCost to recover from any UPDATE failures.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 7, 2026 5:13pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 7, 2026

PR Summary

Medium Risk
Changes billing write semantics by removing the atomic transaction, making user_stats counter updates best-effort and allowing drift if the update fails or the row is missing.

Overview
recordUsage() no longer wraps the usage_log INSERT and user_stats counter increment in a single DB transaction; it now performs the INSERT first (still erroring to the caller) and then runs the user_stats UPDATE separately.

The user_stats update is now best-effort: missing rows and update failures are logged as warnings and swallowed to avoid pgbouncer pool contention under high concurrent updates for the same user, at the cost of potential counter drift.

Reviewed by Cursor Bugbot for commit 938b0a9. Bugbot is set up for automated code reviews on this repo. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 7, 2026

Greptile Summary

This PR removes the db.transaction() wrapper from recordUsage() in usage-log.ts, splitting the usage_log INSERT and user_stats UPDATE into two independent statements to prevent pgbouncer connection pool exhaustion under high concurrency for a single user.

  • Transaction removed: The two writes now execute independently; a failed UPDATE is caught, logged, and swallowed — usage_log serves as the source of truth for cost reconciliation.
  • additionalStats-only paths have no recovery: When no validEntries are present (INSERT is skipped) but additionalStats is provided, a failed UPDATE silently drops those counter increments permanently — unlike cost entries, there is no usage_log record to reconcile from.
  • Callers can no longer detect UPDATE failure: recordUsage() now returns void on both full success and a swallowed UPDATE error; any upstream retry logic that gates on a thrown exception will stop working silently.

Confidence Score: 3/5

The fix addresses a real production problem, but introduces a silent-failure mode where certain counter updates can be permanently lost and callers can no longer detect when the write was only partially successful.

The additionalStats counter increments are written exclusively to user_stats with no corresponding usage_log entry. When only additionalStats is present and the UPDATE is swallowed, those increments are gone with no out-of-band reconciliation possible. Separately, recordUsage() now returns normally even when the UPDATE fails for cost-bearing calls, so any caller that previously relied on an exception to trigger retry logic will silently stop retrying.

apps/sim/lib/billing/core/usage-log.ts — specifically the catch block and the code path where validEntries is empty but additionalStats is present.

Important Files Changed

Filename Overview
apps/sim/lib/billing/core/usage-log.ts Transaction removed to fix pgbouncer pool exhaustion; introduces a silent-failure mode where UPDATE errors are swallowed and additionalStats-only increments (no usage_log entry) can be permanently lost with no reconciliation path.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant recordUsage
    participant DB_UsageLog as DB: usage_log
    participant DB_UserStats as DB: user_stats

    Caller->>recordUsage: recordUsage(params)

    alt "validEntries.length > 0"
        recordUsage->>DB_UsageLog: INSERT (committed immediately)
        DB_UsageLog-->>recordUsage: success
    end

    recordUsage->>DB_UserStats: UPDATE user_stats

    alt UPDATE succeeds (row found)
        DB_UserStats-->>recordUsage: "result.length === 1"
        recordUsage-->>Caller: void (success)
    else UPDATE succeeds but row not found
        DB_UserStats-->>recordUsage: "result.length === 0"
        Note over recordUsage: logger.warn, counter drifts
        recordUsage-->>Caller: void (no error thrown)
    else UPDATE throws
        DB_UserStats-->>recordUsage: throws Error
        Note over recordUsage: catch, logger.error, additionalStats lost
        recordUsage-->>Caller: void (no error thrown)
    end
Loading

Reviews (1): Last reviewed commit: "fix(billing): drop transaction wrapper i..." | Re-trigger Greptile

Comment thread apps/sim/lib/billing/core/usage-log.ts
Comment thread apps/sim/lib/billing/core/usage-log.ts
@TheodoreSpeaks TheodoreSpeaks merged commit ec5793f into staging May 7, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant