Skip to content

Add Supavisor pooler metrics to supabase integration#23749

Merged
iliakur merged 2 commits into
DataDog:masterfrom
gushecht:gushecht/supabase-add-supavisor-metrics
May 22, 2026
Merged

Add Supavisor pooler metrics to supabase integration#23749
iliakur merged 2 commits into
DataDog:masterfrom
gushecht:gushecht/supabase-add-supavisor-metrics

Conversation

@gushecht
Copy link
Copy Markdown
Contributor

What does this PR do?

Adds Supavisor pooler metrics to the Supabase integration's allowlist. Supabase Cloud's customer Metrics API (/customer/v1/privileged/metrics) exposes ~24 supavisor_* metrics from Supavisor's PromEx Tenant plugin, but none were mapped in metrics.py, so the scraper was silently dropping them.

This PR adds:

  • Pool/connection gauges: supavisor.connections.active, supavisor.proxy.connections.active, supavisor.pool.connections.{checked_out,idle}, supavisor.tenants.active
  • Throughput counters: supavisor.client.queries_count, supavisor.client.joins.{ok,fail}, supavisor.{client,db}.network.{recv,send}
  • Lifecycle counters: supavisor.{client_handler,db_handler}.{started_count,stopped_count}, supavisor.db_handler.db_connection_count, supavisor.db_handler.prepared_statements_evicted_count
  • Latency histograms: supavisor.pool.checkout.duration.{local,remote}, supavisor.client.query.duration, supavisor.client.connection.duration, supavisor.client.connection.lifetime_ms, supavisor.client_handler.state.duration

Fixture (tests/fixtures/privileged_metrics.txt), assertion list (tests/common.py), and metric documentation (metadata.csv) are updated to match.

Motivation

The most operationally useful Supavisor metric — supavisor_connections_active — backs the "Shared Pooler (Supavisor) Client Connections" graph in Supabase Studio's Observability section. It's the right metric to alert on pooler client-slot exhaustion, which is a common failure mode in serverless / Vercel Fluid Compute setups (e.g. supabase/discussions#40671) where pool sockets can leak across function terminations. Without these mappings, customers running the integration have no Datadog signal for this until the database starts rejecting connections.

Additional Notes

  • Same gap exists in the SaaS supabase_cloud tile integration (no supabase.cloud.supavisor.* metrics ingested). Filing a separate ask with the saas-integrations team so that the SaaS scraper mirrors these mappings.
  • Metric type/label data sourced from the canonical PromEx plugin in supabase/supavisor.

Review checklist (to be filled by the PR author)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ label attached
  • If the PR doesn't need to be tested during QA, please add a qa/skip-qa label

The Supabase customer Metrics API
(`/customer/v1/privileged/metrics`) exposes ~24 Supavisor-prefixed
Prometheus metrics from Supavisor's PromEx Tenant plugin -- pool
client/server connection counts, pool checkout latency histograms,
client query/connection latency, network throughput, and
client/db handler lifecycle counters. None of these were in the
integration's allowlist, so the scraper was dropping them.

The gauge `supavisor.connections.active` corresponds to the
"Shared Pooler (Supavisor) Client Connections" graph in the
Supabase Studio Observability section and is the metric needed
to alert on pooler client-slot exhaustion -- a common failure
mode in serverless / Vercel Fluid Compute setups where pools
leak across function terminations.

Reference: https://github.com/supabase/supavisor/blob/main/lib/supavisor/monitoring/tenant.ex
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 778ee7054b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@@ -0,0 +1 @@
Add support for Supavisor pooler metrics exposed by the Supabase customer metrics endpoint. Covers pool client/server connection counts, pool checkout durations, query/connection latency histograms, and network throughput counters. Enables alerting on pooler client-connection saturation, which previously had no Datadog coverage despite being a common failure mode in serverless / Fluid Compute setups.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rename changelog fragment to use numeric PR prefix

This fragment filename (+supavisor.added) will break changelog validation in PR CI, because ddev/src/ddev/utils/scripts/check_pr.py:get_core_repo_changelog_errors parses the part before the first . as an integer PR number (int(entry_pr_num)), which raises ValueError for +supavisor. In practice, any PR containing this file will fail the changelog check before review/release automation can proceed; the fragment should be renamed to <PR_NUMBER>.<type>.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude on behalf of Gus] Already addressed — the next commit (156c5d3) renames +supavisor.added23749.added. Codex was reviewing the first commit only. The Check PR changelog workflow has since passed against the renamed fragment.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.04%. Comparing base (36bd36a) to head (156c5d3).
⚠️ Report is 28 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@iliakur iliakur added this pull request to the merge queue May 22, 2026
Merged via the queue into DataDog:master with commit 74e3371 May 22, 2026
69 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants