Skip to content

build: forbid C++ sources outside src/export/#68

Merged
JoshDreamland merged 1 commit into
mainfrom
port-hook-path-to-c
May 13, 2026
Merged

build: forbid C++ sources outside src/export/#68
JoshDreamland merged 1 commit into
mainfrom
port-hook-path-to-c

Conversation

@JoshDreamland
Copy link
Copy Markdown
Contributor

@JoshDreamland JoshDreamland commented Apr 13, 2026

TL;DR

We can't separate the plugin build (C) entirely from the bgworker (C++); these build as one library, which forks postmaster.

Instead, modify the CMake build to

  1. Assert there are no C++ sources outside of the bgworker
  2. Use clang (now installed in workflow) to disallow C++ ctors below main
    • This prevents anything stupid from happening in the postmaster process

Also fix errors Clang tags but not GCC...

Moar text plz LLM Daddy

Builds on @serprex's #88 (C99 port of the plugin layer, now merged) by enforcing the C/C++ split in the build system itself. Single commit, two enforcements:

1. File-extension lock

Replaces the catch-all file(GLOB_RECURSE SOURCES src/*.cc src/*.c) with two separate globs — plugin = src/*.c, exporter = src/export/*.cc — plus a configure-time check that fails the build if any .cc / .cxx / .cpp file appears under src/ outside src/export/. Same single shared library is produced — postmaster still loads one .so, the bgworker still inherits one address space — but a stray src/hooks/helper.cc (or any future "just a tiny C++ helper") will now fail configure with a named, actionable error before any compilation runs.

2. No load-time globals in the exporter

Even inside src/export/, where C++ is allowed, forbid namespace-scope objects whose constructors run at .so load time. The hazard: static std::thread g_worker = std::thread(work); or static std::map<...> g_cache; — a global whose constructor runs inside postmaster before _PG_init even fires, potentially allocating, starting threads, opening connections, taking locks, or otherwise touching state before PG is ready. -Wglobal-constructors flags these; -Werror=global-constructors promotes to build failure.

Clang-only — GCC has no equivalent flag (requested upstream for years, never landed). Bracketed with if(CMAKE_CXX_COMPILER_ID MATCHES "Clang") so the GCC matrix stays green. A small parallel CI job (Build (Clang, PG 18)) runs the Clang build so regressions are caught at PR time.

Nothing in the tree trips the warning today — g_exporter's constructor is constexpr-initialized via std::unique_ptr()'s constexpr default ctor; the rest of src/export/ namespace scope is constexpr PODs and using aliases.

What this does not do

  • -Wexit-time-destructors (the sibling warning that fires on non-trivial destructors running at exit) is intentionally NOT enabled. It would trip on g_exporter's ~unique_ptr and two cached Arrow shared_ptr<DataType> statics in arrow_batch.cc. The fix is a handful of [[clang::no_destroy]] annotations — easy, but the user's stated primary concern is load-time globals (the "std::thread above main" case). Exit-time destructors are secondary (process is shutting down anyway). Easy to layer on later if a regression motivates it.

Stacking

Targets main directly (now that #88 has merged). #94 stacks on top of this for the exception barrier work.

Test plan

  • CI green across PG 16/17/18 on the GCC matrix.
  • New Build (Clang, PG 18) CI job green — confirms -Werror=global-constructors compiles clean against current code.
  • Manual smoke: temporarily mv src/hooks/hooks.c src/hooks/hooks.cc, run cmake -B build — should fail configure with the named error before any compile.
  • Manual smoke: temporarily add static std::vector<int> g_test; at the top of stats_exporter.cc, build with Clang — should fail with -Werror=global-constructors.

Followup

The parent_query_id / nested-query-stack work that used to live on this branch is preserved on parent-query-id-preserved for a separate PR after the safety stack (this + #94) lands.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 13, 2026 19:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ports the PostgreSQL hook/queue path from C++-style code to pure C to avoid RAII-driven heap allocation failure modes, and extends exported telemetry to include parent_query_id for nested query attribution.

Changes:

  • Converted hook/queue/normalization codepaths to C (remove extern "C" blocks in .c, replace nullptr/static_cast/std::array usage).
  • Added per-query stack tracking to compute parent_query_id and improve nested query CPU attribution.
  • Updated ClickHouse schema + added migration for the new parent_query_id column; updated build to compile .c sources.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/queue/shmem.h Adds C-friendly forward declaration for shared state struct.
src/queue/shmem.c Ports implementation to C constructs (NULL, C casts, void params).
src/queue/local_batch.c Replaces std::array with fixed C array; C-style callback signatures.
src/queue/event.h Makes enums/structs more C-friendly; adds parent_query_id field.
src/pg_stat_ch.c Removes C++ linkage wrappers and ports nullptrNULL.
src/hooks/query_normalize_state.h Ports state types/APIs to C typedefs and value-passing.
src/hooks/query_normalize_state.c Replaces absl hashing with Postgres hash_bytes; C cleanups.
src/hooks/query_normalize.c Ports C++ casts/nullptr checks to C equivalents.
src/hooks/hooks.c Introduces fixed-size query frame stack; computes parent_query_id; C ports.
src/export/stats_exporter.cc Exports new parent_query_id column to ClickHouse/OTel exporters.
src/config/guc.c Replaces std::array enum entries with a C array; ports NULL usage.
migrations/001_add_parent_query_id.sql Adds ClickHouse migration for parent_query_id.
docker/init/00-schema.sql Updates canonical ClickHouse schema to include parent_query_id.
CMakeLists.txt Ensures .c sources are compiled and linked into the shared library.
Comments suppressed due to low confidence (5)

src/hooks/hooks.c:57

  • The comment says the fixed-size stack allows "64 levels" of nesting, but PSCH_MAX_QUERY_NESTING_DEPTH is currently 8. Please reconcile the comment and the constant (either increase the constant or update the documented limit).
    src/queue/shmem.c:149
  • The elog() format string uses %lu for values returned by pg_atomic_read_u64(). On platforms where uint64 is not unsigned long (e.g., 32-bit builds), this is undefined behavior and can print incorrect values. Prefer PostgreSQL's format macros (e.g., UINT64_FORMAT) or cast to an explicitly matching type before formatting.
    src/hooks/hooks.c:807
  • CALL_PROCESS_UTILITY() can raise ERROR (longjmp). Since a frame is pushed before this call, an ERROR path will skip QueryStackPop(), leaving query_stack_depth permanently incremented and corrupting parent_query_id/CPU attribution for subsequent statements. Wrap the call in PG_TRY/PG_FINALLY (or similar) and ensure the stack is unwound in the FINALLY path, even on ERROR/ereport.
    src/hooks/hooks.c:602
  • QueryStackPush() happens before chaining to prev/standard ExecutorStart. If that downstream call ereports(ERROR), this function will longjmp without unwinding the stack, leaving query_stack_depth corrupted for the rest of the session. Consider pushing only after a successful ExecutorStart call, or wrapping the ExecutorStart call in PG_TRY/PG_CATCH and unwinding the stack in the error path before rethrowing.
    src/hooks/hooks.c:667
  • QueryStackPop() (and its internal getrusage()) runs before checking !psch_enabled, so disabling pg_stat_ch still pays the per-query getrusage cost. If pg_stat_ch is disabled, consider unwinding only the depth counter (no timestamp/rusage capture) or gating QueryStackPush/Pop's expensive work on psch_enabled while still keeping the stack balanced.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/export/stats_exporter.cc Outdated
col_username->Append(std::string(ev.username, ev.username_len));
col_pid->Append(ev.pid);
col_query_id->Append(static_cast<int64_t>(ev.queryid));
col_parent_query_id->Append(static_cast<int64_t>(ev.parent_query_id));
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent_query_id is exported as a UInt64 column, but it's being cast to int64_t before appending. This will corrupt values when parent_query_id exceeds INT64_MAX. Append as uint64_t (or pass ev.parent_query_id directly) to preserve the full range.

Suggested change
col_parent_query_id->Append(static_cast<int64_t>(ev.parent_query_id));
col_parent_query_id->Append(ev.parent_query_id);

Copilot uses AI. Check for mistakes.
Comment thread migrations/001_add_parent_query_id.sql Outdated
Comment on lines +3 to +4
-- Introduced in pg_stat_ch 0.4.x. Each event now carries the query_id of its
-- calling query (e.g. the plpgsql function that issued an SPI statement).
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration header says this is introduced in pg_stat_ch 0.4.x, but the extension control file currently advertises default_version = 0.1. Please align the referenced version (or describe it without a specific version) to avoid confusing upgrade guidance.

Suggested change
-- Introduced in pg_stat_ch 0.4.x. Each event now carries the query_id of its
-- calling query (e.g. the plpgsql function that issued an SPI statement).
-- Introduces parent_query_id support. Each event now carries the query_id of
-- its calling query (e.g. the plpgsql function that issued an SPI statement).

Copilot uses AI. Check for mistakes.
Comment thread src/queue/event.h Outdated
Comment on lines +66 to +68
// Event structure stored in shared memory queue (~2KB fixed size)
// Forward-declare typedef so C code can use PschEvent without the struct tag
typedef struct PschEvent PschEvent;
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the shared-memory event is "~2KB", but with PSCH_MAX_ERR_MSG_LEN (2048) + PSCH_MAX_QUERY_LEN (2048) plus headers the struct is ~4.5KB. Updating this helps keep sizing/backpressure guidance accurate.

Copilot uses AI. Check for mistakes.
Comment thread src/hooks/hooks.c Outdated
Comment on lines +206 to +207
size_t src_len;
size_t len;
Copy link
Copy Markdown
Member

@serprex serprex Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need to do this (even tho postgres code often does..)
https://www.postgresql.org/docs/current/source-conventions.html C99 since PG12

@serprex
Copy link
Copy Markdown
Member

serprex commented Apr 13, 2026

has some stack code mixed up in here from #61

@serprex
Copy link
Copy Markdown
Member

serprex commented May 7, 2026

#88 isolates porting to C

@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from cc1991f to ebeaeb9 Compare May 12, 2026 20:21
@JoshDreamland JoshDreamland changed the title Port hook path to C build: forbid C++ sources outside src/export/ May 12, 2026
@JoshDreamland JoshDreamland changed the base branch from main to c May 12, 2026 20:21
Base automatically changed from c to main May 12, 2026 20:59
Copilot AI review requested due to automatic review settings May 12, 2026 21:00
@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from fde97e5 to eb7cc35 Compare May 12, 2026 21:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread CMakeLists.txt Outdated
"or port it to C. Offending file(s):\n ${PG_STAT_CH_STRAY_CXX}")
endif()

file(GLOB_RECURSE PG_STAT_CH_EXPORTER_SOURCES src/export/*.cc)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need a new word to capture how supremely OK I am with this

Comment thread .github/workflows/ci.yml
run: |
sudo apt-get install -y \
postgresql-18 postgresql-server-dev-18 \
cmake ninja-build clang libssl-dev pkg-config curl zip unzip tar
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix this if the CI ever like actually fails

@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from eb7cc35 to 5566554 Compare May 12, 2026 21:44
Copilot AI review requested due to automatic review settings May 13, 2026 01:12
@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from 5566554 to 0cb3ee0 Compare May 13, 2026 01:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread CMakeLists.txt Outdated
Comment thread CMakeLists.txt Outdated
Comment on lines +49 to +53
file(GLOB_RECURSE PG_STAT_CH_PLUGIN_SOURCES src/*.c)

file(GLOB_RECURSE PG_STAT_CH_STRAY_CXX
src/*.cc src/*.cxx src/*.cpp
)
@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from 0cb3ee0 to 3a6176c Compare May 13, 2026 01:24
Copilot AI review requested due to automatic review settings May 13, 2026 01:26
@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from 3a6176c to 369b69d Compare May 13, 2026 01:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

The reason #88 ported the plugin layer to C is to keep the PostgreSQL ↔
extension boundary in pure C, so C++ exception unwinding can never cross
PG's longjmp-driven error frames.  Nothing in the build system enforced
that split: a stray `src/hooks/helper.cc` would happily get globbed into
the shared library, reintroducing RAII, libstdc++, and the whole class
of UB #88 just eliminated.

Two enforcements:

1. File-extension lock.  Replace the catch-all
   `file(GLOB_RECURSE SOURCES src/*.cc src/*.c)` with two separate globs
   (plugin = `src/*.c`, exporter = `src/export/*.cc`) plus a
   configure-time check that fails with a named, actionable error if any
   `.cc` / `.cxx` / `.cpp` appears under `src/` outside `src/export/`.

2. No load-time / exit-time work in namespace-scope globals (Clang).
   Apply -Wglobal-constructors -Werror=global-constructors to the
   exporter sources.  Catches the "static std::thread g_worker =
   std::thread(work);" / "static std::map<...> g_cache;" hazard where a
   global allocates / starts threads / opens connections inside the
   postmaster before _PG_init is even reached.  Clang's diagnostic
   covers both halves at namespace scope: load-time constructors AND
   exit-time destructors.

The second lock is Clang-only — GCC has no equivalent flag.  Bracket
the option with a compiler-ID check so the GCC matrix stays green, and
add a small parallel CI job (`Build (Clang, PG 18)`) so regressions are
caught at PR time on the Clang path.

One existing finding to clear: `g_exporter` in stats_exporter.cc has a
non-trivial ~unique_ptr at namespace scope, which Clang flags as an
exit-time destructor.  The destructor is actually a no-op in every code
path — PschExporterShutdown runs in on_proc_exit and resets the
unique_ptr before the C++ atexit chain ever fires.  Annotate with
`[[clang::no_destroy]]` (gated behind __has_cpp_attribute for GCC
compatibility) so the suppression is explicit and the warning compiles
clean.

-Wexit-time-destructors (the sibling warning) is intentionally NOT
enabled.  At namespace scope, -Wglobal-constructors already covers what
matters.  -Wexit-time-destructors additionally fires on function-local
statics — the two cached Arrow DataType shared_ptrs in arrow_batch.cc
would trip it today.  Easy to layer on later via a few more
[[clang::no_destroy]] annotations if a regression motivates it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JoshDreamland JoshDreamland force-pushed the port-hook-path-to-c branch from 369b69d to 68d74e5 Compare May 13, 2026 02:05
Comment thread .github/workflows/ci.yml
name: Build (Clang, PG 18)
runs-on: depot-ubuntu-22.04-16
steps:
- uses: actions/checkout@v4
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v6 is latest, never let claude imagine what latest version of anything is

@JoshDreamland JoshDreamland merged commit 9dcdaae into main May 13, 2026
11 checks passed
@JoshDreamland JoshDreamland deleted the port-hook-path-to-c branch May 13, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants