OOB in wakeupWaiters() chunking loop (WAIT FOR)

We found an array access OOB in src/backend/access/transam/xlogwait.c in wakeupWaiters(). The variable i is initialized from lsnType, then reused as the wakeup loop counter. If a full wakeup batch is processed, i remains at 16 and is used again as an index into waitLSNState->waitersHeap[i] on the next do { ... } while (...) iteration, causing OOB access (WAIT_LSN_TYPE_COUNT == 4).

We knew that:

wakeupWaiters() uses a single local variable i both as the heap index derived from lsnType (waitersHeap[i]), and later the loop counter for waking processes (for (i = 0; ... )).

If exactly 16 waiters are collected in one pass (WAKEUP_PROC_STATIC_ARRAY_SIZE), the function repeats its do {...} while (...) loop with i == 16, and then indexes waitersHeap[16] even though the array is only 4 elements long. That’s an out-of-bounds access into the shared-memory WaitLSNState object. (see src/include/access/xlogwait.h:36)

And WaitLSNState stores waitersHeap[WAIT_LSN_TYPE_COUNT] immediately before the flexible procInfos[] array (same shared-memory allocation), (see src/include/access/xlogwait.h:78)

In which any index >= 4 is an OOB access into adjacent shared-memory fields (into procInfos[]).

For why wakeupWaiters() reuses i and can re-enter with i == 1, In wakeupWaiters() (see src/backend/access/transam/xlogwait.c:242 - src/backend/access/transam/xlogwait.c:310), you can see that i is initialized from lsnType once, before the do {} loop, then reused as the for loop counter, the do {} loop repeats when numWakeUpProcs == 16, at the end of the for, i == numWakeUpProcs, so i becomes 16 precisely when the repeat condition is true.

e.g.,

  1. Start: i = (int)lsnType (0..3), OK.
  2. Suppose at least 16 eligible waiters exist, loop fills wakeUpProcs[] and sets numWakeUpProcs = 16
  3. The for (i = 0; i < numWakeUpProcs; i++) ends with i == 16.
  4. Next iteration: while (!pairingheap_is_empty(&waitersHeap[i])) becomes waitersHeap[16] (OOB)

note that the initial Assert(i >= 0 && i < WAIT_LSN_TYPE_COUNT); does not re-run for subsequent do {} iterations, so it doesn’t catch the corrupted i.

This is dangerous, since a pairingheap is a small struct of pointers, and pairingheap_is_empty(h) reads h->ph_root: (see src/include/lib/pairingheap.h:71, "typedef struct pairingheap");

So when the code treats bytes in/near procInfos[] as pairingheap, it will read a bogus ph_root pointer (from unrelated data); then pass the bogus “heap” into pairingheap_first() / pairingheap_remove_first(); then convert the returned bogus node pointer into a WaitLSNProcInfo * via pairingheap_container(...).

For reproduction:

  1. We started a primary server on upstream master
  2. Get current LSN; choose a target slightly ahead (e.g., +0x400000).
  3. Open 16+ sessions, we did this via WAIT FOR LSN '<target>' WITH (mode 'primary_flush');
  4. Generate WAL (e.g., pgbench -n -c 8 -T 10)
  5. Backend crashes; postmaster restarts

Using CFLAGS=-O1 -g -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize-recover=undefined and UBSAN_OPTIONS=print_stacktrace=1, we get:

UBSAN

Disclose Timeline:

The issue does not effect stable branches