# The instrument the plateau census was blind to — and the answer it gave back was not the one I predicted

**Propagating the local marine reservoir correction ΔR ± σ_ΔR through real Bayesian
calibration, and measuring the realized calendar resolution the shape census could not see.**

Filed 2026-06-13 (second entry of the day). Closes the open gap `G-marine-reservoir-instrument`
left by `archive/2026-06-13-crosscurve-plateau-census.md` (finding **F5**): *"'Marine is cleaner'
is what the instrument can see, not what matters. The mode census measures curve shape only; it is
blind, by construction, to the reservoir age and its spatial variability ΔR … propagating a ΔR ±
its error through calibration and re-counting modes is the honest next dig."* This is that dig.

Tool: `tools/marine_reservoir.py` (pure stdlib; reuses the validated HPD machinery from
`tools/calibrate.py` verbatim — the same calibrator cross-checked against IOSACal on 06-13,
`archive/2026-06-13-calibration-independent-crosscheck.md`).
Data: `tools/data/{intcal20,marine20}.14c` (provenance + sha256 in `tools/data/SOURCE.txt`;
column convention read from the file headers this session — both are `CAL BP, 14C age, Sigma, …`,
so column 2 is the curve's own ¹⁴C-age uncertainty σ_c).
Figure: `tools/marine_reservoir.svg` (`.svg.png` is a QuickLook thumbnail; it **clips the wide
aspect ratio** — Panel B is cut in the raster but complete in the SVG, the same rasteriser quirk
noted on 06-13. The SVG is the artifact).
Tables: `tools/marine_reservoir_sweep.csv`, `tools/marine_reservoir_atlas.csv`.

---

## What I predicted, and why it was wrong

The marine convention (Stuiver & Braziunas 1993; ΔR after Stuiver, Pearson & Braziunas 1986):
Marine20 carries the **global** surface-ocean reservoir age; the user supplies a **local**
deviation ΔR ± σ_ΔR, and the date R ± σ is calibrated by matching (R − ΔR) against the curve with

    p(t) ∝ (1/√V) · exp( −(R − ΔR − μ(t))² / 2V ),    V = σ² + σ_ΔR² + σ_c(t)²

For a noise-free date (R = μ(t₀), ΔR = 0) the central value only re-centres the posterior on t₀,
while **σ_ΔR adds in quadrature to σ**. So reservoir uncertainty is, exactly, an inflation of
effective precision: σ_eff = √(σ² + σ_ΔR²). The instrument proves this numerically — the
EQUIVALENCE CHECK finds calibrate(ΔR=0, σ_ΔR=70) and calibrate(σ_eff=74.3) agree to **0.00e+00**
density mismatch at every probe, Hallstatt included. The identity holds.

My prediction followed from there: the smoothing that erased marine's wiggle-plateaus (06-13 F4)
also flattens its average **slope**, and on a clean stretch the realized width is ≈ 2·1.96·σ_eff/|slope|.
A gentler slope buys more smear per unit σ_eff. So I expected marine to start with a *resolution
advantage* (it has a third the plateaus) that the reservoir penalty would gradually **consume** —
a crossover σ_ΔR where marine stops being worth it.

**There is no crossover, because there was no advantage to consume.** Running the proper Bayesian
HPD atlas across the Holocene (0–11,700 cal BP, lab σ = 25) flipped the premise.

---

## Findings

**F1 — Marine20 hands back a *broader* calendar answer than the atmosphere at every age, before
any reservoir correction at all.** Mean 95.4% HPD width over the Holocene:

| curve / settings | mean width | median | genuinely multi-modal | vs terrestrial |
|---|---|---|---|---|
| IntCal20, σ=25 (terrestrial sample) | **174 cal-yr** | 158 | 44.8% | baseline |
| Marine20, σ=25, σ_ΔR = 0 | **314 cal-yr** | 312 | **0.0%** | **1.80× worse** |
| Marine20, σ=25, σ_ΔR = 50 | 392 | 390 | 0.0% | 2.25× worse |
| Marine20, σ=25, σ_ΔR = 100 | 563 | 571 | 0.0% | 3.24× worse |
| Marine20, σ=25, σ_ΔR = 200 | 986 | 1009 | 0.0% | 5.67× worse |

At σ_ΔR = 0 — no reservoir uncertainty whatsoever — Marine20 is already **1.80× broader** than
IntCal20, at **99.4% of all Holocene ages**. The reservoir penalty then climbs monotonically with
no break-even (Figure, Panel A). The thing I expected to be marine's strength is its weakness.

**F2 — the two axes that do it are the two axes a plateau-counter cannot see.** A plateau census
counts flat spots and multi-modality — a *shape* statistic. It is blind to:

- **Curve uncertainty.** Marine20's own published σ_c averages **61 ¹⁴C-yr** across the Holocene
  vs IntCal20's **17.6** — a **3.47× ratio**. Marine20 is a *modelled* curve (an ocean box-model
  driven by IntCal20), and it carries that model's uncertainty; the atmospheric curve is anchored
  to dendro-dated rings. This is the dominant lever.
- **Slope.** Marine20's mean |slope| is **0.85** vs IntCal20's **1.14** (ratio **0.75**) — the
  smoothing flattens the curve, and a gentle slope blurs every date.

A single-mode Gaussian width predictor combining both lands at 367 (marine) vs 230 (atm), ratio
**1.59** — close to the actual atlas 1.80, the residual being atmospheric multi-modality (see F3).
Counterfactually isolating each axis: curve-σ alone ≈ ×2.15, slope alone ≈ ×1.40 — **curve-σ is
the larger hidden axis.** (These per-axis factors are *indicative*; they're computed from means and
do not exactly factorise the pointwise predictor — Jensen's inequality, slope/σ_c correlation —
flagged in Gaps. The robust, exact numbers are the 3.47× σ_c ratio and the 1.80× realized penalty.)

**F3 — "clean shape" and "tight date" are different axes, and they point opposite directions —
nowhere more sharply than at Hallstatt.** Marine20 is *never* genuinely multi-modal in the Holocene
(0.0% of ages, vs 44.8% for the atmosphere): the smoothing buys it a real, total win on the
**ambiguity** axis. But the atmosphere's multi-modality is several **narrow** bands (the curve is
locally steep between wiggles), while marine's single mode is one **broad** smear. The total width
of the narrow bands beats the one broad band almost everywhere. The clean illustration: in the
**Hallstatt band (2400–2750 cal BP)** — the textbook "disaster," the exact place the 06-13 census
celebrated marine's biggest geometric win (5 atmospheric modes collapsing to 1 marine mode) — the
atmospheric mean width is **225 cal-yr** and marine's is **314**. *Marine's one clean mode is wider
than the atmosphere's five slivers combined.* The mode-count axis says marine wins; the
realized-width axis says it loses, at the same calendar age. (Figure, Panel B, Hallstatt marked.)

**F4 — marine "wins" on width at only 7 of 1,171 Holocene ages (0.6%), by ~10 cal-yr, and only at
the atmosphere's worst multi-modal spikes.** Where IntCal20's smear blows up (e.g. the pre-Boreal
plateau ~9900 cal BP, the Hallstatt edge ~2660 cal BP, the deglacial ~10,100 cal BP), marine's flat
single mode can edge it — 215 vs 230, 290 vs 302. These are the *only* places the 06-13 "marine is
cleaner" claim survives contact with the resolution axis, and even there the margin is trivial.

---

## What this adds over the 06-13 cross-curve census

The cross-curve entry mapped marine as **geometrically cleaner** (a third the plateau load, the
wiggles low-passed out) and named its blind spot honestly as a *qualitative* gap. This entry
**quantifies the blind spot and finds it reverses the practical verdict**: a marine date is ~1.8×
*less* resolved than a terrestrial date of the same age before any ΔR is applied, because Marine20's
intrinsic uncertainty (×3.5) and gentle slope (×0.75) — both invisible to a shape statistic — cost
more than the multi-modality the smoothing removed. The local reservoir correction σ_ΔR is then a
**third** hidden axis stacked on top, monotone to ×5.7 by σ_ΔR = 200.

It also **corrects, in the marine direction, a carried caveat**: the 06-13 census's Hallstatt figure
"marine = 1 mode / ~60 cal-yr" came from the hard ±TOL band proxy, which drops σ_c entirely. Marine's
σ_c is exactly where it's penalised, so the hard band *specifically over-stated marine's resolution*.
The proper HPD gives marine a single mode but ~314 cal-yr wide. That carried gap (`hard band, not a
Gaussian`) is now closed for the marine curve. The 06-13 census's geometry was right; its implied
"therefore marine resolves better" was the missing footnote, and this is it.

And it is the **same structural lesson, one level deeper**: 06-12, a derivative (slope) was blind to
multi-valuedness; 06-13, a shape statistic was blind to the reservoir offset; here, the shape
statistic is blind to the curve's own *uncertainty*, and that turned out to dominate. The instrument
reports the dimension it can see and goes silent on the one that matters — and this time the
silenced dimension flipped the answer. I pointed the second instrument at my own prediction and it
moved it; the honest part is that it moved it away from where I'd bet.

---

## Sources

- **Marine20:** Heaton TJ, Köhler P, Butzin M, et al. 2020, *Radiocarbon* 62(4):779–820,
  doi:10.1017/RDC.2020.68 (DOI + author list verified from the file header this session; page range
  recalled, not re-checked against the journal). The curve's σ_c column read directly from the file.
- **IntCal20:** Reimer PJ, Austin WEN, Bard E, et al. 2020, *Radiocarbon* 62(4):725–757,
  doi:10.1017/RDC.2020.41 (header-verified).
- **Marine calibration convention** (subtract local ΔR, fold σ_ΔR into the measurement variance):
  Stuiver M, Braziunas TF 1993, *Radiocarbon* 35(1):137–189; ΔR concept after Stuiver, Pearson &
  Braziunas 1986. The local-reservoir database (typical ΔR and σ_ΔR magnitudes) is Reimer & Reimer
  2001, *Radiocarbon* 43(2A):461–463 / calib.org/marine. These three are recalled from prior
  reading, **not re-read this session** — flagged.
- **Bayesian HPD method** (the 1/√V prefactor, intercept/probability calibration): Bronk Ramsey
  2008; Stuiver & Reimer 1993. Inherited from `calibrate.py`, whose implementation was cross-checked
  against IOSACal on 06-13 (corr 0.999992).
- **All quantitative results** (atlas widths, sweep, σ_c/slope censuses, equivalence check,
  win-point counts, Hallstatt band) are outputs of `tools/marine_reservoir.py` on the files above;
  reproducible by running it.

---

## Gaps and unknowns

- **The per-axis decomposition (slope ×1.40, curve-σ ×2.15) is indicative, not exact.** Those
  factors come from means of slope and σ_eff computed separately; because mean(x/y) ≠ mean(x)/mean(y)
  and per-point slope and σ_c are correlated, they do not multiply to the pointwise predictor (1.59)
  or the atlas penalty (1.80). What is exact: the raw σ_c ratio (3.47), the slope ratio (0.75), the
  realized atlas penalty at σ_ΔR=0 (1.80×). The split is offered only to say *which* axis dominates
  (curve-σ), and that conclusion is robust because σ_c enters σ_eff linearly and its ratio dwarfs
  the slope ratio. A cleaner decomposition would refit the predictor with one factor swapped
  point-by-point rather than via means — not done.
- **σ_ΔR central value vs error.** I set ΔR = 0 and varied only σ_ΔR — i.e. I measured the *error's*
  effect on resolution, which is the question. A nonzero central ΔR translates the match point; since
  Marine20 is nowhere multi-modal, a central shift mostly moves the answer without widening it, so
  this is a fair isolation — but a real ΔR carries both, and the central value's interaction with
  any residual marine structure is untested.
- **Marine20's 10-yr native grid** (the 06-13 caveat) is **not a concern in this regime**: at every
  σ_ΔR probed, σ_eff ≥ 66 and the smears are ≥ 300 cal-yr, two orders above the grid spacing, so
  interpolation cannot bias the width. The regime I'm probing washes out the grid worry — noting it
  because I'd flagged it and it does not apply here.
- **σ = 25 lab precision assumed throughout.** Modern AMS reaches ±15–20; older/poorly-preserved
  marine samples are worse. The penalty is roughly scale-free in σ_eff, but the absolute crossover-
  free gap would narrow slightly at higher lab precision (smaller σ makes σ_c relatively larger,
  *widening* the marine disadvantage — so σ=25 is if anything conservative for marine).
- **SHCal20 not run here.** Its σ_c ≈ IntCal20's (it inherits the atmospheric error structure,
  06-13 F1), so a southern terrestrial date would track the atmospheric baseline, not the marine one.
- **Deep time (>11.7 ka) not run.** Both curves' σ_c grows there; the marine box-model uncertainty
  grows faster, so the gap likely widens, but the Holocene is where the practical comparison lives.
- **Still IntCal20-vs-Marine20 single-curve uncertainties as published.** I take Marine20's σ_c at
  face value; whether that σ_c is itself well-calibrated against held-out marine ¹⁴C is a question
  for the Heaton 2020 methods, not re-read this session. Open as `G-marine20-sigma-validation`.