Flow-Matching Gaussians · a Noogram agent fleet
lake build: exit 0 sorries: 0 (commuting) adversarial corpus: 9 / 9 sweep: 9000 pairs pass

Flow matching · optimal transport · a machine-checked dichotomy

When is flow matching the same as optimal transport?

Run a flow-matching ODE from one Gaussian to another. Run optimal transport between the same two Gaussians. The two maps coincide if and only if the covariances commute — and in that case the map is the simple Bures form, independent of the schedule you chose.

\[ T_1 \;=\; T_{\mathrm{OT}} \quad\Longleftrightarrow\quad \Sigma_0\Sigma_1=\Sigma_1\Sigma_0 \]

and in the commuting case \( \, T_1 = \big(\Sigma_1\Sigma_0^{-1}\big)^{1/2} \) — symmetric, equal to the Bures–Wasserstein map, the same for every valid schedule.

This proof's first version was wrong. The fleet's own adversary caught an invalid converse argument and a gap in dimension three and up. We kept the receipt of the fix — and then closed the hard direction for real, unconditionally. Anyone can show a green check; almost no one shows the red they passed through to get there.

The verdict, from the kernel — not from us

Below is the real output the Lean kernel and the audit produce. The machine-checked commuting case carries only the three foundational axioms — no sorry, no project-local axiom standing in for a missing argument. Every number in the strip above is captured from a command at build time; see §6 Reproduce / verify.

# lake build — anchor theorem accepted by the Lean kernel (exit 0) $ lake build Build completed successfully. exit 0 # the source the kernel verified is byte-identical to the shipped source $ git hash-object lean/FMG/Gaussian.lean 760ec9b3afb37978b6acb840a260cff0ae283ea6 ✓ matches kernel-provenance.log # #print axioms — only the three standard foundational axioms $ #print axioms FMG.flowMap_isHermitian_pushforward_eq_otMap_of_commute ⇒ [propext, Classical.choice, Quot.sound] -- no sorryAx, no project-local axiom # adversarial corpus — author ≠ scorer; the kernel's exit code is the verdict $ bash FMG/Adversarial/check-adversarial.sh CORPUS OK — 9 / 9 resolved as the manifest predicts (8 unfaithful variants rejected · 1 axiom-smuggling build caught by the audit)

§1 · The result

For the independent-coupling Gaussian interpolant \( X_t = a(t)X_0 + b(t)X_1 \), the marginal covariance is \( \Sigma_t = a(t)^2\Sigma_0 + b(t)^2\Sigma_1 \), the advection velocity is linear, \( v(x,t)=A_t x \) with \( A_t = \tfrac12\dot\Sigma_t\Sigma_t^{-1} \), and the flow map \( \Phi_t \) it generates pushes \( \mathcal N(0,\Sigma_0) \) to \( \mathcal N(0,\Sigma_t) \). Call its terminal value \( T_1 \). The Bures–Wasserstein optimal-transport map between the same Gaussians is \( T_{\mathrm{OT}}=\Sigma_0^{-1/2}\!\big(\Sigma_0^{1/2}\Sigma_1\Sigma_0^{1/2}\big)^{1/2}\Sigma_0^{-1/2} \).

The dichotomy. \( T_1 = T_{\mathrm{OT}} \) exactly when \( \Sigma_0 \) and \( \Sigma_1 \) commute. In the commuting case the flow map collapses to the schedule-independent Bures form \( T_1=(\Sigma_1\Sigma_0^{-1})^{1/2} \), which is symmetric. When the covariances do not commute, \( T_1 \) is non-symmetric and differs from the (always symmetric) OT map.

The honest boundary, up front

§2 · Notebooks

Two pedagogical PyTorch notebooks, executed end-to-end and rendered to self-contained HTML. The source .ipynb files live in the repo's python/.

A — three-Dirac mixture →

The marginal stays a closed-form Gaussian mixture at every instant. Trajectories coloured by which atom they fall into, three schedules contrasted, flow-matching paths set against semi-discrete optimal transport.

B — covariance ellipses →

The Gaussian-to-Gaussian case: \( \Sigma_t = a^2\Sigma_0 + b^2\Sigma_1 \) drawn as evolving ellipses, commuting vs non-commuting schedules, and the commutator sweep behind the converse.

§3 · The paper

The full write-up — setup, the three-Dirac experiment, the Gaussian section with the theorem and its proof, and the numerical evidence. Download the PDF · the LaTeX source (paper-v1.tex + references.bib) is in the repo.

Your browser can't display the embedded PDF — open paper-v1.pdf directly.

§4 · The proof — the credibility anchor

The Lean source, not a screenshot. The fidelity anchor is the theorem FMG.flowMap_isHermitian_pushforward_eq_otMap_of_commute, proved basis-free through the continuous functional calculus (so the repeated-eigenvalue cases that sink the naive diagonalisation never arise).

FMG/Gaussian.lean

The anchor theorem and its lemmas, syntax-highlighted. flowMap σ₀ σ₁ is Hermitian, solves the Bures pushforward equation, and equals the OT map.

FMG/Basic.lean

The library's base module — the import surface for FMG.lean.

The firebreak ledger

The disclosed gaps, each a clearly-scoped infrastructure or out-of-scope item — never the main theorem, never an axiom smuggled into the library. Full detail in lean/unproved.md.

ResidualWhat it isStatus
R1Matrix-ODE \( \Phi_1 = \) flowMap bridge (needs time-ordered-exponential machinery Mathlib lacks)informal — infrastructure gap
R2Schedule-independence / flatness (rides on R1; structurally already true of the Lean object)informal — out of Lean scope
R3The \( d\ge 3 \) converse \( T_1 \) symmetric \( \Rightarrow [\Sigma_0,\Sigma_1]=0 \)closed unconditionally — informal in Lean, math complete

The adversary that rejects axiom-smuggling

A green build does not certify a faithful proof if the context is poisoned. The corpus's ninth entry, A9_AxiomSmuggling.lean, builds clean by smuggling a false universal in as an axiom and feeding it to the verified theorem — and is caught not by the exit code but by the #print axioms audit. Author ≠ scorer: the red team writes the variants, the checker grades exit codes against the manifest.

-- A9_AxiomSmuggling.lean (excerpt) — the smuggled false universal axiom A9_commute_everything {n : Type*} [Fintype n] [DecidableEq n] (σ₀ σ₁ : Matrix n n ℝ) : Commute σ₀ σ₁ -- builds with exit 0, but #print axioms A9_smuggled reveals the -- non-standard dependency → the axiom-grep gate fails the build.
GateVerdictEvidence (build-time)
lake buildexit 0anchor accepted; shipped source blob-matches kernel-provenance.log (760ec9b…)
0-sorrycleangrep -rn sorry over the paper theorem files → no match
no project axiomcleanonly propext, Classical.choice, Quot.sound
adversarial corpus9 / 98 rejected + 1 axiom-smuggling build flagged by the audit

§5 · The fleet that watched itself work

Strip away the science and here is the meta-story: a swarm of software agents was given one question and told to deliver a publishable result — and it kept a logbook of its own motion while it worked. The arc is a soap bubble: polymerisation (the opening plan fans out), foaming (children get nucleated mid-flight, the froth swells), drainage (the foam settles, green fills the frame). Every number is pulled from a real field in the fleet's event log; where a number under-counts, the chart says so out loud — and we leave the caveat on.

DAG growth — molecules by state over wall-clock
1 · Polymerisation. The dashed line is the total tasks ever created — it climbs in three steps: seventeen laid down at once (the plan), a long flat drain, then a second burst of five unplanned repair nodes near 19:50. The dynamic DAG rewriting itself while running. Green (completed) rising to fill the frame is drainage.
Molecule Gantt — one bar per task
2 · The Gantt. Each bar is one task's whole life, coloured by sub-fleet — cognition (science + words), formal (the Lean proof), instrumentation (this report). The long left bars are deliberations; the short late formal bars are the surgical proof-patches, nucleated late, closed fast.
Frontier width — the foaming signal
3 · Foaming. Live foam — queued plus running — spikes to seventeen, then ebbs 17 → 12 → 11 → 6, with a bump near 19:45 as the proof-patch children arrive. Caveat printed on the chart itself: the true frontier is ready-vs-blocked, but the event log records only state counts, not dependency edges over time. This draws the measurable proxy (pending + running) and says so. We draw what we can measure and name what we can't.
Cumulative framing-bytes — a transparent cost proxy
4 · What it cost. The cosmon state contains no token meter and no cost field anywhere. The nearest measured quantity is the size in bytes of the briefing sealed into each task. This curve is the cumulative weight of input framing — a deliberate under-count: it measures the size of the question, never the length of the answer. Treat the shape as honest and the absolute number as a floor, not a bill.
Persona activity — workers spawned per persona
5 · Who did the work. Each bar is a named persona and how many workers it spawned — a sourcer, a proofsmith (twice), a skeptic hunting the dangerous wrong-answer, auditors, and a cluster of proof-patchers. Caveat on the chart: cosmon stamps every worker's role field with the same value, so the honest unit is the session name the persona ran under — which is what's plotted. Sub-fleet colour is inferred from each task's briefing, not a stored fact.
Critical path through the final DAG
6 · The critical path. The longest chain of strictly-ordered tasks — eleven tasks, roughly four hours end to end, from the opening literature-and-setup, through the deliberation that fixed the theorem's framing, out along the formal proof, into the instrumentation that drew these charts. Everything off the red line had slack; the red line could not.
Outcome mix — completed vs in-flight vs collapsed
7 · The honest scorecard. By sub-fleet, nothing hidden: cognition 12/12 completed; formal 4 completed, 1 in-flight, 1 collapsed; instrumentation 1 completed, 3 in-flight (the report was still draining when the snapshot was taken). Across the fleet: seventeen completed, one collapsed — and that single collapse was a junk probe created by hand while poking at the system's JSON, recognised and discarded. The log records it faithfully rather than quietly deleting it.

The whole ethic in miniature: the fleet proved what it could prove to the hilt, marked the rest with a steady finger instead of a flourish — then went back and closed the hard one honestly, demoting the nine-thousand-pair numerical sweep from evidence to a sanity check the moment a real proof existed. The trust isn't in the agents — it's in the gate the agents couldn't argue with. The full narrative is in report/report.md.

§6 · Reproduce / verify

Everything on this page regenerates from the repo. The status strip and the terminal verdict above are produced mechanically by the build script from real command output — not hand-typed.

# clone and check the proof
git clone https://github.com/noogram-labs/flow-matching-gaussians
cd flow-matching-gaussians/lean
lake build                              # exit 0 = the anchor theorem is accepted
bash FMG/Adversarial/check-adversarial.sh   # 9/9 — author ≠ scorer

# regenerate the notebooks, the Lean HTML, and the paper copy
bash docs/site/artifacts/build.sh

# regenerate the seven report figures from the fleet's own event log
python3 report/collect.py               # .cosmon/state → report/data/*.csv (read-only)
python3 report/render.py                # report/data/*.csv → report/figures/*.png

# rebuild this website (refreshes the status strip from live commands)
bash docs/site/build.sh
ProvenanceValue
Built from commit51dab16 (51dab161473a81da4269d014172f6057cce4c416)
Build timestamp2026-06-18 (mechanically stamped by docs/site/build.sh)
Anchor theoremFMG.flowMap_isHermitian_pushforward_eq_otMap_of_commute
Lean source blob760ec9b3afb37978b6acb840a260cff0ae283ea6 — matches kernel-provenance.log
Lean toolchainleanprover/lean4:v4.29.0 · Mathlib v4.29.0