The measurable bi-infinite sentinel parser for Krieger's symbolic code (C3, sub-problem B) #
This file builds the single largest infrastructure gap that the column-code residual of
sub-problem B (C3) of Krieger's finite generator theorem (issue #15) was reduced to in
Oseledets.Krieger.Recovery: the measurable bi-infinite sentinel parser
D : (ℤ → Fin k) → κ
that locates the marker/sentinel structure of a ℤ-indexed code stream around coordinate 0,
extracts the local block, and decodes it — as a measurable function of the stream. The module note
at the bottom of Recovery.lean flagged this — point (3) — as the part with no Mathlib analogue
and the "genuine multi-week residual". This file discharges it, unconditionally and sorry-free.
The verdict: the measurable bi-infinite parser is constructible on current Mathlib #
The construction needs no new symbolic-dynamics infrastructure — only three standard pieces of Mathlib measurability, assembled:
measurable_pi_apply— each coordinate projectionw ↦ w nof the product spaceℤ → Fin kis measurable;measurable_find(Nat.findof a totalized forward search is measurable) — gives a measurable sentinel-position function in each direction (the backward search is a forward search in the reflected stream); andmeasurable_to_countable'— the codomainκisCountablewithMeasurableSingletonClass, so the parser is measurable as soon as each preimageD⁻¹{j}is measurable.
The preimage D⁻¹{j} decomposes as a countable union over block shapes (b, L, l) (start
b : ℤ, length L : ℕ, content l : List (Fin k)) of cylinder sets
{blockStart = b} ∩ {blockLen = L} ∩ {window b L = l} — each measurable. The window-content
cylinder {w | window w b L = l} is a finite intersection of coordinate-singleton constraints
(measurableSet_window_eq). This is the whole of the "measurable bi-infinite parse".
So the residual flagged as multi-week is, in fact, a few hundred lines of routine measurability.
What genuinely remains for an unconditional ColumnCodeData is the dynamical content — the
construction of one fixed code symbol c : α → Fin k whose per-column blocks spell the sentinel
encodings of the Q-names, and the a.e. recovery that the parser then reads off (points (1), (2),
(4) of the Recovery note). That genuinely-dynamical core is isolated, sorry-free, in the sharpened
SentinelColumnCode bundle below — strictly sharper than ColumnCodeData because the parser D is
no longer a hypothesis: it is constructed and proved measurable here, and the bundle only has to
supply the code symbol and the a.e. correctness of the parse.
Main definitions #
window w b L— the length-Lblock of the streamwstarting at integer coordinateb.fwdSentinel s b w/bwdSentinel s b w— the forward / backward distance to the nearest sentinelsfrom anchorb(totalized:0if the search direction has no sentinel).blockStart s w/blockLen s w/blockContent s w— the start, length andList (Fin k)content of the sentinel block containing coordinate0.sentinelParse s dec w— the parser: decodeblockContent s wviadec : List (Fin k) → κ.
Main results (all sorry-free) #
measurableSet_window_eq—{w | window w b L = l}is measurable.measurable_fwdSentinel,measurable_bwdSentinel— the sentinel-distance functions are measurable; hence the block-boundary level setsmeasurableSet_blockStart_eq,measurableSet_blockLen_eq.measurableSet_blockContent_decode_eq— each preimage{w | dec (blockContent s w) = j}is measurable (the countable-block-shape decomposition).measurable_sentinelParse— the headline: the bi-infinite sentinel parser is measurable.SentinelColumnCode.toColumnCodeData— a sharpened residual bundle (code symbol + a.e. parse correctness) yields a fullOseledets.Krieger.ColumnCodeData, hence (viaColumnCodeData.codes) the cross-layer mod-0 code.
References #
- Tomasz Downarowicz, Entropy in Dynamical Systems, Cambridge (2011), §4.2 (Lemma 4.2.5).
- Paul C. Shields, The Ergodic Theory of Discrete Sample Paths, GSM 13, AMS (1996), §I.9 (marker codes and the decoder).
- Wolfgang Krieger, On entropy and generators of measure-preserving transformations, Trans. Amer. Math. Soc. 149 (1970), 453–464.
Coordinate cylinders of the bi-infinite stream #
The product space ℤ → Fin k carries the product σ-algebra; each coordinate projection
w ↦ w c is measurable (measurable_pi_apply), so the singleton-value cylinder {w | w c = s} is
measurable. This is the only atom every measurability proof in this file is built from.
The single-coordinate cylinder is measurable. {w | w c = s} is the preimage of the
singleton {s} under the measurable coordinate projection w ↦ w c (measurable_pi_apply). Every
measurability statement below — the window cylinders, the sentinel-position level sets, the parser
preimages — is assembled from this atom by countable Boolean operations.
The window extractor and the window cylinder #
window w b L reads the length-L block of the stream starting at integer coordinate b. The
parser will read off the content of the sentinel block containing coordinate 0 through this map.
Its key measurability fact, measurableSet_window_eq, is that fixing the start b, length L, and
target content l carves out a measurable cylinder — a finite intersection of coordinate
constraints.
The length-L window of a bi-infinite stream starting at coordinate b. The block
[w b, w (b+1), …, w (b+L-1)] : List (Fin k). A decoder reads off the sentinel block containing
coordinate 0 as window w (blockStart …) (blockLen …).
Equations
- Oseledets.Krieger.window w b L = List.ofFn fun (i : Fin L) => w (b + ↑↑i)
Instances For
The window cylinder is measurable. Fixing the start b : ℤ, length L : ℕ, and target
block l : List (Fin k), the set {w | window w b L = l} of streams whose length-L block at b
equals l is measurable: if |l| = L, it is the finite intersection
⋂ i : Fin L, {w | w (b + i) = l i} of coordinate-singleton cylinders (measurableSet_coord_eq,
via List.ofFn_inj); if |l| ≠ L it is empty (the window always has length L). This is the
content-extraction half of the parser; combined with measurable block boundaries it gives the parser
preimage decomposition.
The measurable sentinel-position search #
The block boundaries are located by searching for the nearest sentinel. The forward search from an
anchor b looks for the least n : ℕ with w (b + n) = s; the backward search looks for the least
n : ℕ with w (b - n) = s — a forward search in the reflected stream, equally measurable. Both
are totalized (return 0 if that direction has no sentinel) so they are total functions of the
stream; on the a.e. set where the orbit is tiled by complete columns (the two-sided recurrence
backbone twoSided_recurrence) the search always succeeds and the totalization is irrelevant.
The totalized forward-search predicate: a sentinel s sits at coordinate b + n, or the
forward ray from b contains no sentinel at all. The disjunction guarantees a witness for every
stream (exists_fwd), so Nat.find is total and measurable_find applies.
Instances For
The totalized backward-search predicate (forward search in the reflected stream): a sentinel
sits at coordinate b - n, or the backward ray from b contains no sentinel.
Instances For
The forward sentinel distance. The least n : ℕ such that w (b + n) = s, or 0 if the
forward ray from b has no sentinel. Measurable in w (measurable_fwdSentinel).
Equations
- Oseledets.Krieger.fwdSentinel s b w = Nat.find ⋯
Instances For
The backward sentinel distance. The least n : ℕ such that w (b - n) = s, or 0 if the
backward ray from b has no sentinel. Measurable in w (measurable_bwdSentinel).
Equations
- Oseledets.Krieger.bwdSentinel s b w = Nat.find ⋯
Instances For
The forward sentinel distance is measurable. This is the crux that the Recovery note
flagged as having "no Mathlib analogue": a measurable locator of the marker structure of a
ℤ-stream. It is Nat.find of a totalized forward search — measurable_find applies because the
totalization makes the search succeed for every stream (exists_fwd) and each predicate level set
is measurable (measurableSet_fwdPred).
The backward sentinel distance is measurable (the reflected-stream form of
measurable_fwdSentinel).
The block around coordinate 0: start, length, content #
The sentinel block containing coordinate 0 is delimited by the nearest sentinels on either side.
We take the block start to be one past the nearest sentinel strictly before coordinate 0
(found by the backward search anchored at -1), and the block end to be the nearest sentinel at
coordinate ≥ 0 (the forward search anchored at 0). The content is the window from start to
end inclusive — exactly one sentinel-terminated block in the well-tiled (a.e.) case. Each of
blockStart, blockLen is a measurable integer/nat function of the stream, built from the
measurable sentinel distances.
The start of the sentinel block containing coordinate 0. One past the nearest sentinel
strictly before 0: anchoring the backward search at -1 finds the distance d to that sentinel
(w (-1 - d) = s), and the block starts at -1 - d + 1 = -d. (In the no-sentinel-below case the
totalized search returns d = 0, giving start 0; this only happens off the a.e. tiled set.)
Equations
- Oseledets.Krieger.blockStart s w = -↑(Oseledets.Krieger.bwdSentinel s (-1) w)
Instances For
The end coordinate of the sentinel block containing coordinate 0. The nearest sentinel at
coordinate ≥ 0: the forward search anchored at 0 gives the distance, and the end is at that
coordinate (inclusive, the terminating sentinel of the block).
Equations
- Oseledets.Krieger.blockEnd s w = ↑(Oseledets.Krieger.fwdSentinel s 0 w)
Instances For
The length of the sentinel block containing coordinate 0 (inclusive of the terminating
sentinel): blockEnd − blockStart + 1, clamped to ℕ.
Equations
- Oseledets.Krieger.blockLen s w = (Oseledets.Krieger.blockEnd s w - Oseledets.Krieger.blockStart s w + 1).toNat
Instances For
The content of the sentinel block containing coordinate 0 as a List (Fin k): the window
of length blockLen starting at blockStart. A decoder applies dec : List (Fin k) → κ to this to
read off the Q-label.
Equations
Instances For
Each block-start level set is measurable. {w | blockStart s w = b} is the level set of
the measurable backward sentinel distance: blockStart s w = b ⇔ bwdSentinel s (-1) w = (-b).toNat
when b ≤ 0 (and is empty when b > 0, since a distance is non-negative). This is all the parser
preimage decomposition uses; we avoid an honest Measurable (blockStart s) (whose ℤ-arithmetic
plumbing is irrelevant) in favour of the level sets directly.
Each block-length level set is measurable. {w | blockLen s w = L} is a measurable set: it
decomposes over the (countably many) pairs of forward/backward sentinel distances (f, d) that
produce length L, and each {fwdSentinel = f} ∩ {bwdSentinel = d} is measurable. Again we expose
only the level sets, which is what the parser preimage decomposition consumes.
The parser and its measurability — the headline of this file #
The parser applies a decode map dec : List (Fin k) → κ to the block content. Its measurability is
proved by the countable-codomain route (measurable_to_countable'): each preimage
{w | dec (blockContent s w) = j} decomposes over the countably many block shapes (b, L, l) as a
union of cylinders {blockStart = b} ∩ {blockLen = L} ∩ {window b L = l}. This is the measurable
bi-infinite sentinel parser that Recovery isolated as the residual with no Mathlib analogue.
The bi-infinite sentinel parser. Given a block-decode map dec : List (Fin k) → κ, the
parser sentinelParse s dec w reads off the sentinel block of the bi-infinite stream w containing
coordinate 0 (blockContent s w) and decodes it. This is the decoder D of
Oseledets.Krieger.ColumnCodeData, now an honest total function rather than a hypothesis.
Equations
- Oseledets.Krieger.sentinelParse s dec w = dec (Oseledets.Krieger.blockContent s w)
Instances For
Each parser preimage is measurable (the countable block-shape decomposition). For a fixed
target label j : κ, the set {w | dec (blockContent s w) = j} is the countable union, over all
block starts b : ℤ, lengths L : ℕ, and contents l : List (Fin k) with dec l = j, of the
cylinder {blockStart = b} ∩ {blockLen = L} ∩ {window b L = l}. Each factor is measurable
(measurable_blockStart, measurable_blockLen, measurableSet_window_eq), List (Fin k) is
countable, and ℤ, ℕ are countable, so the union is measurable. This is the entire content of the
"measurable bi-infinite parse".
The bi-infinite sentinel parser is measurable. This is the headline of the file and the
discharge of point (3) of the Recovery module note (the residual with "no Mathlib analogue"). The
codomain κ is Countable with MeasurableSingletonClass, so measurable_to_countable' reduces
measurability to the measurability of each preimage {w | sentinelParse s dec w = j} — supplied by
measurableSet_blockContent_decode_eq. No new symbolic-dynamics infrastructure is needed.
The sharpened residual: a code symbol + a.e. parse correctness ⇒ ColumnCodeData #
With the parser constructed and proved measurable, the ColumnCodeData residual sharpens: the
decoder is no longer a hypothesis. The remaining genuinely dynamical input is a measurable code
symbol c : α → Fin k (built from the tower columns + sentinel encoding) and the a.e. correctness
of the sentinel parse — that off a μ-null set the parser applied to the two-sided itinerary of a
point recovers the Q-name, i.e. recovers which Q-cell it lies in. We package this as
SentinelColumnCode and show it yields a full ColumnCodeData. This is the honest reduction of
sub-problem B to its dynamical core, with the measurable-parse infrastructure fully discharged.
The sharpened column-code residual. Bundles exactly the genuinely dynamical inputs of sub-problem B that remain after the measurable bi-infinite parser is constructed:
code(+code_measurable): the measurable code symbolc : α → Fin k, built from the refining Rokhlin towers so thatc (eⁱ x)spells thei-th symbol of the sentinel block of the column throughx(points (1), (2) of theRecoverynote);decode: the per-block decode mapList (Fin k) → κ(strip the sentinel, look up theQ-name — the finite unique-decodability ofOseledets.Krieger.sentinelEncodeList_injective); andrecovers: the a.e. correctness of the sentinel parse — for eachQ-cellj,Q jagreesμ-a.e. with{x | sentinelParse s decode (itin e code x) = j}(point (4), them → ∞/ Borel–Cantelli limit over the two-sided recurrence tiling).
This is strictly sharper than Oseledets.Krieger.ColumnCodeData: the decoder is no longer supplied
or assumed measurable — it is the constructed, proved-measurable parser sentinelParse s decode
of this file.
- sentinel : Fin k
The reserved sentinel letter of the
Fin kalphabet. - code : α → Fin k
The measurable code symbol
c : α → Fin kof the column coding. - code_measurable : Measurable self.code
The code symbol is measurable.
The per-block decode map (finite unique decodability of a sentinel block).
- recovers (j : κ) : Q j =ᵐ[μ] {x : α | sentinelParse self.sentinel self.decode (itin e self.code x) = j}
A.e. correctness of the sentinel parse: each
Q-cell agrees mod 0 with the parser event.
Instances For
The sharpened residual yields a full ColumnCodeData. Given a SentinelColumnCode — a
measurable code symbol, a per-block decode map, and the a.e. correctness of the constructed
sentinel parse — the decoder field is filled by sentinelParse sentinel decode, which is measurable
by measurable_sentinelParse (the headline of this file), and the recovery field is the supplied
a.e. correctness. So the only inputs left are dynamical: the code symbol and the a.e. parse
correctness. The measurable bi-infinite parser, flagged as the residual with no Mathlib analogue, is
discharged — it is sentinelParse with its measurable_sentinelParse certificate.
Equations
- data.toColumnCodeData = { code := data.code, code_measurable := ⋯, decoder := Oseledets.Krieger.sentinelParse data.sentinel data.decode, decoder_measurable := ⋯, recovers := ⋯ }
Instances For
A SentinelColumnCode yields the cross-layer countable mod-0 code of Q by its code partition
— the deliverable that slots into Oseledets.Krieger.KriegerCodingData.code_codes. Immediate from
SentinelColumnCode.toColumnCodeData and Oseledets.Krieger.ColumnCodeData.codes.