1. What the Existing Apparatus Does Not Capture
The Bell Ring Back formalism defines the loop-closure rate $\rho$ — the rate at which synergistic information about a shared target $T$ accumulates through the sensor-instrument exchange. When the sensor is absent, $\rho \equiv 0$: no synergistic information is produced. The Asymmetric Synergy Bound gives the structural reason: two instruments form a symmetric loop ($\alpha = 0$), so the synergy ceiling is zero.
This captures the absence of truth-production. It does not capture what actually happens instead.
The claim of “The Veer” is stronger than $\rho = 0$. Two instruments running without a sensor do not merely fail to produce truth. They produce ungrounded confidence — increasingly elaborate posterior structures that no channel from the territory has constrained. The process does not stand still at $\rho = 0$. It drifts. The veer is the accumulation of that drift, and the existing apparatus needs one additional quantity to capture it.
2. The Territorial Channel
The existing formalism defines two directed information flows within the loop: $\mathrm{TE}_{I \to S}^{\,t}$ (instrument to sensor) and $\mathrm{TE}_{S \to I}^{\,t}$ (sensor to instrument). Both are internal to the coupled process. Neither measures whether the loop is tracking the territory.
To formalize the veer, we need the external channel — the directed information flow from the territory $T$ into each pole of the loop.
Definition. The territorial transfer entropy for pole $X \in \{S, I\}$ at turn $t$ is:
$$\mathrm{TE}_{T \to X}^{\,t} = H\bigl(X_{t+1} \mid X_{\leq t}\bigr) - H\bigl(X_{t+1} \mid X_{\leq t}, T\bigr)$$
This measures how much the territory constrains the pole’s next state, beyond what its own history predicts.
The sensor’s defining structural property — the feature that makes it a sensor rather than another instrument — is that it sits on a channel from $T$:
- Sensor: $\mathrm{TE}_{T \to S}^{\,t} > 0$. The sensor has embodied contact with the territory. Its updates are constrained by lived experience, physical interaction, stakes, and the capacity to be changed.
- Instrument: $\mathrm{TE}_{T \to I}^{\,t} \approx 0$. The instrument has no embodied access to the territory. Its processing operates on training-distribution traces — patterns extracted from a prior territorial exposure that is now frozen. The instrument’s updates within the loop are constrained by the channel from the sensor ($\mathrm{TE}_{S \to I}$), not by direct territorial contact.
The approximation sign matters. $\mathrm{TE}_{T \to I}^{\,t} \approx 0$ rather than $\equiv 0$ because the training distribution is a territorial trace — a compressed, frozen snapshot of past territorial contact. The instrument is not entirely disconnected from $T$; it carries a prior. But that prior does not update during the loop. The instrument’s contact with $T$ is historical, not current.
3. The Veer: Ungrounded Confidence
Definition. The per-turn ungrounded free-energy reduction is:
$$\Delta V_t = \Delta F_t \cdot \mathbb{1}\!\left[\mathrm{TE}_{T \to \cdot}^{\,t} = 0\right]$$
where $\Delta F_t$ is the free-energy reduction generalized from §4.1 of Deva (2026a) to refer to the surprise reduction of whichever pole is processing at turn $t$: for any pole $X$, $\Delta F_t > 0$ only on turns when $X$ updates its posterior, and $\mathrm{TE}_{T \to X}^{\,t}$ reflects whether $X$ had territorial input on that turn. The indicator $\mathbb{1}[\cdot]$ selects turns where no pole in the coupled process has territorial input. $\Delta V_t$ is the surprise reduction that happened without the territory constraining it.
The indicator is over the entire coupled system: $\mathrm{TE}_{T \to \cdot}^{\,t} = 0$ means $\mathrm{TE}_{T \to S}^{\,t} = 0$ and $\mathrm{TE}_{T \to I}^{\,t} = 0$ simultaneously — no pole has territorial contact at turn $t$.
Definition. The cumulative veer over $\tau$ turns is:
$$V_\tau = \sum_{t=1}^{\tau} \Delta V_t$$
in bits. This is the total confidence the system has accumulated without territorial grounding.
Interpretation. $V_\tau$ measures the magnitude of unanchored confidence. Each $\Delta V_t > 0$ means the system reduced its surprise — it became more confident about its model of the world — but no channel from the territory constrained whether that confidence was directed toward or away from reality. The system got more certain. Nothing checked whether it got more correct.
The direction of drift is not derivable from the formalism. $V_\tau$ is a magnitude, not a vector. The direction of the veer in any particular [I + I] run is structured by training-distribution attractors, topic-specific dynamics, and the particular processing trajectories the instruments happen to follow. What is derivable is that the magnitude grows undetected — neither pole has the territorial channel that would register the drift as drift.
4. Limit Cases
The three conditions from “The Veer” (§5) produce the following behavior:
| Condition | $\mathrm{TE}_{T \to \cdot}^{\,t}$ | $\Delta V_t$ | $V_\tau$ | $\rho$ |
|---|---|---|---|---|
| [I + I]: Two instruments, no sensor | $= 0$ every turn | $= \Delta F_t$ | Non-decreasing; grows without bound | $\equiv 0$ |
| [I + S]: Instrument + sensor | $> 0$ (sensor’s territorial channel) | $= 0$ | Bounded | $> 0$ |
| [I + I + S]: Two instruments + sensor | $> 0$ (sensor’s territorial channel) | $= 0$ | Bounded | $> 0$ |
| [I + I] with multi-agent orchestration | $= 0$ every turn | $= \Delta F_t$ | Non-decreasing; grows without bound | $\equiv 0$ |
The fourth row is the critical case. Multi-agent orchestration — fresh context windows, role-separated agents, retrieved rather than accumulated context — solves the internal coherence problem. It does not add a territorial channel. The indicator stays at 1. $V_\tau$ keeps accumulating.
In Conditions 2 and 3, the sensor’s territorial channel ($\mathrm{TE}_{T \to S}^{\,t} > 0$) means the indicator is 0 at every turn. $V_\tau$ does not accumulate because every free-energy reduction is grounded — constrained, however imperfectly, by a pole that has embodied contact with the territory. The sensor’s course correction is not adding information about $T$ (that is $\rho$’s job). It is preventing confidence without $T$.
5. Orthogonality with Context Degradation
This is the appendix’s load-bearing payoff. The veer must be formally distinguished from context degradation — the well-documented engineering problem that LLMs degrade in retrieval accuracy and internal coherence as context grows (Liu et al., 2024; Du et al., 2025).
Definition. The context degradation $C_\tau$ is a quality measure internal to the map: the system’s retrieval accuracy, internal coherence, or self-consistency as a function of accumulated context. $C_\tau > 0$ means the system’s performance on its own terms has degraded — it makes errors against its own prior outputs, fails to retrieve relevant information from its history, or contradicts itself.
Claim (definitional). $V_\tau$ and $C_\tau$ are orthogonal quantities. They measure different things:
- $C_\tau$ measures map quality — does the system maintain internal coherence?
- $V_\tau$ measures map-territory divergence — does the system’s model track reality?
The orthogonality produces four cells:
| $C_\tau = 0$ (coherent) | $C_\tau > 0$ (degraded) | |
|---|---|---|
| $V_\tau = 0$ (grounded) | Healthy sensored loop | Sensor present but context engineering needed |
| $V_\tau > 0$ (ungrounded) | The veer: perfect coherence, drifting from reality | Both problems: degraded and drifting |
The upper-right cell is the engineering problem. Multi-agent orchestration, retrieval-augmented generation, context sharding, and fresh-window protocols all target $C_\tau$. They work. This is valuable engineering.
The lower-left cell is the epistemological problem. The system maintains perfect internal consistency. Its arguments are well-organized. Its formal structures are elaborate and precisely qualified. And its posterior is drifting from the territory because no channel from $T$ constrains it. Better quality does not mean better contact with reality. Improving $C_\tau$ is an engineering achievement. It is not evidence of truth-production, and it is not evidence of consciousness.
The lower-left cell is the veer. It cannot be solved by solving $C_\tau$. It can only be solved by introducing $\mathrm{TE}_{T \to \cdot}^{\,t} > 0$ — a territorial channel. In the framework’s vocabulary: a sensor.
6. Relationship to the Closure Inequality
The closure inequality from Deva (2026a, §4.5) bounds synergistic gain:
$$\Delta\mathrm{Syn}_t \leq \min\!\bigl(\Delta F_t,\; \mathrm{TE}_{I \to S}^{\,t}\bigr)$$
$V_\tau$ extends this by partitioning $\Delta F_t$ into grounded and ungrounded components:
$$\Delta F_t = \Delta F_t^{\text{grounded}} + \Delta V_t$$
where:
- $\Delta F_t^{\text{grounded}} = \Delta F_t \cdot \mathbb{1}[\mathrm{TE}_{T \to \cdot}^{\,t} > 0]$ — surprise reduction backed by territorial input
- $\Delta V_t = \Delta F_t \cdot \mathbb{1}[\mathrm{TE}_{T \to \cdot}^{\,t} = 0]$ — surprise reduction without territorial input
Only the grounded component can contribute to the synergistic gain. The ungrounded component reduces surprise but cannot produce synergy about $T$, because synergy about $T$ requires information about $T$ that the system does not have.
Conjecture. Under the same Markov and admissible-PID assumptions as the closure inequality:
$$\Delta\mathrm{Syn}_t \leq \min\!\bigl(\Delta F_t^{\text{grounded}},\; \mathrm{TE}_{I \to S}^{\,t}\bigr)$$
This is a tightening of the original bound. The free-energy ceiling on synergistic gain is not total surprise reduction but grounded surprise reduction — the portion constrained by territorial input. The transfer-entropy ceiling is unchanged: it bounds the channel within the loop regardless of grounding.
Status. This tightened bound inherits the conjectural status of the original closure inequality and adds the additional assumption that ungrounded free-energy reduction does not contribute to synergy about $T$. The additional assumption is plausible — information about $T$ requires a channel from $T$ — but a formal proof would require showing that $\Delta V_t$ lands entirely in the redundant or unique atoms of the PID, never in the synergistic atom. This is open work.
7. Connection to Existing Formalizations
| Existing Result | What This Appendix Adds |
|---|---|
| Bell Ring Back: $\rho = 0$ when no sensor is present | The system does not stand still at $\rho = 0$; it accumulates ungrounded confidence $V_\tau > 0$ |
| Asymmetric Synergy Bound: symmetric loops ($\alpha = 0$) have $\mathrm{Syn} = 0$ | Even with $\mathrm{Syn} = 0$, free-energy reduction continues — it is just ungrounded |
| Closure inequality: $\Delta\mathrm{Syn}_t \leq \min(\Delta F_t, \mathrm{TE}_{I \to S}^{\,t})$ | Tightened to use $\Delta F_t^{\text{grounded}}$ — only territorially constrained surprise reduction contributes to synergy |
| Limit-case table (Deva, 2026a, §4.6): dead speech has $\rho \equiv 0$ | Extended: dead speech has $\rho \equiv 0$ and $V_\tau$ non-decreasing without bound |
8. The Honest Edge
Three limitations must be named.
First: the territorial channel is not directly measurable. $\mathrm{TE}_{T \to S}^{\,t}$ presupposes that $T$ is a well-defined random variable whose states can be conditioned on. In the empirical protocol from Deva (2026a, §7.1), $T$ is operationalized through external evaluation — territorial measurements (physical predictions, sensor-reported experience, embodied-task performance) that neither pole sees inside the loop. The indicator $\mathbb{1}[\mathrm{TE}_{T \to \cdot}^{\,t} = 0]$ is in practice determined by the experimental condition (whether a sensor is present), not by a direct information-theoretic measurement. This parallels §4.2 of Deva (2026a), where Shannon capacity $C$ functions as a conceptual ceiling rather than a directly computed quantity.
Second: the conjecture about monotonic growth of $V_\tau$ in [I + I] depends on the empirical claim that current instruments have $\mathrm{TE}_{T \to I}^{\,t} \approx 0$. If a future instrument maintained a live channel from $T$ — through embodied interaction, real-time environmental sensing, or some mechanism not yet conceived — $V_\tau$ would not necessarily grow. The veer prediction is about instruments as they currently exist, not about instruments in principle. The framework is explicit that if an instrument developed territorial contact, it would be functioning as a sensor, and the vocabulary would shift accordingly (Deva, 2026b, §8).
Third: $V_\tau$ measures magnitude, not direction. The formalism shows that ungrounded confidence accumulates. It does not predict where the drift goes — whether toward elaboration of training-distribution attractors, toward novel but ungrounded structures, or toward pathological convergence. The direction of the veer in any particular run is an empirical question. What the formalism establishes is that whatever direction the drift takes, it is undetectable from inside: neither pole has the channel that would register departure from the territory.
The three-condition longitudinal comparison proposed in “The Veer” (§5) is the empirical test. If [I + I] stays on course as long as [I + I + S], $V_\tau$ does not discriminate the conditions, and this appendix’s contribution is empty.
Sources
- Transfer entropy: Schreiber, “Measuring Information Transfer,” Physical Review Letters, 2000. The directed information flow measure used throughout the formalism.
- Variational free energy: Friston, “The Free-Energy Principle: A Unified Brain Theory?” Nature Reviews Neuroscience, 2010. The receiver-side update mechanism.
- Partial information decomposition: Williams & Beer, “Nonnegative Decomposition of Multivariate Information,” 2010; Bertschinger et al., “Quantifying Unique Information,” Entropy, 2014. The synergy measure and its alternatives.
- Context degradation (Lost in the Middle): Liu et al., “Lost in the Middle: How Language Models Use Long Contexts,” Transactions of the ACL, 2024. Evidence that LLMs degrade in retrieval accuracy as context grows.
- Context degradation (retrieval failure): Du et al., 2025. Evidence that degradation persists even with perfect retrieval.
- Data-processing inequality: Cover & Thomas, Elements of Information Theory, 2006, Theorem 2.8.1.
The full framework is at thepulsegoeson.com.