AgentSeeResearch Notebook
version 1.0.0 · created 2026-04-08 · updated 2026-04-08

E5: Negative Tests (Project-Killers)

experiment
ClaimE5 specifies the project-killing falsification conditions: a compound negative test aggregating falsifier triggers from E1-E4 and kill conditions that, if met, require abandoning or fundamentally reframing the AgentSee architecture.

AgentSee should be abandoned or reframed if any of the following hold:

  1. Observability fails. E1 falsifier triggers: no lift over trivial baselines, chronic miscalibration, or unacceptable false positive burden. Without observable state estimation, the architecture has no foundation.
  2. Stabilization fails or causes autonomy loss. E2 falsifier triggers: no TTR improvement, autonomy decline, increased monitoring anxiety, or increased dependency markers. The intervention layer either does not work or causes the harm it is designed to prevent.
  3. Dependency is intrinsic to the best-performing policy. If the most effective intervention strategies inherently increase user dependence on the system, the architecture violates its own invariants (I3 anti-capture, I4 dependency minimization) by design rather than by failure.
  4. On-device understanding layer cannot maintain commitment consistency. If the semantic layer confabulates, substitutes its own goals, or cannot reliably track user commitments, the understanding layer is not a tool but a source of distortion.
  5. Caring constraints reduce to paternalism or persuasion in practice. Kill criteria 6.4 and 6.6. If the system's caring orientation, despite formal constraints, collapses into directing behavior or nudging toward system-preferred outcomes, the architecture has failed to distinguish itself from exploitative alternatives.
  6. A compliance-optimized system achieves equal capacity outcomes. Kill criterion 6.7. If a system optimizing for behavioral compliance (do-what-we-say) produces the same capacity gains as one optimizing for agency capacity, the entire theoretical framework -- that capacity requires different engineering than compliance -- is wrong.

Design note

E5 is not a standalone experiment. It is the aggregated decision criterion: if any of these conditions are met, the project's premises are wrong and continuation without fundamental reframing is not justified.