Brouwerian Assertibility Constraint for Responsible AI Accountability

Generative AI systems increasingly produce authoritative-sounding outputs in high-stakes domains like law, medicine, and policy, but their underlying reasoning is often opaque and probabilistic. This paper proposes a formal, Brouwer-inspired "assertibility constraint" to make AI systems publicly accountable, requiring them to provide contestable certificates of entitlement for their claims or else abstain with an "Undetermined" status. The work addresses a critical gap in responsible AI by shifting focus from confidence scores to verifiable justifications, aiming to preserve human epistemic agency in an age of automated speech.

Key Takeaways

The paper introduces an "assertibility constraint" for AI: in high-stakes domains, systems may only assert or deny a claim if they can provide a publicly inspectable and contestable certificate of entitlement.
This yields a mandatory three-status interface: Asserted, Denied, or Undetermined. The "Undetermined" status is not a tunable reject option but is required when no forcing witness is available.
The framework uses internal witnesses (like sound bounds or separation margins) and an output contract with reason-coded abstentions to operationalize the constraint at the decision layer.
A core design lemma proves that any total, certificate-sound binary interface already decides the deployed predicate on its declared scope, making "Undetermined" a fundamental status.
The goal is to make AI outputs answerable to challengeable warrants rather than confidence scores alone, preserving democratic epistemic agency.

A Framework for Accountable AI Assertions

The central proposal is a formal constraint inspired by intuitionistic logic and the philosophy of L.E.J. Brouwer. It mandates that for any high-stakes claim, a generative AI system must either provide a publicly inspectable certificate justifying an assertion or denial, or else return a status of "Undetermined". This certificate acts as a "boundary object" connecting the system's internal state of entitlement to a public record that can be challenged. The framework produces a time-indexed entitlement profile that remains stable under numerical refinement but is revisable as public evidence changes.

Operationally, the constraint is implemented through decision-layer gating. Instead of relying on a simple confidence threshold or an argmax output from a softmax layer, the system must check for the presence of an internal "witness." This witness could be a sound mathematical bound, a verified separation margin in a feature space, or another form of verifiable proof. The system's output is governed by a contract that includes reason codes for any abstention, explaining why a determination could not be made.

The author proves a critical technical point: if an interface is total (covers its entire scope) and certificate-sound, then it inherently decides the truth of the predicate it represents. Consequently, the "Undetermined" status is not a parameter to be adjusted for convenience but a mandatory outcome whenever the system lacks a forcing witness. This formalizes the requirement for abstention, moving beyond the common practice of using tunable confidence thresholds for rejection which can be gamed or misinterpreted.

Industry Context & Analysis

This research directly challenges the prevailing paradigm of AI deployment, where systems like OpenAI's GPT-4 or Google's Gemini are often prompted to provide definitive answers even on complex, uncertain topics. These models typically generate text based on statistical likelihood, not verifiable justification, a process critics call "stochastic parroting." The paper's "assertibility constraint" offers a rigorous alternative to the industry's current patchwork of mitigations, such as Constitutional AI (used by Anthropic) or output classifiers designed to detect refusals, which lack a formal theory of public justification.

The framework aligns with but formalizes broader trends in AI safety and alignment. For instance, the push for scalable oversight and debate seeks to make model reasoning checkable, while the concept of "uncertainty quantification" is a major research frontier. However, most uncertainty methods—like Bayesian neural networks or ensemble techniques—produce numerical confidence intervals (e.g., "95% credible region") that are not publicly contestable warrants. The proposed certificates are a different epistemic artifact, closer in spirit to formal verification or proof-carrying code, which have seen adoption in critical systems programming but remain rare in large-scale ML.

The mandatory "Undetermined" status has significant implications for real-world benchmarks and evaluations. Current leaderboards for models like those on the Hugging Face Open LLM Leaderboard prioritize accuracy on tasks like MMLU (Massive Multitask Language Understanding) or HumanEval (code generation). A system adhering to this constraint might score lower on such metrics if it abstains on uncertain items, but its outputs would carry higher integrity. This highlights a potential misalignment between academic benchmarks and real-world deployment needs, where a wrong but confident answer is far more dangerous than a correct abstention.

What This Means Going Forward

The primary beneficiaries of this approach are domains where justification is paramount: legal adjudication, medical diagnosis, scientific publishing, and content moderation. For example, an AI tool recommending a denial of parole or flagging a scientific paper for retraction would need to present a challengeable certificate, making the process more transparent and fair. This could force a technological shift, favoring AI architectures that inherently produce auditable reasoning traces—such as neuro-symbolic systems or chain-of-thought prompted models—over pure end-to-end neural networks that act as "black boxes."

Regulators and standard-setting bodies may find this framework a useful template for future AI governance. The EU's AI Act, with its risk-based classification and requirements for high-risk systems, could incorporate such an assertibility principle for applications in critical infrastructure. Similarly, efforts like the NIST AI Risk Management Framework could adopt "contestable certificates" as a concrete guideline for achieving transparency and accountability.

Looking ahead, key developments to watch include whether leading AI labs begin prototyping systems with this three-status interface and how the research community integrates the concept into new benchmarks. Will we see a "Certifiable MMLU" benchmark that rewards justified assertions? Furthermore, the technical challenge of generating useful certificates for complex, non-formal domains like open-ended dialogue remains immense. The next phase of work will likely involve creating scalable witness-generation methods, potentially using the model's own internal representations, to make this philosophically rigorous framework practically deployable across the high-stakes applications where it is most urgently needed.

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Key Takeaways

A Framework for Accountable AI Assertions

Industry Context & Analysis

What This Means Going Forward

常见问题

Key Takeaways

A Framework for Accountable AI Assertions

Industry Context & Analysis

What This Means Going Forward

常见问题

相关推荐

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Structure-Aware Distributed Backdoor Attacks in Federated Learning

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Structure-Aware Distributed Backdoor Attacks in Federated Learning

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Structure-Aware Distributed Backdoor Attacks in Federated Learning