Guide to AI Assertibility Constraint for Responsible AI Systems

This theoretical paper proposes a fundamental shift in how high-stakes AI systems should communicate uncertainty, arguing that current models' tendency to generate authoritative-sounding outputs erodes democratic accountability. The author introduces a formal "assertibility constraint" requiring AI to provide publicly contestable proof for its claims, a framework with significant implications for deploying generative AI in legal, medical, and governance domains where epistemic agency is paramount.

Key Takeaways

The paper critiques generative AI for converting statistical uncertainty into definitive verdicts, undermining the public justification required for democratic decision-making.
It proposes a formal constraint: in high-stakes domains, AI systems may only Assert or Deny a claim if they can produce a publicly inspectable "certificate of entitlement"; otherwise, they must return Undetermined.
The framework creates a three-status interface (Asserted/Denied/Undetermined) that separates internal model confidence from publicly justified standing, connected by a verifiable certificate.
The author operationalizes this via "decision-layer gating," where outputs are controlled by internal witnesses (like sound bounds) and an output contract with reason-coded abstentions.
A key technical lemma shows that the Undetermined status is not a tunable parameter but a mandatory outcome when forcing evidence is absent, preventing systems from evading the proof requirement.

A Framework for Answerable AI: The Assertibility Constraint

The core argument posits that the fluent, confident outputs of large language models (LLMs) like GPT-4 or Claude 3 represent an epistemic hazard in high-stakes public contexts. By presenting probabilistic inferences as factual conclusions, these systems "displace the justificatory work" essential for democratic discourse, where claims must be open to challenge and revision. The proposed corrective is a Brouwer-inspired assertibility constraint, demanding that AI systems function not as oracles but as accountable participants in a justificatory process.

Under this constraint, an AI's interface must have three distinct statuses. A claim can be Asserted or Denied only when accompanied by a "publicly inspectable and contestable certificate of entitlement." This certificate acts as a boundary object, translating internal computational evidence into a form that can be debated by human stakeholders. If such a certificate cannot be generated, the system's only permissible output is Undetermined. This framework intentionally decouples the model's internal confidence score—a numerical value like a softmax probability—from its right to make a public claim, making outputs "answerable to challengeable warrants rather than confidence alone."

The paper details an operational method using "decision-layer gating." Instead of directly outputting a class label from a final softmax or argmax layer, the system's output is controlled by a gate. This gate checks for the presence of an "internal witness"—concrete evidence such as a formally verified soundness bound, a statistically valid confidence interval, or a provable separation margin in a feature space. The output is then governed by a contract that includes "reason-coded abstentions," explaining *why* a determination could not be made. The resulting "entitlement profile" is time-indexed and stable under numerical refinement, meaning minor adjustments to model parameters don't flip a verdict, but it remains revisable if new public evidence emerges.

Industry Context & Analysis

This theoretical proposal directly challenges the prevailing industry paradigm for handling uncertainty. Major model providers like OpenAI, Anthropic, and Google DeepMind primarily rely on confidence scores or "verbal hedging" (e.g., "I think...", "It's likely that..."). For instance, a model might answer a medical query with high confidence based on its training data but provide no auditable trail from source data to conclusion. In contrast, the assertibility constraint demands a certificate—a concept closer to formal verification in safety-critical software or explainable AI (XAI) techniques like SHAP values or LIME explanations, but with a higher standard of public contestability.

The framework's necessity is underscored by the documented failures of overconfident AI. In legal tech, systems like COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) have faced severe criticism for producing risk assessments without transparent, contestable justification, potentially violating due process. In medical AI, a model might achieve 94% accuracy on a diagnostic benchmark but still be unusable in practice if clinicians cannot interrogate the basis for its decisions on specific, edge-case patients.

Technically, the paper's "design lemma" has profound implications. It proves that any system with a total (always-answering), certificate-sound binary interface has already decided the truth of a predicate on its scope. This mathematically formalizes why the Undetermined status is non-negotiable; it is the only logically consistent output when evidence is insufficient. This contrasts with standard "reject options" or confidence thresholds in machine learning, which are often tuned for operational convenience (e.g., to maintain a certain throughput or accuracy rate). Here, the rejection is mandated by the absence of proof, not by an adjustable confidence cutoff. This aligns with emerging trends in "uncertainty-aware AI" and "conformal prediction," which provide statistical guarantees, but pushes further into the realm of public accountability.

What This Means Going Forward

For AI developers and regulators, this paper provides a rigorous philosophical and technical foundation for mandates on transparency and accountability. It argues that for AI to be deployed responsibly in domains like jurisprudence, healthcare diagnostics, or public policy analysis, its design must institutionalize the right to a "publicly inspectable and contestable" warrant. This could foreseeably shape future regulatory standards, moving beyond vague principles of "fairness" and "transparency" to specific architectural requirements for high-stakes automated decision systems.

The immediate beneficiaries of this approach are stakeholders in democratic institutions—regulators, oversight bodies, and ultimately the public—who require tools to audit and challenge automated decisions. The framework creates a formal space for contestation, treating the AI's output as the start of a justificatory dialogue, not the end. However, implementing this at scale presents a massive technical challenge. Generating computationally sound, human-interpretable certificates for the outputs of billion-parameter neural networks, especially generative models, is an unsolved problem at the frontier of AI research, intersecting with work on formal verification, neuro-symbolic AI, and scalable explainability.

Watch for developments in several key areas. First, whether any major AI lab begins prototyping "certificate-generating" layers for their models, particularly in structured prediction tasks. Second, if regulatory bodies like the EU, operating under the AI Act, begin to incorporate similar "right to a justification" or contestability requirements for high-risk AI systems. Finally, observe the academic trajectory: this paper may catalyze more work at the intersection of computational logic, epistemology, and machine learning, potentially giving rise to a new subfield focused on the normative design of AI interfaces for public discourse. The long-term impact could be a fundamental re-architecting of how AI systems participate in the epistemic processes that underpin a democratic society.

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Key Takeaways

A Framework for Answerable AI: The Assertibility Constraint

Industry Context & Analysis

What This Means Going Forward

常见问题

Key Takeaways

A Framework for Answerable AI: The Assertibility Constraint

Industry Context & Analysis

What This Means Going Forward

常见问题

相关推荐

Measuring AI R&D Automation

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Measuring AI R&D Automation

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Inference-Time Toxicity Mitigation in Protein Language Models

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI