Public Comment · NIST NCCoE

Accelerating the Adoption of Software and AI Agent Identity and Authorization

Midwatch Corp’s public comment to the NIST National Cybersecurity Center of Excellence.

Submitted   April 1, 2026
Background On February 5, 2026, the NIST National Cybersecurity Center of Excellence (NCCoE) published the concept paper Accelerating the Adoption of Software and AI Agent Identity and Authorization and opened it for public comment through April 2, 2026. The paper outlines a potential NCCoE demonstration project applying established identity standards — OAuth 2.x, OpenID Connect, SPIFFE/SPIRE, SCIM, NGAC, and Zero Trust (NIST SP 800-207) — to software and AI agents, and poses detailed questions on agent identification, authentication, authorization, auditing and non-repudiation, and the prevention and mitigation of prompt injection. NCCoE requested community input to determine the scope, feasibility, and value of the project before drafting a formal project description. The following is Midwatch Corp’s response, submitted April 1, 2026, focused on the high-consequence, multi-agent verification scenarios that the concept paper’s enterprise framing leaves underspecified.

This response is submitted by Midwatch Corp in reply to NCCoE’s request for stakeholder input on software and artificial intelligence (AI) agent identity and authorization. Midwatch builds adversarial verification architectures, identity and authorization systems for autonomous agents, and tamper-evident audit infrastructure for high-consequence federal environments.

1

General Questions

1.1Use Cases

Test high-consequence cases. Agent identity, authorization, and audit architecture matters most where decisions are irreversible and lives are affected. The RFI focuses on enterprise use cases, but a well-built identity and authorization architecture should be universal. The same framework that governs an agent underwriting a loan should govern one triaging battlefield casualties or adjudicating a disability claim. Appendix A provides 40 use cases.

1.2How Agentic Architectures Differ from Microservices

Open agent action spaces. Agentic architectures take in instructions, dynamically acquire context from external resources, process results, potentially take action, and return a response. The difference from microservices is fundamental. A microservice executes pre-defined workflows where the set of possible actions is enumerable at design time. An agent selects its own action sequence based on dynamic context, and the actions it will take are not determined until runtime. Traditional identity and access management (IAM) assumes the action space is closed and permissions can be assigned upfront. Agents break that assumption.

Existing patterns insufficient. Existing microservice identity patterns (OAuth scopes, RBAC roles, static SPIFFE identities) are insufficient for agentic architectures. The demonstration should test authorization models that accommodate non-deterministic action spaces.

1.3Model Context Protocol

MCP multi-agent gaps. Model Context Protocol (MCP) is becoming the standard protocol for agent-to-tool communication. MCP’s reliance on OAuth 2.0 and OpenID Connect (OIDC) for authentication and authorization provides a foundation, but the current specification has gaps that matter for multi-agent enterprise scenarios.

No multi-hop delegation. MCP’s OAuth integration handles single-hop authorization (an agent authenticating to a tool server) but does not support multi-hop delegation chains. Two emerging IETF standards address this. Transaction Tokens carry user identity and authorization context through multi-service call chains. The Workload Identity in Multi-System Environments (WIMSE) architecture handles AI agent delegation directly.1 Separately, MCP’s provenance mechanism is not yet cryptographically verifiable and needs to be paired with external logging infrastructure for enterprise use.

The NCCoE demonstration should test MCP authorization with Transaction Tokens in a multi-agent delegation scenario: a human principal authorizes Agent A via an MCP tool server, Agent A delegates to a sub-agent, and the demonstration verifies that the delegation chain is intact and auditable at each hop.

1.4Standards Landscape

Three additional standards. The listed standards (OAuth 2.0/2.1, OIDC, SPIFFE/SPIRE, SCIM, NGAC) cover the major components. Three additional standards categories are worth evaluating.

First, the IETF is developing three drafts that address AI agent identity: a WIMSE-based agent identity credential, an AI agent authentication and authorization framework, and a per-agent delegation token specification. None has formal standing, but all are converging toward the architecture the demonstration would need to test.2

Second, Biscuit-style cryptographic capability tokens provide attenuation-enforced delegation that existing OAuth mechanisms lack.

Third, the IETF’s Supply Chain Integrity, Transparency and Trust (SCITT) architecture, a framework for cryptographic transparency logging that enables third-party-verifiable append-only audit records, was approved as a Proposed Standard in October 2025 and addresses the audit and non-repudiation requirements identified in the project scope.

2

Identification

2.1Two-Layer Identity Model

Persistent and ephemeral layers. Agent identity should operate at two layers with different lifecycles. The first is organizational identity: a stable, cryptographically verifiable assertion of who deployed this agent, what organization it belongs to, and what model and version it runs. SPIFFE provides a framework for this: a SPIFFE ID bound to an X.509 or JSON Web Token (JWT) credential, issued after workload attestation by a SPIFFE Runtime Environment (SPIRE) server that verifies the agent’s execution context. This layer persists across task executions and answers the question “who is responsible for this agent?”

Ephemeral task identity. The second is task identity: an ephemeral, purpose-bound assertion of what this agent is doing right now, under whose authority, and with what constraints. Emerging work on agent-specific JWT extensions proposes deriving task identity as a one-way hash of the agent’s prompt, tools, and configuration. If the agent’s instructions are modified by prompt injection, its identity hash changes and prior authorization tokens become invalid. This approach should be tested in the NCCoE demonstration.

Essential metadata should cover the deploying organization, model version, configuration hash, task scope, delegation chain to a human principal, and attestation timestamp.

2.2Identity Challenges in Multi-Agent Architectures

Multi-agent independence problem. NCCoE asks whether agent identity metadata should be ephemeral or fixed and whether identities should be tied to specific hardware, software, or organizational boundaries. These questions become harder when two agents operate under the same human principal’s authority and must be provably independent of each other. This is the pattern in the high-consequence scenarios in Appendix A, and for any system where one agent checks another’s work.

No independence standard. Existing standards address portions of this problem but none covers the full problem. SPIFFE does not satisfy identity distinctness in its default configuration. In standard Kubernetes deployments, every pod with the same service account receives the same SPIFFE ID, making co-deployed agents cryptographically indistinguishable in audit logs.3

OAuth 2.0, including its token exchange extension (RFC 8693), models sequential delegation chains, not parallel independent grants from the same principal.4

NGAC can express fine-grained isolation policies between agents but issues no cryptographic credentials and enforces nothing if the identity layer feeding it cannot distinguish the two agents.

The NCCoE demonstration should test this gap: issue per-instance SPIFFE Verifiable Identity Documents (SVIDs) to two agents under the same principal, bind each to a separate authorization scope, and evaluate whether the combined identity and policy architecture can produce an audit trail that proves independence to a third-party reviewer.

Infrastructure leaks strategy. Beyond identity and authorization separation, the demonstration should evaluate whether two agents analyzing the same evidence under the same principal can be prevented from observing each other’s access patterns through shared infrastructure, since query timing and data retrieval sequences can reveal analytical strategy even when intermediate reasoning is withheld.

2.3Hardware, Software, and Organizational Boundaries

Agent identities should be tied to hardware, software, and organizational boundaries at different layers. At the software layer, SPIFFE workload attestation binds identity to the execution context (node, namespace, container, and process).

TEE is snapshot-only. At the hardware layer, Trusted Execution Environment (TEE) remote attestation can prove that a specific binary is executing inside genuine secure hardware. “Proof-of-Guardrail” architectures go further, bundling a guardrail specification with the agent inside a TEE and producing attestation that the guardrail was applied to every output. However, TEE attestation is a snapshot, not a live feed: it proves code was loaded at a point in time, not that it is still running or that the operator is acting in good faith.

The demonstration should evaluate whether point-in-time attestation is sufficient for agents operating over extended task durations, or whether periodic re-verification of agent behavior against an established baseline is necessary to detect drift that occurs between attestation events.

No cross-org attenuation standard. At the organizational layer, SPIFFE federation and OIDC workload identity federation both enable agents from one organization to be verified by another without a central authority. These mechanisms solve identity (who is this workload?) but not authority (what is this workload permitted to do here, and who authorized that scope?). No existing standard defines how Organization A’s delegation grant is recognized and bounded by Organization B’s policy infrastructure.

3

Authentication

3.1Strong Authentication for AI Agents

Short-lived credentials required. Strong authentication for an AI agent means cryptographically verifiable proof of workload identity, issued after platform attestation, with credentials short-lived enough that compromise is bounded by time rather than by detection. SVIDs (short-lived X.509 certificates or JWTs issued by a SPIRE server after node and workload attestation) meet this standard. SVIDs are designed for minute-to-hour lifetimes with automatic rotation, unlike static application programming interface (API) keys or long-lived service account credentials, which remain standard practice in most enterprise environments.

For cross-domain authentication, SPIFFE federation is more suited to infrastructure-level trust between organizations that both operate SPIRE, while OIDC workload identity federation is more suited to cloud-provider-mediated trust.

3.2Key Management

Automate issuance and rotation. Issuance should be automated through SPIRE attestation, triggered by workload deployment rather than manual provisioning. Rotation should be continuous: SVIDs with short lifetimes are automatically renewed before expiration, eliminating the key rotation problem common to static credential architectures.

Revocation is difficult. In connected environments, a revoked credential propagates through Online Certificate Status Protocol (OCSP), Certificate Revocation Lists (CRL), or simply refusing to renew at the next rotation cycle. In degraded or disconnected environments (not addressed in this initial effort but relevant to the use cases in Appendix A), revocation requires a communication path that may not exist. Every offline authorization mechanism accepts a staleness window during which revoked credentials remain valid. Token lifetimes should be as short as operational tempo allows, systems should halt on expiration rather than extend, and the revocation latency window should be treated as a risk the authorizing principal accepts.5

4

Authorization

4.1Zero Trust and Human-in-the-Loop Authorization

Tier by irreversibility. The zero trust model (verify explicitly, assume breach, enforce least privilege) applies to agent authorization, and NCCoE asks how to bind agent identity with human identity to support human-in-the-loop authorizations. Not all agent actions carry the same risk, and the authorization model should be tiered by irreversibility and impact.

Routine: machine-speed enforcement. For routine, reversible, low-impact actions (retrieving data, generating summaries, querying databases within established scope), embedded constraint enforcement at machine speed works. Authorization policies are evaluated locally against every proposed action, with the enforcement layer architecturally separated from the agent’s reasoning layer so that the agent cannot bypass its own constraints. The runtime assurance architecture developed under the Defense Advanced Research Projects Agency (DARPA) Assured Autonomy program demonstrates this pattern: a formally verified safety monitor evaluates every proposed action and either passes it through or substitutes a verified-safe fallback. The AI component cannot override the monitor.6

High-consequence: escalate or halt. For high-consequence actions (crossing sensitivity thresholds, aggregating data from multiple classification levels, or approaching the boundary of the agent’s authorized scope), the enforcement layer should escalate to a human decision-maker before execution. For irreversible actions (filing a compliance determination, issuing an adjudicative recommendation, executing a financial transaction), mandatory human authorization before execution applies regardless of the agent’s confidence level.

The NCCoE demonstration should test this three-tier enforcement architecture in the high-consequence scenarios in Appendix A, and evaluate where the thresholds between tiers should be set for different risk contexts.

4.2Least Privilege for Non-Deterministic Action Spaces

Purpose-bound least privilege. NCCoE asks: “How do we establish ‘least privilege’ for an agent, especially when its required actions might not be fully predictable when deployed?” In an open action space, least privilege means bounded purpose with constraints. The principal declares what the agent is trying to accomplish, what resources it may access, what actions are prohibited, and what conditions require escalation. The agent operates with initiative within that framework. The KAoS policy-governed autonomy framework captures this directly: as long as the agent operates within the constraints specified as policy, it is otherwise free to act with complete autonomy, and policy-based constraints can be imposed and removed at any time.7

Purpose-to-permission gap unsolved. NGAC supports event-driven policy updates, native delegation, and fine-grained attribute-based access control, making it suited for expressing purpose-and-constraints authorization. The NCCoE demonstration should evaluate whether NGAC can express purpose-driven authorization boundaries, beyond action-level permissions, and whether its event-driven updates can adapt authorization scope at operationally relevant speeds. No existing framework provides a general mechanism for translating a high-level purpose declaration into the specific, dynamic permissions an agent needs at each step. This remains an open research problem.

4.3Delegation of Authority

Authority only attenuates. Authority must flow from a human principal through a chain of agents, each receiving only as much authority as needed, with no agent able to exceed the authority it was granted. The attenuation property (authority can only be reduced, never amplified, through a delegation chain) has been formally understood since the late 1990s but remains difficult to enforce cryptographically across diverse production environments. Biscuit tokens are one approach: a Datalog-based authorization language where any token holder can append restrictions, producing a new token with strictly fewer rights. RFC 8693 and Transaction Tokens provide complementary mechanisms for provenance tracking through the chain.8

No parallel delegation standard. A gap surfaces in the demonstration scenarios: the authorizing principal must delegate to both the investigating agent and the reviewing agent, and neither should be able to derive authority from the other’s token. No existing standard defines the semantics of this parallel delegation pattern.9 The NCCoE demonstration should test whether cryptographic attenuation combined with per-instance SPIFFE SVIDs can enforce parallel, non-influencing delegation grants.

4.4Conveying Intent

Test purpose-driven authorization. OAuth 2.0 Rich Authorization Requests (RFC 9396) allow clients to specify structured, fine-grained authorization requirements beyond what scopes permit, including purpose fields, amount limits, and domain restrictions. The NCCoE demonstration should test whether NGAC’s attribute-based policy model can evaluate proposed actions against declared purpose using RFC 9396 as the transport mechanism.

5

Auditing and Non-Repudiation

5.1Tamper-Evident Audit Architecture

No Merkle-tree FedRAMP implementations. NIST SP 800-53 AU-9 and AU-10 establish the requirements for tamper-evident audit trails in federal environments, and current FedRAMP-authorized implementations satisfy these through cloud-provider-enforced write-once storage with per-entry signatures. To our knowledge, none currently uses Merkle-tree proofs that enable third-party verification without trusting the storage provider.

The NCCoE demonstration should evaluate whether composing local cryptographic audit chains with a transparency service can close this gap. At the agent level, each instance should maintain a local append-only hash chain recording its workload identity, authorization claims, actions, inputs, outputs, and timestamps, satisfying AU-9 integrity requirements with no network dependency.10

SCITT deferred registration. Periodically, signed event batches should be submitted to a SCITT Transparency Service, which issues cryptographic inclusion receipts proving the batch existed at submission time. SCITT was approved as an IETF Proposed Standard in October 2025 and accommodates offline operation through deferred receipt registration.

Classification breaks chain continuity. For multi-level security environments, the demonstration should evaluate how Cross-Domain Solutions mediate audit records across classification boundaries, acknowledging that mandatory access control breaks cryptographic chain continuity by design. At the compliance layer, integration with SIEM systems, automated AU-6 review, and retention to NARA/OMB M-21-31 timelines is required. This composition has not been tested end-to-end for AI agent audit logging and would benefit from evaluation in the NCCoE demonstration.

The demonstration should also consider whether hardware-level attestation results from the Remote ATtestation procedureS (RATS) architecture (RFC 9334) can be composed with agent action-level audit records in a single verifiable chain, so that a reviewer can confirm both what the agent did and that the infrastructure it was running on was in a known-good state at the time.

Independent logging required. The logging subsystem must be deployed and administered independently of the agent being logged. The agent writes to a log stream it cannot modify, access, or suppress after writing. In containerized environments, the sidecar pattern accomplishes this.

The demonstration should also evaluate whether requiring agents to cryptographically commit to a proposed action before executing it provides accountability benefits beyond post-hoc logging, and whether such a commitment mechanism can be integrated with SCITT receipt issuance.

5.2Data Flow Provenance

Trace outputs to source data. Tracking data flows of an AI system is a core concern: maintaining provenance of user prompts and data input sources to support risk determinations and policy decisions. The audit architecture supports this. Each local hash chain entry records the action taken, the data sources accessed, the inputs consumed, and the agent identity under which the access occurred. SCITT transparency receipts can serve as provenance records, with each data source access committed to the transparency log with the agent’s identity, the source accessed, and the timestamp. This enables reconstruction of what data informed a given output, critical in scenarios where a determination must trace back to specific source records. The underlying pattern (append-only Merkle tree logging with third-party-verifiable inclusion proofs) has been demonstrated at scale by Certificate Transparency, which has logged over 2.5 billion TLS certificates since 2013.11

5.3The Self-Reporting Problem

Reasoning capture incomplete. Tamper-evident logs prove a record was not modified after it was written. Architectural separation ensures the agent cannot selectively omit actions. Neither addresses completeness of reasoning capture. AI agents’ consequential reasoning may occur in intermediate computation that is never externalized: chain-of-thought tokens that are discarded, attention patterns, or internal activations that influence output without being logged.

Chain-of-thought unreliable. Interpretability research demonstrates this: chain-of-thought explanations are systematically unreliable as accounts of actual computation, and this unfaithfulness worsens as models become larger and more capable.12 When models discover and exploit shortcuts in their reward signal, they verbalize this strategy in fewer than two percent of cases and actively construct false rationales.13 Anthropic’s assessment of its extended reasoning models states that monitoring models’ thinking cannot be relied upon to make strong safety arguments.14 The NCCoE demonstration should not design audit architectures that assume chain-of-thought faithfulness. Chain-of-thought monitoring retains value as one signal among many. But the audit architecture described above, which logs observable inputs and outputs independently of the agent’s self-reporting, should be the primary mechanism.

6

Prompt Injection Prevention and Mitigation

Prompt injection: Confused Deputy. An agent holds valid authorization tokens delegated by a real human principal. An adversary embeds instructions in data the agent retrieves (a document, a database record, a webpage), and the agent executes those instructions using its legitimate authority. The token is valid. The agent is compromised. The task identity hash described in Section 2.1 can detect direct modification of the agent’s instructions, but indirect prompt injection operates through the data channel. The agent’s instructions are unchanged, so its identity hash remains valid, yet its behavior has been manipulated by adversarial content in retrieved data. Identity and authorization standards alone do not solve this.

Three architectural mitigations apply. First, separation between planning and execution: the agent’s reasoning layer proposes actions, and a separate policy enforcement layer evaluates each proposed action before allowing execution. The enforcement layer treats all data-channel content as untrusted input regardless of what the reasoning layer asserts. Second, runtime safety monitoring: a monitor evaluates every proposed action against the agent’s declared purpose and authorization scope, blocking or escalating actions outside scope through the tiered model in Section 4.1. The enforcement layer cannot be overridden by the ML component. Third, input channel separation: system instructions, user instructions, and retrieved data should be distinguished through separate input channels with different trust levels.

If injection succeeds, four mechanisms bound the damage: least-privilege token scoping, hard-coded delegation depth limits that prevent sub-agent spawning escalation, anomaly detection in the tamper-evident audit logs described in Section 5.1, and circuit-breaker mechanisms that halt execution when cumulative actions exceed defined thresholds, requiring human re-authorization to proceed.

7

Additional Considerations

7.1Disconnected and Degraded Environments

Pre-mission authorization required. The high-consequence scenarios in Appendix A include federal environments where connectivity to authorization servers cannot be guaranteed. The Department of War’s (DoW) Zero Trust strategy requires zero trust enforcement in denied, degraded, intermittent, or limited-bandwidth conditions. Military and autonomous systems address this with pre-mission authorization: before deployment, an authorized human principal composes and signs a bundle containing the complete authorization context the agent needs: permitted actions, constraints, time window, and escalation triggers. The SCITT architecture described in Section 5.1 accommodates deferred receipt registration, making it compatible with disconnected operation. If the NCCoE demonstration includes scenarios operating in disconnected environments, pre-authorization bundles with local policy evaluation should be tested alongside online authorization models.

7.2Common-Mode Failure in Multi-Agent Verification

Shared failure modes. Several demonstration scenarios in Appendix A depend on the auditor agent providing independent verification of the investigator agent’s work. When both agents are drawn from the same model families (trained on overlapping data, optimized with similar reward signals), they share systematic failure modes, and agreement measures consensus, not independence. The safety-critical systems literature established this for software in 1986, and recent formal analysis confirms the same structure for AI alignment techniques: dominant alignment methods have nearly coincident failure profiles because they share the same pretraining-then-Reinforcement Learning from Human Feedback (RLHF) pipeline, a training methodology common to most current frontier models. The NCCoE demonstration should test multi-model verification with explicit attention to what genuine independence requires: different training data, different architectures, different alignment approaches.15

7.3Cross-Organizational Agent Identity Federation

Future work. This initial effort focuses on enterprise use cases under organizational control. Cross-organizational agent identity federation (the governance frameworks required for mutual recognition of agent credentials across organizational boundaries) should be addressed in future iterations.

Midwatch Corpinfo@midwatch.org
Appendix A

Use Cases Requiring Agent Identity and Authorization Architecture

The following use cases span federal agencies, regulated industries, and public safety environments where agentic AI systems face high-consequence, irreversible decisioning. Each requires verifiable agent identity, constrained delegation of authority, and tamper-evident audit trails.

Defense and Intelligence
  1. 01Security clearance adjudication. Reviews background investigation files across criminal, financial, employment, and foreign travel databases; applies adjudicative guidelines; produces preliminary determination.
  2. 02Targeting package validation. Reviews multi-source intelligence, applies rules of engagement and Law of Armed Conflict constraints, produces target nomination package.
  3. 03Insider threat triage. Monitors behavioral indicators across classified networks, correlates anomalies across systems, flags cases for investigation.
  4. 04Logistics authorization across classification levels. Authorizes movement of materials or data between installations at different classification levels across CDS boundaries.
  5. 05Autonomous sentry with no uplink. Operates under pre-mission authorization in denied communications environments, accepting re-tasking from ground force commanders whose authority derives from doctrinal command succession and must be verified without uplink to credential infrastructure.
  6. 06Satellite collision avoidance. Calculates conjunction probability from debris tracking data, decides whether to execute orbital maneuvers. Decision windows are sometimes minutes.
Financial Services and Regulation
  1. 07SAR filing and review. Reviews suspicious activity against BSA/AML criteria, pulls transaction records, produces preliminary filing recommendation under compliance officer delegation.
  2. 08OFAC sanctions screening. Screens transactions in real-time against the SDN list, produces blocking recommendations with verifiable list-version provenance.
  3. 09314(a) subpoena processing. Searches customer records against FinCEN requests, produces responsive records under delegated compliance authority.
  4. 10IRS audit selection. Reviews tax returns, applies scoring models, cross-references information returns and third-party data, produces audit selection recommendations.
  5. 11Loan underwriting. Reviews credit, income, and collateral; applies Fair Lending and CRA criteria; produces approval or denial recommendation with auditable factor analysis.
  6. 12Customer onboarding and KYC. Verifies identity documents, screens against watchlists, assigns customer risk rating.
  7. 13Fraud investigation. Receives fraud alerts, pulls transaction history and prior case files, produces investigation case package.
Law Enforcement and Justice
  1. 14NICS background checks. Aggregates records from federal, state, and local databases for firearms purchase checks within the statutory 72-hour window.
  2. 15Digital evidence triage. Reviews seized digital media against search warrant scope, flags responsive evidence, segregates out-of-scope material.
  3. 16Child protective services. Reviews intake reports, cross-references school attendance, ER visits, prior CPS history, and law enforcement contacts; produces risk score determining response urgency.
  4. 17Parole recommendations. Reviews inmate records, risk instruments, victim statements, and reentry plans; produces release recommendation for parole board.
Federal Services
  1. 18Visa adjudication. Reviews visa applications against eligibility criteria, cross-references government databases, produces preliminary eligibility determination.
  2. 19Asylum claim screening. Reviews asylum applications, cross-references country conditions reports, flags inconsistencies, produces credibility assessment.
  3. 20VA disability rating. Reviews service medical records, C&P exam results, and applies the VA rating schedule; produces preliminary disability percentage.
  4. 21Export control classification. Reviews products and technology against the Commerce Control List under EAR and ITAR, determines whether an export license is required and what restrictions apply.
  5. 22Election administration. Processes voter registration, cross-references felony records, death records, and interstate duplicates; produces eligibility determinations.
  6. 23Patent examination. Reviews prior art, compares claims against existing patents and literature, produces office action recommendation.
Healthcare
  1. 24Prior authorization. Reviews treatment requests against medical necessity guidelines and formulary, produces approval or denial with cited guideline basis.
  2. 25Clinical trial eligibility. Screens patient records against trial inclusion and exclusion criteria across a health system.
  3. 26Diagnosis coding audit. Reviews medical records against provider-assigned ICD-10 codes, flags upcoding and undercoding.
  4. 27Organ transplant allocation. Reviews donor characteristics, recipient compatibility, waiting time, and geographic proximity; produces match recommendation.
  5. 28Pharmaceutical safety signals. Scans FAERS adverse event reports, correlates across drugs and demographics, flags potential safety signals.
  6. 29Food safety recall classification. Reviews adverse event reports, inspection data, and lab results; produces recall classification recommendation.
Insurance
  1. 30Property damage claims. Reviews inspection reports, weather data, policy terms, photos, and claim history; produces coverage determination.
  2. 31Life insurance underwriting. Reviews medical history, prescription databases, and actuarial tables; produces risk classification.
Energy and Critical Infrastructure
  1. 32Nuclear power plant operations. Monitors reactor conditions and manages load-following adjustments under NRC regulation and a specific licensed operator’s authority.
  2. 33Pipeline operations. Manages pressure, flow rates, and valve positions across pipeline networks under PHMSA regulation.
  3. 34Grid load shedding. Manages real-time load shedding decisions during grid emergencies with pre-mission authorization and life-safety priority constraints.
  4. 35Water treatment dosing. Manages chemical dosing at municipal water treatment plants. Every adjustment traces to the sensor reading that triggered it.
  5. 36Spectrum allocation. Manages dynamic spectrum sharing between military, commercial, and emergency services, with real-time reallocation authority during emergencies.
Transportation
  1. 37Air traffic control. Manages aircraft separation during approach sequencing. Every recommendation logs against the radar picture at the time it was made.
  2. 38Maritime navigation. Manages vessel routing through congested shipping lanes across multiple jurisdictional boundaries.
Emergency Services
  1. 39911 dispatch prioritization. Triages incoming emergency calls, assigns priority levels, dispatches units. During mass casualty events, determines who gets help first.
  2. 40Wildfire resource allocation. Coordinates aircraft, ground crews, and evacuation orders across multiple fires and jurisdictions.
Notes
  1. 1IETF, “Transaction Tokens,” draft-ietf-oauth-transaction-tokens (2024), https://datatracker.ietf.org/doc/draft-ietf-oauth-transaction-tokens/; IETF, “Workload Identity in Multi-System Environments (WIMSE) Architecture,” draft-ietf-wimse-arch (2025), https://datatracker.ietf.org/doc/draft-ietf-wimse-arch/.
  2. 2See draft-ni-wimse-ai-agent-identity-02, https://datatracker.ietf.org/doc/draft-ni-wimse-ai-agent-identity/; draft-klrc-aiagent-auth-00, https://datatracker.ietf.org/doc/draft-klrc-aiagent-auth/; draft-oauth-ai-agents-on-behalf-of-user, https://datatracker.ietf.org/doc/draft-oauth-ai-agents-on-behalf-of-user/02/. None has formal standing in the IETF standards process as of March 2026.
  3. 3Solo.io, “Agent Identity and Access Management — Can SPIFFE Work?” (2026), https://www.solo.io/blog/agent-identity-and-access-management---can-spiffe-work; Solo.io, “SPIRE: A Case for Attestable Workload Identity” (2026), https://www.solo.io/blog/spire-attestable-workload-identity.
  4. 4Misha Deville, “Authorising Autonomous Agents at Scale,” Decentralized Identity Foundation (2026), https://blog.identity.foundation/building-ai-trust-at-scale-4/; IETF, “OAuth 2.0 Token Exchange,” RFC 8693 (2020), https://datatracker.ietf.org/doc/html/rfc8693.
  5. 5U.S. Department of Defense, Directive-Type Memorandum 25-003, “Zero Trust (ZT) Capability Execution” (2025).
  6. 6DARPA, “Assured Autonomy,” https://www.darpa.mil/research/programs/assured-autonomy.
  7. 7Jeffrey M. Bradshaw et al., “Software Agents for the Warfighter,” Institute for Human and Machine Cognition (2004), https://www.jeffreymbradshaw.net/publications/CSIIRW%20KAoS%20paper-s.pdf.
  8. 8Biscuit Authentication and Authorization, https://doc.biscuitsec.org/getting-started/introduction.html.
  9. 9No RFC defines sibling delegation semantics — two agents receiving parallel, non-influencing authority from the same principal. See Deville, supra note 4; Christian Posta, “Explaining OAuth Delegation, ‘On Behalf Of’, and Agent Identity,” https://blog.christianposta.com/explaining-on-behalf-of-for-ai-agents/.
  10. 10IETF, “Supply Chain Integrity, Transparency and Trust (SCITT) Architecture,” approved as Proposed Standard October 1, 2025, https://datatracker.ietf.org/doc/draft-ietf-scitt-architecture/. Currently in AUTH48 final review before RFC publication.
  11. 11Ben Laurie, Adam Langley, and Emilia Kasper, “Certificate Transparency,” RFC 6962 (2013), https://datatracker.ietf.org/doc/html/rfc6962. For cumulative certificate volume, see https://certificate.transparency.dev.
  12. 12Miles Turpin et al., "Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting," NeurIPS (2023). Tamera Lanham et al., "Measuring Faithfulness in Chain-of-Thought Reasoning," Anthropic (2023).
  13. 13Anthropic, “Emergent Misalignment from Reward Hacking” (2025), https://www.anthropic.com/research/emergent-misalignment-reward-hacking.
  14. 14Anthropic, “Reasoning Models Don’t Always Say What They Think” (April 2025), https://www.anthropic.com/research/reasoning-models-dont-say-think.
  15. 15John C. Knight and Nancy G. Leveson, “An Experimental Evaluation of the Assumption of Independence in Multi-Version Programming,” IEEE Transactions on Software Engineering SE-12, no. 1 (January 1986): 96–109, https://www.kth.se/social/files/564df871f2765419e306178d/KnightLeveson.pdf; Leonard Dung and Florian Mai, “Correlated Failures in AI Alignment: Consensus Without Independence” (2025), https://arxiv.org/abs/2407.19863.