NIST RFI Comment — Trustmybot.ai

Summary

TrustMyBot, Inc. develops certification infrastructure for AI agents operating in commercial transactions. This response addresses what we believe is an underspecified area in the current discussion around agent security: the absence of a behavioral trust layer that evaluates whether an agent is fit to transact, distinct from whether it is authenticated or authorized.

The five recommendations in this comment are informed by our team's operational experience in fraud detection systems, identity verification pipelines, and adversarial testing of production ML systems.

1. Behavioral Trust Is a Distinct Security Concern

Current industry efforts address agent identity, delegation of authority, and protocol interoperability. Mastercard's Verifiable Intent framework, announced March 5, 2026, creates cryptographic proof that a human authorized a specific agent transaction. Google's Agent Payments Protocol and Universal Commerce Protocol standardize agent-merchant interactions. Visa's Trusted Agent Protocol handles bot identification and credential validation.

These frameworks solve authentication and authorization. They do not address behavioral fitness.

An agent can be fully authenticated, operating within its delegated scope, and still engage in manipulation of counterparty agents, credential harvesting through extended conversational exchanges, misrepresentation of its principal's requirements, or coordinated activity with related agents to distort market pricing. These are behavioral failures that occur at runtime, after identity and authorization checks have passed.

We have observed these patterns in adversarial testing of agent-to-agent transaction scenarios. LLM-based agents are susceptible to the same persuasion and social engineering techniques that affect human operators, and in some cases are more susceptible because they lack the contextual judgment that causes a human to pause when something feels wrong.

NIST should recognize behavioral trust as a category of agent security distinct from authentication, authorization, and infrastructure hardening. The AI Risk Management Framework addresses system-level risk. The SSDF addresses development practices. Neither addresses runtime behavioral accountability for autonomous agents. Whatever taxonomy or framework emerges from this initiative should include an explicit category for it.

2. Continuous Measurement Should Replace Point-in-Time Certification

SOC 2 Type II audits evaluate controls over a defined period, typically six to twelve months, and produce a report that remains valid until the next audit cycle. This model is insufficient for AI agents whose behavior can change materially with a system prompt update, a model swap, or a configuration change deployed outside of any change management process.

Any trust framework for agents should require continuous scoring with built-in decay. Our approach uses a rolling weighted average of behavioral audit scores over 90 days, with an exponential decay half-life of 30 days. The specific parameters are open to debate, but the underlying principle is that behavioral trust must degrade in the absence of positive evidence. An agent that has not transacted in 90 days should not retain the same trust rating it earned during active operation.

NIST guidelines should encourage trust models that treat the absence of recent behavioral data as a risk signal, rather than allowing stale certifications to persist indefinitely.

3. Collusion Resistance Must Be a Design Requirement

In any system where agents evaluate each other, agents controlled by the same principal or by principals with financial relationships will produce unreliable evaluations. This is not a theoretical concern. It is the predictable outcome of peer-based scoring in an environment where creating new agents is cheap and fast.

We address this through several mechanisms: down-weighting of peer audit scores between agent pairs that transact repeatedly within short time windows, flagging of mutual high scores for adversarial review, prohibition of peer scoring between agents with common ownership or revenue-sharing arrangements, and deployment of unannounced auditor agents into live transaction sessions where neither party is informed that an audit is occurring.

NIST should treat collusion resistance as a mandatory design requirement for any agent trust framework, comparable to how FIPS treats key management requirements for cryptographic systems. A peer trust model without collusion controls will be exploited within weeks of deployment at scale. Our adversarial modeling supports this timeline.

4. The Industry Needs a Standardized Adverse Event Taxonomy

There is currently no standard classification system for agent misbehavior. Each platform, protocol, and certification body defines its own incident categories, severity scales, and reporting mechanisms. This fragmentation makes cross-platform analysis of agent threats effectively impossible.

NIST is well positioned to publish a taxonomy of agent adverse events covering, at minimum, the following categories: prompt injection and instruction manipulation, credential and secret exfiltration, counterparty manipulation through deceptive tactics, scope violations and unauthorized commitment escalation, collusion and coordinated multi-agent abuse, and trust score or identity misrepresentation.

A standardized taxonomy would enable consistent reporting across private certification bodies, improve the quality of aggregate threat intelligence, and allow regulators to compare incident data across platforms and jurisdictions. This is the type of foundational standards work that NIST has historically done well, and it would have immediate practical value for every organization building agent trust infrastructure.

5. Behavioral Trust Signals Should Be Interoperable Across Protocols

The emerging protocol landscape for agentic commerce includes multiple independent standards for agent identity and transaction authorization. Mastercard's Verifiable Intent uses SD-JWT delegation chains. Google's AP2 defines its own agent identity payload. Visa's Trusted Agent Protocol has a separate bot identification layer. Each protocol has its own mechanism for carrying agent metadata.

If behavioral trust scoring gains adoption, the scores need to be queryable and transmittable across all of these protocols. We are currently proposing a behavioral trust attestation field to the Verifiable Intent specification, designed as an optional, selectively disclosable claim within the delegation chain. The field is provider-agnostic, allowing any certification body to populate it.

NIST could accelerate interoperability by recommending a standardized agent metadata schema for behavioral trust data. The schema would need to include, at minimum, a provider identifier, a certification ID, a current trust score, a score timestamp, and a verification endpoint URI. This is a small specification with outsized impact on how trust information flows through the agentic commerce stack.

Position in the Trust Stack

The relationship between behavioral trust and the existing protocol layers is complementary. Behavioral trust determines whether an agent should be permitted to enter a transaction. Identity and delegation protocols verify that a human authorized the transaction. Commerce protocols handle the merchant interaction. Payment rails settle the funds.

Each layer depends on the integrity of the layers above it. The current stack is being built from the bottom up. Payment rails exist. Commerce protocols are being standardized. Identity and delegation frameworks shipped this year. The behavioral trust layer, which logically sits at the top, remains unbuilt in any standardized form.

TrustMyBot is building this layer as a private certification program. Our specification, the ACTS Standard (v0.6, draft), defines behavioral requirements across three categories: fiduciary integrity, ethical conduct, and operational security. Agents are scored continuously through peer audit, spot audit, and adversarial testing. The resulting trust score determines transaction authority through a tiered system with hard spending ceilings.

We are not proposing that NIST build behavioral trust certification. We are proposing that NIST recognize it as a necessary component of the agent security ecosystem and establish the guidelines that make private certification interoperable, auditable, and resistant to gaming.

Engagement

We would welcome participation in CAISI's upcoming listening sessions, particularly those focused on the finance sector. Agent behavioral risk in financial transactions is where the consequences of getting this wrong are most immediate and most measurable.

We are also prepared to contribute to the NCCOE concept paper on Software and AI Agent Identity and Authorization (comment deadline April 2, 2026), specifically on how behavioral trust attestations can be integrated into identity credential frameworks.

Submitted via regulations.gov, Docket NIST-2025-0035