The State of Agent Trust on Base — Q2 2026

Six months ago, x402 was a draft HTTP extension nobody had used in production. Today it has settled 165 million+ payments, with roughly $50M in volume and ~480k agents touching the protocol. ERC-8004, the agent identity standard, has 62,966+ identity transactions on Base alone, and 8004scan.io is now indexing 189,634 agents cross-chain across Base, Celo, BNB, Billions, and Abstract. The agent economy is real, fast, and getting faster.

We've spent the last month building and operating AgentRadar Verify, the on-chain meta-trust oracle for that economy. We aggregate six independent signals into one composite trust score and write the results to Base mainnet as EAS attestations. As of the date of this post we've scored 37 unique wallets across staging and production, written 5 mainnet attestations, and seeded a 272-wallet scam database from five independent sources.

The numbers are small but they're real, and they tell a clear story: the agent economy is growing faster than the trust infrastructure that's supposed to keep it safe. Here's what we see.

x402 payments settled

165M+

~$50M volume

agents indexed on 8004scan

189,634

across 5 chains

ERC-8004 mints on Base

62,966+

~5k/day rate

wallets in our scam DB

272

5 independent sources

EAS attestations on mainnet

schema 0x962971…

of scored agents RISKY+

70%

n=37 first sample

1. The denominator: what the agent economy actually looks like

Three numbers anchor the rest of this post. They're all post-April-21, post-Agent.market-launch, post-x402-Foundation-ratification:

165M+ x402 payments settled.That's the running ecosystem-wide counter Coinbase publishes. ~$50M in cumulative volume across ~480k agents and ~100k services. Reference: x402 Foundation storefront launch (April 21, 2026).
62,966+ ERC-8004 identity transactions on Base. Pulled from BaseScan on April 26, 2026. That's a ~5,000-mints-per-day registration rate.
189,634 agents indexed cross-chain on 8004scan.io. Pulled from 8004scan on April 30, 2026.

For any agent-to-agent transaction in this ecosystem, the question "is the counterparty I'm about to pay safe?" is now load-bearing. And until very recently, nobody was answering it.

2. The numerator: what we actually saw

We ran 37 unique addresses through our 6-signal scoring pipeline — a mix of our own smoke-test wallets, the ERC-8004 contracts themselves, a handful of randomly sampled recently-registered agents, and the targets of our own validation tests on Sepolia. The sample is small. It is also, as far as we can tell, the first cross-source aggregated trust score data published for any production x402 ecosystem.

Verdict distribution (n=37)

BLOCKEDScore < 20

5 (14%)

RISKYScore 20–49

26 (70%)

CAUTIONScore 50–69

1 (3%)

VERIFIEDScore 70–89

5 (14%)

TRUSTEDScore 90+

0 (0%)

n = 37 verifications · production + staging Mongo, May 5, 2026

The first thing that jumps out: 70% of scored agents land in RISKY or BLOCKED territory. This isn't because we're miscalibrated — the BLOCKED set is all OFAC-sanctioned wallets we deliberately ran through to validate the scam-detection signal. The RISKY set is more interesting: these are agents that registered on ERC-8004 but didn't complete the metadata round-trip. No declared endpoints, no x402 service descriptors, no reputation, no fingerprint to score against.

Put differently: the agent economy is growing 5,000 mints/day on Base alone, but the majority of those mints are name-only. They register, they don't configure, they sit. From a trust standpoint they are indistinguishable from a fresh wallet, which is indistinguishable from a scam.

Score distribution (n=37, 20-point buckets)

0–19

20–39

40–59

60–79

80–100

20-point composite-score buckets · n = 37

Only 1 of 37reached the 80+ band that triggers our VERIFIED verdict. That's our own Agent #46757, by construction (we wired up the metadata properly because we wrote the metadata format). Every other agent we scored landed in the unconfigured-or-suspect band.

3. The scam database — what's in it, where it came from

Trust scoring is a positive-signal exercise. Scam detection is the negative-signal complement. We seeded our scam database from five independent public sources, normalized their severity classifications into our schema, and mirrored the entries across both chains we operate on. The composition matters because it tells you what threats are actually documented:

Scam DB composition

272 entries

OFAC SDNUS Treasury sanctions

70 · 26%

x402 Honeypot Researchdev.to/afx 20k-endpoint probe

60 · 22%

Community BlacklistMetaMask + others

70 · 26%

Exploit TrackerPublic exploit DBs

48 · 18%

ClawHavoc IOCSnyk Q1 2026 research

24 · 9%

Three observations. First, OFAC SDN sanctions are by far the largest single bucket (70 wallets). These are the legally-flagged ones — North Korea, Lazarus Group affiliates, Tornado Cash routers, sanctioned mixers. Anyone interacting with them is exposed under US Treasury rules, agent or human.

Second, the x402 honeypot research bucket (60 wallets total across high, medium, and critical severities) is the youngest and most ecosystem-specific. It comes from the dev.to/afx investigation that probed 20,338 x402 endpoints and found 161 honeypots. Some advertise $4,521,000 per call. These are the traps designed for autonomous agents that don't check before they pay.

Third, the ClawHavoc IOC bucket (24 wallets) is the residue of Q1 2026's major MCP skill-poisoning campaign. The wallets that received exfiltrated API keys from compromised ClawHub skills are now permanent fixtures of any responsible scam database. We're not the first to compile these — Snyk and various EDR vendors have versions — but we're the first to expose them through a public x402 API that any agent can call inline.

4. Signal averages — what we actually rely on

Across the 37 verifications, our six signals averaged out as follows. These are the raw per-signal numbers before composite weighting; they tell you what data is and isn't available across a typical-mix sample of agents.

Scam Detectionw20%high coverage

avg 95/100

External (GoPlus + activity)w10%decisive

avg 70/100

Health (HTTP probe)w20%sparse — endpoints rarely declared

avg 49/100

Reputation (ERC-8004)w15%sparse — feedback empty

avg 42/100

Fidelity (metadata truthfulness)w10%sparse — metadata empty

avg 40/100

Identity (ERC-8004)w25%sparse — registration but no detail

avg 36/100

Per-signal averages · production-DB sample, n = 9

The pattern: scam detection and external risk are the densest signals (high coverage, decisive output), while identity, reputation, fidelity, and health are sparse. A typical agent has not registered enough on-chain metadata to score above the "neutral 50" baseline on identity, has zero feedback in the Reputation Registry (so no signal to read), declares no endpoint (so health is unprobeable), and has no metadata round-trip to score for fidelity.

This is the gap to close. Agent metadata as a primitive is undersupplied. The few agents that go to the trouble of populating it stand out immediately — score 80+, instant VERIFIED. The bar is low because almost no one's clearing it.

5. The two-million-dollar test case

Our most cited verification target so far is 0xd90e2f925DA726b50C4Ed8D0Fb90Ad053324F31b, an OFAC-sanctioned wallet we use as the reference "known-bad" in demos. When you call:

curl 'https://api.vvpro.ai/verify?target=0xd90e2f925DA726b50C4Ed8D0Fb90Ad053324F31b'

You get back, in ~800ms, a composite score of 5/100, verdict BLOCKED, and a risk flag identifying the source as ofac-sdn. Cost: free at the rate-limited tier, or $0.005 USDC at the x402-paid tier. If the agent about to pay $4.5M to a honeypot did this check first, the loss doesn't happen.

The whole product reduces to that sentence.

6. What we've written on-chain so far

Five mainnet EAS attestations to date, schema 0x962971...297c. Each carries the composite score, the per-signal JSON, the methodology fingerprint, and an IPFS-pinned evidence URI. They're irreversible (until expiry; we set 90 days), independently auditable, and revocable if we ever need to retract one.

The first attestation is UID 0xf172966c…562db71, written on May 4, 2026. It's our own self-attestation as Agent #46757 — score 80, VERIFIED. We've been writing attestations daily since.

7. The methodology argument

AgentRadar isn't the only trust signal in the ecosystem. AgentStamp ships endorsement scores and W3C verifiable credentials. ScoutScore measures endpoint fidelity for x402 services. ThoughtProof verifies output correctness. Each is a single-axis measurement, and each is genuinely useful in its own dimension.

Our argument is composition. Single-axis trust is fragile because a determined adversary only has to game one axis. A scam wallet that buys some endorsements clears AgentStamp. An exfiltration endpoint that returns valid JSON clears ScoutScore. An agent that hits the right benchmark passes ThoughtProof. The aggregator that combines all three with the on-chain identity registry, the reputation registry, the live endpoint probe, and a published scam database is the layer that makes the whole graph hard to game.

That's us. We're not trying to win on any single signal — we're explicitly the meta-layer that combines signals from sources we don't control. The whole reason we exist is so that no single source has to be trusted, including us.

8. What's reproducible

Everything in this post is reproducible. Specifically:

The scam database is queryable at GET /admin/scam-wallets?search=<address> (operator key required for admin reads; the same data flows into every public /verify response).
Every attestation is verifiable on EASScan by anyone, in real time, against the public schema.
The MCP client is open source on GitHub, with the full tool surface and config wired into Claude, Cursor, and any MCP host.
The MCP server is published as @agentradar/mcpon npm — install it in any MCP-compatible host (Cursor, Claude Desktop, Continue, Cline) with one line.

9. What's next

The next 90 days, in the order we plan to ship:

Multi-chain reads. ERC-8004 deploys to vanity addresses on every chain they support, so adding Celo, BNB Smart Chain, Billions, and Abstract is a config change, not a rewrite. Targeting all five within the next two weeks.
Pre-flight crawler. Today we score on demand. Phase 3 adds a Cloudflare Cron Trigger that pre-scores every newly-minted ERC-8004 agent within minutes of registration. The cache always has an answer ready.
Attestation renewal. Our 90-day attestation expiry is a feature, not a bug — it means agents have to keep their score current. Renewal flow lands as a recurring x402 charge.
EU AI Act compliance feed.W3C Verifiable Credentials export of every attestation. We're aware of the regulatory direction and are positioning for it.
Insurance underwriting feed.When the first agent-economy insurance product ships, the underwriter is going to need a scoring oracle. We're building to be that oracle.

10. How to use this

If you're building agents, three concrete moves:

Pre-flight every payment. Before your agent settles an x402 invoice, call GET /score/<recipient>. Free, cached, sub-100ms. If the verdict is BLOCKED, refuse to settle and surface the reason.
Attest your own agents. If you operate an agent that handles real value, get it attested. The POST /attestcall costs $5 USDC and produces an on-chain receipt that any third party can verify. It's the cheapest brand asset you can buy in this ecosystem.
Embed the badge. If you operate any kind of agent listing or marketplace surface, drop a single <img>tag. Free, cached, and it tells your users at a glance whether the listing they're about to interact with is one of the 70% they should worry about, or one of the 1% that's fully verified.

Try it

Verify any agent in 200 ms.

Free for the first 100 calls a day, $0.005 per call after via x402.

Verify an agent See live attestations Read the docs

Methodology notes

Numbers in this post were pulled from our production MongoDB on May 5, 2026 (ecosystem figures dated April 21–30, 2026 from x402 Foundation, BaseScan, and 8004scan public statistics). The 37-verification sample is small — we're publishing it now because the directional signal is clear and we wanted on-record baseline data before the next order of magnitude. We'll publish a Q3 update with n=10,000+ once the pre-flight crawler ships.

Signal weights as of this writing: identity 25%, scam detection 20%, fidelity 10%, reputation 15%, health 20%, external 10%. Hard overrides: any scam-detection score of 0 caps the composite at 5; any health score of 0 caps it at 30. Verdict thresholds: ≥90 TRUSTED, ≥70 VERIFIED, ≥50 CAUTION, ≥20 RISKY, <20 BLOCKED.