The whole point of a virtual AI key is that it's safer than the raw vendor key. It's scoped, attributed, and revocable. But it still has one classical problem: if it leaks, it's bearer. Someone with the string can replay it.

Visionality v2.0 changed that. Spend tokens now carry an envelope v2 binding key — a second factor minted server-side, shown to the operator once, and stored on the server only as AES-256-GCM ciphertext bound to the token id, the binding algorithm, and the org id (the AAD). A leaked token without that key gets rejected with a structured 401 — and every rejection lands on request_logs.binding_status for the auditor.

This is the same posture as a credit card with CVC, except the CVC is rotated per-token and never crosses the wire after the issue moment.

The wire format

Three headers, all printable ASCII, all idempotent within a one-minute window:

Authorization: Bearer spt_<id>.<material>
X-Acc-Binding:  <base64url(hmac_sha256(binding_key, input))>
X-Acc-Caller:   <opaque>            # reserved for v3 caller-context

The binding proof is built like this:

binding_input = "<token_id>|<minute_bucket>|<body_sha256>"
binding_proof = base64url( hmac_sha256(binding_key, binding_input) )

Three things to notice. First, the proof binds what is being asked (the body sha256) — tampering the body invalidates the proof. Second, the minute bucket aligns retries and gives ±60 seconds of clock skew tolerance. Third, the gateway tracks (token_id, minute_bucket, body_sha256) in a replay cache, so the same proof can't be reused inside the window.

A leaked token without the key fails at step 1: no proof, no validation, structured 401 logged.

Why AAD?

The ciphertext on disk is bound to token_id + binding_alg + org_id as Additional Authenticated Data. That means:

A row copied to another token id won't decrypt.
A row "upgraded" to a stronger alg without re-issuing won't decrypt.
A row migrated across orgs won't decrypt.

The DEK rotates per-org and never crosses the wire after the issue moment.

What the auditor reads

We added two columns to request_logs for this:

binding_status — one of ok | skipped | no_proof | bad_proof | replay | alg_mismatch | expired_bucket. The auditor's "did the second factor enforce?" question is a single SELECT.
agent_sub — autonomous-agent identity from the envelope claim, so spend can be rolled up by agent and by agent_owner (mirrored from agent_identity.owner at write time, no JOIN at read time).

The /compliance dashboard surfaces both as posture cards: green when no rejection events, amber under 1% rejection rate, red at or above 1%. Six SOC 2 controls now contribute live evidence — including this one (CC6.7 secure data transmission).

What this replaces

Before Token Authority v2, a leaked spend token was bearer. We caught it at the budget (the cap is still signed into the envelope, so a leaked-but-exhausted token is a 402, not a four-figure surprise). But the binding-key second factor changes the security ceiling. A leaked token with budget left is now a 401, not a transfer of resources.

Same posture as TLS, FIDO2, and good API design generally. We brought it to spend tokens.

What's not in v2

We intentionally left two things for v3 and later:

Hardware-rooted binding (KMS / TPM / Secure Enclave) — TA-17.20 onward. The DEK provider abstraction is there; only the implementation changes.
Caller-context binding (agent_id, project_id, GitHub OIDC sub, source IP) — TA-17.10 onward. The X-Acc-Caller header is reserved.

Everything in scope today is shipped, tested at every layer (envelope verify, column re-read, CLI input, dashboard URL guard — four boundaries enforce the same constraint), and applied to Neon prod. The full envelope v1 spec is in docs/SPEC-token-envelope-v1.md if you want the wire details.

Why now

The autonomous-agent identity story turns this from a security-team feature into a finance-team feature. With binding-v2 + agent_sub + agent_owner all wired through to the chargeback CSV, an F&P analyst can split AI spend by autonomous-agent identity AND its registered owner — with the OWASP CSV-injection guard on every cell. That's not "AI cost management." That's accounting.

If you're rolling out Claude Desktop, VS Code MCP, or any agent that calls AI on someone's behalf, you want the binding key. The leaked-token replay window closes the moment you turn it on.

Token Authority v2: A Second Factor for AI API Keys