Case study

AestheticIQ.ai: per-client AI rebill where a stolen dashboard can't drain a single account

How a multi-tenant AI-aesthetics SaaS resells AI to its end clients with per-tag chargeback, hard quotas, and a second-factor binding key that means a compromised UI or API can't replay tokens. A preview case study — technical flow live; quotes and logos awaiting customer sign-off.

By Chris Therriault9 min read
Preview — pending customer sign-off

AestheticIQ.ai sells AI-driven skin analysis, treatment-planning, and after-visit education to medical spas, dermatology practices, and clinical aesthetics groups. Each AestheticIQ customer — Dr. Patel's practice in Austin, Glow Wellness in Charleston, a thirty-location regional chain — is itself a separate billing entity with its own rate plan and its own ToS.

When the founders walked through the math on their AI cost, two things were true. One, they needed per-end-client cost attribution accurate to the cent, because they were rebilling. Two, they had a security problem that kept the CTO up at night: if their dashboard or one of their internal APIs got compromised, an attacker could siphon AI spend against any client at platform cost — a five-figure-per-day fraud risk before anyone noticed.

AestheticIQ.ailogo pending turned to Visionality to solve both at once.

The wedge

Standard cloud cost-control products solve attribution. Standard secrets-management products solve key safety. Neither of them was built for the case where the cost-control system itself is also the resell system. The moment you put a cost-attribution dashboard in front of an operator who can also issue tokens, you've concentrated risk that wasn't concentrated before.

AestheticIQ needed:

  1. Per-client tagging that flows from their backend through every AI call.
  2. Hard quotas that drop the call at the gateway, not after-the-fact.
  3. A token model where stealing the token does not let you use the token.

The first two were Visionality's bread and butter. The third is the Token Authority v2 envelope — a second-factor binding key that lives only inside AestheticIQ's own backend.

The user flow

Setup (one onboarding week)

AestheticIQ's platform team did the following:

  • Created a Visionality project_id per end-clientclient_drpatel, client_glow, client_lighthouseaesthetics, and so on.
  • Set allocation-rule multipliers per client to encode the per-client markup their commercial team had already negotiated. Dr. Patel's practice is on the early-adopter plan at 1.15×; Glow is on the standard plan at 1.40×; the regional chain is on a volume plan at 1.08×.
  • Provisioned the binding DEK on their own production cluster — a 32-byte secret rotated quarterly, scoped per env, never copied to a workstation.
  • Wired their AI-call-issuing backend to mint spend-tokens via the Visionality Token Authority with project_id set to the end-client's id and the binding key returned to AestheticIQ's backend memory once at issue time.

The end-client identification was the operationally interesting bit: AestheticIQ's session middleware already knew which paying customer a request belonged to. They added one line to wire that into the token-issue call. From that moment, every downstream AI call carried the client tag without a developer needing to remember it.

The first call

A patient in Dr. Patel's practice opens the AestheticIQ web app and asks for an after-visit summary. AestheticIQ's backend:

  1. Authenticates the patient and resolves the practice → client_drpatel.
  2. Calls Visionality's Token Authority to mint a short-TTL spend-token bound to project_id = client_drpatel. The Authority returns the token id, the token material, and a one-time binding key.
  3. AestheticIQ's backend stores the binding key in memory only, scoped to this single request.
  4. Sends the user's question to Visionality's gateway with three headers:
Authorization: Bearer spt_<id>.<material>
X-Acc-Binding:  <base64url(hmac_sha256(binding_key, "<token_id>|<minute>|<body_sha256>"))>
X-Acc-Caller:   client_drpatel

The gateway:

  • Validates the binding proof. The proof binds the request body's SHA-256 — tampering the body invalidates the proof.
  • Tracks (token_id, minute_bucket, body_sha256) in a replay cache. The same proof can't be reused inside the window.
  • Routes the request to Anthropic.
  • Logs request_logs with project_id = client_drpatel, cost_usd = 0.0021, binding_status = ok.

The response comes back to the patient's screen. The cost lands on Dr. Patel's invoice line. AestheticIQ never sees the raw Anthropic key — the gateway holds it.

The "what if you stole everything" thought experiment

Here is what AestheticIQ's CTO can now tell their auditors. Suppose an attacker fully compromises:

  • The AestheticIQ web app (XSS, supply chain — pick your poison).
  • The AestheticIQ admin dashboard.
  • One of the AestheticIQ internal APIs.

The attacker exfiltrates every spend-token sitting in transit, every session cookie, every operator credential, every secret on the bastion. What can they do with the AI budget?

Nothing useful.

A spend-token without its binding key is a bearer string that can't authenticate. The gateway demands X-Acc-Binding — a per-request HMAC keyed on a secret that lives only inside AestheticIQ's backend memory. Without that key, the attacker hits a structured 401 on every request, and every rejection lands on request_logs.binding_status for the SOC 2 auditor to inspect. The auditor sees the attack as data, not as an incident report after the fact.

Customer quote — awaiting approval
The binding-key model is the only architecture I've ever seen where I can sit in front of the board and say with a straight face that a complete compromise of our app layer doesn't equal a financial bleed. That's the difference between cyber insurance at our scale being affordable and being a non-starter.
Co-founder & CTO, attribution pending sign-off

The first quota-block

Within the first month, the operational story showed up too. The Lighthouse Aesthetics regional chain had a custom on-call assistant feature that spiked usage on weekends when their providers were doing aftercare follow-ups. By Sunday evening of one weekend, they had blown through 95% of the contractual quota for the month with a week still to go.

AestheticIQ had set a hard cap on project_id = client_lighthouseaesthetics via Visionality's allocation rules. At 100% burn, the gateway started returning 402 spend_token_blocked to the AestheticIQ backend, which surfaced cleanly in the chain's admin dashboard as a budget-exceeded banner and a one-click upgrade link.

No outage. No surprise bill. No 4 AM call with the chain's COO. The customer either upgraded (most did) or waited for the calendar to roll. The block was a feature, not an incident.

Month-end

At month-end, AestheticIQ's billing team pulled the chargeback CSV grouped by project_id. They got, for each end-client, the raw cost, the per-client multiplier, the marked-up cost, and the per-day breakdown. The CSV went into a templated Stripe invoice. Customers paid without disputing because the line items matched what they expected from the in-app usage card they'd been watching all month.

For the per-client SOC 2 evidence asks (the regional chain has a HIPAA program; the boutique providers don't), AestheticIQ generates a per-client evidence pack scope by filtering the evidence collectors to the relevant org_id slice. The chain runs vis-verify offline. The bundle either checks out, or it doesn't. There is no middle ground.

What we DID NOT need to build for AestheticIQ

This is the part the founders were most surprised about during scoping. The things they assumed would take a quarter each, and the answer Visionality already had on the shelf:

  • Per-client quota enforcement — already in the gateway via allocation_rules.
  • Per-client markup — already in the gateway via cost_multiplier.
  • A token model that survives a full app compromise — Token Authority v2, shipped in v2.0.
  • A way to prove to the auditor that the rejection log is append-only — the SOC 2 CC6.6 + CC7.3 collectors, signed pack, offline verifier.
  • A way to pull the per-client cost as JSON for their own admin dashboard/api/compliance/binding-status and /api/mcp/anomalies, now Bearer-token-authenticated for headless calls.

What the marketing team is allowed to put on the website

We are publishing this as a preview case study while we wait for AestheticIQ's written sign-off on direct quotes and on the use of their logo in the case-studies grid. The technical flow above is accurate to the integration we shipped together. The pull-quotes are working drafts, and the placeholder where the logo should go is a placeholder.

If you're an AestheticIQ-shaped company — multi-tenant SaaS where AI is part of what you sell, where customer trust is part of the price, and where a stolen token is an existential question — and the flow above describes what you've been trying to architect, book a call and we'll walk through the gateway integration in under thirty minutes.

AestheticIQ.ailogo pending

Visionality.AI

See how Visionality handles this.

30-minute demo. Live deployment. Your questions answered directly — no slides, no pitch.