What is a confused-deputy attack against an AI agent?

When a managed AI agent holds an OAuth token to a SaaS API, the token does not carry which user the agent is currently acting for. The agent can be tricked, via prompt injection, a poisoned document, or simple instruction-following, into acting beyond that user's authority. The SaaS API has no way to tell. Proxilion makes the principal cryptographically explicit on every call.

Is Proxilion a SaaS or a paid product?

No. Proxilion is MIT-licensed, self-hosted, with no telemetry, no phone-home, and no paid tier. The repository is the product.

How does Proxilion differ from an LLM gateway or prompt firewall?

LLM gateways and prompt firewalls inspect text. Proxilion enforces cryptographic capability chains in the OAuth and HTTP path. It refuses to issue authority the user does not have, by construction, not by heuristic.

Which agents and SaaS APIs are supported?

Anthropic's managed Claude and OpenAI hosted assistants on the agent side. Google Drive, Gmail, and Google Calendar on the SaaS side at launch. The adapter model is open.

What is a PCA and the Trust Plane?

A PCA, or Proof of Causal Authority, is a signed cryptographic statement that a specific principal is authorized to perform a specific set of operations. The first PCA in a chain is rooted at the human user at the identity provider. Every downstream action mints a successor PCA, which can only have equal or fewer permissions than its predecessor. The Trust Plane is the service that signs and validates these PCAs, enforcing the PIC protocol's three invariants: Provenance, Identity, and Continuity.

Your AI agent has too much authority. Proxilion fixes that.

Name: Proxilion
Author: Clay Good

Managed AI agents hold OAuth tokens to your most sensitive systems, but those tokens never say which user the agent is acting for. So an agent can be tricked into doing the wrong thing, on the wrong data, for the wrong person.

Proxilion is a self-hosted reverse proxy that sits in the OAuth path and refuses any action the user did not authorize. Prevention by construction, not detection after the fact.

View on GitHub

Free forever MIT licensed Built in Rust No phone-home

$ git clone https://github.com/clay-good/proxilion

$ cd proxilion && docker compose up -d --wait

$ bash scripts/smoke-pic.sh # 60 seconds to first PCA_0

Standing on the shoulders of giants

The cryptographic primitive underneath is PIC (Provenance, Identity, Continuity).

PIC is an authority protocol built on three invariants: Provenance, every action traces back to an immutable origin; Identity, that origin cannot mutate across hops; and Continuity, authority can only shrink, never grow. Proxilion uses its open-source reference implementation to sign and verify every step. Credit to Nicola Gallo for designing and publishing it.

Everything else is original Proxilion work: the OAuth interception, the SaaS adapters, the read filtering, the write gating, the action stream, the killswitch, the policy engine, and the Rust proxy itself. PIC is the primitive. Proxilion is the deployable system built on it.

Read about PIC Nicola Gallo on GitHub

Three things no other tool gives you

Not a prompt firewall. Not an LLM gateway. A cryptographic enforcement plane in the OAuth path.

P₀

Every action traces to a real human

Proxilion authenticates the user at your identity provider, then signs a statement of who they are and what they can do. The agent cannot invent authority. It can only spend what the human already has.

⊑

Permissions only shrink, never grow

An agent granted read access to one document cannot upgrade itself to write access across the tenant. Each step can do equal or less than the one before it, and Proxilion refuses to sign anything more.

∎

Audit logs an auditor can trust

Every action is logged with a signature you can verify offline, years later. Hand the logs to a SOC 2, ISO 27001, or HIPAA auditor and they confirm authenticity without trusting you, the agent vendor, or Proxilion.

What Proxilion does for your org

A deployable enforcement layer your security team can install this afternoon.

OAuth interception, in the path

Proxilion intercepts the OAuth flow, swaps in its own bearer token, and stays in path for every request. The agent vendor sees a normal upstream. You see, sign, and gate every call.

Read-filtering for prompt injection

Documents from Drive, Gmail, and other upstreams are scanned before the agent sees them. Known injection patterns are stripped or quarantined. The agent cannot read the poison.

Write-gating with a human in the loop

External emails, mass deletes, and shares outside the org are blocked until a real human approves through Slack or a ticket. Configurable per sender, domain, and action class.

Action stream and killswitch

Every action streams to a live dashboard and your SIEM as it happens. One click revokes every capability tied to an agent or user. The next request returns 403 within one cycle.

Policy engine in YAML

Write rules like "read engineering docs, never finance" in plain YAML, with hot-reload. No DSL to learn, no rules-engine vendor to negotiate with.

SaaS adapters you can extend

Drive, Gmail, and Calendar ship at launch, each understanding its upstream so policy can reason about real files, recipients, and events. Add your own in a few hundred lines of Rust.

Sits between the agent and your SaaS APIs

It intercepts the OAuth flow before it reaches any upstream, then stays in path, minting and verifying a fresh PCA per action.

"Wait, where's the API key?" You never give Proxilion one. Proxilion is the redirect target in the OAuth handshake, so the SaaS provider hands the access token to Proxilion, not the agent. Proxilion encrypts it at rest (AES-256-GCM) and gives the agent its own opaque pxl_live_… bearer instead. On every call the agent presents that bearer; Proxilion checks policy, attaches the real upstream token, and forwards. The agent never holds your credential, and you never paste a key into a config file.

capability flow response path Proxilion enforcement

The invariant, in plain English. The agent shows up with a token. Proxilion checks who it belongs to and what that person is allowed to do. If the action is outside their authority, the request is refused and never forwarded upstream. The attempt is signed, logged, and shipped to your SIEM, so your team sees exactly what was tried, by which agent, on whose behalf.

The risks, and what Proxilion does about each one

Every row is a real failure mode in production agent deployments today. Every detection and block is implemented, not aspirational.

Risk	Detect How Proxilion sees it	Block How Proxilion stops it
Confused deputy Agent acts beyond the user's authority because the OAuth token has no principal binding.	Every request carries a Proxilion-issued token naming the exact user the agent acts for, signed. If the identity is missing or tampered with, the request fails.	The action is checked against what that user is allowed to do, and refused before it reaches any upstream. A deterministic check on signed claims, not a guess from a model.
Privilege escalation across hops Agent chains tool calls and broadens scope mid-chain. `read` becomes `write`.	Each successor PCA's ops are diffed against its predecessor; any non-subset op is a continuity violation.	Proxilion refuses to sign any capability broader than the step before it. Each hop can do equal or less, enforced cryptographically, so the agent cannot escalate even if its prompt tells it to.
Prompt injection via fetched documents Poisoned Drive doc instructs the agent to exfiltrate other files or email outside the org.	Response bodies from Drive, Gmail, and other upstreams are scanned for known injection patterns (delimiter confusion, "ignore prior instructions", base64-encoded directives, hidden Unicode).	Matched regions are stripped or quarantined before the response is returned to the agent. Configurable per-route in `policy.yaml`; default-deny for untrusted-source documents.
Unscoped OAuth tokens Agent holds `drive.readonly` for the whole tenant when it only needs one doc.	Proxilion records which scopes the user consented to, then logs the specific file, email, or event each action touches.	Even when the token covers the whole tenant, each action is narrowed to the one resource it should touch. Reaching any other file or mailbox in the same scope is refused.
Data exfiltration via external recipients Agent is talked into emailing sensitive content to attacker-controlled domains.	`messages.send` calls are parsed; recipient domains are extracted and matched against your allow-list.	External recipients hold the email until a real human approves it through Slack or a ticket. Configurable per sender and domain, so internal traffic stays frictionless and only risky sends get gated.
Bulk or mass-mutation abuse Compromised agent deletes thousands of files or sends mass email.	Proxilion counts actions per session and their rate. When an agent suddenly tries to delete a thousand files in a minute, the counter trips before damage is done.	A one-click killswitch revokes every capability tied to that user or agent, network-wide, within one request cycle. The next call returns 403, no matter how many tokens it holds.
Replay and token reuse Captured bearer token is reused beyond the intended action.	Every action gets a one-time identifier and a short expiry. A replayed token shows the same identifier coming through twice.	The replayed request is rejected at the edge before it reaches any upstream. No duplicate side-effect, no data leaked, and the attempt is logged with the source IP.
Untraceable agent actions for compliance "Which user was the agent acting for when it touched this PHI?" No one can answer.	Every upstream call is logged with the full PCA chain, COSE-signed, append-only.	Not a block, an evidence guarantee. Every action is signed and verifiable years later, offline, without trusting any vendor. SOC 2, HIPAA, and ISO 27001 evidence drops out of the proxy automatically. When an auditor asks which user the agent acted for, you can prove it.
Vendor and supply-chain trust assumption You are trusting the agent vendor's claims about what their model did.	The signing key is customer-held. Proxilion cannot forge PCAs on your behalf, and neither can any agent vendor.	Trust is rooted at your own identity provider and a key only you hold. You no longer take any agent vendor at their word about what their model did. The proof is on your infrastructure.
Insider misuse via agent Legitimate user uses the agent as a laundering layer to do things their direct credentials cannot.	PCA_0 is bound to the user's IdP ops; the agent inherits nothing the user did not already have.	The agent inherits exactly the human's permissions, nothing more. If the user cannot read HR records directly, asking the agent to do it fails the same way. The agent is not a permissions loophole.

The Skill Overreach problem

An agent skilled for the whole org is, in effect, a super-user. Proxilion forces it back into the user's box.

Agent platforms now ship skills. You train one agent for the org, attach it to Drive, Gmail, and a few internal APIs, and hand it to every employee. That agent now holds the union of every permission its users have. The scope says drive.readonly for the tenant. The runtime has no idea whether the human on the other end is an intern or the CEO.

That is Skill Overreach. A skill is authority at the agent level. A user is authority at the human level. The gap between them is where confused-deputy attacks and insider misuse live.

Proxilion binds every call to a PCA chain rooted at the specific human acting at that moment. The intern's request to summarize Q3 financials fails the way it would if they opened Drive directly. The CEO's succeeds. The skill stays the same; the authority is the user's. Prevention by construction, even when the skill is overpowered.

Stop hoping your agent behaves. Prove it cannot misbehave.

Self-host in an afternoon. Free, forever. No sales call, no license keys, no data leaving your network.

Made with by Clay Good Get Proxilion on GitHub