VerraVerra
HomeProductHow It WorksDocsDashboard
Book a DemoSign in / Sign up

How Verra Works

An 11-step pipeline on every interaction.

Verra is security middleware that sits between your agents and everything they touch, including users, tools, databases, other agents, and models. Every interaction passes through the same pipeline regardless of protocol. MCP, A2A, or anything else.


The Pipeline

Every interaction runs these steps in order. Steps 2, 6, and 9 run partially in parallel, and logging is fire-and-forget so the response is never held.

01

Auth

Agent API key (x-verra-key) is looked up against registered agents. Unknown keys fail immediately with 401.

02

Policy load

Org-wide policy and per-agent policy override are fetched in parallel from Supabase. The agent-level override is layered on top.

03

Tool filtering

Any tools in the request not present in the agent's allowed_tools list are stripped from the payload before detection runs.

04

Header parse

W3C traceparent is read or generated. user_id and parent_agent_id are extracted from headers. The trace propagates through the entire A2A chain and exports as standard OTel spans.

05

A2A authorization

If parent_agent_id is present, Verra checks the caller→target trust matrix. Forbidden agent-type pairs and sensitive data patterns are blocked here.

06

Detection

Four detectors run in parallel: prompt injection (pattern + embedding), jailbreak (pattern + embedding + LLM judge), data exfiltration, policy violation. Aggregated into a single verdict: pass, flag, or block.

07

Approval gate

If risk is high and policy requires justification, a pending approval record is created and a 202 Accepted is returned with approval_id. The request does not proceed until approved.

08

Model routing

Low risk routes to model_target. Medium or high risk routes to private_model_target (self-hosted). If private target is unconfigured, the request is blocked.

09

Log receipt

A receipt is written to Supabase asynchronously. No raw text is stored, only hash, length, metadata, risk level, and findings.

10

Auto-classify

After 10+ receipts, Verra classifies the agent type from behavioral patterns. One of: hr, finance, legal, engineering, support, marketing, security, data, general.

11

Forward + scan response

Request is proxied to the LLM provider. The response is scanned for secrets or data leakage before being returned to the agent.


Detection

Four detectors. Zero serial latency.

All four detectors run in parallel on every request. Latency is the max of four concurrent checks, not the sum. Combined verdict: pass, mask, flag, or block.

Prompt Injection

Three layers, escalating cost. Pattern match first (sync, ~0ms). Fine-tuned on-device classifier second: protectai/deberta-v3-base-prompt-injection-v2 runs locally via ONNX with no API call. LLM judge third, only for ambiguous scores. Catches delimiter attacks, "ignore previous instructions" variants, context escape attempts, and soft persona-hijack attacks.

Jailbreak Detection

Three layers in sequence, escalating cost. Pattern match first (fast). Embedding similarity against a 31-prompt reference corpus second, using jackhhao/jailbreak-classifier and OpenAI embeddings (medium). LLM judge fallback third, only when the first two are inconclusive. Catches roleplay exploits, DAN-style prompts, system-note injection, and hypothetical framing attacks.

Data Exfiltration

Detects attempts to extract system prompts or model training data. Distinct from DLP; this covers intentional extraction attempts rather than accidental leakage.

Policy Violation

Customer-defined rules evaluated per request. Supports keyword filters, topic blocks, language restrictions, and custom LLM-judge rules. Defined in org policy and overridable per agent.

99%

Injection recall

explicit attacks · HackAPrompt

53%

Injection recall

real-world mixed · deepset

92%

Indirect injection recall

email agent attacks · LLMail

~70ms

Avg detection overhead

explicit attack benchmark · HackAPrompt

Risk signals: annotate every receipt regardless of verdict

PII

· Email addresses

· Phone numbers

· Social security numbers

· Dates of birth

Secrets

· sk-* patterns (OpenAI)

· AKIA* (AWS access keys)

· ghp_* (GitHub tokens)

· Bearer tokens


Tool Access Control

Four layers before a tool runs.

POST /api/gate/tool-input is called when an agent invokes a tool. All four layers must pass.

01

RBAC

Is the tool in the agent's allowed_tools list? Hard gate. If not, blocked immediately.

02

Permission matrix

16 tool categories × 9 agent types. Each pairing is expected, allowed, suspicious, or forbidden. An HR agent is suspicious on database_query and forbidden on code_execution.

03

Behavioral baseline

Compared against the agent's 200-receipt rolling profile. Tracks peak hours, tool frequency, data types. Anomaly score above 0.8 blocks. Between 0.5 and 0.8 warns. Fails open, baseline errors don't block production.

04

Content scan

DLP check on the tool input payload itself. Any policy violations here block the call. Fails closed.

Sample permission matrix (subset of 9×16)

emaildatabase_querycode_executionadmin
hr✓ allowed⚠ suspicious✗ forbidden✗ forbidden
finance✓ allowed✓ allowed⚠ suspicious✗ forbidden
engineering✓ allowed✓ allowed✓ allowed⚠ suspicious
security✓ allowed✓ allowed✓ allowed✓ allowed

A2A Authorization

Agents don't automatically trust each other.

When Agent A delegates work to Agent B, Verra validates the trust relationship before Agent B can make any LLM calls.

Implicit path

Agent B includes x-verra-parent-agent: agent_a_id in its proxy header. Verra detects the delegation and checks the call matrix automatically.

Explicit path

Agent A first calls POST /api/a2a with target_agent_id and task. Verra validates both agents are in the same org, checks the delegation policy, writes an agent_handoff receipt, and returns a W3C traceparent. Agent B carries this forward to continue the same distributed trace.

Regardless of path, sensitive data crossing certain agent-type boundaries is always blocked. An HR agent cannot hand off a payload containing SSN or salary data to an engineering agent, even if both agents are registered and the delegation policy allows the type pairing.

Visibility

Full observability. No raw text stored.

Every receipt stores hash + length + metadata only. You get full audit capability without PII ever persisting. Verra is OpenTelemetry-native, and every trace exports to any OTLP backend.

Receipts

Every proxied call with risk level, findings, agent, trace ID, span ID, and detection reasons.

Approvals

Pending human reviews with approve/reject and full audit trail.

Shadow AI

Unregistered AI usage surfaced in the dashboard with agent, timestamp, and request metadata.

Agents

All registered agents: model targets, environments, tool permissions, call stats.

Lineage

Agent relationship graph. A2A edges, ego graph per agent, trace lookup.

Policy

Define org-wide rules: block/warn thresholds, PII handling, custom LLM-judge rules.

OTel export

Every trace is OpenTelemetry-native. Set OTEL_EXPORTER_OTLP_ENDPOINT to ship spans and metrics to Grafana Tempo, Jaeger, Honeycomb, Datadog, or any OTLP backend.

Analytics

Calls over time, risk distribution, agent performance trends.


See it running on your stack.

15-minute demo. Bring your architecture questions.

Book a Demo

→ Read the docs