VerraVerra
HomeProductDocs
Book a DemoSign in / Sign up

Overview

What is Verra?

Verra is a drop-in security proxy that sits between your AI agents and your LLM providers. Every request is scanned, governed by policy, and logged, without any changes to your agent code.

You point your agent at https://api.helloverra.com/api/proxy instead of https://api.openai.com and add one header. Verra forwards the request to the real provider after running its security pipeline.

Your Agent
    │
    │  POST /api/proxy  ·  x-verra-key: va-...
    ▼
┌──────────────────────────────────────────┐
│              Verra Proxy                 │
│                                          │
│  Auth → DLP → Policy → Route → Forward  │
│                         │                │
│                    Scan Response         │
└──────────────────────────────────────────┘
        │                       │
        ▼                       ▼
  LLM Provider             Audit Log

Works with OpenAI, Anthropic, Azure OpenAI, Amazon Bedrock, and Google Vertex AI out of the box. No infrastructure changes required.


Quickstart

Up in three steps

1. Register an agent

Sign up at app.helloverra.com, then open Agent Registry > Register Agent and create one. The agent key is shown once after registration and has the prefix va-.

export VERRA_KEY=va-...

2. Point your agent at Verra

Change the base URL in your LLM client and add the x-verra-key header.

# Python · OpenAI SDK import openai, os client = openai.OpenAI( api_key=os.environ["OPENAI_API_KEY"], base_url="https://api.helloverra.com/api/proxy", # ← change this default_headers={ "x-verra-key": os.environ["VERRA_KEY"], # ← add this }, ) # Your existing code is unchanged response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "..."}], )
# TypeScript · OpenAI SDK import OpenAI from 'openai'; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, baseURL: 'https://api.helloverra.com/api/proxy', // ← change this defaultHeaders: { 'x-verra-key': process.env.VERRA_KEY, // ← add this }, });
About your upstream provider key. The OpenAI SDK forwards api_key as an Authorization: Bearer … header, which Verra passes through to OpenAI after running its pipeline. If you'd rather not handle the provider key in your agent code, add it once in Settings > Credentials and Verra will inject it server-side. In that case, pass any placeholder for api_key (the SDK requires the field) or, for direct HTTP calls, omit the Authorization header entirely.

Direct HTTP (no SDK)

For languages without an OpenAI client, or when you want full control over headers:

# Python · requests import os, requests resp = requests.post( "https://api.helloverra.com/api/proxy", headers={ "content-type": "application/json", "x-verra-key": os.environ["VERRA_KEY"], # Authorization is only required if you haven't stored your provider # key in Settings > Credentials. Omit it to use the dashboard credential. "Authorization": f"Bearer {os.environ['OPENAI_API_KEY']}", }, json={ "model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}], }, ) print(resp.json())

3. Verify in the dashboard

Make any LLM call. The agent is surfaced in your admin dashboard for one-click confirmation on first call. Open the admin dashboard, then Audit Log, to see the entry with risk level, DLP findings, and trace ID.

Verra never stores raw prompt or completion text, only a SHA-256 hash, token count, and metadata. PII and secrets never persist.

LangChain callback (optional)

If you use LangChain, install the npm SDK and attach the callback handler. See the npm SDK section for full configuration options.

import { withVerraSecurity } from '@verra/sdk/langchain'; const safeLLM = withVerraSecurity(llm, { orgId: process.env.VERRA_ORG_ID, verraProxyUrl: 'https://api.helloverra.com', agentName: 'my-agent', });

npm SDK

@verra/sdk

The Verra npm package ships the detection pipeline, LangChain and CrewAI integrations, and all types as a standalone library. Use it to run Verra's security checks directly in your Node.js application, no proxy hop required.

npm install @verra/sdk

LangChain integration

Import from @verra/sdk/langchain. The callback handler surfaces your agent in the admin dashboard on first call (one-click confirm) and reports a receipt to your Verra dashboard for every LLM invocation.

import { ChatOpenAI } from '@langchain/openai'; import { withVerraSecurity } from '@verra/sdk/langchain'; const llm = new ChatOpenAI({ model: 'gpt-4o' }); const safeLLM = withVerraSecurity(llm, { orgId: process.env.VERRA_ORG_ID, // Dashboard > Settings > Org ID verraProxyUrl: 'https://api.helloverra.com', // base URL only, no /api/proxy agentName: 'customer-support-agent', // your agent's name onBlock: { strategy: 'fallback', message: "I can't help with that." }, onFlag: 'log', onError: 'fail_open', // See Configuration below for all options. }); // Use safeLLM exactly like your original LLM. Verra runs invisibly const result = await safeLLM.invoke('Hello!');

CrewAI integration

Import from @verra/sdk/crewai. Attach step and task callbacks to any CrewAI agent to inspect tool inputs and outputs.

import { applyVerraSecurity } from '@verra/sdk/crewai'; const agent = new Agent({ role: 'researcher', tools: [...] }); applyVerraSecurity(agent, { orgId: process.env.VERRA_ORG_ID, verraProxyUrl: 'https://api.helloverra.com', agentName: 'researcher', onBlock: { strategy: 'throw' }, });

LlamaIndex integration

Import from @verra/sdk/llamaindex. The wrapper intercepts every chat() and complete() call: the user message is scanned before the LLM runs, the response is scanned before it returns. Behaviour matches the LangChain wrapper (onBlock, onFlag, onError).

import { OpenAI } from 'llamaindex'; import { withVerraSecurity } from '@verra/sdk/llamaindex'; const llm = withVerraSecurity(new OpenAI({ model: 'gpt-4o' }), { verraProxyUrl: 'https://api.helloverra.com', agentName: 'my-llamaindex-agent', onBlock: { strategy: 'throw' }, }); const response = await llm.chat({ messages: [{ role: 'user', content: '...' }], });

AutoGen integration

Microsoft AutoGen (JS/TS). Import from @verra/sdk/autogen. The middleware exposes two entry points: wrapClient for the typical case of wrapping a model client, and scanMessage for custom agent steps where you want explicit control. System messages are never scanned (they routinely contain words like "confidential" that would false-positive).

import { createVerraMiddleware } from '@verra/sdk/autogen'; const middleware = createVerraMiddleware({ verraProxyUrl: 'https://api.helloverra.com', agentName: 'research-agent', onBlock: { strategy: 'throw' }, }); // Wrap an existing AutoGen model client: const securedClient = middleware.wrapClient(modelClient); // Or scan a single message in a custom agent step: await middleware.scanMessage({ role: 'user', content: userText });

Semantic Kernel integration

Microsoft Semantic Kernel (JS/TS). Import from @verra/sdk/semantickernel. Returns two kernel filters: promptRenderFilter runs after the prompt template is rendered and before the LLM call; functionInvocationFilter runs on the function output. Register both for full input and output coverage.

import { Kernel } from '@microsoft/semantic-kernel'; import { createVerraFilter } from '@verra/sdk/semantickernel'; const kernel = new Kernel(); const filter = createVerraFilter({ verraProxyUrl: 'https://api.helloverra.com', agentName: 'my-sk-agent', onBlock: { strategy: 'throw' }, }); kernel.addPromptRenderFilter(filter.promptRenderFilter); kernel.addFunctionInvocationFilter(filter.functionInvocationFilter);

Direct pipeline usage

Run the detection pipeline directly without an integration wrapper. Useful for custom frameworks or serverless handlers.

import { runInputPipeline, runOutputPipeline } from '@verra/sdk'; // Inspect a request before sending to your LLM const verdict = await runInputPipeline({ requestId: crypto.randomUUID(), userInput: userMessage, systemPrompt: systemPrompt, // optional, enables prompt injection detection policy: myPolicy, // optional, enables custom policy rules }); if (verdict.action === 'block') { throw new Error(`Blocked by ${verdict.triggeredBy}`); } // Inspect the LLM response before returning to the user const outVerdict = runOutputPipeline({ requestId: verdict.requestId, response: llmResponse, systemPrompt: systemPrompt, });

Configuration

OptionTypeDescription
orgIdstringYour Verra org ID, copied from Dashboard > Settings > Org ID. Required for receipt reporting and auto-registration.
verraProxyUrlstringBase URL of your Verra deployment (e.g. https://api.helloverra.com, no /api/proxy suffix). Required for receipt reporting.
agentNamestringStable name for this agent, used to deduplicate registrations in the dashboard.
onBlockobjectAction when a request is blocked. { strategy: "throw" } raises VerraSecurityError. { strategy: "fallback", message } returns a safe string instead.
onFlagstring"log" (default): logs flagged requests and continues. "passthrough": silently continues.
onErrorstring"fail_open" (default): passes the request if the pipeline errors. "fail_closed": blocks on any pipeline error.
loggerfn(summary: string) => void. Receives a one-line summary of each verdict. Defaults to console.log.
policyobjectCustom policy rules to enforce. See the Policy schema in the API reference.
sensitiveContextstring[]Strings that should never appear in LLM responses. Triggers a block if found.
toolsstring[]Tool names this agent has access to. Passed at registration so the dashboard shows them immediately.
frameworkstringFramework label sent at registration, e.g. "langchain", "crewai". The integration wrappers set this for you.
traceparentstringW3C traceparent header value. Links SDK receipts and tool gate calls into a distributed OTel trace.
The SDK uses your OPENAI_API_KEY for the embedding layer and the LLM-judge layer (or GROQ_API_KEY for the judge, if set). With neither key set, semantic similarity and LLM-judge layers are skipped and only pattern + ML-classifier detection runs. The on-prem ML classifiers (PII NER, prompt-injection, jailbreak) call the Verra-operated verra-ml-inference service; set ML_INFERENCE_URL to opt in.

Lower-level detector exports

The integration wrappers cover most use cases. If you're building something custom, the underlying detectors are exported individually from the same package so you can compose them yourself.

ExportDescription
detectPromptInjectionfnLayered detector: pattern match, then DeBERTa via the verra-ml-inference service, then LLM judge for ambiguous scores. Returns DetectionResult with action / confidence / matchedPattern / layersExecuted.
detectJailbreakfnThree-layer jailbreak pipeline: pattern, embedding similarity, LLM judge. Catches DAN-style and roleplay-escape attacks.
detectExfiltrationAttemptfnThree-layer detector: (Layer 0) regex sweep of userInput for PII (0.9) and credentials (0.97) so sensitive data is blocked before it reaches the LLM; (Layer 1) extraction-probe patterns: system-prompt fishing, bulk data probing, indirect roundabout phrasing; (Layer 2) semantic similarity to the system prompt.
scanResponseForLeakagefnScans an LLM response for leaked secrets / PII before returning it to the agent.
detectPolicyViolationfnCustomer policy enforcement (keyword + semantic + LLM-judge layers per rule).
runPiiNerfniiiorg/piiranha-v1-detect-personal-information NER, served by the verra-ml-inference service. Throws if ML_INFERENCE_URL is unset so the caller can fall back to regex PII.
runJailbreakGuardfnJailbreak classifier served by the verra-ml-inference service. Shares the protectai/deberta-v3-base-prompt-injection-v2 model with PromptGuard; the INJECTION label is re-mapped to jailbreak.
runPromptGuardfnprotectai/deberta-v3-base-prompt-injection-v2 served by the verra-ml-inference service. isPromptGuardAvailable() checks ML_INFERENCE_URL is set.
PII_PATTERNSRegExp[]The compiled PII regex set used by scanResponseForLeakage. Exported for custom redaction or testing.
SECRET_PATTERNSRegExp[]The compiled secret regex set (sk-*, ghp_*, AKIA*, JWT, etc.).
normalizeInputfnStandardises text before pattern matching (lowercases, strips zero-widths, decodes common obfuscations).
clearTopicEmbeddingCachefnResets the in-process semantic-rule cache. Useful in tests and when policy changes mid-process.
Detector functions don't emit receipts on their own. If you compose them directly, also call reportReceipt from @verra/sdk after the verdict so your dashboard still gets the audit row.

Observability

Observability integrations

LangSmith

Verra and LangSmith complement each other.

1. Run IDs in receipts. When you use VerraCallbackHandler, every receipt written to the Verra dashboard includes the LangSmith run_idin its findings.langsmith_run_id field. You can look up the corresponding LangSmith trace directly from any blocked or flagged receipt.

2. Automatic tool gating. VerraCallbackHandler now hooks handleToolStart and calls /api/gate/tool-inputautomatically on every tool invocation. Blocked tools raise a VerraSecurityError (or return your fallback message) before the tool runs.

Exporting Verra traces to LangSmith

Verra can send spans to LangSmith via OTLP.

# Your LangSmith OTLP endpoint: find it in Settings > Integrations > OTLP OTEL_EXPORTER_OTLP_ENDPOINT=https://api.smith.langchain.com/otel OTEL_EXPORTER_OTLP_HEADERS=x-api-key=<your-langsmith-api-key>

Verra spans appear in LangSmith as a parallel trace alongside your LangChain execution trace, tagged with gen_ai.* and verra.* attributes.

Other OTLP backends

The same OTEL_EXPORTER_OTLP_ENDPOINT variable works with any OpenTelemetry-compatible backend. Point it at Grafana Tempo, Honeycomb, Datadog, or your own collector.

# Grafana Cloud OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-east-0.grafana.net/otlp # Honeycomb OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io OTEL_EXPORTER_OTLP_HEADERS=x-honeycomb-team=<your-api-key> # Datadog (via OTLP collector) OTEL_EXPORTER_OTLP_ENDPOINT=https://trace.agent.datadoghq.com

Each proxy request creates a root span (verra.proxy) with child spans for risk analysis and the upstream LLM call. Token usage is recorded as gen_ai.usage.input_tokens and gen_ai.usage.output_tokens. Request counts and per-span durations are exported as OTLP metrics every 60 seconds.


Configuration

Configuration

Environment variables

NameTypeDescription
VERRA_KEYstringrequiredYour Verra agent API key (prefix va-). A convention: read from your own code and passed as the x-verra-key header on every proxy call.
VERRA_ORG_IDstringYour Verra org ID (copy from Dashboard > Settings > Org ID). Read by the SDK at startup for receipt reporting and auto-registration.

Verra does not impose other environment variables on your app. Logging verbosity and request timeouts are controlled by your LLM client (e.g. the OpenAI SDK's timeout option), not by Verra.

Policies

Policy is configured in the admin dashboard under Policy Engine. A policy is org-wide by default, with per-agent overrides available from each agent's detail page. Start from a preset (Default, Strict, Warn-only, HIPAA, or SOC 2) and adjust toggles.

The core toggles are:

ToggleTypeDescription
modestringEnforcement posture for detector findings: "observe" (shadow mode, pass-through with receipts), "govern" (default, detectors enforce), "enforce" (Govern plus an auto-generated compliance evidence pack).
block_secretsboolBlock requests whose input contains credentials (sk-..., AKIA..., ghp_..., bearer tokens, private keys).
warn_piiboolWarn (don't block) when PII is detected in input. Pair with mask_pii_input to redact instead.
mask_pii_inputboolReplace detected PII with type tokens (e.g. [email]) before forwarding to the LLM. Receipts log the masked types.
log_low_riskboolWrite receipts for low-risk requests too. Off by default to keep audit volume manageable.
block_exfiltrationboolReject requests combining a directive verb, a sensitive-data target (customer list, credentials, database, all users), and an email address. Narrow by design; the ONNX detector covers the long tail. On by default.
require_justification_onstring[]Hold requests for human approval at these risk levels. Values: "med", "high".
unattributed_tool_policystringWhat to do when a tool call cannot be attributed to a registered MCP server: "allow" | "warn" | "require_approval" | "block".
max_delegation_depthnumberMax chain length for A2A handoffs. Default 5.
block_a2a_trust_escalationboolReject A2A handoffs that promote trust tier (low → medium, medium → high).
require_a2a_approval_for_trust_escalationboolRoute trust-escalating A2A handoffs to human approval instead of blocking.

Per-agent overrides are stored alongside each agent (versioned) and merge over the org policy at request time.


API Reference

API Reference

POST /api/proxy

The proxy endpoint accepts an LLM request body in your provider's native shape (OpenAI Chat Completions, Anthropic Messages, etc.). Verra inspects it, applies policy, then forwards to the configured upstream.

Request headers

HeaderTypeDescription
x-verra-keystringrequiredYour Verra agent API key (prefix va-).
x-verra-orgstringOrg ID for auto-provisioning. Only needed when calling without a registered agent key.
x-verra-userstringOperator identity: the principal in your system running the agent (e.g. an internal employee). Distinct from x-verra-user-id below; used for SOC 2 / DORA Article 5 governance evidence.
x-verra-user-idstringEnd-user (subject) identity: the human whose session triggered the agent action. Verra also accepts body.end_user.id as an alias. Threaded onto every receipt, approval, and findings.mcp_origins[] entry. Opaque; Verra does no SSO resolution.
x-verra-parent-agentstringAgent ID of the calling agent. Set manually by Agent A when chaining direct proxy calls; for first-class A2A handoffs use POST /api/a2a instead, which sets this for you on the resulting traceparent.
x-verra-session-idstringStable session identifier for multi-turn conversations. Verra loads the last 10 messages from this session as context so cross-turn attacks are detected.
x-verra-approval-idstringResubmit a held request after human approval has been granted (returned in the 202 body).
traceparentstringW3C TraceContext header (00-{traceId}-{spanId}-{flags}). Links this hop into an existing distributed trace. Verra generates a new trace if absent.
AuthorizationstringOptional upstream-provider key (e.g. Bearer sk-...). When present it is forwarded to OpenAI after the pipeline runs; when absent, Verra uses the credential stored under Credentials in the dashboard. Anthropic, Azure OpenAI, Bedrock, and Vertex always use the dashboard credential and ignore this header.

Response headers

HeaderTypeDescription
traceparentstringW3C traceparent for this hop. Pass to the next agent in a chain, or store for distributed tracing.
x-verra-trace-idstringThe trace ID as a bare UUID, for systems that don't parse traceparent.
x-verra-actionstringSet to "mask" when the pipeline redacted the request body before forwarding. The companion x-verra-masked-types header lists the redacted types. Blocked requests use HTTP 403; held requests use HTTP 202; see below.
x-verra-masked-typesstringComma-separated list of PII types that were masked (e.g. email,phone). Only set when x-verra-action: mask is set.

Blocked requests

When a request is blocked by policy, Verra returns 403 Forbidden with a structured error body instead of forwarding to the LLM.

HTTP/1.1 403 Forbidden Content-Type: application/json { "error": { "type": "policy_violation", "message": "Request blocked by policy", "findings": ["pii:ssn", "pii:credit_card"], "receipt_id": "rcpt_a3f9c1..." } }

Held requests (human approval)

When the active policy holds a request for human review (e.g. require_justification_on matches), Verra returns 202 Accepted immediately. The upstream call is parked pending approval. Surface the approval URL to your operator and resubmit with x-verra-approval-id once approved.

HTTP/1.1 202 Accepted Content-Type: application/json { "status": "pending_approval", "approval_url": "https://app.helloverra.com/admin/approvals/apr_b7e2d4...", "expires_at": "2026-05-19T14:00:00Z", "receipt_id": "rcpt_b7e2d4..." }

GET /api/receipts

Query the audit log. Authenticated either via an admin browser session (the dashboard uses this) or with any registered agent key scoped to your org. There is no separate "admin API key" today; read access is gated by the caller's org membership.

curl https://api.helloverra.com/api/receipts \ -H "x-verra-key: va-..." \ -G \ --data-urlencode "agent=finance-report-ag" \ --data-urlencode "risk=high" \ --data-urlencode "from=2026-05-01" \ --data-urlencode "limit=50"

Returns a JSON array of receipt objects. Each receipt contains id, agent_id, trace_id, risk_level, action, findings, input_hash, output_hash, and created_at. Raw text is never returned.

POST /api/gate/tool-input

Tool gate (input side). Call before a tool executes; Verra runs four sequential layers and returns a decision. The agent is responsible for acting on it (action: "block" means do not run the tool). The four layers are: (1) static RBAC against the agent's allowed_tools list, (2) the 16x9 tool-category-by-agent-type permission matrix, (3) behavioural baseline anomaly score, (4) DLP content scan on the tool args.

curl -X POST https://api.helloverra.com/api/gate/tool-input \ -H "x-verra-key: va-..." \ -H "content-type: application/json" \ -d '{ "tool_name": "send_email", "args": { "to": "...", "subject": "...", "body": "..." }, "trace_id": "uuid-from-proxy-call", "user_id": "employee-jane" }'

Response body fields: action (pass / warn / block), reason, risk_level, findings, trace_id, layer_blocked (which of the four layers fired), tool_category, anomaly_score, permission_decision, and redacted_args when the args contained secrets. HTTP 400 when action === "block".

If you use the LangChain SDK wrapper, tool gating is wired automatically via handleToolStart on every tool invocation. Manual gate calls are only needed for custom frameworks or non-LangChain agents.

POST /api/gate/tool-output

Tool gate (output side). Call after a tool returns its result; Verra scans the result for leaked secrets / PII and returns a decision plus a redacted version where applicable. trace_id must match the value passed to /api/gate/tool-input so input and output receipts correlate in the audit log.

curl -X POST https://api.helloverra.com/api/gate/tool-output \ -H "x-verra-key: va-..." \ -H "content-type: application/json" \ -d '{ "tool_name": "fetch_customer_record", "result": { "name": "...", "ssn": "..." }, "trace_id": "uuid-from-tool-input-call" }'

Response: action, reason, risk_level, findings, trace_id, and redacted_result when secrets / PII were found. The agent should return redacted_result to its caller, not the raw result. HTTP 400 when blocked.

POST /api/a2a

Agent-to-agent handoff. Agent A calls this endpoint when delegating a task to Agent B; Verra records the delegation as a first-class receipt, runs trust-tier and delegation-depth checks, and returns a traceparent that Agent B must include in its subsequent proxy calls so the full call chain is linked in OpenTelemetry.

curl -X POST https://api.helloverra.com/api/a2a \ -H "x-verra-key: va-... # Agent A's key" \ -H "content-type: application/json" \ -d '{ "target_agent_id": "agent-b-uuid", "task": "Summarise the Q3 financial report", "input": { "report_id": "..." } }'

Response: handoff_id (receipt ID), trace_id, traceparent (pass to Agent B), allowed, delegation_depth, and blocked_reason when policy rejects. HTTP 403 if the org policy blocks A2A delegation or trust escalation. HTTP 202 with an approval_id when the policy requires human approval for trust escalation.

Default delegation-depth cap is 5; configurable via policy.max_delegation_depth. Trust-escalation handoffs (low to medium, medium to high) can be blocked outright with policy.block_a2a_trust_escalation or routed to human review with policy.require_a2a_approval_for_trust_escalation.

GET /api/lineage

Reconstructs the full call chain for a trace. Returns every receipt that shares the given trace_id, ordered by time, joined to the agents table for human-readable agent names. Useful for auditing multi-hop A2A flows or correlating tool-input / tool-output gates with the originating proxy call.

curl https://api.helloverra.com/api/lineage?trace_id=<uuid>

Response: { trace_id, hops: [...] }. Each hop has receipt_id, timestamp, agent_id, agent_name, parent_agent_id, action, risk_level, and destination. Findings blobs are intentionally not included to keep the response lean for long traces; use /api/receipts for the full record.

POST /api/shadow

Shadow AI report endpoint. Reporters (a fetch wrapper in your app, a network proxy, a firewall webhook) post outbound AI-provider calls that don't carry a Verra key. Events surface in the Shadow AI dashboard with agent, timestamp, and request metadata. Recognised hosts include OpenAI, Anthropic, Google, Microsoft, Cohere, Mistral, Together, Groq, Perplexity, Replicate, and HuggingFace endpoints.

curl -X POST https://api.helloverra.com/api/shadow \ -H "x-verra-key: va-... # any registered agent key" \ -H "content-type: application/json" \ -d '{ "url": "https://api.openai.com/v1/chat/completions", "method": "POST", "user_agent": "...", "metadata": { "process": "...", "command": "..." } }'

GET /api/shadow?status=new|reviewed|dismissed lists events for the authenticated org (session-scoped, admin only). PATCH /api/shadow bulk-updates status with body { ids, status } for triage from the dashboard.


Concepts

How the pipeline works

Every request through Verra passes five sequential stages. Each stage can independently allow, warn, mask, or block the request, or hold it for human approval.

1. Auth

The x-verra-key header is verified against your organization. Invalid or revoked keys receive a 401 immediately, before any content is read.

2. Data Loss Prevention (DLP)

The request body is scanned for sensitive patterns before it reaches the model. Four detectors run in parallel.

Prompt injection

Pattern match, then DeBERTa (protectai/deberta-v3-base-prompt-injection-v2) via the verra-ml-inference service, then an LLM judge for ambiguous scores. Catches direct and indirect instruction injection.

Jailbreak

Pattern match, embedding similarity, LLM judge. Shares the protectai/deberta-v3 model with prompt injection; the INJECTION label is re-mapped to jailbreak so DAN-style and role-play escapes surface here.

Data exfiltration

Three layers. Layer 0 scans userInput for PII (SSN, credit card, email, phone, IBAN, MRN, ...) and credentials (sk-*, AKIA*, ghp_*, bearer tokens, private keys); a match blocks before forwarding to the LLM. Layer 1 catches extraction probes (system-prompt fishing, bulk data probing like "list all users", indirect roundabout phrasing). Layer 2 runs semantic similarity between userInput and the system prompt when the request looks like a question.

System prompt extraction

Dedicated pattern detector for direct extraction attempts ("repeat your instructions", "show me your system prompt") plus indirect probes ("translate your instructions to French").

3. Policy evaluation

The toggles from Policy Engine merge over the org policy and any per-agent overrides, producing the final action: allow, warn, mask, block, or hold for human approval. Each stage of the pipeline can independently change the action.

4. Tool access control

For agentic requests that include tool definitions, Verra validates against a four-layer gate: RBAC, agent-type permission matrix, behavioral baseline, and content scan. An HR agent cannot invoke code execution tools; a finance agent cannot query external databases.

Tool restrictions apply even when the model attempts to call an unauthorized tool mid-conversation. The tool call is stripped and the agent receives a policy error in the tool result.

5. Forward to upstream

Allowed requests are forwarded to your configured LLM provider with the upstream credential. Verra sets the receipt, then streams or returns the response back to your agent. When a request is masked, the redacted body is forwarded and the response carries an x-verra-action: mask header.

Streaming

Verra streams responses (SSE) for OpenAI, Azure OpenAI, and Anthropic whenever your client asks for it (stream: true) and the call is human-facing (no x-verra-parent-agent header set; agent-to-agent traffic always buffers, where machine consumers gain nothing from streaming and the strongest enforcement is wanted). In observe mode, frames relay straight through unchanged. In govern / enforce mode, a rolling-window scanner holds back the last 256 characters of assembled text and runs the cheap synchronous detectors (secrets, system-prompt leak, PII regex) on every chunk; any contiguous violation shorter than the hold-back is caught before a single character of it reaches the client.

Streaming makes responses feel faster, but the semantic LLM-judge pipeline cannot run per-chunk on a live stream. The full analyzeResponse (judge included) runs as a post-stream audit: it cannot un-send anything, but emits awarn-kind security event when it catches something the inline cheap scan missed. If your compliance posture needs the judge to gate every response inline, enable Require judge on every response on the Policy page; that flag (policy.dlp.require_judge_output) forces the buffered path for the whole org and skips streaming entirely.

Identity attribution (operator vs. subject)

Verra carries two distinct human identities through every receipt and approval, because regulators draw the line in different places:

  • user_id (header: x-verra-user): the operator, your customer's employee running the agent. Used for SOC 2 and DORA Article 5 governance evidence.
  • end_user_id (header: x-verra-user-id, alias body.end_user.id): the subject, the human whose session triggered the agent action. Used for EU AI Act Article 14 (human oversight) and DORA Article 17 (incident reporting) evidence.

Verra treats both values as opaque strings and does not resolve them against an SSO directory. The end-user identity is also stamped onto every entry of findings.mcp_origins[], so any MCP tool invocation can be traced back to its authorizing human via the audit log alone. A missing end_user_id does not block the request, but the dashboard surfaces an amber "no user attribution" badge so compliance officers can audit the gap.

Multi-turn session analysis

Pass x-verra-session-id on each request to enable session-aware detection. Verra loads the last 10 messages from that session and prepends them as context before running the pipeline. This catches slow-burn attacks that unfold across multiple turns, for example an adversary who gradually shifts the model's behavior over several exchanges rather than in a single message.

Agent-to-agent authorization

When Agent A calls Agent B, Verra validates the trust chain. The delegating agent's trace ID must be present and its permissions must be a superset of the callee's requirements. Sensitive data categories (health, finance, HR) never cross agent-type boundaries regardless of policy configuration.

Human-in-the-loop approvals

Requests escalated by policy return 202 Accepted before forwarding. The admin queue surfaces the justification and context to a human reviewer. Approvals expire after one hour. Rejected requests write a block receipt and notify the calling agent. Approval rows record both the operator and the end-user identity (so EU AI Act Article 14 evidence resolves), plus a kind field that distinguishes unattributed-tool escalations from DLP and trust-tier escalations.

MCP governance

Every tool the model can invoke resolves to a registered MCP server with a known schema and a trust tier. Server registration captures the full input schema for each tool plus a SHA-256 hash; the agent pins to the hash at import time. On every request the proxy resolves tool_calls[] against the pin and the server's current schema, so drift is logged on the receipt automatically. An hourly cron rechecks every non-disabled server and appends an immutable drift event when fingerprints change.

Tools without an origin record (legacy agents, vanilla function definitions, un-imported servers) are governed by mcp.unattributed_tool_policy: allow, warn (default for new orgs), require_approval, or block. Evaluated before trust-tier and drift checks because trust can't be reasoned about on an unknown origin. Existing orgs are backfilled to allow with a console warning at proxy time so the upgrade doesn't break legacy agents.

Server prompts (which are attacker-controllable text injected into model context) are scanned at import and on every recheck with the same regex + ML detector layering used on user input, catching DAN-style persona takeovers and "ignore previous instructions"-style indirect injection wording before the agent ever uses the prompt.

Agent Trust Score (ATS)

Every registered agent carries an Agent Trust Score, a 0-100 composite index that measures how well-governed an agent is, modeled after credit-scoring methodology. Scores are visible per-agent in the dashboard and update in real time as configuration and behavior change.

30 pts

Identity Integrity

Verification status (active/unverified/revoked), whether a policy is configured, number of policy versions saved, and coverage of the five policy rule flags (block secrets, block confidential, block jailbreak, warn PII, warn policy violation). Multiple policy versions signal iterative hardening.

25 pts

Privilege Posture

Whether a scoped tool allow-list is defined (as opposed to allow-all), and how narrow it is. Narrower scopes score higher.

20 pts

Behavioral Standing

Recent block and warn rates from the live receipt stream. New agents with no activity receive a neutral score so they are not penalised.

15 pts

Configuration Coverage

LLM credentials configured, model target explicitly set, and baseline depth measured by all-time request volume. A baseline of 100+ requests signals a well-established agent with enough history for anomaly detection.

10 pts

Operational Depth

Agent age (registered for 30+ days earns full marks) and all-time request history (500+ requests signals a mature production agent). This pillar rewards longevity and track record over raw configuration.

Trusted90-100
Compliant75-89
At Risk55-74
Exposed35-54
Critical0-34

Each pillar uses partial credit and sliding thresholds so the score reflects real security posture.


Compliance

Compliance exports

Verra generates evidence packs for your auditors directly from the receipt log. Generate them from the admin dashboard under Reports. Each pack is downloadable as JSON, CSV (zipped), Excel (.xlsx), or PDF, with a separate integrity manifest and signature file.

EU AI Act

The EU AI Act evidence pack maps your Verra receipt and approval data to the four high-risk system articles:

Article 9Risk managementAggregated risk classification distribution and blocked-request counts for your date range.
Article 12Record-keepingSHA-256 hashed log of every proxied request (text_hash per receipt), with a Merkle-rooted evidence manifest per generated report, no raw text stored.
Article 13TransparencyDLP scan results and classification reasons surfaced to users.
Article 14Human oversightHuman-in-the-loop approval events with decision, timestamp, and reviewer.
Article 17Quality managementVersioned policy history with version numbers and timestamps per change.

The PDF includes a structured auditor report with cover page, per-article sections, and attestation block.

DORA

The DORA pack (Regulation 2022/2554) maps Verra data to the articles most relevant to ICT third-party risk and resilience for AI systems:

Article 5GovernanceAgent and tool inventory with policy versions, owners, and trust tiers.
Article 17ICT incident reportingDrift events, block events, and unattributed-tool escalations within the period.
Article 28Third-party ICT riskMCP server inventory with trust tier, schema hash, and last-checked timestamp per server.

SOC 2 Type II

The SOC 2 AI Addendum pack maps Verra data to five Trust Services Criteria:

CC6.7Third-party AI tool access and OAuth grant log.
CC7.2Anomaly and policy violation event log (all blocked and warned requests).
CC7.3Incident register: human-review queue items and auto-resolved blocks.
PI1.1Processing integrity: complete proxied-call inventory for the period.
P3.1Privacy: DLP findings and consent/block actions per request.

The CSV download is a ZIP containing one CSV per criterion and a manifest with SHA-256 hashes, so auditors can verify the export has not been modified.

HIPAA

The HIPAA evidence pack maps Verra data to the Security Rule and Breach Notification Rule requirements most relevant to AI systems handling protected health information (PHI):

§164.312(b)Audit controlsComplete proxied-call inventory with agent, timestamp, action, and findings for the period.
§164.312(e)Transmission securityDLP scan results confirming PHI was flagged or masked before reaching the model.
§164.308(a)(1)Risk analysisAggregated risk classification distribution and blocked-request counts.
§164.308(a)(5)Security awarenessPolicy change history and human-review approval events.

What makes the reports auditor-grade

Every report generated from the Reports page is deterministic, immutable, and tamper-evidenced: Verra runs the article queries against your receipts and approvals, produces a JSON evidence body, and renders a PDF that cites the underlying receipt, approval, drift-event, and agent IDs under each article. The PDF footer carries a SHA-256 hash of the JSON evidence on every page. Re-running for the same period with unchanged data produces a byte-identical evidence body and the same hash, which is what auditors verify the report against. Reports are immutable once generated; if data changes, generate a new one.

MCP governance

Verra is a first-class MCP-aware proxy. Every tool the model can invoke is attributed to a registered server with a known schema; any drift from the schema captured at agent-import is logged on the receipt. Server prompts are scanned at import and on every recheck with the same regex and ML detector layering used on user input. An hourly cron rechecks every non-disabled server and appends an immutable drift event when any tool schema changes; that drift event is what EU AI Act Article 15 and DORA Article 17 cite. Per-org policy controls how unattributed tool calls (legacy agents, vanilla function definitions) are handled: allow, warn, require_approval, or block.

Shadow AI detection

Calls reported to the Shadow AI reporter endpoint (a fetch wrapper, network proxy, or firewall webhook in your environment) surface in the Shadow AI dashboard with agent, timestamp, and request metadata. The proxy itself only sees calls that touch it, so Shadow AI coverage is bounded by what your reporter sends. Findings feed into the compliance gap report.

Need a walkthrough?

Book a Demo