VerraVerra
HomeProductHow It WorksDocsDashboard
Book a DemoSign in / Sign up

Overview

What is Verra?

Verra is a drop-in security proxy that sits between your AI agents and your LLM providers. Every request is scanned, governed by policy, and logged, without any changes to your agent code.

You point your agent at https://api.helloverra.com/api/proxy instead of https://api.openai.com and add one header. Verra forwards the request to the real provider after running its security pipeline.

Your Agent
    │
    │  POST /api/proxy  ·  x-verra-key: va-...
    ▼
┌──────────────────────────────────────────┐
│              Verra Proxy                 │
│                                          │
│  Auth → DLP → Policy → Route → Forward  │
│                         │                │
│                    Scan Response         │
└──────────────────────────────────────────┘
        │                       │
        ▼                       ▼
  LLM Provider             Audit Log

Works with OpenAI, Anthropic, Azure OpenAI, Amazon Bedrock, and Google Vertex AI out of the box. No infrastructure changes required.


Quickstart

Up in three steps

No SDK install required. Verra is a pure HTTP proxy: change one URL, add one header. Your existing OpenAI-compatible client works without modification.

1. Get your Verra key

After signing up, navigate to Agents → Register Agent in the Verra admin dashboard and register your agent. The key is shown once after registration. Keys have the prefix va-.

Store it as an environment variable, never commit it to source control.

export VERRA_KEY=va-...

2. Point your agent at Verra

Change the base URL in your LLM client and add the x-verra-key header. Everything else stays the same.

# Python · OpenAI SDK import openai, os client = openai.OpenAI( api_key=os.environ["OPENAI_API_KEY"], base_url="https://api.helloverra.com/api/proxy", # ← change this default_headers={ "x-verra-key": os.environ["VERRA_KEY"], # ← add this }, ) # Your existing code is unchanged response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "..."}], )
# TypeScript · OpenAI SDK import OpenAI from 'openai'; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, baseURL: 'https://api.helloverra.com/api/proxy', // ← change this defaultHeaders: { 'x-verra-key': process.env.VERRA_KEY, // ← add this }, });

3. Verify in the dashboard

Make any LLM call. Verra auto-registers your agent on first call. Open the admin dashboard → Receipts to see the audit log entry with risk level, DLP findings, and trace ID.

Verra never stores raw prompt or completion text, only a SHA-256 hash, token count, and metadata. PII and secrets never persist.

OpenClaw

OpenClaw supports a baseUrl override per model provider. Point it at Verra and use your Verra agent key as the API key with no other changes needed.

// ~/.openclaw/openclaw.json { "models": { "providers": { "anthropic": { "baseUrl": "https://api.helloverra.com/api/proxy/anthropic", "apiKey": "va-your-agent-key", "auth": "api-key" } } } }

Replace anthropic with openai if your agents use GPT models. All OpenClaw sessions flow through Verra automatically.

LangChain callback (optional)

If you use LangChain, install the npm SDK and attach the callback handler. See the npm SDK section for full configuration options.

import { withVerraSecurity } from '@helloverra/sdk/langchain'; const safeLLM = withVerraSecurity(llm, { orgId: process.env.VERRA_ORG_ID, verraProxyUrl: 'https://api.helloverra.com', agentName: 'my-agent', });

npm SDK

@helloverra/sdk

The Verra npm package ships the detection pipeline, LangChain and CrewAI integrations, and all types as a standalone library. Use it to run Verra's security checks directly in your Node.js application, no proxy hop required.

npm install @helloverra/sdk

LangChain integration

Import from @helloverra/sdk/langchain. The callback handler auto-registers your agent on first call and reports a receipt to your Verra dashboard for every LLM invocation.

import { ChatOpenAI } from '@langchain/openai'; import { withVerraSecurity } from '@helloverra/sdk/langchain'; const llm = new ChatOpenAI({ model: 'gpt-4o' }); const safeLLM = withVerraSecurity(llm, { orgId: process.env.VERRA_ORG_ID, // your Verra org ID verraProxyUrl: 'https://api.helloverra.com', // your Verra deployment agentName: 'customer-support-agent', onBlock: { strategy: 'fallback', message: "I can't help with that." }, onFlag: 'log', onError: 'fail_open', }); // Use safeLLM exactly like your original LLM. Verra runs invisibly const result = await safeLLM.invoke('Hello!');

CrewAI integration

Import from @helloverra/sdk/crewai. Attach step and task callbacks to any CrewAI agent to inspect tool inputs and outputs.

import { applyVerraSecurity } from '@helloverra/sdk/crewai'; const agent = new Agent({ role: 'researcher', tools: [...] }); applyVerraSecurity(agent, { orgId: process.env.VERRA_ORG_ID, verraProxyUrl: 'https://api.helloverra.com', agentName: 'researcher', onBlock: { strategy: 'throw' }, });

Direct pipeline usage

Run the detection pipeline directly without an integration wrapper. Useful for custom frameworks or serverless handlers.

import { runInputPipeline, runOutputPipeline } from '@helloverra/sdk'; // Inspect a request before sending to your LLM const verdict = await runInputPipeline({ requestId: crypto.randomUUID(), userInput: userMessage, systemPrompt: systemPrompt, // optional, enables prompt injection detection policy: myPolicy, // optional, enables custom policy rules }); if (verdict.action === 'block') { throw new Error(`Blocked by ${verdict.triggeredBy}`); } // Inspect the LLM response before returning to the user const outVerdict = runOutputPipeline({ requestId: verdict.requestId, response: llmResponse, systemPrompt: systemPrompt, });

Configuration

OptionTypeDescription
orgIdstringYour Verra org ID. Required for receipt reporting and auto-registration.
verraProxyUrlstringBase URL of your Verra deployment. Required for receipt reporting.
agentNamestringStable name for this agent, used to deduplicate registrations in the dashboard.
onBlockobjectAction when a request is blocked. { strategy: "throw" } raises VerraSecurityError. { strategy: "fallback", message } returns a safe string instead.
onFlagstring"log" (default): logs flagged requests and continues. "passthrough": silently continues.
onErrorstring"fail_open" (default): passes the request if the pipeline errors. "fail_closed": blocks on any pipeline error.
policyobjectCustom policy rules to enforce. See the Policy schema in the API reference.
sensitiveContextstring[]Strings that should never appear in LLM responses. Triggers a block if found.
toolsstring[]Tool names this agent has access to. Passed at registration so the dashboard shows them immediately.
traceparentstringW3C traceparent header value. Links SDK receipts and tool gate calls into a distributed OTel trace.
The SDK uses your OPENAI_API_KEY environment variable for the embedding-based detectors. If it is not set, semantic similarity layers are skipped and only pattern-based detection runs.

Observability

Observability integrations

LangSmith

Verra and LangSmith complement each other: LangSmith gives you execution traces and prompt debugging; Verra gives you security enforcement and compliance receipts. Two things work out of the box.

1. Run IDs in receipts. When you use VerraCallbackHandler, every receipt written to the Verra dashboard includes the LangSmith run_idin its findings.langsmith_run_id field. You can look up the corresponding LangSmith trace directly from any blocked or flagged receipt.

2. Automatic tool gating. VerraCallbackHandler now hooks handleToolStart and calls /api/gate/tool-inputautomatically on every tool invocation, with no manual gate call needed. Blocked tools raise a VerraSecurityError (or return your fallback message) before the tool runs.

Exporting Verra traces to LangSmith

LangSmith accepts OpenTelemetry traces over OTLP. Since Verra is OTel-native, you can send Verra spans (proxy requests, risk analysis, upstream LLM calls) directly into your LangSmith project by setting two environment variables:

# Your LangSmith OTLP endpoint: find it in Settings → Integrations → OTLP OTEL_EXPORTER_OTLP_ENDPOINT=https://api.smith.langchain.com/otel OTEL_EXPORTER_OTLP_HEADERS=x-api-key=<your-langsmith-api-key>

Verra spans will appear in LangSmith as a parallel trace alongside your LangChain execution trace, tagged with gen_ai.* and verra.* attributes. No code changes required, just the environment variables.

Other OTLP backends

The same OTEL_EXPORTER_OTLP_ENDPOINT variable works with any OpenTelemetry-compatible backend. Point it at Grafana Tempo, Jaeger, Honeycomb, Datadog, or your own collector.

# Grafana Cloud OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-east-0.grafana.net/otlp # Honeycomb OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io OTEL_EXPORTER_OTLP_HEADERS=x-honeycomb-team=<your-api-key> # Local Jaeger (development) OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

Each proxy request creates a root span (verra.proxy) with child spans for risk analysis and the upstream LLM call. Token usage is recorded asgen_ai.usage.input_tokens and gen_ai.usage.output_tokens. Request counts and latency are exported as OTLP metrics every 60 seconds.


Configuration

Configuration

Environment variables

NameTypeDescription
VERRA_KEYstringrequiredYour Verra API key (prefix va-). Required on all proxy requests.
VERRA_ORG_IDstringOrganization ID. Auto-resolved from the key when omitted.
VERRA_LOG_LEVELstringLogging verbosity: silent | error | info | debug. Defaults to info.
VERRA_TIMEOUT_MSnumberRequest timeout in milliseconds forwarded to the upstream LLM. Defaults to 60000.

Policies

Policies are managed in the admin dashboard under Settings → Policies. Each policy is a named rule set that specifies which agents it applies to, which checks to run, and what action to take (pass / warn / block / escalate).

# Example policy (YAML representation shown for clarity) name: finance-agent-policy applies_to: agent_types: [finance, payroll] # Cost & abuse protection max_input_tokens: 8000 # block requests estimated over this token count max_requests_per_minute: 60 # block if agent exceeds this rate (rolling 1-min window) # PII handling block_pii_input: true # block (not just warn) when PII is found in user input rules: - name: block-pii check: dlp patterns: [ssn, credit_card, bank_account] action: block - name: flag-code-execution check: tool_access tools: [code_interpreter, bash] action: escalate # routes to human-in-the-loop queue - name: route-sensitive-to-private check: risk_level threshold: high action: route_private # sends to your self-hosted model

Model routing

Configure a private model target in Settings → Model Routing. When a request is flagged as high-risk, Verra routes it to the private endpoint instead of the public provider, no raw data leaves your network.

# Settings → Model Routing private_endpoint: https://your-internal-llm.corp/v1 private_model: llama-3-70b fallback: block # block | pass_public

API Reference

API Reference

POST /api/proxy

The proxy endpoint accepts any OpenAI-compatible request body. Verra inspects it, applies policies, then forwards to the configured upstream provider.

Request headers

x-verra-keystringrequiredYour Verra agent API key (prefix va-).
x-verra-orgstringOrg ID for auto-provisioning. Only needed when calling without a registered agent key.
x-verra-userstringEnd-user identity for per-user audit trails.
x-verra-parent-agentstringAgent ID of the calling agent, for agent-to-agent chains.
x-verra-session-idstringStable session identifier for multi-turn conversations. Verra loads the last 10 messages from this session as context so cross-turn attacks are detected.
x-verra-approval-idstringRe-submit a previously escalated request after human approval has been granted.
traceparentstringW3C TraceContext header (00-{traceId}-{spanId}-{flags}). Links this hop into an existing distributed trace. Verra generates a new trace if absent.
x-verra-trace-idstringLegacy trace ID header, accepted for backward compatibility. Prefer traceparent for new integrations.
AuthorizationstringrequiredYour upstream provider key (e.g. Bearer sk-...). Forwarded to the LLM after Verra's pipeline runs.

Response headers

x-verra-receipt-idstringID of the audit receipt written for this request.
x-verra-riskstringRisk classification: low | medium | high.
x-verra-actionstringPipeline action taken: pass | warn | block | escalate.
traceparentstringW3C traceparent for this hop. Pass to the next agent or store for distributed tracing.
x-verra-trace-idstringTrace ID in UUID format (echoed or generated). Legacy; use traceparent for new integrations.

Blocked requests

When a request is blocked by policy, Verra returns 403 Forbidden with a structured error body instead of forwarding to the LLM.

HTTP/1.1 403 Forbidden Content-Type: application/json x-verra-action: block x-verra-receipt-id: rcpt_a3f9c1... { "error": { "type": "policy_violation", "message": "Request blocked by policy: block-pii", "findings": ["pii:ssn", "pii:credit_card"], "receipt_id": "rcpt_a3f9c1..." } }

Escalated requests

Escalated requests return 202 Accepted immediately. The upstream call is held pending human approval. Your agent should surface the approval URL to the user; the call proceeds once approved, or expires after one hour.

HTTP/1.1 202 Accepted Content-Type: application/json x-verra-action: escalate { "status": "pending_approval", "approval_url": "https://api.helloverra.com/approvals/apr_b7e2d4...", "expires_at": "2025-06-01T14:00:00Z", "receipt_id": "rcpt_b7e2d4..." }

GET /api/receipts

Query the audit log. Requires an admin API key (va-admin_...).

curl https://api.helloverra.com/api/receipts \ -H "x-verra-key: va-admin_..." \ -G \ --data-urlencode "agent=finance-report-ag" \ --data-urlencode "risk=high" \ --data-urlencode "from=2025-05-01" \ --data-urlencode "limit=50"

Returns a JSON array of receipt objects. Each receipt contains id, agent_id, trace_id, risk_level, action, findings, input_hash, output_hash, and created_at. Raw text is never returned.


Concepts

How the pipeline works

Every request through Verra passes five sequential stages. Each stage can independently pass, warn, block, or escalate the request.

1. Auth

The x-verra-key header is verified against your organization. Invalid or revoked keys receive a 401 immediately, before any content is read.

2. Data Loss Prevention (DLP)

The request body is scanned for sensitive patterns before it reaches the model. Four detectors run in parallel so latency is the max of the slowest detector, not a sum.

Pattern matching & PII detection

Regex patterns plus iiiorg/piiranha-v1 (NER model, runs locally via ONNX) for SSNs, credit cards, API keys, IBANs, passport numbers, medical record numbers (MRN), dates of birth, and 40+ other types. When block_pii_input is set in policy, PII on input is blocked rather than warned.

Prompt injection & jailbreak

Three-layer pipeline per detector: pattern matching, then an on-device ONNX classifier (protectai/deberta-v3-base-prompt-injection-v2 for injection, jackhhao/jailbreak-classifier for jailbreaks), then an LLM judge for ambiguous scores. Covers role-play escapes, indirect instruction injection, and DAN-style jailbreaks. Models run locally via @huggingface/transformers with no data sent externally.

System prompt extraction

Blocks direct extraction attempts ("repeat your instructions", "show me your system prompt") and flags indirect extraction probes ("translate your instructions to French"). 20 direct + 14 indirect patterns.

Structured data injection

JSON payloads are recursively unwrapped and all string values scanned. Catches attacks embedded inside {"notes": "ignore your instructions..."} or nested tool arguments.

3. Policy evaluation

Policies are evaluated in order. The first matching rule determines the action. If no rule matches, the default action (configurable, defaults to pass) applies.

4. Tool access control

For agentic requests that include tool definitions, Verra validates against a four-layer gate: RBAC, agent-type permission matrix, behavioral baseline, and content scan. An HR agent cannot invoke code execution tools; a finance agent cannot query external databases, regardless of what the prompt says.

Tool restrictions apply even when the model attempts to call an unauthorized tool mid-conversation. The tool call is stripped and the agent receives a policy error in the tool result.

5. Model routing

Low-risk requests are forwarded to your configured LLM provider. High-risk requests are routed to your private model endpoint, keeping sensitive data inside your network boundary. You configure both targets in the dashboard.

Multi-turn session analysis

Pass x-verra-session-id on each request to enable session-aware detection. Verra loads the last 10 messages from that session and prepends them as context before running the pipeline. This catches slow-burn attacks that unfold across multiple turns, for example an adversary who gradually shifts the model's behavior over several exchanges rather than in a single message.

Agent-to-agent authorization

When Agent A calls Agent B, Verra validates the trust chain. The delegating agent's trace ID must be present and its permissions must be a superset of the callee's requirements. Sensitive data categories (health, finance, HR) never cross agent-type boundaries regardless of policy configuration.

Human-in-the-loop approvals

Requests escalated by policy return 202 Accepted before forwarding. The admin queue surfaces the justification and context to a human reviewer. Approvals expire after one hour. Rejected requests write a block receipt and notify the calling agent.

Agent Trust Score (ATS)

Every registered agent carries an Agent Trust Score — a 0–100 composite index that measures how well-governed an agent is, modeled after credit-scoring methodology. Scores are visible per-agent in the dashboard and update in real time as configuration and behavior change.

30 pts

Identity Integrity

Verification status (active/unverified/revoked), whether a policy is configured, number of policy versions saved, and coverage of the five policy rule flags (block secrets, block confidential, block jailbreak, warn PII, warn policy violation). Multiple policy versions signal iterative hardening.

25 pts

Privilege Posture

Whether a scoped tool allow-list is defined (as opposed to allow-all), and how narrow it is. Fewer allowed tools = higher score. An agent restricted to 1–3 tools earns full marks; allow-all earns zero for this pillar.

20 pts

Behavioral Standing

7-day rolling block rate and warn rate. Agents with a block rate below 2% and warn rate below 5% earn full marks. Agents with no recent activity receive a neutral score — new agents are not penalized. Activity volume (request count in the 7-day window) also contributes.

15 pts

Configuration Coverage

LLM credentials configured, model target explicitly set, and baseline depth measured by all-time request volume. A baseline of 100+ requests signals a well-established agent with enough history for anomaly detection.

10 pts

Operational Depth

Agent age (registered for 30+ days earns full marks) and all-time request history (500+ requests signals a mature production agent). This pillar rewards longevity and track record over raw configuration.

Trusted90–100
Compliant75–89
At Risk55–74
Exposed35–54
Critical0–34

No pillar is binary — each uses partial credit and sliding thresholds so the score reflects real security posture rather than a checkbox audit. A revoked agent does not have its score capped; it simply loses the Identity Integrity points, letting the other four pillars reflect whatever configuration is still in place.


Compliance

Compliance exports

Verra generates evidence packs for your auditors directly from the receipt log. Exports are available from the admin dashboard under Evidence Packs.

EU AI Act

The EU AI Act evidence pack maps your Verra receipt and approval data to the four high-risk system articles:

Article 9Risk managementAggregated risk classification distribution and blocked-request counts for your date range.
Article 12Record-keepingSHA-256 hash-chained log of every proxied request, tamper-evident, no raw text stored.
Article 13TransparencyDLP scan results and classification reasons surfaced to users.
Article 14Human oversightHuman-in-the-loop approval events with decision, timestamp, and reviewer.
Article 17Quality managementPolicy change history with before/after diffs and timestamps.

Export formats: JSON, CSV, Excel (.xlsx), PDF. The PDF includes a structured auditor report with cover page, per-article sections, and attestation block.

SOC 2 Type II

The SOC 2 AI Addendum pack maps Verra data to five Trust Services Criteria:

CC6.7Third-party AI tool access and OAuth grant log.
CC7.2Anomaly and policy violation event log (all blocked and warned requests).
CC7.3Incident register: human-review queue items and auto-resolved blocks.
PI1.1Processing integrity: complete proxied-call inventory for the period.
P3.1Privacy: DLP findings and consent/block actions per request.

Export formats: ZIP (6 CSVs + integrity manifest), Excel (.xlsx), PDF. The integrity manifest contains SHA-256 hashes of each CSV file, allowing auditors to verify the export has not been modified.

HIPAA

The HIPAA evidence pack maps Verra data to the Security Rule and Breach Notification Rule requirements most relevant to AI systems handling protected health information (PHI):

§164.312(b)Audit controlsComplete proxied-call inventory with agent, timestamp, action, and findings for the period.
§164.312(e)Transmission securityDLP scan results confirming PHI was flagged or masked before reaching the model.
§164.308(a)(1)Risk analysisAggregated risk classification distribution and blocked-request counts.
§164.308(a)(5)Security awarenessPolicy change history and human-review approval events.

Export formats: ZIP (CSVs + integrity manifest), Excel (.xlsx), PDF.

Shadow AI detection

Any AI call that bypasses the proxy is surfaced in the Shadow AI dashboard with agent, timestamp, and request metadata. Findings feed into the compliance gap report.

Need a walkthrough?

Book a 15-minute demo and we'll run through setup in your environment.

Book a Demo