NextPDF Connect HITL risk tiers

At a glance

Every tool declares one of four risk levels. The highest level, approval-required, does not run on the first call. Instead, the ConfirmationGate returns a single-use challenge token. An agent must relay that token to a human, who authorizes the re-invocation.

Install

composer require nextpdf/server

Conceptual overview

The risk model has exactly four ordered levels:

Level	Value	Meaning	Effect
safe	0	Read-only, no side effects	Auto-execute
caution	1	Creates or modifies in-memory state	Auto-execute, audit-logged
review	2	Produces output that could be misused	Auto-execute, audit-logged
approval_required	3	Destructive, legal, or privacy-critical	Human confirmation required

A tool’s risk comes from exactly two places: the tool’s own declaration and an optional operator configuration override. There is no third source. The model carries a version number. The MCP initialize response exposes that number so a client can detect an incompatible change. Audit logging applies at caution and above.

Holding an automated action until a human authorizes it places the control where the automation introduces risk. IEC 31010 identifies this as the position for controlling risk introduced through human action, at or near the point of introduction (IEC 31010:2019).

API surface

The ConfirmationGate

When you invoke an approval_required tool without a valid token, the gate issues a challenge. The check returns one of two shapes.

{ "allowed": true }

{ "allowed": false, "challenge": "<human-readable text>", "token": "confirm_<nonce>" }

The challenge text names the operation and its description. It also warns when a target file would be overwritten. It tells the caller to re-invoke the same tool with a _confirmation_token parameter set to the issued token. The token expires in 300 seconds.

Token binding is deliberate: the token binds the tool name, a random nonce, and the TTL — not the arguments. On retry, MCP clients may re-serialize arguments with different key ordering or normalization, so hashing the arguments would break legitimate confirmations. The token is single-use. Consuming it on the re-invocation allows the call exactly once.

Surfacing per transport

The gate is enforced on every transport that drives tools:

MCP: the challenge returns in-band as a successful JSON-RPC response with the challenge text as its content. The caller re-invokes tools/call with arguments._confirmation_token.
REST and gRPC: the same gate runs in the shared tool executor before an approval_required operation. The challenge appears in the operation response. The caller repeats the operation with the token.

Built-in downgrade protection

A configuration override may raise a tool’s risk level, but it may never lower a tool that is approval_required by design. The configuration loader enforces a fixed critical set and throws at load time if an override attempts a downgrade. The server refuses to boot rather than run with a weakened gate.

Code sample — Quick start

Trigger a challenge by writing a file with output_pdf:

./vendor/bin/nextpdf-mcp <<'EOF'
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"c","version":"1.0.0"}}}
{"jsonrpc":"2.0","method":"notifications/initialized"}
{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"output_pdf","arguments":{"document_id":"<id>","file_path":"/var/lib/nextpdf/tmp/out.pdf"}}}
EOF

The response is the challenge, not the file. Re-invoke with the issued token:

{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"output_pdf","arguments":{"document_id":"<id>","file_path":"/var/lib/nextpdf/tmp/out.pdf","_confirmation_token":"confirm_<nonce>"}}}

Code sample — Production

Raise a normally-caution tool to approval-required for a hardened deployment:

nextpdf_mcp:
  risk_level_overrides:
    add_image: 3      # require human confirmation for image insertion

A downgrade is rejected at load time, and the server does not start. For example, setting output_pdf below 3 is a downgrade.

Edge cases and gotchas

output_pdf in base64 mode does not gate. Writing to disk is approval-required; returning the PDF as base64 (no file_path) is treated as a lower risk and runs without confirmation.
The token is not a credential. It does not authenticate the caller and does not replace an API key on networked transports. It only releases one specific gated call once, within 300 seconds.
A new challenge each time. Failing to relay a token, or letting it expire, does not block the tool permanently. The next call issues a fresh challenge. Tokens are stored in a single-use token store with periodic garbage collection.
Audit happens regardless of outcome. A challenge issuance, a successful execution, and a failed execution at caution-and-above are all audit-logged with the tool name and risk level.

Performance

The gate adds a token store lookup and, on challenge, random-token generation. That cost is negligible next to the gated operation and applies only to approval_required tools.

Security notes

The gate is a containment control, not an authentication control. It ensures a human authorizes destructive, legal, or privacy-critical actions even when an autonomous agent drives the tool. For these operations, the server does not claim to operate without human oversight, and configuration cannot weaken the gate. Combine it with the API key model on networked transports and with enabled_tools least-privilege scoping. See /connect/security-and-operations/.

Conformance

Claim	Source	`reference_id`
Control risk at the point of (human) introduction	IEC 31010:2019

The MCP initialize response carries the risk-model version so clients can detect an incompatible change. The wire format is documented on /transports/mcp/.

Commercial context

Premium tools declare their own risk level with the same four-level model. Destructive Premium operations, such as redaction, use the identical gate. The gate is part of the server, not the Premium package.