Engine threat model

At a glance

This page defines the threat model for the NextPDF core engine. It lists the attack classes the engine treats as in scope when it processes attacker-influenced input: Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), Scalable Vector Graphics (SVG), fonts, images, and existing Portable Document Format (PDF) files. It states the default posture for each external-resource capability and points to the in-code guard that mitigates each class.

Boundary. A threat model records the threats considered and the mitigations applied to them. It does not claim that vulnerabilities are absent. A class not listed here is not proven absent; it may sit outside the model’s current scope. Before unimplemented features ship, they go through a formal threat review. Treat this page as a record of deliberate design, not as a security proof.

Install

composer require nextpdf/core:^3

The guards described here are part of the core package. You do not need an extra dependency to enable them. They are on by default.

Conceptual overview

The model follows the Open Worldwide Application Security Project (OWASP) threat-modeling process (owasp_threat_modeling#x1.x11.p6): decompose the system into the points where untrusted input crosses a trust boundary, enumerate the threats at each boundary, and record the mitigation.

The engine’s primary trust boundary is document ingestion: any place where content authored elsewhere — a remote stylesheet, an @font-face source, an <image href>, an embedded Extensible Markup Language (XML) invoice, or a PDF to inspect — could make the engine fetch, parse, or decompress data. The governing principle is deny-by-default: every external-resource capability is off until you explicitly enable it through a policy object. This applies the least-functionality baseline from National Institute of Standards and Technology (NIST) SP 800-53 Rev. 5 CM-7 (nist_sp_800_53r5#x4.x182.p14) to a rendering engine: the constructor default is the strictest position. Opening a capability is your explicit decision.

API surface

The threat model is not an application programming interface (API). The policy objects that express it are documented on the module pages. The trust-relevant entry points are the external-resource policy contract (ExternalResourcePolicyInterface, with DefaultExternalResourcePolicy as the deny-all default) and the Uniform Resource Locator (URL) and XML guards (UrlValidator, XmlGuard). This page references their behavior; it does not re-document their signatures.

Code sample — Quick start

The secure posture is the default. You do not need code to get it:

<?php

declare(strict_types=1);

require_once __DIR__ . '/vendor/autoload.php';

use NextPDF\Html\DefaultExternalResourcePolicy;

// Out of the box: @font-face blocked, @import blocked, background-image
// blocked, SVG external refs blocked. A document that tries to fetch a
// remote resource gets a system-font fallback or an ignored rule — not an
// outbound request.
$policy = new DefaultExternalResourcePolicy();

Code sample — Production

Opening a capability is deliberate and narrow. If you must allow a content delivery network (CDN)-hosted webfont over Hypertext Transfer Protocol Secure (HTTPS) in production, opt in explicitly and scope it:

<?php

declare(strict_types=1);

require_once __DIR__ . '/vendor/autoload.php';

use NextPDF\Html\DefaultExternalResourcePolicy;

// Explicit, scoped opt-in. The HTTPS scheme is required; size and glyph
// caps still apply; the URL still passes the SSRF guard before any fetch.
$policy = (new DefaultExternalResourcePolicy())
    ->withFontFaceAllowed(['https']);

Edge cases & gotchas

Unimplemented is not the same as safe-by-accident. Capabilities such as CSS background-image url() are not implemented, so they have no current attack surface. They are still documented as requiring a formal threat gate before any future implementation. Absence of code is the mitigation today, not a permanent guarantee.
Domain Name System (DNS) rebinding is a moving target. UrlValidator resolves the hostname and returns the resolved Internet Protocol (IP) address so the caller can pin the connection (CURLOPT_RESOLVE), closing the validate-then-fetch time-of-check to time-of-use (TOCTOU) window. This is a best-effort defense, not an absolute one. An operator behind a permissive egress proxy can still reach internal hosts the library cannot see.
Permission bits are not access control. A document that “blocks copying” relies on reader cooperation, not enforcement. This is covered in the security model. It is called out here because it is a common threat-model misconception.

Performance

The guards fail fast and bound work: the XML guard rejects a document type declaration (DOCTYPE) before parsing and caps input size; the image path enforces a megapixel and byte ceiling before decompression; the URL guard rejects by scheme and host before any socket opens. The secure default costs a rejected request, not a slow one.

Security notes

The considered attack classes and their in-code mitigations, with Common Weakness Enumeration (CWE) and OWASP references where they apply:

Threat class (CWE / OWASP)	Vector in a PDF engine	In-code guard
Server-side request forgery (SSRF) (OWASP Top 10 2025; `owasp_top10_2025#x3.x1.p26`)	`@font-face`/`@import`/`url()` pointing at `169.254.169.254` or an internal host; time-stamp authority (TSA), Online Certificate Status Protocol (OCSP), and certificate revocation list (CRL) fetchers	`UrlValidator::validateExternalUrl()` blocks private, reserved, loopback, and link-local ranges and cloud-metadata endpoints; rejects dangerous schemes; resolves DNS; and returns the IP for connection pinning
XML external entity (XXE) (`cwe_top25_2025#x28.x2.p42`)	External entities or DOCTYPE in an embedded XML invoice or Extensible Metadata Platform (XMP) packet	`XmlGuard::loadXml()` enforces `LIBXML_NONET`, rejects any DOCTYPE declaration outright, rejects XML 1.0 forbidden control chars, and applies an input-size cap
Decompression bomb	1×1 image masking a 100 MP payload; oversized Web Open Font Format 2 (WOFF2)	Image path enforces a megapixel ceiling and a byte cap before decompression; font path caps file size and glyph count
Path traversal	`file:///etc/passwd` via a font or SVG `src`	External resources deny-all by default; local file paths resolved via `realpath()` against a directory allowlist when explicitly enabled
Content injection	Crafted string that breaks out of a PDF operator; `data:`/`javascript:` href	PDF string escaping on emission; scheme allowlist and href sanitization on annotations

The defaults add up to a deny-all external-resource posture: font, @import, background-image, and SVG external references stay off until you opt in per scheme, as described in the security property coverage matrix maintained alongside the code.

This page documents considered threats. It is not a penetration-test report and does not claim that the listed mitigations are complete or that no other weakness class applies.

Conformance

This is not a conformance profile. The threat model is informed by the OWASP threat-modeling process and the CWE Top 25 weakness taxonomy (cwe_top25_2025#x28.x2.p42); it does not claim conformance to any security certification scheme. Independent assessment belongs in an audit, not in this document.