Custom layout engines and layout-time text interception
At a glance
Section titled “At a glance”NextPDF does not expose a pluggable layout-engine interface. Use the public layout-extension contract, TextPreprocessorInterface, to hook text at layout time. Content lifecycle events let you observe what layout produces.
Install
Section titled “Install”composer require nextpdf/core:^3Conceptual overview
Section titled “Conceptual overview”The layout pipeline is internal. It covers glyph layout, font subsetting, ToUnicode CMap output, and the structure tree. NextPDF does not let you replace it. Stable byte output and tagged-PDF conformance depend on one controlled build.
NextPDF does expose the point before layout: TextPreprocessorInterface. An implementation receives raw text and returns a segmented result before that text enters glyph layout, font subsetting, the ToUnicode CMap, or the structure tree. Use this supported path to change text content without touching the layout engine.
The source PHPDoc sets a hard rule: an implementation must not change how layout works. It must not add layout-affecting characters such as line feed, carriage return, or tab, and it must preserve logical reading order. The preprocessor states a content swap; it does not make layout choices. Honor this rule, or stable output and accessibility break.
To observe the result of layout, not change it, use the content lifecycle events in Action triggers and event listeners. ContentRenderedEvent fires after content is drawn to a page. FontLoadedEvent fires once per font family and style.
API surface
Section titled “API surface”NextPDF\Contracts\TextPreprocessorInterface (stable, since 1.9.0):
| Method | Returns | Purpose |
|---|---|---|
process(string $text) | TextPreprocessResult | Transform raw text before the render pipeline, and return a segmented result with redaction metadata. |
The returned NextPDF\Contracts\TextPreprocessResult is a frozen value object. Its constructor signature and public properties are stable and do not change in a minor or patch release. New methods may be added.
Code sample — Quick start
Section titled “Code sample — Quick start”The small preprocessor below masks a fixed token. It adds no layout-affecting characters and keeps reading order.
<?php
declare(strict_types=1);
use NextPDF\Contracts\TextPreprocessorInterface;use NextPDF\Contracts\TextPreprocessResult;use NextPDF\Contracts\TextSegment;
final class TokenMaskingPreprocessor implements TextPreprocessorInterface{ public function process(string $text): TextPreprocessResult { $masked = \str_replace('SECRET-TOKEN', '••••••••••••', $text);
return new TextPreprocessResult([ new TextSegment($masked, redacted: $masked !== $text), ]); }}Code sample — Production
Section titled “Code sample — Production”A production preprocessor keeps matching rules in one place. It fails closed on a bad pattern and never logs the original text.
<?php
declare(strict_types=1);
use NextPDF\Contracts\TextPreprocessorInterface;use NextPDF\Contracts\TextPreprocessResult;use NextPDF\Contracts\TextSegment;use Psr\Log\LoggerInterface;
final class PatternRedactionPreprocessor implements TextPreprocessorInterface{ /** * @param non-empty-string $pattern A valid PCRE pattern for sensitive spans */ public function __construct( private readonly string $pattern, private readonly LoggerInterface $logger, ) {}
public function process(string $text): TextPreprocessResult { $result = \preg_replace($this->pattern, '[REDACTED]', $text);
if ($result === null) { // Fail closed: never emit unredacted text on a pattern error. $this->logger->error('Redaction pattern failed; substituting empty text');
return new TextPreprocessResult([new TextSegment('', redacted: true)]); }
return new TextPreprocessResult([ new TextSegment($result, redacted: $result !== $text), ]); }}Edge cases & gotchas
Section titled “Edge cases & gotchas”- No layout replacement. You cannot replace box layout, line breaking, or pagination through this contract. Plugging in a third-party layout engine is out of scope by design.
- Rule enforcement. If you add
\n,\r, or\tinprocess(), you corrupt layout and break stable output. The engine trusts this rule; it does not re-check your output for layout-affecting characters. - Reading order. If you reorder segments, you break tagged-PDF reading order and PDF/UA conformance.
- One responsibility. The preprocessor states a content swap. Use lifecycle events to watch, and do not push side effects through
process().
Performance
Section titled “Performance”process() runs once per text run on the layout hot path. Keep memory use low. Compile patterns once in the constructor, not on each call. Content lifecycle events cost nothing when no listener is bound.
Security notes
Section titled “Security notes”Use TextPreprocessorInterface to remove sensitive content before it reaches the content stream, font subsets, or metadata. Because it runs before subsetting and the ToUnicode CMap, redacted glyphs never enter the file. Treat a preprocessor failure as fail-closed, and emit empty or masked text rather than the original.
Conformance
Section titled “Conformance”This page makes no normative signing or archival claims. The reading-order rule aligns the contract with tagged-PDF needs. The accessibility reference covers tag-level conformance.
Commercial context
Section titled “Commercial context”NextPDF Pro provides production text-preprocessing strategies, including personally identifiable information (PII) redaction tuned for common document types. In Core, you write TextPreprocessorInterface yourself, or you use a verified paid-edition build through the same public contract.
See also
Section titled “See also”Related contracts and modules
Section titled “Related contracts and modules”- Typography contracts reference — where
TextPreprocessorInterfaceandTextPreprocessResultare catalogued. - Streaming contracts reference — the
experimentalCursorInterfaceandStreamingWriterInterfacecontracts, which have a shipped engine implementation. - Action triggers and event listeners — the lifecycle events used to observe layout output.
- SPI stability rules — the frozen value-object promise behind
TextPreprocessResult. - Extension authoring overview — the full public service provider interface (SPI) surface.
The glossary defines text preprocessor and extension point; see the published glossary for each canonical definition.