Skip to content

Content: textual + structured content model

The Content module builds text-showing operators, text-state operators, text shadows, document-level JavaScript, and marked-content property dictionaries. It sits between layout and the content stream.

Terminal window
composer require nextpdf/core:^3

Content provides the primitives that turn resolved text into Portable Document Format (PDF) operators. TextRenderer is the central component. It builds the text-showing operator for a string and the text-state operators that precede it. Under International Organization for Standardization (ISO) 32000-2 §9, the Tj operator paints the glyphs of a string with the current font and text-related graphics parameters. TextRenderer chooses either a single show operator or a positioned TJ array based on the active TypographyMode. It applies kerning adjustments when the mode uses TJ arrays.

Text state is modeled as a complete set. setTextRenderingMode() takes a TextRenderingMode enum. Its eight cases map one-to-one to the ISO 32000-2 text rendering modes: fill, stroke, fill-then-stroke, invisible, and the four clip variants (Table 104). The renderer also controls stroke width, character spacing, word spacing, horizontal stretching, text rise, right-to-left direction, and an optional Hyphenator. Calling buildTextStateOperators() emits the accumulated state as a single operator block.

TextShadow is a value object: color, X and Y offsets in user units, and opacity. The renderer uses it to emit a second draw pass at the offset. The default offsets are a subtle 0.5/−0.5 with 0.5 opacity, similar to a soft shadow in Cascading Style Sheets (CSS).

JavaScriptManager owns document-level scripting. includeJs() registers a document script. addJsObject() registers a named script object. writeJavaScript() / writeOpenAction() serialize scripts into the catalog and the OpenAction. The manager validates and PDF-string-encodes every script body before emission.

PropertiesRegistry is the marked-content property store. register() returns a stable tag index for a property dictionary. registerOcg() / registerOcgs() bind optional-content groups (OCGs) by object number. writeProperties() serializes the registry into the page resource dictionary. The ContentStream module reads this data when it opens a marked sequence with a property list.

Two image decoders are in this module because they handle PDF-native, pass-through formats. JBig2Loader and JpxLoader parse JBIG2 and JPEG 2000 segment structures and return ImageData without rasterizing pixels. The encoded bytes pass to the viewer unchanged. When a JBIG2 source carries a separate globals segment, JBig2Loader embeds it through a /JBIG2Globals stream reference on the image XObject; the in-stream/in-line form continues to round-trip as before. This is structural wiring only: the globals bytes pass through unrasterized, not decoded.

ClassKey methodsRole
TextRendererbuildTextShowOperator(), buildTextStateOperators(), setTextRenderingMode(), setTextStrokeWidth(), setTextShadow(), setFontSpacing(), setWordSpacing(), setFontStretching(), setTextRise(), setRTL(), setHyphenation()Text show + text-state operator builder
TextRenderingMode (enum)Fill, Stroke, FillStroke, Invisible, FillClip, StrokeClip, FillStrokeClip, ClipISO 32000-2 text rendering modes
TextShadow__construct(Color, offsetX, offsetY, opacity)Offset draw-pass value object
JavaScriptManagerincludeJs(), addJsObject(), hasJavaScript(), writeJavaScript(), writeOpenAction()Document-level JavaScript catalog wiring
PropertiesRegistryregister(), getTagIndex(), registerOcg(), registerOcgs(), getAll(), writeProperties()Marked-content + OCG property store
JBig2Loaderload(), loadFromString(), parseSegments()JBIG2 pass-through decoder
JpxLoaderload(), loadFromString(), parseBoxes()JPEG 2000 pass-through decoder

Run composer docs:generate-api-php -- --module=Content for the full PHPDoc table.

Source: examples/28-text-rendering.php.

<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use NextPDF\Content\TextRenderer;
use NextPDF\Content\TextRenderingMode;
$renderer = new TextRenderer();
$renderer
->setTextRenderingMode(TextRenderingMode::FillStroke)
->setTextStrokeWidth(0.3)
->setWordSpacing(0.5);
$stateOps = $renderer->buildTextStateOperators();

This adds a soft shadow and a hyphenator, then builds the show operator with a caller-supplied escape function, the canonical PdfStringEscaper boundary from architecture decision record ADR-015.

<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use NextPDF\Content\TextRenderer;
use NextPDF\Content\TextShadow;
use NextPDF\Graphics\Color;
use NextPDF\Support\PdfStringEscaper;
$renderer = new TextRenderer();
$renderer
->setTextShadow(new TextShadow(Color::rgb(0, 0, 0), 0.4, -0.4, 0.45))
->setRTL(false);
$showOp = $renderer->buildTextShowOperator(
text: 'Quarterly report',
fontKey: 'F1',
metrics: $fontMetrics,
escapeSegment: static fn (string $s): string => PdfStringEscaper::escapeLiteral($s),
);
$pageContent = $renderer->buildTextStateOperators() . $showOp;
  • buildTextShowOperator() returns an empty string for empty input. Do not emit an empty Tj; guard upstream if your layout can produce blank runs.
  • The escape callback is mandatory and owns string safety. Pass the canonical PdfStringEscaper::escapeLiteral() from ADR-015. A partial escaper produces a syntactically invalid literal string.
  • In a top-left origin, TextShadow::offsetY is negative-down. A positive Y pushes the shadow up, which is rarely intended.
  • JavaScriptManager validates script input. Invalid script bodies are rejected at registration, not silently dropped at write time.
  • JBig2Loader and JpxLoader never rasterize. They validate and pass the encoded bytes through. A corrupt segment surfaces as a parse error, not a blank image.
  • PropertiesRegistry::register() is idempotent per dictionary; identical property dictionaries reuse one tag index.

Operator construction is O(n) in string length, plus an O(n) kerning pass when the typography mode uses TJ arrays. There is no layout or shaping cost here; that work stays in the Typography and Layout modules. JavaScript and property serialization are O(entries). The pass-through image loaders use O(bytes) parsing with zero decode cost. This is their main advantage for scanned-document workloads. The performance_budget for the reference workload is 1500 ms wall and 64 MB peak.

JavaScriptManager accepts script bodies that may come from untrusted templates. It validates and PDF-string-encodes every body, but document JavaScript remains an active-content surface. Disable it for untrusted output, or strip it with the sanitization path described in /modules/core/security/. JBig2Loader and JpxLoader parse untrusted segment structures: bound input size and parse time, and run extraction in a constrained worker when the source is user-supplied. The text escape boundary is the caller-supplied callback. Always pass the canonical escaper so control bytes cannot break out of a literal string.

The module emits text-showing and text-state operators that are consistent with the ISO 32000-2 §9 text model. This includes the Tj operator semantics and the Table 104 rendering modes mirrored by the TextRenderingMode enum. These are implementation facts: src/Content/TextRenderer.php and the TextRenderingMode enum produce the operator shapes, and tests/Unit/Content/TextRenderer*, JavaScriptManagerIsoTest, and PropertiesRegistryTest exercise them. They do not assert end-to-end PDF 2.0 conformance. The string-escape contract follows ADR-015 and ISO 32000-2 §7.3.4.2. The JBIG2 and JPEG 2000 pass-through paths preserve encoded streams unchanged. A separate JBIG2 globals segment is embedded as a /JBIG2Globals stream reference on the image XObject; this is verified as structural wiring, not as a decode-fidelity claim. Document-level conformance is validated by the oracle and golden suites in /modules/core/conformance/.