The pipeline model
Spec: ISO 32000-2, §7.5 ISO 32000-2 §7.5 Evidence: Code-backed
At a glance
Section titled “At a glance”A NextPDF document is not produced in one opaque step. It moves through a small number of explicit stages: a facade that records intent, a content layer that turns intent into a model, and a writer that serializes that model into a conforming PDF. This page explains that shape and why it is the right shape for the engine.
Why this matters
Section titled “Why this matters”The PDF file format is itself a layered structure — a header, a body of objects, a cross-reference table, and a trailer — and a writer has to assemble all of it consistently. If the engine that builds it is a single tangled procedure, every change risks every output. The only way to gain confidence is then to render whole documents and inspect them by eye, which is slow, late, and unconvincing.
An explicit pipeline turns that around. Each stage has one job and a typed boundary, so you can reason about a change and test it at the stage it touches, not only at the end of the file. The architecture is a testability and extensibility decision before it is anything else.
The short version
Section titled “The short version”- The public entry point is a Document facade. It is a fluent, use-once, worker-safe builder that records what you want, not how it is serialized.
- The facade delegates to roughly two dozen focused concern traits (text output, drawing, pages, security, navigation, and so on) — one responsibility each, not one giant class.
- Content arrives by one of two paths: direct drawing (graphics primitives) or the HTML/CSS engine. Both produce the same internal document model.
- A dedicated PDF writer serializes that model, choosing a PDF 1.4 / 1.7 / 2.0 strategy. Producing valid file structure lives here and nowhere else.
- Long-lived state (font and image registries) is process-scoped and shared; per-request state (the document) is created fresh and never reused. The boundary is explicit, which is what makes worker runtimes safe.
How NextPDF approaches it
Section titled “How NextPDF approaches it”The cleanest way to see the model is to follow a document from call to bytes.
- Document facade Fluent, use-once builder; records intent via concern traits.
- Content production Direct drawing or the HTML/CSS engine — both build one document model.
- Document model Accumulated pages, content, and resources held as typed state.
- PDF writer Serialises the model; selects a PDF 1.4 / 1.7 / 2.0 strategy.
- Conforming PDF Header, object body, cross-reference table, trailer.
Two design choices make this more than a diagram.
The facade is composed, not monolithic. Document does not implement
every feature itself; it delegates each area to a dedicated concern trait —
text output, drawing, pages, security, typography, navigation, transactions,
and so on. A new document method belongs in the trait that owns its area,
not on the facade itself. The class you call stays small, and the
responsibilities stay separated.
The writer owns file structure exclusively. Content production decides what marks and objects exist; the writer decides how they become a valid PDF file, including which version strategy applies. That separation is enforced as an architectural rule: layout and content code do not emit final file structure, and the writer does not make layout decisions. The benefit is that “is the output a valid PDF?” has exactly one place to be tested.
The lifetime boundary is part of the model, not an afterthought. Font and image registries live for the life of the process and are shared across requests; the document, its rendering context, and the writer are created per request and disposed. In a worker runtime that distinction is the difference between safe reuse and cross-request corruption. For that reason it is stated in the architecture, not left to discipline.
What the evidence says
Section titled “What the evidence says”This page is Evidence: Code-backed . The stages map to real structure in the core repository:
- The facade and its delegation are
src/Core/Document.phpplus the concern traits insrc/Core/Concerns/(text output, output, drawing, pages, security, typography, navigation, transactions, and more — each a single responsibility). - The two content paths are the HTML/CSS engine (
src/Html/) and direct drawing (src/Graphics/), both feeding the internal model. - Serialization and PDF version strategy live in
src/Writer/(PdfWriter.php, with explicit PDF 1.4 / 1.7 / 2.0 strategy classes). - The process-lifetime vs per-request boundary is the worker-safe design
recorded in the architecture overview and exercised by the shipped
worker-factory example, which shares a
FontRegistryandImageRegistryacross requests while creating eachDocumentfresh.
The destination is fixed by the format. The writer’s output must be a header, an object body, a cross-reference table, and a trailer per Spec: ISO 32000-2, §7.5 ISO 32000-2 §7.5 . Concentrating that obligation in one stage is what lets the rest of the engine stay focused on content instead of on assembling file structure.
Practical example
Section titled “Practical example”The facade’s job is to make intent read like intent. The content path and the writer stay invisible at the call site:
<?php
declare(strict_types=1);
require_once __DIR__ . '/vendor/autoload.php';
use NextPDF\Core\Document;
$doc = Document::createStandalone(); // facade$doc->setTitle('Quarterly Report'); // metadata concern$doc->addPage(); // pages concern$doc->setFont('helvetica', 'B', 16); // typography concern$doc->cell(0, 12, 'Summary', newLine: true); // text-output concern$doc->writeHtml('<p>Generated in-process.</p>'); // HTML content path$doc->save(__DIR__ . '/report.pdf'); // writer stageEach call lands in a different concern. Two different content paths feed the
same model. Exactly one stage — save() — turns the model into file
bytes. Nothing at the call site needs to know how the cross-reference table is
built.
Common misconception
Section titled “Common misconception”The frequent misreading is that “pipeline” implies a streaming push API you wire stage by stage, like a Unix pipe. It does not. The pipeline here is an architectural decomposition: stages with single responsibilities and typed boundaries. You still program against a fluent facade. The stages are how the engine is built and tested, not a transport you assemble by hand.
A related mistake is assuming the facade is the engine. It is the entry point. The real work is distributed across concern traits, two content paths, and a writer. That distribution is precisely why one feature change does not put every output at risk.
Limits and boundaries
Section titled “Limits and boundaries”This page describes the shape of the pipeline, not the internal API of any
single stage. The exact concern-trait inventory, writer strategy selection
rules, and content-model fields are defined by the code and the reference,
not by this explanation. The precise trait count is an implementation
detail that can change without changing the model. This page does not cover
the HTML engine’s internal stages (a separate topic) or the streaming and
memory behavior of the writer (also separate). The structural claims are
accurate as of this page’s review date; the authoritative source is the core
repository’s src/Core/, src/Html/, src/Graphics/, and src/Writer/.
The pipeline model is identical across editions; editions add capabilities within stages, not new stages:
| Edition | Availability |
|---|---|
| Core | Core implements the full facade → content → writer pipeline. |
| Pro | Pro adds capabilities within existing stages, not new stages. |
| Enterprise | Enterprise adds capabilities within existing stages, not new stages. |
Related docs
Section titled “Related docs”- Memory and streaming — how the writer stage keeps memory bounded.
- The HTML pipeline — the internal stages of the HTML content path.
- Strict types, everywhere — the typed boundaries that make each stage independently testable.
Glossary
Section titled “Glossary”- Facade — the public
Documententry point: a fluent, use-once builder that records intent and delegates to concern traits. - Concern trait — a focused PHP trait the facade composes, each owning a single feature area (text output, drawing, pages, security, and so on).
- Content path — one of the two ways content enters the model: direct drawing or the HTML/CSS engine.
- Document model — the engine’s internal, typed accumulation of pages, content, and resources before serialization.
- Writer stage — the component that serializes the model into a valid PDF, selecting a PDF 1.4 / 1.7 / 2.0 strategy.
- Worker-safe — designed so process-lifetime state is shared safely while per-request state is created fresh and never reused.