Skip to content

Writer: PDF 2.0 serializer + xref

The Writer module serializes a document into Portable Document Format (PDF) bytes. It selects a version strategy, writes the object graph, and emits the cross-reference structure and trailer.

Terminal window
composer require nextpdf/core:^3

Use PdfWriter as the entry point. Pass a DocumentData value object to write(). The method returns the complete PDF as a byte string. The writer assembles the object graph, assigns object numbers, records byte offsets, and writes the cross-reference structure last.

For each call, the writer uses one serialization strategy. The PdfSerializationStrategy interface defines four methods: writeHeader(), getCatalogVersion(), writeXrefAndTrailer(), and usesXrefStream(). Three strategies implement it. Pdf20StreamStrategy writes the %PDF-2.0 header, sets the catalog version to /2.0, and emits a cross-reference stream. Pdf17TableStrategy writes %PDF-1.7 and a classic cross-reference table. Pdf14TableStrategy writes %PDF-1.4 and a cross-reference table. PdfWriter picks the strategy with a match on DocumentData::$outputProfile. The default is Pdf20StreamStrategy.

The PdfOutputProfile enum carries the three target versions: Pdf20, Pdf17, and Pdf14. It exposes headerVersion(), catalogVersion(), allowsObjectStreams(), and usesXrefStream(). An archival conformance mode overrides the chosen profile before strategy selection. Pdf14FeatureGuard rejects PDF 2.0 features when the profile is Pdf14.

A cross-reference stream maps each object number to its byte offset, as defined by ISO 32000-2 §7. Incremental updates append new objects to the end of the file, as defined by ISO 32000-2 §7.5.6. The writer escapes every literal string through the canonical PdfStringEscaper::escapeLiteral() path, which follows the normative escape table in ISO 32000-2 §7.3.4.2 (ADR-015).

The writer supports deterministic output. setDeterministicMode() pins object identifiers and dictionary key order. setReproducibleClock() pins the document timestamp. With both pins set, a fixed input produces byte-identical output. The writeChunked() method returns a generator that yields the PDF in fixed-size chunks. Streaming/StreamingPdfWriter writes one page at a time to a caller-supplied stream for documents that exceed the memory budget.

Linearizer rewrites a finished PDF into a linearized layout. It places the first page early, so a viewer can show it before the full download completes. shadowValidate() checks the rewrite without changing the input.

Caution. PdfWriter.php and Linearizer.php are critical to byte offsets and the object graph (manifest danger zones). Do not change object numbering or xref offset arithmetic without the Writer golden suite.

ClassKey methodsRole
PdfWriterwrite(DocumentData): string, writeChunked(DocumentData, int): Generator, setDeterministicMode(), setReproducibleClock(), setOutputColorProfile(), getLastXrefOffset(), getFileId()Primary serializer
PdfSerializationStrategy (interface)writeHeader(), getCatalogVersion(), writeXrefAndTrailer(), usesXrefStream()Version strategy contract
Pdf20StreamStrategywriteHeader()%PDF-2.0, getCatalogVersion()/2.0, usesXrefStream()truePDF 2.0 xref-stream strategy
Pdf17TableStrategywriteHeader()%PDF-1.7, xref tablePDF 1.7 xref-table strategy
Pdf14TableStrategywriteHeader()%PDF-1.4, xref tablePDF 1.4 xref-table strategy
PdfOutputProfile (enum)Pdf20, Pdf17, Pdf14; headerVersion(), catalogVersion(), allowsObjectStreams()Target-version selector
PdfXrefWritergenerateFileId(), finalizeTrailerAndXref()File ID + trailer/xref finalization
Linearizerlinearize(string): string, shadowValidate(string): arrayFast-web-view rewrite
Streaming\StreamingPdfWriteropen(), newPage(), close()Single-pass streaming writer

Run composer docs:generate-api-php -- --module=Writer to generate the full PHPDoc table.

Source: examples/02-pdf-factory.php.

<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use NextPDF\Writer\PdfWriter;
$writer = new PdfWriter();
$pdfBytes = $writer->write($documentData);
file_put_contents('out.pdf', $pdfBytes);

The default profile is PDF 2.0. Output starts with %PDF-2.0 and ends with a cross-reference stream.

This pins determinism and a fixed clock for byte-identical output, then streams the result in fixed chunks.

<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use DateTimeImmutable;
use NextPDF\Writer\PdfWriter;
use NextPDF\Writer\ReproducibleClock;
$pinned = new DateTimeImmutable('2026-01-01T00:00:00Z');
$writer = new PdfWriter();
$writer->setDeterministicMode($pinned, 'nextpdf-fixed-file-id');
$writer->setReproducibleClock(new ReproducibleClock($pinned));
$out = fopen('php://output', 'wb');
foreach ($writer->writeChunked($documentData, chunkSize: 65536) as $chunk) {
fwrite($out, $chunk);
}
fclose($out);
  • Only one strategy runs per write() call. The writer resets the strategy from the profile on every call. A prior call does not leak its version.
  • An archival conformance mode overrides the requested profile. A PDF/A-3 build forces PDF 1.7. A PDF/A-4 build forces PDF 2.0.
  • Byte-identical output requires both pins. Set the deterministic mode and a reproducible clock. One pin alone is not enough.
  • writeChunked() yields a generator. Consume it completely. A partial read produces a truncated, invalid PDF.
  • Linearizer rewrites cross-reference offsets. In a pipeline that cannot tolerate a failed rewrite, run shadowValidate() first.
  • Pdf14TableStrategy is final readonly. The PDF 1.4 path rejects PDF 2.0 features through Pdf14FeatureGuard; it does not degrade them.

Serialization is linear in the object count and total byte size. The cross-reference stream adds one pass over the object table. writeChunked() holds the assembled document but yields it in bounded slices, so peak memory is the document size plus one chunk. Streaming\StreamingPdfWriter does not hold the whole document; use it for inputs larger than the memory budget. The reference workload budget is 1500 ms wall and 64 MB peak. Linearization adds a second full pass and a measure pass. Budget for it explicitly.

The writer serializes a trusted in-memory object graph. Its inputs are the main threat boundary. Every literal string passes through the canonical PdfStringEscaper::escapeLiteral() (ADR-015), so embedded control bytes cannot break out of a string token. Encryption is wired through PdfEncryptionWriter and the /Encrypt trailer entry. Public-key encryption is rejected with an explicit exception rather than silently downgraded. The deterministic and reproducible-clock modes remove timestamp and ordering side channels from the output. See /modules/core/security/ for the document threat model and the encryption trust boundary.

The Writer produces PDF 2.0 file structures: the %PDF-2.0 header, a /2.0 catalog version, a cross-reference stream, and literal-string escaping per the ISO 32000-2 §7.3.4.2 escape table. These are implementation facts. Evidence lives in src/Writer/Pdf20StreamStrategy.php, src/Writer/PdfSerializationStrategy.php, and the strategy selection in src/Writer/PdfWriter.php. The behavior is exercised by tests/Unit/Writer/ (192 tests, including the Pdf20StreamStrategy, PdfXrefWriter, and Linearizer* suites) and the tests/Golden/PdfWriter/PdfWriterGoldenBaselineSmokeTest baseline.

This is not a claim of full PDF 2.0 conformance. Full ISO 32000-2 conformance is a property of a complete document validated by an external oracle, not of the serializer alone. End-to-end conformance is asserted only where an oracle confirms it: tests/Integration/Accessibility/VeraPdfUa2GoldenTest validates a generated fixture against veraPDF for PDF/UA-2, and tests/Standards/Profile/PdfRConformanceTest covers the PDF/R profile. The veraPDF golden test skips when the veraPDF binary is absent from the runner, so it is an opt-in oracle gate, not an unconditional one. Set VERAPDF_BINARY to run it. Archival-profile selection (PDF/A-3 → PDF 1.7, PDF/A-4 → PDF 2.0) is decided by ADR-011 and the conformance mode, and validated by the conformance suites in /modules/core/conformance/. Outside those oracle-backed profiles, state that the Writer “produces PDF 2.0 structures; conformance is validated by veraPDF for the PDF/UA-2 profile” rather than asserting unqualified conformance.