Skip to content

Chaos: deterministic resilience-scenario harness

The Chaos module is a compact harness for resilience testing. You register fault-injection scenarios that implement a one-method interface, run them, and collect a structured pass/fail report. It stays deliberately small at five classes and belongs in resilience suites and chaos-day exercises, not in the document-production path.

Stability: experimental. This is a test-and-resilience tooling surface, not a core Portable Document Format (PDF) application programming interface (API). The service provider interface (SPI) is small and has a stable shape, but the module’s scope and bundled scenarios evolve. Do not build production control flow on it.

Terminal window
composer require nextpdf/core:^3

A resilience test asks whether the engine degrades correctly when a dependency fails. The Chaos module gives that test a structure. ChaosScenarioInterface is the scenario contract: name() identifies the scenario, and simulate() returns a ChaosOutcome. Each scenario encapsulates one fault, such as a network partition or a burst of retrieval-backend 5xx responses, and reports what happened.

ChaosScenarioRunner orchestrates the run. You register() scenarios, call run() to execute them sequentially in registration order, and then read the aggregate with outcomes(), allPassed(), passCount(), and failCount(). The runner never throws on a scenario failure: a failure is data captured in a ChaosOutcome, not an exception. It throws only when its own infrastructure is broken, such as an invalid scenario registration or an inability to write the report file (ChaosReportWriteException). A scenario that cannot reach the resource it is testing surfaces a RetrievalUnavailableException. The module is @since 3.2.0.

ChaosOutcome stores the per-scenario result: pass/fail status, duration, and toArray() output for the structured report. Because an outcome records wall-clock duration, the report’s reproducibility profile is structural, not bitwise.

TypeKey membersRole
ChaosScenarioInterfacename(): string, simulate(): ChaosOutcomeThe scenario contract (@since 3.2.0)
ChaosScenarioRunnerregister(), run(), outcomes(), allPassed(), passCount(), failCount()Sequential scenario orchestrator (@since 3.2.0)
ChaosOutcomedurationSeconds(), toArray()Per-scenario pass/fail result (@since 3.2.0)
RetrievalUnavailableExceptionA tested resource was unreachable
ChaosReportWriteExceptionThe report file could not be written

Run composer docs:generate-api-php -- --module=Chaos to generate the full PHPDoc table.

Register a scenario, then run the suite.

<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use NextPDF\Chaos\ChaosOutcome;
use NextPDF\Chaos\ChaosScenarioInterface;
use NextPDF\Chaos\ChaosScenarioRunner;
final class TimeoutScenario implements ChaosScenarioInterface
{
public function name(): string
{
return 'dependency-timeout';
}
public function simulate(): ChaosOutcome
{
// Inject the fault, observe the engine's degradation, return the verdict.
return new ChaosOutcome(/* name, passed, durationSeconds, details */);
}
}
$runner = new ChaosScenarioRunner();
$runner->register(new TimeoutScenario());
$runner->run();
echo $runner->allPassed() ? "Resilient.\n" : "{$runner->failCount()} scenario(s) failed.\n";

Drive the harness from a resilience job, and return a non-zero exit code for any failure without letting a scenario failure escape as an exception.

<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use NextPDF\Chaos\ChaosScenarioRunner;
use NextPDF\Chaos\Exception\ChaosReportWriteException;
use Psr\Log\LoggerInterface;
final readonly class ChaosJob
{
/** @param list<\NextPDF\Chaos\ChaosScenarioInterface> $scenarios */
public function __construct(
private array $scenarios,
private LoggerInterface $logger,
) {}
public function run(string $reportPath): int
{
$runner = new ChaosScenarioRunner();
foreach ($this->scenarios as $scenario) {
$runner->register($scenario);
}
$runner->run(); // never throws on scenario failure
try {
$runner->writeReport($reportPath);
} catch (ChaosReportWriteException $e) {
$this->logger->error('Chaos report could not be written.', ['error' => $e->getMessage()]);
return 2;
}
return $runner->allPassed() ? 0 : 1;
}
}
  • run() never throws because a scenario failed. A failure lives in ChaosOutcome. If you wrap run() in a try/catch expecting failures there, you will not see them. Read failCount() / allPassed() instead.
  • The runner throws only on infrastructure faults: a bad registration, or a ChaosReportWriteException when the report path is unwritable. Handle those faults separately from scenario results.
  • Scenarios run sequentially in registration order. There is no parallelism. Ordering can matter when scenarios share external state.
  • This module is for resilience testing. Do not import the runner into the document-production path as a control mechanism.

The runner adds negligible overhead. Scenario behavior determines the cost. Because scenarios inject faults and may wait on timeouts, a chaos run can be slow by design. The performance_budget here is the engine’s reference figure, not a bound on scenario duration. The reproducibility profile is structural: the report records wall-clock durations, so those fields differ across runs.

Scenarios inject faults and may exercise failure paths in dependencies. Run the harness only in a test or staging environment, with credentials and endpoints scoped to that environment. Never run it against production systems. The report can contain diagnostic detail about failure modes. Treat it as internal, and apply the project’s log-scrubbing obligation before you share it. See the engine threat model in /modules/core/security/.

This module makes no PDF-specification normative claim. It is resilience tooling. It implements no standardized protocol whose clauses must be cited. The oracle and golden suites described in /modules/core/conformance/ validate engine conformance.