Examples

These examples all ask the same question: if you turn a dated record into a world model, does it give a better read of consequences than an LLM reading the same visible context?

The clean pattern is simple. Build from the record. Hide the future. Ask for a forecast. Then check the forecast against what actually followed when history gives us an answer key.

PDF to event stream

Bismarck

A dense historical PDF becomes dated events, semantic pressures, and a fork explorer. The scored part hides later history and asks whether the world model or GPT gives the closer forecast from the same past.

  • Best public proof of the core idea
  • Scored against later historical entries
  • Custom forks are shown as forecasts, not proof

Company archive

Enron

Internal email, market, and news data become replayable company state. Choose a cutoff, write an action as an Enron actor, and compare the world-model forecast with a GPT baseline.

  • Real company event record
  • Held-out rows and branch cases
  • Good for enterprise-risk intuition

Public record

Public History

U.S. macro records are converted into dated state: inflation, labor, rates, GDP, Treasury yields, and public releases. Pick a cutoff and test a memo, warning, watch, or hold recommendation.

  • Current and analog macro windows
  • Open-ended branch text
  • Best for explaining the method in public data

Live company record

Py Insights

A private company work record is refreshed into workflows, deployment gates, and next checks. The public version is sanitized, but it shows the enterprise use case most directly.

  • Private data stays private
  • Workflow and org-design output
  • Connects the model to agent deployment

Autonomous workflow

Dispatch

The Autonomous Press is a different kind of example: a repeatable agent workflow with sourcing, editorial choices, archive output, and a feedback path.

  • Workflow proof, not a world-model benchmark
  • Shows governed generation in the wild
  • Useful as an agent-readiness companion

What We Think This Shows

The world model is strongest when the problem is local and temporal: a company, a record, a cutoff, an action, and a later outcome. GPT is excellent at reading and explaining context. The world model adds the thing a deployment needs: a consequence forecast with a yardstick.

Open the technical evidence