Examples
These examples all ask the same question: if you turn a dated record into a world model, does it give a better read of consequences than an LLM reading the same visible context?
The clean pattern is simple. Build from the record. Hide the future. Ask for a forecast. Then check the forecast against what actually followed when history gives us an answer key.
PDF to event stream
Bismarck
A dense historical PDF becomes dated events, semantic pressures, and a fork explorer. The scored part hides later history and asks whether the world model or GPT gives the closer forecast from the same past.
- Best public proof of the core idea
- Scored against later historical entries
- Custom forks are shown as forecasts, not proof
Company archive
Enron
Internal email, market, and news data become replayable company state. Choose a cutoff, write an action as an Enron actor, and compare the world-model forecast with a GPT baseline.
- Real company event record
- Held-out rows and branch cases
- Good for enterprise-risk intuition
Public record
Public History
U.S. macro records are converted into dated state: inflation, labor, rates, GDP, Treasury yields, and public releases. Pick a cutoff and test a memo, warning, watch, or hold recommendation.
- Current and analog macro windows
- Open-ended branch text
- Best for explaining the method in public data
Live company record
Py Insights
A private company work record is refreshed into workflows, deployment gates, and next checks. The public version is sanitized, but it shows the enterprise use case most directly.
- Private data stays private
- Workflow and org-design output
- Connects the model to agent deployment
Autonomous workflow
Dispatch
The Autonomous Press is a different kind of example: a repeatable agent workflow with sourcing, editorial choices, archive output, and a feedback path.
- Workflow proof, not a world-model benchmark
- Shows governed generation in the wild
- Useful as an agent-readiness companion
What We Think This Shows
The world model is strongest when the problem is local and temporal: a company, a record, a cutoff, an action, and a later outcome. GPT is excellent at reading and explaining context. The world model adds the thing a deployment needs: a consequence forecast with a yardstick.