Skip to main content

← Back to Blog

Announcement

Introducing CDE: The Causal Dynamics Engine for Science and Engineering

Research teams today face a coordination problem. More instruments, competing hypotheses, and partially documented pipelines than any team can faithfully manage from raw signal to defensible claim. Standard AI tools help with subtasks — summarizing literature, suggesting plots, accelerating well-defined steps — but they still leave humans carrying the full burden of experimental design, method selection, and evidentiary discipline.

CDE conducts research workflows end to end: profiling your data, routing across discovery modes with explicit rationale, running the negative controls that serious science demands, and emitting typed scientific claims tied to a hashed evidence ledger. Every claim is replayable — by a regulator, a reviewer, or your future self — years after the original analysis.

The difference is architectural. Most AI products for science treat each analysis as a stateless transaction: data in, output out, context gone. CDE treats each analysis as a governed run within a broader research program. Memory, provenance, and evidentiary discipline persist across sessions, team members, and time. That persistence turns isolated analyses into compounding organizational knowledge.

CDE — Causal Dynamics Engine for Science and Engineering overview

From tools to engines: why architecture matters

CDE maintains intent across time. It profiles incoming datasets with the rigor of a careful PI: selecting and combining internal methods — equation extraction, learned representations, and hybrid analysis — to match the causal discovery objective for the phenomenon at hand. It carries memory of what failed, what was fragile under resampling, and what deserves another pass under a stricter Truth Dial tier.

Discovery modes as honest routing

CDE provides four discovery modes, each addressing a distinct class of scientific question. Symbolic mode searches for compact governing relationships — equations, conservation laws, rate expressions — in forms humans can inspect and regulators can interrogate. Neural mode represents high-dimensional fields where flexibility must come first. Neuro-Symbolic mode uses neural representations as a scaffold, then distills interpretable structure when the data support it. Causal mode targets mechanism-level claims about how coupled systems evolve, treating interventions and path fidelity as first-class outputs.

The engine selects among these modes based on data diagnostics, domain policy, and the scientific question. Forcing a single method onto every problem either leaves accuracy on the table or produces structure the data cannot support. CDE reports which mode it chose and why, so you can disagree before results are produced.

A discovery engine absorbs the coordination overhead of turning messy observational data into auditable, reproducible claims.

What an organization accumulates over years depends on this choice. A discovery engine produces structured memory: ledger entries, promotion decisions, replay recipes, and typed outputs that compose across teams. When discovery is treated as a governed run rather than a folder of figures, institutional knowledge stops evaporating into email threads and orphaned notebooks.

Built for research discovery at scale

CDE includes engines, validation universes, a neural module registry, and reproducibility machinery for governed operation at scale. Symbolic discovery searches for compact governing relationships. Neural modes represent high-dimensional fields. Neuro-Symbolic pathways distill expressive models into interpretable structure. Causal mode targets mechanism-level claims when interventions and path fidelity matter most.

All modes share a common scientific contract: outputs are typed, controls are mandatory, and promotion is a gate — not a default. Explore mode encourages breadth with honest uncertainty. Validate mode tightens evidentiary requirements. Publish mode commits to external-grade replay. The Truth Dial makes those phases explicit so exploratory sketches never masquerade as theorems.

CDE governed discovery pipeline

Data profiling as the first act of discipline

Before any discovery mode engages, CDE profiles the incoming data: distribution shapes, missing-value patterns, temporal structure, potential confounders, and dimensionality characteristics. A dataset with heavy-tailed distributions, sparse excitation, or suspicious regularity needs different treatment than a clean experimental matrix. Profiling results inform mode selection, control design, and the initial scope of the Truth Dial.

Profiling also establishes a data fingerprint that travels with every downstream claim. When someone asks months later whether a result was computed on raw or imputed values, whether outliers were trimmed, or whether a sensor column was included, the fingerprint answers. That record is automatic and immutable — part of the evidence ledger, not a voluntary annotation.

Typed claims as the primary output

In many organizations the artifact of analysis is a plot, a slide, or an opaque model file. CDE inverts that default. The primary artifact is a typed claim — law-like, causal, conservation-oriented, or otherwise — scoped, scored, and linked to evidence. Claims can be compared, contradicted, and reproduced. They also enable downstream automation, because machines and humans can reason about claim types consistently rather than re-deriving intent from prose.

Enterprise teams adopt CDE when they need discovery that earns trust by showing its work — and the work is machine-verifiable at scale.

Composability across teams and time

Typed claims compose. A conservation law discovered by one team can be imported as a constraint by another. A causal pathway from one study can serve as a prior in the next. A symbolic rate expression can be embedded in a simulator, compared against a neural surrogate, and challenged when new instrument data arrives. None of this works when the artifact is an unlabeled model checkpoint or a plot in a slide deck.

This composability extends across time. A claim produced today, with its evidence and lineage, remains interpretable and actionable a year later even if the original team has moved on. That durability is the difference between an organization that compounds knowledge and one that starts over because no one can reconstruct what was tried, what failed, and what survived scrutiny.

What adoption looks like in practice

Teams start by defining what "promotion" means in their organization: which negative controls are non-negotiable, which claim types flow into decision systems, and which stakeholders sign off before publish tier. CDE encodes those decisions as policy so the engine cannot shortcut them. Once policy is in place, onboarding is about aligning data packaging, access boundaries, and review rituals with the platform's contracts.

Integration follows a deliberate pattern. Teams connect data sources, define domain-specific validation criteria, and establish review workflows that map to their existing structure. The machinery makes those processes reproducible, auditable, and composable with work from other teams on the same platform.

Compounding scientific memory

Every governed run adds structured memory: what was tried, what broke under stress tests, and what advanced with defensible evidence. Over months, that memory becomes an asset as valuable as any single model — reusable, searchable, and independent of who was on call when a result first appeared.

Consider what happens when a key researcher leaves. In a conventional setup, their notebooks, environment configurations, and decision rationale leave with them. With governed discovery, the evidence ledger, typed claims, and promotion history remain. The next researcher picks up where work left off by reading structured records of what was attempted, what the controls showed, and what the Truth Dial tier was at each stage.

CDE is designed for organizations that intend to be doing rigorous science five years from now, with different people and a larger portfolio of instruments. The value is the accumulation of governed, reproducible, composable scientific knowledge that outlasts any individual project or team member.