Benchmark

CDE's Autonomous Discovery: Validated on Real Physics

March 28, 2026

Can an autonomous system discover genuine scientific laws from raw data with minimal human intervention? Not approximate them — actually discover governing dynamics, verify them against known ground truth, and do so with negative controls and reproducibility support. The March 2026 validation results show CDE producing governing equations, causal graphs, and regulatory network topologies across three validated physics systems in fully autonomous runs.

The validation protocol was deliberately rigorous. Each experiment used a system with established ground truth — physics confirmed by decades of laboratory work. CDE was given raw observational data with no hints about underlying mechanisms. The engine profiled the data, selected discovery modes, ran automated negative controls, and emitted typed scientific claims. Those claims were compared against known solutions: R² values above 0.99, path fidelity above 0.98, and every negative control battery passed.

Three systems were chosen to span a range of scientific complexity: a mechanical oscillator governed by a second-order ODE, a four-compartment pharmacokinetic model with directed causal structure, and a six-variable gene regulatory network with cyclic repression topology. Each tests a different discovery mode and claim type.

Experiment 1: Damped Harmonic Oscillator

The damped harmonic oscillator — a mass on a spring with friction — is governed by one of the most fundamental equations in physics. It makes an ideal first test: can CDE rediscover what every physics student learns, starting from nothing but time-series data?

CDE's Symbolic mode was given raw displacement and velocity observations. No equation templates. No hints about spring constants or damping coefficients. In 16.3 seconds, the engine discovered the governing equation with R² = 0.9944. The discovered relationship was the governing equation, discovered from data alone.

Two negative controls confirmed the result. Time shuffle randomized temporal ordering, destroying dynamical signal while preserving marginal statistics. Phase randomization preserved the power spectrum while destroying phase relationships. Both passed cleanly, confirming the discovery depended on actual dynamical structure rather than statistical artifacts. These controls ran automatically as part of the governed pipeline.

R² = 0.9944. Sixteen seconds. Zero human input. The governing equation of a damped harmonic oscillator, discovered from raw observational data with automated negative controls.

Experiment 2: Four-Compartment Pharmacokinetics (ADME)

The second system: a four-compartment pharmacokinetic model describing Absorption, Distribution, Metabolism, and Excretion (ADME). The causal structure is well-established, making it a rigorous test for CDE's Causal mode.

CDE was given multivariate time-series data with no labels, structural hints, or prior pharmacokinetic knowledge. The engine discovered the drug absorption causal graph with three high-confidence directed edges (p > 0.90). Path fidelity reached 0.982. CDE correctly identified Tissue as the convergent node — a non-trivial structural inference reflecting genuine understanding of compartment interactions.

All four negative controls passed: time shuffle, phase randomization, label permutation, and noise robustness. The noise robustness test was particularly demanding — the discovered causal structure remained stable at 93.2% noise levels. That robustness is evidence of genuine mechanism, not fragile pattern matching.

Why the ADME result matters beyond pharmacokinetics

Pharmacokinetic models are the foundation of drug dosing, toxicity prediction, and formulation design. Discovering the ADME causal graph autonomously can accelerate characterization of drug absorption dynamics for novel compounds. The fact that causal structure was discovered without domain-specific priors suggests generalization to other multi-compartment systems: environmental transport, chemical reactor networks, and metabolic pathways.

Experiment 3: Six-Variable Gene Regulatory Network (Repressilator)

The most demanding validation: a six-variable gene regulatory network modeled on the repressilator — a synthetic biological circuit of three genes repressing each other in a cycle. The repressilator's cyclic topology is well-characterized, making it an exacting test for causal discovery in complex biological systems.

CDE's Causal mode discovered the cyclic repression topology from multivariate gene expression data. Path fidelity: 0.989. CDE correctly identified mRNA_B as the primary regulatory hub. All four negative controls passed: time shuffle, phase randomization, label permutation, and noise robustness. Neuro-Symbolic decomposition revealed 92.6% conservative dynamics, confirming the discovered network preserves energy-like constraints characteristic of well-regulated biological circuits.

Gene regulatory networks are among the most challenging systems in modern biology — noisy, high-dimensional, and governed by complex nonlinear interactions. Discovering the correct cyclic topology (not just pairwise correlations, but directed causal structure) from observational data, with automated negative controls confirming every edge, is a qualitative advance for autonomous discovery in biology.

Negative controls: automated and non-optional

Every CDE discovery mode runs automated negative controls as a non-optional part of the process. Time shuffle destroys temporal dependencies while preserving marginal distributions. Phase randomization preserves power spectra while destroying phase relationships. Label permutation tests whether structure depends on actual variable assignments or random relabeling. Noise robustness progressively degrades the signal to find the threshold where the claim collapses.

These controls ran automatically on every experiment. No human scheduled, designed, or interpreted them. The engine ran them because governance policy required them, and results were recorded in the evidence ledger alongside the primary discovery. When every claim carries its own stress-test results, the conversation shifts from "do you believe this?" to "here is the evidence — evaluate it yourself."

The four discovery modes

CDE provides four discovery modes. Symbolic mode searches for closed-form governing equations. Neural mode captures high-dimensional dynamics where structural commitment would be premature. Neuro-Symbolic mode bridges the two — neural expressiveness as scaffold, interpretable structure distilled when data support it. Causal mode targets directed mechanistic claims: causal graphs, intervention predictions, and path-level summaries, with identifiability and negative controls as first-class requirements.

The validation exercised Symbolic mode on the oscillator and Causal mode on the pharmacokinetic and gene regulatory systems. Mode selection was performed by the engine, not a human analyst. The routing is part of the governed pipeline — the engine reports which mode it chose and why.

Reproducibility through a single API call

Every experiment is reproducible through a single API call: POST /v1/discover. The request specifies data, discovery mode, and Truth Dial tier. The engine handles data profiling, mode routing, control scheduling, claim typing, and evidence packaging. No manual pipeline to reconstruct, no notebook to debug, no environment to resurrect.

The Truth Dial governs rigor: explore for breadth with honest uncertainty, validate for stress testing with mandatory controls, publish for external-grade reproducibility with deterministic replay. Validation experiments were conducted at validate tier, with every control mandatory and every claim stress-tested before emission.

What validation on known ground truth proves

Validation against known ground truth answers the most fundamental question about any discovery system: does it find real science? CDE's results across three systems — a mechanical oscillator, a pharmacokinetic model, and a gene regulatory network — demonstrate that it does, across different domains, discovery modes, and levels of complexity.

CDE does not merely fit data. The governing equation for the oscillator is the physical law, not a regression. The ADME causal graph is the directed mechanism, not a correlation matrix. The repressilator topology is the regulatory architecture, not a clustering. These are qualitatively different outputs from predictive models, requiring qualitatively different validation. All three passed their negative control batteries, confirming the structures are genuine.

AI-native research and engineering

CDE's API-native design means every discovery capability is accessible through structured endpoints that autonomous systems can invoke directly. The validation demonstrates that those systems can conduct rigorous research — from data profiling through discovery to controlled validation — without human intermediation. When discovery is an API call, the rate-limiting factor is the quality of the scientific questions being asked.

The validation report by Vareon Research, March 2026, will be available at vareon.com/research. It provides complete experimental protocols, result tables, and negative control outcomes for all three systems.

From validated foundations to open discovery

Validation on known ground truth is a foundation, not an end goal. If CDE can autonomously discover the governing equation of a damped oscillator, the causal graph of drug absorption, and the regulatory topology of a gene network, then novel discoveries on systems without known ground truth deserve serious scientific consideration. They are typed claims from a governed engine that has demonstrated its ability to find truth where truth is knowable, with every step documented in an auditable evidence ledger.

Trust is built by showing the engine works where the answer is known, then deploying where the answer is unknown with the same governance, controls, and evidence standards. The methodology is consistent because it is encoded in the engine — not in the habits of whichever researcher happens to be running the experiment.