What’s new in SFI v2¶

SFI 2.0 is a ground-up, JAX-based rewrite of the Stochastic Force Inference research code that accompanied Frishman & Ronceray (2020) and the follow-up papers (underdamped inference, PASTIS model selection). This page is the one place in these docs that talks about lineage and version history; everywhere else, the two estimator families are simply called the linear estimators and the parametric estimators.

Lineage: where the two estimator families come from¶

The linear estimators — infer_force_linear(), infer_diffusion_linear(), compute_diffusion_constant() — are the direct descendants of the published SFI methodology. With their default ("auto") settings they apply that line’s best refinements (moment-convention auto-selection, trapeze Gram construction, Vestergaard noise-aware diffusion, PASTIS sparsity), rewritten on JAX and typically orders of magnitude faster than the original research code. They remain a best-effort estimate: exact in the fine-sampling, low-noise limit, biased outside it.
The parametric estimators — infer_force(), infer_diffusion() — are new in 2.0: a likelihood-based estimator built from one or more RK4 flow steps per observation interval (n_substeps), with native profiling of the diffusion and measurement-noise levels \((\mathbf{D}, \Lambda)\) and an errors-in-variables instrument for consistency under measurement noise. They are the robust, flexible path for real experimental data — at an iterative compute cost. See Choosing an estimator for the regime table.

Highlights of 2.0¶

Two first-class inference paths per engine (overdamped and underdamped), sharing one API: closed-form linear projection and the parametric likelihood fit, both with PASTIS sparsification wired in.
State-dependent diffusion — infer_diffusion() / infer_diffusion_linear() fit \(\mathbf{D}(\mathbf{x})\) (and \(\mathbf{D}(\mathbf{x},\mathbf{v})\) in the underdamped case).
Compositional model building (SFI.statefunc, SFI.bases): symbolic bases, pair interactions, and neural-network force fields from the same expression algebra.
Diagnostics suite (SFI.diagnostics.assess()): residual whitening tests, autocorrelation/normality batteries, and predicted-vs-realised error consistency. Every flagged issue carries a one-line action hint pointing at the likely cure.
Experimental SPDE toolbox for spatial field data (linear estimators only) — see SPDE (spatial fields) — experimental.
Direct ingestion of tracking tables — TrajectoryCollection.from_dataframe (named, auto-detected columns) and named-column selection in load().
Held-out validation (side feature for data-abundant scenarios) — coll.split_time(0.8) + inf.holdout_score(test); held-out diagnostics via assess(inf, data=test).
Multi-experiment inference — pool many datasets into a single fit and let the model share or split parameters across experiments (one drift per condition, or a force field \(F(x, T)\) learned jointly across temperatures). Experiments are combined through the trajectory layer’s extras, so no special data plumbing is needed.

Version history¶

SFI has gone through three generations. Each one is a ground-up rewrite of the one before — the through-line is the method, not the code.

v1.0 (2020) — the original research code. The NumPy/SciPy implementation accompanying Frishman & Ronceray (2020): a flat set of scripts (no installable package) for overdamped systems. It projects force, velocity, and diffusion fields onto a chosen basis, estimates reconstruction error, and computes entropy production and probability currents, with a Brownian-dynamics simulator for bootstrapped validation and 2D/3D plotting helpers. Underdamped Langevin Inference (ULI) lived in a separate companion package (Brückner et al. (2020)). Documentation was a theory-manual PDF.

v1.5 (July 2025) — the JAX package. A ground-up JAX rewrite, packaged and pip-installable, that unified the two methodology papers into one library: overdamped (OLI) and underdamped (ULI) inference side by side, sharing a closed-form linear estimator (Itô/Stratonovich moment conventions, error reports, exact-model comparison, bootstrapped resimulation). Its headline addition was PASTIS sparse model selection (Gerardos & Ronceray (2025)) — a principled criterion, with a beam search, for deciding which basis terms the data actually support. Shipped with runnable OLI/ULI example scripts; still no rendered documentation.

v2.0 (this release) — the parametric, documented rewrite. A second ground-up rewrite: as with v1.5 before it, none of the previous generation’s code or API carries over. Alongside faster, refined linear estimators, 2.0 introduces the parametric estimators for robust inference under measurement noise and coarse sampling, compositional model building, a diagnostics suite, an experimental SPDE toolbox, direct ingestion of tracking tables, and — for the first time — full rendered documentation (this site). See Highlights of 2.0 above for the detailed feature list.