SFI.inference package¶

SFI.inference — Stochastic Force Inference engines.

Public classes¶

OverdampedLangevinInference: Inference for overdamped Langevin dynamics (dx/dt = F(x) + sqrt(2D) dW).
UnderdampedLangevinInference: Inference for underdamped Langevin dynamics (velocities unobserved).
InferenceResultSF: Callable fitted state function carrying parameter covariance and metadata.
SparseScorer: Scores candidate supports via the normal equations (M, G).
SparsityResult: Immutable container returned by every sparsity strategy. Provides information-criterion selection (AIC, BIC, PASTIS, SIC).
BeamSearchStrategy / GreedyStepwiseStrategy / HillClimbStrategy / STLSQStrategy / LassoStrategy: Pluggable search strategies for sparse model selection.
overlap_metrics / predictive_nmse: Benchmark helpers for comparing to ground truth.

class SFI.inference.BeamSearchStrategy(*, beam_width=20, aic_patience=2, report_time=False)[source]¶

Bases: SparsityStrategy

Bidirectional beam search over the support lattice.

Parameters:

beam_width (int, default 20) – Maximum number of candidate models retained per cardinality.
aic_patience (int, default 2) – Stop early when AIC has strictly declined for this many consecutive closed cardinality levels.
report_time (bool, default False) – If True, log elapsed time and number of explored supports.

name: str = 'beam'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, init_supports=None, **_kwargs)[source]¶

Execute the beam search.

Parameters:

scorer (SparseScorer)
max_k (int) – Maximum model size to consider.
init_supports (list of tuples of int, optional) – Seed supports to inject into the initial frontier. Each entry is a tuple of basis-function indices. Useful for seeding the search with a known good model (e.g. the true support) so that the Pareto front is guaranteed to include it.

Return type:

SparsityResult

class SFI.inference.GreedyStepwiseStrategy(*, direction='forward', report_time=False)[source]¶

Bases: SparsityStrategy

Forward / backward / bidirectional stepwise selection.

Parameters:

direction ("forward" | "backward" | "both", default "forward") –
Which direction(s) to search.
- "forward": start empty, add one feature at a time.
- "backward": start full, drop one feature at a time.
- "both": run both directions and merge the Pareto fronts.
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'greedy'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.HillClimbStrategy(*, ic='PASTIS', p_param=0.001, tau=None, gamma=0.5, patience=200, seed=None, report_time=False)[source]¶

Bases: SparsityStrategy

Stochastic hill-climbing model selection.

For each cardinality k ∈ {0, …, max_k}, a chain starts from a random support of size k and accepts random add/remove moves that improve the chosen information criterion, stopping after patience consecutive failures. The best-per-k results form the Pareto front returned as a SparsityResult.

This is the search algorithm described in Gerardos & Ronceray (2025), §”Model selection”, which recommends parallel searches from null, full, and random starting points.

Parameters:

ic (str, default "PASTIS") – Information criterion used as the acceptance objective. One of "AIC", "BIC", "EBIC", "PASTIS", "SIC".
p_param (float, default 1e-3) – PASTIS significance level \(p_0\).
tau (float or None) – Total trajectory time (required for BIC / EBIC).
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)).
patience (int, default 200) – Stop a chain after this many consecutive rejected moves.
seed (int or None) – Random seed for reproducibility.
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'hillclimb'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.InferenceResultSF(sf, *, param_cov=None, meta=None)[source]¶

Bases: SF

A fitted, callable state function that is an SF and carries parameter covariance + metadata for downstream uncertainty handling.

Notes

param_cov is the covariance of the flattened parameter vector defined by the underlying PSF template order (see PSF.flatten_params).
Covariance estimation is handled upstream (in the inferer).
Call predict_var() / predict_ci() for pointwise uncertainty.

Parameters:

sf (SF)
param_cov (Array | None)
meta (Dict[str, Any])

d_v(*, same_particle=False, mode='auto')¶

Build an expression for the velocity Jacobian ∂F/∂v.

Same rules as .d_x(). Requires needs_v=True on the underlying expression.

Parameters:

same_particle (bool)
mode (str)

d_x(*, same_particle=False, mode='auto')¶

Build an expression for the spatial Jacobian dF/dx.

Axis effects¶

Adds one derivative-dim immediately before the rank block.
If particles_input=True:
- when same_particle=True: if pdepth=1, compute df_i/dx_i (no extra P axis); the particle dimension behaves like a broadcasted index. Otherwise, raises an error.
- when same_particle=False (default): compute the full cross-particle Jacobian df_i/dx_j; an extra particle axis appears (from JAX). We never create P axes ourselves; we only permute to canonical order.

param same_particle:: See axis effects above.
type same_particle:: bool
param mode:: Backend differentiation mode; ‘auto’ selects a sane default.
type mode:: {‘auto’, …}
returns:: A new expression representing the Jacobian.
rtype:: StateExpr

Notes

This method triggers no evaluation; it returns a new graph.

Parameters:

same_particle (bool)
mode (str)

dense(n_out, *, weight='W', bias='b')¶

Apply a learnable affine map on the feature axis.

y[..., j] = sum_i x[..., i] * W[i, j] + b[j]

Spatial (rank) axes are untouched: the same W, b are shared across every spatial component. The result is always a PSF (since the dense layer introduces learnable parameters).

Parameters:

n_out (int) – Number of output features.
weight (str) – Name for the weight parameter (default "W"). Use distinct names ("W1", "W2", …) when stacking multiple layers.
bias (str | None) – Name for the bias parameter (default "b"; None to omit). Use distinct names ("b1", "b2", …) when stacking layers.

Returns:

A parametric state function wrapping the dense layer.

Return type:

PSF

Examples

Build the hidden layers of an MLP force field:

>>> from SFI.bases import X
>>> import jax.numpy as jnp
>>> mlp = (
...     X(dim=2).vectorize(2)
...     .dense(32, weight="W1", bias="b1")
...     .elementwisemap(jnp.tanh)
...     .dense(1, weight="W2", bias="b2")
... )

property dim¶

dot(other, axes=None)¶

Spatial tensordot via einsum.

Semantics:

axes=None: contract last axis of self with first axis of other.
axes=int:
- if self.rank == other.rank: contract all axes (Frobenius/trace for rank-2).
- else: contract axes trailing axes of self with axes leading axes of other.
axes=(a_axes, b_axes): NumPy-style explicit lists.

Arrays are accepted and coerced to spatial constants.

drop_features: bool = True¶

classmethod einsum(spec, *operands)¶

General contraction on spatial axes (like jnp.einsum).

Important

Use only lowercase letters.
spec refers only to spatial axes (not the feature axis).
Features take a Cartesian product across operands (no implicit feature reduction or alignment). If you need feature concatenation, use &/stack. For per-feature ops, use element-wise maps or binary ops where features must match.

Arrays in operands are accepted and coerced to spatial-constant expressions with a single feature. Only spatial letters in spec are interpreted. If no StateExpr is present, a TypeError is raised because dim cannot be inferred.

Examples

Vector inner product (per-feature), two rank-1 inputs: >>> # a, b: i × F >>> c = StateExpr.einsum(“i,i->”, a, b) # result: × F

Matrix–vector product (per-feature), rank-2 with rank-1: >>> # M: ij × F1, v: j × F2 → i × (F1×F2) >>> y = StateExpr.einsum(“ij,j->i”, M, v)

Outer product (per-feature Cartesian product): >>> # u: i × F1, v: j × F2 → ij × (F1×F2) >>> O = StateExpr.einsum(“i,j->ij”, u, v)

Parameters:

spec (str) – An einsum string over spatial indices, e.g. “ij,j->i”.
operands (mix[StateExpr, array-like]) – Any mix of StateExpr and arrays.

elementwisemap(func, *, label_fn=None)¶

Apply func element-wise to every feature (spatial axes untouched).

func must be a pure JAX function from scalar→scalar (rank-0 arrays OK). If the expression carries feature labels (e.g., a Basis or an SF bound from a Basis), label_fn (if provided) is applied to each feature label.

Example

>>> B = ...   # Basis with 4 features
>>> C = B.elementwisemap(jnp.tanh, label_fn=lambda s: f"tanh({s})")

Parameters:

func (Callable[[Array], Array])
label_fn (Callable[[str], str] | None)

estimate_bytes_per_sample(*, dtype=None, particle_size=None, sample=None, mode='forward')¶

Small convenience wrapper returning only the transient bytes/sample.

Parameters:

particle_size (int | None)
sample (SampleMeta | None)
mode (str)

Return type:

int

features_to_rank(rank)¶

Unfold features into spatial axes → given rank.

The output layout changes from the current:

batch · (dim,)^self.rank · n_features

to:

batch · (dim,)^rank · (n_features / dim^(rank − self.rank),)

where the new innermost spatial axes are carved out of the feature axis. This is a pure reshape and is the exact inverse of rank_to_features() when restoring the original rank.

Parameters:

rank (int) – Target tensor rank (must be greater than the current rank).

Returns:

Expression at the requested rank with fewer features.

Return type:

StateExpr (same subclass)

Raises:

ValueError – If n_features is not divisible by dim^Δrank.
TypeError – If rank ≤ self.rank (use rank_to_features to go down).

Examples

Turn a dense layer’s output back into a vector field:

>>> scalar_expr.features_to_rank(1)  # rank-1, F/dim features

Build a 2→H→H→2 MLP force field:

>>> mlp = (
...     X(dim=2)
...     .rank_to_features()                     # rank-0, 2 features
...     .dense(32, weight="W1", bias="b1")
...     .elementwisemap(jnp.tanh)
...     .dense(2, weight="W2", bias="b2")       # rank-0, 2 features
...     .features_to_rank(1)                     # rank-1, 1 feature
... )

flatten_params()[source]¶

Return θ̂ as a 1D vector using the PSF template order.

Return type:: Array

property labels¶: Basis labels propagated from the parent PSF.

classmethod load(path, template)[source]¶

Reload a model saved by save().

Parameters:

path (str or Path) – Base path (without extension).
template (InferenceResultSF) – Skeleton with the same tree structure (same PSF + dummy params).

Return type:

InferenceResultSF

materialize_params(vec)[source]¶

Inverse of flatten_params: make a param dict from a vector.

Parameters:: vec (Array)
Return type:: dict[str, Array]

memory_hint(*, dtype=None, particle_size=None, sample=None, mode='forward')¶

Conservative per-sample memory footprint for the WHOLE expression tree. Delegates to the root node, which sums children + own output along the way.

Parameters:

particle_size (int | None)
sample (SampleMeta | None)
mode (str)

meta: Dict[str, Any]¶

property n_features¶

property needs_v¶

param_cov: Array | None¶

params: dict[str, Array]¶

property particle_extras: tuple[str, ...]¶

Pure metadata, forwarded from the root node.

Names of extras declared as per-particle somewhere in the underlying node tree (typically by interaction leaves). The dispatcher reads this to know which keys to gather from (P, …) into (E, K, …) per edge before calling locals.

property particles_input¶

property pdepth¶

predict_ci(x, *, alpha=0.95, extras=None, mask=None)[source]¶

Pointwise confidence intervals via the delta method.

Parameters:

x (array, shape (N, dim))
alpha (float) – Confidence level (default 0.95 for 95 % CI).
extras (forwarded to the underlying state function.)
mask (forwarded to the underlying state function.)

Returns:

dict with keys
- mean — model prediction F̂(x).
- std — pointwise standard deviation √Var[F(x)].
- **lower*, **upper** — symmetric CI bounds* – F̂ ± z_{α/2} · std.

Return type:

Dict[str, Array]

predict_cov(x, *, extras=None, mask=None)[source]¶

Full pointwise covariance matrix via the delta method.

\[\Sigma_F(x) = J_\theta(x)\,\Sigma_\theta\,J_\theta(x)^\top\]

Parameters:

x (array, shape (N, dim))
extras (forwarded to the underlying state function.)
mask (forwarded to the underlying state function.)

Returns:

cov – For rank-1 models: shape (N, d, d). For rank-2 models: shape (N, d, d, d, d) (rarely needed).

Return type:

jnp.ndarray

predict_var(x, *, extras=None, mask=None)[source]¶

Pointwise predictive variance via the delta method.

\[\operatorname{Var}\!\bigl[F_i(x)\bigr] \approx \bigl(J_\theta(x)\,\Sigma_\theta\,J_\theta(x)^\top\bigr)_{ii}\]

For linear models (basis expansion) this is exact, not an approximation.

Parameters:

x (array, shape (N, dim)) – Query points.
extras (dict, optional) – Extra arguments forwarded to the underlying state function (e.g. {"box": box} for periodic boundary conditions).
mask (array, optional) – Boolean mask forwarded to evaluation.

Returns:

var – Per-component variance. Shape matches the model output rank: (N, d) for a rank-1 (force) model, (N, d, d) for rank-2 (diffusion tensor).

Return type:

jnp.ndarray

property rank¶

rank_to_features()¶

Fold all spatial (rank) axes into the feature axis → rank-0.

The output layout changes from:

batch · (dim,)^rank · n_features

to:

batch · (n_features × dim^rank,)

with rank = 0. This is a pure reshape (no copy, no learnable parameters) and is the exact inverse of features_to_rank(original_rank).

Returns:: Scalar expression whose feature count is self.n_features × self.dim ** self.rank.
Return type:: StateExpr (same subclass)
Raises:: TypeError – If the expression is already rank‑0 (no-op would be confusing).

Examples

Prepare a rank-1 position vector for dense layers:

>>> X(dim=2).rank_to_features()   # rank-0, 2 features

The round-trip is the identity:

>>> expr.rank_to_features().features_to_rank(expr.rank)  # same as expr

property required_extras: tuple[str, ...]¶: Presence-only extras required by the expression, forwarded from the root node. No shape/broadcast semantics here.

root: BaseNode¶

save(path)[source]¶

Save this fitted model to <path>.eqx + <path>.meta.json.

The saved files can be reloaded with load(), provided the user supplies a template built from the same PSF/Basis.

See SFI.inference.serialization.save_model().

Return type:: Path

property sdims¶

specialize(*, dataset)¶

Specialize a bound function at condition dataset.

Rewrites the graph (folding dataset_index-reading leaves) and projects the bound parameter values onto the shrunken template: a per-condition spec whose shape loses a leading axis is sliced at dataset; shared specs are kept verbatim.

Parameters:: dataset (int)
Return type:: SF

sqrtm()¶

classmethod stack(exprs)¶

Concatenate along the feature axis.

Static contracts must match (rank/dim, compatible pdepth).

Parameters:: exprs (Sequence[StateExpr])

summary()[source]¶

Formatted coefficient table (if labels and coefficients are available).

Return type:: str

tensordot(other, axes=1)¶: Alias of .dot with NumPy-compatible axes.

tensorize(dim=None, mode='symmetric')¶

Lift a scalar expression to rank-2 (matrix).

Parameters:

dim (int, optional) – Spatial dimension. Inferred when possible.
mode (str) – 'symmetric' (default) uses symmetric_matrix_basis() (d(d+1)/2 features per scalar feature, spans all symmetric matrices). 'identity' uses identity_matrix_basis() (1 feature per scalar feature, isotropic).

Returns:

Matrix expression.

Return type:

StateExpr

vectorize(dim=None, axes=None)¶

Lift a scalar expression to rank-1 (vector).

Equivalent to self * unit_vector_basis(dim, axes=axes), i.e. a Cartesian product of the feature axis with unit vectors.

Parameters:

dim (int, optional) – Spatial dimension. Inferred from the expression’s contract when possible.
axes (sequence of int, optional) – Subset of spatial axes to include (default: all dim axes).

Returns:

Vector expression with n_features = self.n_features × len(axes).

Return type:

StateExpr

class SFI.inference.LassoStrategy(*, alpha=None, n_alphas=50, max_iter=1000, tol=1e-07, report_time=False)[source]¶

Bases: SparsityStrategy

Coordinate-descent LASSO on the normal equations.

Parameters:

alpha (float or None) – Fixed regularisation strength. If None (default), an automatic log-spaced path from \(\alpha_{\max}\) (where the solution is entirely zero) down to \(10^{-4}\,\alpha_{\max}\) is constructed.
n_alphas (int, default 50) – Number of \(\alpha\) values in the automatic path.
max_iter (int, default 1000) – Maximum coordinate-descent iterations per \(\alpha\).
tol (float, default 1e-7) – Convergence tolerance (max absolute change in any coefficient).
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'lasso'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.OverdampedLangevinInference(data, *, max_memory_gb=1.0, **kwargs)[source]¶

Bases: BaseLangevinInference

Stochastic Force Inference concrete class for overdamped systems

This class provides tools for inferring force (drift) and diffusion tensors from stochastic trajectory data based on overdamped Langevin dynamics. It supports both linear and nonlinear basis function methods.

Notes

The dynamics are described by the 1st order autonomous stochastic differential equation (SDE):

dx/dt = F(x) + sqrt(2D(x)) dxi(t)

where:

F(x) is the Ito drift (force) term.
D(x) is the diffusion tensor, evaluated in the Ito convention.
dxi(t) is Gaussian white noise.

[Dynamical equations] Overdamped Langevin SDE

\[\frac{\mathrm{d}x}{\mathrm{d}t} = F(x) + \sqrt{2\,D(x)}\;\mathrm{d}\xi(t)\]

\(F(x)\) is the Itô drift, \(D(x)\) the diffusion tensor (Itô convention), and \(\mathrm{d}\xi\) is Gaussian white noise.

Here x is a 2D array of shape Nparticles x dimension. All particles are assumed to have identical properties.

This class provides tools to approximate F(x) and D(x) from a time series x(t) formatted as TrajectoryCollection.

Note that the Ito and Strato variants of the force inference routines do NOT refer to the convention in which the SDE is expressed (which is always Ito), but to the way stochastic integrals are performed to compute parameters.

Notes

Force Inference: - Linear combination of basis functions (infer_force_linear). - Parametric families (infer_force with a Basis or PSF).
Diffusion Inference: - Constant diffusion estimation (compute_diffusion_constant). - State-dependent diffusion with basis functions (infer_diffusion_linear and infer_diffusion).
Sparsification: - Force sparsification for linear inference sparsify_force, implementing PASTIS and other information criteria.
Error Estimation: - Normalized mean-squared error (MSE) prediction for both force and diffusion.
Comparison Tools for method benchmarking: - Evaluate inferred fields against known exact models (compare_to_exact).
Simulation: - Generate trajectories using inferred fields (simulate_bootstrapped_trajectory).

Notes

Initialize with TrajectoryCollection containing the trajectory.
Use the infer_* methods to infer force and diffusion fields.
Optionally compute error estimates and/or compare with exact data for validation.

Notes

The code uses jnp.einsum for array manipulation, with a consistent index naming scheme for clarity: - t : Time index, = 0..Ntimesteps-1 - a, b, c… : Basis function indices, = 0..Nfunctions - 1. - m, n, o… : Spatial indices, = 0..dim-1. - i, j… : Particle indices. We also use these indices as shortcuts for array shapes. For instance basis_linear : im -> iam reads basis_linear has input a jnp.array of shape (Nparticles,dim) and outputs a jnp.array of shape (Nparticles,Nfunctions,dim).

Notes

Progress messages use Python logging. Enable with SFI.enable_logging() or logging.getLogger('SFI').setLevel(logging.INFO).

Example

Fully documented examples in the “examples” folder: Lorenz model, ActiveBrownianParticles, Ornstein-Uhlenbeck…

coeff_block(block, *, field='force')¶

Return the coefficient (and covariance) slice for a basis sub-block.

Compound bases (e.g. a multi-kernel or time-Fourier library) pack several conceptual blocks into one flat coefficient vector. This returns the slice for one block without hand-computed offsets.

Parameters:

block – (start, stop) indices, a slice, an int, or a Basis (located by matching its labels as a contiguous run of the fitted basis labels).
field ({"force", "diffusion"})

Returns:

(coeffs, cov) – coeffs is the 1-D slice; cov is the matching covariance block (or None if no covariance is available).

Return type:

tuple

compare_params_to_exact(theta_true, *, psf=None)¶

Compare inferred parametric coefficients to known ground truth.

For a model fitted with a parametric family, returns a per-parameter dict of absolute and relative error. theta_true may be a flat array (compared elementwise to force_coefficients_full) or a {name: value} dict (unflattened from the fitted coefficients via psf.unflatten_params, falling back to self.force_psf).

Returns:: {name: {"true", "inferred", "abs_error", "rel_error"}}; also stored as self.parameter_comparison.
Return type:: dict

compare_to_exact(*, model_exact=None, data_exact=None, force_exact=None, diffusion_exact=None, maxpoints=1000)¶

Compare inferred vs exact using dt-weighted time means via the integrate() API.

This function evaluates the inferred force/diffusion against “exact” references on a (possibly exact/synthetic) dataset. It updates:

self.MSE_force / self.NMSE_force self.MSE_diffusion / self.NMSE_diffusion

Inputs: exact references¶

You can provide exact references in two ways:

Preferred: model_exact A model object (from SFI.langevin submodule) exposing:
- model_exact.force_sf : exact force/drift (SF/StateExpr-like)
- model_exact.diffusion_sfexact diffusion (SF/StateExpr-like)
  OR a constant (float or (d,d) matrix) via model_exact.D
Explicit: force_exact, diffusion_exact - force_exact: SF/StateExpr-like callable returning (N, d) - diffusion_exact:
- callable returning (N, d, d), OR
- float meaning σ·I, OR
- (d,d) matrix constant diffusion.
These are used if model_exact is not provided. If model_exact is provided, its members take precedence unless they are missing, in which case the explicit arguments can be used as fallback.

Velocity provisioning (underdamped)¶

If an evaluated expression advertises needs_v=True, this routine supplies:

v := dX/dt (secant velocity from the data stream)

i.e. it uses velocity(“dX”, “dt”) as the v=… keyword argument. This works for both exact and inferred expressions and keeps underdamped comparisons possible even when the dataset only stores positions.

Metrics¶

Force:: e = Fe - Fh MSE_force = < e^T A_inv e > NMSE_force = MSE_force / < Fh^T A_inv Fh >
Diffusion:: E = De - Dh MSE_diffusion = < tr(A_inv E A_inv E) >

[Error analysis] Normalized MSE metrics (force & diffusion)

\[\text{NMSE}_F = \frac{\langle (F_{\text{exact}} - \hat F)^\top A^{-1} (F_{\text{exact}} - \hat F) \rangle} {\langle \hat F^\top A^{-1} \hat F \rangle}\]

\[\text{NMSE}_D = \frac{\langle \operatorname{tr}(A^{-1} E\, A^{-1} E) \rangle} {\langle \operatorname{tr}(A^{-1} \hat D\, A^{-1} \hat D) \rangle}\]

where \(E = D_{\text{exact}} - \hat D\).: NMSE_diffusion = MSE_diffusion / < tr(A_inv Dh A_inv Dh) >

where A_inv is self.A_inv (typically (2 D̄)^{-1} from the inferred constant diffusion normalization).

Subsampling¶

Uses a simple subsampling heuristic so that the total number of evaluated points is ~<= maxpoints, accounting for the maximum number of particles.

Requirements¶

self.A_inv must exist (run compute_diffusion_constant() or otherwise set A_inv).
The dataset must provide streams X, dt, and if any evaluated expr needs v: dX as well.

Parameters:: maxpoints (int)
Return type:: None

comparison_scatter(*, model_exact=None, field='force', data=None, ax=None, maxpoints=2000, **plot_kw)¶

Scatter inferred-vs-exact force (or diffusion) along the trajectory.

Evaluates both fields on the data with force_comparison_arrays() / diffusion_comparison_arrays() and renders them with SFI.utils.plotting.comparison_scatter() (identity line + Pearson r + MSE). Replaces hand-rolled exact-vs-inferred scatters in demos.

Parameters:

model_exact – Object exposing force_sf / diffusion_sf / D.
field ({"force", "diffusion"}) – Which field to compare.
data – Collection to evaluate on (default: training data).
ax – Target axes (default: current axes).
maxpoints (int) – Approximate number of points to evaluate.
**plot_kw – Forwarded to SFI.utils.plotting.comparison_scatter().

Return type:

matplotlib.axes.Axes

compute_diffusion_constant(method='auto')[source]¶

Estimate a constant (spatially uniform) diffusion matrix.

Parameters:: method ({"auto", "noisy", "WeakNoise", "MSD"}) – Estimator to use. "noisy" is the noise-robust Vestergaard–Blainey–Flyvbjerg estimator. "auto" selects "noisy" when the measurement-noise trace Tr(Λ) > 0 (localization noise detected), and "WeakNoise" otherwise.
Return type:: None

Notes

Sets diffusion_average, diffusion_inferred, A, A_inv, sqrtA, sqrtA_inv, Lambda, Lambda_trace, and metadata["diffusion_constant_method"].

compute_diffusion_error()¶

Estimate sampling error for diffusion inference.

Mirrors compute_force_error() for the diffusion field. Uses the diffusion Gram matrix (normal matrix) and its inverse.

For linear diffusion inference the moments covariance is proportional to the Gram matrix, giving Cov(θ_D) = cov_factor * G_D⁻¹.

Note

This error estimate is approximate. The diffusion inference is more complex than the force inference: diffusion coefficients are inferred from force residuals, a positive-definiteness constraint applies, and the simple covariance formula Cov(θ_D) = cov_factor * G_D⁻¹ may not capture all sources of uncertainty. Treat the result as a rough guide rather than a rigorous confidence interval.

Notes

self.diffusion_coefficients_covariancejnp.ndarray: Covariance matrix of the diffusion coefficients.
self.diffusion_coefficients_stderrjnp.ndarray: Standard error for each diffusion coefficient.
self.diffusion_informationfloat: Estimated information content of the inferred diffusion field.
self.diffusion_predicted_MSEfloat: Predicted normalized mean squared error.

compute_force_error()¶

Estimate sampling error for force inference.

[Error analysis] Force coefficient covariance & predicted error

\[\operatorname{Cov}(C) = G^{-1}, \qquad \mathbb{E}\!\left[\langle \delta F^\top A^{-1} \delta F \rangle\right] = \operatorname{Tr}\!\left(G\,\operatorname{Cov}(C)\right), \qquad I_F = \tfrac{1}{2}\,C^\top M, \qquad \text{NMSE}_{F,\text{pred}} = \frac{\operatorname{Tr}(G\cdot\operatorname{Cov}(C))}{C^\top M} = \frac{\operatorname{Tr}(G\cdot\operatorname{Cov}(C))}{2 I_F}\]

Assumes the sampling error dominates; measurement noise and discretization biases are not addressed.

This method evaluates the covariance of the inferred force coefficients, the standard error, and computes the predicted normalized mean squared error (MSE) of the inferred force field. This analysis assumes that the sampling error dominates, and measurement noise or discretization biases are not explicitly addressed. It is common to OLI and ULI (by construction of the normal matrix G).

Notes

self.force_coefficients_covariance (jnp.ndarray): Covariance matrix of the force coefficients. self.force_coefficients_stderr (jnp.ndarray): Standard error for each force coefficient. self.force_information (float): Estimated information content of the inferred force field. self.force_predicted_MSE (float): Predicted normalized mean squared error of the inferred force field.

diagnose(*, level='standard', **kwargs)¶

Run the consistency-check suite from SFI.diagnostics.

Convenience wrapper for SFI.diagnostics.assess(). See its docstring for the available level presets.

Parameters:: level (str)

diffusion_comparison_arrays(*, model_exact=None, diffusion_exact=None, data=None, maxpoints=2000)¶

Return (D_exact, D_inferred) evaluated along the trajectory.

Like force_comparison_arrays() but for the diffusion field; a constant exact/inferred diffusion is broadcast to (n_points, d, d).

Parameters:: maxpoints (int)

force_comparison_arrays(*, model_exact=None, force_exact=None, data=None, maxpoints=2000)¶

Return (F_exact, F_inferred) evaluated along the trajectory.

Evaluates the exact and inferred force on the (subsampled, masked) trajectory points, supplying finite-difference velocities for underdamped fields. Feeds comparison_scatter(); also handy for custom diagnostics.

Parameters:

model_exact – Object exposing force_sf (e.g. an OverdampedProcess).
force_exact – Explicit callable exact force (overrides model_exact).
data – Collection to evaluate on (default: the training data).
maxpoints (int) – Approximate number of points to evaluate.

Returns:

(F_exact, F_inferred)

Return type:

tuple of ndarray, shape (n_points, d)

get_diffusion_timeop(method)[source]¶

Return the requested overdamped diffusion estimator as a TimeOperand. Output is (N, d, d). Required streams are declared on the wrapped TimeOp.

Parameters:: method (str)
Return type:: TimeOperand

holdout_score(data, *, require_error=False)¶

Held-out NMSE of the fitted force on an independent collection.

A side feature for data-abundant scenarios: SFI estimates its own accuracy from the training data (force_predicted_MSE) and validates fits through the diagnostics suite, neither of which costs any data. Reach for an explicit train/test split (TrajectoryCollection.split_time) only when data is plentiful, or to confirm a suspected bias floor: a ratio near 1 means the fit is sampling-limited, a ratio ≫ 1 means a bias floor (often measurement noise — see the noise-and-sampling guide).

The score is the residual-based normalised mean-square error of force_inferred on data, with the diffusion noise floor subtracted (a bias detector, not a precision instrument: its resolution is set by the χ² fluctuations of the residuals).

Parameters:

data (TrajectoryCollection) – Independent test data (e.g. the second half of coll.split_time(0.8)).
require_error (bool) – If True, run compute_force_error() first when the predicted error is missing, so ratio is always defined.

Returns:

{"holdout_NMSE", "predicted_NMSE", "ratio", "excess_z", "n_obs"}. Also stored as self.force_holdout_NMSE.

Return type:

dict

Notes

Bases that read time-dependent extras are not supported on the held-out path (the residual builders pass extras unsliced).

infer_diffusion(basis=None, *, theta_D0=None, integrator='rk4', n_substeps=1, maxiter=100)[source]¶

Infer state-dependent diffusion D(x) from parametric residuals.

Requires a prior parametric infer_force() call. Holds the fitted force fixed and minimises the windowed conditional NLL over the diffusion parameters (the log-det term makes the diffusion level identifiable), reusing the same flow residuals and integrate engine as the force solve.

[Inference] State-dependent diffusion inference (overdamped)

With the force \(\hat F\) held fixed, the state-dependent diffusion \(D(x;\theta_D)\) is optimised by minimising the windowed conditional negative log-likelihood; \(\Lambda\) from the force inference is held fixed. A rank-2 basis gives \(D(x) = \sum_j (\theta_D)_j\, d_j(x)\); a PSF is evaluated directly.

Parameters:

basis (Basis (rank 2), PSF, or None) – Diffusion model. A rank-2 Basis gives the linear parameterisation; a PSF is used directly (D(x) = PSF(x; θ_D)). When None (default), symmetric_matrix_basis(d) — all constant symmetric d×d diffusion matrices.
theta_D0 (dict, array, or None) – Initial diffusion parameters (default zeros).
integrator ({"rk4", "euler"}) – Flow predictor (default "rk4", matching infer_force()).
n_substeps (int) – Integrator micro-steps per Δt (default 1).
maxiter (int) – L-BFGS maximum iterations (default 100).

Return type:

None

Notes

Sets diffusion_inferred, diffusion_coefficients (and diffusion_basis when basis is a Basis), plus metadata.

See also

Parametric windowed estimators — algorithm and parameters: Full algorithm description.

infer_diffusion_linear(basis=None, *, M_mode='auto', G_mode='rectangle')[source]¶

Fit the diffusion field as a linear combination of basis functions.

This method computes the coefficients of the diffusion tensor field (self.diffusion_coefficients) using the provided basis functions. The diffusion tensor is represented as:

diffusion_inferred(x, mask) = sum_a basis_linear(x, mask)[:,a] * diffusion_coefficients[a]

Parameters:

basis (Basis with rank = 2 or None) – the fitting functions. When None (default), symmetric_matrix_basis(d) is used, spanning all constant symmetric d×d diffusion matrices. Requires a prior compute_diffusion_constant() call to determine d.
M_mode (str) – The method used for local diffusion tensor estimation and moments computation. See _diffusion_estimator documentation for additional information.
G_mode (str) – The method used to compute the normalization matrix G. Not investigated extensively yet for diffusion inference.

Return type:

None

Notes

self.diffusion_coefficients: The inferred coefficients for the diffusion basis functions. self.diffusion_inferred: Callable representing the inferred diffusion tensor field. self.diffusion_G: The normalization matrix used in the inference process.

Note

This inferred tensor field is not guaranteed to be nonnegative.

infer_force(F, theta0=None, *, D=None, Lambda=None, integrator='rk4', n_substeps=1, inner='auto', eiv='auto', max_outer=5, inner_maxiter=80, extra_radius=1)[source]¶

Infer the force field with the minimal parametric estimator.

Built on SFI.inference.parametric_core: a single RK4 flow step per observation interval defines the residual, the residual covariance gives a windowed-precision NLL, and the parameters are found by direct Gauss–Newton (linear-in-θ Basis, with the skip-trick errors-in-variables instrument) or frozen-precision L-BFGS (nonlinear-in-θ PSF). (D, Λ) are profiled natively: moment-estimator init, then one windowed conditional-NLL refinement at the fitted θ.

[Inference] Parametric windowed force inference (overdamped)

The observed positions follow \(y_t = x_t + \eta_t\) where \(\eta \sim \mathcal{N}(0, \Lambda)\). The deterministic flow \(\Phi(x;\theta) = z(\Delta t) - x\) (one RK4 step by default) defines the residual \(r_t = y_{t+1} - y_t - \Phi(y_t;\theta)\). Residuals follow a banded Gaussian whose local precision weights the Gauss–Newton normal equations; under measurement noise the left factor is replaced by the η-clean skip instrument \(\psi_{\rm inst} = \partial\Phi/\partial\theta\) evaluated at the lagged clean point (eiv=True), giving a consistent estimating equation.

Parameters:

F (PSF or Basis) – Parametric drift model. A Basis is converted to a PSF internally (coefficients initialised to zero) and PASTIS sparsification is enabled; it runs the fast direct-GN path. A PSF (possibly nonlinear in θ) runs the L-BFGS path.
theta0 (dict, array, or None) – Initial drift parameters (default: zeros).
D (array (d, d), optional) – Fixed diffusion matrix. If both D and Lambda are given, noise profiling is skipped entirely (fast path).
Lambda (array (d, d), optional) – Fixed measurement-noise covariance.
integrator ({"rk4", "euler"}) – Flow predictor (default "rk4", a single 4th-order step).
n_substeps (int) – Integrator micro-steps per observation interval (default 1 — the single-step minimal estimator).
inner ({"auto", "gn", "lbfgs"}) – Inner solver. "auto" → direct Gauss–Newton for a linear Basis, L-BFGS for a PSF.
eiv ({"auto", True, False, float}) – Measurement-noise errors-in-variables instrument. "auto" (default) → True for all models (interacting models use the same N-body flow for the instrument as for the residual); True forces the η-clean skip instrument (consistent under noise); False is the plain MLE; a float in [0, 1] blends. Active on the GN path only.
max_outer (int) – Outer Gauss–Newton / IRLS iterations (default 5).
inner_maxiter (int) – Inner L-BFGS iterations per outer step on the PSF path (default 80; raise for large nonlinear families, e.g. NNs).
extra_radius (int) – Precision-window padding beyond the covariance bandwidth (default 1). Raise to 2–3 in the noise-dominated regime β = Tr(Λ)/Tr(2DΔt) ≫ 1, where the windowed precision decays slowly and the default window under-resolves it.

Return type:

None

Notes

Sets standard force_* attributes: force_inferred, force_psf, force_G, force_G_pinv, force_coefficients_full, force_coefficients, force_support, force_moments.

Also sets diffusion_average, A, A_inv, Lambda from the profiled (D, Λ).

When F is a Basis, additionally sets force_basis, force_G_full, and force_scorer so that sparsify_force() can be called afterwards.

See also

Parametric windowed estimators — concepts: Mathematical foundations.
Parametric windowed estimators — algorithm and parameters: Detailed algorithm description.

infer_force_linear(basis, *, preset='auto', M_mode=None, G_mode=None)[source]¶

Infer the force field as a linear combination of basis functions (linear regression).

[Inference] Linear force regression (overdamped)

\[\hat F(x) = \sum_a C_a\, b_a(x) \qquad\text{where}\qquad G\,C = M\]

\(G_{ab} = \langle b_a(x_t)\, b_b(x_t) \rangle\) is the Gram matrix, \(M_a = \langle v_t \cdot A^{-1} \cdot b_a \rangle\) are the force moments, and \(A = 2\bar D\).

This method computes the force field coefficients (self.force_coefficients) using the provided Basis object. The force field is represented as:

inferred_force(x) = sum_a basis_linear(x)[:,a] * force_coefficients[a]

These coefficients are computed by solving a linear system:

G . force_coefficients = force_moments

and the different options account for the manner to compute G and force_moments. In its simplest form:

G_ab = < b_a(xt) b_b(xt) >                [G_mode = 'rectangle']
force_moments[a] = < dX[t]/dt b_a(xt) >   [mode = 'Ito' ]

but this is rarely the best choice of parameters.

Parameters:

basis (Basis) – Basis The fitting functions, encoded as a single callable Basis object (see SFI.statefunc submodule for doc and SFI.bases for examples).
preset (str) – Single-keyword convention bundle (mirrors the underdamped engine), default 'auto'. 'auto' picks 'robust' when measurement noise is detected (Tr(Λ) > 0) and 'clean' otherwise. Presets: 'robust' (Stratonovich moments + 'shift' Gram, noise-robust); 'clean' (Itô + 'trapeze', clean / fine-sampled data); 'KM' (Kramers–Moyal: Itô + 'rectangle'); 'legacy-sfi-v1.0' (Stratonovich + 'rectangle', the published SFI v1.0 convention).
M_mode (str, optional) – Override the preset’s moment convention: 'Ito', 'Ito-shift', or 'Strato'. None (default) uses the preset.
G_mode (str, optional) – Override the preset’s Gram normalization: 'rectangle' (2020 PRX), 'trapeze' (2024 Amiri et al. PRR), or 'shift' (<b_a(xt) b_b(x_t+dt)>, decorrelates measurement noise). None (default) uses the preset.

Notes

Updates the following attributes:

self.force_scorer: SparseScorer for model selection.
self.force_coefficients: The inferred coefficients for the basis functions.
self.force_inferred: Callable function representing the inferred force field.
self.force_G: The normalization matrix used in the inference process.

static load_results(path)¶

Reload a dict previously saved by save_results().

See SFI.inference.serialization.load_results() for details.

Return type:: dict

predict_time_profile(basis_block, t, *, field='force', x=None)¶

Evaluate a (time-dependent) basis block’s coefficient profile at t.

Contracts the fitted coefficients of basis_block with the basis’s own evaluation at times t (via the reserved time extra), returning the time profile — e.g. -k(t) for the x block of a time-Fourier trap. Avoids re-deriving the design matrix by hand.

Parameters:: field (str)

print_report()¶

Print a summary report of the inference results.

Provides insights into the inferred diffusion and force fields, along with error metrics such as sampling error, trajectory length, discretization bias, and measurement noise.

report_dict()¶

Return a structured summary of inference results as a dictionary.

This is the machine-readable counterpart of print_report(). All values are plain Python scalars or numpy arrays (no JAX arrays).

Returns:: Keys include "diffusion_average", "Lambda", "force_information", "force_predicted_MSE", "NMSE_force", "NMSE_diffusion", and others when available. Missing quantities are omitted.
Return type:: dict

save_results(path)¶

Save report_dict() to <path>.npz + <path>.json.

See SFI.inference.serialization.save_results() for details.

Return type:: Path

simulate_bootstrapped_trajectory(key, oversampling=1, simulate=True, dataset=0)[source]¶

Simulate an overdamped Langevin trajectory with the inferred force and diffusion fields.

This function generates a trajectory using the inferred force field and diffusion tensor inferred from the input data, matching the original time series and initial conditions.

Parameters:

key – JAX random key for generating noise in the simulation.
oversampling (int, optional) – Factor for oversampling (i.e. number of intermediate simulated points between two recorded points). Defaults to 1.
simulate – if True, performs the simulation with the first data point as initial position; if False, returns an uninitialized object which can be simulated with flexible initial position and parameters.
dataset (int, optional) – Which experiment of a pooled fit to reproduce. The inferred model is collapsed to this condition via specialize() (folding per_dataset_scalar / dataset_indicator at dataset); the resulting process is a standalone single-condition model that does not read dataset_index. Defaults to 0.

Returns:

Simulated Langevin process object.

Return type:

OverdampedProcess

sparsify_force(*, criterion='PASTIS', p=0.05, method='beam', max_k=None, **strategy_kwargs)¶

Sparsify the inferred force by selecting a subset of basis functions.

Builds a Pareto front of sparse models using the chosen method, then selects the model that maximises the given information criterion.

Parameters:

criterion ("PASTIS" | "AIC" | "BIC" | "EBIC" | "SIC", default "PASTIS") – Information criterion for model selection.
p (float, default 0.05) – Prior-scale parameter \(p_0\) for the PASTIS penalty.
method (str, default "beam") –
Search strategy. One of:
- "beam" — bidirectional beam search (PASTIS original). Extra kwargs: beam_width (int, default 3), aic_patience (int, default 2).
- "greedy" — forward stepwise selection. Extra kwargs: direction ("forward" | "backward" | "both", default "forward").
- "stlsq" — Sequential Thresholded Least Squares (SINDy-style). Extra kwargs: threshold (float or None), mode ("relative" | "absolute"), n_thresholds (int).
- "lasso" — \(\ell_1\)-penalised coordinate descent. Extra kwargs: alpha (float or None), n_alphas (int).
- "hillclimb" — stochastic hill-climbing (Gerardos & Ronceray, 2025). Extra kwargs: ic, patience (int), seed (int or None).
max_k (int or None) – Maximum model size. Defaults to the full basis size.
**strategy_kwargs – Passed to the strategy constructor.

Returns:

The full Pareto-front result, also stored as self.force_sparsity_result.

Return type:

SparsityResult

summary(field='force')¶

Return a formatted coefficient table for the inferred model.

Parameters:: field ("force" or "diffusion") – Which inferred field to summarize.
Returns:: Multi-line table ready for print().
Return type:: str

class SFI.inference.STLSQStrategy(*, threshold=None, mode='relative', max_iter=50, n_thresholds=30, report_time=False)[source]¶

Bases: SparsityStrategy

Sequential Thresholded Least Squares (SINDy-style).

Parameters:

threshold (float or None) – Single threshold value. If None, an automatic log-spaced sweep from a small fraction of the maximum coefficient up to the maximum is performed.
mode ("relative" | "absolute", default "relative") – Whether threshold is interpreted as a fraction of \(\max|C|\) (relative) or a fixed value (absolute).
max_iter (int, default 50) – Maximum STLSQ iterations per threshold.
n_thresholds (int, default 30) – Number of thresholds in the automatic sweep (used only when threshold is None).
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'stlsq'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.SparseScorer(*, M, G, norm_X2=0.0, n=1, pinv_tol=1e-08, use_residuals=False)[source]¶

Bases: object

Score candidate supports by solving the restricted normal equations.

Parameters:

M ((p,) Array) – Pre-computed moment vector (cross-moments between data and basis functions).
G ((p, p) Array) – Normal-equations matrix. May be non-symmetric (e.g. when using Itô-shift moment estimators). Symmetry is detected automatically and a Cholesky fast-path is used when possible.
norm_X2 (float, default 0.0) – Sum of squared observations. Only used when use_residuals=True.
n (int, default 1) – Sample count prefactor used in the residual-based information gain \(\tfrac{1}{2} n\,\log(\lVert X\rVert^2 / \mathrm{RSS})\). Only used when use_residuals=True.
pinv_tol (float, default 1e-8) – Tolerance for the diagonal preconditioning floor.
use_residuals (bool, default False) – If True, the information gain is computed via the residual sum-of-squares expression instead of the explicit quadratic form.

info_and_coeffs(B)[source]¶

Solve \(G_{BB}\,C_B = M_B\) and return the information gain.

Parameters:

B ((k,) int Array) – Indices of the active basis functions (the support).

Returns:

info (scalar Array) – \(\tfrac{1}{2}\,C_B^\top M_B\) (or the RSS variant when use_residuals=True).
C_B ((k,) Array) – Maximum-likelihood coefficients for the restricted support.

Return type:

Tuple[Array, Array]

vmap_info(batch)[source]¶

Score a batch of supports of the same cardinality.

The batch is padded to the next power-of-2 length so that JAX’s compilation cache stays bounded (≤ ~12 unique shapes per support size k instead of one per distinct batch count).

Parameters:

batch ((n_supports, k) int Array) – Each row is a sorted support of length k.

Returns:

infos ((n_supports,) Array)
coeffs ((n_supports, k) Array)

Return type:

Tuple[Array, Array]

class SFI.inference.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]¶

Bases: object

Frozen container for the output of a sparsity search.

Variables:

p (int) – Total number of candidate basis functions.
total_info (float) – Information gain of the full (dense) model.
method (str) – Name of the strategy that produced this result (e.g. "beam", "greedy", "stlsq", "lasso").
best_info_by_k (list[float]) – best_info_by_k[k] is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are -inf.
best_support_by_k (list[list[int]]) – The support achieving best_info_by_k[k].
best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.
second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all -inf if the strategy does not track runner-ups.
second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.

Parameters:

p (int)
total_info (float)
method (str)
best_info_by_k (list)
best_support_by_k (list)
best_coeffs_by_k (list)
second_info_by_k (list)
second_support_by_k (list)

all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]¶

Compute all information criteria and optionally compare to ground truth.

Parameters:

p_param (float) – PASTIS significance level.
tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.
gamma (float, default 0.5) – EBIC tuning parameter.
true_support (optional) – Ground-truth support and coefficients for overlap metrics.
true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.
Phi_test (optional Array) – Held-out design matrix for predictive NMSE.
verbose (bool) – If True, log a summary table at INFO level.

Returns:

Keyed by IC name, each value is a dict with k, support, score, coeffs, and optionally overlap and predictive-NMSE entries.

Return type:

dict

best_coeffs_by_k: list¶

best_info_by_k: list¶

best_support_by_k: list¶

method: str¶

p: int¶

second_info_by_k: list¶

second_support_by_k: list¶

select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]¶

Return the support that maximises a given information criterion.

[Model selection] Information criteria for sparse model selection

\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]

where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.

References

AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.
BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).
EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.
PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”
SIC — Unpublished (Ronceray).

Parameters:

name ("AIC" | "BIC" | "EBIC" | "PASTIS" | "SIC") – Information criterion to maximise.
p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.
tau (float or None) – Total trajectory time. Required for BIC and EBIC.
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is "EBIC".

Returns:

k_star (int) – Selected model size.
support (list[int]) – Basis-function indices of the chosen model.
score (float) – Value of the information criterion at k_star.
coeffs (Array or None) – Coefficient vector for the selected support.

Return type:

Tuple[int, List[int], float, Array | None]

total_info: float¶

class SFI.inference.SparsityStrategy[source]¶

Bases: ABC

Abstract base class for sparsity search strategies.

Subclasses must implement run().

name: str = 'base'¶: Short identifier used in SparsityResult.method.

abstractmethod run(scorer, *, max_k, **kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.UnderdampedLangevinInference(data, *, max_memory_gb=1.0, **kwargs)[source]¶

Bases: BaseLangevinInference

Stochastic Force Inference concrete class for underdamped systems (velocities unobserved).

Core equations (Ito convention):: dx/dt = v dv/dt = F(x, v) + sqrt(2 D(x, v)) dW_t

Only x(t) is observed; v(t) is reconstructed from finite differences (“secant velocities”).

[Dynamical equations] Underdamped Langevin SDE

\[\frac{\mathrm{d}x}{\mathrm{d}t} = v, \qquad \frac{\mathrm{d}v}{\mathrm{d}t} = F(x,v) + \sqrt{2\,D(x,v)}\;\mathrm{d}W_t\]

Only positions \(x(t)\) are observed; velocities are reconstructed from finite differences.

coeff_block(block, *, field='force')¶

Return the coefficient (and covariance) slice for a basis sub-block.

Compound bases (e.g. a multi-kernel or time-Fourier library) pack several conceptual blocks into one flat coefficient vector. This returns the slice for one block without hand-computed offsets.

Parameters:

block – (start, stop) indices, a slice, an int, or a Basis (located by matching its labels as a contiguous run of the fitted basis labels).
field ({"force", "diffusion"})

Returns:

(coeffs, cov) – coeffs is the 1-D slice; cov is the matching covariance block (or None if no covariance is available).

Return type:

tuple

compare_params_to_exact(theta_true, *, psf=None)¶

Compare inferred parametric coefficients to known ground truth.

Returns:: {name: {"true", "inferred", "abs_error", "rel_error"}}; also stored as self.parameter_comparison.
Return type:: dict

compare_to_exact(*, model_exact=None, data_exact=None, force_exact=None, diffusion_exact=None, maxpoints=1000)¶

Compare inferred vs exact using dt-weighted time means via the integrate() API.

This function evaluates the inferred force/diffusion against “exact” references on a (possibly exact/synthetic) dataset. It updates:

self.MSE_force / self.NMSE_force self.MSE_diffusion / self.NMSE_diffusion

Inputs: exact references¶

You can provide exact references in two ways:

Preferred: model_exact A model object (from SFI.langevin submodule) exposing:
- model_exact.force_sf : exact force/drift (SF/StateExpr-like)
- model_exact.diffusion_sfexact diffusion (SF/StateExpr-like)
  OR a constant (float or (d,d) matrix) via model_exact.D
Explicit: force_exact, diffusion_exact - force_exact: SF/StateExpr-like callable returning (N, d) - diffusion_exact:
- callable returning (N, d, d), OR
- float meaning σ·I, OR
- (d,d) matrix constant diffusion.
These are used if model_exact is not provided. If model_exact is provided, its members take precedence unless they are missing, in which case the explicit arguments can be used as fallback.

Velocity provisioning (underdamped)¶

If an evaluated expression advertises needs_v=True, this routine supplies:

v := dX/dt (secant velocity from the data stream)

Metrics¶

Force:: e = Fe - Fh MSE_force = < e^T A_inv e > NMSE_force = MSE_force / < Fh^T A_inv Fh >
Diffusion:: E = De - Dh MSE_diffusion = < tr(A_inv E A_inv E) >

[Error analysis] Normalized MSE metrics (force & diffusion)

\[\text{NMSE}_F = \frac{\langle (F_{\text{exact}} - \hat F)^\top A^{-1} (F_{\text{exact}} - \hat F) \rangle} {\langle \hat F^\top A^{-1} \hat F \rangle}\]

\[\text{NMSE}_D = \frac{\langle \operatorname{tr}(A^{-1} E\, A^{-1} E) \rangle} {\langle \operatorname{tr}(A^{-1} \hat D\, A^{-1} \hat D) \rangle}\]

where \(E = D_{\text{exact}} - \hat D\).: NMSE_diffusion = MSE_diffusion / < tr(A_inv Dh A_inv Dh) >

where A_inv is self.A_inv (typically (2 D̄)^{-1} from the inferred constant diffusion normalization).

Subsampling¶

Uses a simple subsampling heuristic so that the total number of evaluated points is ~<= maxpoints, accounting for the maximum number of particles.

Requirements¶

self.A_inv must exist (run compute_diffusion_constant() or otherwise set A_inv).
The dataset must provide streams X, dt, and if any evaluated expr needs v: dX as well.

Parameters:: maxpoints (int)
Return type:: None

comparison_scatter(*, model_exact=None, field='force', data=None, ax=None, maxpoints=2000, **plot_kw)¶

Scatter inferred-vs-exact force (or diffusion) along the trajectory.

Parameters:

model_exact – Object exposing force_sf / diffusion_sf / D.
field ({"force", "diffusion"}) – Which field to compare.
data – Collection to evaluate on (default: training data).
ax – Target axes (default: current axes).
maxpoints (int) – Approximate number of points to evaluate.
**plot_kw – Forwarded to SFI.utils.plotting.comparison_scatter().

Return type:

matplotlib.axes.Axes

compute_diffusion_constant(method='auto')[source]¶

Underdamped constant diffusion inference (ULI estimators).

Estimate measurement noise Λ with method “Lambda”.

If method == ‘auto’: choose ‘noisy’ if tr(Λ)>0 else ‘WeakNoise’.

Average chosen instantaneous estimator to obtain D̄.

Set A = 2 D̄ and derive A_inv, sqrtA, sqrtA_inv.

Parameters:: method (str)
Return type:: None

compute_diffusion_error()¶

Estimate sampling error for diffusion inference.

Mirrors compute_force_error() for the diffusion field. Uses the diffusion Gram matrix (normal matrix) and its inverse.

For linear diffusion inference the moments covariance is proportional to the Gram matrix, giving Cov(θ_D) = cov_factor * G_D⁻¹.

Note

Notes

self.diffusion_coefficients_covariancejnp.ndarray: Covariance matrix of the diffusion coefficients.
self.diffusion_coefficients_stderrjnp.ndarray: Standard error for each diffusion coefficient.
self.diffusion_informationfloat: Estimated information content of the inferred diffusion field.
self.diffusion_predicted_MSEfloat: Predicted normalized mean squared error.

compute_force_error()¶

Estimate sampling error for force inference.

[Error analysis] Force coefficient covariance & predicted error

Assumes the sampling error dominates; measurement noise and discretization biases are not addressed.

Notes

diagnose(*, level='standard', **kwargs)¶

Run the consistency-check suite from SFI.diagnostics.

Convenience wrapper for SFI.diagnostics.assess(). See its docstring for the available level presets.

Parameters:: level (str)

diffusion_comparison_arrays(*, model_exact=None, diffusion_exact=None, data=None, maxpoints=2000)¶

Return (D_exact, D_inferred) evaluated along the trajectory.

Like force_comparison_arrays() but for the diffusion field; a constant exact/inferred diffusion is broadcast to (n_points, d, d).

Parameters:: maxpoints (int)

force_comparison_arrays(*, model_exact=None, force_exact=None, data=None, maxpoints=2000)¶

Return (F_exact, F_inferred) evaluated along the trajectory.

Parameters:

model_exact – Object exposing force_sf (e.g. an OverdampedProcess).
force_exact – Explicit callable exact force (overrides model_exact).
data – Collection to evaluate on (default: the training data).
maxpoints (int) – Approximate number of points to evaluate.

Returns:

(F_exact, F_inferred)

Return type:

tuple of ndarray, shape (n_points, d)

get_diffusion_timeop(method)[source]¶

Parameters:: method (str)
Return type:: TimeOperand

holdout_score(data, *, require_error=False)¶

Held-out NMSE of the fitted force on an independent collection.

Parameters:

data (TrajectoryCollection) – Independent test data (e.g. the second half of coll.split_time(0.8)).
require_error (bool) – If True, run compute_force_error() first when the predicted error is missing, so ratio is always defined.

Returns:

{"holdout_NMSE", "predicted_NMSE", "ratio", "excess_z", "n_obs"}. Also stored as self.force_holdout_NMSE.

Return type:

dict

Notes

Bases that read time-dependent extras are not supported on the held-out path (the residual builders pass extras unsliced).

infer_diffusion(basis=None, *, theta_D0=None, integrator='rk4', n_substeps=1, maxiter=100)[source]¶

Infer state- and velocity-dependent diffusion D(x, v).

Requires a prior parametric infer_force() call. Holds the fitted force fixed and minimises the pentadiagonal windowed conditional NLL over the diffusion parameters, with D(x, v; θ_D) evaluated at the shooting velocity.

[Inference] State-dependent diffusion inference (underdamped)

With the force \(\hat F(x,v)\) held fixed, the state-dependent diffusion \(D(x,v;\theta_D)\) is optimised by minimising the windowed conditional NLL on pentadiagonal (bandwidth-2) covariance windows; \(\Lambda\) from the force inference is held fixed.

Parameters:

basis (Basis (rank 2), PSF, or None) – Diffusion model. A rank-2 Basis gives the linear parameterisation D = Σ_j θ_j d_j; a PSF (optionally needs_v=True) is used directly. When None (default), symmetric_matrix_basis(d) — all constant symmetric d×d diffusion matrices.
theta_D0 (dict, array, or None) – Initial diffusion parameters (default zeros).
integrator ({"rk4", "euler"}) – Flow predictor (default "rk4", matching infer_force()).
n_substeps (int) – Integrator micro-steps per Δt (default 1).
maxiter (int) – L-BFGS maximum iterations (default 100).

Return type:

None

Notes

Sets diffusion_inferred, diffusion_coefficients (and diffusion_basis when basis is a Basis), plus metadata.

See also

Parametric windowed estimators — algorithm and parameters: Full algorithm description.

infer_diffusion_linear(basis, *, M_mode='noisy', G_mode='rectangle')[source]¶

Fit D(x, v) as a linear combination of basis functions.

Parameters:

basis (Basis)
M_mode (str)
G_mode (str)

Return type:

None

infer_force(F, theta0=None, *, D=None, Lambda=None, integrator='rk4', n_substeps=1, inner='auto', eiv='auto', max_outer=5, inner_maxiter=80)[source]¶

Infer force F(x,v) with the minimal parametric estimator.

Built on SFI.inference.parametric_core: the unobserved velocity is resolved by shooting through a single RK4 phase-space step, the 3-point residual covariance is pentadiagonal (bandwidth 2), and the parameters are found by direct Gauss–Newton (linear-in-θ Basis, with the skip-trick errors-in-variables instrument) or frozen-precision L-BFGS (nonlinear-in-θ PSF). (D, Λ) are profiled natively: moment-estimator init, then one windowed conditional-NLL refinement at the fitted θ.

[Inference] Parametric windowed force inference (underdamped)

The phase-space dynamics \(\dot{x}=v,\;\mathrm{d}v = F(x,v)\mathrm{d}t + \sqrt{2D}\,\mathrm{d}W\) are factored into deterministic flow + noise. Three-point shooting residuals follow a pentadiagonal Gaussian whose local precision weights the Gauss–Newton normal equations; the process noise enters at \(\Delta t^3\). Under measurement noise the left factor is the η-clean skip instrument built from the lagged clean position pair (eiv=True) — consistent where the naive MLE is velocity-EIV-biased.

Parameters:

F (PSF (needs_v=True) or Basis) – Parametric force model. A Basis is converted to a PSF internally (coefficients initialised to zero) and PASTIS sparsification is enabled; it runs the fast direct-GN path.
theta0 (dict, array, or None) – Initial force parameters (default: zeros).
D (array (d, d), optional) – Fixed diffusion matrix. If both D and Lambda are given, noise profiling is skipped entirely (fast path).
Lambda (array (d, d), optional) – Fixed measurement-noise covariance.
integrator ({"rk4", "euler"}) – Flow predictor (default "rk4", a single 4th-order step). A single "euler" step cannot carry the force into the position update, so the skip-trick is auto-disabled (with a warning) for integrator="euler", n_substeps=1.
n_substeps (int) – Integrator micro-steps per observation interval (default 1 — the single-step minimal estimator).
inner ({"auto", "gn", "lbfgs"}) – Inner solver. "auto" → direct Gauss–Newton for a linear Basis, L-BFGS for a PSF.
eiv ({"auto", True, False, float}) – Measurement-noise errors-in-variables instrument. "auto" (default) → True for all models (interacting models use the same N-body flow for the instrument as for the residual); True forces the η-clean skip instrument; False is the plain MLE; a float in [0, 1] blends. GN path only.
max_outer (int) – Outer Gauss–Newton / IRLS iterations (default 5).
inner_maxiter (int) – Inner L-BFGS iterations per outer step on the PSF path (default 80; raise for large nonlinear families, e.g. NNs).

Return type:

None

Notes

Sets standard force_* attributes: force_psf, force_G, force_G_pinv, force_coefficients_full, force_coefficients, force_support, force_moments, force_inferred, diffusion_average, A, A_inv, Lambda.

When F is a Basis, additionally sets force_basis, force_G_full, and force_scorer so that sparsify_force() can be called afterwards.

See also

Parametric windowed estimators — concepts: Mathematical foundations.
Parametric windowed estimators — algorithm and parameters: Detailed algorithm description.

infer_force_linear(basis, *, preset='auto', M_mode=None, G_mode=None, diffusion_method=None)[source]¶

Infer F(x, v) as a linear combination of basis functions.

The estimator is configured by a preset that consolidates the (M_mode, G_mode, diffusion_method) surface into two broadly-validated choices, with an auto switch keyed on measurement noise. Power users can still pass any of the three modes explicitly to override the preset.

Parameters:

preset ({'auto', 'robust', 'clean', 'legacy-clean-v1.0', 'legacy-noisy-v1.0'}, default 'auto') –
- 'robust' — symmetric / trapeze / noisy. The broadly applicable default: flat across measurement noise within its feasible band and never catastrophic.
- 'clean' — symmetric / trapeze / WeakNoise. A sharper refinement for verified noise-free data (helps mainly in the coarse-dt, low-noise corner); fragile under measurement noise.
- 'auto' — pick 'robust' when measurement noise is detected and 'clean' otherwise, on the sign of the localization-noise estimate Lambda_trace (>0 -> noise present). Lambda_trace is reused from a prior compute_diffusion_constant() call or estimated inline. Validated on the vdp_uli_10x benchmark: the zero-crossing tracks the noisy``<->``WeakNoise transition with no harmful misroutes.
- 'legacy-clean-v1.0' — early / rectangle / MSD. Reproduces the published SFI v1.0 clean-data convention (explicit only; never selected by 'auto').
- 'legacy-noisy-v1.0' — symmetric / rectangle / noisy. Reproduces the SFI v1.0 noisy-data convention (explicit only).
basis (Basis) – Rank-1 basis producing per-particle vectors. Must accept v=… keyword in evaluation.
M_mode (str, optional) – Override the preset’s kinematics / moments convention: ‘symmetric’ | ‘early’ | ‘anticipated’. None uses the preset.
G_mode (str, optional) – Override the preset’s Gram construction mode: ‘rectangle’ | ‘trapeze’ | ‘shift’ | ‘doubleshift’ (each returned transposed). None uses the preset.
diffusion_method (str, optional) – Override the preset’s instantaneous diffusion estimator used in the ULI correction term: ‘noisy’ | ‘WeakNoise’ | ‘MSD’. None uses the preset (and re-enables the auto measurement-noise switch).

Return type:

None

static load_results(path)¶

Reload a dict previously saved by save_results().

See SFI.inference.serialization.load_results() for details.

Return type:: dict

predict_time_profile(basis_block, t, *, field='force', x=None)¶

Evaluate a (time-dependent) basis block’s coefficient profile at t.

Parameters:: field (str)

print_report()¶

Print a summary report of the inference results.

Provides insights into the inferred diffusion and force fields, along with error metrics such as sampling error, trajectory length, discretization bias, and measurement noise.

report_dict()¶

Return a structured summary of inference results as a dictionary.

This is the machine-readable counterpart of print_report(). All values are plain Python scalars or numpy arrays (no JAX arrays).

Returns:: Keys include "diffusion_average", "Lambda", "force_information", "force_predicted_MSE", "NMSE_force", "NMSE_diffusion", and others when available. Missing quantities are omitted.
Return type:: dict

save_results(path)¶

Save report_dict() to <path>.npz + <path>.json.

See SFI.inference.serialization.save_results() for details.

Return type:: Path

simulate_bootstrapped_trajectory(key, oversampling=1, simulate=True, dataset=0)[source]¶

Simulate an underdamped Langevin trajectory using the inferred force and diffusion.

Parameters:

key – JAX random key.
oversampling (int) – number of integration substeps between saved frames.
simulate (bool) – if True, runs a simulation initialized from the first row of data; if False, returns an uninitialized process object.
dataset (int) – which experiment of a pooled fit to reproduce. The inferred model is collapsed to this condition via specialize() (per-dataset parameters folded at dataset); the resulting process is a standalone single-condition model that does not read dataset_index. Defaults to 0.

Returns:

(TrajectoryCollection, UnderdampedProcess) Else: UnderdampedProcess

Return type:

If simulate=True

sparsify_force(*, criterion='PASTIS', p=0.05, method='beam', max_k=None, **strategy_kwargs)¶

Sparsify the inferred force by selecting a subset of basis functions.

Builds a Pareto front of sparse models using the chosen method, then selects the model that maximises the given information criterion.

Parameters:

criterion ("PASTIS" | "AIC" | "BIC" | "EBIC" | "SIC", default "PASTIS") – Information criterion for model selection.
p (float, default 0.05) – Prior-scale parameter \(p_0\) for the PASTIS penalty.
method (str, default "beam") –
Search strategy. One of:
- "beam" — bidirectional beam search (PASTIS original). Extra kwargs: beam_width (int, default 3), aic_patience (int, default 2).
- "greedy" — forward stepwise selection. Extra kwargs: direction ("forward" | "backward" | "both", default "forward").
- "stlsq" — Sequential Thresholded Least Squares (SINDy-style). Extra kwargs: threshold (float or None), mode ("relative" | "absolute"), n_thresholds (int).
- "lasso" — \(\ell_1\)-penalised coordinate descent. Extra kwargs: alpha (float or None), n_alphas (int).
- "hillclimb" — stochastic hill-climbing (Gerardos & Ronceray, 2025). Extra kwargs: ic, patience (int), seed (int or None).
max_k (int or None) – Maximum model size. Defaults to the full basis size.
**strategy_kwargs – Passed to the strategy constructor.

Returns:

The full Pareto-front result, also stored as self.force_sparsity_result.

Return type:

SparsityResult

summary(field='force')¶

Return a formatted coefficient table for the inferred model.

Parameters:: field ("force" or "diffusion") – Which inferred field to summarize.
Returns:: Multi-line table ready for print().
Return type:: str

SFI.inference.load_model(path, template)[source]¶

Reload an InferenceResultSF from files written by save_model().

Parameters:

path (str or Path) – Base path (without extension).
template (InferenceResultSF) – A skeleton instance providing the PSF / Basis structure. Typically built from the same PSF / Basis used at training time: template = InferenceResultSF(SF(psf, dummy_params)).

Return type:

InferenceResultSF

SFI.inference.load_results(path)[source]¶

Reload a dict saved by save_results().

Parameters:: path (str or Path) – Base path (without extension).
Returns:: The merged dictionary of arrays and scalars.
Return type:: dict

SFI.inference.overlap_metrics(true_support, pred_support)[source]¶

Compare predicted support to the ground truth.

Parameters:

true_support (list[int]) – Indices of the true and predicted active basis functions.
pred_support (list[int]) – Indices of the true and predicted active basis functions.

Returns:

Keys: TP, FP, FN, prec, rec, exact.

Return type:

dict

SFI.inference.predictive_nmse(Phi_test, true_support, true_coeffs, inferred_support, inferred_coeffs)[source]¶

Normalised mean-squared error on a held-out design matrix.

Parameters:

Phi_test ((n_test, p) Array) – Design matrix evaluated on test data.
true_support (list[int]) – Ground-truth active indices.
true_coeffs (array-like) – Ground-truth coefficient vector (length len(true_support)).
inferred_support (list[int]) – Inferred active indices.
inferred_coeffs (array-like) – Inferred coefficient vector (length len(inferred_support)).

Returns:

\(\|\hat y - y\|^2 / \|y\|^2\).

Return type:

float

SFI.inference.save_model(result_sf, path)[source]¶

Serialize an InferenceResultSF (params array + metadata).

Creates <path>.params.npz (the fitted parameter arrays) and <path>.meta.json (param_cov + non-array metadata).

The PSF/Basis structure is not saved — it must be supplied again as a template when calling load_model().

Parameters:

result_sf (InferenceResultSF) – The fitted model to save.
path (str or Path) – Base path (without extension).

Returns:

The base path used.

Return type:

Path

SFI.inference.save_results(inferer, path)[source]¶

Save inferer.report_dict() to <path>.npz + <path>.json.

Parameters:

inferer (BaseLangevinInference) – Any inference object that exposes report_dict().
path (str or Path) – Base path (without extension). Two files are created: <path>.npz for arrays, <path>.json for scalars/metadata.

Returns:

The base path used.

Return type:

Path

Subpackages¶

SFI.inference.sparse package

SFI.inference package¶

Public classes¶

Axis effects¶

Inputs: exact references¶

Velocity provisioning (underdamped)¶

Metrics¶

Subsampling¶

Requirements¶

Inputs: exact references¶

Velocity provisioning (underdamped)¶

Metrics¶

Subsampling¶

Requirements¶

Subpackages¶

Submodules¶