SFI.inference.sparsity module¶

SFI.inference.sparsity — Sparse model selection¶

Public façade that re-exports the key symbols from SFI.inference.sparse. User code should import from here or from the SFI.inference namespace:

from SFI.inference import SparseScorer, SparsityResult
from SFI.inference.sparse import BeamSearchStrategy, LassoStrategy

The heavy lifting lives in SFI.inference.sparse; this module keeps the import path short and backward-compatible.

class SFI.inference.sparsity.BeamSearchStrategy(*, beam_width=20, aic_patience=2, report_time=False)[source]¶

Bases: SparsityStrategy

Bidirectional beam search over the support lattice.

Parameters:

beam_width (int, default 20) – Maximum number of candidate models retained per cardinality.
aic_patience (int, default 2) – Stop early when AIC has strictly declined for this many consecutive closed cardinality levels.
report_time (bool, default False) – If True, log elapsed time and number of explored supports.

name: str = 'beam'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, init_supports=None, **_kwargs)[source]¶

Execute the beam search.

Parameters:

scorer (SparseScorer)
max_k (int) – Maximum model size to consider.
init_supports (list of tuples of int, optional) – Seed supports to inject into the initial frontier. Each entry is a tuple of basis-function indices. Useful for seeding the search with a known good model (e.g. the true support) so that the Pareto front is guaranteed to include it.

Return type:

SparsityResult

class SFI.inference.sparsity.GreedyStepwiseStrategy(*, direction='forward', report_time=False)[source]¶

Bases: SparsityStrategy

Forward / backward / bidirectional stepwise selection.

Parameters:

direction ("forward" | "backward" | "both", default "forward") –
Which direction(s) to search.
- "forward": start empty, add one feature at a time.
- "backward": start full, drop one feature at a time.
- "both": run both directions and merge the Pareto fronts.
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'greedy'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparsity.LassoStrategy(*, alpha=None, n_alphas=50, max_iter=1000, tol=1e-07, report_time=False)[source]¶

Bases: SparsityStrategy

Coordinate-descent LASSO on the normal equations.

Parameters:

alpha (float or None) – Fixed regularisation strength. If None (default), an automatic log-spaced path from \(\alpha_{\max}\) (where the solution is entirely zero) down to \(10^{-4}\,\alpha_{\max}\) is constructed.
n_alphas (int, default 50) – Number of \(\alpha\) values in the automatic path.
max_iter (int, default 1000) – Maximum coordinate-descent iterations per \(\alpha\).
tol (float, default 1e-7) – Convergence tolerance (max absolute change in any coefficient).
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'lasso'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparsity.STLSQStrategy(*, threshold=None, mode='relative', max_iter=50, n_thresholds=30, report_time=False)[source]¶

Bases: SparsityStrategy

Sequential Thresholded Least Squares (SINDy-style).

Parameters:

threshold (float or None) – Single threshold value. If None, an automatic log-spaced sweep from a small fraction of the maximum coefficient up to the maximum is performed.
mode ("relative" | "absolute", default "relative") – Whether threshold is interpreted as a fraction of \(\max|C|\) (relative) or a fixed value (absolute).
max_iter (int, default 50) – Maximum STLSQ iterations per threshold.
n_thresholds (int, default 30) – Number of thresholds in the automatic sweep (used only when threshold is None).
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'stlsq'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparsity.SparseScorer(*, M, G, norm_X2=0.0, n=1, pinv_tol=1e-08, use_residuals=False)[source]¶

Bases: object

Score candidate supports by solving the restricted normal equations.

Parameters:

M ((p,) Array) – Pre-computed moment vector (cross-moments between data and basis functions).
G ((p, p) Array) – Normal-equations matrix. May be non-symmetric (e.g. when using Itô-shift moment estimators). Symmetry is detected automatically and a Cholesky fast-path is used when possible.
norm_X2 (float, default 0.0) – Sum of squared observations. Only used when use_residuals=True.
n (int, default 1) – Sample count prefactor used in the residual-based information gain \(\tfrac{1}{2} n\,\log(\lVert X\rVert^2 / \mathrm{RSS})\). Only used when use_residuals=True.
pinv_tol (float, default 1e-8) – Tolerance for the diagonal preconditioning floor.
use_residuals (bool, default False) – If True, the information gain is computed via the residual sum-of-squares expression instead of the explicit quadratic form.

info_and_coeffs(B)[source]¶

Solve \(G_{BB}\,C_B = M_B\) and return the information gain.

Parameters:

B ((k,) int Array) – Indices of the active basis functions (the support).

Returns:

info (scalar Array) – \(\tfrac{1}{2}\,C_B^\top M_B\) (or the RSS variant when use_residuals=True).
C_B ((k,) Array) – Maximum-likelihood coefficients for the restricted support.

Return type:

Tuple[Array, Array]

vmap_info(batch)[source]¶

Score a batch of supports of the same cardinality.

The batch is padded to the next power-of-2 length so that JAX’s compilation cache stays bounded (≤ ~12 unique shapes per support size k instead of one per distinct batch count).

Parameters:

batch ((n_supports, k) int Array) – Each row is a sorted support of length k.

Returns:

infos ((n_supports,) Array)
coeffs ((n_supports, k) Array)

Return type:

Tuple[Array, Array]

class SFI.inference.sparsity.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]¶

Bases: object

Frozen container for the output of a sparsity search.

Variables:

p (int) – Total number of candidate basis functions.
total_info (float) – Information gain of the full (dense) model.
method (str) – Name of the strategy that produced this result (e.g. "beam", "greedy", "stlsq", "lasso").
best_info_by_k (list[float]) – best_info_by_k[k] is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are -inf.
best_support_by_k (list[list[int]]) – The support achieving best_info_by_k[k].
best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.
second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all -inf if the strategy does not track runner-ups.
second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.

Parameters:

p (int)
total_info (float)
method (str)
best_info_by_k (list)
best_support_by_k (list)
best_coeffs_by_k (list)
second_info_by_k (list)
second_support_by_k (list)

all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]¶

Compute all information criteria and optionally compare to ground truth.

Parameters:

p_param (float) – PASTIS significance level.
tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.
gamma (float, default 0.5) – EBIC tuning parameter.
true_support (optional) – Ground-truth support and coefficients for overlap metrics.
true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.
Phi_test (optional Array) – Held-out design matrix for predictive NMSE.
verbose (bool) – If True, log a summary table at INFO level.

Returns:

Keyed by IC name, each value is a dict with k, support, score, coeffs, and optionally overlap and predictive-NMSE entries.

Return type:

dict

best_coeffs_by_k: list¶

best_info_by_k: list¶

best_support_by_k: list¶

method: str¶

p: int¶

second_info_by_k: list¶

second_support_by_k: list¶

select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]¶

Return the support that maximises a given information criterion.

[Model selection] Information criteria for sparse model selection

\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]

where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.

References

AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.
BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).
EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.
PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”
SIC — Unpublished (Ronceray).

Parameters:

name ("AIC" | "BIC" | "EBIC" | "PASTIS" | "SIC") – Information criterion to maximise.
p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.
tau (float or None) – Total trajectory time. Required for BIC and EBIC.
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is "EBIC".

Returns:

k_star (int) – Selected model size.
support (list[int]) – Basis-function indices of the chosen model.
score (float) – Value of the information criterion at k_star.
coeffs (Array or None) – Coefficient vector for the selected support.

Return type:

Tuple[int, List[int], float, Array | None]

total_info: float¶

class SFI.inference.sparsity.SparsityStrategy[source]¶

Bases: ABC

Abstract base class for sparsity search strategies.

Subclasses must implement run().

name: str = 'base'¶: Short identifier used in SparsityResult.method.

abstractmethod run(scorer, *, max_k, **kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

SFI.inference.sparsity.overlap_metrics(true_support, pred_support)[source]¶

Compare predicted support to the ground truth.

Parameters:

true_support (list[int]) – Indices of the true and predicted active basis functions.
pred_support (list[int]) – Indices of the true and predicted active basis functions.

Returns:

Keys: TP, FP, FN, prec, rec, exact.

Return type:

dict

SFI.inference.sparsity.predictive_nmse(Phi_test, true_support, true_coeffs, inferred_support, inferred_coeffs)[source]¶

Normalised mean-squared error on a held-out design matrix.

Parameters:

Phi_test ((n_test, p) Array) – Design matrix evaluated on test data.
true_support (list[int]) – Ground-truth active indices.
true_coeffs (array-like) – Ground-truth coefficient vector (length len(true_support)).
inferred_support (list[int]) – Inferred active indices.
inferred_coeffs (array-like) – Inferred coefficient vector (length len(inferred_support)).

Returns:

\(\|\hat y - y\|^2 / \|y\|^2\).

Return type:

float