SFI.inference.sparse package

SFI.inference.sparse — Pluggable sparse model selection.

This sub-package separates the scoring of candidate supports from the search strategy used to navigate the combinatorial lattice.

Public API

SparseScorer

Owns the (M, G) normal-equations data. Scores any support B by solving the restricted linear system and computing the information gain.

SparsityResult

Container returned by every strategy. Holds the Pareto front (best info / support / coefficients per cardinality k) and provides information-criterion selection (AIC, BIC, EBIC, PASTIS, SIC).

SparsityStrategy

Abstract base class for search algorithms. Subclasses must implement run(scorer, *, max_k, **kwargs) -> SparsityResult.

Concrete strategies

BeamSearchStrategy

Bidirectional beam search (the original PASTIS algorithm).

GreedyStepwiseStrategy

Forward, backward, or bidirectional stepwise selection.

HillClimbStrategy

Stochastic hill-climbing with random add/remove moves (Gerardos & Ronceray, 2025).

STLSQStrategy

Sequential Thresholded Least Squares (à la SINDy).

LassoStrategy

Coordinate-descent \(\ell_1\)-penalised regression on the normal equations.

Benchmark helpers

overlap_metrics

Compare predicted vs true support (TP / FP / FN / precision / recall).

predictive_nmse

Normalised mean-squared error on a held-out design matrix.

class SFI.inference.sparse.BeamSearchStrategy(*, beam_width=20, aic_patience=2, report_time=False)[source]

Bases: SparsityStrategy

Bidirectional beam search over the support lattice.

Parameters:
  • beam_width (int, default 20) – Maximum number of candidate models retained per cardinality.

  • aic_patience (int, default 2) – Stop early when AIC has strictly declined for this many consecutive closed cardinality levels.

  • report_time (bool, default False) – If True, log elapsed time and number of explored supports.

name: str = 'beam'

Short identifier used in SparsityResult.method.

run(scorer, *, max_k, init_supports=None, **_kwargs)[source]

Execute the beam search.

Parameters:
  • scorer (SparseScorer)

  • max_k (int) – Maximum model size to consider.

  • init_supports (list of tuples of int, optional) – Seed supports to inject into the initial frontier. Each entry is a tuple of basis-function indices. Useful for seeding the search with a known good model (e.g. the true support) so that the Pareto front is guaranteed to include it.

Return type:

SparsityResult

class SFI.inference.sparse.GreedyStepwiseStrategy(*, direction='forward', report_time=False)[source]

Bases: SparsityStrategy

Forward / backward / bidirectional stepwise selection.

Parameters:
  • direction ("forward" | "backward" | "both", default "forward") –

    Which direction(s) to search.

    • "forward": start empty, add one feature at a time.

    • "backward": start full, drop one feature at a time.

    • "both": run both directions and merge the Pareto fronts.

  • report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'greedy'

Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]

Execute the search and return a SparsityResult.

Parameters:
  • scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.

  • max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.HillClimbStrategy(*, ic='PASTIS', p_param=0.001, tau=None, gamma=0.5, patience=200, seed=None, report_time=False)[source]

Bases: SparsityStrategy

Stochastic hill-climbing model selection.

For each cardinality k ∈ {0, …, max_k}, a chain starts from a random support of size k and accepts random add/remove moves that improve the chosen information criterion, stopping after patience consecutive failures. The best-per-k results form the Pareto front returned as a SparsityResult.

This is the search algorithm described in Gerardos & Ronceray (2025), §”Model selection”, which recommends parallel searches from null, full, and random starting points.

Parameters:
  • ic (str, default "PASTIS") – Information criterion used as the acceptance objective. One of "AIC", "BIC", "EBIC", "PASTIS", "SIC".

  • p_param (float, default 1e-3) – PASTIS significance level \(p_0\).

  • tau (float or None) – Total trajectory time (required for BIC / EBIC).

  • gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)).

  • patience (int, default 200) – Stop a chain after this many consecutive rejected moves.

  • seed (int or None) – Random seed for reproducibility.

  • report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'hillclimb'

Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]

Execute the search and return a SparsityResult.

Parameters:
  • scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.

  • max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.LassoStrategy(*, alpha=None, n_alphas=50, max_iter=1000, tol=1e-07, report_time=False)[source]

Bases: SparsityStrategy

Coordinate-descent LASSO on the normal equations.

Parameters:
  • alpha (float or None) – Fixed regularisation strength. If None (default), an automatic log-spaced path from \(\alpha_{\max}\) (where the solution is entirely zero) down to \(10^{-4}\,\alpha_{\max}\) is constructed.

  • n_alphas (int, default 50) – Number of \(\alpha\) values in the automatic path.

  • max_iter (int, default 1000) – Maximum coordinate-descent iterations per \(\alpha\).

  • tol (float, default 1e-7) – Convergence tolerance (max absolute change in any coefficient).

  • report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'lasso'

Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]

Execute the search and return a SparsityResult.

Parameters:
  • scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.

  • max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.STLSQStrategy(*, threshold=None, mode='relative', max_iter=50, n_thresholds=30, report_time=False)[source]

Bases: SparsityStrategy

Sequential Thresholded Least Squares (SINDy-style).

Parameters:
  • threshold (float or None) – Single threshold value. If None, an automatic log-spaced sweep from a small fraction of the maximum coefficient up to the maximum is performed.

  • mode ("relative" | "absolute", default "relative") – Whether threshold is interpreted as a fraction of \(\max|C|\) (relative) or a fixed value (absolute).

  • max_iter (int, default 50) – Maximum STLSQ iterations per threshold.

  • n_thresholds (int, default 30) – Number of thresholds in the automatic sweep (used only when threshold is None).

  • report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'stlsq'

Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]

Execute the search and return a SparsityResult.

Parameters:
  • scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.

  • max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.SparseScorer(*, M, G, norm_X2=0.0, n=1, pinv_tol=1e-08, use_residuals=False)[source]

Bases: object

Score candidate supports by solving the restricted normal equations.

Parameters:
  • M ((p,) Array) – Pre-computed moment vector (cross-moments between data and basis functions).

  • G ((p, p) Array) – Normal-equations matrix. May be non-symmetric (e.g. when using Itô-shift moment estimators). Symmetry is detected automatically and a Cholesky fast-path is used when possible.

  • norm_X2 (float, default 0.0) – Sum of squared observations. Only used when use_residuals=True.

  • n (int, default 1) – Sample count prefactor used in the residual-based information gain \(\tfrac{1}{2} n\,\log(\lVert X\rVert^2 / \mathrm{RSS})\). Only used when use_residuals=True.

  • pinv_tol (float, default 1e-8) – Tolerance for the diagonal preconditioning floor.

  • use_residuals (bool, default False) – If True, the information gain is computed via the residual sum-of-squares expression instead of the explicit quadratic form.

info_and_coeffs(B)[source]

Solve \(G_{BB}\,C_B = M_B\) and return the information gain.

Parameters:

B ((k,) int Array) – Indices of the active basis functions (the support).

Returns:

  • info (scalar Array) – \(\tfrac{1}{2}\,C_B^\top M_B\) (or the RSS variant when use_residuals=True).

  • C_B ((k,) Array) – Maximum-likelihood coefficients for the restricted support.

Return type:

Tuple[Array, Array]

vmap_info(batch)[source]

Score a batch of supports of the same cardinality.

The batch is padded to the next power-of-2 length so that JAX’s compilation cache stays bounded (≤ ~12 unique shapes per support size k instead of one per distinct batch count).

Parameters:

batch ((n_supports, k) int Array) – Each row is a sorted support of length k.

Returns:

  • infos ((n_supports,) Array)

  • coeffs ((n_supports, k) Array)

Return type:

Tuple[Array, Array]

class SFI.inference.sparse.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]

Bases: object

Frozen container for the output of a sparsity search.

Variables:
  • p (int) – Total number of candidate basis functions.

  • total_info (float) – Information gain of the full (dense) model.

  • method (str) – Name of the strategy that produced this result (e.g. "beam", "greedy", "stlsq", "lasso").

  • best_info_by_k (list[float]) – best_info_by_k[k] is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are -inf.

  • best_support_by_k (list[list[int]]) – The support achieving best_info_by_k[k].

  • best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.

  • second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all -inf if the strategy does not track runner-ups.

  • second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.

Parameters:
  • p (int)

  • total_info (float)

  • method (str)

  • best_info_by_k (list)

  • best_support_by_k (list)

  • best_coeffs_by_k (list)

  • second_info_by_k (list)

  • second_support_by_k (list)

all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]

Compute all information criteria and optionally compare to ground truth.

Parameters:
  • p_param (float) – PASTIS significance level.

  • tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.

  • gamma (float, default 0.5) – EBIC tuning parameter.

  • true_support (optional) – Ground-truth support and coefficients for overlap metrics.

  • true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.

  • Phi_test (optional Array) – Held-out design matrix for predictive NMSE.

  • verbose (bool) – If True, log a summary table at INFO level.

Returns:

Keyed by IC name, each value is a dict with k, support, score, coeffs, and optionally overlap and predictive-NMSE entries.

Return type:

dict

best_coeffs_by_k: list
best_info_by_k: list
best_support_by_k: list
method: str
p: int
second_info_by_k: list
second_support_by_k: list
select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]

Return the support that maximises a given information criterion.

[Model selection] Information criteria for sparse model selection

\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]

where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.

References

  • AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.

  • BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).

  • EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.

  • PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”

  • SIC — Unpublished (Ronceray).

Parameters:
  • name ("AIC" | "BIC" | "EBIC" | "PASTIS" | "SIC") – Information criterion to maximise.

  • p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.

  • tau (float or None) – Total trajectory time. Required for BIC and EBIC.

  • gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is "EBIC".

Returns:

  • k_star (int) – Selected model size.

  • support (list[int]) – Basis-function indices of the chosen model.

  • score (float) – Value of the information criterion at k_star.

  • coeffs (Array or None) – Coefficient vector for the selected support.

Return type:

Tuple[int, List[int], float, Array | None]

total_info: float
class SFI.inference.sparse.SparsityStrategy[source]

Bases: ABC

Abstract base class for sparsity search strategies.

Subclasses must implement run().

name: str = 'base'

Short identifier used in SparsityResult.method.

abstractmethod run(scorer, *, max_k, **kwargs)[source]

Execute the search and return a SparsityResult.

Parameters:
  • scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.

  • max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

SFI.inference.sparse.overlap_metrics(true_support, pred_support)[source]

Compare predicted support to the ground truth.

Parameters:
  • true_support (list[int]) – Indices of the true and predicted active basis functions.

  • pred_support (list[int]) – Indices of the true and predicted active basis functions.

Returns:

Keys: TP, FP, FN, prec, rec, exact.

Return type:

dict

SFI.inference.sparse.predictive_nmse(Phi_test, true_support, true_coeffs, inferred_support, inferred_coeffs)[source]

Normalised mean-squared error on a held-out design matrix.

Parameters:
  • Phi_test ((n_test, p) Array) – Design matrix evaluated on test data.

  • true_support (list[int]) – Ground-truth active indices.

  • true_coeffs (array-like) – Ground-truth coefficient vector (length len(true_support)).

  • inferred_support (list[int]) – Inferred active indices.

  • inferred_coeffs (array-like) – Inferred coefficient vector (length len(inferred_support)).

Returns:

\(\|\hat y - y\|^2 / \|y\|^2\).

Return type:

float

Submodules