SFI.inference.sparse package¶

SFI.inference.sparse — Pluggable sparse model selection.¶

This sub-package separates the scoring of candidate supports from the search strategy used to navigate the combinatorial lattice.

Public API¶

SparseScorer: Owns the (M, G) normal-equations data. Scores any support B by solving the restricted linear system and computing the information gain.
SparsityResult: Container returned by every strategy. Holds the Pareto front (best info / support / coefficients per cardinality k) and provides information-criterion selection (AIC, BIC, EBIC, PASTIS, SIC).
SparsityStrategy: Abstract base class for search algorithms. Subclasses must implement run(scorer, *, max_k, **kwargs) -> SparsityResult.

Concrete strategies¶

BeamSearchStrategy: Bidirectional beam search (the original PASTIS algorithm).
GreedyStepwiseStrategy: Forward, backward, or bidirectional stepwise selection.
HillClimbStrategy: Stochastic hill-climbing with random add/remove moves (Gerardos & Ronceray, 2025).
STLSQStrategy: Sequential Thresholded Least Squares (à la SINDy).
LassoStrategy: Coordinate-descent \(\ell_1\)-penalised regression on the normal equations.

Benchmark helpers¶

overlap_metrics: Compare predicted vs true support (TP / FP / FN / precision / recall).
predictive_nmse: Normalised mean-squared error on a held-out design matrix.

class SFI.inference.sparse.BeamSearchStrategy(*, beam_width=20, aic_patience=2, report_time=False)[source]¶

Bases: SparsityStrategy

Bidirectional beam search over the support lattice.

Parameters:

beam_width (int, default 20) – Maximum number of candidate models retained per cardinality.
aic_patience (int, default 2) – Stop early when AIC has strictly declined for this many consecutive closed cardinality levels.
report_time (bool, default False) – If True, log elapsed time and number of explored supports.

name: str = 'beam'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, init_supports=None, **_kwargs)[source]¶

Execute the beam search.

Parameters:

scorer (SparseScorer)
max_k (int) – Maximum model size to consider.
init_supports (list of tuples of int, optional) – Seed supports to inject into the initial frontier. Each entry is a tuple of basis-function indices. Useful for seeding the search with a known good model (e.g. the true support) so that the Pareto front is guaranteed to include it.

Return type:

SparsityResult

class SFI.inference.sparse.GreedyStepwiseStrategy(*, direction='forward', report_time=False)[source]¶

Bases: SparsityStrategy

Forward / backward / bidirectional stepwise selection.

Parameters:

direction ("forward" | "backward" | "both", default "forward") –
Which direction(s) to search.
- "forward": start empty, add one feature at a time.
- "backward": start full, drop one feature at a time.
- "both": run both directions and merge the Pareto fronts.
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'greedy'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.HillClimbStrategy(*, ic='PASTIS', p_param=0.001, tau=None, gamma=0.5, patience=200, seed=None, report_time=False)[source]¶

Bases: SparsityStrategy

Stochastic hill-climbing model selection.

For each cardinality k ∈ {0, …, max_k}, a chain starts from a random support of size k and accepts random add/remove moves that improve the chosen information criterion, stopping after patience consecutive failures. The best-per-k results form the Pareto front returned as a SparsityResult.

This is the search algorithm described in Gerardos & Ronceray (2025), §”Model selection”, which recommends parallel searches from null, full, and random starting points.

Parameters:

ic (str, default "PASTIS") – Information criterion used as the acceptance objective. One of "AIC", "BIC", "EBIC", "PASTIS", "SIC".
p_param (float, default 1e-3) – PASTIS significance level \(p_0\).
tau (float or None) – Total trajectory time (required for BIC / EBIC).
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)).
patience (int, default 200) – Stop a chain after this many consecutive rejected moves.
seed (int or None) – Random seed for reproducibility.
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'hillclimb'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.LassoStrategy(*, alpha=None, n_alphas=50, max_iter=1000, tol=1e-07, report_time=False)[source]¶

Bases: SparsityStrategy

Coordinate-descent LASSO on the normal equations.

Parameters:

alpha (float or None) – Fixed regularisation strength. If None (default), an automatic log-spaced path from \(\alpha_{\max}\) (where the solution is entirely zero) down to \(10^{-4}\,\alpha_{\max}\) is constructed.
n_alphas (int, default 50) – Number of \(\alpha\) values in the automatic path.
max_iter (int, default 1000) – Maximum coordinate-descent iterations per \(\alpha\).
tol (float, default 1e-7) – Convergence tolerance (max absolute change in any coefficient).
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'lasso'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.STLSQStrategy(*, threshold=None, mode='relative', max_iter=50, n_thresholds=30, report_time=False)[source]¶

Bases: SparsityStrategy

Sequential Thresholded Least Squares (SINDy-style).

Parameters:

threshold (float or None) – Single threshold value. If None, an automatic log-spaced sweep from a small fraction of the maximum coefficient up to the maximum is performed.
mode ("relative" | "absolute", default "relative") – Whether threshold is interpreted as a fraction of \(\max|C|\) (relative) or a fixed value (absolute).
max_iter (int, default 50) – Maximum STLSQ iterations per threshold.
n_thresholds (int, default 30) – Number of thresholds in the automatic sweep (used only when threshold is None).
report_time (bool, default False) – Log elapsed wall-clock time when done.

name: str = 'stlsq'¶: Short identifier used in SparsityResult.method.

run(scorer, *, max_k, **_kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

class SFI.inference.sparse.SparseScorer(*, M, G, norm_X2=0.0, n=1, pinv_tol=1e-08, use_residuals=False)[source]¶

Bases: object

Score candidate supports by solving the restricted normal equations.

Parameters:

M ((p,) Array) – Pre-computed moment vector (cross-moments between data and basis functions).
G ((p, p) Array) – Normal-equations matrix. May be non-symmetric (e.g. when using Itô-shift moment estimators). Symmetry is detected automatically and a Cholesky fast-path is used when possible.
norm_X2 (float, default 0.0) – Sum of squared observations. Only used when use_residuals=True.
n (int, default 1) – Sample count prefactor used in the residual-based information gain \(\tfrac{1}{2} n\,\log(\lVert X\rVert^2 / \mathrm{RSS})\). Only used when use_residuals=True.
pinv_tol (float, default 1e-8) – Tolerance for the diagonal preconditioning floor.
use_residuals (bool, default False) – If True, the information gain is computed via the residual sum-of-squares expression instead of the explicit quadratic form.

info_and_coeffs(B)[source]¶

Solve \(G_{BB}\,C_B = M_B\) and return the information gain.

Parameters:

B ((k,) int Array) – Indices of the active basis functions (the support).

Returns:

info (scalar Array) – \(\tfrac{1}{2}\,C_B^\top M_B\) (or the RSS variant when use_residuals=True).
C_B ((k,) Array) – Maximum-likelihood coefficients for the restricted support.

Return type:

Tuple[Array, Array]

vmap_info(batch)[source]¶

Score a batch of supports of the same cardinality.

The batch is padded to the next power-of-2 length so that JAX’s compilation cache stays bounded (≤ ~12 unique shapes per support size k instead of one per distinct batch count).

Parameters:

batch ((n_supports, k) int Array) – Each row is a sorted support of length k.

Returns:

infos ((n_supports,) Array)
coeffs ((n_supports, k) Array)

Return type:

Tuple[Array, Array]

class SFI.inference.sparse.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]¶

Bases: object

Frozen container for the output of a sparsity search.

Variables:

p (int) – Total number of candidate basis functions.
total_info (float) – Information gain of the full (dense) model.
method (str) – Name of the strategy that produced this result (e.g. "beam", "greedy", "stlsq", "lasso").
best_info_by_k (list[float]) – best_info_by_k[k] is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are -inf.
best_support_by_k (list[list[int]]) – The support achieving best_info_by_k[k].
best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.
second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all -inf if the strategy does not track runner-ups.
second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.

Parameters:

p (int)
total_info (float)
method (str)
best_info_by_k (list)
best_support_by_k (list)
best_coeffs_by_k (list)
second_info_by_k (list)
second_support_by_k (list)

all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]¶

Compute all information criteria and optionally compare to ground truth.

Parameters:

p_param (float) – PASTIS significance level.
tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.
gamma (float, default 0.5) – EBIC tuning parameter.
true_support (optional) – Ground-truth support and coefficients for overlap metrics.
true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.
Phi_test (optional Array) – Held-out design matrix for predictive NMSE.
verbose (bool) – If True, log a summary table at INFO level.

Returns:

Keyed by IC name, each value is a dict with k, support, score, coeffs, and optionally overlap and predictive-NMSE entries.

Return type:

dict

best_coeffs_by_k: list¶

best_info_by_k: list¶

best_support_by_k: list¶

method: str¶

p: int¶

second_info_by_k: list¶

second_support_by_k: list¶

select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]¶

Return the support that maximises a given information criterion.

[Model selection] Information criteria for sparse model selection

\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]

where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.

References

AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.
BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).
EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.
PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”
SIC — Unpublished (Ronceray).

Parameters:

name ("AIC" | "BIC" | "EBIC" | "PASTIS" | "SIC") – Information criterion to maximise.
p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.
tau (float or None) – Total trajectory time. Required for BIC and EBIC.
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is "EBIC".

Returns:

k_star (int) – Selected model size.
support (list[int]) – Basis-function indices of the chosen model.
score (float) – Value of the information criterion at k_star.
coeffs (Array or None) – Coefficient vector for the selected support.

Return type:

Tuple[int, List[int], float, Array | None]

total_info: float¶

class SFI.inference.sparse.SparsityStrategy[source]¶

Bases: ABC

Abstract base class for sparsity search strategies.

Subclasses must implement run().

name: str = 'base'¶: Short identifier used in SparsityResult.method.

abstractmethod run(scorer, *, max_k, **kwargs)[source]¶

Execute the search and return a SparsityResult.

Parameters:

scorer (SparseScorer) – Provides info_and_coeffs / vmap_info for evaluating candidate supports.
max_k (int) – Maximum model size to explore.

Return type:

SparsityResult

SFI.inference.sparse.overlap_metrics(true_support, pred_support)[source]¶

Compare predicted support to the ground truth.

Parameters:

true_support (list[int]) – Indices of the true and predicted active basis functions.
pred_support (list[int]) – Indices of the true and predicted active basis functions.

Returns:

Keys: TP, FP, FN, prec, rec, exact.

Return type:

dict

SFI.inference.sparse.predictive_nmse(Phi_test, true_support, true_coeffs, inferred_support, inferred_coeffs)[source]¶

Normalised mean-squared error on a held-out design matrix.

Parameters:

Phi_test ((n_test, p) Array) – Design matrix evaluated on test data.
true_support (list[int]) – Ground-truth active indices.
true_coeffs (array-like) – Ground-truth coefficient vector (length len(true_support)).
inferred_support (list[int]) – Inferred active indices.
inferred_coeffs (array-like) – Inferred coefficient vector (length len(inferred_support)).

Returns:

\(\|\hat y - y\|^2 / \|y\|^2\).

Return type:

float

SFI.inference.sparse package¶

SFI.inference.sparse — Pluggable sparse model selection.¶

Public API¶

Concrete strategies¶

Benchmark helpers¶

Submodules¶