SFI.inference.sparse package¶
SFI.inference.sparse — Pluggable sparse model selection.¶
This sub-package separates the scoring of candidate supports from the search strategy used to navigate the combinatorial lattice.
Public API¶
- SparseScorer
Owns the
(M, G)normal-equations data. Scores any support B by solving the restricted linear system and computing the information gain.- SparsityResult
Container returned by every strategy. Holds the Pareto front (best info / support / coefficients per cardinality k) and provides information-criterion selection (AIC, BIC, EBIC, PASTIS, SIC).
- SparsityStrategy
Abstract base class for search algorithms. Subclasses must implement
run(scorer, *, max_k, **kwargs) -> SparsityResult.
Concrete strategies¶
- BeamSearchStrategy
Bidirectional beam search (the original PASTIS algorithm).
- GreedyStepwiseStrategy
Forward, backward, or bidirectional stepwise selection.
- HillClimbStrategy
Stochastic hill-climbing with random add/remove moves (Gerardos & Ronceray, 2025).
- STLSQStrategy
Sequential Thresholded Least Squares (à la SINDy).
- LassoStrategy
Coordinate-descent \(\ell_1\)-penalised regression on the normal equations.
Benchmark helpers¶
- overlap_metrics
Compare predicted vs true support (TP / FP / FN / precision / recall).
- predictive_nmse
Normalised mean-squared error on a held-out design matrix.
- class SFI.inference.sparse.BeamSearchStrategy(*, beam_width=20, aic_patience=2, report_time=False)[source]¶
Bases:
SparsityStrategyBidirectional beam search over the support lattice.
- Parameters:
beam_width (int, default 20) – Maximum number of candidate models retained per cardinality.
aic_patience (int, default 2) – Stop early when AIC has strictly declined for this many consecutive closed cardinality levels.
report_time (bool, default False) – If True, log elapsed time and number of explored supports.
- name: str = 'beam'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, init_supports=None, **_kwargs)[source]¶
Execute the beam search.
- Parameters:
scorer (SparseScorer)
max_k (int) – Maximum model size to consider.
init_supports (list of tuples of int, optional) – Seed supports to inject into the initial frontier. Each entry is a tuple of basis-function indices. Useful for seeding the search with a known good model (e.g. the true support) so that the Pareto front is guaranteed to include it.
- Return type:
- class SFI.inference.sparse.GreedyStepwiseStrategy(*, direction='forward', report_time=False)[source]¶
Bases:
SparsityStrategyForward / backward / bidirectional stepwise selection.
- Parameters:
direction (
"forward"|"backward"|"both", default"forward") –Which direction(s) to search.
"forward": start empty, add one feature at a time."backward": start full, drop one feature at a time."both": run both directions and merge the Pareto fronts.
report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'greedy'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparse.HillClimbStrategy(*, ic='PASTIS', p_param=0.001, tau=None, gamma=0.5, patience=200, seed=None, report_time=False)[source]¶
Bases:
SparsityStrategyStochastic hill-climbing model selection.
For each cardinality k ∈ {0, …, max_k}, a chain starts from a random support of size k and accepts random add/remove moves that improve the chosen information criterion, stopping after patience consecutive failures. The best-per-k results form the Pareto front returned as a
SparsityResult.This is the search algorithm described in Gerardos & Ronceray (2025), §”Model selection”, which recommends parallel searches from null, full, and random starting points.
- Parameters:
ic (str, default
"PASTIS") – Information criterion used as the acceptance objective. One of"AIC","BIC","EBIC","PASTIS","SIC".p_param (float, default 1e-3) – PASTIS significance level \(p_0\).
tau (float or None) – Total trajectory time (required for BIC / EBIC).
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)).
patience (int, default 200) – Stop a chain after this many consecutive rejected moves.
seed (int or None) – Random seed for reproducibility.
report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'hillclimb'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparse.LassoStrategy(*, alpha=None, n_alphas=50, max_iter=1000, tol=1e-07, report_time=False)[source]¶
Bases:
SparsityStrategyCoordinate-descent LASSO on the normal equations.
- Parameters:
alpha (float or None) – Fixed regularisation strength. If None (default), an automatic log-spaced path from \(\alpha_{\max}\) (where the solution is entirely zero) down to \(10^{-4}\,\alpha_{\max}\) is constructed.
n_alphas (int, default 50) – Number of \(\alpha\) values in the automatic path.
max_iter (int, default 1000) – Maximum coordinate-descent iterations per \(\alpha\).
tol (float, default 1e-7) – Convergence tolerance (max absolute change in any coefficient).
report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'lasso'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparse.STLSQStrategy(*, threshold=None, mode='relative', max_iter=50, n_thresholds=30, report_time=False)[source]¶
Bases:
SparsityStrategySequential Thresholded Least Squares (SINDy-style).
- Parameters:
threshold (float or None) – Single threshold value. If None, an automatic log-spaced sweep from a small fraction of the maximum coefficient up to the maximum is performed.
mode (
"relative"|"absolute", default"relative") – Whetherthresholdis interpreted as a fraction of \(\max|C|\) (relative) or a fixed value (absolute).max_iter (int, default 50) – Maximum STLSQ iterations per threshold.
n_thresholds (int, default 30) – Number of thresholds in the automatic sweep (used only when
threshold is None).report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'stlsq'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparse.SparseScorer(*, M, G, norm_X2=0.0, n=1, pinv_tol=1e-08, use_residuals=False)[source]¶
Bases:
objectScore candidate supports by solving the restricted normal equations.
- Parameters:
M ((p,) Array) – Pre-computed moment vector (cross-moments between data and basis functions).
G ((p, p) Array) – Normal-equations matrix. May be non-symmetric (e.g. when using Itô-shift moment estimators). Symmetry is detected automatically and a Cholesky fast-path is used when possible.
norm_X2 (float, default 0.0) – Sum of squared observations. Only used when
use_residuals=True.n (int, default 1) – Sample count prefactor used in the residual-based information gain \(\tfrac{1}{2} n\,\log(\lVert X\rVert^2 / \mathrm{RSS})\). Only used when
use_residuals=True.pinv_tol (float, default 1e-8) – Tolerance for the diagonal preconditioning floor.
use_residuals (bool, default False) – If True, the information gain is computed via the residual sum-of-squares expression instead of the explicit quadratic form.
- info_and_coeffs(B)[source]¶
Solve \(G_{BB}\,C_B = M_B\) and return the information gain.
- Parameters:
B ((k,) int Array) – Indices of the active basis functions (the support).
- Returns:
info (scalar Array) – \(\tfrac{1}{2}\,C_B^\top M_B\) (or the RSS variant when
use_residuals=True).C_B ((k,) Array) – Maximum-likelihood coefficients for the restricted support.
- Return type:
Tuple[Array, Array]
- vmap_info(batch)[source]¶
Score a batch of supports of the same cardinality.
The batch is padded to the next power-of-2 length so that JAX’s compilation cache stays bounded (≤ ~12 unique shapes per support size k instead of one per distinct batch count).
- Parameters:
batch ((n_supports, k) int Array) – Each row is a sorted support of length k.
- Returns:
infos ((n_supports,) Array)
coeffs ((n_supports, k) Array)
- Return type:
Tuple[Array, Array]
- class SFI.inference.sparse.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]¶
Bases:
objectFrozen container for the output of a sparsity search.
- Variables:
p (int) – Total number of candidate basis functions.
total_info (float) – Information gain of the full (dense) model.
method (str) – Name of the strategy that produced this result (e.g.
"beam","greedy","stlsq","lasso").best_info_by_k (list[float]) –
best_info_by_k[k]is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are-inf.best_support_by_k (list[list[int]]) – The support achieving
best_info_by_k[k].best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.
second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all
-infif the strategy does not track runner-ups.second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.
- Parameters:
p (int)
total_info (float)
method (str)
best_info_by_k (list)
best_support_by_k (list)
best_coeffs_by_k (list)
second_info_by_k (list)
second_support_by_k (list)
- all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]¶
Compute all information criteria and optionally compare to ground truth.
- Parameters:
p_param (float) – PASTIS significance level.
tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.
gamma (float, default 0.5) – EBIC tuning parameter.
true_support (optional) – Ground-truth support and coefficients for overlap metrics.
true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.
Phi_test (optional Array) – Held-out design matrix for predictive NMSE.
verbose (bool) – If True, log a summary table at INFO level.
- Returns:
Keyed by IC name, each value is a dict with
k,support,score,coeffs, and optionally overlap and predictive-NMSE entries.- Return type:
dict
- best_coeffs_by_k: list¶
- best_info_by_k: list¶
- best_support_by_k: list¶
- method: str¶
- p: int¶
- second_info_by_k: list¶
- second_support_by_k: list¶
- select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]¶
Return the support that maximises a given information criterion.
[Model selection] Information criteria for sparse model selection
\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.
References
AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.
BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).
EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.
PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”
SIC — Unpublished (Ronceray).
- Parameters:
name (
"AIC"|"BIC"|"EBIC"|"PASTIS"|"SIC") – Information criterion to maximise.p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.
tau (float or None) – Total trajectory time. Required for BIC and EBIC.
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is
"EBIC".
- Returns:
k_star (int) – Selected model size.
support (list[int]) – Basis-function indices of the chosen model.
score (float) – Value of the information criterion at
k_star.coeffs (Array or None) – Coefficient vector for the selected support.
- Return type:
Tuple[int, List[int], float, Array | None]
- total_info: float¶
- class SFI.inference.sparse.SparsityStrategy[source]¶
Bases:
ABCAbstract base class for sparsity search strategies.
Subclasses must implement
run().- name: str = 'base'¶
Short identifier used in
SparsityResult.method.
- abstractmethod run(scorer, *, max_k, **kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- SFI.inference.sparse.overlap_metrics(true_support, pred_support)[source]¶
Compare predicted support to the ground truth.
- Parameters:
true_support (list[int]) – Indices of the true and predicted active basis functions.
pred_support (list[int]) – Indices of the true and predicted active basis functions.
- Returns:
Keys:
TP,FP,FN,prec,rec,exact.- Return type:
dict
- SFI.inference.sparse.predictive_nmse(Phi_test, true_support, true_coeffs, inferred_support, inferred_coeffs)[source]¶
Normalised mean-squared error on a held-out design matrix.
- Parameters:
Phi_test ((n_test, p) Array) – Design matrix evaluated on test data.
true_support (list[int]) – Ground-truth active indices.
true_coeffs (array-like) – Ground-truth coefficient vector (length
len(true_support)).inferred_support (list[int]) – Inferred active indices.
inferred_coeffs (array-like) – Inferred coefficient vector (length
len(inferred_support)).
- Returns:
\(\|\hat y - y\|^2 / \|y\|^2\).
- Return type:
float
Submodules¶
- SFI.inference.sparse.base module
- SFI.inference.sparse.beam module
- SFI.inference.sparse.greedy module
- SFI.inference.sparse.hillclimb module
- SFI.inference.sparse.lasso module
- SFI.inference.sparse.metrics module
- SFI.inference.sparse.result module
- SFI.inference.sparse.scorer module
- SFI.inference.sparse.stlsq module