SFI.inference.sparsity module¶
SFI.inference.sparsity — Sparse model selection¶
Public façade that re-exports the key symbols from
SFI.inference.sparse. User code should import from here or from
the SFI.inference namespace:
from SFI.inference import SparseScorer, SparsityResult
from SFI.inference.sparse import BeamSearchStrategy, LassoStrategy
The heavy lifting lives in SFI.inference.sparse; this module keeps
the import path short and backward-compatible.
- class SFI.inference.sparsity.BeamSearchStrategy(*, beam_width=20, aic_patience=2, report_time=False)[source]¶
Bases:
SparsityStrategyBidirectional beam search over the support lattice.
- Parameters:
beam_width (int, default 20) – Maximum number of candidate models retained per cardinality.
aic_patience (int, default 2) – Stop early when AIC has strictly declined for this many consecutive closed cardinality levels.
report_time (bool, default False) – If True, log elapsed time and number of explored supports.
- name: str = 'beam'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, init_supports=None, **_kwargs)[source]¶
Execute the beam search.
- Parameters:
scorer (SparseScorer)
max_k (int) – Maximum model size to consider.
init_supports (list of tuples of int, optional) – Seed supports to inject into the initial frontier. Each entry is a tuple of basis-function indices. Useful for seeding the search with a known good model (e.g. the true support) so that the Pareto front is guaranteed to include it.
- Return type:
- class SFI.inference.sparsity.GreedyStepwiseStrategy(*, direction='forward', report_time=False)[source]¶
Bases:
SparsityStrategyForward / backward / bidirectional stepwise selection.
- Parameters:
direction (
"forward"|"backward"|"both", default"forward") –Which direction(s) to search.
"forward": start empty, add one feature at a time."backward": start full, drop one feature at a time."both": run both directions and merge the Pareto fronts.
report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'greedy'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparsity.LassoStrategy(*, alpha=None, n_alphas=50, max_iter=1000, tol=1e-07, report_time=False)[source]¶
Bases:
SparsityStrategyCoordinate-descent LASSO on the normal equations.
- Parameters:
alpha (float or None) – Fixed regularisation strength. If None (default), an automatic log-spaced path from \(\alpha_{\max}\) (where the solution is entirely zero) down to \(10^{-4}\,\alpha_{\max}\) is constructed.
n_alphas (int, default 50) – Number of \(\alpha\) values in the automatic path.
max_iter (int, default 1000) – Maximum coordinate-descent iterations per \(\alpha\).
tol (float, default 1e-7) – Convergence tolerance (max absolute change in any coefficient).
report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'lasso'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparsity.STLSQStrategy(*, threshold=None, mode='relative', max_iter=50, n_thresholds=30, report_time=False)[source]¶
Bases:
SparsityStrategySequential Thresholded Least Squares (SINDy-style).
- Parameters:
threshold (float or None) – Single threshold value. If None, an automatic log-spaced sweep from a small fraction of the maximum coefficient up to the maximum is performed.
mode (
"relative"|"absolute", default"relative") – Whetherthresholdis interpreted as a fraction of \(\max|C|\) (relative) or a fixed value (absolute).max_iter (int, default 50) – Maximum STLSQ iterations per threshold.
n_thresholds (int, default 30) – Number of thresholds in the automatic sweep (used only when
threshold is None).report_time (bool, default False) – Log elapsed wall-clock time when done.
- name: str = 'stlsq'¶
Short identifier used in
SparsityResult.method.
- run(scorer, *, max_k, **_kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- class SFI.inference.sparsity.SparseScorer(*, M, G, norm_X2=0.0, n=1, pinv_tol=1e-08, use_residuals=False)[source]¶
Bases:
objectScore candidate supports by solving the restricted normal equations.
- Parameters:
M ((p,) Array) – Pre-computed moment vector (cross-moments between data and basis functions).
G ((p, p) Array) – Normal-equations matrix. May be non-symmetric (e.g. when using Itô-shift moment estimators). Symmetry is detected automatically and a Cholesky fast-path is used when possible.
norm_X2 (float, default 0.0) – Sum of squared observations. Only used when
use_residuals=True.n (int, default 1) – Sample count prefactor used in the residual-based information gain \(\tfrac{1}{2} n\,\log(\lVert X\rVert^2 / \mathrm{RSS})\). Only used when
use_residuals=True.pinv_tol (float, default 1e-8) – Tolerance for the diagonal preconditioning floor.
use_residuals (bool, default False) – If True, the information gain is computed via the residual sum-of-squares expression instead of the explicit quadratic form.
- info_and_coeffs(B)[source]¶
Solve \(G_{BB}\,C_B = M_B\) and return the information gain.
- Parameters:
B ((k,) int Array) – Indices of the active basis functions (the support).
- Returns:
info (scalar Array) – \(\tfrac{1}{2}\,C_B^\top M_B\) (or the RSS variant when
use_residuals=True).C_B ((k,) Array) – Maximum-likelihood coefficients for the restricted support.
- Return type:
Tuple[Array, Array]
- vmap_info(batch)[source]¶
Score a batch of supports of the same cardinality.
The batch is padded to the next power-of-2 length so that JAX’s compilation cache stays bounded (≤ ~12 unique shapes per support size k instead of one per distinct batch count).
- Parameters:
batch ((n_supports, k) int Array) – Each row is a sorted support of length k.
- Returns:
infos ((n_supports,) Array)
coeffs ((n_supports, k) Array)
- Return type:
Tuple[Array, Array]
- class SFI.inference.sparsity.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]¶
Bases:
objectFrozen container for the output of a sparsity search.
- Variables:
p (int) – Total number of candidate basis functions.
total_info (float) – Information gain of the full (dense) model.
method (str) – Name of the strategy that produced this result (e.g.
"beam","greedy","stlsq","lasso").best_info_by_k (list[float]) –
best_info_by_k[k]is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are-inf.best_support_by_k (list[list[int]]) – The support achieving
best_info_by_k[k].best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.
second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all
-infif the strategy does not track runner-ups.second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.
- Parameters:
p (int)
total_info (float)
method (str)
best_info_by_k (list)
best_support_by_k (list)
best_coeffs_by_k (list)
second_info_by_k (list)
second_support_by_k (list)
- all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]¶
Compute all information criteria and optionally compare to ground truth.
- Parameters:
p_param (float) – PASTIS significance level.
tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.
gamma (float, default 0.5) – EBIC tuning parameter.
true_support (optional) – Ground-truth support and coefficients for overlap metrics.
true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.
Phi_test (optional Array) – Held-out design matrix for predictive NMSE.
verbose (bool) – If True, log a summary table at INFO level.
- Returns:
Keyed by IC name, each value is a dict with
k,support,score,coeffs, and optionally overlap and predictive-NMSE entries.- Return type:
dict
- best_coeffs_by_k: list¶
- best_info_by_k: list¶
- best_support_by_k: list¶
- method: str¶
- p: int¶
- second_info_by_k: list¶
- second_support_by_k: list¶
- select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]¶
Return the support that maximises a given information criterion.
[Model selection] Information criteria for sparse model selection
\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.
References
AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.
BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).
EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.
PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”
SIC — Unpublished (Ronceray).
- Parameters:
name (
"AIC"|"BIC"|"EBIC"|"PASTIS"|"SIC") – Information criterion to maximise.p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.
tau (float or None) – Total trajectory time. Required for BIC and EBIC.
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is
"EBIC".
- Returns:
k_star (int) – Selected model size.
support (list[int]) – Basis-function indices of the chosen model.
score (float) – Value of the information criterion at
k_star.coeffs (Array or None) – Coefficient vector for the selected support.
- Return type:
Tuple[int, List[int], float, Array | None]
- total_info: float¶
- class SFI.inference.sparsity.SparsityStrategy[source]¶
Bases:
ABCAbstract base class for sparsity search strategies.
Subclasses must implement
run().- name: str = 'base'¶
Short identifier used in
SparsityResult.method.
- abstractmethod run(scorer, *, max_k, **kwargs)[source]¶
Execute the search and return a
SparsityResult.- Parameters:
scorer (SparseScorer) – Provides
info_and_coeffs/vmap_infofor evaluating candidate supports.max_k (int) – Maximum model size to explore.
- Return type:
- SFI.inference.sparsity.overlap_metrics(true_support, pred_support)[source]¶
Compare predicted support to the ground truth.
- Parameters:
true_support (list[int]) – Indices of the true and predicted active basis functions.
pred_support (list[int]) – Indices of the true and predicted active basis functions.
- Returns:
Keys:
TP,FP,FN,prec,rec,exact.- Return type:
dict
- SFI.inference.sparsity.predictive_nmse(Phi_test, true_support, true_coeffs, inferred_support, inferred_coeffs)[source]¶
Normalised mean-squared error on a held-out design matrix.
- Parameters:
Phi_test ((n_test, p) Array) – Design matrix evaluated on test data.
true_support (list[int]) – Ground-truth active indices.
true_coeffs (array-like) – Ground-truth coefficient vector (length
len(true_support)).inferred_support (list[int]) – Inferred active indices.
inferred_coeffs (array-like) – Inferred coefficient vector (length
len(inferred_support)).
- Returns:
\(\|\hat y - y\|^2 / \|y\|^2\).
- Return type:
float