SFI.inference.sparse.result module¶
SFI.inference.sparse.result — Sparsity result container¶
SparsityResult is the return type of every search strategy.
It stores the Pareto front (best info / support / coefficients per
cardinality k) and provides information-criterion selection.
Supported information criteria¶
AIC — Akaike (1974), penalty k.
BIC — Schwarz (1978), penalty (k/2) ln τ. Uses the continuous-time formulation of Gerardos & Ronceray (2025).
EBIC — Chen & Chen (2008), BIC + 2 γ ln C(n₀, k).
PASTIS — Gerardos & Ronceray (2025), penalty k ln(n₀/p₀).
SIC — Secret Information Criterion (unpublished, Ronceray), penalty k ln(I_total).
- class SFI.inference.sparse.result.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]¶
Bases:
objectFrozen container for the output of a sparsity search.
- Variables:
p (int) – Total number of candidate basis functions.
total_info (float) – Information gain of the full (dense) model.
method (str) – Name of the strategy that produced this result (e.g.
"beam","greedy","stlsq","lasso").best_info_by_k (list[float]) –
best_info_by_k[k]is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are-inf.best_support_by_k (list[list[int]]) – The support achieving
best_info_by_k[k].best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.
second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all
-infif the strategy does not track runner-ups.second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.
- Parameters:
p (int)
total_info (float)
method (str)
best_info_by_k (list)
best_support_by_k (list)
best_coeffs_by_k (list)
second_info_by_k (list)
second_support_by_k (list)
- all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]¶
Compute all information criteria and optionally compare to ground truth.
- Parameters:
p_param (float) – PASTIS significance level.
tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.
gamma (float, default 0.5) – EBIC tuning parameter.
true_support (optional) – Ground-truth support and coefficients for overlap metrics.
true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.
Phi_test (optional Array) – Held-out design matrix for predictive NMSE.
verbose (bool) – If True, log a summary table at INFO level.
- Returns:
Keyed by IC name, each value is a dict with
k,support,score,coeffs, and optionally overlap and predictive-NMSE entries.- Return type:
dict
- best_coeffs_by_k: list¶
- best_info_by_k: list¶
- best_support_by_k: list¶
- method: str¶
- p: int¶
- second_info_by_k: list¶
- second_support_by_k: list¶
- select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]¶
Return the support that maximises a given information criterion.
[Model selection] Information criteria for sparse model selection
\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.
References
AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.
BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).
EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.
PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”
SIC — Unpublished (Ronceray).
- Parameters:
name (
"AIC"|"BIC"|"EBIC"|"PASTIS"|"SIC") – Information criterion to maximise.p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.
tau (float or None) – Total trajectory time. Required for BIC and EBIC.
gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is
"EBIC".
- Returns:
k_star (int) – Selected model size.
support (list[int]) – Basis-function indices of the chosen model.
score (float) – Value of the information criterion at
k_star.coeffs (Array or None) – Coefficient vector for the selected support.
- Return type:
Tuple[int, List[int], float, Array | None]
- total_info: float¶