SFI.inference.sparse.result module

SFI.inference.sparse.result — Sparsity result container

SparsityResult is the return type of every search strategy. It stores the Pareto front (best info / support / coefficients per cardinality k) and provides information-criterion selection.

Supported information criteria

  • AIC — Akaike (1974), penalty k.

  • BIC — Schwarz (1978), penalty (k/2) ln τ. Uses the continuous-time formulation of Gerardos & Ronceray (2025).

  • EBIC — Chen & Chen (2008), BIC + 2 γ ln C(n₀, k).

  • PASTIS — Gerardos & Ronceray (2025), penalty k ln(n₀/p₀).

  • SIC — Secret Information Criterion (unpublished, Ronceray), penalty k ln(I_total).

class SFI.inference.sparse.result.SparsityResult(p, total_info, method, best_info_by_k=<factory>, best_support_by_k=<factory>, best_coeffs_by_k=<factory>, second_info_by_k=<factory>, second_support_by_k=<factory>)[source]

Bases: object

Frozen container for the output of a sparsity search.

Variables:
  • p (int) – Total number of candidate basis functions.

  • total_info (float) – Information gain of the full (dense) model.

  • method (str) – Name of the strategy that produced this result (e.g. "beam", "greedy", "stlsq", "lasso").

  • best_info_by_k (list[float]) – best_info_by_k[k] is the highest information gain found among all explored supports of cardinality k. Unexplored cardinalities are -inf.

  • best_support_by_k (list[list[int]]) – The support achieving best_info_by_k[k].

  • best_coeffs_by_k (list[Array | None]) – The corresponding coefficient vector.

  • second_info_by_k (list[float]) – Second-best information gain per k (for robustness diagnostics). May be all -inf if the strategy does not track runner-ups.

  • second_support_by_k (list[list[int]]) – Support achieving the second-best info per k.

Parameters:
  • p (int)

  • total_info (float)

  • method (str)

  • best_info_by_k (list)

  • best_support_by_k (list)

  • best_coeffs_by_k (list)

  • second_info_by_k (list)

  • second_support_by_k (list)

all_ic(*, p_param=0.001, tau=None, gamma=0.5, true_support=None, true_coeffs=None, Phi_test=None, verbose=True)[source]

Compute all information criteria and optionally compare to ground truth.

Parameters:
  • p_param (float) – PASTIS significance level.

  • tau (float or None) – Total trajectory time. If provided, BIC and EBIC are included; otherwise they are skipped.

  • gamma (float, default 0.5) – EBIC tuning parameter.

  • true_support (optional) – Ground-truth support and coefficients for overlap metrics.

  • true_coeffs (optional) – Ground-truth support and coefficients for overlap metrics.

  • Phi_test (optional Array) – Held-out design matrix for predictive NMSE.

  • verbose (bool) – If True, log a summary table at INFO level.

Returns:

Keyed by IC name, each value is a dict with k, support, score, coeffs, and optionally overlap and predictive-NMSE entries.

Return type:

dict

best_coeffs_by_k: list
best_info_by_k: list
best_support_by_k: list
method: str
p: int
second_info_by_k: list
second_support_by_k: list
select_by_ic(name, *, p_param=0.001, tau=None, gamma=0.5)[source]

Return the support that maximises a given information criterion.

[Model selection] Information criteria for sparse model selection

\[\begin{split}\text{AIC}(k) &= \mathcal{I}(k) - k \\ \text{BIC}(k) &= \mathcal{I}(k) - \tfrac{1}{2}\,k\,\ln\tau \\ \text{EBIC}(k) &= \text{BIC}(k) - 2\gamma\,\ln\binom{n_0}{k} \\ \text{PASTIS}(k) &= \mathcal{I}(k) - k\,\ln(n_0 / p_0) \\ \text{SIC}(k) &= \mathcal{I}(k) - k\,\ln(\mathcal{I}_{\text{total}})\end{split}\]

where \(\mathcal{I}(k)\) is the log-likelihood gain with k basis terms out of \(n_0\) candidates, \(\tau\) is the total trajectory time, \(p_0\) is the PASTIS significance level, and \(\gamma \in [0,1]\) controls EBIC stringency.

References

  • AIC — Akaike, H. (1974). “A new look at the statistical model identification.” IEEE Trans. Automat. Control, 19(6), 716–723.

  • BIC — Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Statist., 6(2), 461–464. The continuous-time formulation \(\tfrac{k}{2}\ln\tau\) follows from the Laplace approximation of the SDE marginal likelihood (Gerardos & Ronceray, 2025).

  • EBIC — Chen, J. & Chen, Z. (2008). “Extended Bayesian information criteria for model selection with large model spaces.” Biometrika, 95(3), 759–771.

  • PASTIS — Gerardos, A. & Ronceray, P. (2025). “Principled model selection for stochastic dynamics.”

  • SIC — Unpublished (Ronceray).

Parameters:
  • name ("AIC" | "BIC" | "EBIC" | "PASTIS" | "SIC") – Information criterion to maximise.

  • p_param (float, default 1e-3) – Significance level \(p_0\) for the PASTIS penalty.

  • tau (float or None) – Total trajectory time. Required for BIC and EBIC.

  • gamma (float, default 0.5) – EBIC tuning parameter (\(\gamma \in [0,1]\)). Only used when name is "EBIC".

Returns:

  • k_star (int) – Selected model size.

  • support (list[int]) – Basis-function indices of the chosen model.

  • score (float) – Value of the information criterion at k_star.

  • coeffs (Array or None) – Coefficient vector for the selected support.

Return type:

Tuple[int, List[int], float, Array | None]

total_info: float