SFI.statefunc.memhint module¶

class SFI.statefunc.memhint.MemHint(per_sample_bytes=0, persistent_bytes=0)[source]¶

Bases: object

Conservative memory footprint per SINGLE sample. - per_sample_bytes scales with the number of samples. - persistent_bytes is model state / CSR / weights, does NOT scale with samples.

Parameters:

per_sample_bytes (int)
persistent_bytes (int)

per_sample_bytes: int = 0¶

persistent_bytes: int = 0¶

scaled(k)[source]¶

Scale only the transient part (useful if you ever convert to total bytes).

Parameters:: k (int)
Return type:: MemHint

class SFI.statefunc.memhint.SampleMeta(P=None, K=None, has_v=None, has_mask=None)[source]¶

Bases: object

Optional single-sample context to refine estimates.

P: number of particles in ONE sample (for nodes with particle axes, pdepth>0). K: arity for interaction gathers when fixed/known (pairs=2, etc.). has_v: whether a velocity block will be present. has_mask: whether a boolean mask participates in the call.

Parameters:

P (int | None)
K (int | None)
has_v (bool | None)
has_mask (bool | None)

K: int | None = None¶

P: int | None = None¶

static from_arrays(x=None, v=None, mask=None)[source]¶

Try to recover P from provided single-sample arrays without allocating anything. Heuristics:

If x has at least 2 dims and looks like (…, dim) with a leading particle axis, we guess P from the axis before dim. If ambiguous, we leave P=None.

v and mask only toggle flags; they don’t change P.

Return type:: SampleMeta

has_mask: bool | None = None¶

has_v: bool | None = None¶

SFI.statefunc.memhint.broadcast_extra_bytes_for_children(*, children, dtype, particle_size)[source]¶

When children have different particle depths, lower-depth outputs are broadcast to match the max. Materializing that broadcast costs memory. We conservatively account for an additional slab equal to (P^Δ - 1) times the child’s OUTPUT bytes.

Parameters:

children (Iterable)
particle_size (int | None)

Return type:

int

SFI.statefunc.memhint.default_leaf_hint(node, *, dtype, particle_size, mode)[source]¶

Default for leaf-like nodes: count only the output buffer per sample. Composite nodes will also include their children.

Parameters:

node (_HasContract)
particle_size (int | None)
mode (str)

Return type:

MemHint

SFI.statefunc.memhint.default_op_hint(node, *, children, dtype, particle_size, mode)[source]¶

Default for composite ops: sum(children.hint) + my output buffer + broadcast overhead. We assume child outputs coexist while the op constructs its result.

Parameters:

node (_HasContract)
children (Iterable)
particle_size (int | None)
mode (str)

Return type:

MemHint

SFI.statefunc.memhint.inflate_for_grad(hint, *, factor=2.0)[source]¶

Blanket inflation for gradient-mode nodes to account for tangents/tapes. Keep it simple and conservative; adjust locally if you get better bounds.

Parameters:

hint (MemHint)
factor (float)

Return type:

MemHint

SFI.statefunc.memhint.itemsize_of(dtype)[source]¶

Return dtype itemsize as an int; defaults to float32 when dtype is None.

Return type:: int

SFI.statefunc.memhint.output_bytes_per_sample(node, *, dtype, particle_size)[source]¶

Translate element count to bytes using dtype itemsize.

Parameters:

node (_HasContract)
particle_size (int | None)

Return type:

int

SFI.statefunc.memhint.output_elems_per_sample(node, *, particle_size)[source]¶

Count output elements of a node for ONE sample from its static contract.

Output shape suffix is (*particle_axes, *rank_axes, n_features) with particle_axes = (P,) * pdepth and rank_axes = (dim,) * rank.

If particle_size is None, treat P = 1 (safe lower bound for batch picking).

Parameters:

node (_HasContract)
particle_size (int | None)

Return type:

int

SFI.statefunc.memhint.resolve_P(particle_size, sample)[source]¶

Pick P from explicit particle_size, else from SampleMeta or array, else None.

Parameters:: particle_size (int | None)
Return type:: int | None