SFI.statefunc.memhint module¶
- class SFI.statefunc.memhint.MemHint(per_sample_bytes=0, persistent_bytes=0)[source]¶
Bases:
objectConservative memory footprint per SINGLE sample. - per_sample_bytes scales with the number of samples. - persistent_bytes is model state / CSR / weights, does NOT scale with samples.
- Parameters:
per_sample_bytes (int)
persistent_bytes (int)
- per_sample_bytes: int = 0¶
- persistent_bytes: int = 0¶
- class SFI.statefunc.memhint.SampleMeta(P=None, K=None, has_v=None, has_mask=None)[source]¶
Bases:
objectOptional single-sample context to refine estimates.
P: number of particles in ONE sample (for nodes with particle axes, pdepth>0). K: arity for interaction gathers when fixed/known (pairs=2, etc.). has_v: whether a velocity block will be present. has_mask: whether a boolean mask participates in the call.
- Parameters:
P (int | None)
K (int | None)
has_v (bool | None)
has_mask (bool | None)
- K: int | None = None¶
- P: int | None = None¶
- static from_arrays(x=None, v=None, mask=None)[source]¶
Try to recover P from provided single-sample arrays without allocating anything. Heuristics:
If x has at least 2 dims and looks like (…, dim) with a leading particle axis, we guess P from the axis before dim. If ambiguous, we leave P=None.
v and mask only toggle flags; they don’t change P.
- Return type:
- has_mask: bool | None = None¶
- has_v: bool | None = None¶
- SFI.statefunc.memhint.broadcast_extra_bytes_for_children(*, children, dtype, particle_size)[source]¶
When children have different particle depths, lower-depth outputs are broadcast to match the max. Materializing that broadcast costs memory. We conservatively account for an additional slab equal to (P^Δ - 1) times the child’s OUTPUT bytes.
- Parameters:
children (Iterable)
particle_size (int | None)
- Return type:
int
- SFI.statefunc.memhint.default_leaf_hint(node, *, dtype, particle_size, mode)[source]¶
Default for leaf-like nodes: count only the output buffer per sample. Composite nodes will also include their children.
- Parameters:
node (_HasContract)
particle_size (int | None)
mode (str)
- Return type:
- SFI.statefunc.memhint.default_op_hint(node, *, children, dtype, particle_size, mode)[source]¶
Default for composite ops: sum(children.hint) + my output buffer + broadcast overhead. We assume child outputs coexist while the op constructs its result.
- Parameters:
node (_HasContract)
children (Iterable)
particle_size (int | None)
mode (str)
- Return type:
- SFI.statefunc.memhint.inflate_for_grad(hint, *, factor=2.0)[source]¶
Blanket inflation for gradient-mode nodes to account for tangents/tapes. Keep it simple and conservative; adjust locally if you get better bounds.
- SFI.statefunc.memhint.itemsize_of(dtype)[source]¶
Return dtype itemsize as an int; defaults to float32 when dtype is None.
- Return type:
int
- SFI.statefunc.memhint.output_bytes_per_sample(node, *, dtype, particle_size)[source]¶
Translate element count to bytes using dtype itemsize.
- Parameters:
node (_HasContract)
particle_size (int | None)
- Return type:
int
- SFI.statefunc.memhint.output_elems_per_sample(node, *, particle_size)[source]¶
Count output elements of a node for ONE sample from its static contract.
Output shape suffix is
(*particle_axes, *rank_axes, n_features)withparticle_axes = (P,) * pdepthandrank_axes = (dim,) * rank.If
particle_sizeis None, treatP = 1(safe lower bound for batch picking).- Parameters:
node (_HasContract)
particle_size (int | None)
- Return type:
int