SFI.statefunc.memhint module

class SFI.statefunc.memhint.MemHint(per_sample_bytes=0, persistent_bytes=0)[source]

Bases: object

Conservative memory footprint per SINGLE sample. - per_sample_bytes scales with the number of samples. - persistent_bytes is model state / CSR / weights, does NOT scale with samples.

Parameters:
  • per_sample_bytes (int)

  • persistent_bytes (int)

per_sample_bytes: int = 0
persistent_bytes: int = 0
scaled(k)[source]

Scale only the transient part (useful if you ever convert to total bytes).

Parameters:

k (int)

Return type:

MemHint

class SFI.statefunc.memhint.SampleMeta(P=None, K=None, has_v=None, has_mask=None)[source]

Bases: object

Optional single-sample context to refine estimates.

P: number of particles in ONE sample (for nodes with particle axes, pdepth>0). K: arity for interaction gathers when fixed/known (pairs=2, etc.). has_v: whether a velocity block will be present. has_mask: whether a boolean mask participates in the call.

Parameters:
  • P (int | None)

  • K (int | None)

  • has_v (bool | None)

  • has_mask (bool | None)

K: int | None = None
P: int | None = None
static from_arrays(x=None, v=None, mask=None)[source]

Try to recover P from provided single-sample arrays without allocating anything. Heuristics:

  • If x has at least 2 dims and looks like (…, dim) with a leading particle axis, we guess P from the axis before dim. If ambiguous, we leave P=None.

  • v and mask only toggle flags; they don’t change P.

Return type:

SampleMeta

has_mask: bool | None = None
has_v: bool | None = None
SFI.statefunc.memhint.broadcast_extra_bytes_for_children(*, children, dtype, particle_size)[source]

When children have different particle depths, lower-depth outputs are broadcast to match the max. Materializing that broadcast costs memory. We conservatively account for an additional slab equal to (P^Δ - 1) times the child’s OUTPUT bytes.

Parameters:
  • children (Iterable)

  • particle_size (int | None)

Return type:

int

SFI.statefunc.memhint.default_leaf_hint(node, *, dtype, particle_size, mode)[source]

Default for leaf-like nodes: count only the output buffer per sample. Composite nodes will also include their children.

Parameters:
  • node (_HasContract)

  • particle_size (int | None)

  • mode (str)

Return type:

MemHint

SFI.statefunc.memhint.default_op_hint(node, *, children, dtype, particle_size, mode)[source]

Default for composite ops: sum(children.hint) + my output buffer + broadcast overhead. We assume child outputs coexist while the op constructs its result.

Parameters:
  • node (_HasContract)

  • children (Iterable)

  • particle_size (int | None)

  • mode (str)

Return type:

MemHint

SFI.statefunc.memhint.inflate_for_grad(hint, *, factor=2.0)[source]

Blanket inflation for gradient-mode nodes to account for tangents/tapes. Keep it simple and conservative; adjust locally if you get better bounds.

Parameters:
Return type:

MemHint

SFI.statefunc.memhint.itemsize_of(dtype)[source]

Return dtype itemsize as an int; defaults to float32 when dtype is None.

Return type:

int

SFI.statefunc.memhint.output_bytes_per_sample(node, *, dtype, particle_size)[source]

Translate element count to bytes using dtype itemsize.

Parameters:
  • node (_HasContract)

  • particle_size (int | None)

Return type:

int

SFI.statefunc.memhint.output_elems_per_sample(node, *, particle_size)[source]

Count output elements of a node for ONE sample from its static contract.

Output shape suffix is (*particle_axes, *rank_axes, n_features) with particle_axes = (P,) * pdepth and rank_axes = (dim,) * rank.

If particle_size is None, treat P = 1 (safe lower bound for batch picking).

Parameters:
  • node (_HasContract)

  • particle_size (int | None)

Return type:

int

SFI.statefunc.memhint.resolve_P(particle_size, sample)[source]

Pick P from explicit particle_size, else from SampleMeta or array, else None.

Parameters:

particle_size (int | None)

Return type:

int | None