Models

Models provide high-level abstractions for composing layers, running training loops, and tracking training history.

All models documented here are part of KeyDNN’s public presentation API.

keydnn.Sequential

Bases: Model

Sequential container model.

Sequential composes multiple Module instances and applies them sequentially during forward():

out = layer_n(...layer_2(layer_1(x)))

This container is useful for building simple pipelines such as MLPs and CNN feature stacks without writing a custom forward.

Key Features

Deterministic layer ordering via an internal _layers list
Automatic submodule registration into _modules for recursion and save/load
Supports indexing, iteration, and dynamic extension via add
Provides a lightweight summary() for quick inspection

is_built `property`

is_built: bool

Return whether the model has been built (lazy layers materialized).

A model is considered built once build() has successfully executed a forward pass on a representative input and all lazy layers have created their parameters.

Returns:

Type	Description
`bool`	True if built, else False.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

build

build(
    x: Union[Tensor, ShapeLike],
    *,
    device: Optional[Any] = None,
    dtype: Any = np.float32,
) -> None

Build (materialize) the model using a representative input.

This method performs a single forward pass to force initialization of all lazy modules (e.g., Dense inferring in_features and allocating parameters). After build(), calls to model.parameters() will return a complete and stable parameter set suitable for optimizer construction.

The build input may be provided either as: - a real Tensor, or - a shape-like object (tuple/list), in which case a dummy Tensor is internally constructed.

Parameters:

Name	Type	Description	Default
`x`	`Tensor or ShapeLike`	Representative model input. - If a `Tensor`, it is used directly. - If a shape (e.g., `(1, 784)`), a zero-filled Tensor is created internally to trigger the forward pass.	required
`device`	`optional`	Device to use when constructing a dummy Tensor from a shape. Ignored if `x` is already a Tensor.	`None`
`dtype`	`Any`	Data type to use for a dummy Tensor created from a shape. Ignored if `x` is already a Tensor.	`float32`

Raises:

Type	Description
`TypeError`	If `x` is neither a Tensor nor a valid shape-like object.
`RuntimeError`	If the forward pass fails during model building.

Notes

Calling build() multiple times is safe; subsequent calls are no-ops.
This method must be called before inference, training, or optimizer creation when the model contains lazy layers.

save_json

save_json(path: str | Path) -> None

Save model architecture and weights into a single JSON file.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Output JSON file path, e.g. "checkpoint.json".	required

Format

{ "format": "keydnn.json.ckpt.v1", "arch": {...}, "state": { "layer1.weight": {"b64": "...", "dtype": "<f4", "shape": [...], "order": "C"}, ... } }

Notes

Avoids pickle and HDF5 dependencies.
JSON file can get large; base64 adds ~33% size overhead.

load_json `classmethod`

load_json(path: str | Path) -> 'Model'

Load a model from a single JSON checkpoint created by save_json().

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Checkpoint JSON path.	required

Returns:

Type	Description
`Model`	Reconstructed model with weights loaded.

Raises:

Type	Description
`ValueError`	If the checkpoint format is unsupported.
`TypeError`	If the reconstructed object is not an instance of `cls`.

train_on_batch

train_on_batch(
    x_batch: Tensor,
    y_batch: Tensor,
    *,
    loss: Any,
    optimizer: Any,
    metrics: Optional[Sequence[Any]] = None,
    metric_names: Optional[Sequence[str]] = None,
    zero_grad: bool = True,
    backward: bool = True,
    step: bool = True,
    optimizer_kwargs: Optional[Dict[str, Any]] = None,
) -> Dict[str, float]

Run a single training step on one mini-batch.

Supports string shortcuts for: - loss: "mse", "sse", "bce", "cce" - optimizer: "sgd", "adam" - metrics: "acc"

See Model.fit() for full semantics.

fit

fit(
    x: Union[Tensor, Iterable[Tuple[Tensor, Tensor]]],
    y: Optional[Tensor] = None,
    *,
    loss: LossLike,
    optimizer: OptimizerLike,
    metrics: Optional[Sequence[MetricLike]] = None,
    metric_names: Optional[Sequence[str]] = None,
    batch_size: int = 32,
    epochs: int = 1,
    shuffle: bool = True,
    verbose: int = 1,
    validation_data: Optional[Tuple[Any, Any]] = None,
    callbacks: Optional[Sequence["Callback"]] = None,
    optimizer_kwargs: Optional[Mapping[str, Any]] = None,
) -> History

Train the model for a fixed number of epochs.

This method provides a Keras-like training loop built on top of train_on_batch(). It aggregates batch logs into per-epoch metrics and records them in a History object.

Supported input forms

1) (x, y) dataset: - x and y are array-like and support len() and indexing - batching/shuffling is handled internally via _iter_minibatches_xy 2) Iterable-of-batches: - y is None - x is an iterable yielding (x_batch, y_batch) tuples

Parameters:

Name	Type	Description	Default
`x`	`Union[Tensor, Iterable[Tuple[Tensor, Tensor]]]`	Dataset inputs, or an iterable yielding `(x_batch, y_batch)` tuples.	required
`y`	`Optional[Tensor]`	Dataset targets. Must be provided for dataset inputs; must be `None` for iterable-of-batches inputs.	`None`
`loss`	`LossLike`	Callable producing a scalar loss: `loss(y_pred, y_true)`.	required
`optimizer`	`OptimizerLike`	Optimizer-like object, expected to expose `zero_grad()` and `step()`.	required
`metrics`	`MetricLike`	Metric callables to compute per batch and aggregate per epoch.	`None`
`metric_names`	`Optional[Sequence[str]]`	Optional names matching `metrics`. If omitted, names are inferred.	`None`
`batch_size`	`int`	Mini-batch size for dataset inputs. Default is 32.	`32`
`epochs`	`int`	Number of epochs to train for. Default is 1.	`1`
`shuffle`	`bool`	Whether to shuffle dataset inputs each epoch. Default is True.	`True`
`verbose`	`int`	If non-zero, prints a simple epoch summary. Default is 1.	`1`

Returns:

Type	Description
`History`	A `History` instance (from `._history`) containing per-epoch metrics.

Raises:

Type	Description
`ValueError`	If `epochs < 1` or `batch_size < 1`.
`TypeError`	If `y is None` but `x` is not an iterable of `(x_batch, y_batch)`.

Notes

Per-epoch metric values are computed as weighted means over batches, using _batch_size_of() to determine the weight for each batch.

to_json_payload

to_json_payload() -> Dict[str, Any]

Serialize model architecture and weights into a JSON-serializable payload.

This is the in-memory counterpart to save_json(path) and is useful for: - callback-based checkpointing without filesystem writes - snapshotting best weights for early stopping restore - programmatic checkpoint transport (e.g., RPC, DB, etc.)

Returns:

Type	Description
`Dict[str, Any]`	JSON-serializable checkpoint payload with keys: - "format": str - "arch": dict - "state": dict

Notes

The payload is compatible with the on-disk format produced by save_json().

from_json_payload_

from_json_payload_(payload: Dict[str, Any]) -> None

Load weights in-place from a JSON checkpoint payload.

This method restores only the parameter state into the current model instance. It is intended for in-memory restores (e.g., EarlyStopping).

Parameters:

Name	Type	Description	Default
`payload`	`Dict[str, Any]`	A checkpoint payload produced by `to_json_payload()` or loaded from disk via `json.loads(...)`.	required

Raises:

Type	Description
`ValueError`	If the checkpoint format is unsupported.

Notes

This method does not reconstruct the module graph.
It assumes the current model architecture matches the payload.
Shape mismatches are detected by load_state_payload_().

get_config

get_config() -> dict[str, Any]

Return the (de)serialization config for this container.

Returns:

Type	Description
`dict[str, Any]`	Configuration dictionary used by the JSON serializer.

Notes

Sequential stores its children separately via the module tree, so no additional config is required here.

add

add(layer: Module, name: Optional[str] = None) -> None

Append a module to the container and register it as a submodule.

Parameters:

Name	Type	Description	Default
`layer`	`Module`	The module to append.	required
`name`	`Optional[str]`	Explicit registration name. If omitted, a numeric name ("0", "1", ...) is assigned based on insertion order.	`None`

Raises:

Type	Description
`TypeError`	If `layer` is not an instance of `Module`.
`ValueError`	If the provided `name` conflicts with an existing submodule.

Notes

_layers preserves the execution order used by forward().
_modules enables parameter discovery, recursion, and serialization.

forward

forward(x: ITensor, *, _skip_norm: bool = False) -> ITensor

Apply all layers sequentially.

Parameters:

Name	Type	Description	Default
`x`	`ITensor`	Input tensor to the first layer.	required
`_skip_norm`	`bool`	Whether to skip normalization layers. Useful during model build / compilation.	`False`

Returns:

Type	Description
`ITensor`	Output of the final layer.

Notes

This method uses the ordered _layers list (not _modules) to ensure deterministic execution order.

layers

layers() -> Tuple[Module, ...]

Return all layers as an immutable tuple.

Returns:

Type	Description
`Tuple[Module, ...]`	Tuple of contained modules in execution order.

summary

summary() -> str

Generate a lightweight textual summary of the container.

Returns:

Type	Description
`str`	Human-readable representation listing layer indices and types.

Notes

No shape inference or parameter counting is performed.
This is intended for quick debugging/inspection only.

predict

predict(x, *, requires_grad: bool = False)

Perform an inference-style forward pass for the Sequential model.

Parameters:

Name	Type	Description	Default
`x`	`ITensor`	Input tensor.	required
`requires_grad`	`bool`	Placeholder for future gradient-control semantics. Currently unused.	`False`

Returns:

Type	Description
`ITensor`	Output of the sequential computation.

Notes

If the model exposes eval(), this method switches to eval mode before running the forward pass.

from_config `classmethod`

from_config(cfg: dict[str, Any]) -> Sequential

Construct a Sequential container from configuration.

Parameters:

Name	Type	Description	Default
`cfg`	`dict[str, Any]`	Configuration dictionary produced by `get_config()`.	required

Returns:

Type	Description
`Sequential`	A newly constructed `Sequential` instance.

Notes

Child modules are attached later by the deserializer into self._modules. After attachment, _post_load() rebuilds the ordered _layers view.

to

to(device: Device | str) -> Sequential

Move this model (recursively) to device by moving all registered parameters.

This is a model-level convenience wrapper around Module.to(). It migrates parameters/buffers of all child layers registered in the module tree, so users can transfer the entire model without rebuilding a new Sequential.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device for this model's parameters.	required

Returns:

Type	Description
`Sequential`	This model (`self`) after migration.

Notes

The authoritative traversal uses the _modules registry (built by add()), ensuring all layers participate in device transfer.
This method may rebind parameter objects (out-of-place transfer) depending on parameter implementation; use to_() for best-effort identity preservation.

to_

to_(device: Device | str) -> Sequential

Move this model (recursively) to device in-place.

This is a model-level convenience wrapper around Module.to_(). It attempts to migrate parameters in-place when supported (preserving parameter identity), and falls back to out-of-place transfer + rebinding when necessary.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device for this model's parameters.	required

Returns:

Type	Description
`Sequential`	This model (`self`) after in-place migration.

Notes

In-place behavior is best-effort: parameters without to_() support will be replaced via to() and rebound on the module.
This method relies on the _modules registry to recurse into child layers.

keydnn.History `dataclass`

Container for per-epoch training metrics.

History is a lightweight, Keras-inspired object returned by high-level training routines (e.g., Model.fit). It stores aggregated metric values for each completed epoch and provides convenience accessors for inspection.

Attributes:

Name	Type	Description
`history`	`Dict[str, List[float]]`	Mapping from metric name to a list of per-epoch values. Each list is ordered by epoch index.
`epoch`	`List[int]`	List of epoch indices (0-based) corresponding to entries in `history`.

Notes

All metric values are stored as Python float for portability.
This object is intentionally passive: it performs no aggregation logic beyond appending values supplied by the training loop.

append_epoch

append_epoch(
    epoch_idx: int, logs: Mapping[str, Number]
) -> None

Append metrics for a completed epoch.

Parameters:

Name	Type	Description	Default
`epoch_idx`	`int`	Zero-based index of the completed epoch.	required
`logs`	`Mapping[str, Number]`	Mapping from metric name to aggregated epoch value (e.g., mean loss, accuracy).	required

Notes

Metric values are coerced to float before storage.
The caller (typically Model.fit) is responsible for ensuring that logs contains already-aggregated values.

last

last() -> Dict[str, float]

Return metrics from the most recent epoch.

Returns:

Type	Description
`Dict[str, float]`	Mapping from metric name to its latest recorded value. Metrics with no recorded values are omitted.

Notes

This is a convenience accessor commonly used after training to retrieve final loss/metric values without manual indexing.

Notes

Sequential is intended for linear stacks of layers.
Parameters are registered automatically when layers are added.
Training utilities such as callbacks and optimizers integrate with the model API.
For custom architectures, users may subclass lower-level building blocks (documented in the Guides section).

Models

keydnn.Sequential

is_built property

parameters

train

eval

register_parameter

register_module

named_parameters

build

save_json

load_json classmethod

train_on_batch

fit

to_json_payload

from_json_payload_

get_config

add

forward

layers

summary

predict

from_config classmethod

to

to_

keydnn.History dataclass

append_epoch

last

Notes

is_built `property`

load_json `classmethod`

from_config `classmethod`

keydnn.History `dataclass`