Skip to content

Models

Models provide high-level abstractions for composing layers, running training loops, and tracking training history.

All models documented here are part of KeyDNN’s public presentation API.


keydnn.Sequential

Bases: Model

Sequential container model.

Sequential composes multiple Module instances and applies them sequentially during forward():

out = layer_n(...layer_2(layer_1(x)))

This container is useful for building simple pipelines such as MLPs and CNN feature stacks without writing a custom forward.

Key Features
  • Deterministic layer ordering via an internal _layers list
  • Automatic submodule registration into _modules for recursion and save/load
  • Supports indexing, iteration, and dynamic extension via add
  • Provides a lightweight summary() for quick inspection

is_built property

is_built: bool

Return whether the model has been built (lazy layers materialized).

A model is considered built once build() has successfully executed a forward pass on a representative input and all lazy layers have created their parameters.

Returns:

Type Description
bool

True if built, else False.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

build

build(
    x: Union[Tensor, ShapeLike],
    *,
    device: Optional[Any] = None,
    dtype: Any = np.float32,
) -> None

Build (materialize) the model using a representative input.

This method performs a single forward pass to force initialization of all lazy modules (e.g., Dense inferring in_features and allocating parameters). After build(), calls to model.parameters() will return a complete and stable parameter set suitable for optimizer construction.

The build input may be provided either as: - a real Tensor, or - a shape-like object (tuple/list), in which case a dummy Tensor is internally constructed.

Parameters:

Name Type Description Default
x Tensor or ShapeLike

Representative model input. - If a Tensor, it is used directly. - If a shape (e.g., (1, 784)), a zero-filled Tensor is created internally to trigger the forward pass.

required
device optional

Device to use when constructing a dummy Tensor from a shape. Ignored if x is already a Tensor.

None
dtype Any

Data type to use for a dummy Tensor created from a shape. Ignored if x is already a Tensor.

float32

Raises:

Type Description
TypeError

If x is neither a Tensor nor a valid shape-like object.

RuntimeError

If the forward pass fails during model building.

Notes
  • Calling build() multiple times is safe; subsequent calls are no-ops.
  • This method must be called before inference, training, or optimizer creation when the model contains lazy layers.

save_json

save_json(path: str | Path) -> None

Save model architecture and weights into a single JSON file.

Parameters:

Name Type Description Default
path str | Path

Output JSON file path, e.g. "checkpoint.json".

required
Format

{ "format": "keydnn.json.ckpt.v1", "arch": {...}, "state": { "layer1.weight": {"b64": "...", "dtype": "<f4", "shape": [...], "order": "C"}, ... } }

Notes
  • Avoids pickle and HDF5 dependencies.
  • JSON file can get large; base64 adds ~33% size overhead.

load_json classmethod

load_json(path: str | Path) -> 'Model'

Load a model from a single JSON checkpoint created by save_json().

Parameters:

Name Type Description Default
path str | Path

Checkpoint JSON path.

required

Returns:

Type Description
Model

Reconstructed model with weights loaded.

Raises:

Type Description
ValueError

If the checkpoint format is unsupported.

TypeError

If the reconstructed object is not an instance of cls.

train_on_batch

train_on_batch(
    x_batch: Tensor,
    y_batch: Tensor,
    *,
    loss: Any,
    optimizer: Any,
    metrics: Optional[Sequence[Any]] = None,
    metric_names: Optional[Sequence[str]] = None,
    zero_grad: bool = True,
    backward: bool = True,
    step: bool = True,
    optimizer_kwargs: Optional[Dict[str, Any]] = None,
) -> Dict[str, float]

Run a single training step on one mini-batch.

Supports string shortcuts for: - loss: "mse", "sse", "bce", "cce" - optimizer: "sgd", "adam" - metrics: "acc"

See Model.fit() for full semantics.

fit

fit(
    x: Union[Tensor, Iterable[Tuple[Tensor, Tensor]]],
    y: Optional[Tensor] = None,
    *,
    loss: LossLike,
    optimizer: OptimizerLike,
    metrics: Optional[Sequence[MetricLike]] = None,
    metric_names: Optional[Sequence[str]] = None,
    batch_size: int = 32,
    epochs: int = 1,
    shuffle: bool = True,
    verbose: int = 1,
    validation_data: Optional[Tuple[Any, Any]] = None,
    callbacks: Optional[Sequence["Callback"]] = None,
    optimizer_kwargs: Optional[Mapping[str, Any]] = None,
) -> History

Train the model for a fixed number of epochs.

This method provides a Keras-like training loop built on top of train_on_batch(). It aggregates batch logs into per-epoch metrics and records them in a History object.

Supported input forms

1) (x, y) dataset: - x and y are array-like and support len() and indexing - batching/shuffling is handled internally via _iter_minibatches_xy 2) Iterable-of-batches: - y is None - x is an iterable yielding (x_batch, y_batch) tuples

Parameters:

Name Type Description Default
x Union[Tensor, Iterable[Tuple[Tensor, Tensor]]]

Dataset inputs, or an iterable yielding (x_batch, y_batch) tuples.

required
y Optional[Tensor]

Dataset targets. Must be provided for dataset inputs; must be None for iterable-of-batches inputs.

None
loss LossLike

Callable producing a scalar loss: loss(y_pred, y_true).

required
optimizer OptimizerLike

Optimizer-like object, expected to expose zero_grad() and step().

required
metrics MetricLike

Metric callables to compute per batch and aggregate per epoch.

None
metric_names Optional[Sequence[str]]

Optional names matching metrics. If omitted, names are inferred.

None
batch_size int

Mini-batch size for dataset inputs. Default is 32.

32
epochs int

Number of epochs to train for. Default is 1.

1
shuffle bool

Whether to shuffle dataset inputs each epoch. Default is True.

True
verbose int

If non-zero, prints a simple epoch summary. Default is 1.

1

Returns:

Type Description
History

A History instance (from ._history) containing per-epoch metrics.

Raises:

Type Description
ValueError

If epochs < 1 or batch_size < 1.

TypeError

If y is None but x is not an iterable of (x_batch, y_batch).

Notes

Per-epoch metric values are computed as weighted means over batches, using _batch_size_of() to determine the weight for each batch.

to_json_payload

to_json_payload() -> Dict[str, Any]

Serialize model architecture and weights into a JSON-serializable payload.

This is the in-memory counterpart to save_json(path) and is useful for: - callback-based checkpointing without filesystem writes - snapshotting best weights for early stopping restore - programmatic checkpoint transport (e.g., RPC, DB, etc.)

Returns:

Type Description
Dict[str, Any]

JSON-serializable checkpoint payload with keys: - "format": str - "arch": dict - "state": dict

Notes

The payload is compatible with the on-disk format produced by save_json().

from_json_payload_

from_json_payload_(payload: Dict[str, Any]) -> None

Load weights in-place from a JSON checkpoint payload.

This method restores only the parameter state into the current model instance. It is intended for in-memory restores (e.g., EarlyStopping).

Parameters:

Name Type Description Default
payload Dict[str, Any]

A checkpoint payload produced by to_json_payload() or loaded from disk via json.loads(...).

required

Raises:

Type Description
ValueError

If the checkpoint format is unsupported.

Notes
  • This method does not reconstruct the module graph.
  • It assumes the current model architecture matches the payload.
  • Shape mismatches are detected by load_state_payload_().

get_config

get_config() -> dict[str, Any]

Return the (de)serialization config for this container.

Returns:

Type Description
dict[str, Any]

Configuration dictionary used by the JSON serializer.

Notes

Sequential stores its children separately via the module tree, so no additional config is required here.

add

add(layer: Module, name: Optional[str] = None) -> None

Append a module to the container and register it as a submodule.

Parameters:

Name Type Description Default
layer Module

The module to append.

required
name Optional[str]

Explicit registration name. If omitted, a numeric name ("0", "1", ...) is assigned based on insertion order.

None

Raises:

Type Description
TypeError

If layer is not an instance of Module.

ValueError

If the provided name conflicts with an existing submodule.

Notes
  • _layers preserves the execution order used by forward().
  • _modules enables parameter discovery, recursion, and serialization.

forward

forward(x: ITensor, *, _skip_norm: bool = False) -> ITensor

Apply all layers sequentially.

Parameters:

Name Type Description Default
x ITensor

Input tensor to the first layer.

required
_skip_norm bool

Whether to skip normalization layers. Useful during model build / compilation.

False

Returns:

Type Description
ITensor

Output of the final layer.

Notes

This method uses the ordered _layers list (not _modules) to ensure deterministic execution order.

layers

layers() -> Tuple[Module, ...]

Return all layers as an immutable tuple.

Returns:

Type Description
Tuple[Module, ...]

Tuple of contained modules in execution order.

summary

summary() -> str

Generate a lightweight textual summary of the container.

Returns:

Type Description
str

Human-readable representation listing layer indices and types.

Notes
  • No shape inference or parameter counting is performed.
  • This is intended for quick debugging/inspection only.

predict

predict(x, *, requires_grad: bool = False)

Perform an inference-style forward pass for the Sequential model.

Parameters:

Name Type Description Default
x ITensor

Input tensor.

required
requires_grad bool

Placeholder for future gradient-control semantics. Currently unused.

False

Returns:

Type Description
ITensor

Output of the sequential computation.

Notes

If the model exposes eval(), this method switches to eval mode before running the forward pass.

from_config classmethod

from_config(cfg: dict[str, Any]) -> Sequential

Construct a Sequential container from configuration.

Parameters:

Name Type Description Default
cfg dict[str, Any]

Configuration dictionary produced by get_config().

required

Returns:

Type Description
Sequential

A newly constructed Sequential instance.

Notes

Child modules are attached later by the deserializer into self._modules. After attachment, _post_load() rebuilds the ordered _layers view.

to

to(device: Device | str) -> Sequential

Move this model (recursively) to device by moving all registered parameters.

This is a model-level convenience wrapper around Module.to(). It migrates parameters/buffers of all child layers registered in the module tree, so users can transfer the entire model without rebuilding a new Sequential.

Parameters:

Name Type Description Default
device Device

Target device for this model's parameters.

required

Returns:

Type Description
Sequential

This model (self) after migration.

Notes
  • The authoritative traversal uses the _modules registry (built by add()), ensuring all layers participate in device transfer.
  • This method may rebind parameter objects (out-of-place transfer) depending on parameter implementation; use to_() for best-effort identity preservation.

to_

to_(device: Device | str) -> Sequential

Move this model (recursively) to device in-place.

This is a model-level convenience wrapper around Module.to_(). It attempts to migrate parameters in-place when supported (preserving parameter identity), and falls back to out-of-place transfer + rebinding when necessary.

Parameters:

Name Type Description Default
device Device

Target device for this model's parameters.

required

Returns:

Type Description
Sequential

This model (self) after in-place migration.

Notes
  • In-place behavior is best-effort: parameters without to_() support will be replaced via to() and rebound on the module.
  • This method relies on the _modules registry to recurse into child layers.

keydnn.History dataclass

Container for per-epoch training metrics.

History is a lightweight, Keras-inspired object returned by high-level training routines (e.g., Model.fit). It stores aggregated metric values for each completed epoch and provides convenience accessors for inspection.

Attributes:

Name Type Description
history Dict[str, List[float]]

Mapping from metric name to a list of per-epoch values. Each list is ordered by epoch index.

epoch List[int]

List of epoch indices (0-based) corresponding to entries in history.

Notes
  • All metric values are stored as Python float for portability.
  • This object is intentionally passive: it performs no aggregation logic beyond appending values supplied by the training loop.

append_epoch

append_epoch(
    epoch_idx: int, logs: Mapping[str, Number]
) -> None

Append metrics for a completed epoch.

Parameters:

Name Type Description Default
epoch_idx int

Zero-based index of the completed epoch.

required
logs Mapping[str, Number]

Mapping from metric name to aggregated epoch value (e.g., mean loss, accuracy).

required
Notes
  • Metric values are coerced to float before storage.
  • The caller (typically Model.fit) is responsible for ensuring that logs contains already-aggregated values.

last

last() -> Dict[str, float]

Return metrics from the most recent epoch.

Returns:

Type Description
Dict[str, float]

Mapping from metric name to its latest recorded value. Metrics with no recorded values are omitted.

Notes

This is a convenience accessor commonly used after training to retrieve final loss/metric values without manual indexing.


Notes

  • Sequential is intended for linear stacks of layers.
  • Parameters are registered automatically when layers are added.
  • Training utilities such as callbacks and optimizers integrate with the model API.
  • For custom architectures, users may subclass lower-level building blocks (documented in the Guides section).