Models
Models provide high-level abstractions for composing layers, running training loops, and tracking training history.
All models documented here are part of KeyDNN’s public presentation API.
keydnn.Sequential
Bases: Model
Sequential container model.
Sequential composes multiple Module instances and applies them
sequentially during forward():
out = layer_n(...layer_2(layer_1(x)))
This container is useful for building simple pipelines such as
MLPs and CNN feature stacks without writing a custom forward.
Key Features
- Deterministic layer ordering via an internal
_layerslist - Automatic submodule registration into
_modulesfor recursion and save/load - Supports indexing, iteration, and dynamic extension via
add - Provides a lightweight
summary()for quick inspection
is_built
property
is_built: bool
Return whether the model has been built (lazy layers materialized).
A model is considered built once build() has successfully executed
a forward pass on a representative input and all lazy layers have
created their parameters.
Returns:
| Type | Description |
|---|---|
bool
|
True if built, else False. |
parameters
parameters() -> Iterable[IParameter]
Return an iterable over this module's parameters (recursive).
Returns:
| Type | Description |
|---|---|
Iterable[IParameter]
|
Iterable of parameters registered on this module and all submodules. |
train
train() -> Self
Set this module to training mode and recursively set all child modules to training mode.
Notes
- This toggles
self.training = True. - Modules that behave differently in training (e.g., Dropout, BatchNorm)
should read
self.trainingduring forward/predict to decide behavior. - This method is intended to mirror PyTorch's
Module.train().
eval
eval() -> Self
Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.
Notes
- This toggles
self.training = False. - In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
- This method is intended to mirror PyTorch's
Module.eval().
register_parameter
register_parameter(
name: str, param: Optional[IParameter]
) -> None
Register a parameter with this module.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name under which the parameter will be stored (e.g., "weight", "bias"). |
required |
param
|
Optional[IParameter]
|
Parameter instance to register. If None, registration is skipped. |
required |
Notes
- If
paramis None, nothing is registered. - If the name already exists, it is overwritten intentionally.
- This also sets the attribute on the module so
self.<name>works.
register_module
register_module(
name: str, module: Optional["Module"]
) -> None
Register a child module with this module.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name under which the module will be stored. |
required |
module
|
Optional[Module]
|
Child module to register. If None, registration is skipped. |
required |
Notes
- If
moduleis None, nothing is registered. - This also sets the attribute on the module so
self.<name>works.
named_parameters
named_parameters(
prefix: str = "",
) -> Iterator[tuple[str, IParameter]]
Return an iterator over (name, parameter) pairs (recursive).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
Prefix to prepend to parameter names (used for recursion). |
''
|
Returns:
| Type | Description |
|---|---|
Iterator[tuple[str, IParameter]]
|
Iterator yielding (fully_qualified_name, parameter). |
build
build(
x: Union[Tensor, ShapeLike],
*,
device: Optional[Any] = None,
dtype: Any = np.float32,
) -> None
Build (materialize) the model using a representative input.
This method performs a single forward pass to force initialization of all
lazy modules (e.g., Dense inferring in_features and allocating
parameters). After build(), calls to model.parameters() will return a
complete and stable parameter set suitable for optimizer construction.
The build input may be provided either as:
- a real Tensor, or
- a shape-like object (tuple/list), in which case a dummy Tensor is
internally constructed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor or ShapeLike
|
Representative model input.
- If a |
required |
device
|
optional
|
Device to use when constructing a dummy Tensor from a shape.
Ignored if |
None
|
dtype
|
Any
|
Data type to use for a dummy Tensor created from a shape.
Ignored if |
float32
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
RuntimeError
|
If the forward pass fails during model building. |
Notes
- Calling
build()multiple times is safe; subsequent calls are no-ops. - This method must be called before inference, training, or optimizer creation when the model contains lazy layers.
save_json
save_json(path: str | Path) -> None
Save model architecture and weights into a single JSON file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Output JSON file path, e.g. "checkpoint.json". |
required |
Format
{ "format": "keydnn.json.ckpt.v1", "arch": {...}, "state": { "layer1.weight": {"b64": "...", "dtype": "<f4", "shape": [...], "order": "C"}, ... } }
Notes
- Avoids pickle and HDF5 dependencies.
- JSON file can get large; base64 adds ~33% size overhead.
load_json
classmethod
load_json(path: str | Path) -> 'Model'
Load a model from a single JSON checkpoint created by save_json().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Checkpoint JSON path. |
required |
Returns:
| Type | Description |
|---|---|
Model
|
Reconstructed model with weights loaded. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the checkpoint format is unsupported. |
TypeError
|
If the reconstructed object is not an instance of |
train_on_batch
train_on_batch(
x_batch: Tensor,
y_batch: Tensor,
*,
loss: Any,
optimizer: Any,
metrics: Optional[Sequence[Any]] = None,
metric_names: Optional[Sequence[str]] = None,
zero_grad: bool = True,
backward: bool = True,
step: bool = True,
optimizer_kwargs: Optional[Dict[str, Any]] = None,
) -> Dict[str, float]
Run a single training step on one mini-batch.
Supports string shortcuts for: - loss: "mse", "sse", "bce", "cce" - optimizer: "sgd", "adam" - metrics: "acc"
See Model.fit() for full semantics.
fit
fit(
x: Union[Tensor, Iterable[Tuple[Tensor, Tensor]]],
y: Optional[Tensor] = None,
*,
loss: LossLike,
optimizer: OptimizerLike,
metrics: Optional[Sequence[MetricLike]] = None,
metric_names: Optional[Sequence[str]] = None,
batch_size: int = 32,
epochs: int = 1,
shuffle: bool = True,
verbose: int = 1,
validation_data: Optional[Tuple[Any, Any]] = None,
callbacks: Optional[Sequence["Callback"]] = None,
optimizer_kwargs: Optional[Mapping[str, Any]] = None,
) -> History
Train the model for a fixed number of epochs.
This method provides a Keras-like training loop built on top of
train_on_batch(). It aggregates batch logs into per-epoch metrics and
records them in a History object.
Supported input forms
1) (x, y) dataset:
- x and y are array-like and support len() and indexing
- batching/shuffling is handled internally via _iter_minibatches_xy
2) Iterable-of-batches:
- y is None
- x is an iterable yielding (x_batch, y_batch) tuples
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Union[Tensor, Iterable[Tuple[Tensor, Tensor]]]
|
Dataset inputs, or an iterable yielding |
required |
y
|
Optional[Tensor]
|
Dataset targets. Must be provided for dataset inputs; must be |
None
|
loss
|
LossLike
|
Callable producing a scalar loss: |
required |
optimizer
|
OptimizerLike
|
Optimizer-like object, expected to expose |
required |
metrics
|
MetricLike
|
Metric callables to compute per batch and aggregate per epoch. |
None
|
metric_names
|
Optional[Sequence[str]]
|
Optional names matching |
None
|
batch_size
|
int
|
Mini-batch size for dataset inputs. Default is 32. |
32
|
epochs
|
int
|
Number of epochs to train for. Default is 1. |
1
|
shuffle
|
bool
|
Whether to shuffle dataset inputs each epoch. Default is True. |
True
|
verbose
|
int
|
If non-zero, prints a simple epoch summary. Default is 1. |
1
|
Returns:
| Type | Description |
|---|---|
History
|
A |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
TypeError
|
If |
Notes
Per-epoch metric values are computed as weighted means over batches,
using _batch_size_of() to determine the weight for each batch.
to_json_payload
to_json_payload() -> Dict[str, Any]
Serialize model architecture and weights into a JSON-serializable payload.
This is the in-memory counterpart to save_json(path) and is useful for:
- callback-based checkpointing without filesystem writes
- snapshotting best weights for early stopping restore
- programmatic checkpoint transport (e.g., RPC, DB, etc.)
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
JSON-serializable checkpoint payload with keys: - "format": str - "arch": dict - "state": dict |
Notes
The payload is compatible with the on-disk format produced by save_json().
from_json_payload_
from_json_payload_(payload: Dict[str, Any]) -> None
Load weights in-place from a JSON checkpoint payload.
This method restores only the parameter state into the current model instance. It is intended for in-memory restores (e.g., EarlyStopping).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
Dict[str, Any]
|
A checkpoint payload produced by |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the checkpoint format is unsupported. |
Notes
- This method does not reconstruct the module graph.
- It assumes the current model architecture matches the payload.
- Shape mismatches are detected by
load_state_payload_().
get_config
get_config() -> dict[str, Any]
Return the (de)serialization config for this container.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Configuration dictionary used by the JSON serializer. |
Notes
Sequential stores its children separately via the module tree, so
no additional config is required here.
add
add(layer: Module, name: Optional[str] = None) -> None
Append a module to the container and register it as a submodule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layer
|
Module
|
The module to append. |
required |
name
|
Optional[str]
|
Explicit registration name. If omitted, a numeric name ("0", "1", ...) is assigned based on insertion order. |
None
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
ValueError
|
If the provided |
Notes
_layerspreserves the execution order used byforward()._modulesenables parameter discovery, recursion, and serialization.
forward
forward(x: ITensor, *, _skip_norm: bool = False) -> ITensor
Apply all layers sequentially.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ITensor
|
Input tensor to the first layer. |
required |
_skip_norm
|
bool
|
Whether to skip normalization layers. Useful during model build / compilation. |
False
|
Returns:
| Type | Description |
|---|---|
ITensor
|
Output of the final layer. |
Notes
This method uses the ordered _layers list (not _modules) to ensure
deterministic execution order.
layers
layers() -> Tuple[Module, ...]
Return all layers as an immutable tuple.
Returns:
| Type | Description |
|---|---|
Tuple[Module, ...]
|
Tuple of contained modules in execution order. |
summary
summary() -> str
Generate a lightweight textual summary of the container.
Returns:
| Type | Description |
|---|---|
str
|
Human-readable representation listing layer indices and types. |
Notes
- No shape inference or parameter counting is performed.
- This is intended for quick debugging/inspection only.
predict
predict(x, *, requires_grad: bool = False)
Perform an inference-style forward pass for the Sequential model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ITensor
|
Input tensor. |
required |
requires_grad
|
bool
|
Placeholder for future gradient-control semantics. Currently unused. |
False
|
Returns:
| Type | Description |
|---|---|
ITensor
|
Output of the sequential computation. |
Notes
If the model exposes eval(), this method switches to eval mode before
running the forward pass.
from_config
classmethod
from_config(cfg: dict[str, Any]) -> Sequential
Construct a Sequential container from configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
dict[str, Any]
|
Configuration dictionary produced by |
required |
Returns:
| Type | Description |
|---|---|
Sequential
|
A newly constructed |
Notes
Child modules are attached later by the deserializer into self._modules.
After attachment, _post_load() rebuilds the ordered _layers view.
to
to(device: Device | str) -> Sequential
Move this model (recursively) to device by moving all registered parameters.
This is a model-level convenience wrapper around Module.to(). It migrates
parameters/buffers of all child layers registered in the module tree, so
users can transfer the entire model without rebuilding a new Sequential.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
Device
|
Target device for this model's parameters. |
required |
Returns:
| Type | Description |
|---|---|
Sequential
|
This model ( |
Notes
- The authoritative traversal uses the
_modulesregistry (built byadd()), ensuring all layers participate in device transfer. - This method may rebind parameter objects (out-of-place transfer) depending
on parameter implementation; use
to_()for best-effort identity preservation.
to_
to_(device: Device | str) -> Sequential
Move this model (recursively) to device in-place.
This is a model-level convenience wrapper around Module.to_(). It attempts
to migrate parameters in-place when supported (preserving parameter identity),
and falls back to out-of-place transfer + rebinding when necessary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
Device
|
Target device for this model's parameters. |
required |
Returns:
| Type | Description |
|---|---|
Sequential
|
This model ( |
Notes
- In-place behavior is best-effort: parameters without
to_()support will be replaced viato()and rebound on the module. - This method relies on the
_modulesregistry to recurse into child layers.
keydnn.History
dataclass
Container for per-epoch training metrics.
History is a lightweight, Keras-inspired object returned by high-level
training routines (e.g., Model.fit). It stores aggregated metric values
for each completed epoch and provides convenience accessors for inspection.
Attributes:
| Name | Type | Description |
|---|---|---|
history |
Dict[str, List[float]]
|
Mapping from metric name to a list of per-epoch values. Each list is ordered by epoch index. |
epoch |
List[int]
|
List of epoch indices (0-based) corresponding to entries in |
Notes
- All metric values are stored as Python
floatfor portability. - This object is intentionally passive: it performs no aggregation logic beyond appending values supplied by the training loop.
append_epoch
append_epoch(
epoch_idx: int, logs: Mapping[str, Number]
) -> None
Append metrics for a completed epoch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
epoch_idx
|
int
|
Zero-based index of the completed epoch. |
required |
logs
|
Mapping[str, Number]
|
Mapping from metric name to aggregated epoch value (e.g., mean loss, accuracy). |
required |
Notes
- Metric values are coerced to
floatbefore storage. - The caller (typically
Model.fit) is responsible for ensuring thatlogscontains already-aggregated values.
last
last() -> Dict[str, float]
Return metrics from the most recent epoch.
Returns:
| Type | Description |
|---|---|
Dict[str, float]
|
Mapping from metric name to its latest recorded value. Metrics with no recorded values are omitted. |
Notes
This is a convenience accessor commonly used after training to retrieve final loss/metric values without manual indexing.
Notes
Sequentialis intended for linear stacks of layers.- Parameters are registered automatically when layers are added.
- Training utilities such as callbacks and optimizers integrate with the model API.
- For custom architectures, users may subclass lower-level building blocks (documented in the Guides section).