Layers

This section documents the neural network layers provided by KeyDNN’s public API.
All layers are part of the presentation layer and are safe to depend on.

Unless otherwise noted, layers:

operate on Tensor inputs
support automatic differentiation
respect the device (CPU / CUDA) of their parameters
follow PyTorch-style shape conventions where applicable

Core Layers

keydnn.Dense

Bases: _BaseLinear

Keras-style Dense layer with lazy input-dimension inference.

Users specify only out_features at construction time. The corresponding in_features dimension is inferred from the first input tensor passed to forward (x.shape[1]).

Device behavior

If device is None at construction time, the layer adopts x.device on the first forward pass.
If device is provided, forward enforces that inputs already reside on that device (no implicit transfers).

is_built `property`

is_built: bool

Return whether parameters have been materialized.

Returns:

Type	Description
`bool`	True if `weight` exists, False otherwise.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration dict.

This stores only constructor-level hyperparameters. Parameter values are expected to be restored separately via the checkpoint/state mechanism.

Returns:

Type	Description
`Dict[str, Any]`	Configuration containing `in_features` (if known), `out_features`, `bias`, `device`, `dtype`, and `initializer`.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply the Dense transform to a 2D input tensor.

On first call, infers in_features from x.shape[1] and materializes parameters.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (batch, in_features).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (batch, out_features).

Raises:

Type	Description
`ValueError`	If input is not 2D.
`RuntimeError`	If `device` was specified and does not match `x.device`.

from_config `classmethod`

from_config(cfg: Dict[str, Any]) -> 'Dense'

Reconstruct a Dense layer from configuration.

If in_features is present, eagerly materializes parameters so a subsequent weight-load can attach values deterministically.

Parameters:

Name	Type	Description	Default
`cfg`	`Dict[str, Any]`	Configuration dictionary produced by `get_config()`.	required

Returns:

Type	Description
`Dense`	Reconstructed Dense module.

keydnn.Linear

Bases: _BaseLinear

Fully-connected (affine) layer with eager parameter allocation.

Linear allocates weight and (optionally) bias during initialization, making it immediately usable for both training and inference.

Notes

This class preserves the historical Linear(in_features, out_features, ...) API while delegating all core functionality to _BaseLinear.

is_built `property`

is_built: bool

Return whether parameters have been materialized.

Returns:

Type	Description
`bool`	True if `weight` exists, False otherwise.

forward

forward(x: Tensor) -> Tensor

Apply the affine transform to a 2D input tensor.

This method assumes the layer has been materialized (i.e., weight exists). Lazy subclasses should call _materialize(...) before delegating here.

Computation: y = x @ W^T (+ b)

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (batch, in_features).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (batch, out_features).

Raises:

Type	Description
`RuntimeError`	If the layer is not built, or devices mismatch.
`ValueError`	If input rank/shape is incompatible with the layer.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration dict.

This stores only constructor-level hyperparameters. Parameter values are expected to be restored separately via the checkpoint/state mechanism.

Returns:

Type	Description
`Dict[str, Any]`	Configuration containing `in_features` (if known), `out_features`, `bias`, `device`, `dtype`, and `initializer`.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

from_config `classmethod`

from_config(cfg: Dict[str, Any]) -> 'Linear'

Construct a Linear layer from a configuration dict.

Parameters:

Name	Type	Description	Default
`cfg`	`Dict[str, Any]`	Configuration dictionary produced by `get_config()`.	required

Returns:

Type	Description
`Linear`	A newly constructed `Linear` instance with matching hyperparameters.

Convolution Layers

Note
KeyDNN provides both Conv2D / Conv2DTranspose and
Conv2d / Conv2dTranspose.
These are equivalent and exist for naming compatibility.

keydnn.Conv2D `module-attribute`

Conv2D = Conv2d

keydnn.Conv2DTranspose `module-attribute`

Conv2DTranspose = Conv2dTranspose

keydnn.Conv2d

Bases: Module

Two-dimensional convolution layer (NCHW).

This module applies a 2D convolution over an input tensor using learnable weights and an optional bias term. It supports configurable kernel size, stride, and padding, and integrates fully with KeyDNN's autograd system.

Parameters:

Name	Type	Description	Default
`in_channels`	`int`	Number of channels in the input tensor.	required
`out_channels`	`int`	Number of channels produced by the convolution.	required
`kernel_size`	`int or tuple[int, int]`	Size of the convolution kernel. If an integer is provided, the same value is used for both height and width.	required
`stride`	`int or tuple[int, int]`	Stride of the convolution. Defaults to 1.	`1`
`padding`	`int or tuple[int, int]`	Zero-padding applied to the input. Defaults to 0.	`0`
`bias`	`bool`	Whether to include a learnable bias term. Defaults to True.	`True`
`device`	`Device`	Device on which parameters will be allocated. Defaults to CPU.	`None`
`dtype`	`Any`	Data type used to initialize parameters. Defaults to float32 if not provided.	`None`
`initializer`	`str`	Name of the weight initializer applied to the convolution kernel. Defaults to `"kaiming"`. The bias parameter, if present, is initialized using the `"zeros"` initializer.	`'kaiming'`

Attributes:

Name	Type	Description
`weight`	`Parameter`	Convolution kernel weights of shape (out_channels, in_channels, kernel_height, kernel_width).
`bias`	`Optional[Parameter]`	Optional bias parameter of shape (out_channels,).
`stride`	`tuple[int, int]`	Convolution stride as a 2D pair.
`padding`	`tuple[int, int]`	Convolution padding as a 2D pair.

Notes

Weight initialization is performed via the Parameter initializer registry, not inside this module.
This module does not perform any numerical computation directly; it delegates forward and backward logic to Conv2dFn.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply the convolution operation to an input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C_in, H, W).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C_out, H_out, W_out).

Notes

If any of the inputs or parameters require gradients, an autograd Context is attached to the output tensor.
The backward function delegates gradient computation to Conv2dFn.
No validation of input shape is performed here; mismatches are expected to be caught by lower-level kernels.

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for reconstructing this layer.

Notes

This configuration captures constructor-level hyperparameters only. Trainable parameters (weights and bias) are serialized separately by the checkpoint/state_dict mechanism.

from_config `classmethod`

from_config(cfg: Dict[str, Any]) -> 'Conv2d'

Construct a Conv2d layer from a configuration dict.

Notes

This reconstructs the module structure (hyperparameters). Weights are expected to be loaded afterward from the checkpoint state.

keydnn.Conv2dTranspose

Bases: Module

Two-dimensional transposed convolution layer (NCHW).

Parameters:

Name	Type	Description	Default
`in_channels`	`int`	Number of channels in the input tensor.	required
`out_channels`	`int`	Number of channels produced by the transposed convolution.	required
`kernel_size`	`int or tuple[int, int]`	Size of the convolution kernel.	required
`stride`	`int or tuple[int, int]`	Stride of the transposed convolution. Defaults to 1.	`1`
`padding`	`int or tuple[int, int]`	Padding used by the transposed convolution. Defaults to 0.	`0`
`output_padding`	`int or tuple[int, int]`	Additional size added to one side of each output dimension. Defaults to 0. (Must satisfy output_padding[d] < stride[d] for the corresponding ops.)	`0`
`bias`	`bool`	Whether to include a learnable bias term. Defaults to True.	`True`
`device`	`Device`	Device on which parameters will be allocated.	`None`
`dtype`	`Any`	Data type used to initialize parameters. Kept for backward compatibility.	`None`

Attributes:

Name	Type	Description
`weight`	`Parameter`	Kernel weights of shape (in_channels, out_channels, K_h, K_w).
`bias`	`Optional[Parameter]`	Optional bias parameter of shape (out_channels,).
`stride`	`tuple[int, int]`	Stride as a 2D pair.
`padding`	`tuple[int, int]`	Padding as a 2D pair.
`output_padding`	`tuple[int, int]`	Output padding as a 2D pair.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply the transposed convolution operation to an input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C_in, H_in, W_in).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C_out, H_out, W_out).

Notes

If any of the inputs or parameters require gradients, an autograd Context is attached to the output tensor.
The backward function delegates gradient computation to Conv2dTransposeFn.

Normalization Layers

Note
KeyDNN provides both BatchNorm1D / BatchNorm2D and
BatchNorm1d / BatchNorm2d.
These are equivalent and exist for naming compatibility.

keydnn.BatchNorm1d

Bases: BatchNorm1d

Presentation-layer BatchNorm1d with ergonomic defaults.

This class subclasses the infrastructure BatchNorm1d implementation and only adjusts constructor ergonomics:

device becomes optional and defaults to CPU.
device may be provided as Device or a string like "cuda:0".

All numerical behavior, buffer updates, and autograd logic are inherited unchanged from the infrastructure implementation.

forward

forward(x: Tensor) -> Tensor

Apply BatchNorm1d to an input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C). Must be a CPU tensor and on the same device as the module.	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C). Requires gradients if the input requires gradients and/or (when affine=True) gamma/beta require gradients.

Raises:

Type	Description
`RuntimeError`	If the input tensor is not on CPU.
`ValueError`	If device mismatches, input rank is not 2D, or channel count does not match `num_features`.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for this module.

Returns:

Type	Description
`Dict[str, Any]`	Configuration dictionary sufficient to reconstruct the module via `from_config`.

from_config `classmethod`

from_config(config: Dict[str, Any]) -> 'BatchNorm1d'

Construct a BatchNorm1d instance from a configuration dictionary.

Parameters:

Name	Type	Description	Default
`config`	`Dict[str, Any]`	Configuration as produced by `get_config()`.	required

Returns:

Type	Description
`BatchNorm1d`	Reconstructed module instance.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

keydnn.BatchNorm2d

Bases: BatchNorm2d

Presentation-layer BatchNorm2d with ergonomic defaults.

This class subclasses the infrastructure BatchNorm2d implementation and only adjusts constructor ergonomics:

device becomes optional and defaults to CPU.
device may be provided as Device or a string like "cuda:0".

All numerical behavior, buffer updates, and autograd logic are inherited unchanged from the infrastructure implementation.

forward

forward(x: Tensor) -> Tensor

Apply BatchNorm2d to an input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C, H, W). Must be a CPU tensor and on the same device as the module.	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C, H, W). Requires gradients if the input requires gradients and/or (when affine=True) gamma/beta require gradients.

Raises:

Type	Description
`RuntimeError`	If the input tensor is not on CPU.
`ValueError`	If device mismatches, input rank is not 4D, or channel count does not match `num_features`.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for this module.

Returns:

Type	Description
`Dict[str, Any]`	Configuration dictionary sufficient to reconstruct the module via `from_config`.

from_config `classmethod`

from_config(config: Dict[str, Any]) -> 'BatchNorm2d'

Construct a BatchNorm2d instance from a configuration dictionary.

Parameters:

Name	Type	Description	Default
`config`	`Dict[str, Any]`	Configuration as produced by `get_config()`.	required

Returns:

Type	Description
`BatchNorm2d`	Reconstructed module instance.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

keydnn.BatchNorm1D `module-attribute`

BatchNorm1D = BatchNorm1d

keydnn.BatchNorm2D `module-attribute`

BatchNorm2D = BatchNorm2d

keydnn.LayerNorm

Bases: LayerNorm

Presentation-layer LayerNorm with ergonomic defaults.

This class subclasses the infrastructure LayerNorm and only adjusts constructor ergonomics:

device becomes optional and defaults to CPU.
device may be provided as Device or a string like "cuda:0".

All numerical behavior, parameter management, and autograd logic are inherited unchanged from the infrastructure implementation.

forward

forward(x: Tensor) -> Tensor

Apply LayerNorm to an input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (..., *normalized_shape). Must be a CPU tensor and on the same device as the module.	required

Returns:

Type	Description
`Tensor`	Output tensor of the same shape as `x`.

Raises:

Type	Description
`RuntimeError`	If the input tensor is not on CPU.
`ValueError`	If device mismatches, rank is insufficient, or trailing dims do not match `normalized_shape`.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for this module.

from_config `classmethod`

from_config(config: Dict[str, Any]) -> 'LayerNorm'

Construct a LayerNorm instance from a configuration dictionary.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

Regularization Layers

keydnn.Dropout

Bases: Module

Dropout regularization layer (inverted dropout).

This layer randomly zeroes elements of the input tensor with probability p during training and rescales the remaining elements by 1 / (1 - p) so that the expected activation magnitude remains unchanged.

Behavior

Training mode: y = x * mask / (1 - p), where mask ~ Bernoulli(1 - p) (equivalently y = x * ((rand < keep_prob) / keep_prob))
Evaluation mode: y = x (identity)

Parameters:

Name	Type	Description	Default
`p`	`float`	Probability of dropping (zeroing) an element. Must satisfy 0.0 <= p < 1.0. Default is 0.5.	`0.5`

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply dropout to the input tensor.

During training, elements of the input tensor are randomly masked according to the dropout probability and scaled using inverted dropout. During evaluation, the input tensor is returned unchanged.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor (CPU or CUDA).	required

Returns:

Type	Description
`Tensor`	Output tensor after applying dropout (or identity if not in training mode).

Raises:

Type	Description
`ValueError`	If `p` implies a non-positive keep probability (numerical guard).

get_config

get_config() -> Dict[str, Any]

Return a serializable configuration for this module.

Returns:

Type	Description
`Dict[str, Any]`	Configuration dictionary containing the dropout probability.

from_config `classmethod`

from_config(config: Dict[str, Any]) -> 'Dropout'

Construct a Dropout module from a configuration dictionary.

Parameters:

Name	Type	Description	Default
`config`	`Dict[str, Any]`	Configuration dictionary produced by `get_config`.	required

Returns:

Type	Description
`Dropout`	A new Dropout instance initialized from the configuration.

Pooling Layers

Note
KeyDNN provides both *Pool2D and *Pool2d variants.
These are equivalent and exist for naming compatibility.

keydnn.MaxPool2D `module-attribute`

MaxPool2D = MaxPool2d

keydnn.MaxPool2d

Bases: Pool2dConfigMixin, Module

2D max pooling module (NCHW).

This module applies max pooling over the spatial dimensions (H, W) independently per channel. The output retains the batch and channel dimensions while reducing spatial resolution according to pooling hyperparameters.

Shape semantics

Input: x.shape == (N, C, H, W)

Output: y.shape == (N, C, H_out, W_out)

Notes

Backpropagation routes gradients to the input positions that produced the maxima during the forward pass (argmax-based routing).
The underlying CPU reference implementation pads with -inf so padded values never become maxima (important for correctness at borders).

kernel_size `property`

kernel_size: Tuple[int, int]

Return the pooling window size.

Returns:

Type	Description
`tuple[int, int]`	Kernel size as (k_h, k_w).

stride `property`

stride: Tuple[int, int]

Return the pooling stride.

Returns:

Type	Description
`tuple[int, int]`	Stride as (s_h, s_w).

padding `property`

padding: Tuple[int, int]

Return the pooling padding.

Returns:

Type	Description
`tuple[int, int]`	Padding as (p_h, p_w).

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return JSON-serializable configuration for this pooling layer.

from_config `classmethod`

from_config(cfg: Dict[str, Any]) -> T

Reconstruct the pooling layer from a JSON configuration dict.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply max pooling to the input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C, H, W).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C, H_out, W_out).

Notes

A Context is attached only if x.requires_grad is True.
The module delegates computation to MaxPool2dFn.

keydnn.AvgPool2D `module-attribute`

AvgPool2D = AvgPool2d

keydnn.AvgPool2d

Bases: Pool2dConfigMixin, Module

2D average pooling module (NCHW).

This module applies average pooling over the spatial dimensions (H, W) independently per channel.

Shape semantics

Input: x.shape == (N, C, H, W)

Output: y.shape == (N, C, H_out, W_out)

Notes

The underlying reference implementation uses zero-padding.
The average is computed over the full kernel area (k_h * k_w), which means padded zeros contribute to the average when padding > 0.
The backward pass distributes gradients uniformly over each pooling window.

kernel_size `property`

kernel_size: Tuple[int, int]

Return the pooling window size.

Returns:

Type	Description
`tuple[int, int]`	Kernel size as (k_h, k_w).

stride `property`

stride: Tuple[int, int]

Return the pooling stride.

Returns:

Type	Description
`tuple[int, int]`	Stride as (s_h, s_w).

padding `property`

padding: Tuple[int, int]

Return the pooling padding.

Returns:

Type	Description
`tuple[int, int]`	Padding as (p_h, p_w).

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return JSON-serializable configuration for this pooling layer.

from_config `classmethod`

from_config(cfg: Dict[str, Any]) -> T

Reconstruct the pooling layer from a JSON configuration dict.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply average pooling to the input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C, H, W).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C, H_out, W_out).

Notes

A Context is attached only if x.requires_grad is True.
The module delegates computation to AvgPool2dFn.

keydnn.GlobalAvgPool2D `module-attribute`

GlobalAvgPool2D = GlobalAvgPool2d

keydnn.GlobalAvgPool2d

Bases: StatelessConfigMixin, Module

Global average pooling module (NCHW).

Global average pooling reduces each channel to a single value by averaging over the spatial dimensions:

(N, C, H, W) -> (N, C, 1, 1)

This is commonly used near the end of CNN architectures to eliminate fully-connected layers and support variable spatial input sizes.

Notes

This module has no kernel/stride/padding hyperparameters.
The backward pass distributes gradients uniformly across all H*W input positions per channel.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type	Description
`Iterable[IParameter]`	Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes

This toggles self.training = True.
Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes

This toggles self.training = False.
In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the parameter will be stored (e.g., "weight", "bias").	required
`param`	`Optional[IParameter]`	Parameter instance to register. If None, registration is skipped.	required

Notes

If param is None, nothing is registered.
If the name already exists, it is overwritten intentionally.
This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name under which the module will be stored.	required
`module`	`Optional[Module]`	Child module to register. If None, registration is skipped.	required

Notes

If module is None, nothing is registered.
This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name	Type	Description	Default
`prefix`	`str`	Prefix to prepend to parameter names (used for recursion).	`''`

Returns:

Type	Description
`Iterator[tuple[str, IParameter]]`	Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration dictionary.

For stateless modules, this method returns an empty dictionary, indicating that no parameters are required to reconstruct the object.

Returns:

Type	Description
`Dict[str, Any]`	An empty configuration dictionary.

from_config `classmethod`

from_config(cfg: Dict[str, Any]) -> Self

Reconstruct the module from a configuration dictionary.

Since stateless modules do not require any configuration parameters, the provided configuration is ignored and a default instance of the class is returned.

Parameters:

Name	Type	Description	Default
`cfg`	`Dict[str, Any]`	Configuration dictionary (unused).	required

Returns:

Type	Description
`StatelessConfigMixin`	A newly constructed instance of the module.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes

Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior

For each registered parameter:
- If the parameter implements to_(), it is migrated in-place (object identity is preserved).
- Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
All child modules are recursively migrated using the same rules.

Parameters:

Name	Type	Description	Default
`device`	`Device`	Target device to which all parameters should be moved.	required

Returns:

Type	Description
`Module`	This module (`self`), after in-place migration.

Notes

This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply global average pooling to the input tensor.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape (N, C, H, W).	required

Returns:

Type	Description
`Tensor`	Output tensor of shape (N, C, 1, 1).

Notes

A Context is attached only if x.requires_grad is True.
The module delegates computation to GlobalAvgPool2dFn.

Notes on Shapes and Devices

Convolution and pooling layers expect NCHW layout by default.
Parameters are created on the same device as the layer unless explicitly moved.
Inputs must be contiguous for optimal CUDA performance.
Shape mismatches are reported at runtime with descriptive errors.

For more details, see:

Guides → Tensors & Devices
Guides → Training Loop

Layers

Core Layers

keydnn.Dense

is_built property

parameters

train

eval

register_parameter

register_module

named_parameters

get_config

to

to_

forward

from_config classmethod

keydnn.Linear

is_built property

forward

parameters

train

eval

register_parameter

register_module

named_parameters

get_config

to

to_

from_config classmethod

Convolution Layers

keydnn.Conv2D module-attribute

keydnn.Conv2DTranspose module-attribute

keydnn.Conv2d

parameters

train

eval

register_parameter

register_module

named_parameters

to

to_

forward

get_config

from_config classmethod

keydnn.Conv2dTranspose

parameters

train

eval

register_parameter

register_module

named_parameters

to

to_

forward

Normalization Layers

keydnn.BatchNorm1d

forward

parameters

train

eval

register_parameter

register_module

named_parameters

get_config

from_config classmethod

to

to_

keydnn.BatchNorm2d

forward

parameters

train

eval

register_parameter

register_module

named_parameters

get_config

from_config classmethod

to

to_

keydnn.BatchNorm1D module-attribute

keydnn.BatchNorm2D module-attribute

is_built `property`

from_config `classmethod`

is_built `property`

from_config `classmethod`

keydnn.Conv2D `module-attribute`

keydnn.Conv2DTranspose `module-attribute`

from_config `classmethod`

from_config `classmethod`

from_config `classmethod`

keydnn.BatchNorm1D `module-attribute`

keydnn.BatchNorm2D `module-attribute`

from_config `classmethod`

from_config `classmethod`

keydnn.MaxPool2D `module-attribute`

kernel_size `property`

stride `property`

padding `property`

from_config `classmethod`

keydnn.AvgPool2D `module-attribute`

kernel_size `property`

stride `property`

padding `property`

from_config `classmethod`

keydnn.GlobalAvgPool2D `module-attribute`

from_config `classmethod`