Skip to content

Layers

This section documents the neural network layers provided by KeyDNN’s public API.
All layers are part of the presentation layer and are safe to depend on.

Unless otherwise noted, layers:

  • operate on Tensor inputs
  • support automatic differentiation
  • respect the device (CPU / CUDA) of their parameters
  • follow PyTorch-style shape conventions where applicable

Core Layers

keydnn.Dense

Bases: _BaseLinear

Keras-style Dense layer with lazy input-dimension inference.

Users specify only out_features at construction time. The corresponding in_features dimension is inferred from the first input tensor passed to forward (x.shape[1]).

Device behavior
  • If device is None at construction time, the layer adopts x.device on the first forward pass.
  • If device is provided, forward enforces that inputs already reside on that device (no implicit transfers).

is_built property

is_built: bool

Return whether parameters have been materialized.

Returns:

Type Description
bool

True if weight exists, False otherwise.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration dict.

This stores only constructor-level hyperparameters. Parameter values are expected to be restored separately via the checkpoint/state mechanism.

Returns:

Type Description
Dict[str, Any]

Configuration containing in_features (if known), out_features, bias, device, dtype, and initializer.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply the Dense transform to a 2D input tensor.

On first call, infers in_features from x.shape[1] and materializes parameters.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch, in_features).

required

Returns:

Type Description
Tensor

Output tensor of shape (batch, out_features).

Raises:

Type Description
ValueError

If input is not 2D.

RuntimeError

If device was specified and does not match x.device.

from_config classmethod

from_config(cfg: Dict[str, Any]) -> 'Dense'

Reconstruct a Dense layer from configuration.

If in_features is present, eagerly materializes parameters so a subsequent weight-load can attach values deterministically.

Parameters:

Name Type Description Default
cfg Dict[str, Any]

Configuration dictionary produced by get_config().

required

Returns:

Type Description
Dense

Reconstructed Dense module.

keydnn.Linear

Bases: _BaseLinear

Fully-connected (affine) layer with eager parameter allocation.

Linear allocates weight and (optionally) bias during initialization, making it immediately usable for both training and inference.

Notes

This class preserves the historical Linear(in_features, out_features, ...) API while delegating all core functionality to _BaseLinear.

is_built property

is_built: bool

Return whether parameters have been materialized.

Returns:

Type Description
bool

True if weight exists, False otherwise.

forward

forward(x: Tensor) -> Tensor

Apply the affine transform to a 2D input tensor.

This method assumes the layer has been materialized (i.e., weight exists). Lazy subclasses should call _materialize(...) before delegating here.

Computation: y = x @ W^T (+ b)

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch, in_features).

required

Returns:

Type Description
Tensor

Output tensor of shape (batch, out_features).

Raises:

Type Description
RuntimeError

If the layer is not built, or devices mismatch.

ValueError

If input rank/shape is incompatible with the layer.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration dict.

This stores only constructor-level hyperparameters. Parameter values are expected to be restored separately via the checkpoint/state mechanism.

Returns:

Type Description
Dict[str, Any]

Configuration containing in_features (if known), out_features, bias, device, dtype, and initializer.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

from_config classmethod

from_config(cfg: Dict[str, Any]) -> 'Linear'

Construct a Linear layer from a configuration dict.

Parameters:

Name Type Description Default
cfg Dict[str, Any]

Configuration dictionary produced by get_config().

required

Returns:

Type Description
Linear

A newly constructed Linear instance with matching hyperparameters.


Convolution Layers

Note
KeyDNN provides both Conv2D / Conv2DTranspose and
Conv2d / Conv2dTranspose.
These are equivalent and exist for naming compatibility.

keydnn.Conv2D module-attribute

Conv2D = Conv2d

keydnn.Conv2DTranspose module-attribute

Conv2DTranspose = Conv2dTranspose

keydnn.Conv2d

Bases: Module

Two-dimensional convolution layer (NCHW).

This module applies a 2D convolution over an input tensor using learnable weights and an optional bias term. It supports configurable kernel size, stride, and padding, and integrates fully with KeyDNN's autograd system.

Parameters:

Name Type Description Default
in_channels int

Number of channels in the input tensor.

required
out_channels int

Number of channels produced by the convolution.

required
kernel_size int or tuple[int, int]

Size of the convolution kernel. If an integer is provided, the same value is used for both height and width.

required
stride int or tuple[int, int]

Stride of the convolution. Defaults to 1.

1
padding int or tuple[int, int]

Zero-padding applied to the input. Defaults to 0.

0
bias bool

Whether to include a learnable bias term. Defaults to True.

True
device Device

Device on which parameters will be allocated. Defaults to CPU.

None
dtype Any

Data type used to initialize parameters. Defaults to float32 if not provided.

None
initializer str

Name of the weight initializer applied to the convolution kernel. Defaults to "kaiming". The bias parameter, if present, is initialized using the "zeros" initializer.

'kaiming'

Attributes:

Name Type Description
weight Parameter

Convolution kernel weights of shape (out_channels, in_channels, kernel_height, kernel_width).

bias Optional[Parameter]

Optional bias parameter of shape (out_channels,).

stride tuple[int, int]

Convolution stride as a 2D pair.

padding tuple[int, int]

Convolution padding as a 2D pair.

Notes
  • Weight initialization is performed via the Parameter initializer registry, not inside this module.
  • This module does not perform any numerical computation directly; it delegates forward and backward logic to Conv2dFn.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply the convolution operation to an input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C_in, H, W).

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C_out, H_out, W_out).

Notes
  • If any of the inputs or parameters require gradients, an autograd Context is attached to the output tensor.
  • The backward function delegates gradient computation to Conv2dFn.
  • No validation of input shape is performed here; mismatches are expected to be caught by lower-level kernels.

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for reconstructing this layer.

Notes

This configuration captures constructor-level hyperparameters only. Trainable parameters (weights and bias) are serialized separately by the checkpoint/state_dict mechanism.

from_config classmethod

from_config(cfg: Dict[str, Any]) -> 'Conv2d'

Construct a Conv2d layer from a configuration dict.

Notes

This reconstructs the module structure (hyperparameters). Weights are expected to be loaded afterward from the checkpoint state.

keydnn.Conv2dTranspose

Bases: Module

Two-dimensional transposed convolution layer (NCHW).

Parameters:

Name Type Description Default
in_channels int

Number of channels in the input tensor.

required
out_channels int

Number of channels produced by the transposed convolution.

required
kernel_size int or tuple[int, int]

Size of the convolution kernel.

required
stride int or tuple[int, int]

Stride of the transposed convolution. Defaults to 1.

1
padding int or tuple[int, int]

Padding used by the transposed convolution. Defaults to 0.

0
output_padding int or tuple[int, int]

Additional size added to one side of each output dimension. Defaults to 0. (Must satisfy output_padding[d] < stride[d] for the corresponding ops.)

0
bias bool

Whether to include a learnable bias term. Defaults to True.

True
device Device

Device on which parameters will be allocated.

None
dtype Any

Data type used to initialize parameters. Kept for backward compatibility.

None

Attributes:

Name Type Description
weight Parameter

Kernel weights of shape (in_channels, out_channels, K_h, K_w).

bias Optional[Parameter]

Optional bias parameter of shape (out_channels,).

stride tuple[int, int]

Stride as a 2D pair.

padding tuple[int, int]

Padding as a 2D pair.

output_padding tuple[int, int]

Output padding as a 2D pair.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply the transposed convolution operation to an input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C_in, H_in, W_in).

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C_out, H_out, W_out).

Notes
  • If any of the inputs or parameters require gradients, an autograd Context is attached to the output tensor.
  • The backward function delegates gradient computation to Conv2dTransposeFn.

Normalization Layers

Note
KeyDNN provides both BatchNorm1D / BatchNorm2D and
BatchNorm1d / BatchNorm2d.
These are equivalent and exist for naming compatibility.

keydnn.BatchNorm1d

Bases: BatchNorm1d

Presentation-layer BatchNorm1d with ergonomic defaults.

This class subclasses the infrastructure BatchNorm1d implementation and only adjusts constructor ergonomics:

  • device becomes optional and defaults to CPU.
  • device may be provided as Device or a string like "cuda:0".

All numerical behavior, buffer updates, and autograd logic are inherited unchanged from the infrastructure implementation.

forward

forward(x: Tensor) -> Tensor

Apply BatchNorm1d to an input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C). Must be a CPU tensor and on the same device as the module.

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C). Requires gradients if the input requires gradients and/or (when affine=True) gamma/beta require gradients.

Raises:

Type Description
RuntimeError

If the input tensor is not on CPU.

ValueError

If device mismatches, input rank is not 2D, or channel count does not match num_features.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for this module.

Returns:

Type Description
Dict[str, Any]

Configuration dictionary sufficient to reconstruct the module via from_config.

from_config classmethod

from_config(config: Dict[str, Any]) -> 'BatchNorm1d'

Construct a BatchNorm1d instance from a configuration dictionary.

Parameters:

Name Type Description Default
config Dict[str, Any]

Configuration as produced by get_config().

required

Returns:

Type Description
BatchNorm1d

Reconstructed module instance.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

keydnn.BatchNorm2d

Bases: BatchNorm2d

Presentation-layer BatchNorm2d with ergonomic defaults.

This class subclasses the infrastructure BatchNorm2d implementation and only adjusts constructor ergonomics:

  • device becomes optional and defaults to CPU.
  • device may be provided as Device or a string like "cuda:0".

All numerical behavior, buffer updates, and autograd logic are inherited unchanged from the infrastructure implementation.

forward

forward(x: Tensor) -> Tensor

Apply BatchNorm2d to an input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C, H, W). Must be a CPU tensor and on the same device as the module.

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C, H, W). Requires gradients if the input requires gradients and/or (when affine=True) gamma/beta require gradients.

Raises:

Type Description
RuntimeError

If the input tensor is not on CPU.

ValueError

If device mismatches, input rank is not 4D, or channel count does not match num_features.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for this module.

Returns:

Type Description
Dict[str, Any]

Configuration dictionary sufficient to reconstruct the module via from_config.

from_config classmethod

from_config(config: Dict[str, Any]) -> 'BatchNorm2d'

Construct a BatchNorm2d instance from a configuration dictionary.

Parameters:

Name Type Description Default
config Dict[str, Any]

Configuration as produced by get_config().

required

Returns:

Type Description
BatchNorm2d

Reconstructed module instance.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

keydnn.BatchNorm1D module-attribute

BatchNorm1D = BatchNorm1d

keydnn.BatchNorm2D module-attribute

BatchNorm2D = BatchNorm2d

keydnn.LayerNorm

Bases: LayerNorm

Presentation-layer LayerNorm with ergonomic defaults.

This class subclasses the infrastructure LayerNorm and only adjusts constructor ergonomics:

  • device becomes optional and defaults to CPU.
  • device may be provided as Device or a string like "cuda:0".

All numerical behavior, parameter management, and autograd logic are inherited unchanged from the infrastructure implementation.

forward

forward(x: Tensor) -> Tensor

Apply LayerNorm to an input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (..., *normalized_shape). Must be a CPU tensor and on the same device as the module.

required

Returns:

Type Description
Tensor

Output tensor of the same shape as x.

Raises:

Type Description
RuntimeError

If the input tensor is not on CPU.

ValueError

If device mismatches, rank is insufficient, or trailing dims do not match normalized_shape.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration for this module.

from_config classmethod

from_config(config: Dict[str, Any]) -> 'LayerNorm'

Construct a LayerNorm instance from a configuration dictionary.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

Regularization Layers

keydnn.Dropout

Bases: Module

Dropout regularization layer (inverted dropout).

This layer randomly zeroes elements of the input tensor with probability p during training and rescales the remaining elements by 1 / (1 - p) so that the expected activation magnitude remains unchanged.

Behavior
  • Training mode: y = x * mask / (1 - p), where mask ~ Bernoulli(1 - p) (equivalently y = x * ((rand < keep_prob) / keep_prob))
  • Evaluation mode: y = x (identity)

Parameters:

Name Type Description Default
p float

Probability of dropping (zeroing) an element. Must satisfy 0.0 <= p < 1.0. Default is 0.5.

0.5

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply dropout to the input tensor.

During training, elements of the input tensor are randomly masked according to the dropout probability and scaled using inverted dropout. During evaluation, the input tensor is returned unchanged.

Parameters:

Name Type Description Default
x Tensor

Input tensor (CPU or CUDA).

required

Returns:

Type Description
Tensor

Output tensor after applying dropout (or identity if not in training mode).

Raises:

Type Description
ValueError

If p implies a non-positive keep probability (numerical guard).

get_config

get_config() -> Dict[str, Any]

Return a serializable configuration for this module.

Returns:

Type Description
Dict[str, Any]

Configuration dictionary containing the dropout probability.

from_config classmethod

from_config(config: Dict[str, Any]) -> 'Dropout'

Construct a Dropout module from a configuration dictionary.

Parameters:

Name Type Description Default
config Dict[str, Any]

Configuration dictionary produced by get_config.

required

Returns:

Type Description
Dropout

A new Dropout instance initialized from the configuration.


Pooling Layers

Note
KeyDNN provides both *Pool2D and *Pool2d variants.
These are equivalent and exist for naming compatibility.

keydnn.MaxPool2D module-attribute

MaxPool2D = MaxPool2d

keydnn.MaxPool2d

Bases: Pool2dConfigMixin, Module

2D max pooling module (NCHW).

This module applies max pooling over the spatial dimensions (H, W) independently per channel. The output retains the batch and channel dimensions while reducing spatial resolution according to pooling hyperparameters.

Shape semantics

Input: x.shape == (N, C, H, W)

Output: y.shape == (N, C, H_out, W_out)

Notes
  • Backpropagation routes gradients to the input positions that produced the maxima during the forward pass (argmax-based routing).
  • The underlying CPU reference implementation pads with -inf so padded values never become maxima (important for correctness at borders).

kernel_size property

kernel_size: Tuple[int, int]

Return the pooling window size.

Returns:

Type Description
tuple[int, int]

Kernel size as (k_h, k_w).

stride property

stride: Tuple[int, int]

Return the pooling stride.

Returns:

Type Description
tuple[int, int]

Stride as (s_h, s_w).

padding property

padding: Tuple[int, int]

Return the pooling padding.

Returns:

Type Description
tuple[int, int]

Padding as (p_h, p_w).

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return JSON-serializable configuration for this pooling layer.

from_config classmethod

from_config(cfg: Dict[str, Any]) -> T

Reconstruct the pooling layer from a JSON configuration dict.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply max pooling to the input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C, H, W).

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C, H_out, W_out).

Notes
  • A Context is attached only if x.requires_grad is True.
  • The module delegates computation to MaxPool2dFn.

keydnn.AvgPool2D module-attribute

AvgPool2D = AvgPool2d

keydnn.AvgPool2d

Bases: Pool2dConfigMixin, Module

2D average pooling module (NCHW).

This module applies average pooling over the spatial dimensions (H, W) independently per channel.

Shape semantics

Input: x.shape == (N, C, H, W)

Output: y.shape == (N, C, H_out, W_out)

Notes
  • The underlying reference implementation uses zero-padding.
  • The average is computed over the full kernel area (k_h * k_w), which means padded zeros contribute to the average when padding > 0.
  • The backward pass distributes gradients uniformly over each pooling window.

kernel_size property

kernel_size: Tuple[int, int]

Return the pooling window size.

Returns:

Type Description
tuple[int, int]

Kernel size as (k_h, k_w).

stride property

stride: Tuple[int, int]

Return the pooling stride.

Returns:

Type Description
tuple[int, int]

Stride as (s_h, s_w).

padding property

padding: Tuple[int, int]

Return the pooling padding.

Returns:

Type Description
tuple[int, int]

Padding as (p_h, p_w).

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return JSON-serializable configuration for this pooling layer.

from_config classmethod

from_config(cfg: Dict[str, Any]) -> T

Reconstruct the pooling layer from a JSON configuration dict.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply average pooling to the input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C, H, W).

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C, H_out, W_out).

Notes
  • A Context is attached only if x.requires_grad is True.
  • The module delegates computation to AvgPool2dFn.

keydnn.GlobalAvgPool2D module-attribute

GlobalAvgPool2D = GlobalAvgPool2d

keydnn.GlobalAvgPool2d

Bases: StatelessConfigMixin, Module

Global average pooling module (NCHW).

Global average pooling reduces each channel to a single value by averaging over the spatial dimensions:

(N, C, H, W) -> (N, C, 1, 1)

This is commonly used near the end of CNN architectures to eliminate fully-connected layers and support variable spatial input sizes.

Notes
  • This module has no kernel/stride/padding hyperparameters.
  • The backward pass distributes gradients uniformly across all H*W input positions per channel.

parameters

parameters() -> Iterable[IParameter]

Return an iterable over this module's parameters (recursive).

Returns:

Type Description
Iterable[IParameter]

Iterable of parameters registered on this module and all submodules.

train

train() -> Self

Set this module to training mode and recursively set all child modules to training mode.

Notes
  • This toggles self.training = True.
  • Modules that behave differently in training (e.g., Dropout, BatchNorm) should read self.training during forward/predict to decide behavior.
  • This method is intended to mirror PyTorch's Module.train().

eval

eval() -> Self

Set this module to evaluation (inference) mode and recursively set all child modules to evaluation mode.

Notes
  • This toggles self.training = False.
  • In eval mode, modules such as Dropout should be disabled, and BatchNorm should use running statistics (if implemented).
  • This method is intended to mirror PyTorch's Module.eval().

register_parameter

register_parameter(
    name: str, param: Optional[IParameter]
) -> None

Register a parameter with this module.

Parameters:

Name Type Description Default
name str

Name under which the parameter will be stored (e.g., "weight", "bias").

required
param Optional[IParameter]

Parameter instance to register. If None, registration is skipped.

required
Notes
  • If param is None, nothing is registered.
  • If the name already exists, it is overwritten intentionally.
  • This also sets the attribute on the module so self.<name> works.

register_module

register_module(
    name: str, module: Optional["Module"]
) -> None

Register a child module with this module.

Parameters:

Name Type Description Default
name str

Name under which the module will be stored.

required
module Optional[Module]

Child module to register. If None, registration is skipped.

required
Notes
  • If module is None, nothing is registered.
  • This also sets the attribute on the module so self.<name> works.

named_parameters

named_parameters(
    prefix: str = "",
) -> Iterator[tuple[str, IParameter]]

Return an iterator over (name, parameter) pairs (recursive).

Parameters:

Name Type Description Default
prefix str

Prefix to prepend to parameter names (used for recursion).

''

Returns:

Type Description
Iterator[tuple[str, IParameter]]

Iterator yielding (fully_qualified_name, parameter).

get_config

get_config() -> Dict[str, Any]

Return a JSON-serializable configuration dictionary.

For stateless modules, this method returns an empty dictionary, indicating that no parameters are required to reconstruct the object.

Returns:

Type Description
Dict[str, Any]

An empty configuration dictionary.

from_config classmethod

from_config(cfg: Dict[str, Any]) -> Self

Reconstruct the module from a configuration dictionary.

Since stateless modules do not require any configuration parameters, the provided configuration is ignored and a default instance of the class is returned.

Parameters:

Name Type Description Default
cfg Dict[str, Any]

Configuration dictionary (unused).

required

Returns:

Type Description
StatelessConfigMixin

A newly constructed instance of the module.

to

to(device: Device) -> 'Module'

Move this module (recursively) to device by moving all registered Parameters.

Notes
  • Uses _parameters / _modules registries as the source of truth. This avoids touching properties / methods / non-parameter attributes.
  • Assumes each Parameter/Tensor implements .to(Device) -> same-type-like.
  • Rebinds attributes so self.weight, etc. now point to the moved objects.

to_

to_(device: Device) -> 'Module'

Move this module and all of its parameters to device in-place.

This method performs a recursive, in-place device migration of all parameters registered on this module and its submodules. Unlike Module.to(), which may rebind parameters to newly created objects, to_() attempts to preserve the identity of each parameter whenever possible.

Behavior
  • For each registered parameter:
    • If the parameter implements to_(), it is migrated in-place (object identity is preserved).
    • Otherwise, the parameter is migrated out-of-place via to(device) and rebound on the module as a fallback.
  • All child modules are recursively migrated using the same rules.

Parameters:

Name Type Description Default
device Device

Target device to which all parameters should be moved.

required

Returns:

Type Description
Module

This module (self), after in-place migration.

Notes
  • This method relies exclusively on the _parameters and _modules registries and does not inspect arbitrary attributes.
  • In-place migration is best-effort and depends on parameter support for to_(). Parameters that do not implement to_() will be replaced by newly created objects.
  • Autograd context is not preserved across device transfers; parameters should be treated as graph breaks after migration.
  • Optimizers that hold references to parameters remain valid only if all parameters support true in-place migration.

forward

forward(x: Tensor) -> Tensor

Apply global average pooling to the input tensor.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (N, C, H, W).

required

Returns:

Type Description
Tensor

Output tensor of shape (N, C, 1, 1).

Notes
  • A Context is attached only if x.requires_grad is True.
  • The module delegates computation to GlobalAvgPool2dFn.

Notes on Shapes and Devices

  • Convolution and pooling layers expect NCHW layout by default.
  • Parameters are created on the same device as the layer unless explicitly moved.
  • Inputs must be contiguous for optimal CUDA performance.
  • Shape mismatches are reported at runtime with descriptive errors.

For more details, see:

  • Guides → Tensors & Devices
  • Guides → Training Loop