Skip to content

Utilities

Utility functions provide common helpers for reproducibility, preprocessing, and random number control.

All utilities documented here are part of KeyDNN’s public presentation API.


Determinism

set_deterministic

Configure KeyDNN's determinism policy.

This function centralizes reproducibility-related settings that are not strictly based on random number generation, such as thread-level execution determinism and backend configuration choices.

When enabled, this function can restrict CPU parallelism by setting thread-control environment variables used by common numerical libraries. This helps reduce nondeterminism caused by varying execution order in parallel reductions.

Parameters:

Name Type Description Default
enabled bool

If True, configure the runtime to prefer deterministic behavior.

required
cpu_threads int or None

Desired number of CPU threads for numerical libraries. When enabled=True, the default is 1 for maximum reproducibility. If None, thread-related environment variables are left unchanged.

1

Returns:

Type Description
DeterminismState

The applied determinism configuration.

Raises:

Type Description
TypeError

If enabled is not a boolean.

ValueError

If cpu_threads is not a positive integer or None.

Notes
  • Thread-related settings are applied via environment variables and may need to be set before importing NumPy or BLAS libraries for full effect.
  • This function does not guarantee bitwise-identical results across different hardware or library implementations.
  • CUDA backend determinism (cuDNN / cuBLAS) is intentionally not configured here and will be handled separately when native backend APIs are exposed.

Randomness

seed

Seed RNG sources used by KeyDNN's CPU / NumPy execution path.

This function seeds all random number generators that affect CPU-side execution, including:

  • Python's standard-library random module
  • NumPy's global random number generator (np.random)

Parameters:

Name Type Description Default
seed int

Global seed value used to initialize all supported RNG sources.

required

Raises:

Type Description
TypeError

If seed is not an integer.

Notes
  • Calling this function multiple times with the same seed is idempotent with respect to subsequent random number generation.
  • This function should typically be called once at the start of a script, test, or experiment to ensure reproducibility.
  • This function does not control Python hash randomization (PYTHONHASHSEED), which must be set before process startup if required by the user.

get_seed

Return the last seed set via :func:seed.

Returns:

Type Description
int or None

The most recent seed value passed to :func:seed, or None if no seed has been set during the current process lifetime.


Preprocessing

numpy_to_tensor

Convert a NumPy array into a KeyDNN Tensor.

This function creates a new Tensor instance with the same shape as the given NumPy array, copies the data into the tensor's internal storage, and optionally enables gradient tracking.

Parameters:

Name Type Description Default
arr ndarray

Input NumPy array. The data is converted to float32 internally before being copied into the tensor.

required
device Device

Target device on which to allocate the tensor. If None, the framework default device is used.

None
requires_grad bool

Whether the resulting tensor should participate in autograd and accumulate gradients during backpropagation.

False

Returns:

Type Description
Tensor

A newly allocated tensor containing a copy of the input data.

Notes
  • The returned tensor does not share memory with the input NumPy array; the data is copied explicitly.
  • This function uses only public Tensor APIs and is safe to use in user-facing code, tests, and examples.
  • Gradient buffers are allocated lazily according to the tensor's requires_grad setting and the autograd engine behavior.

one_hot

Convert integer class labels to a one-hot encoded NumPy array.

This function takes a 1D array of integer labels and returns a 2D one-hot encoded matrix with float32 dtype.

Parameters:

Name Type Description Default
labels ndarray

Array of integer class labels. The input is flattened internally, so both shape (N,) and (N, 1) are supported.

required
num_classes int

Total number of classes. Each label value must satisfy 0 <= label < num_classes.

required

Returns:

Type Description
ndarray

One-hot encoded array of shape (N, num_classes) with dtype float32, where N is the number of labels.

Notes
  • No bounds checking is performed beyond NumPy indexing semantics; invalid label values will raise an IndexError.
  • This function is intended for dataset preprocessing and loss computation (e.g., classification targets).

Notes

  • For fully reproducible experiments, call set_deterministic() and seed() before creating models, tensors, or datasets.
  • numpy_to_tensor() is useful for bridging external NumPy-based pipelines with KeyDNN’s tensor system.
  • one_hot() is commonly used with classification losses such as categorical cross-entropy.