Utilities

Utility functions provide common helpers for reproducibility, preprocessing, and random number control.

All utilities documented here are part of KeyDNN’s public presentation API.

Determinism

set_deterministic

Configure KeyDNN's determinism policy.

This function centralizes reproducibility-related settings that are not strictly based on random number generation, such as thread-level execution determinism and backend configuration choices.

When enabled, this function can restrict CPU parallelism by setting thread-control environment variables used by common numerical libraries. This helps reduce nondeterminism caused by varying execution order in parallel reductions.

Parameters:

Name	Type	Description	Default
`enabled`	`bool`	If True, configure the runtime to prefer deterministic behavior.	required
`cpu_threads`	`int or None`	Desired number of CPU threads for numerical libraries. When `enabled=True`, the default is `1` for maximum reproducibility. If `None`, thread-related environment variables are left unchanged.	`1`

Returns:

Type	Description
`DeterminismState`	The applied determinism configuration.

Raises:

Type	Description
`TypeError`	If `enabled` is not a boolean.
`ValueError`	If `cpu_threads` is not a positive integer or `None`.

Notes

Thread-related settings are applied via environment variables and may need to be set before importing NumPy or BLAS libraries for full effect.
This function does not guarantee bitwise-identical results across different hardware or library implementations.
CUDA backend determinism (cuDNN / cuBLAS) is intentionally not configured here and will be handled separately when native backend APIs are exposed.

Randomness

seed

Seed RNG sources used by KeyDNN's CPU / NumPy execution path.

This function seeds all random number generators that affect CPU-side execution, including:

Python's standard-library random module
NumPy's global random number generator (np.random)

Parameters:

Name	Type	Description	Default
`seed`	`int`	Global seed value used to initialize all supported RNG sources.	required

Raises:

Type	Description
`TypeError`	If `seed` is not an integer.

Notes

Calling this function multiple times with the same seed is idempotent with respect to subsequent random number generation.
This function should typically be called once at the start of a script, test, or experiment to ensure reproducibility.
This function does not control Python hash randomization (PYTHONHASHSEED), which must be set before process startup if required by the user.

get_seed

Return the last seed set via :func:seed.

Returns:

Type	Description
`int or None`	The most recent seed value passed to :func:`seed`, or `None` if no seed has been set during the current process lifetime.

Preprocessing

numpy_to_tensor

Convert a NumPy array into a KeyDNN Tensor.

This function creates a new Tensor instance with the same shape as the given NumPy array, copies the data into the tensor's internal storage, and optionally enables gradient tracking.

Parameters:

Name	Type	Description	Default
`arr`	`ndarray`	Input NumPy array. The data is converted to `float32` internally before being copied into the tensor.	required
`device`	`Device`	Target device on which to allocate the tensor. If `None`, the framework default device is used.	`None`
`requires_grad`	`bool`	Whether the resulting tensor should participate in autograd and accumulate gradients during backpropagation.	`False`

Returns:

Type	Description
`Tensor`	A newly allocated tensor containing a copy of the input data.

Notes

The returned tensor does not share memory with the input NumPy array; the data is copied explicitly.
This function uses only public Tensor APIs and is safe to use in user-facing code, tests, and examples.
Gradient buffers are allocated lazily according to the tensor's requires_grad setting and the autograd engine behavior.

one_hot

Convert integer class labels to a one-hot encoded NumPy array.

This function takes a 1D array of integer labels and returns a 2D one-hot encoded matrix with float32 dtype.

Parameters:

Name	Type	Description	Default
`labels`	`ndarray`	Array of integer class labels. The input is flattened internally, so both shape `(N,)` and `(N, 1)` are supported.	required
`num_classes`	`int`	Total number of classes. Each label value must satisfy `0 <= label < num_classes`.	required

Returns:

Type	Description
`ndarray`	One-hot encoded array of shape `(N, num_classes)` with dtype `float32`, where `N` is the number of labels.

Notes

No bounds checking is performed beyond NumPy indexing semantics; invalid label values will raise an IndexError.
This function is intended for dataset preprocessing and loss computation (e.g., classification targets).

Notes

For fully reproducible experiments, call set_deterministic() and seed() before creating models, tensors, or datasets.
numpy_to_tensor() is useful for bridging external NumPy-based pipelines with KeyDNN’s tensor system.
one_hot() is commonly used with classification losses such as categorical cross-entropy.