Utilities
Utility functions provide common helpers for reproducibility, preprocessing, and random number control.
All utilities documented here are part of KeyDNN’s public presentation API.
Determinism
set_deterministic
Configure KeyDNN's determinism policy.
This function centralizes reproducibility-related settings that are not strictly based on random number generation, such as thread-level execution determinism and backend configuration choices.
When enabled, this function can restrict CPU parallelism by setting thread-control environment variables used by common numerical libraries. This helps reduce nondeterminism caused by varying execution order in parallel reductions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
enabled
|
bool
|
If True, configure the runtime to prefer deterministic behavior. |
required |
cpu_threads
|
int or None
|
Desired number of CPU threads for numerical libraries. When
|
1
|
Returns:
| Type | Description |
|---|---|
DeterminismState
|
The applied determinism configuration. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
ValueError
|
If |
Notes
- Thread-related settings are applied via environment variables and may need to be set before importing NumPy or BLAS libraries for full effect.
- This function does not guarantee bitwise-identical results across different hardware or library implementations.
- CUDA backend determinism (cuDNN / cuBLAS) is intentionally not configured here and will be handled separately when native backend APIs are exposed.
Randomness
seed
Seed RNG sources used by KeyDNN's CPU / NumPy execution path.
This function seeds all random number generators that affect CPU-side execution, including:
- Python's standard-library
randommodule - NumPy's global random number generator (
np.random)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
int
|
Global seed value used to initialize all supported RNG sources. |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
Notes
- Calling this function multiple times with the same seed is idempotent with respect to subsequent random number generation.
- This function should typically be called once at the start of a script, test, or experiment to ensure reproducibility.
- This function does not control Python hash randomization (PYTHONHASHSEED), which must be set before process startup if required by the user.
get_seed
Return the last seed set via :func:seed.
Returns:
| Type | Description |
|---|---|
int or None
|
The most recent seed value passed to :func: |
Preprocessing
numpy_to_tensor
Convert a NumPy array into a KeyDNN Tensor.
This function creates a new Tensor instance with the same shape as
the given NumPy array, copies the data into the tensor's internal
storage, and optionally enables gradient tracking.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
arr
|
ndarray
|
Input NumPy array. The data is converted to |
required |
device
|
Device
|
Target device on which to allocate the tensor. If |
None
|
requires_grad
|
bool
|
Whether the resulting tensor should participate in autograd and accumulate gradients during backpropagation. |
False
|
Returns:
| Type | Description |
|---|---|
Tensor
|
A newly allocated tensor containing a copy of the input data. |
Notes
- The returned tensor does not share memory with the input NumPy array; the data is copied explicitly.
- This function uses only public
TensorAPIs and is safe to use in user-facing code, tests, and examples. - Gradient buffers are allocated lazily according to the tensor's
requires_gradsetting and the autograd engine behavior.
one_hot
Convert integer class labels to a one-hot encoded NumPy array.
This function takes a 1D array of integer labels and returns a 2D
one-hot encoded matrix with float32 dtype.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
labels
|
ndarray
|
Array of integer class labels. The input is flattened internally,
so both shape |
required |
num_classes
|
int
|
Total number of classes. Each label value must satisfy
|
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
One-hot encoded array of shape |
Notes
- No bounds checking is performed beyond NumPy indexing semantics;
invalid label values will raise an
IndexError. - This function is intended for dataset preprocessing and loss computation (e.g., classification targets).
Notes
- For fully reproducible experiments, call
set_deterministic()andseed()before creating models, tensors, or datasets. numpy_to_tensor()is useful for bridging external NumPy-based pipelines with KeyDNN’s tensor system.one_hot()is commonly used with classification losses such as categorical cross-entropy.