Contains the Partitioner class which implements a dictionary to quickly retrieve an integer index array for validation indices of a given cross-validation partition (fold). This is an implementation of Algorithm 1 in the paper by O.-C. G. Engstrøm and M. H. Jensen: https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/full/10.1002/cem.70008
The implementation is written using NumPy.
Author: Ole-Christian Galbo Engstrøm E-mail: ocge@foss.dk
Classes
|
Implements Algorithm 1 by O.-C. |
- class cvmatrix.partitioner.Partitioner(folds: Iterable[Hashable])
Bases:
objectImplements Algorithm 1 by O.-C. G. Engstrøm and M. H. Jensen: https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/full/10.1002/cem.70008 This class is used to partition data into validation sets based on cross-validation folds. It is detached from the CVMatrix so that it does not need to be pickled when using the CVMatrix in a multiprocessing context. See the parallel implementation in the ikpls package by O.-C. G. Engstrøm et al.: https://github.com/sm00thix/ikpls. CVMatrix and Partitioner are used together in the cross_validate method of the ikpls.fast_cross_validation.numpy_ikpls.PLS class.
- Parameters:
folds (Iterable of Hashable with N elements) – An iterable defining cross-validation splits. Each unique value in folds corresponds to a different fold. The indices of the samples in each fold will be stored in a dictionary for quick access.
- folds_dict
A dictionary where keys are fold identifiers (from the folds parameter) and values are NumPy arrays containing the indices of the samples in that fold. This allows for efficient retrieval of validation indices for each fold.
- Type:
dict[Hashable, npt.NDArray[np.int_]]
- get_validation_indices(fold: Hashable) ndarray[tuple[Any, ...], dtype[int64]]
Returns an integer array of indices of the validation partition samples for fold.
- Parameters:
fold (Hashable) – The fold for which to return the validation partition indices.
- Returns:
Integer array of indices of the validation partition samples for the given fold.
- Return type:
Array of shape (N_val,)
- Raises:
ValueError – If fold was not one of the values in the folds parameter of the constructor.