Datasets

The PyColibri datasets module offers a variety of datasets widely used in machine learning and computer vision research, including both built-in datasets and custom datasets. This module simplifies the process of loading and transforming datasets, providing a consistent interface for researchers and developers to work with different data types.

Built-in Datasets

The built-in datasets supported by this module include popular datasets like MNIST, CIFAR-10, CIFAR-100, and Fashion MNIST. The complete list is aviailable in the colibri.data.utils.BUILTIN_DATASETS dictionary.

colibri.data.utils.load_builtin_dataset

Load a built-in dataset.

colibri.data.utils.update_builtin_path

Update the built-in path by creating a new directory with the given name inside the specified path.

Custom Dataset Class

A flexible dataset class is available to handle both built-in and custom datasets. It includes methods for loading datasets and applying necessary transformations.

colibri.data.datasets.CustomDataset

Custom dataset.

Custom Datasets

Custom datasets are supported by the Handlers, Currently, the module supports the following datasets: Cave

colibri.data.sota_datasets.CaveDataset

Class to handle the CAVE dataset.

Default Transformations

The module provides a default transformation class to handle the standard preprocessing tasks required for the input and output data. This includes conversion to tensors, among other operations, ensuring that the data is in the right format for model training.

colibri.data.datasets.DefaultTransform

Default transformation class.