Datasets

DD-Ranking provides a set of commonly used datasets in existing dataset distillation methods. Users can flexibly use these datasets for evaluation. The interface to load datasets is as follows:

dd_ranking.utils.get_dataset(dataset: str, data_path: str, im_size: tuple, use_zca: bool, custom_val_trans: Optional[Callable], device: str) [SOURCE]

Parameters

  • dataset(str): Name of the dataset.
  • data_path(str): Path to the dataset.
  • im_size(tuple): Image size.
  • use_zca(bool): Whether to use ZCA whitening. When set to True, the dataset will not be normalized using the mean and standard deviation of the training set.
  • custom_val_trans(Optional[Callable]): Custom transformation on the validation set.
  • device(str): Device for performing ZCA whitening.

Currently, we support the following datasets with default settings. We will keep updating this section with more datasets.

  • CIFAR10
    • channels: 3
    • im_size: (32, 32)
    • num_classes: 10
    • mean: [0.4914, 0.4822, 0.4465]
    • std: [0.2023, 0.1994, 0.2010]
  • CIFAR100
    • channels: 3
    • im_size: (32, 32)
    • num_classes: 100
    • mean: [0.4914, 0.4822, 0.4465]
    • std: [0.2023, 0.1994, 0.2010]
  • TinyImageNet
    • channels: 3
    • im_size: (64, 64)
    • num_classes: 200
    • mean: [0.485, 0.456, 0.406]
    • std: [0.229, 0.224, 0.225]