Folder Dataset

Custom Folder Dataset. This script creates a custom dataset from a folder.

class anomalib.data.folder.Folder(normal_dir: str | Path, root: str | Path | None = None, abnormal_dir: str | Path | None = None, normal_test_dir: str | Path | None = None, mask_dir: str | Path | None = None, normal_split_ratio: float = 0.2, extensions: tuple[str] | None = None, image_size: int | tuple[int, int] | None = None, center_crop: int | tuple[int, int] | None = None, normalization: str | InputNormalizationMethod = InputNormalizationMethod.IMAGENET, train_batch_size: int = 32, eval_batch_size: int = 32, num_workers: int = 8, task: TaskType = TaskType.SEGMENTATION, transform_config_train: str | A.Compose | None = None, transform_config_eval: str | A.Compose | None = None, test_split_mode: TestSplitMode = TestSplitMode.FROM_DIR, test_split_ratio: float = 0.2, val_split_mode: ValSplitMode = ValSplitMode.FROM_TEST, val_split_ratio: float = 0.5, seed: int | None = None)[source]

Bases: AnomalibDataModule

Folder DataModule. :param normal_dir: Name of the directory containing normal images.

Defaults to “normal”.

Parameters:
  • root (str | Path | None) – Path to the root folder containing normal and abnormal dirs.

  • abnormal_dir (str | Path | None) – Name of the directory containing abnormal images. Defaults to “abnormal”.

  • normal_test_dir (str | Path | None, optional) – Path to the directory containing normal images for the test dataset. Defaults to None.

  • mask_dir (str | Path | None, optional) – Path to the directory containing the mask annotations. Defaults to None.

  • normal_split_ratio (float, optional) – Ratio to split normal training images and add to the test set in case test set doesn’t contain any normal images. Defaults to 0.2.

  • extensions (tuple[str, ...] | None, optional) – Type of the image extensions to read from the directory. Defaults to None.

  • image_size (int | tuple[int, int] | None, optional) – Size of the input image. Defaults to None.

  • center_crop (int | tuple[int, int] | None, optional) – When provided, the images will be center-cropped to the provided dimensions.

  • normalize (bool) – When True, the images will be normalized to the ImageNet statistics.

  • train_batch_size (int, optional) – Training batch size. Defaults to 32.

  • test_batch_size (int, optional) – Test batch size. Defaults to 32.

  • num_workers (int, optional) – Number of workers. Defaults to 8.

  • task (TaskType, optional) – Task type. Could be classification, detection or segmentation. Defaults to segmentation.

  • transform_config_train (str | A.Compose | None, optional) – Config for pre-processing during training. Defaults to None.

  • transform_config_val (str | A.Compose | None, optional) – Config for pre-processing during validation. Defaults to None.

  • test_split_mode (TestSplitMode) – Setting that determines how the testing subset is obtained.

  • test_split_ratio (float) – Fraction of images from the train set that will be reserved for testing.

  • val_split_mode (ValSplitMode) – Setting that determines how the validation subset is obtained.

  • val_split_ratio (float) – Fraction of train or test images that will be reserved for validation.

  • seed (int | None, optional) – Seed used during random subset splitting.

prepare_data_per_node

If True, each LOCAL_RANK=0 will call prepare data. Otherwise only NODE_RANK=0, LOCAL_RANK=0 will prepare data.

allow_zero_length_dataloader_with_multiple_devices

If True, dataloader with zero length within local rank is allowed. Default value is False.

class anomalib.data.folder.FolderDataset(task: TaskType, transform: A.Compose, normal_dir: str | Path, root: str | Path | None = None, abnormal_dir: str | Path | None = None, normal_test_dir: str | Path | None = None, mask_dir: str | Path | None = None, split: str | Split | None = None, extensions: tuple[str, ...] | None = None)[source]

Bases: AnomalibDataset

Folder dataset. :param task: Task type. (classification, detection or segmentation). :type task: TaskType :param transform: Albumentations Compose object describing the transforms that are applied to the inputs. :type transform: A.Compose :param split: Fixed subset split that follows from folder structure on file system.

Choose from [Split.FULL, Split.TRAIN, Split.TEST]

Parameters:
  • normal_dir (str | Path) – Path to the directory containing normal images.

  • root (str | Path | None) – Root folder of the dataset.

  • abnormal_dir (str | Path | None, optional) – Path to the directory containing abnormal images.

  • normal_test_dir (str | Path | None, optional) – Path to the directory containing normal images for the test dataset. Defaults to None.

  • mask_dir (str | Path | None, optional) – Path to the directory containing the mask annotations. Defaults to None.

  • extensions (tuple[str, ...] | None, optional) – Type of the image extensions to read from the directory.

  • val_split_mode (ValSplitMode) – Setting that determines how the validation subset is obtained.

Raises:

ValueError – When task is set to classification and mask_dir is provided. When mask_dir is provided, task should be set to segmentation.

anomalib.data.folder.make_folder_dataset(normal_dir: str | Path, root: str | Path | None = None, abnormal_dir: str | Path | None = None, normal_test_dir: str | Path | None = None, mask_dir: str | Path | None = None, split: str | Split | None = None, extensions: tuple[str, ...] | None = None) DataFrame[source]

Make Folder Dataset. :param normal_dir: Path to the directory containing normal images. :type normal_dir: str | Path :param root: Path to the root directory of the dataset. :type root: str | Path | None :param abnormal_dir: Path to the directory containing abnormal images. :type abnormal_dir: str | Path | None, optional :param normal_test_dir: Path to the directory containing

normal images for the test dataset. Normal test images will be a split of normal_dir if None. Defaults to None.

Parameters:
  • mask_dir (str | Path | None, optional) – Path to the directory containing the mask annotations. Defaults to None.

  • split (str | Split | None, optional) – Dataset split (ie., Split.FULL, Split.TRAIN or Split.TEST). Defaults to None.

  • extensions (tuple[str, ...] | None, optional) – Type of the image extensions to read from the directory.

Returns:

an output dataframe containing samples for the requested split (ie., train or test)

Return type:

DataFrame