Padim

This is the implementation of the PaDiM paper.

Model Type: Segmentation

Description

PaDiM is a patch based algorithm. It relies on a pre-trained CNN feature extractor. The image is broken into patches and embeddings are extracted from each patch using different layers of the feature extractors. The activation vectors from different layers are concatenated to get embedding vectors carrying information from different semantic levels and resolutions. This helps encode fine grained and global contexts. However, since the generated embedding vectors may carry redundant information, dimensions are reduced using random selection. A multivariate gaussian distribution is generated for each patch embedding across the entire training batch. Thus, for each patch of the set of training images, we have a different multivariate gaussian distribution. These gaussian distributions are represented as a matrix of gaussian parameters.

During inference, Mahalanobis distance is used to score each patch position of the test image. It uses the inverse of the covariance matrix calculated for the patch during training. The matrix of Mahalanobis distances forms the anomaly map with higher scores indicating anomalous regions.

Architecture

PaDiM Architecture

Usage

$ python tools/train.py --model padim

PyTorch model for the PaDiM model implementation.

class anomalib.models.padim.torch_model.PadimModel(input_size: tuple[int, int], layers: list[str], backbone: str = 'resnet18', pre_trained: bool = True, n_features: Optional[int] = None)[source]

Bases: Module

Padim Module.

Parameters:
  • input_size (tuple[int, int]) – Input size for the model.

  • layers (list[str]) – Layers used for feature extraction

  • backbone (str, optional) – Pre-trained model backbone. Defaults to “resnet18”.

  • pre_trained (bool, optional) – Boolean to check whether to use a pre_trained backbone.

  • n_features (int, optional) – Number of features to retain in the dimension reduction step. Default values from the paper are available for: resnet18 (100), wide_resnet50_2 (550).

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_tensor: Tensor) Tensor[source]

Forward-pass image-batch (N, C, H, W) into model to extract features.

Parameters:
  • input_tensor – Image-batch (N, C, H, W)

  • input_tensor – Tensor:

Returns:

Features from single/multiple layers.

Example

>>> x = torch.randn(32, 3, 224, 224)
>>> features = self.extract_features(input_tensor)
>>> features.keys()
dict_keys(['layer1', 'layer2', 'layer3'])
>>> [v.shape for v in features.values()]
[torch.Size([32, 64, 56, 56]),
torch.Size([32, 128, 28, 28]),
torch.Size([32, 256, 14, 14])]
generate_embedding(features: dict[str, torch.Tensor]) Tensor[source]

Generate embedding from hierarchical feature map.

Parameters:

features (dict[str, Tensor]) – Hierarchical feature map from a CNN (ResNet18 or WideResnet)

Returns:

Embedding vector

PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization.

Paper https://arxiv.org/abs/2011.08785

class anomalib.models.padim.lightning_model.Padim(layers: list[str], input_size: tuple[int, int], backbone: str, pre_trained: bool = True, n_features: Optional[int] = None)[source]

Bases: AnomalyModule

PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization.

Parameters:
  • layers (list[str]) – Layers to extract features from the backbone CNN

  • input_size (tuple[int, int]) – Size of the model input.

  • backbone (str) – Backbone CNN network

  • pre_trained (bool, optional) – Boolean to check whether to use a pre_trained backbone.

  • n_features (int, optional) – Number of features to retain in the dimension reduction step. Default values from the paper are available for: resnet18 (100), wide_resnet50_2 (550).

static configure_optimizers() None[source]

PADIM doesn’t require optimization, therefore returns no optimizers.

on_validation_start() None[source]

Fit a Gaussian to the embedding collected from the training set.

training_step(batch: dict[str, str | torch.Tensor], *args, **kwargs) None[source]

Training Step of PADIM. For each batch, hierarchical features are extracted from the CNN.

Parameters:
  • batch (dict[str, str | Tensor]) – Batch containing image filename, image, label and mask

  • _batch_idx – Index of the batch.

Returns:

Hierarchical feature map

validation_step(batch: dict[str, str | torch.Tensor], *args, **kwargs) Union[Tensor, Dict[str, Any]][source]

Validation Step of PADIM.

Similar to the training step, hierarchical features are extracted from the CNN for each batch.

Parameters:

batch (dict[str, str | Tensor]) – Input batch

Returns:

Dictionary containing images, features, true labels and masks. These are required in validation_epoch_end for feature concatenation.

class anomalib.models.padim.lightning_model.PadimLightning(hparams: omegaconf.dictconfig.DictConfig | omegaconf.listconfig.ListConfig)[source]

Bases: Padim

PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization.

Parameters:

hparams (DictConfig | ListConfig) – Model params

Anomaly Map Generator for the PaDiM model implementation.

class anomalib.models.padim.anomaly_map.AnomalyMapGenerator(image_size: omegaconf.listconfig.ListConfig | tuple, sigma: int = 4)[source]

Bases: Module

Generate Anomaly Heatmap.

Parameters:
  • image_size (ListConfig, tuple) – Size of the input image. The anomaly map is upsampled to this dimension.

  • sigma (int, optional) – Standard deviation for Gaussian Kernel. Defaults to 4.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute_anomaly_map(embedding: Tensor, mean: Tensor, inv_covariance: Tensor) Tensor[source]

Compute anomaly score.

Scores are calculated based on embedding vector, mean and inv_covariance of the multivariate gaussian distribution.

Parameters:
  • embedding (Tensor) – Embedding vector extracted from the test set.

  • mean (Tensor) – Mean of the multivariate gaussian distribution

  • inv_covariance (Tensor) – Inverse Covariance matrix of the multivariate gaussian distribution.

Returns:

Output anomaly score.

static compute_distance(embedding: Tensor, stats: list[torch.Tensor]) Tensor[source]

Compute anomaly score to the patch in position(i,j) of a test image.

Ref: Equation (2), Section III-C of the paper.

Parameters:
  • embedding (Tensor) – Embedding Vector

  • stats (list[Tensor]) – Mean and Covariance Matrix of the multivariate Gaussian distribution

Returns:

Anomaly score of a test image via mahalanobis distance.

forward(**kwargs) Tensor[source]

Returns anomaly_map.

Expects embedding, mean and covariance keywords to be passed explicitly.

Example: >>> anomaly_map_generator = AnomalyMapGenerator(image_size=input_size) >>> output = anomaly_map_generator(embedding=embedding, mean=mean, covariance=covariance)

Raises:

ValueErrorembedding. mean or covariance keys are not found

Returns:

anomaly map

Return type:

torch.Tensor

smooth_anomaly_map(anomaly_map: Tensor) Tensor[source]

Apply gaussian smoothing to the anomaly map.

Parameters:

anomaly_map (Tensor) – Anomaly score for the test image(s).

Returns:

Filtered anomaly scores

up_sample(distance: Tensor) Tensor[source]

Up sample anomaly score to match the input image size.

Parameters:

distance (Tensor) – Anomaly score computed via the mahalanobis distance.

Returns:

Resized distance matrix matching the input image size