Metrics

There are two ways of configuring metrics in the config file:

  1. a list of metric names, or

  2. a mapping of metric names to class path and init args.

Each subsection in the section metrics of the config file can have a different style but inside each one it must be the same style.

Example of metrics configuration section in the config file.
metrics:
   # imagewise metrics using the list of metric names style
   image:
      - F1Score
      - AUROC
   # pixelwise metrics using the mapping style
   pixel:
      F1Score:
         class_path: torchmetrics.F1Score
         init_args:
         compute_on_cpu: true
      AUROC:
         class_path: anomalib.utils.metrics.AUROC
         init_args:
         compute_on_cpu: true

List of metric names

A list of strings that match the name of a class in anomalib.utils.metrics or torchmetrics (in this order of priority), which will be instantiated with default arguments.

Mapping of metric names to class path and init args

A mapping of metric names (str) to a dictionary with two keys: “class_path” and “init_args”.

“class_path” is a string with the full path to a metric (from root package down to the class name, e.g.: “anomalib.utils.metrics.AUROC”).

“init_args” is a dictionary of arguments to be passed to the class constructor.

Custom anomaly evaluation metrics.

class anomalib.utils.metrics.AUPR(num_classes: Optional[int] = None, pos_label: Optional[int] = None, task: Optional[Literal['binary', 'multiclass', 'multilabel']] = None, thresholds: Optional[Union[int, List[float], Tensor]] = None, num_labels: Optional[int] = None, ignore_index: Optional[int] = None, validate_args: bool = True, **kwargs: Any)[source]

Bases: PrecisionRecallCurve

Area under the PR curve.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() Tensor[source]

First compute PR curve, then compute area under the curve.

Returns:

Value of the AUPR metric

generate_figure() tuple[Figure, str][source]

Generate a figure containing the PR curve as well as the random baseline and the AUC.

Returns:

Tuple containing both the PR curve and the figure title to be used for logging

Return type:

tuple[Figure, str]

update(preds: Tensor, target: Tensor) None[source]

Update state with new values.

Need to flatten new values as PrecicionRecallCurve expects them in this format for binary classification.

Parameters:
  • preds (Tensor) – predictions of the model

  • target (Tensor) – ground truth targets

preds: List[Tensor]
target: List[Tensor]
class anomalib.utils.metrics.AUPRO(compute_on_step: bool = True, dist_sync_on_step: bool = False, process_group: Any | None = None, dist_sync_fn: Callable | None = None, fpr_limit: float = 0.3)[source]

Bases: Metric

Area under per region overlap (AUPRO) Metric.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() Tensor[source]

Fist compute PRO curve, then compute and scale area under the curve.

Returns:

Value of the AUPRO metric

Return type:

Tensor

generate_figure() tuple[Figure, str][source]

Generate a figure containing the PRO curve and the AUPRO.

Returns:

Tuple containing both the figure and the figure title to be used for logging

Return type:

tuple[Figure, str]

static interp1d(old_x: Tensor, old_y: Tensor, new_x: Tensor) Tensor[source]

Function to interpolate a 1D signal linearly to new sampling points.

Parameters:
  • old_x (Tensor) – original 1-D x values (same size as y)

  • old_y (Tensor) – original 1-D y values (same size as x)

  • new_x (Tensor) – x-values where y should be interpolated at

Returns:

y-values at corresponding new_x values.

Return type:

Tensor

update(preds: Tensor, target: Tensor) None[source]

Update state with new values.

Parameters:
  • preds (Tensor) – predictions of the model

  • target (Tensor) – ground truth targets

full_state_update: bool = False
higher_is_better: bool | None = None
is_differentiable: bool = False
preds: list[Tensor]
target: list[Tensor]
class anomalib.utils.metrics.AUROC(num_classes: Optional[int] = None, pos_label: Optional[int] = None, task: Optional[Literal['binary', 'multiclass', 'multilabel']] = None, thresholds: Optional[Union[int, List[float], Tensor]] = None, num_labels: Optional[int] = None, ignore_index: Optional[int] = None, validate_args: bool = True, **kwargs: Any)[source]

Bases: ROC

Area under the ROC curve.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() Tensor[source]

First compute ROC curve, then compute area under the curve.

Returns:

Value of the AUROC metric

Return type:

Tensor

generate_figure() tuple[Figure, str][source]

Generate a figure containing the ROC curve, the baseline and the AUROC.

Returns:

Tuple containing both the figure and the figure title to be used for logging

Return type:

tuple[Figure, str]

update(preds: Tensor, target: Tensor) None[source]

Update state with new values.

Need to flatten new values as ROC expects them in this format for binary classification.

Parameters:
  • preds (Tensor) – predictions of the model

  • target (Tensor) – ground truth targets

preds: List[Tensor]
target: List[Tensor]
class anomalib.utils.metrics.AnomalyScoreDistribution(**kwargs)[source]

Bases: Metric

Mean and standard deviation of the anomaly scores of normal training data.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() tuple[Tensor, Tensor, Tensor, Tensor][source]

Compute stats.

update(*args, anomaly_scores: Tensor | None = None, anomaly_maps: Tensor | None = None, **kwargs) None[source]

Update the precision-recall curve metric.

class anomalib.utils.metrics.AnomalyScoreThreshold(num_classes: Optional[int] = None, pos_label: Optional[int] = None, task: Optional[Literal['binary', 'multiclass', 'multilabel']] = None, thresholds: Optional[Union[int, List[float], Tensor]] = None, num_labels: Optional[int] = None, ignore_index: Optional[int] = None, validate_args: bool = True, **kwargs: Any)[source]

Bases: PrecisionRecallCurve

Anomaly Score Threshold.

This class computes/stores the threshold that determines the anomalous label given anomaly scores. If the threshold method is manual, the class only stores the manual threshold values.

If the threshold method is adaptive, the class initially computes the adaptive threshold to find the optimal f1_score and stores the computed adaptive threshold value.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() Tensor[source]

Compute the threshold that yields the optimal F1 score.

Compute the F1 scores while varying the threshold. Store the optimal threshold as attribute and return the maximum value of the F1 score.

Returns:

Value of the F1 score at the optimal threshold.

preds: List[Tensor]
target: List[Tensor]
class anomalib.utils.metrics.MinMax(**kwargs)[source]

Bases: Metric

Track the min and max values of the observations in each batch.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() tuple[Tensor, Tensor][source]

Return min and max values.

update(predictions: Tensor, *args, **kwargs) None[source]

Update the min and max values.

full_state_update: bool = True
class anomalib.utils.metrics.OptimalF1(num_classes: int, **kwargs)[source]

Bases: Metric

Optimal F1 Metric.

Compute the optimal F1 score at the adaptive threshold, based on the F1 metric of the true labels and the predicted anomaly scores.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() Tensor[source]

Compute the value of the optimal F1 score.

Compute the F1 scores while varying the threshold. Store the optimal threshold as attribute and return the maximum value of the F1 score.

Returns:

Value of the F1 score at the optimal threshold.

reset() None[source]

Reset the metric.

update(preds: Tensor, target: Tensor, *args, **kwargs) None[source]

Update the precision-recall curve metric.

full_state_update: bool = False
class anomalib.utils.metrics.PRO(threshold: float = 0.5, **kwargs)[source]

Bases: Metric

Per-Region Overlap (PRO) Score.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute() Tensor[source]

Compute the macro average of the PRO score across all regions in all batches.

update(predictions: Tensor, targets: Tensor) None[source]

Compute the PRO score for the current batch.

preds: list[Tensor]
target: list[Tensor]