otx.api.usecases.evaluation#

Evaluation metrics.

Functions

intersection_box(box1, box2)

Calculate the intersection box of two bounding boxes.

intersection_over_union(box1, box2[, ...])

Calculate the Intersection over Union (IoU) of two bounding boxes.

precision_per_class(matrix)

Compute the precision per class based on the confusion matrix.

recall_per_class(matrix)

Compute the recall per class based on the confusion matrix.

get_intersections_and_cardinalities(...)

Returns all intersections and cardinalities between reference masks and prediction masks.

Classes

Accuracy(resultset[, average])

This class is responsible for providing Accuracy measures; mainly for Classification problems.

MetricAverageMethod(value)

This defines the metrics averaging method.

DiceAverage(resultset[, average])

Computes the average Dice coefficient overall and for individual labels.

FMeasure(resultset[, ...])

Computes the f-measure (also known as F1-score) for a resultset.

MetricsHelper()

Contains metrics computation functions.

class otx.api.usecases.evaluation.Accuracy(resultset: ResultSetEntity, average: MetricAverageMethod = MetricAverageMethod.MICRO)[source]#

Bases: IPerformanceProvider

This class is responsible for providing Accuracy measures; mainly for Classification problems.

The calculation both supports multi label and binary label predictions.

Accuracy is the proportion of the predicted correct labels, to the total number (predicted and actual) labels for that instance. Overall accuracy is the average across all instances.

Parameters:
  • resultset (ResultSetEntity) – ResultSet that score will be computed for

  • average (MetricAverageMethod, optional) – The averaging method, either MICRO or MACRO MICRO: compute average over all predictions in all label groups MACRO: compute accuracy per label group, return the average of the per-label-group accuracy scores

get_performance() Performance[source]#

Returns the performance with accuracy and confusion metrics.

property accuracy: ScoreMetric#

Returns the accuracy as ScoreMetric.

class otx.api.usecases.evaluation.DiceAverage(resultset: ResultSetEntity, average: MetricAverageMethod = MetricAverageMethod.MACRO)[source]#

Bases: IPerformanceProvider

Computes the average Dice coefficient overall and for individual labels.

See https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient for background information.

To compute the Dice coefficient the shapes in the dataset items of the prediction and ground truth dataset are first converted to masks.

Dice is computed by computing the intersection and union computed over the whole dataset, instead of computing intersection and union for individual images and then averaging.

Parameters:
  • resultset (ResultSetEntity) – ResultSet that score will be computed for

  • average (MetricAverageMethod) – One of - MICRO: every pixel has the same weight, regardless of label - MACRO: compute score per label, return the average of the per-label scores

classmethod compute_dice_using_intersection_and_cardinality(all_intersection: Dict[LabelEntity | None, int], all_cardinality: Dict[LabelEntity | None, int], average: MetricAverageMethod) Tuple[ScoreMetric, Dict[LabelEntity, ScoreMetric]][source]#

Computes dice score using intersection and cardinality dictionaries.

Both dictionaries must contain the same set of keys. Dice score is computed by: 2 * intersection / cardinality

Parameters:
  • average – Averaging method to use

  • all_intersection – collection of intersections per label

  • all_cardinality – collection of cardinality per label

Returns:

A tuple containing the overall DICE score, and per label DICE score

Raises:
  • KeyError – if the keys in intersection and cardinality do not match

  • KeyError – if the key None is not present in either all_intersection or all_cardinality

  • ValueError – if the intersection for a certain key is larger than its corresponding cardinality

get_performance() Performance[source]#

Returns the performance of the resultset.

property dice_per_label: Dict[LabelEntity, ScoreMetric]#

Returns a dictionary mapping the label to its corresponding dice score (as ScoreMetric).

property overall_dice: ScoreMetric#

Returns the dice average as ScoreMetric.

class otx.api.usecases.evaluation.FMeasure(resultset: ResultSetEntity, vary_confidence_threshold: bool = False, vary_nms_threshold: bool = False, cross_class_nms: bool = False)[source]#

Bases: IPerformanceProvider

Computes the f-measure (also known as F1-score) for a resultset.

The f-measure is typically used in detection (localization) tasks to obtain a single number that balances precision and recall.

To determine whether a predicted box matches a ground truth box an overlap measured is used based on a minimum intersection-over-union (IoU), by default a value of 0.5 is used.

In addition spurious results are eliminated by applying non-max suppression (NMS) so that two predicted boxes with IoU > threshold are reduced to one. This threshold can be determined automatically by setting vary_nms_threshold to True.

Parameters:
  • resultset (ResultSetEntity) – ResultSet entity used for calculating the F-Measure

  • vary_confidence_threshold (bool) – if True the maximal F-measure is determined by optimizing for different confidence threshold values Defaults to False.

  • vary_nms_threshold (bool) – if True the maximal F-measure is determined by optimizing for different NMS threshold values. Defaults to False.

  • cross_class_nms (bool) – Whether non-max suppression should be applied cross-class. If True this will eliminate boxes with sufficient overlap even if they are from different classes. Defaults to False.

Raises:

ValueError – if prediction dataset and ground truth dataset are empty

get_performance() MultiScorePerformance[source]#

Returns the performance which consists of the F-Measure score and the dashboard metrics.

Returns:

MultiScorePerformance object containing the F-Measure scores and the dashboard metrics.

Return type:

MultiScorePerformance

property best_confidence_threshold: ScoreMetric | None#

Returns best confidence threshold as ScoreMetric if exists.

property best_nms_threshold: ScoreMetric | None#

Returns the best NMS threshold as ScoreMetric if exists.

property f_measure: ScoreMetric#

Returns the f-measure as ScoreMetric.

property f_measure_per_confidence: CurveMetric | None#

Returns the curve for f-measure per confidence as CurveMetric if exists.

property f_measure_per_label: Dict[LabelEntity, ScoreMetric]#

Returns the f-measure per label as dictionary (Label -> ScoreMetric).

property f_measure_per_nms: CurveMetric | None#

Returns the curve for f-measure per nms threshold as CurveMetric if exists.

class otx.api.usecases.evaluation.MetricAverageMethod(value)[source]#

Bases: Enum

This defines the metrics averaging method.

class otx.api.usecases.evaluation.MetricsHelper[source]#

Bases: object

Contains metrics computation functions.

TODO: subject for refactoring.

static compute_accuracy(resultset: ResultSetEntity, average: MetricAverageMethod = MetricAverageMethod.MICRO) Accuracy[source]#

Compute the Accuracy on a resultset, averaged over the different label groups.

Parameters:
  • resultset – The resultset used to compute the accuracy

  • average – The averaging method, either MICRO or MACRO

Returns:

Accuracy object

static compute_anomaly_detection_scores(resultset: ResultSetEntity) AnomalyDetectionScores[source]#

Compute the anomaly localization performance metrics on an anomaly detection resultset.

Parameters:

resultset – The resultset used to compute the metrics

Returns:

AnomalyLocalizationScores object

static compute_anomaly_segmentation_scores(resultset: ResultSetEntity) AnomalySegmentationScores[source]#

Compute the anomaly localization performance metrics on an anomaly segmentation resultset.

Parameters:

resultset – The resultset used to compute the metrics

Returns:

AnomalyLocalizationScores object

static compute_dice_averaged_over_pixels(resultset: ResultSetEntity, average: MetricAverageMethod = MetricAverageMethod.MACRO) DiceAverage[source]#

Compute the Dice average on a resultset, averaged over the pixels.

Parameters:
  • resultset – The resultset used to compute the Dice average

  • average – The averaging method, either MICRO or MACRO

Returns:

DiceAverage object

static compute_f_measure(resultset: ResultSetEntity, vary_confidence_threshold: bool = False, vary_nms_threshold: bool = False, cross_class_nms: bool = False) FMeasure[source]#

Compute the F-Measure on a resultset given some parameters.

Parameters:
  • resultset – The resultset used to compute f-measure

  • vary_confidence_threshold – Flag specifying whether f-measure shall be computed for different confidence threshold values

  • vary_nms_threshold – Flag specifying whether f-measure shall be computed for different NMS threshold values

  • cross_class_nms – Whether non-max suppression should be applied cross-class

Returns:

FMeasure object

otx.api.usecases.evaluation.get_intersections_and_cardinalities(references: List[ndarray], predictions: List[ndarray], labels: List[LabelEntity]) Tuple[Dict[LabelEntity | None, int], Dict[LabelEntity | None, int]][source]#

Returns all intersections and cardinalities between reference masks and prediction masks.

Intersections and cardinalities are each returned in a dictionary mapping each label to its corresponding number of intersection/cardinality pixels

Parameters:
  • references (List[np.ndarray]) – reference masks,s one mask per image

  • predictions (List[np.ndarray]) – prediction masks, one mask per image

  • labels (List[LabelEntity]) – labels in input masks

Returns:

(all_intersections, all_cardinalities)

Return type:

Tuple[NumberPerLabel, NumberPerLabel]

otx.api.usecases.evaluation.intersection_box(box1: Rectangle, box2: Rectangle) List[float] | None[source]#

Calculate the intersection box of two bounding boxes.

Parameters:
  • box1 – a Rectangle that represents the first bounding box

  • box2 – a Rectangle that represents the second bounding box

Returns:

a Rectangle that represents the intersection box if inputs have a valid intersection, else None

otx.api.usecases.evaluation.intersection_over_union(box1: Rectangle, box2: Rectangle, intersection: List[float] | None = None) float[source]#

Calculate the Intersection over Union (IoU) of two bounding boxes.

Parameters:
  • box1 – a Rectangle representing a bounding box

  • box2 – a Rectangle representing a second bounding box

  • intersection – precomputed intersection between two boxes (see intersection_box function), if exists.

Returns:

intersection-over-union of box1 and box2

otx.api.usecases.evaluation.precision_per_class(matrix: ndarray) ndarray[source]#

Compute the precision per class based on the confusion matrix.

Parameters:

matrix – the computed confusion matrix

Returns:

the precision (per class), defined as TP/(TP+FP)

otx.api.usecases.evaluation.recall_per_class(matrix: ndarray) ndarray[source]#

Compute the recall per class based on the confusion matrix.

Parameters:

matrix – the computed confusion matrix

Returns:

the recall (per class), defined as TP/(TP+FN)