otx.algo.detection.losses#
Custom OTX Losses for Object Detection.
Classes
|
ATSSCriterion is a loss criterion used in the Adaptive Training Sample Selection (ATSS) algorithm. |
|
This class computes the loss for DETR. |
|
RTMDetCriterion is a criterion module for RTM-based object detection. |
|
SSDCriterion is a loss criterion for Single Shot MultiBox Detector (SSD). |
|
YOLOv9 criterion module. |
|
YOLOX criterion module. |
|
D-Fine criterion with FGL and DDF losses. |
- class otx.algo.detection.losses.ATSSCriterion(num_classes: int, bbox_coder: Module, loss_cls: Module, loss_bbox: Module, loss_centerness: Module | None = None, use_qfl: bool = False, qfl_cfg: dict | None = None, reg_decoded_bbox: bool = True, bg_loss_weight: float = -1.0)[source]#
Bases:
Module
ATSSCriterion is a loss criterion used in the Adaptive Training Sample Selection (ATSS) algorithm.
- Parameters:
num_classes (int) – The number of object classes.
bbox_coder (nn.Module) – The module used for encoding and decoding bounding box coordinates.
loss_cls (nn.Module) – The module used for calculating the classification loss.
loss_bbox (nn.Module) – The module used for calculating the bounding box regression loss.
loss_centerness (nn.Module | None, optional) – The module used for calculating the centerness loss. Defaults to None.
use_qfl (bool, optional) – Whether to use the Quality Focal Loss (QFL). Defaults to
CrossEntropyLoss(use_sigmoid=True, loss_weight=1.0)
.reg_decoded_bbox (bool, optional) – Whether to use the decoded bounding box coordinates for regression loss calculation. Defaults to True.
bg_loss_weight (float, optional) – The weight for the background loss. Defaults to -1.0.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- centerness_target(anchors: Tensor, gts: Tensor) Tensor [source]#
Calculate the centerness between anchors and gts.
Only calculate pos centerness targets, otherwise there may be nan.
- Parameters:
anchors (Tensor) – Anchors with shape (N, 4), “xyxy” format.
gts (Tensor) – Ground truth bboxes with shape (N, 4), “xyxy” format.
- Returns:
Centerness between anchors and gts.
- Return type:
Tensor
- forward(anchors: Tensor, cls_score: Tensor, bbox_pred: Tensor, centerness: Tensor, labels: Tensor, label_weights: Tensor, bbox_targets: Tensor, valid_label_mask: Tensor, avg_factor: float) dict[str, Tensor] [source]#
Compute loss of a single scale level.
- Parameters:
anchors (Tensor) – Box reference for scale levels with shape (N, num_total_anchors, 4).
cls_score (Tensor) – Box scores for scale levels have shape (N, num_anchors * num_classes, H, W).
bbox_pred (Tensor) – Box energies / deltas for scale levels with shape (N, num_anchors * 4, H, W).
centerness (Tensor) – Centerness scores for each scale level.
labels (Tensor) – Labels of anchors with shape (N, num_total_anchors).
label_weights (Tensor) – Label weights of anchors with shape (N, num_total_anchors)
bbox_targets (Tensor) – BBox regression targets of anchors with shape (N, num_total_anchors, 4).
valid_label_mask (Tensor) – Label mask for consideration of ignored label with shape (N, num_total_anchors, 1).
avg_factor (float) – Average factor that is used to average the loss. When using sampling method, avg_factor is usually the sum of positive and negative priors. When using PseudoSampler, avg_factor is usually equal to the number of positive priors.
- Returns:
A dictionary of loss components.
- Return type:
- class otx.algo.detection.losses.DFINECriterion(weight_dict: dict[str, int | float], alpha: float = 0.2, gamma: float = 2.0, num_classes: int = 80, reg_max: int = 32)[source]#
Bases:
Module
D-Fine criterion with FGL and DDF losses.
TODO(Eugene): Consider merge with RTDETRCriterion in the next PR.
The process happens in two steps: 1) we compute hungarian assignment between ground truth boxes and the outputs of the model 2) we supervise each pair of matched ground-truth / prediction (supervise class and box)
- Parameters:
weight_dict (dict[str, int | float]) – A dictionary containing the weights for different loss components.
alpha (float, optional) – The alpha parameter for the loss calculation. Defaults to 0.2.
gamma (float, optional) – The gamma parameter for the loss calculation. Defaults to 2.0.
num_classes (int, optional) – The number of classes. Defaults to 80.
reg_max (int, optional) – The maximum number of bin targets. Defaults to 32.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- static fgl_loss(preds: Tensor, targets: Tensor, weight_right: Tensor, weight_left: Tensor, iou_weight: Tensor | None = None, reduction: str = 'sum', avg_factor: float | None = None) Tensor [source]#
Fine-Grained Localization (FGL) Loss.
- Parameters:
preds (Tensor) – predicted distances
targets (Tensor) – target distances
weight_right (Tensor) – weight for right distance
weight_left (Tensor) – weight for left distance
iou_weight (Tensor, optional) – IoU weight. Defaults to None.
reduction (str, optional) – reduction method. Defaults to “sum”.
avg_factor (float, optional) – average factor. Defaults to None.
- Returns:
FGL loss
- Return type:
Tensor
- forward(outputs: dict[str, Tensor], targets: list[dict[str, Tensor]]) dict[str, Tensor] [source]#
This performs the loss computation.
- Parameters:
- Returns:
dict of losses
- Return type:
- static get_cdn_matched_indices(dn_meta: dict[str, list[Tensor]], targets: list[dict[str, Tensor]]) list[tuple[Tensor, Tensor]] [source]#
get_cdn_matched_indices.
- loss_boxes(outputs: dict[str, Tensor], targets: list[dict[str, Tensor]], indices: list[tuple[int, int]], num_boxes: int) dict[str, Tensor] [source]#
Compute the losses re)L1 regression loss and the GIoU loss.
Targets dicts must contain the key “boxes” containing a tensor of dim [nb_target_boxes, 4] The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.
- loss_labels_vfl(outputs: dict[str, Tensor], targets: list[dict[str, Tensor]], indices: list[tuple[int, int]], num_boxes: int) dict[str, Tensor] [source]#
Varifocal Loss (VFL) for label prediction.
- loss_local(outputs: dict[str, Tensor], targets: list[dict[str, Tensor]], indices: list[tuple[int, int]], num_boxes: int, temperature: int = 5) dict[str, Tensor] [source]#
Compute Fine-Grained Localization (FGL) Loss and Decoupled Distillation Focal (DDF) Loss.
- Parameters:
- Returns:
FGL and DDF losses.
- Return type:
- class otx.algo.detection.losses.DetrCriterion(weight_dict: dict[str, int | float], alpha: float = 0.2, gamma: float = 2.0, num_classes: int = 80)[source]#
Bases:
Module
This class computes the loss for DETR.
- The process happens in two steps:
we compute hungarian assignment between ground truth boxes and the outputs of the model
we supervise each pair of matched ground-truth / prediction (supervise class and box)
- Parameters:
weight_dict (dict[str, int | float]) – A dictionary containing the weights for different loss components.
alpha (float, optional) – The alpha parameter for the loss calculation. Defaults to 0.2.
gamma (float, optional) – The gamma parameter for the loss calculation. Defaults to 2.0.
num_classes (int, optional) – The number of classes. Defaults to 80.
Create the criterion.
- forward(outputs: dict[str, Tensor], targets: list[dict[str, Tensor]]) dict[str, Tensor] [source]#
This performs the loss computation.
- static get_cdn_matched_indices(dn_meta: dict[str, list[Tensor]], targets: list[dict[str, Tensor]]) list[tuple[Tensor, Tensor]] [source]#
get_cdn_matched_indices.
- loss_boxes(outputs: dict[str, Tensor], targets: list[dict[str, Tensor]], indices: list[tuple[int, int]], num_boxes: int) dict[str, Tensor] [source]#
Compute the losses re)L1 regression loss and the GIoU loss.
Targets dicts must contain the key “boxes” containing a tensor of dim [nb_target_boxes, 4] The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.
- class otx.algo.detection.losses.RTMDetCriterion(num_classes: int, loss_cls: Module, loss_bbox: Module)[source]#
Bases:
Module
RTMDetCriterion is a criterion module for RTM-based object detection.
- Parameters:
num_classes (int) – Number of object classes.
loss_cls (nn.Module) – Classification loss module.
loss_bbox (nn.Module) – Bounding box regression loss module.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(cls_score: Tensor, bbox_pred: Tensor, labels: Tensor, label_weights: Tensor, bbox_targets: Tensor, assign_metrics: Tensor, stride: list[int], **kwargs) dict[str, Tensor] [source]#
Compute loss of a single scale level.
- Parameters:
cls_score (Tensor) – Box scores for scale levels have shape (N, num_anchors * num_classes, H, W).
bbox_pred (Tensor) – Decoded bboxes for scale levels with shape (N, num_anchors * 4, H, W).
labels (Tensor) – Labels of anchors with shape (N, num_total_anchors).
label_weights (Tensor) – Label weights of anchors with shape (N, num_total_anchors).
bbox_targets (Tensor) – BBox regression targets of anchors with shape (N, num_total_anchors, 4).
assign_metrics (Tensor) – Assign metrics with shape (N, num_total_anchors).
- Returns:
A dictionary of loss components.
- Return type:
- class otx.algo.detection.losses.SSDCriterion(num_classes: int, bbox_coder: Module | None = None, neg_pos_ratio: int = 3, reg_decoded_bbox: bool = False, smoothl1_beta: float = 1.0)[source]#
Bases:
Module
SSDCriterion is a loss criterion for Single Shot MultiBox Detector (SSD).
- Parameters:
num_classes (int) – Number of classes including the background class.
bbox_coder (nn.Module) – Bounding box coder module. Defaults to None.
neg_pos_ratio (int, optional) – Ratio of negative to positive samples. Defaults to 3.
reg_decoded_bbox (bool) – If true, the regression loss would be applied directly on decoded bounding boxes, converting both the predicted boxes and regression targets to absolute coordinates format. Defaults to False. It should be True when using IoULoss, GIoULoss, or DIoULoss in the bbox head.
smoothl1_beta (float, optional) – Beta parameter for the smooth L1 loss. Defaults to 1.0.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(cls_score: Tensor, bbox_pred: Tensor, anchor: Tensor, labels: Tensor, label_weights: Tensor, bbox_targets: Tensor, bbox_weights: Tensor, avg_factor: int) dict[str, Tensor] [source]#
Compute losses of images.
- Parameters:
cls_score (Tensor) – Box scores for images have shape (N, num_total_anchors, num_classes).
bbox_pred (Tensor) – Box energies / deltas for image levels with shape (N, num_total_anchors, 4).
anchors (Tensor) – Box reference for for scale levels with shape (N, num_total_anchors, 4).
labels (Tensor) – Labels of anchors with shape (N, num_total_anchors).
label_weights (Tensor) – Label weights of anchors with shape (N, num_total_anchors)
bbox_targets (Tensor) – BBox regression targets of anchors with shape (N, num_total_anchors, 4).
bbox_weights (Tensor) – BBox regression loss weights of anchors with shape (N, num_total_anchors, 4).
avg_factor (int) – Average factor that is used to average the loss. When using sampling method, avg_factor is usually the sum of positive and negative priors. When using PseudoSampler, avg_factor is usually equal to the number of positive priors.
- Returns:
A dictionary of loss components. the dict has components below:
loss_cls (list[Tensor]): A list containing each feature map classification loss.
loss_bbox (list[Tensor]): A list containing each feature map regression loss.
- Return type:
- class otx.algo.detection.losses.YOLOXCriterion(num_classes: int, loss_cls: Module | None = None, loss_bbox: Module | None = None, loss_obj: Module | None = None, loss_l1: Module | None = None, use_l1: bool = False)[source]#
Bases:
Module
YOLOX criterion module.
This module calculates the loss for YOLOX object detection model.
- Parameters:
num_classes (int) – The number of classes.
loss_cls (nn.Module | None) – The classification loss module. Defaults to None.
loss_bbox (nn.Module | None) – The bounding box regression loss module. Defaults to None.
loss_obj (nn.Module | None) – The objectness loss module. Defaults to None.
loss_l1 (nn.Module | None) – The L1 loss module. Defaults to None.
- Returns:
A dictionary containing the calculated losses.
- Return type:
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(flatten_objectness: Tensor, flatten_cls_preds: Tensor, flatten_bbox_preds: Tensor, flatten_bboxes: Tensor, obj_targets: Tensor, cls_targets: Tensor, bbox_targets: Tensor, l1_targets: Tensor, num_total_samples: Tensor, num_pos: Tensor, pos_masks: Tensor) dict[str, Tensor] [source]#
Forward pass of the YOLOX criterion module.
- Parameters:
flatten_objectness (Tensor) – Flattened objectness predictions.
flatten_cls_preds (Tensor) – Flattened class predictions.
flatten_bbox_preds (Tensor) – Flattened bounding box predictions.
flatten_bboxes (Tensor) – Flattened ground truth bounding boxes.
obj_targets (Tensor) – Objectness targets.
cls_targets (Tensor) – Class targets.
bbox_targets (Tensor) – Bounding box targets.
l1_targets (Tensor) – L1 targets.
num_total_samples (Tensor) – Total number of samples.
num_pos (Tensor) – Number of positive samples.
pos_masks (Tensor) – Positive masks.
- Returns:
A dictionary containing the calculated losses.
- Return type:
- class otx.algo.detection.losses.YOLOv9Criterion(num_classes: int, vec2box: Vec2Box, loss_cls: Module | None = None, loss_dfl: Module | None = None, loss_iou: Module | None = None, reg_max: int = 16, cls_rate: float = 0.5, dfl_rate: float = 1.5, iou_rate: float = 7.5, aux_rate: float = 0.25)[source]#
Bases:
Module
YOLOv9 criterion module.
This module calculates the loss for YOLOv9 object detection model.
- Parameters:
num_classes (int) – The number of classes.
vec2box (Vec2Box) – The Vec2Box object.
loss_cls (nn.Module | None) – The classification loss module. Defaults to None.
loss_dfl (nn.Module | None) – The DFLoss loss module. Defaults to None.
loss_iou (nn.Module | None) – The IoULoss loss module. Defaults to None.
reg_max (int, optional) – Maximum number of anchor regions. Defaults to 16.
cls_rate (float, optional) – The classification loss rate. Defaults to 1.5.
dfl_rate (float, optional) – The DFLoss loss rate. Defaults to 7.5.
iou_rate (float, optional) – The IoU loss rate. Defaults to 0.5.
aux_rate (float, optional) – The auxiliary loss rate. Defaults to 0.25.
Initialize internal Module state, shared by both nn.Module and ScriptModule.