datumaro.plugins.anchor_generator#

Classes

BboxOverlaps2D([scale, dtype])

2D Overlaps (e.g. IoUs, GIoUs) Calculator.

DataAwareAnchorGenerator(img_size, strides, ...)

Data-aware anchor generator for optimizing appropriate anchor scales and ratios.

class datumaro.plugins.anchor_generator.BboxOverlaps2D(scale=1.0, dtype=None)[source]#

Bases: object

2D Overlaps (e.g. IoUs, GIoUs) Calculator.

class datumaro.plugins.anchor_generator.DataAwareAnchorGenerator(img_size: Tuple[int, int], strides: List[int], scales: List[List[float]], ratios: List[List[float]], pos_thr: float, neg_thr: float, device: str | None = 'cpu')[source]#

Bases: object

Data-aware anchor generator for optimizing appropriate anchor scales and ratios. In general, anchor generator gets img_size and strides, and its assigner gets positive and negative thresholds for solving matching problem in object detection tasks.

Parameters:
  • img_size (Tuple[int, int]) – Image size of height and width.

  • strides (List[int]) – Strides of feature map from feature pyramid network.

  • generator. (This implicitly indicates receptive field size and base size of anchor) –

  • scales (List[float]) – Initial scales for data-aware optimization.

  • ratios (List[float]) – Initial ratios for data-aware optimization.

  • pos_thr (float) – Positive threshold for matching in the following assigner.

  • neg_thr (float) – Negative threshold for matching in the following assigner.

  • device (str) – Device for computing gradient. Please refer to torch.device

get_shifts(stride: int) Tensor[source]#

Bounding box proposals from anchor generator is composed of shifts and base anchors, where shifts is generated in mesh-grid manner and base anchors is combinations of ratios and scales. This function is to create mesh-grid shifts in the original image space.

Parameters:

stride (int) – Strides of feature map from feature pyramid network.

Returns:

Shift point coordinates.

Return type:

Tensor

get_anchors(base_size: int, shifts: Tensor, scales: Tensor, ratios: Tensor) Tensor[source]#

This function is to create base anchors, which combinates ratios and scales.

Parameters:
  • base_size (int) – Strides of feature map from feature pyramid network.

  • shifts (Tensor) – Shift point coordinates in the original image space.

  • scales (Tensor) – Scales for creating base anchors.

  • ratios (Tensor) – Ratios for creating base anchors.

Returns:

Set of anchor bounding box coordinates.

Return type:

Tensor

initialize(targets, scales, ratios)[source]#
get_loss(targets: Tensor, scales: Tensor, ratios: Tensor)[source]#

This function is to create base anchors, which combinates ratios and scales.

Parameters:
  • targets (Tensor) – Set of target bounding box coordinates.

  • scales (Tensor) – Scales for creating base anchors.

  • ratios (Tensor) – Ratios for creating base anchors.

Returns:

Cost. float: Coverage rate.

Return type:

float

optimize(dataset: Dataset, subset: str | None = None, batch_size: int | None = 1024, learning_rate: float | None = 0.1, num_iters: int | None = 100)[source]#

This function is to create base anchors, which combinates ratios and scales.

Parameters:
  • dataset (Dataset) – Desired dataset to optimize anchor scales and ratios.

  • batch_size (int) – Minibatch size.

  • learning_rate (float) – Learning rate.

  • num_iters (int) – Number of iterations.

Returns:

Optimized scales. List[List[float]]: Optimized ratios.

Return type:

List[List[float]]