otx.algo.classification.heads#

Head modules for OTX custom model.

Classes

LinearClsHead(num_classes, in_channels[, ...])

Linear classifier head.

MultiLabelLinearClsHead(num_classes, in_channels)

Custom Linear classification head for multilabel task.

MultiLabelNonLinearClsHead(num_classes, ...)

Non-linear classification head for multilabel task.

HierarchicalLinearClsHead(...[, thr, init_cfg])

Custom classification linear head for hierarchical classification task.

HierarchicalNonLinearClsHead(...)

Custom classification non-linear head for hierarchical classification task.

HierarchicalCBAMClsHead(...[, thr, ...])

Custom classification CBAM head for hierarchical classification task.

VisionTransformerClsHead(num_classes, ...[, ...])

Vision Transformer classifier head.

SemiSLLinearClsHead(num_classes, in_channels)

LinearClsHead for OTXSemiSLClsHead.

SemiSLVisionTransformerClsHead(num_classes, ...)

VisionTransformerClsHead for OTXSemiSLClsHead.

class otx.algo.classification.heads.HierarchicalCBAMClsHead(num_multiclass_heads: int, num_multilabel_classes: int, head_idx_to_logits_range: dict[str, tuple[int, int]], num_single_label_classes: int, empty_multiclass_head_indices: list[int], in_channels: int, num_classes: int, thr: float = 0.5, init_cfg: dict | None = None, step_size: int | tuple[int, int] = 7, **kwargs)[source]#

Bases: HierarchicalClsHead

Custom classification CBAM head for hierarchical classification task.

Parameters:
  • num_multiclass_heads (int) – Number of multi-class heads.

  • num_multilabel_classes (int) – Number of multi-label classes.

  • head_idx_to_logits_range (dict[str, tuple[int, int]]) – the logit range of each heads

  • num_single_label_classes (int) – the number of single label classes

  • empty_multiclass_head_indices (list[int]) – the index of head that doesn’t include any label due to the label removing

  • in_channels (int) – Number of channels in the input feature map.

  • num_classes (int) – Number of total classes.

  • thr (float, optional) – Predictions with scores under the thresholds are considered as negative. Defaults to 0.5.

  • init_cfg (dict | None, optional) – Initialize configuration key-values, Defaults to None.

  • step_size (int | tuple[int, int], optional) – Step size value for HierarchicalCBAMClsHead, Defaults to 7.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) Tensor[source]#

The forward process.

pre_logits(feats: tuple[Tensor] | Tensor) Tensor[source]#

The process before the final classification head.

class otx.algo.classification.heads.HierarchicalLinearClsHead(num_multiclass_heads: int, num_multilabel_classes: int, head_idx_to_logits_range: dict[str, tuple[int, int]], num_single_label_classes: int, empty_multiclass_head_indices: list[int], in_channels: int, num_classes: int, thr: float = 0.5, init_cfg: dict | None = None, **kwargs)[source]#

Bases: HierarchicalClsHead

Custom classification linear head for hierarchical classification task.

Parameters:
  • num_multiclass_heads (int) – Number of multi-class heads.

  • num_multilabel_classes (int) – Number of multi-label classes.

  • head_idx_to_logits_range – the logit range of each heads

  • num_single_label_classes – the number of single label classes

  • empty_multiclass_head_indices – the index of head that doesn’t include any label due to the label removing

  • in_channels (int) – Number of channels in the input feature map.

  • num_classes (int) – Number of total classes.

  • thr (float | None) – Predictions with scores under the thresholds are considered as negative. Defaults to 0.5.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) Tensor[source]#

The forward process.

class otx.algo.classification.heads.HierarchicalNonLinearClsHead(num_multiclass_heads: int, num_multilabel_classes: int, head_idx_to_logits_range: dict[str, tuple[int, int]], num_single_label_classes: int, empty_multiclass_head_indices: list[int], in_channels: int, num_classes: int, thr: float = 0.5, hid_channels: int = 1280, activation: ~typing.Callable[[], ~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, dropout: bool = False, init_cfg: dict | None = None, **kwargs)[source]#

Bases: HierarchicalClsHead

Custom classification non-linear head for hierarchical classification task.

Parameters:
  • num_multiclass_heads (int) – Number of multi-class heads.

  • num_multilabel_classes (int) – Number of multi-label classes.

  • head_idx_to_logits_range – the logit range of each heads

  • num_single_label_classes – the number of single label classes

  • empty_multiclass_head_indices – the index of head that doesn’t include any label due to the label removing

  • in_channels (int) – Number of channels in the input feature map.

  • num_classes (int) – Number of total classes.

  • thr (float | None) – Predictions with scores under the thresholds are considered as negative. Defaults to 0.5.

  • hid_cahnnels (int) – Number of channels in the hidden feature map at the classifier.

  • acivation_Cfg (dict | None) – Config of activation layer at the classifier.

  • dropout (bool) – Flag for the enabling the dropout at the classifier.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) Tensor[source]#

The forward process.

class otx.algo.classification.heads.LinearClsHead(num_classes: int, in_channels: int, init_cfg: dict = {'layer': 'Linear', 'std': 0.01, 'type': 'Normal'}, **kwargs)[source]#

Bases: BaseModule

Linear classifier head.

Parameters:
  • num_classes (int) – Number of categories excluding the background category.

  • in_channels (int) – Number of channels in the input feature map.

  • cal_acc (bool) – Whether to calculate accuracy during training. If you use batch augmentations like Mixup and CutMix during training, it is pointless to calculate accuracy. Defaults to False.

  • init_cfg (dict, optional) – the config to control the initialization. Defaults to dict(type='Normal', layer='Linear', std=0.01).

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) Tensor[source]#

The forward process.

predict(feats: tuple[Tensor], **kwargs) Tensor[source]#

Inference without augmentation.

Parameters:

feats (tuple[Tensor]) – The features extracted from the backbone. Multiple stage inputs are acceptable but only the last stage will be used to classify. The shape of every item should be (num_samples, num_classes).

Returns:

A tensor of softmax result.

Return type:

torch.Tensor

class otx.algo.classification.heads.MultiLabelLinearClsHead(num_classes: int, in_channels: int, normalized: bool = False, init_cfg: dict | None = None, **kwargs)[source]#

Bases: MultiLabelClsHead

Custom Linear classification head for multilabel task.

Parameters:
  • num_classes (int) – Number of categories.

  • in_channels (int) – Number of channels in the input feature map.

  • normalized (bool) – Normalize input features and weights.

  • init_cfg (dict | None, optional) – Initialize configuration key-values, Defaults to None.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) Tensor[source]#

The forward process.

class otx.algo.classification.heads.MultiLabelNonLinearClsHead(num_classes: int, in_channels: int, hid_channels: int = 1280, activation: ~typing.Callable[[...], ~torch.nn.modules.module.Module] | ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, dropout: bool = False, normalized: bool = False, init_cfg: dict | None = None, **kwargs)[source]#

Bases: MultiLabelClsHead

Non-linear classification head for multilabel task.

Parameters:
  • num_classes (int) – Number of categories.

  • in_channels (int) – Number of channels in the input feature map.

  • hid_channels (int) – Number of channels in the hidden feature map.

  • activation (Callable[..., nn.Module] | nn.Module) – Activation layer module. Defaults to nn.ReLU.

  • dropout (bool) – Whether use the dropout or not.

  • normalized (bool) – Normalize input features and weights in the last linear layer.

  • init_cfg (dict | None, optional) – Initialize configuration key-values, Defaults to None.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) Tensor[source]#

The forward process.

class otx.algo.classification.heads.SemiSLLinearClsHead(num_classes: int, in_channels: int, use_dynamic_threshold: bool = True, min_threshold: float = 0.5)[source]#

Bases: OTXSemiSLClsHead, LinearClsHead

LinearClsHead for OTXSemiSLClsHead.

Initializes the OTXSemiSLClsHead class.

Parameters:
  • num_classes (int) – The number of classes.

  • use_dynamic_threshold (bool, optional) – Whether to use a dynamic threshold for pseudo-label selection. Defaults to True.

  • min_threshold (float, optional) – The minimum threshold for pseudo-label selection. Defaults to 0.5.

class otx.algo.classification.heads.SemiSLVisionTransformerClsHead(num_classes: int, in_channels: int, use_dynamic_threshold: bool = True, min_threshold: float = 0.5, hidden_dim: int | None = None, init_cfg: dict = {'layer': 'Linear', 'type': 'Constant', 'val': 0}, **kwargs)[source]#

Bases: OTXSemiSLClsHead, VisionTransformerClsHead

VisionTransformerClsHead for OTXSemiSLClsHead.

Initializes the OTXSemiSLClsHead class.

Parameters:
  • num_classes (int) – The number of classes.

  • use_dynamic_threshold (bool, optional) – Whether to use a dynamic threshold for pseudo-label selection. Defaults to True.

  • min_threshold (float, optional) – The minimum threshold for pseudo-label selection. Defaults to 0.5.

class otx.algo.classification.heads.VisionTransformerClsHead(num_classes: int, in_channels: int, hidden_dim: int | None = None, init_cfg: dict = {'layer': 'Linear', 'type': 'Constant', 'val': 0}, **kwargs)[source]#

Bases: BaseModule

Vision Transformer classifier head.

Parameters:
  • num_classes (int) – Number of categories excluding the background category.

  • in_channels (int) – Number of channels in the input feature map.

  • hidden_dim (int, optional) – Number of the dimensions for hidden layer. Defaults to None, which means no extra hidden layer.

  • init_cfg (dict) – The extra initialization configs. Defaults to dict(type='Constant', layer='Linear', val=0).

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[list[Tensor]]) Tensor[source]#

The forward process.

init_weights() None[source]#

Init weights of hidden layer if exists.

pre_logits(feats: tuple[list[Tensor]]) Tensor[source]#

The process before the final classification head.

The input feats is a tuple of list of tensor, and each tensor is the feature of a backbone stage. In VisionTransformerClsHead, we obtain the feature of the last stage and forward in hidden layer if exists.

predict(feats: tuple[Tensor]) Tensor[source]#

Inference without augmentation.

Parameters:

feats (tuple[Tensor]) – The features extracted from the backbone. Multiple stage inputs are acceptable but only the last stage will be used to classify. The shape of every item should be (num_samples, num_classes).

Returns:

A tensor of softmax result.

Return type:

torch.Tensor