otx.algo.classification.heads#

Head modules for OTX custom model.

Classes

`LinearClsHead`(num_classes, in_channels[, ...])	Linear classifier head.
`MultiLabelLinearClsHead`(num_classes, in_channels)	Custom Linear classification head for multilabel task.
`MultiLabelNonLinearClsHead`(num_classes, ...)	Non-linear classification head for multilabel task.
`HierarchicalLinearClsHead`(...[, thr, init_cfg])	Custom classification linear head for hierarchical classification task.
`HierarchicalNonLinearClsHead`(...)	Custom classification non-linear head for hierarchical classification task.
`HierarchicalCBAMClsHead`(...[, thr, ...])	Custom classification CBAM head for hierarchical classification task.
`VisionTransformerClsHead`(num_classes, ...[, ...])	Vision Transformer classifier head.

class otx.algo.classification.heads.HierarchicalCBAMClsHead(num_multiclass_heads: int, num_multilabel_classes: int, head_idx_to_logits_range: dict[str, tuple[int, int]], num_single_label_classes: int, empty_multiclass_head_indices: list[int], in_channels: int, num_classes: int, thr: float = 0.5, init_cfg: dict | None = None, step_size: int | tuple[int, int] = 7, **kwargs)[source]#

Bases: HierarchicalClsHead

Custom classification CBAM head for hierarchical classification task.

Parameters:

num_multiclass_heads (int) – Number of multi-class heads.
num_multilabel_classes (int) – Number of multi-label classes.
head_idx_to_logits_range (dict[str, tuple[int, int]]) – the logit range of each heads
num_single_label_classes (int) – the number of single label classes
empty_multiclass_head_indices (list[int]) – the index of head that doesn’t include any label due to the label removing
in_channels (int) – Number of channels in the input feature map.
num_classes (int) – Number of total classes.
thr (float, optional) – Predictions with scores under the thresholds are considered as negative. Defaults to 0.5.
init_cfg (dict | None, optional) – Initialize configuration key-values, Defaults to None.
step_size (int | tuple[int, int], optional) – Step size value for HierarchicalCBAMClsHead, Defaults to 7.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The forward process.

pre_logits(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The process before the final classification head.

class otx.algo.classification.heads.HierarchicalLinearClsHead(num_multiclass_heads: int, num_multilabel_classes: int, head_idx_to_logits_range: dict[str, tuple[int, int]], num_single_label_classes: int, empty_multiclass_head_indices: list[int], in_channels: int, num_classes: int, thr: float = 0.5, init_cfg: dict | None = None, **kwargs)[source]#

Bases: HierarchicalClsHead

Custom classification linear head for hierarchical classification task.

Parameters:

num_multiclass_heads (int) – Number of multi-class heads.
num_multilabel_classes (int) – Number of multi-label classes.
head_idx_to_logits_range – the logit range of each heads
num_single_label_classes – the number of single label classes
empty_multiclass_head_indices – the index of head that doesn’t include any label due to the label removing
in_channels (int) – Number of channels in the input feature map.
num_classes (int) – Number of total classes.
thr (float | None) – Predictions with scores under the thresholds are considered as negative. Defaults to 0.5.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The forward process.

class otx.algo.classification.heads.HierarchicalNonLinearClsHead(num_multiclass_heads: int, num_multilabel_classes: int, head_idx_to_logits_range: dict[str, tuple[int, int]], num_single_label_classes: int, empty_multiclass_head_indices: list[int], in_channels: int, num_classes: int, thr: float = 0.5, hid_channels: int = 1280, activation: ~typing.Callable[[], ~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, dropout: bool = False, init_cfg: dict | None = None, **kwargs)[source]#

Bases: HierarchicalClsHead

Custom classification non-linear head for hierarchical classification task.

Parameters:

num_multiclass_heads (int) – Number of multi-class heads.
num_multilabel_classes (int) – Number of multi-label classes.
head_idx_to_logits_range – the logit range of each heads
num_single_label_classes – the number of single label classes
empty_multiclass_head_indices – the index of head that doesn’t include any label due to the label removing
in_channels (int) – Number of channels in the input feature map.
num_classes (int) – Number of total classes.
thr (float | None) – Predictions with scores under the thresholds are considered as negative. Defaults to 0.5.
hid_cahnnels (int) – Number of channels in the hidden feature map at the classifier.
acivation_Cfg (dict | None) – Config of activation layer at the classifier.
dropout (bool) – Flag for the enabling the dropout at the classifier.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The forward process.

class otx.algo.classification.heads.LinearClsHead(num_classes: int, in_channels: int, init_cfg: dict = {'layer': 'Linear', 'std': 0.01, 'type': 'Normal'}, **kwargs)[source]#

Bases: BaseModule

Linear classifier head.

Parameters:

num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
cal_acc (bool) – Whether to calculate accuracy during training. If you use batch augmentations like Mixup and CutMix during training, it is pointless to calculate accuracy. Defaults to False.
init_cfg (dict, optional) – the config to control the initialization. Defaults to dict(type='Normal', layer='Linear', std=0.01).

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The forward process.

predict(feats: tuple[Tensor], **kwargs) → Tensor[source]#

Inference without augmentation.

Parameters:: feats (tuple[Tensor]) – The features extracted from the backbone. Multiple stage inputs are acceptable but only the last stage will be used to classify. The shape of every item should be (num_samples, num_classes).
Returns:: A tensor of softmax result.
Return type:: torch.Tensor

class otx.algo.classification.heads.MultiLabelLinearClsHead(num_classes: int, in_channels: int, normalized: bool = False, init_cfg: dict | None = None, **kwargs)[source]#

Bases: MultiLabelClsHead

Custom Linear classification head for multilabel task.

Parameters:

num_classes (int) – Number of categories.
in_channels (int) – Number of channels in the input feature map.
normalized (bool) – Normalize input features and weights.
init_cfg (dict | None, optional) – Initialize configuration key-values, Defaults to None.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The forward process.

class otx.algo.classification.heads.MultiLabelNonLinearClsHead(num_classes: int, in_channels: int, hid_channels: int = 1280, activation: ~typing.Callable[[...], ~torch.nn.modules.module.Module] | ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, dropout: bool = False, normalized: bool = False, init_cfg: dict | None = None, **kwargs)[source]#

Bases: MultiLabelClsHead

Non-linear classification head for multilabel task.

Parameters:

num_classes (int) – Number of categories.
in_channels (int) – Number of channels in the input feature map.
hid_channels (int) – Number of channels in the hidden feature map.
activation (Callable[..., nn.Module] | nn.Module) – Activation layer module. Defaults to nn.ReLU.
dropout (bool) – Whether use the dropout or not.
normalized (bool) – Normalize input features and weights in the last linear layer.
init_cfg (dict | None, optional) – Initialize configuration key-values, Defaults to None.

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[Tensor] | Tensor) → Tensor[source]#: The forward process.

class otx.algo.classification.heads.VisionTransformerClsHead(num_classes: int, in_channels: int, hidden_dim: int | None = None, init_cfg: dict = {'layer': 'Linear', 'type': 'Constant', 'val': 0}, **kwargs)[source]#

Bases: BaseModule

Vision Transformer classifier head.

Parameters:

num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
hidden_dim (int, optional) – Number of the dimensions for hidden layer. Defaults to None, which means no extra hidden layer.
init_cfg (dict) – The extra initialization configs. Defaults to dict(type='Constant', layer='Linear', val=0).

Initialize BaseModule, inherited from torch.nn.Module.

forward(feats: tuple[list[Tensor]]) → Tensor[source]#: The forward process.

init_weights() → None[source]#: Init weights of hidden layer if exists.

pre_logits(feats: tuple[list[Tensor]]) → Tensor[source]#

The process before the final classification head.

The input feats is a tuple of list of tensor, and each tensor is the feature of a backbone stage. In VisionTransformerClsHead, we obtain the feature of the last stage and forward in hidden layer if exists.

predict(feats: tuple[Tensor]) → Tensor[source]#

Inference without augmentation.

Parameters:: feats (tuple[Tensor]) – The features extracted from the backbone. Multiple stage inputs are acceptable but only the last stage will be used to classify. The shape of every item should be (num_samples, num_classes).
Returns:: A tensor of softmax result.
Return type:: torch.Tensor