otx.algorithms.segmentation.adapters.mmseg#
OTX Adapters - mmseg.
Classes
|
Wrapper dataset that allows using a OTX dataset to train models. |
|
Lite-HRNet backbone. |
|
MMOVBackbone. |
|
MMOVDecodeHead. |
|
Modified from deepmind/detcon. |
|
The SelfSLMLP neck: fc/conv-bn-relu-fc/conv. |
|
The learning rate remains constant over time. |
|
The learning rate changes over time according to a polynomial schedule. |
|
Step learning rate scheduler. |
|
DetCon Implementation. |
|
CrossEntropyLossWithIgnore with Ignore Mode Support for Class Incremental Learning. |
|
Apply DetConB as a contrastive part of Supervised Contrastive Learning (https://arxiv.org/abs/2004.11362). |
|
Mean teacher segmentor for semi-supervised learning. |
- class otx.algorithms.segmentation.adapters.mmseg.ConstantScalarScheduler(scale: float = 30.0)[source]#
Bases:
BaseScalarScheduler
The learning rate remains constant over time.
The learning rate equals the scale.
- Parameters:
scale (float) – The learning rate scale.
- class otx.algorithms.segmentation.adapters.mmseg.CrossEntropyLossWithIgnore(*args, **kwargs)[source]#
Bases:
CrossEntropyLoss
CrossEntropyLossWithIgnore with Ignore Mode Support for Class Incremental Learning.
When new classes are added through continual training cycles, images from previous cycles may become partially annotated if they are not revisited. To prevent the model from predicting these new classes for such images, CrossEntropyLossWithIgnore can be used to ignore the unseen classes.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(cls_score: Tensor | None, label: Tensor | None, weight: Tensor | None = None, avg_factor: int | None = None, reduction_override: str | None = 'mean', ignore_index: int = 255, valid_label_mask: Tensor | None = None, **kwargs)[source]#
Forward.
- Parameters:
cls_score (torch.Tensor, optional) – The prediction with shape (N, 1).
label (torch.Tensor, optional) – The learning label of the prediction.
weight (torch.Tensor, optional) – Sample-wise loss weight. Default: None.
class_weight (list[float], optional) – The weight for each class. Default: None.
avg_factor (int, optional) – Average factor that is used to average the loss. Default: None.
reduction_override (str, optional) – The method used to reduce the loss. Options are ‘none’, ‘mean’ and ‘sum’. Default: ‘mean’.
ignore_index (int) – Specifies a target value that is ignored and does not contribute to the input gradients. When
avg_non_ignore `` is ``True
, and thereduction
is''mean''
, the loss is averaged over non-ignored targets. Defaults: 255.valid_label_mask (torch.Tensor, optional) – The valid labels with shape (N, num_classes). If the value in the valid_label_mask is 0, mask label of the the mask label of the class corresponding to its index will be ignored like ignore_index.
**kwargs (Any) – Additional keyword arguments.
- property loss_name#
Loss Name.
This function must be implemented and will return the name of this loss function. This name will be used to combine different loss items by simple sum operation. In addition, if you want this loss item to be included into the backward graph, loss_ must be the prefix of the name.
- Returns:
The name of this loss item.
- Return type:
- class otx.algorithms.segmentation.adapters.mmseg.DetConB(backbone: Dict[str, Any], neck: Dict[str, Any] | None = None, head: Dict[str, Any] | None = None, pretrained: str | None = None, base_momentum: float = 0.996, num_classes: int = 256, num_samples: int = 16, downsample: int = 32, input_transform: str = 'resize_concat', in_index: List[int] | int = [0], align_corners: bool = False, **kwargs)[source]#
Bases:
Module
DetCon Implementation.
- Implementation of ‘Efficient Visual Pretraining with Contrastive Detection’
- Parameters:
backbone (dict) – Config dict for module of backbone ConvNet.
neck (dict, optional) – Config dict for module of deep features to compact feature vectors. Default: None.
head (dict, optional) – Config dict for module of loss functions. Default: None.
pretrained (str, optional) – Path to pre-trained weights. Default: None.
base_momentum (float) – The base momentum coefficient for the target network. Default: 0.996.
num_classes (int) – The number of classes to be considered as pseudo classes. Default: 256.
num_samples (int) – The number of samples to be sampled. Default: 16.
downsample (int) – The ratio of the mask size to the feature size. Default: 32.
input_transform (str) – Input transform of features from backbone. Default: “resize_concat”.
in_index (list) – Feature index to be used for DetCon if the backbone outputs multi-scale features wrapped by list or tuple. Default: [0].
align_corners (bool) – Whether apply align_corners during resize. Default: False.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- extract_feat(img: Tensor)[source]#
Extract features from images.
- Parameters:
img (Tensor) – Input image.
- Returns:
Features from the online_backbone.
- Return type:
Tensor
- forward(img, img_metas, return_loss=True, **kwargs)[source]#
Calls either
forward_train()
orforward_test()
depending on whetherreturn_loss
isTrue
.Note this setting will change the expected inputs. When
return_loss=True
, img and img_meta are single-nested (i.e. Tensor and List[dict]), and whenresturn_loss=False
, img and img_meta should be double nested (i.e. List[Tensor], List[List[dict]]), with the outer list indicating test time augmentations.
- forward_train(img: Tensor, img_metas: List[Dict], gt_semantic_seg: Tensor, return_embedding: bool = False)[source]#
Forward function for training.
- Parameters:
- Returns:
A dictionary of loss components.
- Return type:
- init_weights(pretrained: str | None = None)[source]#
Initialize the weights of model.
- Parameters:
pretrained (str, optional) – Path to pre-trained weights. Default: None.
- sample_masked_feats(feats: Tensor | List | Tuple, masks: Tensor, projector: Module)[source]#
Sampled features from mask.
- Parameters:
- Returns:
(proj, sampled_mask_ids), features from the projector and ids used to sample masks.
- Return type:
tuple[Tensor, Tensor]
- static state_dict_hook(module, state_dict, *args, **kwargs)[source]#
Save only online backbone as output state_dict.
- train_step(data_batch: Dict[str, Any], optimizer: Optimizer | Dict, **kwargs)[source]#
The iteration step during training.
This method defines an iteration step during training, except for the back propagation and optimizer updating, which are done in an optimizer hook. Note that in some complicated cases or models, the whole process including back propagation and optimizer updating is also defined in this method, such as GAN.
- Parameters:
data_batch (dict) – The output of dataloader.
optimizer (
torch.optim.Optimizer
| dict) – The optimizer of runner is passed totrain_step()
. This argument is unused and reserved.**kwargs (Any) – Addition keyword arguments.
- Returns:
- It should contain at least 3 keys:
loss
,log_vars
, num_samples
.loss
is a tensor for back propagation, which can be a weighted sum of multiple losses.log_vars
contains all the variables to be sent to the logger.num_samples
indicates the batch size (when the model is DDP, it means the batch size on each GPU), which is used for averaging the logs.
- It should contain at least 3 keys:
- Return type:
- class otx.algorithms.segmentation.adapters.mmseg.DetConLoss(temperature: float = 0.1, use_replicator_loss: bool = True, ignore_index: int = 255)[source]#
Bases:
Module
Modified from deepmind/detcon.
Compute the NCE scores from pairs of predictions and targets. This implements the batched form of the loss described in Section 3.1, Equation 3 in https://arxiv.org/pdf/2103.10957.pdf.
- Parameters:
temperature – (float) the temperature to use for the NCE loss.
use_replicator_loss (bool) – use cross-replica samples.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred1, pred2, target1, target2, pind1, pind2, tind1, tind2, local_negatives=True)[source]#
Forward loss.
- Parameters:
pred1 (Tensor) – (b, num_samples, d) the prediction from first view.
pred2 (Tensor) – (b, num_samples, d) the prediction from second view.
target1 (Tensor) – (b, num_samples, d) the projection from first view.
target2 (Tensor) – (b, num_samples, d) the projection from second view.
pind1 (Tensor) – (b, num_samples) mask indices for first view’s prediction.
pind2 (Tensor) – (b, num_samples) mask indices for second view’s prediction.
tind1 (Tensor) – (b, num_samples) mask indices for first view’s projection.
tind2 (Tensor) – (b, num_samples) mask indices for second view’s projection.
local_negatives (bool) – whether to include local negatives.
- Returns:
A single scalar loss for the XT-NCE objective.
- Return type:
- class otx.algorithms.segmentation.adapters.mmseg.LiteHRNet(extra, in_channels=3, conv_cfg=None, norm_cfg=None, norm_eval=False, with_cp=False, zero_init_residual=False, dropout=None, init_cfg=None)[source]#
Bases:
BaseModule
Lite-HRNet backbone.
High-Resolution Representations for Labeling Pixels and Regions
- Parameters:
extra (dict) – detailed configuration for each stage of HRNet.
in_channels (int) – Number of input image channels. Default: 3.
conv_cfg (dict) – dictionary to construct and config conv layer.
norm_cfg (dict) – dictionary to construct and config norm layer.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: False
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.
Initialize BaseModule, inherited from torch.nn.Module
- class otx.algorithms.segmentation.adapters.mmseg.MMOVBackbone(*args, **kwargs)[source]#
Bases:
MMOVModel
MMOVBackbone.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- class otx.algorithms.segmentation.adapters.mmseg.MMOVDecodeHead(model_path_or_model: str | Model | None = None, weight_path: str | None = None, inputs: Dict[str, str | List[str]] | None = None, outputs: Dict[str, str | List[str]] | None = None, init_weight: bool = False, verify_shape: bool = True, *args, **kwargs)[source]#
Bases:
BaseDecodeHead
MMOVDecodeHead.
Initialize BaseModule, inherited from torch.nn.Module
- class otx.algorithms.segmentation.adapters.mmseg.MeanTeacherSegmentor(orig_type, num_iters_per_epoch=None, unsup_weight=0.1, proto_weight=0.7, drop_unrel_pixels_percent=20, semisl_start_epoch=1, proto_head=None, **kwargs)[source]#
Bases:
BaseSegmentor
Mean teacher segmentor for semi-supervised learning.
It creates two models and ema from one to the other for consistency loss.
- Parameters:
orig_type (BaseSegmentor) – original type of segmentor to build student and teacher models
num_iters_per_epoch (int) – number of iterations per training epoch.
unsup_weight (float) – loss weight for unsupervised part. Default: 0.1
proto_weight (float) – loss weight for pixel prototype cross entropy loss. Default: 0.7
drop_unrel_pixels_percent (int) – starting precentage of pixels with high entropy to drop from teachers pseudo labels. Default: 20
semisl_start_epoch (int) – epoch to start learning with unlabeled images. Default: 1
proto_head (dict) – configuration to constract prototype network. Default: None
Initialize BaseModule, inherited from torch.nn.Module
- decode_proto_network(sup_input, gt_semantic_seg, unsup_input=None, pl_from_teacher=None, reweight_unsup=1.0)[source]#
Forward prototype network, compute proto loss.
If there is no unsupervised part, only supervised loss will be computed.
- Parameters:
sup_input (torch.Tensor) – student output from labeled images
gt_semantic_seg (torch.Tensor) – ground truth semantic segmentation label maps
unsup_input (torch.Tensor) – student output from unlabeled images. Default: None
pl_from_teacher (torch.Tensor) – teacher generated pseudo labels. Default: None
reweight_unsup (float) – reweighting coefficient for unsupervised part after filtering high entropy pixels. Default: 1.0
- generate_pseudo_labels(ul_w_img, ul_img_metas)[source]#
Generate pseudo labels from teacher model, apply filter loss method.
- Parameters:
ul_w_img (torch.Tensor) – weakly augmented unlabeled images
ul_img_metas (dict) – unlabeled images meta data
- class otx.algorithms.segmentation.adapters.mmseg.OTXSegDataset(**kwargs)[source]#
Bases:
_OTXSegDataset
Wrapper dataset that allows using a OTX dataset to train models.
- class otx.algorithms.segmentation.adapters.mmseg.PolyScalarScheduler(start_scale: float, end_scale: float, num_iters: int, power: float = 1.2, by_epoch: bool = False)[source]#
Bases:
BaseScalarScheduler
The learning rate changes over time according to a polynomial schedule.
- Parameters:
start_scale (float) – The initial learning rate scale.
end_scale (float) – The final learning rate scale.
num_iters (int) – The number of iterations to reach the final learning rate.
power (float) – The power of the polynomial schedule.
by_epoch (bool) – Whether to use epoch as the unit of iteration.
- class otx.algorithms.segmentation.adapters.mmseg.SelfSLMLP(in_channels: int, hid_channels: int, out_channels: int, norm_cfg: Dict[str, Any] = {'type': 'BN1d'}, use_conv: bool = False, with_avg_pool: bool = True)[source]#
Bases:
Module
The SelfSLMLP neck: fc/conv-bn-relu-fc/conv.
- Parameters:
in_channels (int) – The number of feature output channels from backbone.
hid_channels (int) – The number of channels for a hidden layer.
out_channels (int) – The number of output channels of SelfSLMLP.
norm_cfg (dict) – Normalize configuration. Default: dict(type=”BN1d”).
use_conv (bool) – Whether using conv instead of fc. Default: False.
with_avg_pool (bool) – Whether using average pooling before passing MLP. Default: True.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- class otx.algorithms.segmentation.adapters.mmseg.StepScalarScheduler(scales: List[float], num_iters: List[int], by_epoch: bool = False)[source]#
Bases:
BaseScalarScheduler
Step learning rate scheduler.
Example
>>> scheduler = StepScalarScheduler(scales=[1.0, 0.1, 0.01], num_iters=[100, 200]) This means that the learning rate will be 1.0 for the first 100 iterations, 0.1 for the next 200 iterations, and 0.01 for the rest of the iterations.
- class otx.algorithms.segmentation.adapters.mmseg.SupConDetConB(backbone: Dict[str, Any], decode_head: Dict[str, Any] | None = None, neck: Dict[str, Any] | None = None, head: Dict[str, Any] | None = None, pretrained: str | None = None, base_momentum: float = 0.996, num_classes: int = 256, num_samples: int = 16, downsample: int = 32, input_transform: str = 'resize_concat', in_index: List[int] | int = [0], align_corners: bool = False, train_cfg: Dict[str, Any] | None = None, test_cfg: Dict[str, Any] | None = None, **kwargs)[source]#
Bases:
OTXEncoderDecoder
Apply DetConB as a contrastive part of Supervised Contrastive Learning (https://arxiv.org/abs/2004.11362).
SupCon with DetConB uses ground truth masks instead of pseudo masks to organize features among the same classes.
- Parameters:
Initialize BaseModule, inherited from torch.nn.Module