datumaro.plugins.sam_transforms.interpreters.sam_decoder_for_amg#

Classes

`AMGMasks`(*[, id, attributes, group, object_id])	Intermediate annotation class for SAM decoder outputs.
`AMGPoints`(*[, id, attributes, group, object_id])	Intermediate annotation class for SAM decoder inputs.
`SAMDecoderForAMGInterpreter`()	Interpreter for the automatic mask generation using SAM decoder.

class datumaro.plugins.sam_transforms.interpreters.sam_decoder_for_amg.AMGPoints(*, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, points: ndarray)[source]#

Bases: Annotation

Intermediate annotation class for SAM decoder inputs.

points#

Array of points (x, y) for the SAM prompt.

Type:: numpy.ndarray

Method generated by attrs for class AMGPoints.

points: ndarray#

class datumaro.plugins.sam_transforms.interpreters.sam_decoder_for_amg.AMGMasks(*, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, masks: ndarray, iou_preds: ndarray)[source]#

Bases: Annotation

Intermediate annotation class for SAM decoder outputs.

masks#

Array of masks corresponded to the points.

Type:: numpy.ndarray

iou_preds#

Array of Intersection over Union (IoU) prediction scores corresponded to the points.

Type:: numpy.ndarray

Method generated by attrs for class AMGMasks.

masks: ndarray#

iou_preds: ndarray#

classmethod cat(masks: List[AMGMasks]) → AMGMasks[source]#

Concatenate a list of AMGMasks into a single AMGMasks object.

Parameters:: masks – List of AMGMasks to concatenate.
Returns:: A new AMGMasks containing the concatenated masks and IoU prediction scores.

postprocess(mask_threshold: float, pred_iou_thresh: float, stability_score_offset: float, stability_score_thresh: float, box_nms_thresh: float, min_mask_region_area: int) → List[Mask][source]#

Postprocesses the masks with the given parameters.

Parameters:

pred_iou_thresh (float) – A filtering threshold in [0,1], using the model’s predicted mask quality.
stability_score_thresh (float) – A filtering threshold in [0,1], using the stability of the mask under changes to the cutoff used to binarize the model’s mask predictions.
stability_score_offset (float) – The amount to shift the cutoff when calculated the stability score.
box_nms_thresh (float) – The box IoU cutoff used by non-maximal suppression to filter duplicate masks.
min_mask_region_area (int) – If >0, postprocessing will be applied to remove the binary mask which has the number of 1s less than min_mask_region_area.

Returns:

List of :class:`Mask`s representing the postprocessed masks.

class datumaro.plugins.sam_transforms.interpreters.sam_decoder_for_amg.SAMDecoderForAMGInterpreter[source]#

Bases: IModelInterpreter

Interpreter for the automatic mask generation using SAM decoder.

h_model = 1024#

w_model = 1024#

onnx_mask_input = array([[[[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]]], dtype=float32)#

onnx_has_mask_input = array([0.], dtype=float32)#

preprocess(inp: DatasetItem) → Tuple[ndarray | Dict[str, ndarray], PrepInfo][source]#

Preprocessing an dataset item input.

Parameters:: img – DatasetItem input
Returns:: It returns a tuple of preprocessed input and preprocessing information. The preprocessing information will be used in the postprocessing step. One use case for this would be an affine transformation of the output bounding box obtained by object detection models. Input images for those models are usually resized to fit the model input dimensions. As a result, the postprocessing step should refine the model output bounding box to match the original input image size.

postprocess(pred: Dict[str, ndarray] | List[Dict[str, ndarray]], info: PrepInfo) → List[Annotation][source]#

Postprocesses the outputs of the SAM decoder to generate masks automatically from the prompts which have a point uniformly distributed on a 2d grid.

Parameters:

pred – List of dictionaries containing model predictions. Each dictionary should have the ‘masks’
is (and 'iou_preds' keys. 'masks' is corresponding to the predicted mask of which shape) –
score. ('iou_preds' is corresponding to the scalar IoU prediction) –
info – None

Returns:

List of AMGMasks produced by the SAM decoder.

get_categories()[source]#: It should be implemented if the model generate a new categories