Instance Segmentation#

Description#

Instance segmentation model aims to detect and segment objects in an image. It is an extension of object detection, where each object is segmented into a separate mask. The model outputs a list of segmented objects, each containing a mask, bounding box, score and class label.

OpenVINO Model Specifications#

Inputs#

A single input image of shape (H, W, 3) where H and W are the height and width of the image, respectively.

Outputs#

Instance segmentation model outputs a list of segmented objects (i.e list[SegmentedObject])wrapped in InstanceSegmentationResult.segmentedObjects, each containing the following attributes:

  • mask (numpy.ndarray) - A binary mask of the object.

  • score (float) - Confidence score of the object.

  • id (int) - Class label of the object.

  • str_label (str) - String label of the object.

  • xmin (int) - X-coordinate of the top-left corner of the bounding box.

  • ymin (int) - Y-coordinate of the top-left corner of the bounding box.

  • xmax (int) - X-coordinate of the bottom-right corner of the bounding box.

  • ymax (int) - Y-coordinate of the bottom-right corner of the bounding box.

Example#

import cv2
from model_api.models import MaskRCNNModel

# Load the model
model = MaskRCNNModel.create_model("model.xml")

# Forward pass
predictions = model(image)

# Iterate over the segmented objects
for pred_obj in predictions.segmentedObjects:
    pred_mask = pred_obj.mask
    pred_score = pred_obj.score
    label_id = pred_obj.id
    bbox = [pred_obj.xmin, pred_obj.ymin, pred_obj.xmax, pred_obj.ymax]
class model_api.models.instance_segmentation.MaskRCNNModel(inference_adapter, configuration={}, preload=False)#

Bases: ImageModel

Image model constructor

It extends the Model constructor.

Parameters:
  • inference_adapter (InferenceAdapter) – allows working with the specified executor

  • configuration (dict, optional) – it contains values for parameters accepted by specific wrapper (confidence_threshold, labels etc.) which are set as data attributes

  • preload (bool, optional) – a flag whether the model is loaded to device while initialization. If preload=False, the model must be loaded via load method before inference

Raises:

WrapperError – if the wrapper failed to define appropriate inputs for images

classmethod parameters()#

Defines the description and type of configurable data parameters for the wrapper.

See types.py to find available types of the data parameter. For each parameter the type, default value and description must be provided.

The example of possible data parameter:
‘confidence_threshold’: NumericalValue(

default_value=0.5, description=”Threshold value for detection box confidence”

)

The method must be implemented in each specific inherited wrapper.

Return type:

dict

Returns:

  • the dictionary with defined wrapper data parameters

postprocess(outputs, meta)#

Interface for postprocess method.

Parameters:
  • outputs (dict) –

    model raw output in the following format: {

    ’output_layer_name_1’: raw_result_1, ‘output_layer_name_2’: raw_result_2, …

    }

  • meta (dict) – the input metadata obtained from preprocess method

Return type:

InstanceSegmentationResult

Returns:

  • postprocessed data in the format defined by wrapper

preprocess(inputs)#

Data preprocess method

It performs basic preprocessing of a single image:
  • Resizes the image to fit the model input size via the defined resize type

  • Normalizes the image: subtracts means, divides by scales, switch channels BGR-RGB

  • Changes the image layout according to the model input layout

Also, it keeps the size of original image and resized one as original_shape and resized_shape in the metadata dictionary.

Note

It supports only models with single image input. If the model has more image inputs or has additional supported inputs, the preprocess should be overloaded in a specific wrapper.

Parameters:

inputs (ndarray) – a single image as 3D array in HWC layout

Returns:

{

‘input_layer_name’: preprocessed_image

}

  • the input metadata, which might be used in postprocess method

Return type:

  • the preprocessed image in the following format