Instance Segmentation#
Description#
Instance segmentation model aims to detect and segment objects in an image. It is an extension of object detection, where each object is segmented into a separate mask. The model outputs a list of segmented objects, each containing a mask, bounding box, score and class label.
OpenVINO Model Specifications#
Inputs#
A single input image of shape (H, W, 3) where H and W are the height and width of the image, respectively.
Outputs#
Instance segmentation model outputs a list of segmented objects (i.e list[SegmentedObject]
)wrapped in InstanceSegmentationResult.segmentedObjects
, each containing the following attributes:
mask
(numpy.ndarray) - A binary mask of the object.score
(float) - Confidence score of the object.id
(int) - Class label of the object.str_label
(str) - String label of the object.xmin
(int) - X-coordinate of the top-left corner of the bounding box.ymin
(int) - Y-coordinate of the top-left corner of the bounding box.xmax
(int) - X-coordinate of the bottom-right corner of the bounding box.ymax
(int) - Y-coordinate of the bottom-right corner of the bounding box.
Example#
import cv2
from model_api.models import MaskRCNNModel
# Load the model
model = MaskRCNNModel.create_model("model.xml")
# Forward pass
predictions = model(image)
# Iterate over the segmented objects
for pred_obj in predictions.segmentedObjects:
pred_mask = pred_obj.mask
pred_score = pred_obj.score
label_id = pred_obj.id
bbox = [pred_obj.xmin, pred_obj.ymin, pred_obj.xmax, pred_obj.ymax]
- class model_api.models.instance_segmentation.MaskRCNNModel(inference_adapter, configuration={}, preload=False)#
Bases:
ImageModel
Image model constructor
It extends the Model constructor.
- Parameters:
inference_adapter (InferenceAdapter) – allows working with the specified executor
configuration (dict, optional) – it contains values for parameters accepted by specific wrapper (confidence_threshold, labels etc.) which are set as data attributes
preload (bool, optional) – a flag whether the model is loaded to device while initialization. If preload=False, the model must be loaded via load method before inference
- Raises:
WrapperError – if the wrapper failed to define appropriate inputs for images
- classmethod parameters()#
Defines the description and type of configurable data parameters for the wrapper.
See types.py to find available types of the data parameter. For each parameter the type, default value and description must be provided.
- The example of possible data parameter:
- ‘confidence_threshold’: NumericalValue(
default_value=0.5, description=”Threshold value for detection box confidence”
)
The method must be implemented in each specific inherited wrapper.
- Return type:
dict
- Returns:
the dictionary with defined wrapper data parameters
- postprocess(outputs, meta)#
Interface for postprocess method.
- Parameters:
outputs (dict) –
model raw output in the following format: {
’output_layer_name_1’: raw_result_1, ‘output_layer_name_2’: raw_result_2, …
}
meta (dict) – the input metadata obtained from preprocess method
- Return type:
InstanceSegmentationResult
- Returns:
postprocessed data in the format defined by wrapper
- preprocess(inputs)#
Data preprocess method
- It performs basic preprocessing of a single image:
Resizes the image to fit the model input size via the defined resize type
Normalizes the image: subtracts means, divides by scales, switch channels BGR-RGB
Changes the image layout according to the model input layout
Also, it keeps the size of original image and resized one as original_shape and resized_shape in the metadata dictionary.
Note
It supports only models with single image input. If the model has more image inputs or has additional supported inputs, the preprocess should be overloaded in a specific wrapper.
- Parameters:
inputs (ndarray) – a single image as 3D array in HWC layout
- Returns:
- {
‘input_layer_name’: preprocessed_image
}
the input metadata, which might be used in postprocess method
- Return type:
the preprocessed image in the following format