datumaro.components.filter#

Classes

`DatasetItemEncoder`()
`UserFunctionAnnotationsFilter`(extractor, ...)	Filter annotations using a user-provided Python function.
`UserFunctionDatasetFilter`(extractor, filter_func)	Filter dataset items using a user-provided Python function.
`XPathAnnotationsFilter`(extractor, xpath[, ...])
`XPathDatasetFilter`(extractor, xpath)

class datumaro.components.filter.XPathDatasetFilter(extractor: IDataset, xpath: str)[source]#

Bases: ItemTransform

transform_item(item: DatasetItem) → DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.XPathAnnotationsFilter(extractor: IDataset, xpath: str, remove_empty: bool = False)[source]#

Bases: ItemTransform

transform_item(item: DatasetItem) → DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.UserFunctionDatasetFilter(extractor: IDataset, filter_func: Callable[[DatasetItem], bool])[source]#

Bases: ItemTransform

Filter dataset items using a user-provided Python function.

Parameters:

extractor – Datumaro Dataset to filter.
filter_func – A Python callable that takes a DatasetItem as its input and returns a boolean. If the return value is True, that DatasetItem will be retained. Otherwise, it is removed.

Example

This is an example of filtering dataset items with images larger than 1024 pixels:

from datumaro.components.media import Image

def filter_func(item: DatasetItem) -> bool:: h, w = item.media_as(Image).size return h > 1024 or w > 1024
filtered = UserFunctionDatasetFilter(: extractor=dataset, filter_func=filter_func)

# No items with an image height or width greater than 1024 filtered_items = [item for item in filtered]

transform_item(item: DatasetItem) → DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.UserFunctionAnnotationsFilter(extractor: IDataset, filter_func: Callable[[DatasetItem, Annotation], bool], remove_empty: bool = False)[source]#

Bases: ItemTransform

Filter annotations using a user-provided Python function.

Parameters:

extractor – Datumaro Dataset to filter.
filter_func – A Python callable that takes DatasetItem and Annotation as its inputs and returns a boolean. If the return value is True, the Annotation will be retained. Otherwise, it is removed.
remove_empty – If True, DatasetItem without any annotations is removed after filtering its annotations. Otherwise, do not filter DatasetItem.

Example

This is an example of removing bounding boxes sized greater than 50% of the image size:

from datumaro.components.media import Image from datumaro.components.annotation import Annotation, Bbox

def filter_func(item: DatasetItem, ann: Annotation) -> bool:

# If the annotation is not a Bbox, do not filter if not isinstance(ann, Bbox):

return False

h, w = item.media_as(Image).size image_size = h * w bbox_size = ann.h * ann.w

# Accept Bboxes smaller than 50% of the image size return bbox_size < 0.5 * image_size

filtered = UserFunctionAnnotationsFilter(

extractor=dataset, filter_func=filter_func)

# No bounding boxes with a size greater than 50% of their image filtered_items = [item for item in filtered]

transform_item(item: DatasetItem) → DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.Annotation(*, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: object

A base annotation class.

Derived classes must define the ‘_type’ class variable with a value from the AnnotationType enum.

Method generated by attrs for class Annotation.

id: int#

attributes: Dict[str, Any]#

group: int#

object_id: int#

property type: AnnotationType#

as_dict() → Dict[str, Any][source]#: Returns a dictionary { field_name: value }

wrap(**kwargs)[source]#: Returns a modified copy of the object

class datumaro.components.filter.AnnotationType(value)[source]#

Bases: IntEnum

An enumeration.

unknown = 0#

label = 1#

mask = 2#

points = 3#

polygon = 4#

polyline = 5#

bbox = 6#

caption = 7#

cuboid_3d = 8#

super_resolution_annotation = 9#

depth_annotation = 10#

ellipse = 11#

hash_key = 12#

feature_vector = 13#

tabular = 14#

rotated_bbox = 15#

cuboid_2d = 16#

class datumaro.components.filter.Bbox(x, y, w, h, *args, **kwargs)[source]#

Bases: Shape

Bbox annotation class. This class represents a bounding box defined by its top-left corner (x, y) and its width and height (w, h).

_type#

The type of annotation, set to AnnotationType.bbox.

Type:: AnnotationType

__init__()[source]#: Initializes the Bbox with its coordinates and dimensions.

x()#: Property to get the x-coordinate of the bounding box.

y()#: Property to get the y-coordinate of the bounding box.

w()#: Property to get the width of the bounding box.

h()#: Property to get the height of the bounding box.

get_area()[source]#: Calculates the area of the bounding box.

get_bbox()[source]#: Returns the bounding box coordinates and dimensions.

as_polygon()[source]#: Returns the bounding box as a list of points forming a polygon.

iou()[source]#: Calculates the Intersection over Union (IoU) with another shape.

wrap()[source]#: Creates a new Bbox instance with updated attributes.

Initialize the Bbox with its top-left corner (x, y) and its width and height (w, h).

Parameters:

x (float) – The x-coordinate of the top-left corner.
y (float) – The y-coordinate of the top-left corner.
w (float) – The width of the bounding box.
h (float) – The height of the bounding box.

property x#

Get the x-coordinate of the top-left corner of the bounding box.

Returns:: The x-coordinate of the bounding box.
Return type:: float

property y#

Get the y-coordinate of the top-left corner of the bounding box.

Returns:: The y-coordinate of the bounding box.
Return type:: float

property w#

Get the width of the bounding box.

Returns:: The width of the bounding box.
Return type:: float

property h#

Get the height of the bounding box.

Returns:: The height of the bounding box.
Return type:: float

get_area()[source]#

Calculate the area of the bounding box.

Returns:: The area of the bounding box.
Return type:: float

get_bbox()[source]#

Get the bounding box coordinates and dimensions.

Returns:: The bounding box as [x, y, w, h].
Return type:: List[float]

as_polygon() → List[float][source]#

Convert the bounding box into a polygon representation.

Returns:: The bounding box as a polygon.
Return type:: List[float]

iou(other: Shape) → float | ~typing.Literal[-1][source]#

Calculate the Intersection over Union (IoU) with another shape.

Parameters:: other (Shape) – The other shape to compare with.
Returns:: The IoU value or -1 if not applicable.
Return type:: Union[float, Literal[-1]]

wrap(**kwargs)[source]#

Create a new Bbox instance with updated attributes.

Parameters:

item (Bbox) – The original Bbox instance.
kwargs – Additional attributes to update.

Returns:

A new Bbox instance with updated attributes.

Return type:

Bbox

class datumaro.components.filter.Caption(caption, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: Annotation

Represents arbitrary text annotations.

Method generated by attrs for class Caption.

caption: str#

class datumaro.components.filter.DatasetItemEncoder[source]#

Bases: object

classmethod encode(item: DatasetItem, categories: CategoriesInfo | None = None) → ET.ElementBase[source]#

classmethod encode_image(image: Image) → ElementBase[source]#

classmethod encode_annotation_base(annotation: Annotation) → ElementBase[source]#

classmethod encode_label_object(obj: Label, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_mask_object(obj: Mask, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_bbox_object(obj: Bbox, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_points_object(obj: Points, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_polygon_object(obj: Polygon, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_polyline_object(obj: PolyLine, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_caption_object(obj: Caption) → ElementBase[source]#

classmethod encode_ellipse_object(obj: Ellipse, categories: CategoriesInfo | None) → ET.ElementBase[source]#

classmethod encode_annotation(o: Annotation, categories: CategoriesInfo | None = None) → ET.ElementBase[source]#

static to_string(encoded_item: ElementBase) → str[source]#

class datumaro.components.filter.Ellipse(x1: float, y1: float, x2: float, y2: float, *args, **kwargs)[source]#

Bases: Shape

Ellipse represents an ellipse that is encapsulated by a rectangle.

x1 and y1 represent the top-left coordinate of the encapsulating rectangle
x2 and y2 representing the bottom-right coordinate of the encapsulating rectangle

Parameters:

x1 (float) – left x coordinate of encapsulating rectangle
y1 (float) – top y coordinate of encapsulating rectangle
x2 (float) – right x coordinate of encapsulating rectangle
y2 (float) – bottom y coordinate of encapsulating rectangle

Method generated by attrs for class Shape.

property x1#

property y1#

property x2#

property y2#

property w#

property h#

property c_x#

property c_y#

get_area()[source]#: Calculate the area of the shape.

get_bbox()[source]#

Calculate and return the bounding box of the shape.

Returns:: The bounding box as [x, y, w, h].
Return type:: Tuple[float, float, float, float]

get_points(num_points: int = 720) → List[Tuple[float, float]][source]#

Return points as a list of tuples, e.g. [(x0, y0), (x1, y1), …].

Parameters:: num_points (int) – The number of boundary points of the ellipse. By default, one point is created for every 1 degree of interior angle (num_points=360).

as_polygon(num_points: int = 720) → List[float][source]#

Return a polygon as a list of tuples, e.g. [x0, y0, x1, y1, …].

Parameters:: num_points (int) – The number of boundary points of the ellipse. By default, one point is created for every 1 degree of interior angle (num_points=360).

iou(other: Shape) → float | ~typing.Literal[-1][source]#

wrap(**kwargs) → Ellipse[source]#: Returns a modified copy of the object

class datumaro.components.filter.HashKey(hash_key: ndarray, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: Annotation

Method generated by attrs for class HashKey.

hash_key: ndarray#

class datumaro.components.filter.Image(size: Tuple[int, int] | None = None, ext: str | None = None, *args, **kwargs)[source]#

Bases: MediaElement[ndarray]

classmethod from_file(path: str, *args, **kwargs)[source]#

classmethod from_numpy(data: ndarray | Callable[[], ndarray], *args, **kwargs)[source]#

classmethod from_bytes(data: bytes | Callable[[], bytes], *args, **kwargs)[source]#

property has_size: bool#: Indicates that size info is cached and won’t require image loading

property size: Tuple[int, int] | None#: Returns (H, W)

property ext: str | None#: Media file extension (with the leading dot)

set_crypter(crypter: Crypter)[source]#

class datumaro.components.filter.ItemTransform(extractor: IDataset)[source]#

Bases: Transform

transform_item(item: DatasetItem) → DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.Label(label, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: Annotation

Method generated by attrs for class Label.

label: int#

class datumaro.components.filter.Mask(image: ndarray | Callable[[], ndarray], *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Annotation

Represents a 2d single-instance binary segmentation mask.

Method generated by attrs for class Mask.

label: int | None#

z_order: int#

property image: ndarray#

as_class_mask(label_id: int | None = None, ignore_index: int = 0, dtype: dtype | None = None) → ndarray[source]#

Produces a class index mask based on the binary mask.

Parameters:

label_id – Scalar value to represent the class index of the mask. If not specified, self.label will be used. Defaults to None.
ignore_index – Scalar value to fill in the zeros in the binary mask. Defaults to 0.
dtype – Data type for the resulting mask. If not specified, it will be inferred from the provided label_id to hold its value. For example, if label_id=255, the inferred dtype will be np.uint8. Defaults to None.

Returns:

Class index mask generated from the binary mask.

Return type:

IndexMaskImage

as_instance_mask(instance_id: int, ignore_index: int = 0, dtype: dtype | None = None) → ndarray[source]#

Produces an instance index mask based on the binary mask.

Parameters:

instance_id – Scalar value to represent the instance id.
ignore_index – Scalar value to fill in the zeros in the binary mask. Defaults to 0.
dtype – Data type for the resulting mask. If not specified, it will be inferred from the provided label_id to hold its value. For example, if label_id=255, the inferred dtype will be np.uint8. Defaults to None.

Returns:

Instance index mask generated from the binary mask.

Return type:

IndexMaskImage

get_area() → int[source]#

get_bbox() → Tuple[int, int, int, int][source]#

Computes the bounding box of the mask.

Returns: [x, y, w, h]

paint(colormap: Dict[int, Tuple[int, int, int]]) → ndarray[source]#: Applies a colormap to the mask and produces the resulting image.

class datumaro.components.filter.Points(points, visibility: List[IntEnum] | None = None, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Shape

Represents an ordered set of points.

_type#

The type of annotation, set to AnnotationType.points.

Type:: AnnotationType

visibility#

A list indicating the visibility status of each point.

Type:: List[IntEnum]

Nested Class:

Visibility (IntEnum): Enum representing the visibility state of points. It has three states:

absent: Point is absent (0).
hidden: Point is hidden (1).
visible: Point is visible (2).

__attrs_post_init__()[source]#: Validates that the number of points is even.

get_area()[source]#: Returns the area covered by the points, always zero.

get_bbox()[source]#: Returns the bounding box containing all visible or hidden points.

Method generated by attrs for class Points.

class Visibility(value)[source]#

Bases: IntEnum

Enum representing the visibility state of points.

absent#

Point is absent (0).

Type:: int

hidden#

Point is hidden (1).

Type:: int

visible#

Point is visible (2).

Type:: int

absent = 0#

hidden = 1#

visible = 2#

visibility: List[IntEnum]#

get_area()[source]#

Returns the area covered by the points.

Returns:: Always returns 0.
Return type:: int

get_bbox()[source]#

Returns the bounding box containing all visible or hidden points.

Returns:: The bounding box as [x0, y0, width, height].
Return type:: List[float]

class datumaro.components.filter.PolyLine(points, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Shape

PolyLine annotation class. This class represents a polyline shape, which is a series of connected line segments.

_type#

The type of annotation, set to AnnotationType.polyline.

Type:: AnnotationType

as_polygon()[source]#: Returns the points of the polyline as a polygon.

get_area()[source]#: Returns the area of the polyline, which is always 0.

Method generated by attrs for class PolyLine.

as_polygon()[source]#: Convert the shape into a polygon representation.

get_area()[source]#: Calculate the area of the shape.

class datumaro.components.filter.Polygon(points, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Shape

Polygon annotation class. This class represents a polygon shape defined by a series of points.

_type#

The type of annotation, set to AnnotationType.polygon.

Type:: AnnotationType

__attrs_post_init__()[source]#: Validates the points to ensure they form a valid polygon.

get_area()[source]#: Calculates the area of the polygon using the shoelace formula.

as_polygon()[source]#: Returns the points of the polygon.

__eq__()[source]#: Compares this polygon with another for equality.

_get_shoelace_area()[source]#: Helper method to calculate the area of the polygon using the shoelace formula.

Method generated by attrs for class Polygon.

get_area()[source]#

Calculate the area of the polygon using the shoelace formula.

Returns:: The area of the polygon.
Return type:: float

as_polygon() → List[float][source]#

Return the points of the polygon.

Returns:: The points of the polygon.
Return type:: List[float]