datumaro.components.filter#

Classes

DatasetItemEncoder()

UserFunctionAnnotationsFilter(extractor, ...)

Filter annotations using a user-provided Python function.

UserFunctionDatasetFilter(extractor, filter_func)

Filter dataset items using a user-provided Python function.

XPathAnnotationsFilter(extractor, xpath[, ...])

XPathDatasetFilter(extractor, xpath)

class datumaro.components.filter.XPathDatasetFilter(extractor: IDataset, xpath: str)[source]#

Bases: ItemTransform

transform_item(item: DatasetItem) DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.XPathAnnotationsFilter(extractor: IDataset, xpath: str, remove_empty: bool = False)[source]#

Bases: ItemTransform

transform_item(item: DatasetItem) DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.UserFunctionDatasetFilter(extractor: IDataset, filter_func: Callable[[DatasetItem], bool])[source]#

Bases: ItemTransform

Filter dataset items using a user-provided Python function.

Parameters:
  • extractor – Datumaro Dataset to filter.

  • filter_func – A Python callable that takes a DatasetItem as its input and returns a boolean. If the return value is True, that DatasetItem will be retained. Otherwise, it is removed.

Example

This is an example of filtering dataset items with images larger than 1024 pixels:

from datumaro.components.media import Image

def filter_func(item: DatasetItem) -> bool:

h, w = item.media_as(Image).size return h > 1024 or w > 1024

filtered = UserFunctionDatasetFilter(

extractor=dataset, filter_func=filter_func)

# No items with an image height or width greater than 1024 filtered_items = [item for item in filtered]

transform_item(item: DatasetItem) DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.UserFunctionAnnotationsFilter(extractor: IDataset, filter_func: Callable[[DatasetItem, Annotation], bool], remove_empty: bool = False)[source]#

Bases: ItemTransform

Filter annotations using a user-provided Python function.

Parameters:
  • extractor – Datumaro Dataset to filter.

  • filter_func – A Python callable that takes DatasetItem and Annotation as its inputs and returns a boolean. If the return value is True, the Annotation will be retained. Otherwise, it is removed.

  • remove_empty – If True, DatasetItem without any annotations is removed after filtering its annotations. Otherwise, do not filter DatasetItem.

Example

This is an example of removing bounding boxes sized greater than 50% of the image size:

from datumaro.components.media import Image from datumaro.components.annotation import Annotation, Bbox

def filter_func(item: DatasetItem, ann: Annotation) -> bool:

# If the annotation is not a Bbox, do not filter if not isinstance(ann, Bbox):

return False

h, w = item.media_as(Image).size image_size = h * w bbox_size = ann.h * ann.w

# Accept Bboxes smaller than 50% of the image size return bbox_size < 0.5 * image_size

filtered = UserFunctionAnnotationsFilter(

extractor=dataset, filter_func=filter_func)

# No bounding boxes with a size greater than 50% of their image filtered_items = [item for item in filtered]

transform_item(item: DatasetItem) DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.Annotation(*, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: object

A base annotation class.

Derived classes must define the ‘_type’ class variable with a value from the AnnotationType enum.

Method generated by attrs for class Annotation.

id: int#
attributes: Dict[str, Any]#
group: int#
object_id: int#
property type: AnnotationType#
as_dict() Dict[str, Any][source]#

Returns a dictionary { field_name: value }

wrap(**kwargs)[source]#

Returns a modified copy of the object

class datumaro.components.filter.AnnotationType(value)[source]#

Bases: IntEnum

An enumeration.

unknown = 0#
label = 1#
mask = 2#
points = 3#
polygon = 4#
polyline = 5#
bbox = 6#
caption = 7#
cuboid_3d = 8#
super_resolution_annotation = 9#
depth_annotation = 10#
ellipse = 11#
hash_key = 12#
feature_vector = 13#
tabular = 14#
rotated_bbox = 15#
cuboid_2d = 16#
class datumaro.components.filter.Bbox(x, y, w, h, *args, **kwargs)[source]#

Bases: Shape

Bbox annotation class. This class represents a bounding box defined by its top-left corner (x, y) and its width and height (w, h).

_type#

The type of annotation, set to AnnotationType.bbox.

Type:

AnnotationType

__init__()[source]#

Initializes the Bbox with its coordinates and dimensions.

x()#

Property to get the x-coordinate of the bounding box.

y()#

Property to get the y-coordinate of the bounding box.

w()#

Property to get the width of the bounding box.

h()#

Property to get the height of the bounding box.

get_area()[source]#

Calculates the area of the bounding box.

get_bbox()[source]#

Returns the bounding box coordinates and dimensions.

as_polygon()[source]#

Returns the bounding box as a list of points forming a polygon.

iou()[source]#

Calculates the Intersection over Union (IoU) with another shape.

wrap()[source]#

Creates a new Bbox instance with updated attributes.

Initialize the Bbox with its top-left corner (x, y) and its width and height (w, h).

Parameters:
  • x (float) – The x-coordinate of the top-left corner.

  • y (float) – The y-coordinate of the top-left corner.

  • w (float) – The width of the bounding box.

  • h (float) – The height of the bounding box.

property x#

Get the x-coordinate of the top-left corner of the bounding box.

Returns:

The x-coordinate of the bounding box.

Return type:

float

property y#

Get the y-coordinate of the top-left corner of the bounding box.

Returns:

The y-coordinate of the bounding box.

Return type:

float

property w#

Get the width of the bounding box.

Returns:

The width of the bounding box.

Return type:

float

property h#

Get the height of the bounding box.

Returns:

The height of the bounding box.

Return type:

float

get_area()[source]#

Calculate the area of the bounding box.

Returns:

The area of the bounding box.

Return type:

float

get_bbox()[source]#

Get the bounding box coordinates and dimensions.

Returns:

The bounding box as [x, y, w, h].

Return type:

List[float]

as_polygon() List[float][source]#

Convert the bounding box into a polygon representation.

Returns:

The bounding box as a polygon.

Return type:

List[float]

iou(other: Shape) float | ~typing.Literal[-1][source]#

Calculate the Intersection over Union (IoU) with another shape.

Parameters:

other (Shape) – The other shape to compare with.

Returns:

The IoU value or -1 if not applicable.

Return type:

Union[float, Literal[-1]]

wrap(**kwargs)[source]#

Create a new Bbox instance with updated attributes.

Parameters:
  • item (Bbox) – The original Bbox instance.

  • kwargs – Additional attributes to update.

Returns:

A new Bbox instance with updated attributes.

Return type:

Bbox

class datumaro.components.filter.Caption(caption, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: Annotation

Represents arbitrary text annotations.

Method generated by attrs for class Caption.

caption: str#
class datumaro.components.filter.DatasetItemEncoder[source]#

Bases: object

classmethod encode(item: DatasetItem, categories: CategoriesInfo | None = None) ET.ElementBase[source]#
classmethod encode_image(image: Image) ElementBase[source]#
classmethod encode_annotation_base(annotation: Annotation) ElementBase[source]#
classmethod encode_label_object(obj: Label, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_mask_object(obj: Mask, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_bbox_object(obj: Bbox, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_points_object(obj: Points, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_polygon_object(obj: Polygon, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_polyline_object(obj: PolyLine, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_caption_object(obj: Caption) ElementBase[source]#
classmethod encode_ellipse_object(obj: Ellipse, categories: CategoriesInfo | None) ET.ElementBase[source]#
classmethod encode_annotation(o: Annotation, categories: CategoriesInfo | None = None) ET.ElementBase[source]#
static to_string(encoded_item: ElementBase) str[source]#
class datumaro.components.filter.Ellipse(x1: float, y1: float, x2: float, y2: float, *args, **kwargs)[source]#

Bases: Shape

Ellipse represents an ellipse that is encapsulated by a rectangle.

  • x1 and y1 represent the top-left coordinate of the encapsulating rectangle

  • x2 and y2 representing the bottom-right coordinate of the encapsulating rectangle

Parameters:
  • x1 (float) – left x coordinate of encapsulating rectangle

  • y1 (float) – top y coordinate of encapsulating rectangle

  • x2 (float) – right x coordinate of encapsulating rectangle

  • y2 (float) – bottom y coordinate of encapsulating rectangle

Method generated by attrs for class Shape.

property x1#
property y1#
property x2#
property y2#
property w#
property h#
property c_x#
property c_y#
get_area()[source]#

Calculate the area of the shape.

get_bbox()[source]#

Calculate and return the bounding box of the shape.

Returns:

The bounding box as [x, y, w, h].

Return type:

Tuple[float, float, float, float]

get_points(num_points: int = 720) List[Tuple[float, float]][source]#

Return points as a list of tuples, e.g. [(x0, y0), (x1, y1), …].

Parameters:

num_points (int) – The number of boundary points of the ellipse. By default, one point is created for every 1 degree of interior angle (num_points=360).

as_polygon(num_points: int = 720) List[float][source]#

Return a polygon as a list of tuples, e.g. [x0, y0, x1, y1, …].

Parameters:

num_points (int) – The number of boundary points of the ellipse. By default, one point is created for every 1 degree of interior angle (num_points=360).

iou(other: Shape) float | ~typing.Literal[-1][source]#
wrap(**kwargs) Ellipse[source]#

Returns a modified copy of the object

class datumaro.components.filter.HashKey(hash_key: ndarray, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: Annotation

Method generated by attrs for class HashKey.

hash_key: ndarray#
class datumaro.components.filter.Image(size: Tuple[int, int] | None = None, ext: str | None = None, *args, **kwargs)[source]#

Bases: MediaElement[ndarray]

classmethod from_file(path: str, *args, **kwargs)[source]#
classmethod from_numpy(data: ndarray | Callable[[], ndarray], *args, **kwargs)[source]#
classmethod from_bytes(data: bytes | Callable[[], bytes], *args, **kwargs)[source]#
property has_size: bool#

Indicates that size info is cached and won’t require image loading

property size: Tuple[int, int] | None#

Returns (H, W)

property ext: str | None#

Media file extension (with the leading dot)

set_crypter(crypter: Crypter)[source]#
class datumaro.components.filter.ItemTransform(extractor: IDataset)[source]#

Bases: Transform

transform_item(item: DatasetItem) DatasetItem | None[source]#

Returns a modified copy of the input item.

Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.

class datumaro.components.filter.Label(label, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1)[source]#

Bases: Annotation

Method generated by attrs for class Label.

label: int#
class datumaro.components.filter.Mask(image: ndarray | Callable[[], ndarray], *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Annotation

Represents a 2d single-instance binary segmentation mask.

Method generated by attrs for class Mask.

label: int | None#
z_order: int#
property image: ndarray#
as_class_mask(label_id: int | None = None, ignore_index: int = 0, dtype: dtype | None = None) ndarray[source]#

Produces a class index mask based on the binary mask.

Parameters:
  • label_id – Scalar value to represent the class index of the mask. If not specified, self.label will be used. Defaults to None.

  • ignore_index – Scalar value to fill in the zeros in the binary mask. Defaults to 0.

  • dtype – Data type for the resulting mask. If not specified, it will be inferred from the provided label_id to hold its value. For example, if label_id=255, the inferred dtype will be np.uint8. Defaults to None.

Returns:

Class index mask generated from the binary mask.

Return type:

IndexMaskImage

as_instance_mask(instance_id: int, ignore_index: int = 0, dtype: dtype | None = None) ndarray[source]#

Produces an instance index mask based on the binary mask.

Parameters:
  • instance_id – Scalar value to represent the instance id.

  • ignore_index – Scalar value to fill in the zeros in the binary mask. Defaults to 0.

  • dtype – Data type for the resulting mask. If not specified, it will be inferred from the provided label_id to hold its value. For example, if label_id=255, the inferred dtype will be np.uint8. Defaults to None.

Returns:

Instance index mask generated from the binary mask.

Return type:

IndexMaskImage

get_area() int[source]#
get_bbox() Tuple[int, int, int, int][source]#

Computes the bounding box of the mask.

Returns: [x, y, w, h]

paint(colormap: Dict[int, Tuple[int, int, int]]) ndarray[source]#

Applies a colormap to the mask and produces the resulting image.

class datumaro.components.filter.Points(points, visibility: List[IntEnum] | None = None, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Shape

Represents an ordered set of points.

_type#

The type of annotation, set to AnnotationType.points.

Type:

AnnotationType

visibility#

A list indicating the visibility status of each point.

Type:

List[IntEnum]

Nested Class:
Visibility (IntEnum): Enum representing the visibility state of points. It has three states:
  • absent: Point is absent (0).

  • hidden: Point is hidden (1).

  • visible: Point is visible (2).

__attrs_post_init__()[source]#

Validates that the number of points is even.

get_area()[source]#

Returns the area covered by the points, always zero.

get_bbox()[source]#

Returns the bounding box containing all visible or hidden points.

Method generated by attrs for class Points.

class Visibility(value)[source]#

Bases: IntEnum

Enum representing the visibility state of points.

absent#

Point is absent (0).

Type:

int

hidden#

Point is hidden (1).

Type:

int

visible#

Point is visible (2).

Type:

int

absent = 0#
hidden = 1#
visible = 2#
visibility: List[IntEnum]#
get_area()[source]#

Returns the area covered by the points.

Returns:

Always returns 0.

Return type:

int

get_bbox()[source]#

Returns the bounding box containing all visible or hidden points.

Returns:

The bounding box as [x0, y0, width, height].

Return type:

List[float]

class datumaro.components.filter.PolyLine(points, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Shape

PolyLine annotation class. This class represents a polyline shape, which is a series of connected line segments.

_type#

The type of annotation, set to AnnotationType.polyline.

Type:

AnnotationType

as_polygon()[source]#

Returns the points of the polyline as a polygon.

get_area()[source]#

Returns the area of the polyline, which is always 0.

Method generated by attrs for class PolyLine.

as_polygon()[source]#

Convert the shape into a polygon representation.

get_area()[source]#

Calculate the area of the shape.

class datumaro.components.filter.Polygon(points, *, id: int = 0, attributes: Dict[str, Any] = _Nothing.NOTHING, group: int = 0, object_id: int = -1, label=None, z_order: int = 0)[source]#

Bases: Shape

Polygon annotation class. This class represents a polygon shape defined by a series of points.

_type#

The type of annotation, set to AnnotationType.polygon.

Type:

AnnotationType

__attrs_post_init__()[source]#

Validates the points to ensure they form a valid polygon.

get_area()[source]#

Calculates the area of the polygon using the shoelace formula.

as_polygon()[source]#

Returns the points of the polygon.

__eq__()[source]#

Compares this polygon with another for equality.

_get_shoelace_area()[source]#

Helper method to calculate the area of the polygon using the shoelace formula.

Method generated by attrs for class Polygon.

get_area()[source]#

Calculate the area of the polygon using the shoelace formula.

Returns:

The area of the polygon.

Return type:

float

as_polygon() List[float][source]#

Return the points of the polygon.

Returns:

The points of the polygon.

Return type:

List[float]