datumaro.components.merge.intersect_merge#
Classes
|
Merge several datasets with "intersect" policy: |
- class datumaro.components.merge.intersect_merge.IntersectMerge(conf=_Nothing.NOTHING)[source]#
Bases:
Merger
Merge several datasets with “intersect” policy:
If there are two or more dataset items whose (id, subset) pairs match each other,
we can consider this as having an intersection in our dataset. This method merges the annotations of the corresponding
DatasetItem
into oneDatasetItem
to handle this intersection. The rule to handle merging annotations is provided byAnnotationMerger
according to their annotation types. For example, DatasetItem(id=”item_1”, subset=”train”, annotations=[Bbox(0, 0, 1, 1)]) from Dataset-A and DatasetItem(id=”item_1”, subset=”train”, annotations=[Bbox(.5, .5, 1, 1)]) from Dataset-B can be merged into DatasetItem(id=”item_1”, subset=”train”, annotations=[Bbox(0, 0, 1, 1)]).Label categories are merged according to the union of their label names
(Same as UnionMerge). For example, if Dataset-A has {“car”, “cat”, “dog”} and Dataset-B has {“car”, “bus”, “truck”} labels, the merged dataset will have {“bust”, “car”, “cat”, “dog”, “truck”} labels.
This merge has configuration parameters (conf) to control the annotation merge behaviors.
For example,
```python merge = IntersectMerge(
- conf=IntersectMerge.Conf(
pairwise_dist=0.25, groups=[], output_conf_thresh=0.0, quorum=0,
)
)#
For more details for the parameters, please refer to
IntersectMerge.Conf
.Method generated by attrs for class IntersectMerge.
- class Conf(*, pairwise_dist=0.5, sigma=_Nothing.NOTHING, output_conf_thresh=0, quorum=0, ignored_attributes=_Nothing.NOTHING, groups=_Nothing.NOTHING, close_distance=0.75)[source]#
Bases:
object
- Parameters:
pairwise_dist – IoU match threshold for segments
sigma – Parameter for Object Keypoint Similarity metric (https://cocodataset.org/#keypoints-eval)
output_conf_thresh – Confidence threshold for output annotations
quorum – Minimum count for a label and attribute voting results to be counted
ignored_attributes – Attributes to be ignored in the merged
DatasetItem
groups – A comma-separated list of labels in annotation groups to check. ‘?’ postfix can be added to a label to make it optional in the group (repeatable)
close_distance – Distance threshold between annotations to decide their closeness. If they are decided to be close, it will be enrolled to the error tracker.
Method generated by attrs for class IntersectMerge.Conf.
- merge(sources: Sequence[IDataset]) DatasetItemStorage [source]#
- merge_items(items: Dict[int, DatasetItem]) DatasetItem [source]#
- class datumaro.components.merge.intersect_merge.AnnotationMerger(*, context: IMatcherContext | IMergerContext | None = None)[source]#
Bases:
AnnotationMatcher
Method generated by attrs for class AnnotationMerger.
- class datumaro.components.merge.intersect_merge.AnnotationType(value)[source]#
Bases:
IntEnum
An enumeration.
- unknown = 0#
- label = 1#
- mask = 2#
- points = 3#
- polygon = 4#
- polyline = 5#
- bbox = 6#
- cuboid_3d = 8#
- super_resolution_annotation = 9#
- depth_annotation = 10#
- ellipse = 11#
- hash_key = 12#
- feature_vector = 13#
- tabular = 14#
- rotated_bbox = 15#
- exception datumaro.components.merge.intersect_merge.AnnotationsTooCloseError(item_id, a, b, distance)[source]#
Bases:
DatasetQualityError
Method generated by attrs for class AnnotationsTooCloseError.
- item_id#
- a#
- b#
- distance#
- class datumaro.components.merge.intersect_merge.BboxMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,BboxMatcher
Method generated by attrs for class BboxMerger.
- class datumaro.components.merge.intersect_merge.CaptionsMerger(*, context: IMatcherContext | IMergerContext | None = None)[source]#
Bases:
AnnotationMerger
,CaptionsMatcher
Method generated by attrs for class CaptionsMerger.
- exception datumaro.components.merge.intersect_merge.ConflictingCategoriesError(msg=None, *, sources=None)[source]#
Bases:
DatasetMergeError
- sources#
- class datumaro.components.merge.intersect_merge.Cuboid3dMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,Cuboid3dMatcher
Method generated by attrs for class Cuboid3dMerger.
- class datumaro.components.merge.intersect_merge.DatasetItem(id: str, *, subset: str | None = None, media: str | MediaElement | None = None, annotations: List[Annotation] | None = None, attributes: Dict[str, Any] | None = None)[source]#
Bases:
object
- media: MediaElement | None#
- annotations: Annotations#
- class datumaro.components.merge.intersect_merge.DatasetItemStorage[source]#
Bases:
object
- put(item: DatasetItem) bool [source]#
- get(id: str | DatasetItem, subset: str | None = None, dummy: Any | None = None) DatasetItem | None [source]#
- class datumaro.components.merge.intersect_merge.DatasetItemStorageDatasetView(parent: DatasetItemStorage, infos: Dict[str, Any], categories: Dict[AnnotationType, Categories], media_type: Type[MediaElement] | None, ann_types: Set[AnnotationType] | None)[source]#
Bases:
IDataset
- class Subset(parent: DatasetItemStorageDatasetView, name: str)[source]#
Bases:
IDataset
- class datumaro.components.merge.intersect_merge.EllipseMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,ShapeMatcher
Method generated by attrs for class EllipseMerger.
- exception datumaro.components.merge.intersect_merge.FailedAttrVotingError(item_id, attr, votes, ann, *, sources=_Nothing.NOTHING)[source]#
Bases:
DatasetMergeError
Method generated by attrs for class FailedAttrVotingError.
- item_id#
- attr#
- votes#
- ann#
- class datumaro.components.merge.intersect_merge.FeatureVectorMerger(*, context: IMatcherContext | IMergerContext | None = None)[source]#
Bases:
AnnotationMerger
,FeatureVectorMatcher
Method generated by attrs for class FeatureVectorMerger.
- class datumaro.components.merge.intersect_merge.HashKeyMerger(*, context: IMatcherContext | IMergerContext | None = None)[source]#
Bases:
AnnotationMerger
,HashKeyMatcher
Method generated by attrs for class HashKeyMerger.
- class datumaro.components.merge.intersect_merge.IDataset[source]#
Bases:
object
- subsets() Dict[str, IDataset] [source]#
Enumerates subsets in the dataset. Each subset can be a dataset itself.
- categories() Dict[AnnotationType, Categories] [source]#
Returns metainfo about dataset labels.
- get(id: str, subset: str | None = None) DatasetItem | None [source]#
Provides random access to dataset items.
- media_type() Type[MediaElement] [source]#
Returns media type of the dataset items.
All the items are supposed to have the same media type. Supposed to be constant and known immediately after the object construction (i.e. doesn’t require dataset iteration).
- ann_types() List[AnnotationType] [source]#
Returns available task type from dataset annotation types.
- class datumaro.components.merge.intersect_merge.ImageAnnotationMerger(*, context: IMatcherContext | IMergerContext | None = None)[source]#
Bases:
AnnotationMerger
,ImageAnnotationMatcher
Method generated by attrs for class ImageAnnotationMerger.
- class datumaro.components.merge.intersect_merge.LabelCategories(items: List[str] = _Nothing.NOTHING, label_groups: List[LabelGroup] = _Nothing.NOTHING, *, attributes: Set[str] = _Nothing.NOTHING)[source]#
Bases:
Categories
Method generated by attrs for class LabelCategories.
- class Category(name, parent: str = '', attributes: Set[str] = _Nothing.NOTHING)[source]#
Bases:
object
Method generated by attrs for class LabelCategories.Category.
- class LabelGroup(name, labels: List[str] = [], group_type: GroupType = GroupType.EXCLUSIVE)[source]#
Bases:
object
Method generated by attrs for class LabelCategories.LabelGroup.
- label_groups: List[LabelGroup]#
- classmethod from_iterable(iterable: Iterable[str | Tuple[str] | Tuple[str, str] | Tuple[str, str, List[str]]]) LabelCategories [source]#
Creates a LabelCategories from iterable.
- Parameters:
iterable –
This iterable object can be:
a list of str - will be interpreted as list of Category names
a list of positional arguments - will generate Categories with these arguments
Returns: a LabelCategories object
- class datumaro.components.merge.intersect_merge.LabelMerger(*, context: IMatcherContext | IMergerContext | None = None, quorum=0)[source]#
Bases:
AnnotationMerger
,LabelMatcher
Method generated by attrs for class LabelMerger.
- class datumaro.components.merge.intersect_merge.LineMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,LineMatcher
Method generated by attrs for class LineMerger.
- class datumaro.components.merge.intersect_merge.MaskCategories(colormap: Dict[int, Tuple[int, int, int]] = _Nothing.NOTHING, inverse_colormap: Dict[Tuple[int, int, int], int] | None = None, *, attributes: Set[str] = _Nothing.NOTHING)[source]#
Bases:
Categories
Describes a color map for segmentation masks.
Method generated by attrs for class MaskCategories.
- classmethod generate(size: int = 255, include_background: bool = True) MaskCategories [source]#
Generates MaskCategories with the specified size.
- If include_background is True, the result will include the item
“0: (0, 0, 0)”, which is typically used as a background color.
- class datumaro.components.merge.intersect_merge.MaskMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,MaskMatcher
Method generated by attrs for class MaskMerger.
- class datumaro.components.merge.intersect_merge.Merger(**options)[source]#
Bases:
IMergerContext
,CliPlugin
Merge multiple datasets into one dataset
- static merge_categories(sources: Sequence[Dict[AnnotationType, Categories]]) Dict [source]#
- exception datumaro.components.merge.intersect_merge.NoMatchingAnnError(item_id, ann, *, sources=_Nothing.NOTHING)[source]#
Bases:
DatasetMergeError
Method generated by attrs for class NoMatchingAnnError.
- item_id#
- ann#
- exception datumaro.components.merge.intersect_merge.NoMatchingItemError(item_id, *, sources=_Nothing.NOTHING)[source]#
Bases:
DatasetMergeError
Method generated by attrs for class NoMatchingItemError.
- item_id#
- class datumaro.components.merge.intersect_merge.OrderedDict[source]#
Bases:
dict
Dictionary that remembers insertion order
- clear() None. Remove all items from od. #
- popitem(last=True)#
Remove and return a (key, value) pair from the dictionary.
Pairs are returned in LIFO order if last is true or FIFO order if false.
- move_to_end(key, last=True)#
Move an existing element to the end (or beginning if last is false).
Raise KeyError if the element does not exist.
- update([E, ]**F) None. Update D from dict/iterable E and F. #
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- keys() a set-like object providing a view on D's keys #
- items() a set-like object providing a view on D's items #
- values() an object providing a view on D's values #
- pop(key[, default]) v, remove specified key and return the corresponding value. #
If the key is not found, return the default if given; otherwise, raise a KeyError.
- setdefault(key, default=None)#
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- copy() a shallow copy of od #
- fromkeys(value=None)#
Create a new ordered dictionary with keys from iterable and values set to value.
- class datumaro.components.merge.intersect_merge.PointsCategories(items: Dict[int, Category] = _Nothing.NOTHING, *, attributes: Set[str] = _Nothing.NOTHING)[source]#
Bases:
Categories
Describes (key-)point metainfo such as point names and joints.
Method generated by attrs for class PointsCategories.
- class Category(labels: List[str] = _Nothing.NOTHING, joints: Set[Tuple[int, int]] = _Nothing.NOTHING)[source]#
Bases:
object
Method generated by attrs for class PointsCategories.Category.
- classmethod from_iterable(iterable: Tuple[int, List[str]] | Tuple[int, List[str], Set[Tuple[int, int]]]) PointsCategories [source]#
Create PointsCategories from an iterable.
- Parameters:
iterable –
An Iterable with the following elements:
a label id
a list of positional arguments for Categories
- Returns:
PointsCategories object
- Return type:
- class datumaro.components.merge.intersect_merge.PointsMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0, sigma: list | None = None, instance_map)[source]#
Bases:
_ShapeMerger
,PointsMatcher
Method generated by attrs for class PointsMerger.
- class datumaro.components.merge.intersect_merge.PolygonMerger(*, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,PolygonMatcher
Method generated by attrs for class PolygonMerger.
- class datumaro.components.merge.intersect_merge.RotatedBboxMerger(sigma: list | None = None, *, context: ~datumaro.components.abstracts.merger.IMatcherContext | ~datumaro.components.abstracts.merger.IMergerContext | None = None, pairwise_dist=0.9, cluster_dist=-1.0, match_segments=<function match_segments_pair>, quorum=0)[source]#
Bases:
_ShapeMerger
,RotatedBboxMatcher
Method generated by attrs for class RotatedBboxMerger.
- class datumaro.components.merge.intersect_merge.TabularMerger(*, context: IMatcherContext | IMergerContext | None = None)[source]#
Bases:
AnnotationMerger
,TabularMatcher
Method generated by attrs for class TabularMerger.
- exception datumaro.components.merge.intersect_merge.WrongGroupError(item_id, found, expected, group)[source]#
Bases:
DatasetQualityError
Method generated by attrs for class WrongGroupError.
- item_id#
- found#
- expected#
- group#
- datumaro.components.merge.intersect_merge.attrib(default=_Nothing.NOTHING, validator=None, repr=True, cmp=None, hash=None, init=True, metadata=None, type=None, converter=None, factory=None, kw_only=False, eq=None, order=None, on_setattr=None, alias=None)[source]#
Create a new field / attribute on a class.
Identical to attrs.field, except it’s not keyword-only.
Consider using attrs.field in new code (
attr.ib
will never go away, though).Warning
Does nothing unless the class is also decorated with attr.s (or similar)!
New in version 15.2.0: convert
New in version 16.3.0: metadata
Changed in version 17.1.0: validator can be a
list
now.Changed in version 17.1.0: hash is None and therefore mirrors eq by default.
New in version 17.3.0: type
Deprecated since version 17.4.0: convert
New in version 17.4.0: converter as a replacement for the deprecated convert to achieve consistency with other noun-based arguments.
New in version 18.1.0:
factory=f
is syntactic sugar fordefault=attr.Factory(f)
.New in version 18.2.0: kw_only
Changed in version 19.2.0: convert keyword argument removed.
Changed in version 19.2.0: repr also accepts a custom callable.
Deprecated since version 19.2.0: cmp Removal on or after 2021-06-01.
New in version 19.2.0: eq and order
New in version 20.1.0: on_setattr
Changed in version 20.3.0: kw_only backported to Python 2
Changed in version 21.1.0: eq, order, and cmp also accept a custom callable
Changed in version 21.1.0: cmp undeprecated
New in version 22.2.0: alias
- datumaro.components.merge.intersect_merge.attrs(maybe_cls=None, these=None, repr_ns=None, repr=None, cmp=None, hash=None, init=None, slots=False, frozen=False, weakref_slot=True, str=False, auto_attribs=False, kw_only=False, cache_hash=False, auto_exc=False, eq=None, order=None, auto_detect=False, collect_by_mro=False, getstate_setstate=None, on_setattr=None, field_transformer=None, match_args=True, unsafe_hash=None)[source]#
A class decorator that adds dunder methods according to the specified attributes using attr.ib or the these argument.
Consider using attrs.define / attrs.frozen in new code (
attr.s
will never go away, though).- Parameters:
repr_ns (str) – When using nested classes, there was no way in Python 2 to automatically detect that. This argument allows to set a custom name for a more meaningful
repr
output. This argument is pointless in Python 3 and is therefore deprecated.
Caution
Refer to attrs.define for the rest of the parameters, but note that they can have different defaults.
Notably, leaving on_setattr as None will not add any hooks.
New in version 16.0.0: slots
New in version 16.1.0: frozen
New in version 16.3.0: str
New in version 16.3.0: Support for
__attrs_post_init__
.Changed in version 17.1.0: hash supports None as value which is also the default now.
New in version 17.3.0: auto_attribs
Changed in version 18.1.0: If these is passed, no attributes are deleted from the class body.
Changed in version 18.1.0: If these is ordered, the order is retained.
New in version 18.2.0: weakref_slot
Deprecated since version 18.2.0:
__lt__
,__le__
,__gt__
, and__ge__
now raise a DeprecationWarning if the classes compared are subclasses of each other.__eq
and__ne__
never tried to compared subclasses to each other.Changed in version 19.2.0:
__lt__
,__le__
,__gt__
, and__ge__
now do not consider subclasses comparable anymore.New in version 18.2.0: kw_only
New in version 18.2.0: cache_hash
New in version 19.1.0: auto_exc
Deprecated since version 19.2.0: cmp Removal on or after 2021-06-01.
New in version 19.2.0: eq and order
New in version 20.1.0: auto_detect
New in version 20.1.0: collect_by_mro
New in version 20.1.0: getstate_setstate
New in version 20.1.0: on_setattr
New in version 20.3.0: field_transformer
Changed in version 21.1.0:
init=False
injects__attrs_init__
Changed in version 21.1.0: Support for
__attrs_pre_init__
Changed in version 21.1.0: cmp undeprecated
New in version 21.3.0: match_args
New in version 22.2.0: unsafe_hash as an alias for hash (for PEP 681 compliance).
Deprecated since version 24.1.0: repr_ns
Changed in version 24.1.0: Instances are not compared as tuples of attributes anymore, but using a big
and
condition. This is faster and has more correct behavior for uncomparable values like math.nan.New in version 24.1.0: If a class has an inherited classmethod called
__attrs_init_subclass__
, it is executed after the class is created.Deprecated since version 24.1.0: hash is deprecated in favor of unsafe_hash.