Supported Data Formats#
- ADE20k (v2017) (import-only)
- ADE20k (v2020) (import-only)
- Align CelebA (
classification
,landmarks
) (import-only)
- Align CelebA (
- BraTS (
segmentation
) (import-only)
- BraTS (
- BraTS Numpy (
detection
,segmentation
) (import-only)
- BraTS Numpy (
- CamVid (
segmentation
)
- CamVid (
- CelebA (
classification
,detection
,landmarks
) (import-only)
- CelebA (
- CIFAR-10/100 (
classification
)
- CIFAR-10/100 (
- Cityscapes (
segmentation
)
- Cityscapes (
- Common Semantic Segmentation (
segmentation
) (import-only)
- Common Semantic Segmentation (
- Common Super Resolution
- CVAT (for images, for video (import-only))
- DOTA (
detection_rotated
)
- DOTA (
- ICDAR13/15 (
word recognition
,text localization
,text segmentation
)
- ICDAR13/15 (
- ImageNet (
classification
,detection
) Detection format is the same as in PASCAL VOC
- ImageNet (
- Kaggle (
classification
,detection
,segmentation
) (import-only)
- Kaggle (
- KITTI (
segmentation
,detection
)
- KITTI (
- KITTI 3D (
raw
,tracklets
,velodyne points
)
- KITTI 3D (
- Kinetics 400/600/700
- LabelMe (
labels
,boxes
,masks
)
- LabelMe (
- LFW (
classification
,person re-identification
,landmarks
)
- LFW (
- Mapillary Vistas (
segmentation
) (import-only)
- Mapillary Vistas (
- Market-1501 (
person re-identification
)
- Market-1501 (
- MARS (import-only)
- MMDet-COCO (
detection
,segmentation
)
- MMDet-COCO (
- MNIST (
classification
)
- MNIST (
- MNIST in CSV (
classification
)
- MNIST in CSV (
- MOT sequences
- MPII Human Pose (
detection
,pose estimation
) (import-only)
- MPII Human Pose (
- MPII Human Pose JSON (
detection
,pose estimation
) (import-only)
- MPII Human Pose JSON (
- MS COCO (
image info
,instances
,person keypoints
,captions
,labels
,panoptic
,stuff
) labels
are our extension - like instances with only category_id
- MS COCO (
- Roboflow (import-only)
- NYU Depth Dataset V2 (
depth estimation
) (import-only)
- NYU Depth Dataset V2 (
- OpenImages (
classification
,detection
,segmentation
)
- OpenImages (
- PASCAL VOC (
classification
,detection
,segmentation
,action classification
,person layout
)
- PASCAL VOC (
- Segment Anything (a.k.a SA-1B) (
detection
,segmentation
)
- Segment Anything (a.k.a SA-1B) (
- Supervisely (
pointcloud
)
- Supervisely (
- SYNTHIA (
segmentation
) (import-only)
- SYNTHIA (
- Tabular (
classification
,regression
) (import/export only)
- Tabular (
- TF Detection API (
bboxes
,masks
)
- TF Detection API (
- VGGFace2 (
landmarks
,bboxes
)
- VGGFace2 (
- VoTT CSV (
detection
) (import-only)
- VoTT CSV (
- VoTT JSON (
detection
) (import-only)
- VoTT JSON (
- WIDERFace (
bboxes
)
- WIDERFace (
- YOLO (
bboxes
)
- YOLO (
- YOLO-Ultralytics (
bboxes
)
- YOLO-Ultralytics (
Supported Annotation Types#
Labels
Bounding Boxes
Polygons
Polylines
(Segmentation) Masks
(Key-) Points
Captions
3D cuboids
Super Resolution Annotation
Depth Annotation
Ellipses
Hash Keys
Datumaro does not separate datasets by tasks like classification, detection, segmentation, etc. Instead, datasets can have any annotations. When a dataset is exported in a specific format, only relevant annotations are exported.
Dataset Meta Info File#
It is possible to use classes that are not original to the format.
To do this, use dataset_meta.json
.
{
"label_map": {"0": "background", "1": "car", "2": "person"},
"segmentation_colors": [[0, 0, 0], [255, 0, 0], [0, 0, 255]],
"background_label": "0"
}
label_map
is a dictionary where the class ID is the key and the class name is the value.segmentation_colors
is a list of channel-wise values for each class. This is only necessary for the segmentation task.background_label
is a background label ID in the dataset.