Supported Data Formats#

ADE20k (v2017) (import-only)
ADE20k (v2020) (import-only)
Align CelebA (classification, landmarks) (import-only)
BraTS (segmentation) (import-only)
BraTS Numpy (detection, segmentation) (import-only)
CamVid (segmentation)
- Format specification
- Dataset example
CelebA (classification, detection, landmarks) (import-only)
CIFAR-10/100 (classification)
Cityscapes (segmentation)
Common Semantic Segmentation (segmentation) (import-only)
Common Super Resolution
CVAT (for images, for video (import-only))
DOTA (detection_rotated)
ICDAR13/15 (word recognition, text localization, text segmentation)
- Format specification
- Dataset example
ImageNet (classification, detection)
- Dataset example
- Dataset example (txt for classification)
- Detection format is the same as in PASCAL VOC
- Format documentation
Kaggle (classification, detection, segmentation) (import-only)
- Dataset examples
- Format documentation
KITTI (segmentation, detection)
KITTI 3D (raw, tracklets, velodyne points)
Kinetics 400/600/700
LabelMe (labels, boxes, masks)
LFW (classification, person re-identification, landmarks)
Mapillary Vistas (segmentation) (import-only)
Market-1501 (person re-identification)
- Format specification
- Dataset example
MARS (import-only)
MMDet-COCO (detection, segmentation)
MNIST (classification)
MNIST in CSV (classification)
MOT sequences
MOTS (png)
MPII Human Pose (detection, pose estimation) (import-only)
MPII Human Pose JSON (detection, pose estimation) (import-only)
MS COCO (image info, instances, person keypoints, captions, labels, panoptic, stuff)
- Format specification
- Dataset example
- labels are our extension - like instances with only category_id
- Format documentation
Roboflow (import-only)
NYU Depth Dataset V2 (depth estimation) (import-only)
OpenImages (classification, detection, segmentation)
PASCAL VOC (classification, detection, segmentation, action classification, person layout)
Segment Anything (a.k.a SA-1B) (detection, segmentation)
Supervisely (pointcloud)
SYNTHIA (segmentation) (import-only)
Tabular (classification, regression) (import/export only)
- Dataset example
- Format documentation
TF Detection API (bboxes, masks)
- Format specifications: [bboxes], [masks]
- Dataset example
VGGFace2 (landmarks, bboxes)
VoTT CSV (detection) (import-only)
VoTT JSON (detection) (import-only)
WIDERFace (bboxes)
YOLO (bboxes)
YOLO-Ultralytics (bboxes)

Supported Annotation Types#

Labels
Bounding Boxes
Polygons
Polylines
(Segmentation) Masks
(Key-) Points
Captions
3D cuboids
Super Resolution Annotation
Depth Annotation
Ellipses
Hash Keys

Datumaro does not separate datasets by tasks like classification, detection, segmentation, etc. Instead, datasets can have any annotations. When a dataset is exported in a specific format, only relevant annotations are exported.

Dataset Meta Info File#

It is possible to use classes that are not original to the format. To do this, use dataset_meta.json.

{
  "label_map": {"0": "background", "1": "car", "2": "person"},
  "segmentation_colors": [[0, 0, 0], [255, 0, 0], [0, 0, 255]],
  "background_label": "0"
}

label_map is a dictionary where the class ID is the key and the class name is the value.
segmentation_colors is a list of channel-wise values for each class. This is only necessary for the segmentation task.
background_label is a background label ID in the dataset.