otx.core.data.adapter.base_dataset_adapter#

Base Class for Dataset Adapter.

Classes

BaseDatasetAdapter(task_type[, ...])

Base dataset adapter for all of downstream tasks to use Datumaro.

class otx.core.data.adapter.base_dataset_adapter.BaseDatasetAdapter(task_type: TaskType, train_data_roots: str | None = None, train_ann_files: str | None = None, val_data_roots: str | None = None, val_ann_files: str | None = None, test_data_roots: str | None = None, test_ann_files: str | None = None, unlabeled_data_roots: str | None = None, unlabeled_file_list: str | None = None, cache_config: Dict[str, Any] | None = None, encryption_key: str | None = None, **kwargs)[source]#

Bases: object

Base dataset adapter for all of downstream tasks to use Datumaro.

Mainly, BaseDatasetAdapter detect and import the dataset by using the function implemented in Datumaro. And it could prepare common variable, function (EmptyLabelSchema, LabelSchema, ..) commonly consumed under all tasks

Parameters:
  • [TaskType] (task_type) – type of the task

  • train_data_roots (Optional[str]) – Path for training data

  • train_ann_files (Optional[str]) – Path for training annotation file

  • val_data_roots (Optional[str]) – Path for validation data

  • val_ann_files (Optional[str]) – Path for validation annotation file

  • test_data_roots (Optional[str]) – Path for test data

  • test_ann_files (Optional[str]) – Path for test annotation file

  • unlabeled_data_roots (Optional[str]) – Path for unlabeled data

  • unlabeled_file_list (Optional[str]) – Path of unlabeled file list

  • encryption_key (Optional[str]) – Encryption key to load an encrypted dataset (only required for DatumaroBinary format)

Since all adapters can be used for training and validation, the default value of train/val/test_data_roots was set to None.

i.e) For the training/validation phase, test_data_roots is not used. For the test phase, train_data_roots and val_data_root are not used.

static datum_media_2_otx_media(datumaro_media: MediaElement) IMediaEntity[source]#

Convert Datumaro media to OTX media.

get_label_schema() LabelSchemaEntity[source]#

Get Label Schema.

abstract get_otx_dataset() DatasetEntity[source]#

Get DatasetEntity.

remove_unused_label_entities(used_labels: List)[source]#

Remove unused label from label entities.

Because label entities will be used to make Label Schema, If there is unused label in Label Schema, it will hurts the model performance. So, remove the unused label from label entities.

Parameters:

used_labels (List) – list for index of used label