otx.core.data.adapter.base_dataset_adapter#

Base Class for Dataset Adapter.

Classes

BaseDatasetAdapter(task_type[, ...])

Base dataset adapter for all of downstream tasks to use Datumaro.

class otx.core.data.adapter.base_dataset_adapter.BaseDatasetAdapter(task_type: TaskType, train_data_roots: str | None = None, train_ann_files: str | None = None, val_data_roots: str | None = None, val_ann_files: str | None = None, test_data_roots: str | None = None, test_ann_files: str | None = None, unlabeled_data_roots: str | None = None, unlabeled_file_list: str | None = None, cache_config: Dict[str, Any] | None = None, encryption_key: str | None = None, **kwargs)[source]#

Bases: object

Base dataset adapter for all of downstream tasks to use Datumaro.

Mainly, BaseDatasetAdapter detect and import the dataset by using the function implemented in Datumaro. And it could prepare common variable, function (EmptyLabelSchema, LabelSchema, ..) commonly consumed under all tasks

Parameters:

[TaskType] (task_type) – type of the task
train_data_roots (Optional[str]) – Path for training data
train_ann_files (Optional[str]) – Path for training annotation file
val_data_roots (Optional[str]) – Path for validation data
val_ann_files (Optional[str]) – Path for validation annotation file
test_data_roots (Optional[str]) – Path for test data
test_ann_files (Optional[str]) – Path for test annotation file
unlabeled_data_roots (Optional[str]) – Path for unlabeled data
unlabeled_file_list (Optional[str]) – Path of unlabeled file list
encryption_key (Optional[str]) – Encryption key to load an encrypted dataset (only required for DatumaroBinary format)

Since all adapters can be used for training and validation, the default value of train/val/test_data_roots was set to None.

i.e) For the training/validation phase, test_data_roots is not used. For the test phase, train_data_roots and val_data_root are not used.

static datum_media_2_otx_media(datumaro_media: MediaElement) → IMediaEntity[source]#: Convert Datumaro media to OTX media.

get_label_schema() → LabelSchemaEntity[source]#: Get Label Schema.

abstract get_otx_dataset() → DatasetEntity[source]#: Get DatasetEntity.

remove_unused_label_entities(used_labels: List)[source]#

Remove unused label from label entities.

Because label entities will be used to make Label Schema, If there is unused label in Label Schema, it will hurts the model performance. So, remove the unused label from label entities.

Parameters:: used_labels (List) – list for index of used label