Auto-configuration#

Auto-configuration for a deep learning framework means the automatic finding of the most appropriate settings for the training parameters, based on the dataset and the specific task at hand. Auto-configuration can help to save time, it eases the process of interaction with OpenVINO™ Training Extensions CLI and gives a better baseline for the given dataset.

At this end, we developed a simple auto-configuration functionality to ease the process of training and validation utilizing our framework. Basically, to start the training and obtain a good baseline with the best trade-off between accuracy and speed we need to pass only a dataset in the right format without specifying anything else:

$ otx train --train-data-roots <path_to_data_root>

Note

OpenVINO™ Training Extensions supports also otx build mode with the auto-configuration feature. We can build OpenVINO™ Training Extensions workspace with the following CLI command:

$ otx build --train-data-roots <path_to_data_root>

Moreover, our dataset can have no train/val splits at all. The Datumaro manager integrated into OpenVINO™ Training Extensions will handle it on its own. It will recognize the task by analyzing the dataset and if there is no splits for the validation - Datumaro will do a random auto-split, saving this split to the workspace. It could be used with otx optimize or otx train.

Note

Currently, Datumaro auto-split feature supports 3 formats: Imagenet (multi-class classification), COCO (detection) and Cityscapes (semantic segmentation).

After dataset preparation, the training will be started with the middle-sized template to achieve competitive accuracy preserving fast inference.

Supported dataset formats for each task:

classification: Imagenet, COCO (multi-label), custom hierarchical
object detection: COCO, Pascal-VOC, YOLO
semantic segmentation: Common Semantic Segmentation, Pascal-VOC, Cityscapes, ADE20k
action classification: CVAT
action detection: CVAT
anomaly classification: MVTec
anomaly detection: MVTec
anomaly segmentation: MVTec
instance segmentation: COCO, Pascal-VOC

If we have a dataset format occluded with other tasks, for example COCO format, we should directly emphasize the task type and use otx build first with an additional CLI option. If not, OpenVINO™ Training Extensions automatically chooses the task type that you might not intend:

$ otx build --train-data-roots <path_to_data_root>
            --task {CLASSIFICATION, DETECTION, SEGMENTATION, ACTION_CLASSIFICATION, ACTION_DETECTION, ANOMALY_CLASSIFICATION, ANOMALY_DETECTION, ANOMALY_SEGMENTATION, INSTANCE_SEGMENTATION}

It will create a task-specific workspace folder with configured template and auto dataset split if supported.

Move to this folder and simply run without any options to start training:

$ otx train