Auto-configuration#
Auto-configuration for a deep learning framework means the automatic finding of the most appropriate settings for the training parameters, based on the dataset and the specific task at hand. Auto-configuration can help to save time, it eases the process of interaction with OpenVINO™ Training Extensions CLI and gives a better baseline for the given dataset.
At this end, we developed a simple auto-configuration functionality to ease the process of training and validation utilizing our framework. Basically, to start the training and obtain a good baseline with the best trade-off between accuracy and speed we need to pass only a dataset in the right format without specifying anything else:
$ otx train --train-data-roots <path_to_data_root>
Note
OpenVINO™ Training Extensions supports also otx build
mode with the auto-configuration feature. We can build OpenVINO™ Training Extensions workspace with the following CLI command:
$ otx build --train-data-roots <path_to_data_root>
Moreover, our dataset can have no train/val splits at all. The Datumaro manager integrated into OpenVINO™ Training Extensions will handle it on its own.
It will recognize the task by analyzing the dataset and if there is no splits for the validation - Datumaro will do a random auto-split, saving this split to the workspace. It could be used with otx optimize
or otx train
.
Note
Currently, Datumaro auto-split feature supports 3 formats: Imagenet (multi-class classification), COCO (detection) and Cityscapes (semantic segmentation).
After dataset preparation, the training will be started with the middle-sized template to achieve competitive accuracy preserving fast inference.
Supported dataset formats for each task:
classification: Imagenet, COCO (multi-label), custom hierarchical
object detection: COCO, Pascal-VOC, YOLO
semantic segmentation: Common Semantic Segmentation, Pascal-VOC, Cityscapes, ADE20k
action classification: CVAT
action detection: CVAT
anomaly classification: MVTec
anomaly detection: MVTec
anomaly segmentation: MVTec
instance segmentation: COCO, Pascal-VOC
If we have a dataset format occluded with other tasks, for example COCO
format, we should directly emphasize the task type and use otx build
first with an additional CLI option. If not, OpenVINO™ Training Extensions automatically chooses the task type that you might not intend:
$ otx build --train-data-roots <path_to_data_root>
--task {CLASSIFICATION, DETECTION, SEGMENTATION, ACTION_CLASSIFICATION, ACTION_DETECTION, ANOMALY_CLASSIFICATION, ANOMALY_DETECTION, ANOMALY_SEGMENTATION, INSTANCE_SEGMENTATION}
It will create a task-specific workspace folder with configured template and auto dataset split if supported.
Move to this folder and simply run without any options to start training:
$ otx train