Use Self-Supervised Learning#
This tutorial introduces how to train a model using self-supervised learning and how to fine-tune the model with pre-trained weights. OpenVINO™ Training Extensions provides self-supervised learning methods for multi-classification and semantic segmentation.
The process has been tested on the following configuration:
Ubuntu 20.04
NVIDIA GeForce RTX 3090
Intel(R) Core(TM) i9-10980XE
CUDA Toolkit 11.7
Note
This example demonstrates how to work with self-supervised learning for classification. There are some differences between classfication and semantic segmentation, so there will be some notes for self-supervised learning for semantic segmentation.
Setup virtual environment#
1. You can follow the installation process from a quick start guide to create a universal virtual environment for OpenVINO™ Training Extensions.
2. Activate your virtual environment:
.otx/bin/activate
# or by this line, if you created an environment, using tox
. venv/otx/bin/activate
Pre-training#
1. In this self-supervised learning tutorial, images from flowers dataset and MobileNet-V3-large-1x model is utilized.
2. Prepare OpenVINO™ Training Extensions workspace for supervised learning by running the following command:
(otx) ...$ otx build --train-data-roots data/flower_photos --model MobileNet-V3-large-1x
[*] Workspace Path: otx-workspace-CLASSIFICATION
[*] Load Model Template ID: Custom_Image_Classification_MobileNet-V3-large-1x
[*] Load Model Name: MobileNet-V3-large-1x
[*] - Updated: otx-workspace-CLASSIFICATION/model.py
[*] - Updated: otx-workspace-CLASSIFICATION/data_pipeline.py
[*] - Updated: otx-workspace-CLASSIFICATION/deployment.py
[*] - Updated: otx-workspace-CLASSIFICATION/hpo_config.yaml
[*] - Updated: otx-workspace-CLASSIFICATION/model_hierarchical.py
[*] - Updated: otx-workspace-CLASSIFICATION/model_multilabel.py
[*] - Updated: otx-workspace-CLASSIFICATION/compression_config.json
[*] Update data configuration file to: otx-workspace-CLASSIFICATION/data.yaml
3. Prepare an OpenVINO™ Training Extensions workspace for self-supervised learning by running the following command:
(otx) ...$ otx build --train-data-roots data/flower_photos --model MobileNet-V3-large-1x --train-type Selfsupervised --workspace otx-workspace-CLASSIFICATION-Selfsupervised
[*] Workspace Path: otx-workspace-CLASSIFICATION-Selfsupervised
[*] Load Model Template ID: Custom_Image_Classification_MobileNet-V3-large-1x
[*] Load Model Name: MobileNet-V3-large-1x[*] - Updated: otx-workspace-CLASSIFICATION-Selfsupervised/selfsl/model.py
[*] - Updated: otx-workspace-CLASSIFICATION-Selfsupervised/selfsl/data_pipeline.py
[*] - Updated: otx-workspace-CLASSIFICATION-Selfsupervised/deployment.py
[*] - Updated: otx-workspace-CLASSIFICATION-Selfsupervised/hpo_config.yaml
[*] - Updated: otx-workspace-CLASSIFICATION-Selfsupervised/model_hierarchical.py
[*] - Updated: otx-workspace-CLASSIFICATION-Selfsupervised/model_multilabel.py
[*] Update data configuration file to: otx-workspace-CLASSIFICATION-Selfsupervised/data.yaml
Note
One important thing must be considered to set the workspace for self-supervised learning:
1. It is also possible to pass just a directory with any images to --train-data-roots
then --train-type Selfsupervised
is not needed. OpenVINO™ Training Extensions will recognize this training type automatically.
However, if you passed a full imagenet data format (with different sub-folders inside), this option is mandatory since it is hard to distinguish between supervised training.
After the workspace creation, the workspace structure is as follows:
otx-workspace-CLASSIFICATION
├── compression_config.json
├── configuration.yaml
├── data_pipeline.py
├── data.yaml
├── deployment.py
├── hpo_config.yaml
├── model_hierarchical.py
├── model_multilabel.py
├── model.py
├── splitted_dataset
│ ├── train
│ └── val
└── template.yaml
otx-workspace-CLASSIFICATION-Selfsupervised
├── configuration.yaml
├── data.yaml
├── deployment.py
├── hpo_config.yaml
├── model_hierarchical.py
├── model_multilabel.py
├── selfsl
│ ├── data_pipeline.py
│ └── model.py
└── template.yaml
Note
For semantic segmentation, --train-data-root
must be set to a directory including only images, like below.
For VOC2012 dataset used in semantic segmentation tutorial, for example, the path data/VOCdevkit/VOC2012/JPEGImages
must be set instead of data/VOCdevkit/VOC2012
.
Please refer to Explanation of Self-Supervised Learning for Semantic Segmentation. Option --train-type
is not needed.
(otx) ...$ otx build --train-data-roots data/VOCdevkit/VOC2012/JPEGImages \
--model Lite-HRNet-18-mod2
4. To start training we need to call otx train
command in self-supervised learning workspace:
(otx) ...$ cd otx-workspace-CLASSIFICATION-Selfsupervised
(otx) ...$ otx train --data ../otx-workspace-CLASSIFICATION/data.yaml
...
2023-02-23 19:41:36,879 | INFO : Iter [4970/5000] lr: 8.768e-05, eta: 0:00:29, time: 1.128, data_time: 0.963, memory: 7522, current_iters: 4969, loss: 0.2788
2023-02-23 19:41:46,371 | INFO : Iter [4980/5000] lr: 6.458e-05, eta: 0:00:19, time: 0.949, data_time: 0.782, memory: 7522, current_iters: 4979, loss: 0.2666
2023-02-23 19:41:55,806 | INFO : Iter [4990/5000] lr: 5.037e-05, eta: 0:00:09, time: 0.943, data_time: 0.777, memory: 7522, current_iters: 4989, loss: 0.2793
2023-02-23 19:42:05,105 | INFO : Saving checkpoint at 5000 iterations
2023-02-23 19:42:05,107 | INFO : ----------------- BYOL.state_dict_hook() called
2023-02-23 19:42:05,314 | WARNING : training progress 100%
2023-02-23 19:42:05,315 | INFO : Iter [5000/5000] lr: 4.504e-05, eta: 0:00:00, time: 0.951, data_time: 0.764, memory: 7522, current_iters: 4999, loss: 0.2787
2023-02-23 19:42:05,319 | INFO : run task done.
2023-02-23 19:42:05,323 | INFO : called save_model
2023-02-23 19:42:05,498 | INFO : Final model performance: Performance(score: -1, dashboard: (6 metric groups))
2023-02-23 19:42:05,499 | INFO : train done.
[*] Save Model to: models
Note
To use the same splitted train dataset, set --data ../otx-workspace-CLASSIFICATION/data.yaml
insead of using data.yaml
in self-supervised learning workspace.
The training will return artifacts: weights.pth
and label_schema.json
and we can use the weights to fine-tune the model using the target dataset.
The final model performance will be set to -1, but it doesn’t matter because self-supervised learning doesn’t use accuracy.
Let’s see how to fine-tune the model using pre-trained weights below.
Fine-tuning#
After pre-training progress, start fine-tuning by calling the below command with adding --load-weights
argument in supervised learning workspace.
(otx) ...$ cd ../otx-workspace-CLASSIFICATION
(otx) ...$ otx train --load-weights ../otx-workspace-CLASSIFICATION-Selfsupervised/models/weights.pth
...
2023-02-23 20:56:24,307 | INFO : run task done.
2023-02-23 20:56:28,883 | INFO : called evaluate()
2023-02-23 20:56:28,895 | INFO : Accuracy after evaluation: 0.9604904632152589
2023-02-23 20:56:28,896 | INFO : Evaluation completed
Performance(score: 0.9604904632152589, dashboard: (3 metric groups))
For comparison, we can also obtain the performance without pre-trained weights as below:
(otx) ...$ otx train
...
2023-02-23 18:24:34,453 | INFO : run task done.
2023-02-23 18:24:39,043 | INFO : called evaluate()
2023-02-23 18:24:39,056 | INFO : Accuracy after evaluation: 0.9550408719346049
2023-02-23 18:24:39,056 | INFO : Evaluation completed
Performance(score: 0.9550408719346049, dashboard: (3 metric groups))
With self-supervised learning, we can obtain well-adaptive weights and train the model more accurately. This example showed a little improvement (0.955 → 0.960), but if we use only a few samples that are too difficult to train a model on, then self-supervised learning can be the solution to improve the model perfomance more significantly. You can check performance improvement examples in self-supervised learning for classification documentation.
Note
Then we obtain the new model after fine-tuning, we can proceed with optimization and exporting as described in classification tutorial.