# Train Your OpenVINO™ Model Using YoloV8 Trainer For Any Dataset Format

[![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge&logo=jupyter&logoColor=white)](https://github.com/openvinotoolkit/datumaro/blob/develop/notebooks/08_e2e_example_yolo_ultralytics_trainer.ipynb)

## Prerequisite
### Download Six-sided Dice dataset
This is [a download link for Six-sided Dice dataset in Kaggle](https://www.kaggle.com/datasets/nellbyler/d6-dice?resource=download). Please download using this link and extract to your workspace directory. Then, you will have a `d6-dice` directory with annotations and images in YOLO format as follows.

```bash
d6-dice
├── Annotations
│ ├── classes.txt
│ ├── IMG_20191208_111228.txt
│ ├── IMG_20191208_111246.txt
│ ├── ...
└── Images
 ├── IMG_20191208_111228.jpg
 ├── IMG_20191208_111246.jpg
 ├── ...
```

However, for import compatibility, `obj.names` file must be added to `d6-dice/obj.names` filepath for import compatibility. This `obj.names` file includes the label names of the dataset, e.g., `[dice1, ..., dice6]`. Therefore, you can write it with the following simple code. Please see [Yolo Loose format](https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/yolo) for more details.

In [1]:
# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

import os

root_dir = "d6-dice"

names = """
dice1
dice2
dice3
dice4
dice5
dice6
"""

fpath = os.path.join(root_dir, "obj.names")
with open(fpath, "w") as fp:
 fp.write(names)

## Import dataset

Firstly, we import this dataset using Datumaro Python API. The Six-sided Dice dataset has no subset split so that Datumaro will create "default" subset for it.

In [2]:
from datumaro import Dataset

dataset = Dataset.import_from("./d6-dice", format="yolo")
dataset

Dataset
	size=250
	source_path=./d6-dice
	media_type=
	annotated_items_count=250
	annotations_count=1795
subsets
	default: # of items=250, # of annotated items=250, # of annotations=1795, annotation types=['bbox']
infos
	categories
	label: ['dice1', 'dice2', 'dice3', 'dice4', 'dice5', 'dice6']

## Split subsets and export dataset

There is no subset split in the imported dataset. However, Ultralytics-YOLO trainer must require "train" and "val" subsets ("test" is optional). So, we will create "train", "val", and "test" splits from the imported dataset.

In [3]:
splited_dataset = dataset.transform(
 "random_split", splits=[("train", 0.5), ("val", 0.2), ("test", 0.3)]
)
splited_dataset

Dataset
	size=250
	source_path=./d6-dice
	media_type=
	annotated_items_count=250
	annotations_count=1795
subsets
	test: # of items=75, # of annotated items=75, # of annotations=517, annotation types=['bbox']
	train: # of items=125, # of annotated items=125, # of annotations=951, annotation types=['bbox']
	val: # of items=50, # of annotated items=50, # of annotations=327, annotation types=['bbox']
infos
	categories
	label: ['dice1', 'dice2', 'dice3', 'dice4', 'dice5', 'dice6']

Now, we export the splited subsets to "yolo_ultralytics" format with `save_media=True` for Ultralytics-YOLO trainer. It is recommended to set `save_media=True`. If this option is enabled, Datumaro automatically copy-and-pastes the source images according to the correct directory structure of the target dataset format.

In [4]:
splited_dataset.export("d6-dice-ultralytics", "yolo_ultralytics", save_media=True)

## Train model with Ultralytics YOLOv8 trainer

At first, we will install Ultralytics YOLOv8 trainer to train the model and export it to [OpenVINO™ Intermediate Representation (IR)](https://docs.openvino.ai/latest/home.html). For export OpenVINO™ IR, we should install it with `export` extra (`ultralytics[export]`).

In [None]:
%pip install ultralytics[export]

In [2]:
import os.path as osp

# To give the Ultralytics YOLO trainer an arbitrary dataset path,
# you must provide its absolute path.
data_fpath = osp.abspath(osp.join("d6-dice-ultralytics", "data.yaml"))
model_fpath = osp.abspath(osp.join("d6-dice-project", "train", "weights", "best.pt"))

### Train yolov8n model
We will train a `yolov8n` model on the Six-sided Dataset for 100 epochs.

In [7]:
!yolo detect train model=yolov8n.pt data={data_fpath} epochs=100 imgsz=640 project=d6-dice-project

Ultralytics YOLOv8.0.53 🚀 Python-3.9.13 torch-1.13.1+cu117 CUDA:0 (NVIDIA GeForce RTX 3090, 24268MiB)
[34m[1myolo/engine/trainer: [0mtask=detect, mode=train, model=yolov8n.pt, data=/home/vinnamki/datumaro/notebooks/d6-dice-ultralytics/data.yaml, epochs=100, patience=50, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=d6-dice-project, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format

### Evaluate on the test set

Now, we have the trained model saved in `model_fpath`. We can evaluate this model on the test dataset as follows.

In [3]:
!yolo detect val model={model_fpath} data={data_fpath} split=test

Ultralytics YOLOv8.0.53 🚀 Python-3.9.13 torch-1.13.1+cu117 CUDA:0 (NVIDIA GeForce RTX 3090, 24268MiB)
Model summary (fused): 168 layers, 3006818 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mScanning /home/vinnamki/datumaro/notebooks/d6-dice-ultralytics/labels/test.[0m
 Class Images Instances Box(P R mAP50 m
 all 75 517 0.953 0.932 0.975 0.632
 dice1 75 83 0.977 0.952 0.987 0.662
 dice2 75 101 0.951 0.931 0.976 0.649
 dice3 75 84 0.962 0.903 0.96 0.596
 dice4 75 82 0.93 0.97 0.98 0.615
 dice5 75 88 0.938 0.92 0.969 0.629
 dice6 75 79 0.96 0.914 0.976 0.642
Speed: 1.5ms preprocess, 1.0ms inference, 0.0ms loss, 26.3ms postprocess per image
Results saved to [1m/home/vinnamki/ultralytics/runs/detect/val4[0m


### Export the trained model to OpenVINO™ IR

So far, we have been able to successfully train our `YOLOv8` model by converting the dataset format using Datumaro and passing it to the Ultralytics YOLOv8 trainer CLI. The final step is exporting the trained model to [OpenVINO™ IR](https://docs.openvino.ai/latest/home.html) to accelerate model inference on any Intel™ device.

In [4]:
!yolo detect export model={model_fpath} format=openvino

Ultralytics YOLOv8.0.53 🚀 Python-3.9.13 torch-1.13.1+cu117 CPU
Model summary (fused): 168 layers, 3006818 parameters, 0 gradients, 8.1 GFLOPs

[34m[1mPyTorch:[0m starting from /home/vinnamki/datumaro/notebooks/d6-dice-project/train/weights/best.pt with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 10, 8400) (5.9 MB)

[34m[1mONNX:[0m starting export with onnx 1.13.1...
[34m[1mONNX:[0m export success ✅ 0.4s, saved as /home/vinnamki/datumaro/notebooks/d6-dice-project/train/weights/best.onnx (11.7 MB)

[34m[1mOpenVINO:[0m starting export with openvino 2022.3.0-9052-9752fafe8eb-releases/2022/3...
[34m[1mOpenVINO:[0m export success ✅ 0.7s, saved as /home/vinnamki/datumaro/notebooks/d6-dice-project/train/weights/best_openvino_model/ (11.8 MB)

Export complete (1.4s)
Results saved to [1m/home/vinnamki/datumaro/notebooks/d6-dice-project/train/weights[0m
Predict: yolo predict task=detect model=/home/vinnamki/datumaro/notebooks/d6-dice-project/train/weights/best_openv