Models Optimization#

OpenVINO™ Training Extensions provides optimization algorithm: Post-Training Quantization tool (PTQ).

Post-Training Quantization Tool#

PTQ is designed to optimize the inference of models by applying post-training methods that do not require model retraining or fine-tuning. If you want to know more details about how PTQ works and to be more familiar with model optimization methods, please refer to documentation.

To run Post-training quantization it is required to convert the model to OpenVINO™ intermediate representation (IR) first. To perform fast and accurate quantization we use DefaultQuantization Algorithm for each task. Please, refer to the Tune quantization Parameters for further information about configuring the optimization.

Please, refer to our dedicated tutorials on how to optimize your model using PTQ.

API

from otx.engine import Engine
...
engine.optimize(checkpoint="<IR-checkpoint-path>")

CLI

(otx) ...$ otx optimize ... --checkpoint <IR-checkpoint-path>