Inference Adapter#

class model_api.adapters.inference_adapter.InferenceAdapter#

Bases: ABC

An abstract Model Adapter with the following interface:

Reading the model from disk or other place

Loading the model to the device

Accessing the information about inputs/outputs

The model reshaping

Synchronous model inference

Asynchronous model inference

An abstract Model Adapter constructor. Reads the model from disk or other place.

abstract await_all()#: In case of asynchronous execution waits the completion of all busy infer requests.

abstract await_any()#: In case of asynchronous execution waits the completion of any busy infer request until it becomes available for the data submission.

abstract embed_preprocessing(layout, resize_mode, interpolation_mode, target_shape, pad_value, dtype=<class 'int'>, brg2rgb=False, mean=None, scale=None, input_idx=0)#

Embeds preprocessing into the model if possible with the adapter being used. In some cases, this method would just add extra python preprocessing steps instaed actuall of embedding it into the model representation.

Parameters:

layout (str) – Layout, for instance NCHW.
resize_mode (str) – Resize type to use for preprocessing.
interpolation_mode (str) – Resize interpolation mode.
target_shape (tuple[int, ...]) – Target resize shape.
pad_value (int) – Value to pad with if resize implies padding.
dtype (type, optional) – Input data type for the preprocessing module. Defaults to int.
bgr2rgb (bool, optional) – Defines if we need to swap R and B channels in case of image input.
False. (Defaults to)
mean (list[Any] | None, optional) – Mean values to perform input normalization. Defaults to None.
scale (list[Any] | None, optional) – Scale values to perform input normalization. Defaults to None.
input_idx (int, optional) – Index of the model input to apply preprocessing to. Defaults to 0.

abstract get_input_layers()#

Gets the names of model inputs and for each one creates the Metadata structure,: which contains the information about the input shape, layout, precision in OpenVINO format, meta (optional)

Returns:

the dict containing Metadata for all inputs

abstract get_model()#: Get the model.

abstract get_output_layers()#

Gets the names of model outputs and for each one creates the Metadata structure,: which contains the information about the output shape, layout, precision in OpenVINO format, meta (optional)

Returns:

the dict containing Metadata for all outputs

abstract get_raw_result(infer_result)#

Gets raw results from the internal inference framework representation as a dict.

Parameters:

infer_result (-) – framework-specific result of inference from the model

Returns:

{: ‘output_layer_name_1’: raw_result_1, ‘output_layer_name_2’: raw_result_2, …

}

Return type:

raw result (dict) - model raw output in the following format

abstract get_rt_info(path)#

Returns an attribute stored in model info.

Parameters:: path (list[str]) – a sequence of tag names leading to the attribute.
Returns:: a value stored under corresponding tag sequence.
Return type:: Any

abstract infer_async(dict_data, callback_data)#

Performs the asynchronous model inference and sets the callback for inference completion. Also, it should define get_raw_result() function, which handles the result of inference from the model.

Parameters:

dict_data (-) –
it’s submitted to the model for inference and has the following format: {

’input_layer_name_1’: data_1, ‘input_layer_name_2’: data_2, …

}
callback_data (-) – the data for callback, that will be taken after the model inference is ended

abstract infer_sync(dict_data)#

Performs the synchronous model inference. The infer is a blocking method.

Parameters:

dict_data (-) –

it’s submitted to the model for inference and has the following format: {

’input_layer_name_1’: data_1, ‘input_layer_name_2’: data_2, …

}

Returns:

{: ‘output_layer_name_1’: raw_result_1, ‘output_layer_name_2’: raw_result_2, …

}

Return type:

raw result (dict) - model raw output in the following format

abstract is_ready()#

In case of asynchronous execution checks if one can submit input data to the model for inference, or all infer requests are busy.

Return type:

bool

Returns:

the boolean flag whether the input data can be
submitted to the model for inference or not

abstract load_model()#: Loads the model on the device.

abstract reshape_model(new_shape)#

Reshapes the model inputs to fit the new input shape.

Parameters:

new_shape (-) –

the dictionary with inputs names as keys and list of new shape as values in the following format: {

’input_layer_name_1’: [1, 128, 128, 3], ‘input_layer_name_2’: [1, 128, 128, 3], …

}

abstract save_model(path, weights_path, version)#

Serializes model to the filesystem.

Parameters:

path (str) – Path to write the resulting model.
weights_path (str | None) – Optional path to save weights if they are stored separately.
version (str | None) – Optional model version.

abstract set_callback(callback_fn)#

Sets callback that grabs results of async inference.

Parameters:: callback_fn (Callable) – Callback function.

abstract update_model_info(model_info)#

Updates model with the provided model info. Model info dict can also contain nested dicts.

Parameters:: model_info (dict[str, Any]) – model info dict to write to the model.

precisions = ('FP32', 'I32', 'FP16', 'I16', 'I8', 'U8')#

class model_api.adapters.inference_adapter.Metadata(names=<factory>, shape=<factory>, layout='', precision='', type='', meta=<factory>)#

Bases: object

layout: str = ''#

meta: dict#

names: set[str]#

precision: str = ''#

shape: list[int]#

type: str = ''#

Inference Adapter#

This Page