datumaro.plugins.inference_server_plugin.ovms#

Classes

OVMSLauncher(model_name, model_interpreter_path)

Inference launcher for OVMS (OpenVINO™ Model Server) (openvinotoolkit/model_server)

class datumaro.plugins.inference_server_plugin.ovms.OVMSLauncher(model_name: str, model_interpreter_path: str, model_version: int = 0, host: str = 'localhost', port: int = 9000, timeout: float = 10.0, tls_config: TLSConfig | None = None, protocol_type: ProtocolType = ProtocolType.grpc)[source]#

Bases: LauncherForDedicatedInferenceServer[Union[GrpcClient, HttpClient]]

Inference launcher for OVMS (OpenVINO™ Model Server) (openvinotoolkit/model_server)

Parameters:
  • model_name – Name of the model. It should match with the model name loaded in the server instance.

  • model_interpreter_path – Python source code path which implements a model interpreter. The model interpreter implement pre-processing of the model input and post-processing of the model output.

  • model_version – Version of the model loaded in the server instance

  • host – Host address of the server instance

  • port – Port number of the server instance

  • timeout – Timeout limit during communication between the client and the server instance

  • tls_config – Configuration required if the server instance is in the secure mode

  • protocol_type – Communication protocol type with the server instance

infer(inputs: ndarray | Dict[str, ndarray]) List[Dict[str, ndarray] | List[Dict[str, ndarray]]][source]#
exception datumaro.plugins.inference_server_plugin.ovms.DatumaroError[source]#

Bases: Exception

class datumaro.plugins.inference_server_plugin.ovms.GrpcClient(channel, prediction_service_stub, model_service_stub)[source]#

Bases: ServingClient

predict(inputs, model_name, model_version=0, timeout=10.0)[source]#

Send PredictRequest to the server and return response.

Parameters:
  • inputs – dictionary with (input_name, input data) pairs

  • model_name – name of the model in the model server.

  • model_version – version of the model (default = 0).

  • timeout – time in seconds to wait for the response (default = 10).

Returns:

For models with single output - ndarray with prediction result. For models with multiple outputs - dictionary of (output_name, result) pairs.

Raises:
  • TypeError – if provided argument is of wrong type.

  • ValueError – if provided argument has unsupported value.

  • ConnectionError – if there is an issue with server connection.

  • TimeoutError – if request handling duration exceeded timeout.

  • ModelNotFound – if model with specified name and version does not exist in the model server.

  • InvalidInputError – if provided inputs could not be handled by the model.

  • BadResponseError – if server response in malformed and cannot be parsed.

get_model_metadata(model_name, model_version=0, timeout=10.0)[source]#

Send ModelMetadataRequest to the server and return response.

Parameters:
  • model_name – name of the model in the model server.

  • model_version – version of the model (default = 0).

  • timeout – time in seconds to wait for the response (default = 10).

Returns:

Dictionary with the model metadata response.

Raises:
  • TypeError – if provided argument is of wrong type.

  • ValueError – if provided argument has unsupported value.

  • ConnectionError – if there is an issue with server connection.

  • TimeoutError – if request handling duration exceeded timeout.

  • ModelNotFound – if model with specified name and version does not exist in the model server.

  • BadResponseError – if server response in malformed and cannot be parsed.

get_model_status(model_name, model_version=0, timeout=10.0)[source]#

Send ModelStatusRequest to the server and return response.

Parameters:
  • model_name – name of the model in the model server.

  • model_version – version of the model (default = 0).

  • timeout – time in seconds to wait for the response (default = 10).

Returns:

Dictionary with the model status response.

Raises:
  • TypeError – if provided argument is of wrong type.

  • ValueError – if provided argument has unsupported value.

  • ConnectionError – if there is an issue with server connection.

  • TimeoutError – if request handling duration exceeded timeout.

  • ModelNotFound – if model with specified name and version does not exist in the model server.

  • BadResponseError – if server response in malformed and cannot be parsed.

class datumaro.plugins.inference_server_plugin.ovms.HttpClient(url, session, client_key=None, server_cert=None)[source]#

Bases: ServingClient

predict(inputs, model_name, model_version=0, timeout=10.0)[source]#

Send PredictRequest to the server and return response.

Parameters:
  • inputs – dictionary with (input_name, input data) pairs

  • model_name – name of the model in the model server.

  • model_version – version of the model (default = 0).

  • timeout – time in seconds to wait for the response (default = 10).

Returns:

For models with single output - ndarray with prediction result. For models with multiple outputs - dictionary of (output_name, result) pairs.

Raises:
  • TypeError – if provided argument is of wrong type.

  • ValueError – if provided argument has unsupported value.

  • ConnectionError – if there is an issue with server connection.

  • TimeoutError – if request handling duration exceeded timeout.

  • ModelNotFound – if model with specified name and version does not exist in the model server.

  • InvalidInputError – if provided inputs could not be handled by the model.

  • BadResponseError – if server response in malformed and cannot be parsed.

get_model_metadata(model_name, model_version=0, timeout=10.0)[source]#

Send ModelMetadataRequest to the server and return response.

Parameters:
  • model_name – name of the model in the model server.

  • model_version – version of the model (default = 0).

  • timeout – time in seconds to wait for the response (default = 10).

Returns:

Dictionary with the model metadata response.

Raises:
  • TypeError – if provided argument is of wrong type.

  • ValueError – if provided argument has unsupported value.

  • ConnectionError – if there is an issue with server connection.

  • TimeoutError – if request handling duration exceeded timeout.

  • ModelNotFound – if model with specified name and version does not exist in the model server.

  • BadResponseError – if server response in malformed and cannot be parsed.

get_model_status(model_name, model_version=0, timeout=10.0)[source]#

Send ModelStatusRequest to the server and return response.

Parameters:
  • model_name – name of the model in the model server.

  • model_version – version of the model (default = 0).

  • timeout – time in seconds to wait for the response (default = 10).

Returns:

Dictionary with the model status response.

Raises:
  • TypeError – if provided argument is of wrong type.

  • ValueError – if provided argument has unsupported value.

  • ConnectionError – if there is an issue with server connection.

  • TimeoutError – if request handling duration exceeded timeout.

  • ModelNotFound – if model with specified name and version does not exist in the model server.

  • BadResponseError – if server response in malformed and cannot be parsed.

class datumaro.plugins.inference_server_plugin.ovms.LauncherForDedicatedInferenceServer(model_name: str, model_interpreter_path: str, model_version: int = 0, host: str = 'localhost', port: int = 9000, timeout: float = 10.0, tls_config: TLSConfig | None = None, protocol_type: ProtocolType = ProtocolType.grpc)[source]#

Bases: Generic[TClient], LauncherWithModelInterpreter

Inference launcher for dedicated inference server

Parameters:
  • model_name – Name of the model. It should match with the model name loaded in the server instance.

  • model_interpreter_path – Python source code path which implements a model interpreter. The model interpreter implement pre-processing of the model input and post-processing of the model output.

  • model_version – Version of the model loaded in the server instance

  • host – Host address of the server instance

  • port – Port number of the server instance

  • timeout – Timeout limit during communication between the client and the server instance

  • tls_config – Configuration required if the server instance is in the secure mode

  • protocol_type – Communication protocol type with the server instance

type_check(item)[source]#

Check the media type of dataset item.

If False, the item is excluded from the input batch.

class datumaro.plugins.inference_server_plugin.ovms.ProtocolType(value)[source]#

Bases: IntEnum

Protocol type for communication with dedicated inference server

grpc = 0#
http = 1#
datumaro.plugins.inference_server_plugin.ovms.make_grpc_client(url, tls_config=None)[source]#

Create GrpcClient object.

Parameters:
  • <address>:<port> (url - Model Server URL as a string in format) –

  • tls_config (optional) –

    dictionary with TLS configuration. The accepted format is:

    {
        "client_key_path": <Path to client key file>,
        "client_cert_path": <Path to client certificate file>,
        "server_cert_path": <Path to server certificate file>
    }
    

    With following types accepted:

    client_key_path

    string

    client_cert_path

    string

    server_cert_path

    string

Returns:

GrpcClient object

Raises:

ValueError, TypeError – if provided config is invalid.

Examples

Create minimal GrpcClient: >>> client = make_grpc_client(“localhost:9000”)

Create GrpcClient with TLS:

>>> tls_config = {
...     "client_key_path": "/opt/tls/client.key",
...     "client_cert_path": "/opt/tls/client.crt",
...     "server_cert_path": "/opt/tls/server.crt"
... }
>>> client = make_grpc_client("localhost:9000", tls_config)
datumaro.plugins.inference_server_plugin.ovms.make_http_client(url, tls_config=None)[source]#

Create HttpClient object.

Parameters:
  • <address>:<port> (url - Model Server URL as a string in format) –

  • tls_config (optional) –

    dictionary with TLS configuration. The accepted format is:

    {
        "client_key_path": <Path to client key file>,
        "client_cert_path": <Path to client certificate file>,
        "server_cert_path": <Path to server certificate file>
    }
    

    With following types accepted:

    client_key_path

    string

    client_cert_path

    string

    server_cert_path

    string

Returns:

HttpClient object

Raises:

ValueError, TypeError – if provided config is invalid.

Examples

Create minimal HttpClient: >>> client = make_http_client(“localhost:9000”)

Create HttpClient with TLS:

>>> tls_config = {
...     "client_key_path": "/opt/tls/client.key",
...     "client_cert_path": "/opt/tls/client.crt",
...     "server_cert_path": "/opt/tls/server.crt"
... }
>>> client = make_http_client("localhost:9000", tls_config)