datumaro.plugins.inference_server_plugin.ovms#
Classes
|
Inference launcher for OVMS (OpenVINO™ Model Server) (openvinotoolkit/model_server) |
- class datumaro.plugins.inference_server_plugin.ovms.OVMSLauncher(model_name: str, model_interpreter_path: str, model_version: int = 0, host: str = 'localhost', port: int = 9000, timeout: float = 10.0, tls_config: TLSConfig | None = None, protocol_type: ProtocolType = ProtocolType.grpc)[source]#
Bases:
LauncherForDedicatedInferenceServer
[Union
[GrpcClient
,HttpClient
]]Inference launcher for OVMS (OpenVINO™ Model Server) (openvinotoolkit/model_server)
- Parameters:
model_name – Name of the model. It should match with the model name loaded in the server instance.
model_interpreter_path – Python source code path which implements a model interpreter. The model interpreter implement pre-processing of the model input and post-processing of the model output.
model_version – Version of the model loaded in the server instance
host – Host address of the server instance
port – Port number of the server instance
timeout – Timeout limit during communication between the client and the server instance
tls_config – Configuration required if the server instance is in the secure mode
protocol_type – Communication protocol type with the server instance
- class datumaro.plugins.inference_server_plugin.ovms.GrpcClient(channel, prediction_service_stub, model_service_stub)[source]#
Bases:
ServingClient
- predict(inputs, model_name, model_version=0, timeout=10.0)[source]#
Send PredictRequest to the server and return response.
- Parameters:
inputs – dictionary with (input_name, input data) pairs
model_name – name of the model in the model server.
model_version – version of the model (default = 0).
timeout – time in seconds to wait for the response (default = 10).
- Returns:
For models with single output - ndarray with prediction result. For models with multiple outputs - dictionary of (output_name, result) pairs.
- Raises:
TypeError – if provided argument is of wrong type.
ValueError – if provided argument has unsupported value.
ConnectionError – if there is an issue with server connection.
TimeoutError – if request handling duration exceeded timeout.
ModelNotFound – if model with specified name and version does not exist in the model server.
InvalidInputError – if provided inputs could not be handled by the model.
BadResponseError – if server response in malformed and cannot be parsed.
- get_model_metadata(model_name, model_version=0, timeout=10.0)[source]#
Send ModelMetadataRequest to the server and return response.
- Parameters:
model_name – name of the model in the model server.
model_version – version of the model (default = 0).
timeout – time in seconds to wait for the response (default = 10).
- Returns:
Dictionary with the model metadata response.
- Raises:
TypeError – if provided argument is of wrong type.
ValueError – if provided argument has unsupported value.
ConnectionError – if there is an issue with server connection.
TimeoutError – if request handling duration exceeded timeout.
ModelNotFound – if model with specified name and version does not exist in the model server.
BadResponseError – if server response in malformed and cannot be parsed.
- get_model_status(model_name, model_version=0, timeout=10.0)[source]#
Send ModelStatusRequest to the server and return response.
- Parameters:
model_name – name of the model in the model server.
model_version – version of the model (default = 0).
timeout – time in seconds to wait for the response (default = 10).
- Returns:
Dictionary with the model status response.
- Raises:
TypeError – if provided argument is of wrong type.
ValueError – if provided argument has unsupported value.
ConnectionError – if there is an issue with server connection.
TimeoutError – if request handling duration exceeded timeout.
ModelNotFound – if model with specified name and version does not exist in the model server.
BadResponseError – if server response in malformed and cannot be parsed.
- class datumaro.plugins.inference_server_plugin.ovms.HttpClient(url, session, client_key=None, server_cert=None)[source]#
Bases:
ServingClient
- predict(inputs, model_name, model_version=0, timeout=10.0)[source]#
Send PredictRequest to the server and return response.
- Parameters:
inputs – dictionary with (input_name, input data) pairs
model_name – name of the model in the model server.
model_version – version of the model (default = 0).
timeout – time in seconds to wait for the response (default = 10).
- Returns:
For models with single output - ndarray with prediction result. For models with multiple outputs - dictionary of (output_name, result) pairs.
- Raises:
TypeError – if provided argument is of wrong type.
ValueError – if provided argument has unsupported value.
ConnectionError – if there is an issue with server connection.
TimeoutError – if request handling duration exceeded timeout.
ModelNotFound – if model with specified name and version does not exist in the model server.
InvalidInputError – if provided inputs could not be handled by the model.
BadResponseError – if server response in malformed and cannot be parsed.
- get_model_metadata(model_name, model_version=0, timeout=10.0)[source]#
Send ModelMetadataRequest to the server and return response.
- Parameters:
model_name – name of the model in the model server.
model_version – version of the model (default = 0).
timeout – time in seconds to wait for the response (default = 10).
- Returns:
Dictionary with the model metadata response.
- Raises:
TypeError – if provided argument is of wrong type.
ValueError – if provided argument has unsupported value.
ConnectionError – if there is an issue with server connection.
TimeoutError – if request handling duration exceeded timeout.
ModelNotFound – if model with specified name and version does not exist in the model server.
BadResponseError – if server response in malformed and cannot be parsed.
- get_model_status(model_name, model_version=0, timeout=10.0)[source]#
Send ModelStatusRequest to the server and return response.
- Parameters:
model_name – name of the model in the model server.
model_version – version of the model (default = 0).
timeout – time in seconds to wait for the response (default = 10).
- Returns:
Dictionary with the model status response.
- Raises:
TypeError – if provided argument is of wrong type.
ValueError – if provided argument has unsupported value.
ConnectionError – if there is an issue with server connection.
TimeoutError – if request handling duration exceeded timeout.
ModelNotFound – if model with specified name and version does not exist in the model server.
BadResponseError – if server response in malformed and cannot be parsed.
- class datumaro.plugins.inference_server_plugin.ovms.LauncherForDedicatedInferenceServer(model_name: str, model_interpreter_path: str, model_version: int = 0, host: str = 'localhost', port: int = 9000, timeout: float = 10.0, tls_config: TLSConfig | None = None, protocol_type: ProtocolType = ProtocolType.grpc)[source]#
Bases:
Generic
[TClient
],LauncherWithModelInterpreter
Inference launcher for dedicated inference server
- Parameters:
model_name – Name of the model. It should match with the model name loaded in the server instance.
model_interpreter_path – Python source code path which implements a model interpreter. The model interpreter implement pre-processing of the model input and post-processing of the model output.
model_version – Version of the model loaded in the server instance
host – Host address of the server instance
port – Port number of the server instance
timeout – Timeout limit during communication between the client and the server instance
tls_config – Configuration required if the server instance is in the secure mode
protocol_type – Communication protocol type with the server instance
- class datumaro.plugins.inference_server_plugin.ovms.ProtocolType(value)[source]#
Bases:
IntEnum
Protocol type for communication with dedicated inference server
- grpc = 0#
- http = 1#
- datumaro.plugins.inference_server_plugin.ovms.make_grpc_client(url, tls_config=None)[source]#
Create GrpcClient object.
- Parameters:
<address>:<port> (url - Model Server URL as a string in format) –
tls_config (optional) –
dictionary with TLS configuration. The accepted format is:
{ "client_key_path": <Path to client key file>, "client_cert_path": <Path to client certificate file>, "server_cert_path": <Path to server certificate file> }
With following types accepted:
client_key_path
string
client_cert_path
string
server_cert_path
string
- Returns:
GrpcClient object
- Raises:
ValueError, TypeError – if provided config is invalid.
Examples
Create minimal GrpcClient: >>> client = make_grpc_client(“localhost:9000”)
Create GrpcClient with TLS:
>>> tls_config = { ... "client_key_path": "/opt/tls/client.key", ... "client_cert_path": "/opt/tls/client.crt", ... "server_cert_path": "/opt/tls/server.crt" ... } >>> client = make_grpc_client("localhost:9000", tls_config)
- datumaro.plugins.inference_server_plugin.ovms.make_http_client(url, tls_config=None)[source]#
Create HttpClient object.
- Parameters:
<address>:<port> (url - Model Server URL as a string in format) –
tls_config (optional) –
dictionary with TLS configuration. The accepted format is:
{ "client_key_path": <Path to client key file>, "client_cert_path": <Path to client certificate file>, "server_cert_path": <Path to server certificate file> }
With following types accepted:
client_key_path
string
client_cert_path
string
server_cert_path
string
- Returns:
HttpClient object
- Raises:
ValueError, TypeError – if provided config is invalid.
Examples
Create minimal HttpClient: >>> client = make_http_client(“localhost:9000”)
Create HttpClient with TLS:
>>> tls_config = { ... "client_key_path": "/opt/tls/client.key", ... "client_cert_path": "/opt/tls/client.crt", ... "server_cert_path": "/opt/tls/server.crt" ... } >>> client = make_http_client("localhost:9000", tls_config)