datumaro.plugins.inference_server_plugin.triton#

Classes

TritonLauncher(model_name, ...[, ...])

Inference launcher for Triton Inference Server (triton-inference-server)

class datumaro.plugins.inference_server_plugin.triton.TritonLauncher(model_name: str, model_interpreter_path: str, model_version: int = 0, host: str = 'localhost', port: int = 9000, timeout: float = 10.0, tls_config: TLSConfig | None = None, protocol_type: ProtocolType = ProtocolType.grpc)[source]#

Bases: LauncherForDedicatedInferenceServer[Union[InferenceServerClient, InferenceServerClient]]

Inference launcher for Triton Inference Server (triton-inference-server)

Parameters:

model_name – Name of the model. It should match with the model name loaded in the server instance.
model_interpreter_path – Python source code path which implements a model interpreter. The model interpreter implement pre-processing of the model input and post-processing of the model output.
model_version – Version of the model loaded in the server instance
host – Host address of the server instance
port – Port number of the server instance
timeout – Timeout limit during communication between the client and the server instance
tls_config – Configuration required if the server instance is in the secure mode
protocol_type – Communication protocol type with the server instance

infer(inputs: ndarray | Dict[str, ndarray]) → List[Dict[str, ndarray] | List[Dict[str, ndarray]]][source]#

exception datumaro.plugins.inference_server_plugin.triton.DatumaroError[source]#: Bases: Exception

class datumaro.plugins.inference_server_plugin.triton.LauncherForDedicatedInferenceServer(model_name: str, model_interpreter_path: str, model_version: int = 0, host: str = 'localhost', port: int = 9000, timeout: float = 10.0, tls_config: TLSConfig | None = None, protocol_type: ProtocolType = ProtocolType.grpc)[source]#

Bases: Generic[TClient], LauncherWithModelInterpreter

Inference launcher for dedicated inference server

Parameters:

model_name – Name of the model. It should match with the model name loaded in the server instance.
model_interpreter_path – Python source code path which implements a model interpreter. The model interpreter implement pre-processing of the model input and post-processing of the model output.
model_version – Version of the model loaded in the server instance
host – Host address of the server instance
port – Port number of the server instance
timeout – Timeout limit during communication between the client and the server instance
tls_config – Configuration required if the server instance is in the secure mode
protocol_type – Communication protocol type with the server instance

type_check(item)[source]#

Check the media type of dataset item.

If False, the item is excluded from the input batch.

class datumaro.plugins.inference_server_plugin.triton.ProtocolType(value)[source]#

Bases: IntEnum

Protocol type for communication with dedicated inference server

grpc = 0#

http = 1#