datumaro.components.transformer#
Classes
|
|
|
A transformation class for applying a model's inference to dataset items. |
|
A transformation class for processing dataset items in batches with optional parallelism. |
|
A base class for dataset transformations that change dataset items or their annotations. |
- class datumaro.components.transformer.Transform(extractor: IDataset)[source]#
Bases:
DatasetBase
,CliPlugin
A base class for dataset transformations that change dataset items or their annotations.
- class datumaro.components.transformer.ItemTransform(extractor: IDataset)[source]#
Bases:
Transform
- transform_item(item: DatasetItem) DatasetItem | None [source]#
Returns a modified copy of the input item.
Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.
- class datumaro.components.transformer.TabularTransform(extractor: IDataset, batch_size: int = 1, num_workers: int = 0)[source]#
Bases:
Transform
A transformation class for processing dataset items in batches with optional parallelism.
This class takes a dataset extractor, batch size, and number of worker threads to process dataset items. Depending on the number of workers specified, it can process items either sequentially (single-process) or in parallel (multi-process), making it efficient for batch transformations.
- Parameters:
extractor – The dataset extractor to obtain items from.
batch_size – The batch size for processing items. Default is 1.
num_workers – The number of worker threads to use for parallel processing. Set to 0 for single-process mode. Default is 0.
- transform_item(item: DatasetItem) DatasetItem | None [source]#
Returns a modified copy of the input item.
Avoid changing and returning the input item, because it can lead to unexpected problems. Use wrap_item() or item.wrap() to simplify copying.
- class datumaro.components.transformer.ModelTransform(extractor: IDataset, launcher: Launcher, batch_size: int = 1, append_annotation: bool = False, num_workers: int = 0)[source]#
Bases:
Transform
A transformation class for applying a model’s inference to dataset items.
This class takes an dataset, a launcher, and other optional parameters to transform the dataset item from the model outputs by the launcher. It can process items using multiple processes if specified, making it suitable for parallelized inference tasks.
- Parameters:
extractor – The dataset extractor to obtain items from.
launcher – The launcher responsible for model inference.
batch_size – The batch size for processing items. Default is 1.
append_annotation – Whether to append inference annotations to existing annotations. Default is False.
num_workers – The number of worker threads to use for parallel inference. Set to 0 for single-process mode. Default is 0.