otx.algorithms.visual_prompting.adapters.pytorch_lightning.datasets.pipelines.sam_transforms#

SAM transfrom pipeline for visual prompting task.

Classes

ResizeLongestSide(target_length)

Resizes images to the longest side target_length, as well as provides methods for resizing coordinates and boxes.

class otx.algorithms.visual_prompting.adapters.pytorch_lightning.datasets.pipelines.sam_transforms.ResizeLongestSide(target_length: int)[source]#

Bases: object

Resizes images to the longest side target_length, as well as provides methods for resizing coordinates and boxes.

Provides methods for transforming both numpy array and batched torch tensors.

Parameters:

target_length (int) – The length of the longest side of the image.

__call__(item: Dict[str, List | Tensor]) Dict[str, List | Tensor][source]#

Applies the transformation to a single sample.

Parameters:

item (Dict[str, Union[List, Tensor]]) – Dictionary of batch data.

Returns: Dict[str, Union[List, Tensor]]: Dictionary of batch data.

classmethod apply_boxes(boxes: ndarray | Tensor, original_size: List[int] | Tuple[int, int] | Tensor, target_length: int) ndarray | Tensor[source]#

Expects a numpy array / torch tensor shape Bx4. Requires the original image size in (H, W) format.

Parameters:
  • boxes (Union[np.ndarray, Tensor]) – Boxes array/tensor.

  • original_size (Union[List[int], Tuple[int, int], Tensor]) – Original size of image.

  • target_length (int) – The length of the longest side of the image.

Returns:

Resized boxes.

Return type:

Union[np.ndarray, Tensor]

classmethod apply_coords(coords: ndarray | Tensor, original_size: List[int] | Tuple[int, int] | Tensor, target_length: int) ndarray | Tensor[source]#

Expects a numpy array / torch tensor of length 2 in the final dimension.

Requires the original image size in (H, W) format.

Parameters:
  • coords (Union[np.ndarray, Tensor]) – Coordinates array/tensor.

  • original_size (Union[List[int], Tuple[int, int], Tensor]) – Original size of image.

  • target_length (int) – The length of the longest side of the image.

Returns:

Resized coordinates.

Return type:

Union[np.ndarray, Tensor]

classmethod apply_image(image: ndarray, target_length: int) ndarray[source]#

Expects a numpy array with shape HxWxC in uint8 format.

Parameters:
  • image (np.ndarray) – Image array.

  • target_length (int) – The length of the longest side of the image.

Returns:

Resized image.

Return type:

np.ndarray

static get_preprocess_shape(oldh: int, oldw: int, long_side_length: int) Tuple[int, int][source]#

Compute the output size given input size and target long side length.

Parameters:
  • oldh (int) – Original height.

  • oldw (int) – Original width.

  • long_side_length (int) – Target long side length.

Returns:

Output size.

Return type:

Tuple[int, int]