datumaro.util.image#

Module Attributes

ImageMeta

filename -> height, width

Functions

copyto_image(src, dst, src_crypter, dst_crypter)

decode_image(image_bytes[, dtype])

decode_image_context(image_backend, ...)

Change Datumaro image color channel while decoding.

encode_image(image, ext[, dtype])

find_images(dirpath[, exts, recursive, ...])

is_image(path)

load_image(path[, dtype, crypter])

Reads an image in the HWC Grayscale/BGR(A) [0; 255] format (default dtype is uint8).

load_image_meta_file(image_meta_path)

Loads image metadata from a file with the following format:

save_image(dst, image[, ext, create_dir, ...])

save_image_meta_file(image_meta, image_meta_path)

Saves image_meta to the path specified by image_meta_path in the format defined in load_image_meta_file's documentation.

Classes

ImageBackend(value)

An enumeration.

ImageColorChannel(value)

Image color channel

lazy_image(path, loader, ...)

Cache:

class datumaro.util.image.ImageBackend(value)[source]#

Bases: Enum

An enumeration.

cv2 = 1#
PIL = 2#
class datumaro.util.image.ImageColorChannel(value)[source]#

Bases: Enum

Image color channel

  • UNCHANGED: Use the original image’s channel (default)

  • COLOR_BGR: Use BGR 3 channels (it can ignore the alpha channel or convert the gray scale image)

  • COLOR_RGB: Use RGB 3 channels (it can ignore the alpha channel or convert the gray scale image)

UNCHANGED = 0#
COLOR_BGR = 1#
COLOR_RGB = 2#
decode_by_cv2(image_bytes: bytes) ndarray[source]#

Convert image color channel for OpenCV image (np.ndarray).

decode_by_pil(image_bytes: bytes) PILImage[source]#

Convert image color channel for PIL Image.

datumaro.util.image.decode_image_context(image_backend: ImageBackend, image_color_channel: ImageColorChannel)[source]#

Change Datumaro image color channel while decoding.

For model training, it is recommended to use this context manager to load images in the BGR 3-channel format. For example,

import datumaro as dm
with decode_image_context(image_backend=ImageBackend.cv2, image_color_channel=ImageColorScale.COLOR):
    item: dm.DatasetItem
    img_data = item.media_as(dm.Image).data
    assert img_data.shape[-1] == 3  # It should be a 3-channel image
datumaro.util.image.load_image(path: str, dtype: ~numpy.dtype[~typing.Any] | None | type[~typing.Any] | ~numpy._typing._dtype_like._SupportsDType[~numpy.dtype[~typing.Any]] | str | tuple[~typing.Any, int] | tuple[~typing.Any, ~typing.SupportsIndex | ~collections.abc.Sequence[~typing.SupportsIndex]] | list[~typing.Any] | ~numpy._typing._dtype_like._DTypeDict | tuple[~typing.Any, ~typing.Any] = <class 'numpy.uint8'>, crypter: ~datumaro.components.crypter.Crypter = <datumaro.components.crypter.NullCrypter object>)[source]#

Reads an image in the HWC Grayscale/BGR(A) [0; 255] format (default dtype is uint8).

datumaro.util.image.copyto_image(src: str | IOBase, dst: str | IOBase, src_crypter: Crypter, dst_crypter: Crypter) None[source]#
datumaro.util.image.save_image(dst: str | ~io.IOBase, image: ~numpy.ndarray, ext: str | None = None, create_dir: bool = False, dtype: ~numpy.dtype[~typing.Any] | None | type[~typing.Any] | ~numpy._typing._dtype_like._SupportsDType[~numpy.dtype[~typing.Any]] | str | tuple[~typing.Any, int] | tuple[~typing.Any, ~typing.SupportsIndex | ~collections.abc.Sequence[~typing.SupportsIndex]] | list[~typing.Any] | ~numpy._typing._dtype_like._DTypeDict | tuple[~typing.Any, ~typing.Any] = <class 'numpy.uint8'>, crypter: ~datumaro.components.crypter.Crypter = <datumaro.components.crypter.NullCrypter object>, **kwargs) None[source]#
datumaro.util.image.encode_image(image: ~numpy.ndarray, ext: str, dtype: ~numpy.dtype[~typing.Any] | None | type[~typing.Any] | ~numpy._typing._dtype_like._SupportsDType[~numpy.dtype[~typing.Any]] | str | tuple[~typing.Any, int] | tuple[~typing.Any, ~typing.SupportsIndex | ~collections.abc.Sequence[~typing.SupportsIndex]] | list[~typing.Any] | ~numpy._typing._dtype_like._DTypeDict | tuple[~typing.Any, ~typing.Any] = <class 'numpy.uint8'>, **kwargs) bytes[source]#
datumaro.util.image.decode_image(image_bytes: bytes, dtype: ~numpy.dtype[~typing.Any] | None | type[~typing.Any] | ~numpy._typing._dtype_like._SupportsDType[~numpy.dtype[~typing.Any]] | str | tuple[~typing.Any, int] | tuple[~typing.Any, ~typing.SupportsIndex | ~collections.abc.Sequence[~typing.SupportsIndex]] | list[~typing.Any] | ~numpy._typing._dtype_like._DTypeDict | tuple[~typing.Any, ~typing.Any] = <class 'numpy.uint8'>) ndarray[source]#
datumaro.util.image.find_images(dirpath: str, exts: str | Iterable[str] | None = None, recursive: bool = False, max_depth: int | None = None, min_depth: int | None = None) Iterator[str][source]#
datumaro.util.image.is_image(path: str) bool[source]#
class datumaro.util.image.lazy_image(path: str, loader: ~typing.Callable[[str], ~numpy.ndarray] | None = None, cache: bool | ~datumaro.util.image_cache.ImageCache = True, crypter: ~datumaro.components.crypter.Crypter = <datumaro.components.crypter.NullCrypter object>)[source]#

Bases: object

Cache:
  • False: do not cache

  • True: use the global cache

  • ImageCache instance: an object to be used as cache

datumaro.util.image.ImageMeta#

filename -> height, width

alias of Dict[str, Tuple[int, int]]

datumaro.util.image.load_image_meta_file(image_meta_path: str) Dict[str, Tuple[int, int]][source]#

Loads image metadata from a file with the following format:

<image name 1> <height 1> <width 1>

<image name 2> <height 2> <width 2>

Shell-like comments and quoted fields are allowed.

This can be useful to support datasets in which image dimensions are required to interpret annotations.

datumaro.util.image.save_image_meta_file(image_meta: Dict[str, Tuple[int, int]], image_meta_path: str) None[source]#

Saves image_meta to the path specified by image_meta_path in the format defined in load_image_meta_file’s documentation.