datumaro.plugins.data_formats.tabular#

Functions

string_to_dict(input_string)

Classes

TabularDataBase(path, *[, target, dtype, ctx])

Read and compose a tabular dataset.

TabularDataExporter(extractor, save_dir, *)

Export a tabular dataset.

TabularDataImporter()

Import a tabular dataset.

class datumaro.plugins.data_formats.tabular.TabularDataBase(path: str, *, target: str | List[str] | None = None, dtype: Dict[str, Type[TableDtype]] | None = None, ctx: ImportContext | None = None)[source]#

Bases: DatasetBase

Read and compose a tabular dataset. The file name of each ‘.csv’ file is regarded as subset.

Parameters:
  • path (str) – Path to a tabular dataset. (csv file or folder contains csv files).

  • target (optional, str or list(str)) – Target column or list of target columns. If this is not specified (None), the last column is regarded as a target column. In case of a dataset with no targets, give an empty list as a parameter.

  • dtype (optional, dict(str,str)) – Dictionay of column name -> type str (‘str’, ‘int’, or ‘float’). This can be used when automatic type inferencing is failed.

NAME = 'tabular'#
categories()[source]#

Returns metainfo about dataset labels.

datumaro.plugins.data_formats.tabular.string_to_dict(input_string)[source]#
class datumaro.plugins.data_formats.tabular.TabularDataImporter[source]#

Bases: Importer

Import a tabular dataset. Each ‘.csv’ file is regarded as a subset.

NAME = 'tabular'#
classmethod build_cmdline_parser(**kwargs)[source]#
classmethod find_sources(path)[source]#
classmethod get_file_extensions() List[str][source]#
class datumaro.plugins.data_formats.tabular.TabularDataExporter(extractor: IDataset, save_dir: str, *, save_media: bool | None = None, image_ext: str | None = None, default_image_ext: str | None = None, save_dataset_meta: bool = False, save_hashkey_meta: bool = False, stream: bool = False, ctx: ExportContext | None = None)[source]#

Bases: Exporter

Export a tabular dataset. This will save each subset into a ‘.csv’ file regardless of ‘save_media’ value

NAME = 'tabular'#
EXPORT_EXT = '.csv'#
DEFAULT_IMAGE_EXT = '.jpg'#