datumaro.plugins.data_formats.tabular#

Functions

string_to_dict(input_string)

Classes

`TabularDataBase`(path, *[, target, dtype, ctx])	Read and compose a tabular dataset.
`TabularDataExporter`(extractor, save_dir, *)	Export a tabular dataset.
`TabularDataImporter`()	Import a tabular dataset.

class datumaro.plugins.data_formats.tabular.TabularDataBase(path: str, *, target: str | List[str] | None = None, dtype: Dict[str, Type[TableDtype]] | None = None, ctx: ImportContext | None = None, **kwargs)[source]#

Bases: DatasetBase

Read and compose a tabular dataset. The file name of each ‘.csv’ file is regarded as subset.

Parameters:

path (str) – Path to a tabular dataset. (csv file or folder contains csv files).
target (optional, str or list(str)) – Target column or list of target columns. If this is not specified (None), the last column is regarded as a target column. In case of a dataset with no targets, give an empty list as a parameter.
dtype (optional, dict(str,str)) – Dictionay of column name -> type str (‘str’, ‘int’, or ‘float’). This can be used when automatic type inferencing is failed.

NAME = 'tabular'#

categories()[source]#: Returns metainfo about dataset labels.

datumaro.plugins.data_formats.tabular.string_to_dict(input_string)[source]#

class datumaro.plugins.data_formats.tabular.TabularDataImporter[source]#

Bases: Importer

Import a tabular dataset. Each ‘.csv’ file is regarded as a subset.

NAME = 'tabular'#

classmethod build_cmdline_parser(**kwargs)[source]#

classmethod find_sources(path)[source]#

classmethod get_file_extensions() → List[str][source]#

class datumaro.plugins.data_formats.tabular.TabularDataExporter(extractor: IDataset, save_dir: str, *, save_media: bool | None = None, image_ext: str | None = None, default_image_ext: str | None = None, save_dataset_meta: bool = False, save_hashkey_meta: bool = False, stream: bool = False, ctx: ExportContext | None = None)[source]#

Bases: Exporter

Export a tabular dataset. This will save each subset into a ‘.csv’ file regardless of ‘save_media’ value

NAME = 'tabular'#

EXPORT_EXT = '.csv'#

DEFAULT_IMAGE_EXT = '.jpg'#