datumaro.components.algorithms.hash_key_inference.prune#
Functions
|
Classes
|
Select items through clustering with centers targeting the desired number. |
Select items through clustering and choose randomly within each cluster. |
|
|
Select items through clustering and choose them based on label entropy in each cluster. |
Select items based on NDR among each subset. |
|
|
Prune make a representative and manageable subset. |
Select items through clustering with inits that imply each label. |
|
Select items randomly from the dataset. |
- datumaro.components.algorithms.hash_key_inference.prune.match_num_item_for_cluster(ratio, dataset_len, cluster_num_item_list)[source]#
- class datumaro.components.algorithms.hash_key_inference.prune.PruneBase[source]#
Bases:
ABC
- abstract base(ratio: float, num_centers: int | None, labels: List[int] | None, database_keys: ndarray | None, item_list: List[DatasetItem], source: Dataset | None) Tuple[List[DatasetItem], Dict | None] [source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.
- class datumaro.components.algorithms.hash_key_inference.prune.RandomSelect[source]#
Bases:
PruneBase
Select items randomly from the dataset.
- base(ratio, num_centers, labels, database_keys, item_list, source)[source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.
- class datumaro.components.algorithms.hash_key_inference.prune.Centroid[source]#
Bases:
PruneBase
Select items through clustering with centers targeting the desired number.
- base(ratio, num_centers, labels, database_keys, item_list, source)[source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.
- class datumaro.components.algorithms.hash_key_inference.prune.ClusteredRandom[source]#
Bases:
PruneBase
Select items through clustering and choose randomly within each cluster.
- base(ratio, num_centers, labels, database_keys, item_list, source)[source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.
- class datumaro.components.algorithms.hash_key_inference.prune.QueryClust[source]#
Bases:
PruneBase
Select items through clustering with inits that imply each label.
- base(ratio, num_centers, labels, database_keys, item_list, source)[source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.
- class datumaro.components.algorithms.hash_key_inference.prune.Entropy[source]#
Bases:
PruneBase
Select items through clustering and choose them based on label entropy in each cluster.
- base(ratio, num_centers, labels, database_keys, item_list, source)[source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.
- class datumaro.components.algorithms.hash_key_inference.prune.NDRSelect[source]#
Bases:
PruneBase
Select items based on NDR among each subset.
- base(ratio, num_centers, labels, database_keys, item_list, source)[source]#
It executes each method for pruning.
- Parameters:
ratio – How much to remain dataset after pruning.
num_centers – Number of centers for clustering.
labels – Label of one annotation for each datasetitem.
database_keys – Batch of the numpy formatted hash_key.
item_list – List of datasetitem of dataset.
source – Whole dataset.
- Returns:
It returns a tuple of selected items and distance of each item and clusters.