Sampling

Sampling methods.

class anomalib.models.components.sampling.KCenterGreedy(embedding: Tensor, sampling_ratio: float)[source]

Bases: object

Implements k-center-greedy method.

Parameters:
  • embedding (Tensor) – Embedding vector extracted from a CNN

  • sampling_ratio (float) – Ratio to choose coreset size from the embedding size.

Example

>>> embedding.shape
torch.Size([219520, 1536])
>>> sampler = KCenterGreedy(embedding=embedding)
>>> sampled_idxs = sampler.select_coreset_idxs()
>>> coreset = embedding[sampled_idxs]
>>> coreset.shape
torch.Size([219, 1536])
get_new_idx() int[source]

Get index value of a sample.

Based on minimum distance of the cluster

Returns:

Sample index

Return type:

int

reset_distances() None[source]

Reset minimum distances.

sample_coreset(selected_idxs: Optional[list[int]] = None) Tensor[source]

Select coreset from the embedding.

Parameters:

selected_idxs – index of samples already selected. Defaults to an empty set.

Returns:

Output coreset

Return type:

Tensor

Example

>>> embedding.shape
torch.Size([219520, 1536])
>>> sampler = KCenterGreedy(...)
>>> coreset = sampler.sample_coreset()
>>> coreset.shape
torch.Size([219, 1536])
select_coreset_idxs(selected_idxs: Optional[list[int]] = None) list[int][source]

Greedily form a coreset to minimize the maximum distance of a cluster.

Parameters:

selected_idxs – index of samples already selected. Defaults to an empty set.

Returns:

indices of samples selected to minimize distance to cluster centers

update_distances(cluster_centers: list[int]) None[source]

Update min distances given cluster centers.

Parameters:

cluster_centers (list[int]) – indices of cluster centers