otx.hpo#

HPO package.

Functions

run_hpo_loop(hpo_algo, train_func[, ...])

Run the HPO loop.

Classes

TrialStatus(value)

Enum class for trial status.

HyperBand([minimum_resource, ...])

It implements the Asyncronous HyperBand scheduler with iterations only.

class otx.hpo.HyperBand(minimum_resource: int | float | None = None, reduction_factor: int = 3, asynchronous_sha: bool = True, asynchronous_bracket: bool = False, **kwargs)[source]#

Bases: HpoBase

It implements the Asyncronous HyperBand scheduler with iterations only.

Please refer the below papers for the detailed algorithm.

[1] “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization”, JMLR 2018

https://arxiv.org/abs/1603.06560 https://homes.cs.washington.edu/~jamieson/hyperband.html

[2] “A System for Massively Parallel Hyperparameter Tuning”, MLSys 2020

https://arxiv.org/abs/1810.05934

Parameters:
  • minimum_resource (Union[float, int]) – Minimum resource to use for training a trial. Defaults to None.

  • reduction_factor (int, optional) – Decicdes how many trials to promote to next rung. Only top 1 / reduction_factor of rung trials can be promoted. Defaults to 3.

  • asynchronous_sha (bool, optional) – Whether to operate SHA asynchronously. Defaults to True.

  • asynchronous_bracket (bool, optional) – Whether SHAs(brackets) are running parallelly or not. Defaults to True. Defaults to False.

auto_config() List[Dict[str, Any]][source]#

Configure ASHA automatically aligning with possible resource.

Configure ASHA automatically. If resource is lesser than full ASHA, decrease ASHA scale. In contrast, resource is more than full ASHA, increase ASHA scale.

Returns:

ASHA configuration. It’s used to make brackets.

Return type:

List[Dict[str, Any]]

get_best_config() Dict[str, Any] | None[source]#

Get best configuration in ASHA.

Returns:

Best configuration in ASHA. If there is no trial to select, return None.

Return type:

Optional[Dict[str, Any]]

get_next_sample() AshaTrial | None[source]#

Get next trial to train.

Returns:

Next trial to train. If there is no trial to train, then return None.

Return type:

Optional[AshaTrial]

get_progress() int | float[source]#

Get current progress of ASHA.

is_done() bool[source]#

Check that the ASHA is done.

Returns:

Whether ASHA is done.

Return type:

bool

print_result()[source]#

Print a ASHA result.

report_score(score: float | int, resource: float | int, trial_id: str, done: bool = False) Literal[TrialStatus.STOP, TrialStatus.RUNNING][source]#

Report a score to ASHA.

Parameters:
  • score (Union[float, int]) – Score to report.

  • resource (Union[float, int]) – Resource used to get score.

  • trial_id (str) – Trial id.

  • done (bool, optional) – Whether training trial is done. Defaults to False.

Returns:

Decide whether to continue training or not.

Return type:

Literal[TrialStatus.STOP, TrialStatus.RUNNING]

save_results()[source]#

Save a ASHA result.

class otx.hpo.TrialStatus(value)[source]#

Bases: IntEnum

Enum class for trial status.

otx.hpo.run_hpo_loop(hpo_algo: HpoBase, train_func: Callable, resource_type: Literal['gpu', 'cpu', 'xpu'] = 'gpu', num_parallel_trial: int | None = None, num_devices_per_trial: int | None = None, available_devices: str | None = None)[source]#

Run the HPO loop.

Parameters:
  • hpo_algo (HpoBase) – HPO algorithms.

  • train_func (Callable) – Function to train a model.

  • resource_type (Literal['gpu', 'cpu', 'xpu'], optional) – Which type of resource to use. If can be changed depending on environment. Defaults to “gpu”.

  • num_parallel_trial (Optional[int], optional) – How many trials to run in parallel. It’s used for CPUResourceManager. Defaults to None.

  • num_devices_per_trial (Optional[int], optional) – Number of devices used for a single trial. It’s used for GPUResourceManager and XPUResourceManager. Defaults to None.

  • available_devices (Optional[str], optional) – Number of devices available. It’s used for GPUResourceManager and XPUResourceManager. Defaults to None.