nncf.torch#

Base subpackage for NNCF PyTorch functionality.

Subpackages#

Submodules#

Functions#

create_compressed_model(model, config[, ...])

The main function used to produce a model ready for compression fine-tuning from an original PyTorch

load_state(model, state_dict_to_load[, is_resume, ...])

Used to load a checkpoint containing a compressed model into an NNCFNetwork object, but can

register_default_init_args(nncf_config, train_loader)

register_module(*quantizable_field_names[, ...])

register_operator([name])

nncf_model_input(tensor)

nncf_model_output(tensor)

disable_tracing(method)

Patch a method so that it will be executed within no_nncf_trace context

no_nncf_trace()

forward_nncf_trace()

force_build_cpu_extensions()

force_build_cuda_extensions()

nncf.torch.create_compressed_model(model, config, compression_state=None, dummy_forward_fn=None, wrap_inputs_fn=None, wrap_outputs_fn=None, dump_graphs=True)[source]#

The main function used to produce a model ready for compression fine-tuning from an original PyTorch model and a configuration object.

Parameters:
  • model (torch.nn.Module) – The original model. Should have its parameters already loaded from a checkpoint or another source.

  • config (nncf.NNCFConfig) – A configuration object used to determine the exact compression modifications to be applied to the model

  • compression_state (Optional[Dict[str, Any]]) – representation of the entire compression state to unambiguously restore the compressed model. Includes builder and controller states.

  • dummy_forward_fn (Callable[[torch.nn.Module], Any]) – if supplied, will be used instead of a forward function call to build the internal graph representation via tracing. Specifying this is useful when the original training pipeline has special formats of data loader output or has additional forward arguments other than input tensors. Otherwise, the forward call of the model during graph tracing will be made with mock tensors according to the shape specified in the config object. The dummy_forward_fn code MUST contain calls to nncf.nncf_model_input functions made with each compressed model input tensor in the underlying model’s args/kwargs tuple, and these calls should be exactly the same as in the wrap_inputs_fn function code (see below); if dummy_forward_fn is specified, then wrap_inputs_fn also must be specified.

  • wrap_inputs_fn (Callable[[Tuple, Dict], Tuple[Tuple, Dict]]) – if supplied, will be used on the module’s input arguments during a regular, non-dummy forward call before passing the inputs to the underlying compressed model. This is required if the model’s input tensors that are important for compression are not supplied as arguments to the model’s forward call directly, but instead are located in a container (such as list), and the model receives the container as an argument. wrap_inputs_fn should take as input two arguments - the tuple of positional arguments to the underlying model’s forward call, and a dict of keyword arguments to the same. The function should wrap each tensor among nncf.nncf_model_input function, which is a no-operation function and marks the tensors as inputs to be traced by NNCF in the internal graph representation. Output is the tuple of (args, kwargs), where args and kwargs are the same as were supplied in input, but each tensor in the original input. Must be specified if dummy_forward_fn is specified.

  • wrap_outputs_fn (Callable[[Any], Any]) – same as wrap_inputs_fn, but applies to model outputs

  • dump_graphs – Whether to dump the internal graph representation of the original and compressed models in the .dot format into the log directory.

Returns:

A controller for the compression algorithm (or algorithms, in which case the controller is an instance of CompositeCompressionController) and the model ready for compression parameter training wrapped as an object of NNCFNetwork.

Return type:

Tuple[nncf.api.compression.CompressionAlgorithmController, nncf.torch.nncf_network.NNCFNetwork]

nncf.torch.load_state(model, state_dict_to_load, is_resume=False, keys_to_ignore=None)[source]#

Used to load a checkpoint containing a compressed model into an NNCFNetwork object, but can be used for any PyTorch module as well. Will do matching of state_dict_to_load parameters to the model’s state_dict parameters while discarding irrelevant prefixes added during wrapping in NNCFNetwork or DataParallel/DistributedDataParallel objects, and load the matched parameters from the state_dict_to_load into the model’s state dict. :param model: The target module for the state_dict_to_load to be loaded to. :param state_dict_to_load: A state dict containing the parameters to be loaded into the model. :param is_resume: Determines the behavior when the function cannot do a successful parameter match when loading. If True, the function will raise an exception if it cannot match the state_dict_to_load parameters to the model’s parameters (i.e. if some parameters required by model are missing in state_dict_to_load, or if state_dict_to_load has parameters that could not be matched to model parameters, or if the shape of parameters is not matching). If False, the exception won’t be raised. Usually is_resume is specified as False when loading uncompressed model’s weights into the model with compression algorithms already applied, and as True when loading a compressed model’s weights into the model with compression algorithms applied to evaluate the model. :param keys_to_ignore: A list of parameter names that should be skipped from matching process. :return: The number of state_dict_to_load entries successfully matched and loaded into model.

Parameters:
  • model (torch.nn.Module) –

  • state_dict_to_load (dict) –

  • is_resume (bool) –

  • keys_to_ignore (List[str]) –

Return type:

int

nncf.torch.register_default_init_args(nncf_config, train_loader, criterion=None, criterion_fn=None, train_steps_fn=None, validate_fn=None, val_loader=None, autoq_eval_fn=None, model_eval_fn=None, distributed_callbacks=None, execution_parameters=None, legr_train_optimizer=None, device=None)[source]#
Parameters:
  • nncf_config (nncf.NNCFConfig) –

  • train_loader (torch.utils.data.DataLoader) –

  • criterion (torch.nn.modules.loss._Loss) –

  • criterion_fn (Callable[[Any, Any, torch.nn.modules.loss._Loss], torch.Tensor]) –

  • train_steps_fn (Callable[[torch.utils.data.DataLoader, torch.nn.Module, torch.optim.Optimizer, nncf.api.compression.CompressionAlgorithmController, Optional[int]], type(None)]) –

  • validate_fn (Callable[[torch.nn.Module, torch.utils.data.DataLoader], Tuple[float, float]]) –

  • val_loader (torch.utils.data.DataLoader) –

  • autoq_eval_fn (Callable[[torch.nn.Module, torch.utils.data.DataLoader], float]) –

  • model_eval_fn (Callable[[torch.nn.Module, torch.utils.data.DataLoader], float]) –

  • distributed_callbacks (Tuple[Callable, Callable]) –

  • execution_parameters (nncf.torch.structures.ExecutionParameters) –

  • legr_train_optimizer (torch.optim.Optimizer) –

  • device (str) –

Return type:

nncf.NNCFConfig

nncf.torch.register_module(*quantizable_field_names, ignored_algorithms=None, target_weight_dim_for_compression=0)[source]#
Parameters:
  • quantizable_field_names (str) –

  • ignored_algorithms (list) –

  • target_weight_dim_for_compression (int) –

nncf.torch.register_operator(name=None)[source]#
nncf.torch.nncf_model_input(tensor)[source]#
Parameters:

tensor (torch.Tensor) –

nncf.torch.nncf_model_output(tensor)[source]#
Parameters:

tensor (torch.Tensor) –

nncf.torch.disable_tracing(method)[source]#

Patch a method so that it will be executed within no_nncf_trace context :param method: A method to patch.

nncf.torch.no_nncf_trace()[source]#
nncf.torch.forward_nncf_trace()[source]#
nncf.torch.force_build_cpu_extensions()[source]#
nncf.torch.force_build_cuda_extensions()[source]#