llmcompressor.pipelines.layer_sequential.helpers
EarlyStopException
dataclass
Bases: Exception
Dataclass for storing model activations
Note: Attribute names args
and kwargs
are reserved for dataclass.GenericAlias
Source code in src/llmcompressor/pipelines/layer_sequential/helpers.py
capture_first_layer_intermediates(model, first_layer, dataloader, mask_padding=True)
Captures the intermediate activations directly before the first model layer. This is meant to capture any model preprocessing before model layers are executed
Note that if any modules compressed prior to the execution of the first layer, the compression error induced by compressing those modules will not be propagated to subsequent activations, as they would be for modules which are compressed within a layer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model containing layers | required |
first_layer | Module | the first layer of the model | required |
dataloader | DataLoader | dataloader of calibration inputs | required |
mask_padding | bool | zero out padding tokens if True. This affects modifiers such as GPTQ and SparseGPT | True |
Source code in src/llmcompressor/pipelines/layer_sequential/helpers.py
match_modules(model, target_names)
Find all submodules which match the target_names
and sort them by name
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model to search for submodules in | required |
target_names | List[str] | patterns of submodule names to match | required |
Returns:
Type | Description |
---|---|
List[Module] | list of submodules |
Source code in src/llmcompressor/pipelines/layer_sequential/helpers.py
maybe_inject_pos_embeddings(output, next_layer, inputs)
As of https://github.com/huggingface/transformers/pull/34858, positional embeddings must be passed into each decoder call as kwargs
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output | Dict[str, Any] | output of the previous layer | required |
next_layer | Module | next layer to call | required |
inputs | Dict[str, Any] | inputs to next layer | required |
Source code in src/llmcompressor/pipelines/layer_sequential/helpers.py
to_next_layer_kwargs(args, next_layer)
Convert a list of arguments to a dictionary of keyword arguments which match the next layer's function signature
Parameters:
Name | Type | Description | Default |
---|---|---|---|
args | Tuple[Any, ...] | list of argument values | required |
next_layer | Module | the next layer whose function signature must be matched | required |
Returns:
Type | Description |
---|---|
Dict[str, Any] | dictionary mapping function signature keywords to argument values |