llmcompressor.pytorch.utils.sparsification_info.helpers
get_leaf_operations(model, operations_to_skip=None, operations_to_unwrap=None)
Get the leaf operations in the model (those that do not have operations as children)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | the model to get the leaf operations from | required |
operations_to_skip | Optional[List[Module]] | a list of leaf operations that will be omitted when getting the leaf operations. If None passed, by default the Identity operation will be skipped | None |
operations_to_unwrap | Optional[List[Module]] | a list of operations that will be unwrapped when getting the leaf operations. Unwrapping means that we directly add the module(s) that is/are wrapped by the operation (i.e. operation's | None |
Returns:
Type | Description |
---|---|
List[Module] | a list of the leaf operations |
Source code in src/llmcompressor/pytorch/utils/sparsification_info/helpers.py
get_precision_information(operation)
Get the information about the precision of the operation.
1) If operation is quantized, returns the quantization scheme of the operation. 2) If operation is not quantized, returns the numer of bits of the operation's weights. 3) If operation is not quantized and does not have a weights, returns None.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
operation | Module | the operation to get the quantization scheme from | required |
Returns:
Type | Description |
---|---|
Union[None, int, QuantizationScheme] | the quantization scheme of the operation, the number of bits of the operation's weights, or None if the operation is not quantized and does not have a weight |
Source code in src/llmcompressor/pytorch/utils/sparsification_info/helpers.py
is_quantized(operation)
Check whether the operation is quantized (contains a quantization scheme)