llmcompressor.pytorch.utils.sparsification_info.helpers

`get_leaf_operations(model, operations_to_skip=None, operations_to_unwrap=None)`

Get the leaf operations in the model (those that do not have operations as children)

Parameters:

Name	Type	Description	Default
`model`	`Module`	the model to get the leaf operations from	required
`operations_to_skip`	`Optional[List[Module]]`	a list of leaf operations that will be omitted when getting the leaf operations. If None passed, by default the Identity operation will be skipped	`None`
`operations_to_unwrap`	`Optional[List[Module]]`	a list of operations that will be unwrapped when getting the leaf operations. Unwrapping means that we directly add the module(s) that is/are wrapped by the operation (i.e. operation's `module` attribute) to the list of leaf operations. If None passed, by default the QuantWrapper operation will be unwrapped	`None`

Returns:

Type	Description
`List[Module]`	a list of the leaf operations

Source code in src/llmcompressor/pytorch/utils/sparsification_info/helpers.py

def get_leaf_operations(
    model: torch.nn.Module,
    operations_to_skip: Optional[List[torch.nn.Module]] = None,
    operations_to_unwrap: Optional[List[torch.nn.Module]] = None,
) -> List[torch.nn.Module]:
    """
    Get the leaf operations in the model
    (those that do not have operations as children)

    :param model: the model to get the leaf operations from
    :param operations_to_skip: a list of leaf operations that will be
        omitted when getting the leaf operations. If None passed, by
        default the Identity operation will be skipped
    :param operations_to_unwrap: a list of operations that will be unwrapped
        when getting the leaf operations. Unwrapping means that we directly
        add the module(s) that is/are wrapped by the operation (i.e. operation's
        `module` attribute) to the list
        of leaf operations. If None passed, by default the QuantWrapper
        operation will be unwrapped
    :return: a list of the leaf operations
    """
    if operations_to_skip is None:
        operations_to_skip = [Identity]

    if operations_to_unwrap is None:
        operations_to_unwrap = [QuantWrapper]

    leaf_operations = []
    children = list(model.children())

    if children == []:
        return model
    else:
        for child in children:
            if isinstance(child, tuple(operations_to_unwrap)):
                leaf_operations.append(child.module)
                continue
            try:
                leaf_operations.extend(get_leaf_operations(child))
            except TypeError:
                leaf_operations.append(get_leaf_operations(child))
    leaf_operations = [
        op for op in leaf_operations if not isinstance(op, tuple(operations_to_skip))
    ]
    return leaf_operations

`get_precision_information(operation)`

Get the information about the precision of the operation.

1) If operation is quantized, returns the quantization scheme of the operation. 2) If operation is not quantized, returns the numer of bits of the operation's weights. 3) If operation is not quantized and does not have a weights, returns None.

Parameters:

Name	Type	Description	Default
`operation`	`Module`	the operation to get the quantization scheme from	required

Returns:

Type	Description
`Union[None, int, QuantizationScheme]`	the quantization scheme of the operation, the number of bits of the operation's weights, or None if the operation is not quantized and does not have a weight

Source code in src/llmcompressor/pytorch/utils/sparsification_info/helpers.py

def get_precision_information(
    operation: torch.nn.Module,
) -> Union[None, int, "QuantizationScheme"]:  # noqa F821
    """
    Get the information about the precision of the operation.

    1)  If operation is quantized, returns the quantization
        scheme of the operation.
    2)  If operation is not quantized, returns the numer of bits
        of the operation's weights.
    3)  If operation is not quantized and does not have a weights,
        returns None.

    :param operation: the operation to get the quantization scheme from
    :return: the quantization scheme of the operation, the number of bits
        of the operation's weights, or None if the operation is not quantized
        and does not have a weight
    """

    if hasattr(operation, "quantization_scheme"):
        return getattr(operation, "quantization_scheme")
    elif hasattr(operation, "weight"):
        return _get_num_bits(operation.weight.dtype)
    else:
        return None

`is_quantized(operation)`

Check whether the operation is quantized (contains a quantization scheme)

Source code in src/llmcompressor/pytorch/utils/sparsification_info/helpers.py

def is_quantized(operation: torch.nn.Module) -> bool:
    """
    Check whether the operation is quantized (contains
    a quantization scheme)
    """
    return hasattr(operation, "quantization_scheme")