llmcompressor.pytorch.utils
Generic code used as utilities and helpers for PyTorch
ModuleSparsificationInfo
Helper class for providing information related to torch Module parameters and the amount of sparsification applied. Includes information for pruning and quantization
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module | Module | torch Module to analyze | required |
state_dict | Optional[Dict[str, Tensor]] | optional state_dict to analyze in place of the torch model. This is used when analyzing an FSDP model, where the full weights may not be accessible | None |
Source code in src/llmcompressor/pytorch/utils/sparsification.py
params_quantized
property
Returns:
Type | Description |
---|---|
int | number of parameters across quantized layers |
params_quantized_percent
property
Returns:
Type | Description |
---|---|
float | percentage of parameters that have been quantized |
params_sparse
property
Returns:
Type | Description |
---|---|
int | total number of sparse (0) trainable parameters in the model |
params_sparse_percent
property
Returns:
Type | Description |
---|---|
float | percent of sparsified parameters in the entire model |
params_total
property
Returns:
Type | Description |
---|---|
int | total number of trainable parameters in the model |
get_linear_layers(module)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module | Module | the module to grab all linear layers for | required |
Returns:
Type | Description |
---|---|
Dict[str, Module] | a list of all linear layers in the module |
Source code in src/llmcompressor/pytorch/utils/helpers.py
get_quantized_layers(module)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module | Module | the module to get the quantized layers from | required |
Returns:
Type | Description |
---|---|
List[Tuple[str, Module]] | a list containing the names and modules of the quantized layers (Embedding, Linear, Conv2d, Conv3d) |
Source code in src/llmcompressor/pytorch/utils/helpers.py
set_deterministic_seeds(seed=0)
Manually seeds the numpy, random, and torch packages. Also sets torch.backends.cudnn.deterministic to True
Parameters:
Name | Type | Description | Default |
---|---|---|---|
seed | int | the manual seed to use. Default is 0 | 0 |
Source code in src/llmcompressor/pytorch/utils/helpers.py
tensor_sparsity(tens, dim=None)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tens | Tensor | the tensor to calculate the sparsity for | required |
dim | Union[None, int, List[int], Tuple[int, ...]] | the dimension(s) to split the calculations over; ex, can split over batch, channels, or combos | None |
Returns:
Type | Description |
---|---|
Tensor | the sparsity of the input tens, ie the fraction of numbers that are zero |
Source code in src/llmcompressor/pytorch/utils/helpers.py
tensors_module_forward(tensors, module, check_feat_lab_inp=True)
Default function for calling into a model with data for a forward execution. Returns the model result. Note, if an iterable the features to be passed into the model are considered to be at index 0 and other indices are for labels.
Supported use cases: single tensor, iterable with first tensor taken as the features to pass into the model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors | Union[Tensor, Iterable[Tensor], Mapping[Any, Tensor]] | the data to be passed into the model, if an iterable the features to be passed into the model are considered to be at index 0 and other indices are for labels | required |
module | Module | the module to pass the data into | required |
check_feat_lab_inp | bool | True to check if the incoming tensors looks like it's made up of features and labels ie a tuple or list with 2 items (typical output from a data loader) and will call into the model with just the first element assuming it's the features False to not check | True |
Returns:
Type | Description |
---|---|
Any | the result of calling into the model for a forward pass |
Source code in src/llmcompressor/pytorch/utils/helpers.py
tensors_to_device(tensors, device)
Default function for putting a tensor or collection of tensors to the proper device. Returns the tensor references after being placed on the proper device.
Supported use cases: - single tensor - Dictionary of single tensors - Dictionary of iterable of tensors - Dictionary of dictionary of tensors - Iterable of single tensors - Iterable of iterable of tensors - Iterable of dictionary of tensors
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors | Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]] | the tensors or collection of tensors to put onto a device | required |
device | str | the string representing the device to put the tensors on, ex: 'cpu', 'cuda', 'cuda:1' | required |
Returns:
Type | Description |
---|---|
Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]] | the tensors or collection of tensors after being placed on the device |
Source code in src/llmcompressor/pytorch/utils/helpers.py
tensors_to_precision(tensors, full_precision)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensors | Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]] | the tensors to change the precision of | required |
full_precision | bool | True for full precision (float 32) and False for half (float 16) | required |
Returns:
Type | Description |
---|---|
Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]] | the tensors converted to the desired precision |