llmcompressor.pipelines.sequential.helpers
Subgraph
dataclass
Dataclass specifying an executable subgraph of a model graph
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph | Graph | subgraph of model graph | required |
input_names | Set[str] | argument names of the compiled forward function | required |
consumed_names | Set[str] | argument names which are not used by any subsequent subgraphs and can therefore be deleted from the intermediates cache | required |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
forward(*args, **kwargs)
Execute the operations within the subgraph
Parameters:
Name | Type | Description | Default |
---|---|---|---|
\*args | argument inputs to subgraph forward function | required | |
\**kwargs | keyword inputs to subgraph forward function | required |
Returns:
Type | Description |
---|---|
Dict[str, Any] | |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
find_target_nodes(graph, targets)
Find all nodes whose execution is equivalent to executing the target modules. Note that these nodes are guaranteed to be treated as leaf nodes by SequentialTracer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph | GraphModule | graph containing target nodes | required |
targets | Set[Module] | modules whose nodes are being searched for | required |
Returns:
Type | Description |
---|---|
Set[Node] | set of all nodes which call the target modules |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
get_sequential_ancestors(model, targets)
Find modules which are call graph ancestors of the given sequential targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model containing sequential targets | required |
targets | Set[Module] | sequential targets to find ancestors of | required |
Returns:
Type | Description |
---|---|
Set[Module] | call graph ancestors of sequential targets |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
get_targets_from_modifiers(modifiers, model)
Infer sequential targets and ignore list from modifiers list
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | PreTrainedModel | model being calibrated | required |
modifiers | List[Modifier] | list of modifiers being applied during calibration | required |
Returns:
Type | Description |
---|---|
Tuple[List[str], List[str]] | list of sequential targets and list of modules to ignore for tracing |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
get_tracer(model, sequential_targets, ignore)
Get a tracer specialized for the given model. The resulting tracer will not trace inside of sequential targets, nor any modules which are not call graph ancestors of sequential targets
Tracing within sequential targets is unnecessary, and tracing within offloaded modules may result in meta tensors being added to the model graph
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model being traced | required |
sequential_targets | Set[Module] | modules which are sequential targets | required |
ignore | Set[Module] | modules to ignore during tracing, in the future will specify functions and methods to skip during tracing | required |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
graph_is_well_formed(graph)
A graph is well formed if and only if nodeA in NodeB.users <=> nodeB in Node.A.all_input_nodes
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph | Graph | graph being checked | required |
Returns:
Type | Description |
---|---|
bool | True if the graph is well formed, False otherwise |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
match_modules(model, target_names)
Find modules whose names match the patterns given by target_names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model containing submodules to find | required |
target_names | List[str] | target patterns to find | required |
Returns:
Type | Description |
---|---|
Set[Module] | all submodules matching |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
partition_graph(model, partitions)
Convert each partition into a Subgraph. Each Subgraph returns a dictionary mapping of output node names to their computed values. Note that the consumed_names
attribute of each Subgraph remains empty, to be later populated by trace_consumed_names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model which owns the produced Subgraphs | required |
partitions | List[List[Node]] | list of partitions, where each partition is a list of nodes belonging to that partition | required |
Returns:
Type | Description |
---|---|
List[Subgraph] | list of subgraphs in order of execution |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
populate_concrete_args(model, sample_input)
Creates concrete args which, unlike the equivalent function provided by transformers.utils.fx, creates default values for variadic arguments, which are needed by some models.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | model being traced | required |
sample_input | Dict | values used to symbolically trace the model. All arguments to the model.forward function which are not in the sample_input are considered concrete args | required |
Returns:
Type | Description |
---|---|
Dict | dictionary mapping concrete argument names to their default values |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
topological_partition(graph, targets)
Partition the graph into partitions such that each target
belongs to exactly one partition and executing each partition depends only on intermediate values produced by executing the partitions before it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
graph | GraphModule | graph being partitioned | required |
targets | Set[Module] | target modules which will be assigned to disjoint partitions | required |
Returns:
Type | Description |
---|---|
List[List[Node]] | list of partitions, where each partition is a list of nodes belonging to that partition |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
trace_consumed_names(subgraphs)
Populate the consumed_names
attribute of each Subgraph according to when inputs are last used in order to vacate the intermediates
cache and save memory
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subgraphs | List[Subgraph] | list of subgraphs with empty | required |
Source code in src/llmcompressor/pipelines/sequential/helpers.py
trace_subgraphs(model, sample_input, sequential_targets, ignore)
Trace a model to produce subgraphs, where each sequential target belongs to exactly one subgraph and where executing each subgraph in order is equivalent to executing the original model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | PreTrainedModel | model being traced | required |
sample_input | Dict[str, Any] | inputs whose values will change during execution but whose len, bool, and contains values are assumed constant across batches | required |
sequential_targets | List[str] | list of patterns matching sequential targets | required |
ignore | List[str] | modules to ignore during tracing, in the future will specify functions and methods to skip during tracing | required |
Returns:
Type | Description |
---|---|
List[Subgraph] | a list of Subgraphs in order of execution |