Per-Layer POL vs Per-Model POL
Per-Model POL This approach sets a single precision level for the entire model using theCsConfig.precision_opt_level
parameter. This method provides a simple and efficient way to control model precision but lacks the granularity to address specific numerical challenges in individual layers.
Per-Layer POL
This approach allows for individual operations (layers) to be annotated with specific precision levels, including both forward and backward passes. This granularity enables a more precise and targeted approach to optimizing model performance and addressing numerical issues. It is achieved by utilizing the cstorch.pol decorator to annotate specific functions within the model code. The granularity of the POL annotation is determined by the content of the decorated function, allowing for individual operations or blocks of operations to be targeted. Importantly, per-layer POL overrides any per-model POL settings, ensuring that the specific operation precision takes precedence.
CSTorch API
To annotate a layer or set of layers, use thecstorch.pol
decorator.
For example:
cstorch.pol
applies the specified precision level to both the forward pass operations as well as their corresponding gradients computation operations during backpropagation. However, the decorator accepts parameters to allow configuring precision policies independently for forward versus backward passes.
For example:
Current implementation of per-layer POL only supports annotating MatMul operations. While future updates may expand support to additional operations, this limitation is currently in place.