FP8
Below are functions and classes relative to the underlying FP8 implementation
FP8RecipeKwargs
class accelerate.utils.FP8RecipeKwargs
< source >( opt_level: typing.Literal['O1', 'O2'] = None use_autocast_during_eval: bool = None margin: int = None interval: int = None fp8_format: typing.Literal['E4M3', 'HYBRID'] = None amax_history_len: int = None amax_compute_algo: typing.Literal['max', 'most_recent'] = None override_linear_precision: typing.Tuple[bool, bool, bool] = None backend: typing.Literal['MSAMP', 'TE'] = None )
Deprecated. Please use one of the proper FP8 recipe kwargs classes such as TERecipeKwargs
or MSAMPRecipeKwargs
instead.
convert_model
accelerate.utils.convert_model
< source >( model to_transformer_engine = True _convert_linear = True _convert_ln = True )
Recursively converts the linear and layernorm layers of a model to their transformers_engine
counterpart.
has_transformer_engine_layers
Returns whether a given model has some transformer_engine
layer or not.
contextual_fp8_autocast
accelerate.utils.contextual_fp8_autocast
< source >( model_forward fp8_recipe use_during_eval = False )
Wrapper for a model’s forward method to apply FP8 autocast. Is context aware, meaning that by default it will disable FP8 autocast during eval mode, which is generally better for more accurate metrics.
apply_fp8_autowrap
Applies FP8 context manager to the model’s forward method