Accelerate

FP8

Below are functions and classes relative to the underlying FP8 implementation

FP8RecipeKwargs

class accelerate.utils.FP8RecipeKwargs

( opt_level: typing.Literal['O1', 'O2'] = None use_autocast_during_eval: bool = None margin: int = None interval: int = None fp8_format: typing.Literal['E4M3', 'HYBRID'] = None amax_history_len: int = None amax_compute_algo: typing.Literal['max', 'most_recent'] = None override_linear_precision: typing.Tuple[bool, bool, bool] = None backend: typing.Literal['MSAMP', 'TE'] = None )

Deprecated. Please use one of the proper FP8 recipe kwargs classes such as TERecipeKwargs or MSAMPRecipeKwargs instead.

convert_model

accelerate.utils.convert_model

< source >

( model to_transformer_engine = True _convert_linear = True _convert_ln = True )

Recursively converts the linear and layernorm layers of a model to their transformers_engine counterpart.

has_transformer_engine_layers

accelerate.utils.has_transformer_engine_layers

< source >

( model )

Returns whether a given model has some transformer_engine layer or not.

contextual_fp8_autocast

accelerate.utils.contextual_fp8_autocast

< source >

( model_forward fp8_recipe use_during_eval = False )

Wrapper for a model’s forward method to apply FP8 autocast. Is context aware, meaning that by default it will disable FP8 autocast during eval mode, which is generally better for more accurate metrics.

apply_fp8_autowrap

accelerate.utils.apply_fp8_autowrap

< source >

( model fp8_recipe_handler )

Applies FP8 context manager to the model’s forward method

< > Update on GitHub