Accelerate documentation

FP8

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

FP8

Below are functions and classes relative to the underlying FP8 implementation

FP8RecipeKwargs

class accelerate.utils.FP8RecipeKwargs

< >

( opt_level: typing.Literal['O1', 'O2'] = None use_autocast_during_eval: bool = None margin: int = None interval: int = None fp8_format: typing.Literal['E4M3', 'HYBRID'] = None amax_history_len: int = None amax_compute_algo: typing.Literal['max', 'most_recent'] = None override_linear_precision: typing.Tuple[bool, bool, bool] = None backend: typing.Literal['MSAMP', 'TE'] = None )

Deprecated. Please use one of the proper FP8 recipe kwargs classes such as TERecipeKwargs or MSAMPRecipeKwargs instead.

convert_model

accelerate.utils.convert_model

< >

( model to_transformer_engine = True _convert_linear = True _convert_ln = True )

Recursively converts the linear and layernorm layers of a model to their transformers_engine counterpart.

has_transformer_engine_layers

accelerate.utils.has_transformer_engine_layers

< >

( model )

Returns whether a given model has some transformer_engine layer or not.

contextual_fp8_autocast

accelerate.utils.contextual_fp8_autocast

< >

( model_forward fp8_recipe use_during_eval = False )

Wrapper for a model’s forward method to apply FP8 autocast. Is context aware, meaning that by default it will disable FP8 autocast during eval mode, which is generally better for more accurate metrics.

apply_fp8_autowrap

accelerate.utils.apply_fp8_autowrap

< >

( model fp8_recipe_handler )

Applies FP8 context manager to the model’s forward method

< > Update on GitHub