The following optimizers are supported:
Adam
AdamW
Adam8Bit
AdamW8Bit
Not all optimizers have been tested with all models/parallel settings. They may or may not work, but this will gradually improve over time.