Bitsandbytes documentation
RMSprop
RMSprop
RMSprop is an adaptive learning rate optimizer that is very similar to Adagrad
. RMSprop stores a weighted average of the squared past gradients for each parameter and uses it to scale their learning rate. This allows the learning rate to be automatically lower or higher depending on the magnitude of the gradient, and it prevents the learning rate from diminishing.
RMSprop
class bitsandbytes.optim.RMSprop
< source >( paramslr = 0.01alpha = 0.99eps = 1e-08weight_decay = 0momentum = 0centered = Falseoptim_bits = 32args = Nonemin_8bit_size = 4096percentile_clipping = 100block_wise = True )
RMSprop8bit
class bitsandbytes.optim.RMSprop8bit
< source >( paramslr = 0.01alpha = 0.99eps = 1e-08weight_decay = 0momentum = 0centered = Falseargs = Nonemin_8bit_size = 4096percentile_clipping = 100block_wise = True )
RMSprop32bit
class bitsandbytes.optim.RMSprop32bit
< source >( paramslr = 0.01alpha = 0.99eps = 1e-08weight_decay = 0momentum = 0centered = Falseargs = Nonemin_8bit_size = 4096percentile_clipping = 100block_wise = True )