Loss is increasing while finetuning
#50
by
abpani1994
- opened
I am finetuning the mistral 0.3 instruct gptq with peft and qlora.
per_device_train_batch_size = 8,
per_device_eval_batch_size = 4,
# gradient_accumulation_steps = 1,
optim = "paged_adamw_8bit",
save_strategy = 'steps',
save_steps = 500,
logging_steps = 5,
learning_rate = 4e-4,
gradient_checkpointing_kwargs= {"use_reentrant": False},
weight_decay = 0.15,
fp16 = False,
bf16 = True,
max_steps= -1,
group_by_length = True,
lr_scheduler_type= "linear",
The loss increases gradually after certain point.
Please help