How may gpu and gpu time used for this training?
#3
by
aisensiy
- opened
This is a really cool model, may I know which kind of gpu do you use for this and how many hours does it take. Thanks.
Here's the SFT phase:
https://wandb.ai/jondurbin/bagel-34b-v0.2/runs/uo933t3v/overview?workspace=user-jondurbin
(about 3 days on 8x 80gb a100s)
And the DPO phase:
https://wandb.ai/jondurbin/bagel-dpo-34b-v0.2/runs/xeis3y61/overview?workspace=user-jondurbin
(about 18 hours on 8x 80gb a100s)
Hi @jondurbin Your DPO training is based on Lora but not full parameters, right? Does DPO training on full parameters will make difference? better performance?