Training environment
#15
by
Leeli1
- opened
Hello, I am a researcher and I would like to know what is your training GPU?
Thanks for your interest in our model! For our experiment, we used 8 A100/A800 80G GPUs. But I'm not sure if any other GPUs work or not.
shenzhi-wang
changed discussion status to
closed
How long does it take?
For v1, it takes about 9 hours. For v2, it takes about more than 2 days (therefore we recommend using more GPUs to reduce the gradient accumulation steps).
This comment has been hidden