Training environment

#15
by Leeli1 - opened

Hello, I am a researcher and I would like to know what is your training GPU?

Thanks for your interest in our model! For our experiment, we used 8 A100/A800 80G GPUs. But I'm not sure if any other GPUs work or not.

shenzhi-wang changed discussion status to closed

How long does it take?

For v1, it takes about 9 hours. For v2, it takes about more than 2 days (therefore we recommend using more GPUs to reduce the gradient accumulation steps).

This comment has been hidden

Sign up or log in to comment