Convergence with 16\8 bit

#10
by NeuroScie - opened

Hi, great work!
In your example you talk about additional parameters for fitting the training into smaller gpu's (simmilar to the huggingface fill50k example).
Can you verify that it actually converged for you using 16 bit? and if so, can you provide info regarding how many steps did it take? Any additional parameters?

Thanks

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment