The GPU memory is not released
#2
by
cyt-Free
- opened
After a large number of requests, GPU memory is not released , finally OOM
Having the same problem here, making it not usable.
Hi,
Thank you for quick reply! Could you try to run this script by adding this model? https://gist.github.com/joe32140/3c38f377750202d7803b8c0fa0ef1e8b#file-evaluate_code_tasks-py-L196-L199
CodeRankEmbed always consume much more vram compared to other models with similar size, which makes it much slower. I believe there's gpu management issue in the modeling code. (I have adjusted batch size from 32 to 4, but it didn't help.)
If i had to make a quick guess, it's due to the max sequence length being longer than the models you pasted. you can manually override this to be shorter if needed
zpn
changed discussion status to
closed