The GPU memory is not released

#2
by cyt-Free - opened

After a large number of requests, GPU memory is not released , finally OOM

Having the same problem here, making it not usable.

Nomic AI org

Hi @cyt-Free and @joe32140 ,

Could you please share the code snippet and exact error message so I can resolve this issue?

Thanks!

Hi,

Thank you for quick reply! Could you try to run this script by adding this model? https://gist.github.com/joe32140/3c38f377750202d7803b8c0fa0ef1e8b#file-evaluate_code_tasks-py-L196-L199

CodeRankEmbed always consume much more vram compared to other models with similar size, which makes it much slower. I believe there's gpu management issue in the modeling code. (I have adjusted batch size from 32 to 4, but it didn't help.)

Nomic AI org

If i had to make a quick guess, it's due to the max sequence length being longer than the models you pasted. you can manually override this to be shorter if needed

zpn changed discussion status to closed

Sign up or log in to comment