7 3 1

Yongchang Hao

yongchanghao

https://yongchanghao.github.io

yongchanghao
yongchanghao

AI & ML interests

None yet

Organizations

Posts 1

Post

3796

We just released a paper (NeuZip) that compresses VRAM in a lossless manner to run larger models. This should be particularly useful when VRAM is insufficient during training/inference. Specifically, we look inside each floating number and find that the exponents are highly compressible (as shown in the figure below).

Read more about the work at NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks (2410.20650)

Papers 3

arxiv:2410.20650

arxiv:2402.03293

arxiv:2210.08708

models 1

yongchanghao/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • 2B • Updated May 8 • 3

datasets 0

None public yet