Spaces:

huggingface
/

HuggingDiscussions

Running

App Files Files Community

[FEEDBACK] Daily Papers

#32

by kramp - opened Jun 12, 2024

Discussion

kramp

Hugging Face org Jun 12, 2024

•

edited Jul 25, 2024

Note that this is not a post about adding new papers, it's about feedback on the Daily Papers community update feature.

How to submit a paper to the Daily Papers, like @akhaliq (AK)?

Submitting is available to paper authors
Only recent papers (less than 7d) can be featured on the Daily

Then drop the arxiv id in the form at https://huggingface.co/papers/submit

Add medias to the paper (images, videos) when relevant
You can start the discussion to engage with the community

Please check out the documentation

RollingPig

Jun 17, 2024

https://arxiv.org/abs/2406.01954

runninglsy

Jun 18, 2024

•

edited Jun 27, 2024

We are excited to share our recent work on MLLM architecture design titled "Ovis: Structural Embedding Alignment for Multimodal Large Language Model".

Paper: https://arxiv.org/abs/2405.20797
Github: https://github.com/AIDC-AI/Ovis
Model: https://huggingface.co/AIDC-AI/Ovis-Clip-Llama3-8B
Data: https://huggingface.co/datasets/AIDC-AI/Ovis-dataset

Yiwen-ntu

Jun 18, 2024

This comment has been hidden

kramp

Hugging Face org Jun 18, 2024

@Yiwen-ntu for now we support only videos as paper covers in the Daily.

renqiux0302

Jun 19, 2024

This comment has been hidden

taki555

Jun 19, 2024

This comment has been hidden

devichand

Jun 20, 2024

we are excited to share our work titled "Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models" : https://arxiv.org/abs/2406.12644

140 hidden messages

Expand all

ionutmodo

May 26

Hi, everyone! We are happy to share with you our work SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language Models.

We focus on the low-rank compression of optimizer states and propose replacing the expensive SVD decomposition with a fixed orthogonal matrix that comes from the Discrete Consine Transformation (DCT).

In our work we couple the DCT matrix with a theoretically-justified approach to choose the most appropriate columns from the DCT matrix that minimize the reconstruction error for each gradient matrix G and obtain a dynamic projection matrix tailored to each gradient G.

Our numerical results show that DCT matrix not only recovers the performance of existing low-rank optimizers, but also reduces the running time by 20% and memory usage for large models, both for pretraining and finetuning.

📜 Paper: https://arxiv.org/pdf/2505.17967

🐍 Code: soon to appear in https://github.com/IST-DASLab/ISTA-DASLab-Optimizers via pip

xiaolinzi-ahu

Jun 4

•

edited Jun 4

https://arxiv.org/abs/2505.23808
https://github.com/mulin-ahu/DenseLoRA

Wesleythu

Jun 12

HI, all! We are thrilled to present our work: VerIF: Verification Engineering for RL in Instruction Following.
In this work, we introduce VerIF, a new verification method for RL in instruction following. RL with VerIF significantly improves the instruction-following capabilities of LLMs. TULU3+VerIF achieves SoTA performance across models with similar sizes.

Our paper: https://arxiv.org/abs/2506.09942
Our repo: https://github.com/THU-KEG/VerIF