arxiv:2404.13628

Mixture of LoRA Experts

Published on Apr 21, 2024

Authors:

Abstract

LoRA has gained widespread acceptance in the fine-tuning of large pre-trained models to cater to a diverse array of downstream tasks, showcasing notable effectiveness and efficiency, thereby solidifying its position as one of the most prevalent fine-tuning techniques. Due to the modular nature of LoRA's plug-and-play plugins, researchers have delved into the amalgamation of multiple LoRAs to empower models to excel across various downstream tasks. Nonetheless, extant approaches for <PRE_TAG>LoRA fusion</POST_TAG> grapple with inherent challenges. Direct arithmetic merging may result in the loss of the original pre-trained model's generative capabilities or the distinct identity of LoRAs, thereby yielding suboptimal outcomes. On the other hand, Reference tuning-based fusion exhibits limitations concerning the requisite flexibility for the effective combination of multiple LoRAs. In response to these challenges, this paper introduces the Mixture of <PRE_TAG>LoRA Experts (MoLE)</POST_TAG> approach, which harnesses hierarchical control and unfettered branch selection. The MoLE approach not only achieves superior <PRE_TAG><PRE_TAG>LoRA fusion</POST_TAG> performance</POST_TAG> in comparison to direct arithmetic merging but also retains the crucial flexibility for combining LoRAs effectively. Extensive experimental evaluations conducted in both the Natural Language Processing (NLP) and Vision & Language (V&L) domains substantiate the efficacy of MoLE.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2404.13628 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2404.13628 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2404.13628 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.