2. AWQ and AQLM support for LoRA. You can now: - Train adapters on top of 2-bit quantized models with AQLM - Train adapters on top of powerful AWQ quantized models Note for inference you can't merge the LoRA weights into the base model!
3. DoRA support: Enabling DoRA is as easy as adding use_dora=True to your LoraConfig. Find out more about this method here: https://arxiv.org/abs/2402.09353
4. Improved documentation, particularly docs regarding PEFT LoRA+DeepSpeed and PEFT LoRA+FSDP! ๐ Check out the docs at https://huggingface.co/docs/peft/index.
I am thrilled to announce Gemma, new 2B and 7B models from Google, based on the same research and technology used to train the Gemini models! These models achieve state-of-the-art performance for their size, and are launched across Transformers, Google Cloud, and many other surfaces worldwide starting today.
These launches are the product of an outstanding collaboration between the Google DeepMind and Hugging Face teams over the last few months -- very proud of the work both teams have done, from integration with Vertex AI to optimization across the stack. Read more about the partnership in the main launch by @philschmid@osanseviero@pcuenq on the launch blog: https://huggingface.co/blog/gemma
More information below if you are curious about training details, eval results, and safety characteristics!