JunxiongWang
/

Llama3.2-Mamba2-3B-dpo

Model card Files Files and versions Community

Llama3.2-Mamba2-3B-dpo / README.md

JunxiongWang's picture

Update README.md

78e54b5 verified about 9 hours ago

|

history blame contribute delete

3.64 kB

	---
	license: apache-2.0
	---

	Zero-shot results when using the [Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) as the teacher model, and the [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) as the initialized model

	\| Model \| [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) \| [Llama-3.2-Mamba2-0.5-3B-sft](https://huggingface.co/JunxiongWang/Mamba2InLlama3B_Half) \| [Llama-3.2-Mamba2-0.5-3B-dpo](https://huggingface.co/JunxiongWang/Mamba2InLlama3B_Half_DPO) \|
	\|---------------\|---------------------------------------------------------------------------------\|-----------------------------------\|-----------------------------------\|
	\| Initialization Model \| N/A \| Llama-3.2-3B-Instruct \| Llama-3.2-3B-Instruct \|
	\| Teacher Model \| N/A \| Llama-3.1-8B-Instruct \| Llama-3.1-8B-Instruct \|
	\| arc_challenge \| 0.459 \| 0.4667 \| 0.541 \|
	\| arc_easy \| 0.7407 \| 0.7668 \| 0.8026 \| \|
	\| hellaswag \| 0.7043 \| 0.6913 \| 0.7445 \|
	\| mmlu \| 0.6043 \| 0.5271 \| 0.5247 \|
	\| openbookqa \| 0.36 \| 0.388 \| 0.424 \|
	\| piqa \| 0.7568 \| 0.7601 \| 0.7769 \|
	\| pubmedqa \| 0.696 \| 0.638 \| 0.654 \|
	\| race \| 0.4067 \| 0.3981 \| 0.4344 \|
	\| winogrande \| 0.6748 \| 0.6606 \| 0.6732 \|


	```
	@article{junxiongdaniele2024mambainllama,
	title = {The Mamba in the Llama: Distilling and Accelerating Hybrid Models},
	author = {Junxiong Wang and Daniele Paliotta and Avner May and Alexander M. Rush and Tri Dao},
	journal = {arXiv preprint arXiv:2408.15237},
	year = {2024}
	}
	```