--- library_name: transformers tags: [] --- # Model Card for Model ID This model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer. This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset. Only 1000 data samples were used to train quickly using ORPO. ## Model Details ### Model Description The base model meta-llama/Llama-3.2-1B has been fine-tuned using ORPO on a few samples of mlabonne/orpo-dpo-mix-40k dataset. The Llama 3.2 instruction-tuned text-only model is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. This fine-tuned version is aimed at improving the understanding of the context in prompts and thereby increasing the interpretability of the model. - **Finetuned from model [meta-llama/Llama-3.2-1B]** - **Model Size: 1 Billion parameters** - **Fine-tuning Method: ORPO** - **Dataset: mlabonne/orpo-dpo-mix-40k** ## Evaluation The model was evaluated on the following benchmarks, with the following performance metrics: | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| |---------|------:|------|-----:|--------|---|-----:|---|-----:| |hellaswag| 1|none | 0|acc |↑ |0.2504|± |0.0043| | | |none | 0|acc_norm|↑ |0.2504|± |0.0043|