File size: 4,303 Bytes
c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 743e85f c213c63 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
---
base_model: meta-llama/Llama-2-7b-hf
library_name: peft
---
# Model Card for Instruction Backtranslation (Backward Model)
## Model Overview
This repository contains a fine-tuned version of the Llama-2-7b-hf model specifically trained for **Instruction Backtranslation**, implementing the method described in the paper "Self Alignment with Instruction Backtranslation." The model is trained as a backward model to predict original instructions given their corresponding outputs. This reverse training aims to improve model alignment and self-consistency.
## Model Details
### Model Description
The model utilizes the LLaMA-2 architecture fine-tuned using Low-Rank Adaptation (LoRA) techniques (PEFT library). The primary goal is to reconstruct instructions (`x`) from outputs (`y`), thus creating pairs `(y, x)` for backward prediction.
- **Developed by:** Abhishek Sagar Sanda
- **Model type:** LoRA-finetuned Causal LM
- **Language(s):** English
- **License:** Apache-2.0 (consistent with base LLaMA-2 model)
- **Finetuned from:** [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
### Model Sources
- **Paper:** [Self Alignment with Instruction Backtranslation](https://arxiv.org/abs/2308.06259)
- **Base Model Repository:** [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
## Intended Uses
### Direct Use
- Generating original instructions from outputs for alignment purposes.
- Research in model alignment, self-consistency, and instruction-following behavior.
### Downstream Use
- Enhancing forward instruction-following models via self-alignment methods.
- Improving instruction tuning datasets by generating diverse instructions from desired outputs.
### Out-of-Scope Use
- This model is not suited for general question-answering or generic text generation tasks.
- Avoid using in contexts requiring high factual accuracy without additional verification.
## Training Data
The model was trained on the **OpenAssistant-Guanaco** training dataset, focusing on `(output, instruction)` pairs for backward prediction.
## Training Procedure
### Preprocessing
- Dataset pairs were inverted to use outputs (`y`) as input and instructions (`x`) as labels.
- Standard tokenization was applied using LLaMA's tokenizer.
### Training Hyperparameters
- **LoRA Rank (r):** 8
- **LoRA Alpha:** 32
- **LoRA Dropout:** 0.05
- **Target Modules:** `k_proj`, `q_proj`, `v_proj`, `o_proj`
- **Training Precision:** bf16 mixed precision
## Evaluation
Evaluation involved assessing the accuracy of generated instructions against the original instructions. Key metrics include BLEU, ROUGE, and qualitative human evaluations.
## Technical Specifications
### Model Architecture
- LLaMA-2 Transformer Architecture with PEFT LoRA Adaptation
- **Tokenizer:** LLaMA Tokenizer (`tokenizer_class`: LlamaTokenizer)
- **Maximum Sequence Length:** Practically unlimited (`model_max_length` is set very large)
### Hardware
- GPUs: NVIDIA A100 GPUs recommended
- Cloud Provider: AWS/GCP
## How to Use
Here's a quick start guide:
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model_name = "meta-llama/Llama-2-7b-hf"
peft_model_name = "your_hf_model_path"
config = PeftConfig.from_pretrained(peft_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, peft_model_name)
inputs = tokenizer("Output text goes here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
- **Hardware Type:** NVIDIA GeForce RTX 4070 GPU
- **Hours Used:** 3hrs
## Citation
If you use this model, please cite:
```bibtex
@article{xu2023selfalignment,
title={Self Alignment with Instruction Backtranslation},
author={Xu, et al.},
journal={arXiv preprint arXiv:2308.06259},
year={2023}
}
```
## Model Card Author
- Abhishek Sagar Sanda
## Model Card Contact
- [email protected]
### Framework versions
- PEFT 0.15.1
- Transformers 4.38.1
|