Model Card for Instruction Backtranslation (Backward Model)

Model Overview

This repository contains a fine-tuned version of the Llama-2-7b-hf model specifically trained for Instruction Backtranslation, implementing the method described in the paper "Self Alignment with Instruction Backtranslation." The model is trained as a backward model to predict original instructions given their corresponding outputs. This reverse training aims to improve model alignment and self-consistency.

Model Details

Model Description

The model utilizes the LLaMA-2 architecture fine-tuned using Low-Rank Adaptation (LoRA) techniques (PEFT library). The primary goal is to reconstruct instructions (x) from outputs (y), thus creating pairs (y, x) for backward prediction.

Developed by: Abhishek Sagar Sanda
Model type: LoRA-finetuned Causal LM
Language(s): English
License: Apache-2.0 (consistent with base LLaMA-2 model)
Finetuned from: meta-llama/Llama-2-7b-hf

Model Sources

Paper: Self Alignment with Instruction Backtranslation
Base Model Repository: meta-llama/Llama-2-7b-hf

Intended Uses

Direct Use

Generating original instructions from outputs for alignment purposes.
Research in model alignment, self-consistency, and instruction-following behavior.

Downstream Use

Enhancing forward instruction-following models via self-alignment methods.
Improving instruction tuning datasets by generating diverse instructions from desired outputs.

Out-of-Scope Use

This model is not suited for general question-answering or generic text generation tasks.
Avoid using in contexts requiring high factual accuracy without additional verification.

Training Data

The model was trained on the OpenAssistant-Guanaco training dataset, focusing on (output, instruction) pairs for backward prediction.

Training Procedure

Preprocessing

Dataset pairs were inverted to use outputs (y) as input and instructions (x) as labels.
Standard tokenization was applied using LLaMA's tokenizer.

Training Hyperparameters

LoRA Rank (r): 8
LoRA Alpha: 32
LoRA Dropout: 0.05
Target Modules: k_proj, q_proj, v_proj, o_proj
Training Precision: bf16 mixed precision

Evaluation

Evaluation involved assessing the accuracy of generated instructions against the original instructions. Key metrics include BLEU, ROUGE, and qualitative human evaluations.

Technical Specifications

Model Architecture

LLaMA-2 Transformer Architecture with PEFT LoRA Adaptation
Tokenizer: LLaMA Tokenizer (tokenizer_class: LlamaTokenizer)
Maximum Sequence Length: Practically unlimited (model_max_length is set very large)

Hardware

GPUs: NVIDIA A100 GPUs recommended
Cloud Provider: AWS/GCP

How to Use

Here's a quick start guide:

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model_name = "meta-llama/Llama-2-7b-hf"
peft_model_name = "your_hf_model_path"

config = PeftConfig.from_pretrained(peft_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, peft_model_name)

inputs = tokenizer("Output text goes here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

Hardware Type: NVIDIA GeForce RTX 4070 GPU
Hours Used: 3hrs

Citation

If you use this model, please cite:

@article{xu2023selfalignment,
  title={Self Alignment with Instruction Backtranslation},
  author={Xu, et al.},
  journal={arXiv preprint arXiv:2308.06259},
  year={2023}
}

Model Card Author

Abhishek Sagar Sanda

Model Card Contact

[email protected]

Framework versions

PEFT 0.15.1
Transformers 4.38.1

abhisheksagar
/

backward_model