Certainly! Below is a draft for the README of your Hugging Face repository containing the QLoRA adapters. This README is structured to provide clear and concise information about the adapters, their purpose, and how to use them.


FineLlama-3.2-3B-Instruct-ead QLoRA Adapters

This repository contains the QLoRA (Quantized Low-Rank Adaptation) adapters for the FineLlama-3.2-3B-Instruct-ead model. These adapters are designed to be used with the base meta-llama/Llama-3.2-3B-Instruct model to enable efficient fine-tuning for generating EAD (Encoded Archival Description) XML format for archival records.

Overview

The QLoRA adapters were trained using Parameter Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation) on the Geraldine/Ead-Instruct-38k dataset. This approach allows for memory-efficient fine-tuning while maintaining high performance for the task of generating EAD/XML-compliant archival descriptions.

Key Features

  • Efficient Fine-Tuning: Uses 4-bit quantization and LoRA to reduce memory usage.
  • Compatibility: Designed to work with the base meta-llama/Llama-3.2-3B-Instruct model.
  • Specialization: Optimized for generating EAD/XML archival metadata.

Adapter Details

Training Configuration

  • Quantization: 4-bit quantization using bitsandbytes.
    • Quantization Type: nf4
    • Double Quantization: Enabled
    • Compute Dtype: bfloat16
  • LoRA Configuration:
    • Rank (r): 256
    • Alpha (alpha): 128
    • Dropout: 0.05
    • Target Modules: All linear layers
  • Training Parameters:
    • Epochs: 3
    • Batch Size: 3
    • Gradient Accumulation Steps: 2
    • Learning Rate: 2e-4
    • Warmup Ratio: 0.03
    • Max Sequence Length: 4096
    • Scheduler: Constant

Training Infrastructure

  • Libraries: transformers, peft, trl
  • Mixed Precision: FP16/BF16 (based on hardware support)
  • Optimizer: fused adamw

Usage

To use the QLoRA adapters, you need to load the base model and apply the adapters using the peft library.

Installation

pip install transformers torch bitsandbytes peft

Loading the Model with Adapters

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
import torch

# Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# Load the base model
base_model_name = "meta-llama/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    torch_dtype="auto",
    device_map="auto"
)

# Load the QLoRA adapters
adapter_model_name = "Geraldine/FineLlama-3.2-3B-Instruct-ead-Adapters"
model = PeftModel.from_pretrained(model, adapter_model_name)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

Example Usage

messages = [
    {"role": "system", "content": "You are an expert in EAD/XML generation for archival records metadata."},
    {"role": "user", "content": "Generate a minimal and compliant <eadheader> template with all required EAD/XML tags"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(inputs, max_new_tokens=4096, use_cache=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

  • The adapters are specifically trained for EAD/XML generation and may not generalize well to other tasks.
  • Performance depends on the quality and specificity of the input prompts.
  • The maximum sequence length is limited to 4096 tokens.

Citation

If you use these adapters in your work, please cite the base model and this repository:

@misc{ead-llama-adapters,
  author = {Géraldine Geoffroy},
  title = {FineLlama-3.2-3B-Instruct-ead QLoRA Adapters},
  year = {2024},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/Geraldine/qlora-FineLlama-3.2-3B-Instruct-ead}}
}

License

The adapters are subject to the same license as the base meta-llama/Llama-3.2-3B-Instruct model. Please refer to Meta's LLaMa license for usage terms and conditions.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .