Model Card : gdpr_gemma-2-2b
This model is a fine-tuned version of google/gemma-2b-it on GDPR compliance data using Direct Preference Optimization (DPO).
Model Details
- Developed by: cycloevan
- Model type: Causal Language Model
- Language(s): English
- License: Apache 2.0 (inherited from base model)
- Finetuned from model: google/gemma-2b-it
- GitHub Code: gdpr-gemma2
Uses
This model is designed to assist with GDPR compliance queries and provide information related to data protection regulations.
Training Details
Training Data
The model was fine-tuned on the sims2k/GDPR_QA_instruct_dataset.
Training Procedure
- Fine-tuning method: Direct Preference Optimization (DPO)
- Optimizer: AdamW (paged_adamw_32bit)
- Learning rate: 5e-6
- Batch size: 1
- Gradient accumulation steps: 3
- Number of epochs: 10
- LR scheduler: Cosine
- Warmup steps: 2
- Training regime: LoRA (Low-Rank Adaptation)
LoRA Hyperparameters
- r: 16
- lora_alpha: 32
- lora_dropout: 0.05
- Target modules: all-linear
Limitations and Bias
- The model's knowledge is limited to the training data and may not cover all aspects of GDPR or recent updates.
- The model may occasionally generate incorrect or inconsistent information.
- Downloads last month
- 4