Mario12355
/

swabian_german_translator

@@ -1,7 +1,45 @@
 ---
 license: mit
-pipeline_tag: translation
 ---
 # Swabian-German Translation Model (DPO-Enhanced)
 This model fine-tunes LLAMA 3.1 8B for bidirectional translation between Standard German and Swabian dialect, enhanced through Direct Preference Optimization (DPO).
@@ -19,7 +57,124 @@ This model fine-tunes LLAMA 3.1 8B for bidirectional translation between Standar
 ## Usage
 ```python
 # Example translation from Swabian to Standard German
-prompt = "Übersetze ins Hochdeutsche: Du hosch ja a blaus Mol am Arm!"
-Expected output format: "Du hast ja einen Bluterguss am Arm!""

 ---
+language:
+  - de
 license: mit
+library_name: transformers
+pipeline_tag: text2text-generation
+tags:
+  - llama
+  - translation
+  - german
+  - dialect
+  - swabian
+  - qlora
+  - dpo
+datasets:
+  - custom
+model-index:
+  - name: swabian-german-translator
+    results:
+      - task:
+          type: translation
+          name: German-Swabian Translation
+        metrics:
+          - type: accuracy
+            value: 0.8
+            name: Training Loss
+          - type: bleu
+            value: N/A
+            name: BLEU Score
+metadata:
+  author: [Your Name]
+  framework: pytorch
+  fine_tuning_type:
+    - dpo
+    - qlora
+base_model: llama-3.1-8b
+training_data: Custom dataset based on Schwäbisch-Schwätza wordbook
+training_processes:
+  - sft
+  - dpo
 ---
 # Swabian-German Translation Model (DPO-Enhanced)
 This model fine-tunes LLAMA 3.1 8B for bidirectional translation between Standard German and Swabian dialect, enhanced through Direct Preference Optimization (DPO).
 ## Usage
+### Basic Translation
 ```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+# Load model and tokenizer
+model_name = "your-username/swabian-translator-dpo"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
 # Example translation from Swabian to Standard German
+def translate(text, direction="to_german"):
+    if direction == "to_german":
+        prompt = f"Übersetze ins Hochdeutsche: {text}"
+    else:
+        prompt = f"Übersetze ins Schwäbische: {text}"
+    inputs = tokenizer(prompt, return_tensors="pt")
+    outputs = model.generate(**inputs, max_length=100)
+    return tokenizer.decode(outputs[0], skip_special_tokens=True)
+# Example usage
+swabian_text = "Du hosch ja a blaus Mol am Arm!"
+german_translation = translate(swabian_text, "to_german")
+print(german_translation)  # Expected: "Du hast ja einen Bluterguss am Arm!"
+```
+### Translation Examples
+Swabian to German:
+```
+Input: "I han koi Zeit"
+Output: "Ich habe keine Zeit"
+Input: "Des goht et"
+Output: "Das geht nicht"
+Input: "Wo bisch du her komma?"
+Output: "Woher kommst du?"
+```
+German to Swabian:
+```
+Input: "Ich verstehe das nicht"
+Output: "I versteh des et"
+Input: "Das schmeckt sehr gut"
+Output: "Des schmeckt arg guat"
+```
+## Model Architecture & Training
+### Training Process
+1. **Initial Dataset Preparation**
+   - Base dataset: 12,000+ word pairs from Schwäbisch-Schwätza wordbook
+   - Context enhancement using LLM-generated sentences
+   - Manual verification and cleanup
+2. **SFT (Supervised Fine-Tuning)**
+   - QLoRA implementation for efficient training
+   - 2 epochs on the complete dataset
+   - Loss convergence at ~0.8
+3. **DPO (Direct Preference Optimization)**
+   - 300 carefully curated preference pairs
+   - 3 epochs of preference learning
+   - Focus on natural and accurate translations
+### Technical Implementation
+- Quantized training using QLoRA
+- 4-bit precision for efficient resource usage
+- Training framework: UnslothAI
+- Single GPU training (~16GB VRAM required)
+## Limitations and Considerations
+1. **Dialect Variations**
+   - Swabian varies significantly by region
+   - Model focuses on common/standard Swabian expressions
+   - May not capture all local variations
+2. **Translation Quality**
+   - Best performance on common phrases and expressions
+   - May struggle with very colloquial or context-dependent translations
+   - Not recommended for official or legal translations
+3. **Technical Limitations**
+   - Input length limited to 512 tokens
+   - Generation speed affected by quantization
+   - Memory requirements: ~8GB RAM minimum
+## Community and Contributions
+We welcome community contributions to improve the model:
+- Additional training data
+- Regional variant documentation
+- Bug reports and fixes
+- Performance improvements
+Please submit issues or pull requests through the Hugging Face repository.
+## Citation and Attribution
+```bibtex
+@misc{swabian-german-translator,
+  author = {[Your Name]},
+  title = {Swabian-German Translation Model},
+  year = {2024},
+  publisher = {Hugging Face},
+  journal = {Hugging Face Model Hub}
+}
+```
+## License
+This project is licensed under the MIT License - see the LICENSE file for details.
+## Acknowledgments
+- Original dictionary data: [schwäbisch-schwätza.de](http://xn--schwbisch-schwtza-tqbk.de/)
+- UnslothAI for the training framework
+- LLAMA 3.1 8B base model