niclasgriesshaber commited on
Commit
5d23864
1 Parent(s): 265c032

Updated README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -3
README.md CHANGED
@@ -1,3 +1,40 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: unsloth/gemma-2-2b-it-bnb-4bit
3
+ language:
4
+ - en
5
+ - de
6
+ license: apache-2.0
7
+ tags:
8
+ - text-generation-inference
9
+ - transformers
10
+ - unsloth
11
+ - llama
12
+ - trl
13
+ - machine-translation
14
+ - historical-language
15
+ - early-modern-german
16
+ - legal-texts
17
+ - economic-history
18
+ - open-source
19
+ ---
20
+
21
+ # English to Early Modern Bohemian German Translation Model
22
+
23
+ ## Overview
24
+
25
+ This model translates from English to Early Modern Bohemian German (EMBG). It was fine-tuned using LoRA on a unique historical dataset of 3,873 paragraph-level translation pairs sourced from legal court records. The dataset was meticulously transcribed and translated by the Chichele Professor of Economic History, **Sheilagh Ogilvie**, from All Souls College, University of Oxford.
26
+
27
+ ### Key Features
28
+
29
+ - **Base Model**: `unsloth/gemma-2-2b-it-bnb-4bit`
30
+ - **Fine-Tuning**: Performed using [LoRA](https://arxiv.org/abs/2106.09685) and [Unsloth](https://github.com/unslothai/unsloth), leveraging Hugging Face's [Transformers](https://github.com/huggingface/transformers) and [TRL](https://github.com/huggingface/trl) libraries.
31
+ - **Languages Supported**:
32
+ - Source: English
33
+ - Target: Early Modern Bohemian German (EMBG)
34
+ - **Dataset**: Legal court records, manually transcribed and translated over five years. The dataset will be published in an upcoming [ACL](https://acl2024.org) paper.
35
+
36
+ ### Use Cases
37
+
38
+ - Research in economic history and legal studies.
39
+ - Exploration of historical dialects and their nuances.
40
+ - Applications in language revitalisation and historical text analysis.