MarineLives
/

mt5-small-raw-htr-clean-ver.1.0

Model card Files Files and versions

Addaci commited on Oct 12, 2024

Commit

a0cc5e5

·

verified ·

1 Parent(s): 9418357

Update README.md

Files changed (1) hide show

README.md +20 -3

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+- la
+base_model:
+- google/mt5-small
+---
+Demonstration of fine-tuning of mt5-small for C17th English (and Latin) legal depositions
+Uses mt5-small, which is trained on the mC4 common crawal dataset containing 101 languages, including some Latin
+mt5-small is the smallest of five variants of mt5 (small; base; large; XL; XXL)
+fine-tuned with text to text pairs of raw-HTR from C17th English High Court of Admiralty depositions
+A series of fine-tuned mt5-small models will be created with ascending version numbers
+Fine-tuning experiments will include:
+* Using 1000 lines of raw-HTR paired with 1000 lines of hand corrected Ground Truth
+* Using 2000 lines of raw-HTR paired with 1000 lines of hand corrected Ground Truth
+* Using 1000 and 2000 lines of synthetic raw-HTR paired with 1000 lines of handcorrected Ground Truth