|
--- |
|
license: mit |
|
language: |
|
- de |
|
metrics: |
|
- cer |
|
library_name: transformers |
|
tags: |
|
- kurrent |
|
- ocr |
|
- htr |
|
- 16th century |
|
- 17th century |
|
- 18th century |
|
- trocr |
|
--- |
|
# TrOCR Kurrent-Model 16th to 18th century |
|
|
|
Base model: **dh-unibe/trocr-kurrent** |
|
|
|
Epochs: 19.85 / 20 |
|
Eval CER: 0.05673 |
|
Test CER: 0.05416 |
|
|
|
This model is based on an extensive training set (of roughly 1579200 words) and evaluated against the same hands in an evaluation and test set (automatic split). |
|
Consisting of German Kurrent scripts written in the 16th-18th century. |
|
|
|
The ground truth stems from different projects and partners and is biased toward Swiss documents. |
|
It is based on documents from a variety of archives and projects. |
|
Among others, the State Archives of Zürich (Stillstandsprotokolle, Ratsmanuale, Findmittel), and the scholarly edition project Königsfelden (Universitäten Zürich und Bern: www.koenigsfelden.uzh.ch). |
|
As well as transcriptions from Einsiedeln. |
|
Further contributions by the university archives of Greifswald: https://rechtsprechung-im-ostseeraum.archiv.uni-greifswald.de/. |
|
|
|
The public Transkribus model (based on PyLaia) can be found here: https://readcoop.eu/model/german-kurrent-16th-18th/ |
|
|
|
Extensive testing of the model has still to be carried out. |
|
This is only a first attempt but might help for fine-tuning tasks. |
|
|
|
|