Actually changed the Readme.md
Browse files
README.md
CHANGED
@@ -5,4 +5,45 @@ base_model:
|
|
5 |
library_name: transformers
|
6 |
language:
|
7 |
- la
|
8 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
library_name: transformers
|
6 |
language:
|
7 |
- la
|
8 |
+
---
|
9 |
+
Base model: **magistermilitum/tridis_HTR v1**
|
10 |
+
|
11 |
+
Train Lines: ???
|
12 |
+
|
13 |
+
Eval Lines: ???
|
14 |
+
|
15 |
+
Test Lines: ???
|
16 |
+
|
17 |
+
|
18 |
+
Epochs: 14.1667 / 20
|
19 |
+
|
20 |
+
Eval CER: 0.0544
|
21 |
+
|
22 |
+
Test CER: 0.0622
|
23 |
+
|
24 |
+
|
25 |
+
Testresults with CERberus
|
26 |
+
| Metric | Value |
|
27 |
+
|----------------------------|---------|
|
28 |
+
| Character Error Rate | 6.22 |
|
29 |
+
| Number of Correct Characters| 186998 |
|
30 |
+
| Number of Substitutions | 5425 |
|
31 |
+
| Number of Insertions | 2933 |
|
32 |
+
| Number of Deletions | 3849 |
|
33 |
+
| Total Character Count | 196272 |
|
34 |
+
| Original Lines Count | 2288 |
|
35 |
+
| Discarded Lines Count | 0 |
|
36 |
+
|
37 |
+
|
38 |
+
Finetuned on an Anglicana-dataset, with mainly Middle Latin and few Middle English and Anglo-Norman text sources containing documents from:
|
39 |
+
|
40 |
+
- the Common Pleas (CP)
|
41 |
+
- the Justices (JUST)
|
42 |
+
|
43 |
+
from the English Legal Court Rolls.
|
44 |
+
|
45 |
+
The model has not been extensively tested.
|
46 |
+
|
47 |
+
Errors often occur in the Punctuation, which itself has an error rate of 44.44% which mostly consits of missed ‧ dots.
|
48 |
+
|
49 |
+
Potential biases are still to be identified.
|