danangwijaya
commited on
Commit
•
327d233
1
Parent(s):
e908a76
Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,9 @@ datasets:
|
|
6 |
model-index:
|
7 |
- name: IndoRetNet-Liputan6
|
8 |
results: []
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -13,21 +16,25 @@ should probably proofread and complete it, then remove this comment. -->
|
|
13 |
|
14 |
# IndoRetNet-Liputan6
|
15 |
|
16 |
-
This model is a
|
|
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 3.4936
|
19 |
|
20 |
## Model description
|
21 |
|
22 |
-
|
|
|
|
|
|
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|
26 |
-
|
27 |
|
28 |
## Training and evaluation data
|
29 |
|
30 |
-
|
31 |
|
32 |
## Training procedure
|
33 |
|
@@ -74,4 +81,4 @@ The following hyperparameters were used during training:
|
|
74 |
- Transformers 4.36.2
|
75 |
- Pytorch 2.1.0+cu121
|
76 |
- Datasets 2.16.1
|
77 |
-
- Tokenizers 0.15.0
|
|
|
6 |
model-index:
|
7 |
- name: IndoRetNet-Liputan6
|
8 |
results: []
|
9 |
+
license: apache-2.0
|
10 |
+
language:
|
11 |
+
- id
|
12 |
---
|
13 |
|
14 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
16 |
|
17 |
# IndoRetNet-Liputan6
|
18 |
|
19 |
+
This model is a Indonesian RetNet model train using the Liputan6 dataset.
|
20 |
+
Using Tokenizer from [IndoBERT](https://huggingface.co/indolem/indobert-base-uncased)
|
21 |
It achieves the following results on the evaluation set:
|
22 |
- Loss: 3.4936
|
23 |
|
24 |
## Model description
|
25 |
|
26 |
+
Demonstrate training and recurrent inference using a retentive network (https://arxiv.org/pdf/2307.08621.pdf).
|
27 |
+
The code utilizes Sehyun Choi's implementation of retentive network (https://github.com/syncdoth/RetNet).
|
28 |
+
|
29 |
+
- **License:** Apache 2.0.
|
30 |
|
31 |
## Intended uses & limitations
|
32 |
|
33 |
+
Intended to demonstrate training and (recurrent O(1)) inference using a retentive network in Indonesian language.
|
34 |
|
35 |
## Training and evaluation data
|
36 |
|
37 |
+
Using Train and validation set from Liputan6 dataset provided by [NusaCrowd](https://github.com/IndoNLP/nusa-crowd).
|
38 |
|
39 |
## Training procedure
|
40 |
|
|
|
81 |
- Transformers 4.36.2
|
82 |
- Pytorch 2.1.0+cu121
|
83 |
- Datasets 2.16.1
|
84 |
+
- Tokenizers 0.15.0
|