danangwijaya commited on
Commit
327d233
1 Parent(s): e908a76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -6,6 +6,9 @@ datasets:
6
  model-index:
7
  - name: IndoRetNet-Liputan6
8
  results: []
 
 
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -13,21 +16,25 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # IndoRetNet-Liputan6
15
 
16
- This model is a fine-tuned version of [](https://huggingface.co/) on the liputan6 dataset.
 
17
  It achieves the following results on the evaluation set:
18
  - Loss: 3.4936
19
 
20
  ## Model description
21
 
22
- More information needed
 
 
 
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
27
 
28
  ## Training and evaluation data
29
 
30
- More information needed
31
 
32
  ## Training procedure
33
 
@@ -74,4 +81,4 @@ The following hyperparameters were used during training:
74
  - Transformers 4.36.2
75
  - Pytorch 2.1.0+cu121
76
  - Datasets 2.16.1
77
- - Tokenizers 0.15.0
 
6
  model-index:
7
  - name: IndoRetNet-Liputan6
8
  results: []
9
+ license: apache-2.0
10
+ language:
11
+ - id
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  # IndoRetNet-Liputan6
18
 
19
+ This model is a Indonesian RetNet model train using the Liputan6 dataset.
20
+ Using Tokenizer from [IndoBERT](https://huggingface.co/indolem/indobert-base-uncased)
21
  It achieves the following results on the evaluation set:
22
  - Loss: 3.4936
23
 
24
  ## Model description
25
 
26
+ Demonstrate training and recurrent inference using a retentive network (https://arxiv.org/pdf/2307.08621.pdf).
27
+ The code utilizes Sehyun Choi's implementation of retentive network (https://github.com/syncdoth/RetNet).
28
+
29
+ - **License:** Apache 2.0.
30
 
31
  ## Intended uses & limitations
32
 
33
+ Intended to demonstrate training and (recurrent O(1)) inference using a retentive network in Indonesian language.
34
 
35
  ## Training and evaluation data
36
 
37
+ Using Train and validation set from Liputan6 dataset provided by [NusaCrowd](https://github.com/IndoNLP/nusa-crowd).
38
 
39
  ## Training procedure
40
 
 
81
  - Transformers 4.36.2
82
  - Pytorch 2.1.0+cu121
83
  - Datasets 2.16.1
84
+ - Tokenizers 0.15.0