huseinzol05 commited on
Commit
5eb45b4
·
verified ·
1 Parent(s): d568657

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -13,7 +13,9 @@ Finetuned https://huggingface.co/mesolitica/nanot5-base-malaysian-cased using 20
13
  - This model natively code switching.
14
  - This model maintain `\n`, `\t`, `\r` as it is.
15
 
16
- **Still in training session**, Wandb at https://wandb.ai/huseinzol05/nanot5-base-malaysian-cased-translation-v4?nw=nwuserhuseinzol05
 
 
17
 
18
  ## Supported prefix
19
 
@@ -78,4 +80,8 @@ Output,
78
  ' Bayangkan PH dan menang PRU-14. Terdapat pelbagai pintu belakang. Akhirnya, Ismail Sabri naik. Itulah sebabnya saya tidak lagi bercakap tentang politik. Saya bersumpah sudah berputus asa.']
79
  ```
80
 
81
- Input text can be any languages that speak in Malaysia, as long you use proper prefix, it should be able to translate to target language.
 
 
 
 
 
13
  - This model natively code switching.
14
  - This model maintain `\n`, `\t`, `\r` as it is.
15
 
16
+ Thanks to https://x.com/oscarnazhan for the GPU access to train this model.
17
+
18
+ **Still in training session**, Wandb at https://wandb.ai/huseinzol05/nanot5-base-malaysian-cased-translation-v4-multipack?nw=nwuserhuseinzol05
19
 
20
  ## Supported prefix
21
 
 
80
  ' Bayangkan PH dan menang PRU-14. Terdapat pelbagai pintu belakang. Akhirnya, Ismail Sabri naik. Itulah sebabnya saya tidak lagi bercakap tentang politik. Saya bersumpah sudah berputus asa.']
81
  ```
82
 
83
+ Input text can be any languages that speak in Malaysia, as long you use proper prefix, it should be able to translate to target language.
84
+
85
+ ## how to finetune your own dataset?
86
+
87
+ We finetuned using T5 SDPA multipacking forked at https://github.com/mesolitica/t5-sdpa-multipack, super undocumented, but scripts from https://github.com/huggingface/transformers/tree/main/examples/pytorch/translation should work also.