tisorlawan
commited on
Commit
·
8532443
1
Parent(s):
79cbaf4
fix: improve doc
Browse files
README.md
CHANGED
@@ -6,6 +6,7 @@ language:
|
|
6 |
- ban
|
7 |
- bug
|
8 |
- id
|
|
|
9 |
tags:
|
10 |
- generated_from_trainer
|
11 |
datasets:
|
@@ -14,15 +15,15 @@ datasets:
|
|
14 |
pipeline_tag: fill-mask
|
15 |
---
|
16 |
|
17 |
-
#
|
18 |
|
19 |
This repository contains a language adaptation and fine-tuning of the Indobenchmark IndoBERT language model for three specific languages: Balinese, Buginese, and Minangkabau.
|
20 |
-
The adaptation was performed using nusa-
|
21 |
|
22 |
## Model Details
|
23 |
|
24 |
- **Base Model**: [indobenchmark/indobert-large-p2](https://huggingface.co/indobenchmark/indobert-large-p2)
|
25 |
-
- **Adaptation Data**:
|
26 |
|
27 |
|
28 |
## Performance Comparison / Benchmark
|
@@ -77,18 +78,6 @@ The following hyperparameters were used during training:
|
|
77 |
The dataset is released under the terms of **CC-BY-SA 4.0**.
|
78 |
By using this model, you are also bound to the respective Terms of Use and License of the dataset.
|
79 |
|
80 |
-
### Citation Information
|
81 |
-
|
82 |
-
```bibtex
|
83 |
-
@article{purwarianti2023nusadialogue,
|
84 |
-
title={NusaDialogue: Dialogue Summarization and Generation for Underrepresented and Extremely Low-Resource Languages},
|
85 |
-
author={Purwarianti, Ayu and Adhista, Dea and Baptiso, Agung and Mahfuzh, Miftahul and Yusrina Sabila and Cahyawijaya, Samuel and Aji, Alham Fikri},
|
86 |
-
journal={arXiv preprint arXiv:(coming soon)},
|
87 |
-
url={https://huggingface.co/datasets/prosa-text/nusa-dialogue},
|
88 |
-
year={2023}
|
89 |
-
}
|
90 |
-
```
|
91 |
-
|
92 |
### Acknowledgement
|
93 |
This research work is funded and supported by The Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) GmbH and FAIR Forward - Artificial Intelligence for all. We thank Direktorat Jenderal Pendidikan Tinggi, Riset, dan Teknologi Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi (Ditjen DIKTI) for providing the computing resources for this project.
|
94 |
|
|
|
6 |
- ban
|
7 |
- bug
|
8 |
- id
|
9 |
+
pretty_name: IndoBERTNusa
|
10 |
tags:
|
11 |
- generated_from_trainer
|
12 |
datasets:
|
|
|
15 |
pipeline_tag: fill-mask
|
16 |
---
|
17 |
|
18 |
+
# IndoBERTNusa (IndoBERT Adapted for Balinese, Buginese, and Minangkabau)
|
19 |
|
20 |
This repository contains a language adaptation and fine-tuning of the Indobenchmark IndoBERT language model for three specific languages: Balinese, Buginese, and Minangkabau.
|
21 |
+
The adaptation was performed using [nusa-translation](https://huggingface.co/datasets/prosa-text/nusa-translation) dataset.
|
22 |
|
23 |
## Model Details
|
24 |
|
25 |
- **Base Model**: [indobenchmark/indobert-large-p2](https://huggingface.co/indobenchmark/indobert-large-p2)
|
26 |
+
- **Adaptation Data**:[nusa-translation](https://huggingface.co/datasets/prosa-text/nusa-translation)
|
27 |
|
28 |
|
29 |
## Performance Comparison / Benchmark
|
|
|
78 |
The dataset is released under the terms of **CC-BY-SA 4.0**.
|
79 |
By using this model, you are also bound to the respective Terms of Use and License of the dataset.
|
80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
81 |
### Acknowledgement
|
82 |
This research work is funded and supported by The Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) GmbH and FAIR Forward - Artificial Intelligence for all. We thank Direktorat Jenderal Pendidikan Tinggi, Riset, dan Teknologi Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi (Ditjen DIKTI) for providing the computing resources for this project.
|
83 |
|