|
--- |
|
datasets: |
|
- SLPG/Punjabi_Transliteration_Corpus |
|
language: |
|
- pa |
|
metrics: |
|
- bleu |
|
library_name: fairseq |
|
pipeline_tag: translation |
|
tags: |
|
- punjabi shahmukhi |
|
- punjabi gurmukhi |
|
- transliteration |
|
- punjabi transliteration |
|
- punjabi gur to shahmukhi |
|
- transliteration system |
|
- punjabi transliteration system |
|
--- |
|
|
|
### Punjabi Gurmukhi to Shahmukhi Transliteration System |
|
Our supervised Punjabi transliteration systems built using unsupervised corpus are bidirectional NMT systems which effectively convert text between Gurmukhi and Shahmukhi scripts. The Gurmukhi-to-Shahmukhi model achieves a 98.1 BLEU score and 99.5% word-level accuracy, while the Shahmukhi-to-Gurmukhi model scores 87.7 BLEU. |
|
|
|
## Corpus Details |
|
- **Total Sentences:** 6.3 million |
|
- **Domains Covered:** Various domains including CCaligned, ccmatrix, TED, QED, OPUS, TIco, |
|
Wikimedia, Multicclaigned, Emille, IJCNLP, xlent, and paracrawl. |
|
- **Test Corpus:** FLORES-101 |
|
|
|
### Model Details |
|
- **BLEU Score:** 87.7 |
|
|
|
You may also explore our <u>Gurmukhi-to-Shahmukhi Model</u> with **BLEU Score:** of 98.1 [here](https://huggingface.co/SLPG/Punjabi_Gurmukhi_to_Shahmukhi_Transliteration). |
|
|
|
## Usage |
|
These resources are intended to facilitate research and development in the field of Punjabi |
|
transliteration. They can be used to train new models or improve existing ones, enabling high-quality |
|
transliteration between Gurmukhi and Shahmukhi scripts. |
|
|
|
## Citation |
|
|
|
**If you use our model, kindly cite our [paper]()**: |
|
``` |
|
@article{Shehzadi2024, |
|
title={Unsupervised Punjabi Corpus and Neural Machine Transliteration |
|
System}, |
|
author={Shehzadi Ambreen, Sadaf Abdul Rauf, MG Abbas Malik and Muhammad Imran }, journal={Heliyon}, |
|
year={2024}, |
|
note={Under review} |
|
} |
|
``` |
|
|