SLPG's picture
Update README.md
eaca37e verified
---
datasets:
- SLPG/Punjabi_Transliteration_Corpus
language:
- pa
metrics:
- bleu
library_name: fairseq
pipeline_tag: translation
tags:
- punjabi shahmukhi
- punjabi gurmukhi
- transliteration
- punjabi transliteration
- punjabi gur to shahmukhi
- transliteration system
- punjabi transliteration system
---
### Punjabi Gurmukhi to Shahmukhi Transliteration System
Our supervised Punjabi transliteration systems built using unsupervised corpus are bidirectional NMT systems which effectively convert text between Gurmukhi and Shahmukhi scripts. The Gurmukhi-to-Shahmukhi model achieves a 98.1 BLEU score and 99.5% word-level accuracy, while the Shahmukhi-to-Gurmukhi model scores 87.7 BLEU.
## Corpus Details
- **Total Sentences:** 6.3 million
- **Domains Covered:** Various domains including CCaligned, ccmatrix, TED, QED, OPUS, TIco,
Wikimedia, Multicclaigned, Emille, IJCNLP, xlent, and paracrawl.
- **Test Corpus:** FLORES-101
### Model Details
- **BLEU Score:** 87.7
You may also explore our <u>Gurmukhi-to-Shahmukhi Model</u> with **BLEU Score:** of 98.1 [here](https://huggingface.co/SLPG/Punjabi_Gurmukhi_to_Shahmukhi_Transliteration).
## Usage
These resources are intended to facilitate research and development in the field of Punjabi
transliteration. They can be used to train new models or improve existing ones, enabling high-quality
transliteration between Gurmukhi and Shahmukhi scripts.
## Citation
**If you use our model, kindly cite our [paper]()**:
```
@article{Shehzadi2024,
title={Unsupervised Punjabi Corpus and Neural Machine Transliteration
System},
author={Shehzadi Ambreen, Sadaf Abdul Rauf, MG Abbas Malik and Muhammad Imran }, journal={Heliyon},
year={2024},
note={Under review}
 }
```