File size: 2,241 Bytes
348fb0c
 
279acde
 
 
 
 
 
 
 
a95ccf0
348fb0c
 
 
 
 
 
 
813fa0f
348fb0c
 
 
 
 
 
 
 
 
 
 
 
 
 
813fa0f
a95ccf0
 
 
 
 
 
 
 
 
 
5717f8d
a95ccf0
 
 
 
 
 
 
 
 
 
 
 
5717f8d
a95ccf0
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
license: apache-2.0
language:
- hi
base_model:
- facebook/nllb-200-distilled-600M
tags:
- nllb
- hindi2kangri
- lowresourcelang
library_name: transformers
---


# Hindi to Kangri Neural Machine Translation (NMT) Model

## Overview

This repository contains a fine-tuned  Neural Machine Translation (NMT) model that translates text from **Hindi** to **Kangri**. The model is built using the Hugging Face Transformers library and is designed to facilitate easy integration and usage in various applications.I have added the kangri language to the tokenizer and train it on the kangri corpus. This is version. Soon, gonna improve it.

## Model Details

- **Model Name**: NLLB-200-distilled-600M  
- **Languages**: Hindi (source) to Kangri (target)
- **Architecture**: Transformer-based architecture
- **Training Dataset**: Custom dataset consisting of parallel Hindi and Kangri sentences.

## Installation

To use this model, you need to install the Hugging Face Transformers library along with other required packages. Follow the instructions below:


## Using the Model
You can use this model with the Hugging Face Transformers library in a Python script or a Jupyter notebook. Below is a sample code snippet to demonstrate how to load and use the model for translation.

```python
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

model_name = "cloghost/nllb-200-distilled-600M-hin-kang-v1"

model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

device = 0 if torch.cuda.is_available() else -1 
translator = pipeline(
    "translation",
    model=model,
    tokenizer=tokenizer,
    src_lang="hin_Deva",
    tgt_lang="kang_Deva",
    device=device
)

text = """मगर हिमाचली भाषा तो पहले से बोली जा रही है।
लोग सदियों से ही इसके संग जी रहे हैं।
पहाड़ी भाषा का इतिहास हिन्दी साहित्य के आदिकाल ,‌जिसे सिद्ध चारण काल के नाम से भी जानते हैं
"""

translation = translator(text)