File size: 6,211 Bytes
6023aa1
88eb418
 
6e5fc9c
 
 
 
 
 
6023aa1
88eb418
 
 
 
 
 
 
 
 
 
 
 
44822b9
88eb418
 
44822b9
 
 
88eb418
44822b9
88eb418
 
44822b9
 
88eb418
 
 
 
 
701204c
f587832
 
 
701204c
44822b9
 
 
 
 
 
857f232
44822b9
 
 
 
 
 
 
 
e62fc75
44822b9
 
 
 
 
 
e62fc75
44822b9
 
 
 
 
 
 
e62fc75
44822b9
 
 
 
 
 
 
 
 
 
 
 
88eb418
 
 
 
 
 
44822b9
88eb418
44822b9
88eb418
 
 
 
44822b9
88eb418
 
44822b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88eb418
 
44822b9
88eb418
 
 
44822b9
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
library_name: peft
base_model: Sakonii/distilgpt2-nepali
license: apache-2.0
datasets:
- Bibek1129/nepali_SQuAD_single_qsn
language:
- ne
pipeline_tag: text-generation
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->



## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->
The model is finetuned on Sakonii/distilgpt2-nepali with Bibek1129/nepali_SQuAD_multiple_qsns dataset.The dataset is converted to nepali using Nepali_nlp library using SQuAD dataset.


- **Model type:** distilgpt2
- **Language(s) (NLP):** ne(Nepali)
- **Finetuned from model :** https://huggingface.co/Sakonii/distilgpt2-nepali

### Model Sources 

<!-- Provide the basic links for the model. -->
For training snippets and inference check the following repository.
- **Repository:** https://github.com/HordesOfGhost/Nepali_LLMs/]


## How to Get Started with the Model

Use the code below to get started with the model.
```python
!pip install peft 
!pip install transformers
!pip install sentencepiece
```
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM,AutoTokenizer
from transformers import pipeline

base_model = "Sakonii/distilgpt2-nepali"
adapter_model = "Bibek1129/distilgpt2-nepali-single-qs-generator"

tokenizer = AutoTokenizer.from_pretrained(base_model)

config = PeftConfig.from_pretrained(adapter_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_model)
model = model.merge_and_unload()

prompt = """तपाईं एउटा प्रश्न उत्पन्न गर्ने मोडेल हुनुहुन्छ। तपाइँलाई एक सन्दर्भ दिइएको हुन्छ र तपाइँ त्यसमा आधारित एउटा प्रश्न उत्पन्न गर्नुहुन्छ।

### सन्दर्भ:
राजनीति 'शहरका मामिलाहरू') गतिविधिहरूको सेट हो जुन समूहहरूमा निर्णय गर्न वा व्यक्तिहरू बीचको शक्ति सम्बन्धका अन्य रूपहरू, जस्तै स्रोत वा स्थितिको वितरणसँग सम्बन्धित छ। राजनीति र सरकारको अध्ययन गर्ने सामाजिक विज्ञानको शाखालाई राजनीति विज्ञान भनिन्छ।
यसलाई "राजनीतिक समाधान" को सन्दर्भमा सकारात्मक रूपमा प्रयोग गर्न सकिन्छ जुन सम्झौता र अहिंसात्मक छ, वा वर्णनात्मक रूपमा "सरकारको कला वा विज्ञान" को रूपमा, तर प्राय: नकारात्मक अर्थ पनि बोक्छ। अवधारणालाई विभिन्न तरिकामा परिभाषित गरिएको छ, र यसलाई
व्यापक रूपमा प्रयोग गर्ने वा सीमित रूपमा, प्रायोगिक वा सामान्य रूपमा, र यसको लागि द्वन्द्व वा सहयोग बढी आवश्यक छ कि छैन भन्ने बारेमा विभिन्न दृष्टिकोणहरूमा मौलिक रूपमा फरक फरक विचारहरू छन्।

### प्रश्न:
"""
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=64)

def format_output(prompt,pipe):
  inference = pipe(prompt)[0]["generated_text"]
  
  # Select after प्रश्नहरू: and break line after each ?
  inference = inference.split("प्रश्न:")[-1].replace("?","?\n")
  
  # only take first question
  index = inference.find("?")
  inference = inference[:index+1] 
  return inference

print(format_output(prompt, pipe))
'''
  Output:
        राजनीतिक आन्दोलनमा, राजनीतिक कार्यसूचीको सन्दर्भमा कुन प्रकारको राजनीति महत्वपूर्ण छ?
'''
```

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
The dataset is created by converting SQuAD dataset to nepali using Nepali_nlp using PEFT.

https://huggingface.co/datasets/Bibek1129/nepali_SQuAD_single_qsn

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
The model is trained with the lora config (rank=32,lora_alpha=64,target_modules="c_fc","c_attn","c_proj","lm_head");with 512 tokens per instance, 4 instances per batch, and around 118.1K training steps.

#### Training Hyperparameters
Following are the training hyperparameters.
<li>learning_rate:2e-4</li>
<li>fp16:True</li>
<li>optim:"paged_adamw_32bit"</li>
<li>lr_scheduler_type:"constant"</li>
<li>num_train_epochs:15</li>
Lora Config:

```python
config={
  "alpha_pattern": {},
  "auto_mapping": null,
  "base_model_name_or_path": "Sakonii/distilgpt2-nepali",
  "bias": "none",
  "fan_in_fan_out": false,
  "inference_mode": true,
  "init_lora_weights": true,
  "layers_pattern": null,
  "layers_to_transform": null,
  "lora_alpha": 64,
  "lora_dropout": 0.05,
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 32,
  "rank_pattern": {},
  "revision": null,
  "target_modules": [
    "c_proj",
    "lm_head",
    "c_fc",
    "c_attn"
  ],
  "task_type": "CAUSAL_LM"
}

```

### Results
  <li>train/loss:3.1028</li>

### Framework versions

- PEFT 0.9.0
-