Model Card for Model ID

Model Details

Model Description

The model is finetuned on Sakonii/distilgpt2-nepali with Bibek1129/nepali_SQuAD_multiple_qsns dataset.The dataset is converted to nepali using Nepali_nlp library using SQuAD dataset.

Model type: distilgpt2
Language(s) (NLP): ne(Nepali)
Finetuned from model : https://huggingface.co/Sakonii/distilgpt2-nepali

Model Sources

For training snippets and inference check the following repository.

Repository: https://github.com/HordesOfGhost/Nepali_LLMs/]

How to Get Started with the Model

Use the code below to get started with the model.

!pip install peft 
!pip install transformers
!pip install sentencepiece

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM,AutoTokenizer
from transformers import pipeline

base_model = "Sakonii/distilgpt2-nepali"
adapter_model = "Bibek1129/distilgpt2-nepali-single-qs-generator"

tokenizer = AutoTokenizer.from_pretrained(base_model)

config = PeftConfig.from_pretrained(adapter_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_model)
model = model.merge_and_unload()

prompt = """तपाईं एउटा प्रश्न उत्पन्न गर्ने मोडेल हुनुहुन्छ। तपाइँलाई एक सन्दर्भ दिइएको हुन्छ र तपाइँ त्यसमा आधारित एउटा प्रश्न उत्पन्न गर्नुहुन्छ।

### सन्दर्भ:
राजनीति 'शहरका मामिलाहरू') गतिविधिहरूको सेट हो जुन समूहहरूमा निर्णय गर्न वा व्यक्तिहरू बीचको शक्ति सम्बन्धका अन्य रूपहरू, जस्तै स्रोत वा स्थितिको वितरणसँग सम्बन्धित छ। राजनीति र सरकारको अध्ययन गर्ने सामाजिक विज्ञानको शाखालाई राजनीति विज्ञान भनिन्छ।
यसलाई "राजनीतिक समाधान" को सन्दर्भमा सकारात्मक रूपमा प्रयोग गर्न सकिन्छ जुन सम्झौता र अहिंसात्मक छ, वा वर्णनात्मक रूपमा "सरकारको कला वा विज्ञान" को रूपमा, तर प्राय: नकारात्मक अर्थ पनि बोक्छ। अवधारणालाई विभिन्न तरिकामा परिभाषित गरिएको छ, र यसलाई
व्यापक रूपमा प्रयोग गर्ने वा सीमित रूपमा, प्रायोगिक वा सामान्य रूपमा, र यसको लागि द्वन्द्व वा सहयोग बढी आवश्यक छ कि छैन भन्ने बारेमा विभिन्न दृष्टिकोणहरूमा मौलिक रूपमा फरक फरक विचारहरू छन्।

### प्रश्न:
"""
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=64)

def format_output(prompt,pipe):
  inference = pipe(prompt)[0]["generated_text"]
  
  # Select after प्रश्नहरू: and break line after each ?
  inference = inference.split("प्रश्न:")[-1].replace("?","?\n")
  
  # only take first question
  index = inference.find("?")
  inference = inference[:index+1] 
  return inference

print(format_output(prompt, pipe))
'''
  Output:
        राजनीतिक आन्दोलनमा, राजनीतिक कार्यसूचीको सन्दर्भमा कुन प्रकारको राजनीति महत्वपूर्ण छ?
'''

Training Details

Training Data

The dataset is created by converting SQuAD dataset to nepali using Nepali_nlp using PEFT.

https://huggingface.co/datasets/Bibek1129/nepali_SQuAD_single_qsn

Training Procedure

The model is trained with the lora config (rank=32,lora_alpha=64,target_modules="c_fc","c_attn","c_proj","lm_head");with 512 tokens per instance, 4 instances per batch, and around 118.1K training steps.

Training Hyperparameters

Following are the training hyperparameters.

learning_rate:2e-4

fp16:True

optim:"paged_adamw_32bit"

lr_scheduler_type:"constant"

num_train_epochs:15

Lora Config:

config={
  "alpha_pattern": {},
  "auto_mapping": null,
  "base_model_name_or_path": "Sakonii/distilgpt2-nepali",
  "bias": "none",
  "fan_in_fan_out": false,
  "inference_mode": true,
  "init_lora_weights": true,
  "layers_pattern": null,
  "layers_to_transform": null,
  "lora_alpha": 64,
  "lora_dropout": 0.05,
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 32,
  "rank_pattern": {},
  "revision": null,
  "target_modules": [
    "c_proj",
    "lm_head",
    "c_fc",
    "c_attn"
  ],
  "task_type": "CAUSAL_LM"
}

Results

train/loss:3.1028

Framework versions

PEFT 0.9.0

Bibek1129
/

distilgpt2-nepali-single-qs-generator