Edit model card

Model Card for Model ID

Model Details

Model Description

The model is finetuned on Sakonii/distilgpt2-nepali with Bibek1129/nepali_SQuAD_multiple_qsns dataset.The dataset is converted to nepali using Nepali_nlp library using SQuAD dataset.

Model Sources

For training snippets and inference check the following repository.

How to Get Started with the Model

Use the code below to get started with the model.

!pip install peft 
!pip install transformers
!pip install sentencepiece
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM,AutoTokenizer
from transformers import pipeline

base_model = "Sakonii/distilgpt2-nepali"
adapter_model = "Bibek1129/distilgpt2-nepali-single-qs-generator"

tokenizer = AutoTokenizer.from_pretrained(base_model)

config = PeftConfig.from_pretrained(adapter_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_model)
model = model.merge_and_unload()

prompt = """तपाईं एउटा प्रश्न उत्पन्न गर्ने मोडेल हुनुहुन्छ। तपाइँलाई एक सन्दर्भ दिइएको हुन्छ र तपाइँ त्यसमा आधारित एउटा प्रश्न उत्पन्न गर्नुहुन्छ।

### सन्दर्भ:
राजनीति 'शहरका मामिलाहरू') गतिविधिहरूको सेट हो जुन समूहहरूमा निर्णय गर्न वा व्यक्तिहरू बीचको शक्ति सम्बन्धका अन्य रूपहरू, जस्तै स्रोत वा स्थितिको वितरणसँग सम्बन्धित छ। राजनीति र सरकारको अध्ययन गर्ने सामाजिक विज्ञानको शाखालाई राजनीति विज्ञान भनिन्छ।
यसलाई "राजनीतिक समाधान" को सन्दर्भमा सकारात्मक रूपमा प्रयोग गर्न सकिन्छ जुन सम्झौता र अहिंसात्मक छ, वा वर्णनात्मक रूपमा "सरकारको कला वा विज्ञान" को रूपमा, तर प्राय: नकारात्मक अर्थ पनि बोक्छ। अवधारणालाई विभिन्न तरिकामा परिभाषित गरिएको छ, र यसलाई
व्यापक रूपमा प्रयोग गर्ने वा सीमित रूपमा, प्रायोगिक वा सामान्य रूपमा, र यसको लागि द्वन्द्व वा सहयोग बढी आवश्यक छ कि छैन भन्ने बारेमा विभिन्न दृष्टिकोणहरूमा मौलिक रूपमा फरक फरक विचारहरू छन्।

### प्रश्न:
"""
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=64)

def format_output(prompt,pipe):
  inference = pipe(prompt)[0]["generated_text"]
  
  # Select after प्रश्नहरू: and break line after each ?
  inference = inference.split("प्रश्न:")[-1].replace("?","?\n")
  
  # only take first question
  index = inference.find("?")
  inference = inference[:index+1] 
  return inference

print(format_output(prompt, pipe))
'''
  Output:
        राजनीतिक आन्दोलनमा, राजनीतिक कार्यसूचीको सन्दर्भमा कुन प्रकारको राजनीति महत्वपूर्ण छ?
'''

Training Details

Training Data

The dataset is created by converting SQuAD dataset to nepali using Nepali_nlp using PEFT.

https://huggingface.co/datasets/Bibek1129/nepali_SQuAD_single_qsn

Training Procedure

The model is trained with the lora config (rank=32,lora_alpha=64,target_modules="c_fc","c_attn","c_proj","lm_head");with 512 tokens per instance, 4 instances per batch, and around 118.1K training steps.

Training Hyperparameters

Following are the training hyperparameters.

  • learning_rate:2e-4
  • fp16:True
  • optim:"paged_adamw_32bit"
  • lr_scheduler_type:"constant"
  • num_train_epochs:15
  • Lora Config:
    config={
      "alpha_pattern": {},
      "auto_mapping": null,
      "base_model_name_or_path": "Sakonii/distilgpt2-nepali",
      "bias": "none",
      "fan_in_fan_out": false,
      "inference_mode": true,
      "init_lora_weights": true,
      "layers_pattern": null,
      "layers_to_transform": null,
      "lora_alpha": 64,
      "lora_dropout": 0.05,
      "modules_to_save": null,
      "peft_type": "LORA",
      "r": 32,
      "rank_pattern": {},
      "revision": null,
      "target_modules": [
        "c_proj",
        "lm_head",
        "c_fc",
        "c_attn"
      ],
      "task_type": "CAUSAL_LM"
    }
    

    Results

  • train/loss:3.1028
  • Framework versions

    • PEFT 0.9.0
    Downloads last month
    2
    Inference Examples
    This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

    Model tree for Bibek1129/distilgpt2-nepali-single-qs-generator

    Adapter
    (2)
    this model

    Dataset used to train Bibek1129/distilgpt2-nepali-single-qs-generator