|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
base_model: Hemanth-thunder/Tamil-Mistral-7B-v0.1 |
|
Pretrain_Model: mistralai/Mistral-7B-v0.1 |
|
tags: |
|
- Mistral |
|
- instruct |
|
- finetune |
|
- chatml |
|
- DPO |
|
- RLHF |
|
- gpt4 |
|
- synthetic data |
|
- distillation |
|
- function calling |
|
- json mode |
|
datasets: |
|
- Hemanth-thunder/tamil-instruction |
|
language: |
|
- ta |
|
widget: |
|
- example_title: Tamil Chat with LLM |
|
messages: |
|
- role: system |
|
content: >- |
|
சரியான பதிலுடன் வேலையை வெற்றிகரமாக முடிக்க, வழங்கப்பட்ட வழிகாட்டுதல்களைப் |
|
பின்பற்றி, தேவையான தகவலை உள்ளிடவும். |
|
- role: user |
|
content: மூன்று இயற்கை கூறுகளை குறிப்பிடவும். |
|
--- |
|
|
|
# Model Card for Tamil-Mistral-7B-Instruct-v0.1 |
|
|
|
The Tamil-Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of [Tamil-Mistral-7B-Instruct-v0.1](https://huggingface.co/Hemanth-thunder/Tamil-Mistral-7B-Instruct-v0.1). |
|
Tamil LLM: A Breakthrough in Tamil Language Understanding In the realm of language models, the fine-tuned Tamil Mistral model represents a significant advancement. Unlike its English counterpart, the Tamil Mistral model is specifically tailored to comprehend and generate text in the Tamil language. This innovation addresses a critical gap, as the English Mistral model fails to effectively engage with Tamil, a language rich in culture and heritage. Through extensive fine-tuning with a base Tamil Mistral model, this iteration has been meticulously enhanced to grasp the nuances and intricacies of the Tamil language. As a result, we are delighted to present a revolutionary model that enables seamless interaction through text. Welcome to the future of conversational Tamil language processing with our instructive model. |
|
|
|
# Dataset |
|
alpaca dataset (400k) instruction google translated |
|
|
|
# Training time |
|
18 hrs to train on NVIDIA RTX A6000 48GB with batch size of 30 |
|
|
|
## Kaggle demo link |
|
https://www.kaggle.com/code/hemanthkumar21/tamil-mistral-instruct-v0-1-demo/ |
|
|
|
```python |
|
from transformers import (AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig,TextStreamer,pipeline) |
|
import torch |
|
model_name = "Hemanth-thunder/Tamil-Mistral-7B-Instruct-v0.1" |
|
nf4_config = BitsAndBytesConfig(load_in_4bit=True,bnb_4bit_quant_type="nf4",bnb_4bit_use_double_quant=True, |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
model = AutoModelForCausalLM.from_pretrained(model_name,device_map='auto',quantization_config=nf4_config,use_cache=False,low_cpu_mem_usage=True ) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
tokenizer.padding_side = "right" |
|
streamer = TextStreamer(tokenizer) |
|
pipe = pipeline("text-generation" ,model=model, tokenizer=tokenizer ,do_sample=True, repetition_penalty=1.15,top_p=0.95,streamer=streamer) |
|
prompt = create_prompt("வாழ்க்கையில் ஆரோக்கியமாக இருப்பது எப்படி?") |
|
result=pipe(prompt,max_length=512,pad_token_id=tokenizer.eos_token_id) |
|
|
|
``` |
|
``` |
|
result: |
|
- உடற்பயிற்சி - ஆரோக்கியமான உணவை உண்ணுங்கள் -2 புகைபிடிக்காதே - தவறாமல் உடற்பயிற்சி செய்</s> |
|
``` |
|
## Instruction format |
|
|
|
To harness the power of instruction fine-tuning, your prompt must be encapsulated within <s> and </s> tokens. This instructional format revolves around three key elements: Instruction, Input, and Response. The Tamil Mistral instruct model is adept at engaging in conversations based on this structured template. |
|
E.g. |
|
|
|
|
|
``` |
|
# without Input |
|
prompt_template =<s>"""சரியான பதிலுடன் வேலையை வெற்றிகரமாக முடிக்க, தேவையான தகவலை உள்ளிடவும். |
|
|
|
### Instruction: |
|
{} |
|
|
|
### Response:""" |
|
|
|
# with Input |
|
prompt_template =<s>"""சரியான பதிலுடன் வேலையை வெற்றிகரமாக முடிக்க, வழங்கப்பட்ட வழிகாட்டுதல்களைப் பின்பற்றி, தேவையான தகவலை உள்ளிடவும். |
|
|
|
### Instruction: |
|
{} |
|
|
|
### Input: |
|
{} |
|
|
|
### Response:""" |
|
|
|
``` |
|
|
|
## Python function to format query |
|
```python |
|
def create_prompt(query,prompt_template=prompt_template): |
|
bos_token = "<s>" |
|
eos_token = "</s>" |
|
if query: |
|
prompt_template = prompt_template.format(query) |
|
else: |
|
raise "Please input with query" |
|
prompt = bos_token+prompt_template #eos_token |
|
return prompt |
|
``` |
|
|
|
## Model Architecture |
|
This instruction model is based on Mistral-7B-v0.1, a transformer model with the following architecture choices: |
|
- Grouped-Query Attention |
|
- Sliding-Window Attention |
|
- Byte-fallback BPE tokenizer |
|
|
|
## Troubleshooting |
|
- If you see the following error: |
|
``` |
|
Traceback (most recent call last): |
|
File "", line 1, in |
|
File "/transformers/models/auto/auto_factory.py", line 482, in from_pretrained |
|
config, kwargs = AutoConfig.from_pretrained( |
|
File "/transformers/models/auto/configuration_auto.py", line 1022, in from_pretrained |
|
config_class = CONFIG_MAPPING[config_dict["model_type"]] |
|
File "/transformers/models/auto/configuration_auto.py", line 723, in getitem |
|
raise KeyError(key) |
|
KeyError: 'mistral' |
|
``` |
|
|
|
Installing transformers from source should solve the issue |
|
pip install git+https://github.com/huggingface/transformers |
|
|
|
This should not be required after transformers-v4.33.4. |
|
|
|
## Limitations |
|
|
|
The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. |
|
It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to |
|
make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. |
|
|
|
|
|
## Quantized Versions: |
|
coming s00n |
|
|
|
# How to Cite |
|
|
|
```bibtext |
|
@misc{Tamil-Mistral-7B-Instruct-v0.1, |
|
url={[https://huggingface.co/Hemanth-thunder/Tamil-Mistral-7B-Instruct-v0.1]https://huggingface.co/Hemanth-thunder/Tamil-Mistral-7B-Instruct-v0.1)}, |
|
title={Tamil-Mistral-7B-Instruct-v0.1}, |
|
author={"hemanth kumar"} |
|
} |
|
``` |