--- license: mit --- This model is a part of two model series, AryaBhatta-1 and AryaBhatta-2 and is finetuned from HuggingFaceH4/zephyr-7b-gemma-v0.1 or Google/gemma and is finetuned on 9 Indian languages (Hindi, Tamil, Punjabi, Bengali, Gujarati, Oriya, Telugu, Kannada, Malayalam) plus English. There are two models. One finetuned on Google's Gemma and one fine-tuned on Zephyr's Gemma base. Repo for other one (Zephyr one): GenVRadmin/AryaBhatta-GemmaOrca-2-Merged To improve the resoning and maths skills, we first SFT tune the gemma on Microsoft's Orca datasets. We utilize Orca maths Hindi dataset: GenVRadmin/Aryabhatta-Orca-Maths-Hindi \ And original Orca maths dataset: microsoft/orca-math-word-problems-200k This pushes the MATHS score from 24.3 in Gemma-7B to 25.5 in Zephyr-Gemma and 31.6 in GemmaOrca. The model is then finetuned on GenVR's Samvaad datasets (GenVRadmin/Samvaad-Indic-Positive and GenVRadmin/Samvaad-Tamil-Mixtral and a subset of GenVRadmin/Samvaad-Mixed-Language-3). This is then finetuned on various open sourced datasets like: Telugu-LLM-Labs/yahma_alpaca_cleaned_telugu_filtered_and_romanized \ Telugu-LLM-Labs/teknium_GPTeacher_general_instruct_telugu_filtered_and_romanized \ abhinand/tamil-alpaca \ Tensoic/airoboros-3.2_kn \ Tensoic/gpt-teacher_kn \ Tensoic/Alpaca-Gujarati \ HydraIndicLM/bengali_alpaca_dolly_67k \ Open-Orca/OpenOrca \ pankajmathur/alpaca_orca \ OdiaGenAI/Odia_Alpaca_instructions_52k \ OdiaGenAI/gpt-teacher-roleplay-odia-3k \ GenVRadmin/Samvaad-Punjabi-Mini \ pankajmathur/WizardLM_Orca The model achieves following scores on benchmarks: Model AGIEval GPT4All TruthfulQA BigBench Average ⬇️ \ AryaBhatta-GemmaOrca 35.9 72.26 53.85 40.35 50.59 \ zephyr-7b-beta 37.52 71.77 55.26 39.77 51.08 \ zephyr-7b-gemma-v0.1 34.22 66.37 52.19 37.10 47.47 \ mlabonne/Gemmalpaca-7B 21.6 40.87 44.85 30.49 34.45 \ google/gemma-7b-it 21.33 40.84 41.70 30.25 33.53 How to use:- ``` from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer model = AutoPeftModelForCausalLM.from_pretrained( "GenVRadmin/AryaBhatta-GemmaOrca", load_in_4bit = False, token = hf_token ) tokenizer = AutoTokenizer.from_pretrained("GenVRadmin/AryaBhatta-GemmaOrca") input_prompt = """ ### Instruction: {} ### Input: {} ### Response: {}""" input_text = input_prompt.format( "Answer this question about India.", # instruction "Who is the Prime Minister of India", # input "", # output - leave this blank for generation! ) inputs = tokenizer([input_text], return_tensors = "pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True) response = tokenizer.batch_decode(outputs)[0] ```