--- license: mit datasets: - tatsu-lab/alpaca - yizhongw/self_instruct - anon8231489123/ShareGPT_Vicuna_unfiltered language: - en - es metrics: - accuracy - bleu pipeline_tag: text-generation --- # Note ## Orginal LLaMA Weights Is not used in this model so it's MIT Licenced I used Alpaca Prompting Method ```python def prompt_to_instruction(instruction, input_=None, response_=None, eos='<|endoftext|>'): if input_ is None: st1_prompting = f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n\n{instruction}\n\n' else: st1_prompting = f'Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n\n{instruction}\n\n### Input:\n\n{input_}\n\n' resp = f'### Response:\n\n{response_}{eos}' if response_ is not None else '### Response:\n\n' return st1_prompting + resp ``` # Using Model In Transformers ```python import torch from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM # Loading Tokenizer tokenizer = LlamaTokenizer.from_pretrained("erfanzar/LGeM-7B") # Generation Config gf = GenerationConfig( temperature=1, top_p=0.75, top_k=40, max_new_tokens=256, num_beams=4, ) # Loading Model model = LlamaForCausalLM.from_pretrained( "erfanzar/LGeM-7B", load_in_8bit=True, device_map="auto", torch_dtype=torch.float16, ) while True: instruction = input('=> ') input_ = None prompt = prompt_to_instruction(instruction, input_) input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"] input_ids = input_ids.to(model.device) with torch.no_grad(): prediction = model.generate( input_ids=input_ids, return_dict_in_generate=True, generation_config=gc, output_scores=True, ) response = tokenizer.decode(prediction.sequences[0], skip_special_tokens=True) print('\n\n\n') print(response[len(prompt)+1:]) print('\n\n') ``` # Using Model in OST ## [Open Source Transformers](https://github.com/erfanzar/OST-OpenSourceTransformers) ### LGeM 🚀 - what is LGeM, LGeM is a CausalLM Model that is trained on self instruct data (Alpaca data) and for initialization of the first train of the main model (weights are available) I used pre weights from Alpaca LoRA (open source) - it's Decoder Only - built-in Pytorch - you can simply import models like ```python from modules import LGeMForCausalLM ``` - and Training code is available at LGeM-Train.py (check source) - training parameters - - learning rate 1e-4 - - AdamW (weight decay 1e-2) - - batch 2 - - A 100 80GB used for training (4 X) ``` shell python3 LGeM-train.py ```