|
--- |
|
license: mit |
|
datasets: |
|
- tatsu-lab/alpaca |
|
- yizhongw/self_instruct |
|
- anon8231489123/ShareGPT_Vicuna_unfiltered |
|
language: |
|
- en |
|
- es |
|
metrics: |
|
- accuracy |
|
- bleu |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Note |
|
## Orginal LLaMA Weights Is not used in this model so it's MIT Licenced |
|
|
|
I used Alpaca Prompting Method |
|
|
|
```python |
|
def prompt_to_instruction(instruction, input_=None, response_=None, eos='<|endoftext|>'): |
|
if input_ is None: |
|
st1_prompting = f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n\n{instruction}\n\n' |
|
else: |
|
st1_prompting = f'Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n\n{instruction}\n\n### Input:\n\n{input_}\n\n' |
|
resp = f'### Response:\n\n{response_}{eos}' if response_ is not None else '### Response:\n\n' |
|
return st1_prompting + resp |
|
``` |
|
|
|
# Using Model In Transformers |
|
|
|
```python |
|
|
|
import torch |
|
from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM |
|
|
|
# Loading Tokenizer |
|
|
|
tokenizer = LlamaTokenizer.from_pretrained("erfanzar/LGeM-7B") |
|
|
|
# Generation Config |
|
|
|
gf = GenerationConfig( |
|
temperature=1, |
|
top_p=0.75, |
|
top_k=40, |
|
max_new_tokens=256, |
|
num_beams=4, |
|
|
|
) |
|
|
|
|
|
# Loading Model |
|
|
|
model = LlamaForCausalLM.from_pretrained( |
|
"erfanzar/LGeM-7B", |
|
load_in_8bit=True, |
|
device_map="auto", |
|
torch_dtype=torch.float16, |
|
|
|
) |
|
|
|
|
|
while True: |
|
|
|
instruction = input('=> ') |
|
input_ = None |
|
|
|
prompt = prompt_to_instruction(instruction, input_) |
|
input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"] |
|
input_ids = input_ids.to(model.device) |
|
|
|
with torch.no_grad(): |
|
prediction = model.generate( |
|
input_ids=input_ids, |
|
return_dict_in_generate=True, |
|
generation_config=gc, |
|
output_scores=True, |
|
) |
|
|
|
response = tokenizer.decode(prediction.sequences[0], skip_special_tokens=True) |
|
print('\n\n\n') |
|
print(response[len(prompt)+1:]) |
|
print('\n\n') |
|
|
|
|
|
|
|
``` |
|
|
|
|
|
# Using Model in OST |
|
|
|
## [Open Source Transformers](https://github.com/erfanzar/OST-OpenSourceTransformers) |
|
|
|
### LGeM ๐ |
|
|
|
- what is LGeM, LGeM is a CausalLM Model that is trained on self instruct data (Alpaca data) and for initialization of the first train of the main model (weights are available) I used pre weights from Alpaca LoRA (open source) |
|
|
|
- it's Decoder Only |
|
- built-in Pytorch |
|
- you can simply import models like |
|
|
|
```python |
|
from modules import LGeMForCausalLM |
|
``` |
|
|
|
- and Training code is available at LGeM-Train.py (check source) |
|
- training parameters |
|
- - learning rate 1e-4 |
|
- - AdamW (weight decay 1e-2) |
|
- - batch 2 |
|
- - A 100 80GB used for training (4 X) |
|
``` shell |
|
python3 LGeM-train.py |
|
``` |