IlyaGusev commited on
Commit
4ff4f47
2 Parent(s): ba7ade3 52e90df

Merge branch 'main' of https://huggingface.co/IlyaGusev/saiga2_13b_lora

Browse files
Files changed (1) hide show
  1. README.md +130 -14
README.md CHANGED
@@ -1,20 +1,136 @@
1
  ---
2
- library_name: peft
 
 
 
 
 
 
 
 
 
 
3
  ---
4
- ## Training procedure
5
 
 
6
 
7
- The following `bitsandbytes` quantization config was used during training:
8
- - load_in_8bit: True
9
- - load_in_4bit: False
10
- - llm_int8_threshold: 6.0
11
- - llm_int8_skip_modules: None
12
- - llm_int8_enable_fp32_cpu_offload: False
13
- - llm_int8_has_fp16_weight: False
14
- - bnb_4bit_quant_type: fp4
15
- - bnb_4bit_use_double_quant: False
16
- - bnb_4bit_compute_dtype: float32
17
- ### Framework versions
18
 
 
19
 
20
- - PEFT 0.5.0.dev0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ datasets:
3
+ - IlyaGusev/ru_turbo_alpaca
4
+ - IlyaGusev/ru_turbo_saiga
5
+ - IlyaGusev/ru_sharegpt_cleaned
6
+ - IlyaGusev/oasst1_ru_main_branch
7
+ - IlyaGusev/ru_turbo_alpaca_evol_instruct
8
+ - lksy/ru_instruct_gpt4
9
+ language:
10
+ - ru
11
+ pipeline_tag: conversational
12
+ license: cc-by-4.0
13
  ---
 
14
 
15
+ # Saiga2 7B, Russian LLaMA-based chatbot
16
 
17
+ Based on [LLaMA-2 7B HF](https://huggingface.co/meta-llama/Llama-2-13b-hf).
 
 
 
 
 
 
 
 
 
 
18
 
19
+ This is an adapter-only version.
20
 
21
+ Training code: [link](https://github.com/IlyaGusev/rulm/tree/master/self_instruct)
22
+
23
+ **WARNING 1**: Run with the development version of `transformers` and `peft`!
24
+
25
+ **WARNING 2**: Avoid using V100 (in Colab, for example). Outputs are much worse in this case.
26
+
27
+ ```python
28
+ from peft import PeftModel, PeftConfig
29
+ from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
30
+
31
+ MODEL_NAME = "IlyaGusev/saiga2_13b_lora"
32
+ DEFAULT_MESSAGE_TEMPLATE = "<s>{role}\n{content}</s>\n"
33
+ DEFAULT_SYSTEM_PROMPT = "Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."
34
+
35
+ class Conversation:
36
+ def __init__(
37
+ self,
38
+ message_template=DEFAULT_MESSAGE_TEMPLATE,
39
+ system_prompt=DEFAULT_SYSTEM_PROMPT,
40
+ start_token_id=1,
41
+ bot_token_id=9225
42
+ ):
43
+ self.message_template = message_template
44
+ self.start_token_id = start_token_id
45
+ self.bot_token_id = bot_token_id
46
+ self.messages = [{
47
+ "role": "system",
48
+ "content": system_prompt
49
+ }]
50
+
51
+ def get_start_token_id(self):
52
+ return self.start_token_id
53
+
54
+ def get_bot_token_id(self):
55
+ return self.bot_token_id
56
+
57
+ def add_user_message(self, message):
58
+ self.messages.append({
59
+ "role": "user",
60
+ "content": message
61
+ })
62
+
63
+ def add_bot_message(self, message):
64
+ self.messages.append({
65
+ "role": "bot",
66
+ "content": message
67
+ })
68
+
69
+ def get_prompt(self, tokenizer):
70
+ final_text = ""
71
+ for message in self.messages:
72
+ message_text = self.message_template.format(**message)
73
+ final_text += message_text
74
+ final_text += tokenizer.decode([self.start_token_id, self.bot_token_id])
75
+ return final_text.strip()
76
+
77
+
78
+ def generate(model, tokenizer, prompt, generation_config):
79
+ data = tokenizer(prompt, return_tensors="pt")
80
+ data = {k: v.to(model.device) for k, v in data.items()}
81
+ output_ids = model.generate(
82
+ **data,
83
+ generation_config=generation_config
84
+ )[0]
85
+ output_ids = output_ids[len(data["input_ids"][0]):]
86
+ output = tokenizer.decode(output_ids, skip_special_tokens=True)
87
+ return output.strip()
88
+
89
+ config = PeftConfig.from_pretrained(MODEL_NAME)
90
+ model = AutoModelForCausalLM.from_pretrained(
91
+ config.base_model_name_or_path,
92
+ load_in_8bit=True,
93
+ torch_dtype=torch.float16,
94
+ device_map="auto"
95
+ )
96
+ model = PeftModel.from_pretrained(
97
+ model,
98
+ MODEL_NAME,
99
+ torch_dtype=torch.float16
100
+ )
101
+ model.eval()
102
+
103
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=False)
104
+ generation_config = GenerationConfig.from_pretrained(MODEL_NAME)
105
+ print(generation_config)
106
+
107
+ inputs = ["Почему трава зеленая?", "Сочини длинный рассказ, обязательно упоминая следующие объекты. Дано: Таня, мяч"]
108
+ for inp in inputs:
109
+ conversation = Conversation()
110
+ conversation.add_user_message(inp)
111
+ prompt = conversation.get_prompt(tokenizer)
112
+
113
+ output = generate(model, tokenizer, prompt, generation_config)
114
+ print(inp)
115
+ print(output)
116
+ print()
117
+ print("==============================")
118
+ print()
119
+ ```
120
+
121
+ Examples:
122
+ ```
123
+ User: Почему трава зеленая?
124
+ Saiga:
125
+ ```
126
+
127
+ ```
128
+ User: Сочини длинный рассказ, обязательно упоминая следующие объекты. Дано: Таня, мяч
129
+ Saiga:
130
+ ```
131
+
132
+ v1:
133
+ - dataset code revision 7712a061d993f61c49b1e2d992e893c48acb3a87
134
+ - wandb [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/848s9kbi)
135
+ - 7 datasets: ru_turbo_alpaca, ru_turbo_saiga, ru_sharegpt_cleaned, oasst1_ru_main_branch, gpt_roleplay_realm, ru_turbo_alpaca_evol_instruct (iteration 1/2), ru_instruct_gpt4
136
+ - Datasets merging script: [create_chat_set.py](https://github.com/IlyaGusev/rulm/blob/e4238fd9a196405b566a2d5838ab44b7a0f4dc31/self_instruct/src/data_processing/create_chat_set.py)