added training and peft params
Browse files
README.md
CHANGED
@@ -1,8 +1,86 @@
|
|
1 |
---
|
2 |
library_name: peft
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
## Training procedure
|
5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
The following `bitsandbytes` quantization config was used during training:
|
8 |
- load_in_8bit: False
|
@@ -14,7 +92,10 @@ The following `bitsandbytes` quantization config was used during training:
|
|
14 |
- bnb_4bit_quant_type: nf4
|
15 |
- bnb_4bit_use_double_quant: False
|
16 |
- bnb_4bit_compute_dtype: float16
|
17 |
-
### Framework versions
|
18 |
|
19 |
-
|
20 |
-
-
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
library_name: peft
|
3 |
+
license: apache-2.0
|
4 |
+
datasets:
|
5 |
+
- knkarthick/dialogsum
|
6 |
+
pipeline_tag: summarization
|
7 |
---
|
8 |
+
## Useage with Transformers
|
9 |
+
|
10 |
+
```python
|
11 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
12 |
+
import transformers
|
13 |
+
from peft import PeftModel
|
14 |
+
|
15 |
+
# Quantization config
|
16 |
+
bnb_config = BitsAndBytesConfig(
|
17 |
+
load_in_4bit=True,
|
18 |
+
bnb_4bit_quant_type="nf4",
|
19 |
+
bnb_4bit_compute_dtype="float16",
|
20 |
+
)
|
21 |
+
|
22 |
+
model_name = "TinyPixel/Llama-2-7B-bf16-sharded"
|
23 |
+
|
24 |
+
# loading the model with quantization config
|
25 |
+
model = AutoModelForCausalLM.from_pretrained(
|
26 |
+
model_name,
|
27 |
+
quantization_config=bnb_config,
|
28 |
+
trust_remote_code=True,
|
29 |
+
device_map='auto'
|
30 |
+
)
|
31 |
+
|
32 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True , return_token_type_ids=False)
|
33 |
+
tokenizer.pad_token = tokenizer.eos_token
|
34 |
+
|
35 |
+
model = PeftModel.from_pretrained(model,"shenoy/DialogSumLlama2_qlora", device_map="auto")
|
36 |
+
|
37 |
+
text = """### Instruction:
|
38 |
+
Write a concise summary of the below input text.Return your response in bullet points which covers the key points of the text.
|
39 |
+
### Input:
|
40 |
+
#Person1#: Ms. Dawson, I need you to take a dictation for me.
|
41 |
+
#Person2#: Yes, sir...
|
42 |
+
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
|
43 |
+
#Person2#: Yes, sir. Go ahead.
|
44 |
+
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
|
45 |
+
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
|
46 |
+
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
|
47 |
+
#Person2#: But sir, many employees use Instant Messaging to communicate with their clients.
|
48 |
+
#Person1#: They will just have to change their communication methods. I don't want any - one using Instant Messaging in this office. It wastes too much time! Now, please continue with the memo. Where were we?
|
49 |
+
#Person2#: This applies to internal and external communications.
|
50 |
+
#Person1#: Yes. Any employee who persists in using Instant Messaging will first receive a warning and be placed on probation. At second offense, the employee will face termination. Any questions regarding this new policy may be directed to department heads.
|
51 |
+
#Person2#: Is that all?
|
52 |
+
#Person1#: Yes. Please get this memo typed up and distributed to all employees before 4 pm.
|
53 |
+
### Response :"""
|
54 |
+
|
55 |
+
inputs = tokenizer(text, return_tensors="pt")
|
56 |
+
outputs = model.generate(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'], max_new_tokens=100 ,repetition_penalty=1.2)
|
57 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
58 |
+
```
|
59 |
+
---
|
60 |
+
|
61 |
## Training procedure
|
62 |
|
63 |
+
Training Configuration:
|
64 |
+
|
65 |
+
- `per_device_train_batch_size`: 4
|
66 |
+
- `gradient_accumulation_steps`: 4
|
67 |
+
- `optim`: "paged_adamw_8bit"
|
68 |
+
- `learning_rate`: 2e-4
|
69 |
+
- `lr_scheduler_type`: "linear"
|
70 |
+
- `save_strategy`: "epoch"
|
71 |
+
- `logging_steps`: 10
|
72 |
+
- `num_train_epochs`: 2
|
73 |
+
- `max_steps`: 50
|
74 |
+
- `fp16`: True
|
75 |
+
|
76 |
+
LORA Configuration:
|
77 |
+
|
78 |
+
- `lora_alpha`: 16
|
79 |
+
- `lora_dropout`: 0.05
|
80 |
+
- `target_modules`: ["q_proj", "v_proj"]
|
81 |
+
- `r`: 8
|
82 |
+
- `bias`: "none"
|
83 |
+
- `task_type`: "CAUSAL_LM"
|
84 |
|
85 |
The following `bitsandbytes` quantization config was used during training:
|
86 |
- load_in_8bit: False
|
|
|
92 |
- bnb_4bit_quant_type: nf4
|
93 |
- bnb_4bit_use_double_quant: False
|
94 |
- bnb_4bit_compute_dtype: float16
|
|
|
95 |
|
96 |
+
### Framework versions
|
97 |
+
- `accelerate` 0.21.0
|
98 |
+
- `peft` 0.4.0
|
99 |
+
- `bitsandbytes` 0.40.2
|
100 |
+
- `transformers` 4.30.2
|
101 |
+
- `trl` 0.4.7
|