shahidul034 commited on
Commit
9bc7370
·
1 Parent(s): 9e77dc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -6
README.md CHANGED
@@ -7,7 +7,8 @@ model-index:
7
  - name: KUETLLM_zephyr
8
  results: []
9
  ---
10
-
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
  should probably proofread and complete it, then remove this comment. -->
13
 
@@ -17,11 +18,29 @@ This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://hug
17
 
18
  ## Model description
19
 
20
- More information needed
21
-
22
- ## Intended uses & limitations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- More information needed
25
 
26
  ## Training and evaluation data
27
 
@@ -43,7 +62,25 @@ The following hyperparameters were used during training:
43
  - num_epochs: 2
44
  - mixed_precision_training: Native AMP
45
 
46
- ### Training results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
 
49
 
 
7
  - name: KUETLLM_zephyr
8
  results: []
9
  ---
10
+ KUETLLM is a [zephyr7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) finetune, using a dataset with prompts and answers about Khulna University of Engineering and Technology.
11
+ It was loaded in 8 bit quantization using [bitsandbytes](https://github.com/TimDettmers/bitsandbytes). [LORA](https://huggingface.co/docs/diffusers/main/en/training/lora) was used to finetune an adapter, which was leter merged with the base unquantized model.
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
 
18
 
19
  ## Model description
20
 
21
+ Below is the training configuarations for the finetuning process:
22
+ ```
23
+ LoraConfig:
24
+ r=16,
25
+ lora_alpha=16,
26
+ target_modules=["q_proj", "v_proj","k_proj","o_proj","gate_proj","up_proj","down_proj"],
27
+ lora_dropout=0.05,
28
+ bias="none",
29
+ task_type="CAUSAL_LM"
30
+ ```
31
+ ```
32
+ TrainingArguments:
33
+ per_device_train_batch_size=12,
34
+ gradient_accumulation_steps=1,
35
+ optim='paged_adamw_8bit',
36
+ learning_rate=5e-06 ,
37
+ fp16=True,
38
+ logging_steps=10,
39
+ num_train_epochs = 1,
40
+ output_dir=zephyr_lora_output,
41
+ remove_unused_columns=False,
42
+ ```
43
 
 
44
 
45
  ## Training and evaluation data
46
 
 
62
  - num_epochs: 2
63
  - mixed_precision_training: Native AMP
64
 
65
+ ### Inference
66
+ ```
67
+ def process_data_sample(example):
68
+ processed_example = "<|system|>\nYou are a KUET authority managed chatbot, help users by answering their queries about KUET.\n<|user|>\n" + example + "\n<|assistant|>\n"
69
+ return processed_example
70
+
71
+ inp_str = process_data_sample("Tell me about KUET.")
72
+ inputs = tokenizer(inp_str, return_tensors="pt")
73
+ generation_config = GenerationConfig(
74
+ do_sample=True,
75
+ top_k=1,
76
+ temperature=0.1,
77
+ max_new_tokens=256,
78
+ pad_token_id=tokenizer.eos_token_id
79
+ )
80
+
81
+ outputs = model.generate(**inputs, generation_config=generation_config)
82
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
83
+ ```
84
 
85
 
86