tyzhu
/

lmind_nq_train6000_eval6489_v1_qa_3e-4_lora2

Safetensors

Generated from Trainer

Eval Results

Model card Files Files and versions Community

tyzhu commited on Jun 7, 2024

Commit

82dde3b

verified ·

1 Parent(s): 0c675cc

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: lmind_nq_train6000_eval6489_v1_qa_3e-4_lora2
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# lmind_nq_train6000_eval6489_v1_qa_3e-4_lora2
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.5245
+- Accuracy: 0.5456
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0003
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Accuracy |
+|:-------------:|:-------:|:----:|:---------------:|:--------:|
+| 1.7259        | 0.9973  | 187  | 1.6105          | 0.5753   |
+| 1.3358        | 2.0     | 375  | 1.6465          | 0.5723   |
+| 0.9854        | 2.9973  | 562  | 1.7621          | 0.5708   |
+| 0.7508        | 4.0     | 750  | 1.9381          | 0.5663   |
+| 0.6396        | 4.9973  | 937  | 1.9926          | 0.5634   |
+| 0.5833        | 6.0     | 1125 | 2.1015          | 0.5612   |
+| 0.5567        | 6.9973  | 1312 | 2.1645          | 0.5607   |
+| 0.5411        | 8.0     | 1500 | 2.2040          | 0.5614   |
+| 0.5028        | 8.9973  | 1687 | 2.2365          | 0.5608   |
+| 0.5041        | 10.0    | 1875 | 2.2862          | 0.5605   |
+| 0.4991        | 10.9973 | 2062 | 2.2851          | 0.5603   |
+| 0.5048        | 12.0    | 2250 | 2.2455          | 0.5610   |
+| 0.5067        | 12.9973 | 2437 | 2.2589          | 0.5592   |
+| 0.508         | 14.0    | 2625 | 2.2631          | 0.5584   |
+| 0.514         | 14.9973 | 2812 | 2.2773          | 0.5564   |
+| 0.5149        | 16.0    | 3000 | 2.2861          | 0.5576   |
+| 0.4835        | 16.9973 | 3187 | 2.2663          | 0.5588   |
+| 0.484         | 18.0    | 3375 | 2.3145          | 0.5575   |
+| 0.4862        | 18.9973 | 3562 | 2.2949          | 0.5559   |
+| 0.4871        | 20.0    | 3750 | 2.3217          | 0.5581   |
+| 0.4902        | 20.9973 | 3937 | 2.3256          | 0.5572   |
+| 0.492         | 22.0    | 4125 | 2.3584          | 0.5558   |
+| 0.4937        | 22.9973 | 4312 | 2.3608          | 0.5558   |
+| 0.492         | 24.0    | 4500 | 2.3685          | 0.5552   |
+| 0.4728        | 24.9973 | 4687 | 2.3752          | 0.5543   |
+| 0.4753        | 26.0    | 4875 | 2.3276          | 0.5557   |
+| 0.4788        | 26.9973 | 5062 | 2.4252          | 0.5542   |
+| 0.4812        | 28.0    | 5250 | 2.3812          | 0.5551   |
+| 0.4849        | 28.9973 | 5437 | 2.4413          | 0.5523   |
+| 0.4872        | 30.0    | 5625 | 2.3946          | 0.5526   |
+| 0.488         | 30.9973 | 5812 | 2.3911          | 0.5526   |
+| 0.4864        | 32.0    | 6000 | 2.4076          | 0.5517   |
+| 0.4667        | 32.9973 | 6187 | 2.4808          | 0.5505   |
+| 0.4694        | 34.0    | 6375 | 2.4784          | 0.5523   |
+| 0.4703        | 34.9973 | 6562 | 2.4760          | 0.5521   |
+| 0.4704        | 36.0    | 6750 | 2.5062          | 0.5519   |
+| 0.4761        | 36.9973 | 6937 | 2.4947          | 0.5525   |
+| 0.4802        | 38.0    | 7125 | 2.4657          | 0.5496   |
+| 0.4861        | 38.9973 | 7312 | 2.4472          | 0.5504   |
+| 0.4875        | 40.0    | 7500 | 2.4841          | 0.5489   |
+| 0.4681        | 40.9973 | 7687 | 2.4855          | 0.5484   |
+| 0.4661        | 42.0    | 7875 | 2.5166          | 0.5491   |
+| 0.47          | 42.9973 | 8062 | 2.5159          | 0.5487   |
+| 0.4679        | 44.0    | 8250 | 2.5625          | 0.5490   |
+| 0.4688        | 44.9973 | 8437 | 2.4849          | 0.5482   |
+| 0.4699        | 46.0    | 8625 | 2.5193          | 0.5486   |
+| 0.4724        | 46.9973 | 8812 | 2.5711          | 0.5462   |
+| 0.4753        | 48.0    | 9000 | 2.5664          | 0.5465   |
+| 0.4577        | 48.9973 | 9187 | 2.5205          | 0.5466   |
+| 0.4642        | 49.8667 | 9350 | 2.5245          | 0.5456   |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.41.1
+- Pytorch 2.1.0+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1