RedaAlami commited on
Commit
6195168
·
verified ·
1 Parent(s): 7423e1b

Model save

Browse files
Files changed (4) hide show
  1. README.md +50 -8
  2. all_results.json +6 -6
  3. train_results.json +6 -6
  4. trainer_state.json +0 -0
README.md CHANGED
@@ -1,7 +1,6 @@
1
  ---
2
- base_model: tiiuae/falcon-mamba-7b-instruct
3
  library_name: peft
4
- license: other
5
  tags:
6
  - trl
7
  - dpo
@@ -16,7 +15,17 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # zephyr-7b-dpo-qlora
18
 
19
- This model is a fine-tuned version of [tiiuae/falcon-mamba-7b-instruct](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct) on an unknown dataset.
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Model description
22
 
@@ -51,12 +60,45 @@ The following hyperparameters were used during training:
51
 
52
  ### Training results
53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
 
56
  ### Framework versions
57
 
58
- - PEFT 0.12.0
59
- - Transformers 4.45.0.dev0
60
- - Pytorch 2.4.0+cu121
61
- - Datasets 2.21.0
62
- - Tokenizers 0.19.1
 
1
  ---
2
+ base_model: TII-Frontier-Team/falcon3-3b-instruct
3
  library_name: peft
 
4
  tags:
5
  - trl
6
  - dpo
 
15
 
16
  # zephyr-7b-dpo-qlora
17
 
18
+ This model is a fine-tuned version of [TII-Frontier-Team/falcon3-3b-instruct](https://huggingface.co/TII-Frontier-Team/falcon3-3b-instruct) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 0.0287
21
+ - Rewards/chosen: -4.6985
22
+ - Rewards/rejected: -10.6531
23
+ - Rewards/accuracies: 0.9276
24
+ - Rewards/margins: 5.9547
25
+ - Logps/rejected: -1101.2122
26
+ - Logps/chosen: -502.6163
27
+ - Logits/rejected: 1.9469
28
+ - Logits/chosen: 2.1464
29
 
30
  ## Model description
31
 
 
60
 
61
  ### Training results
62
 
63
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
64
+ |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
65
+ | 0.6914 | 0.0315 | 100 | 0.6912 | 0.0006 | -0.0036 | 0.6340 | 0.0042 | -36.2582 | -32.7125 | -1.6841 | -1.6367 |
66
+ | 0.6743 | 0.0629 | 200 | 0.6753 | -0.0009 | -0.0462 | 0.6321 | 0.0454 | -40.5232 | -32.8573 | -1.5154 | -1.4649 |
67
+ | 0.6112 | 0.0944 | 300 | 0.5905 | -0.5010 | -0.8365 | 0.6631 | 0.3356 | -119.5518 | -82.8670 | -0.5166 | -0.4325 |
68
+ | 0.4477 | 0.1258 | 400 | 0.4026 | -1.9267 | -3.0850 | 0.7201 | 1.1583 | -344.3972 | -225.4428 | -0.5023 | -0.3494 |
69
+ | 0.3583 | 0.1573 | 500 | 0.3063 | -2.4869 | -4.1367 | 0.7646 | 1.6498 | -449.5698 | -281.4605 | 0.3124 | 0.4717 |
70
+ | 0.3041 | 0.1887 | 600 | 0.2405 | -2.9070 | -4.9732 | 0.7918 | 2.0662 | -533.2189 | -323.4665 | 0.9644 | 1.1113 |
71
+ | 0.2487 | 0.2202 | 700 | 0.1964 | -3.4123 | -5.8172 | 0.8209 | 2.4050 | -617.6231 | -373.9985 | 1.1343 | 1.2933 |
72
+ | 0.218 | 0.2517 | 800 | 0.1547 | -3.6771 | -6.6251 | 0.8336 | 2.9480 | -698.4094 | -400.4795 | 1.5710 | 1.7290 |
73
+ | 0.1858 | 0.2831 | 900 | 0.1394 | -3.5484 | -6.6808 | 0.8485 | 3.1324 | -703.9799 | -387.6123 | 1.6988 | 1.8631 |
74
+ | 0.173 | 0.3146 | 1000 | 0.1176 | -3.4824 | -6.7705 | 0.8649 | 3.2881 | -712.9531 | -381.0118 | 1.8190 | 1.9776 |
75
+ | 0.1494 | 0.3460 | 1100 | 0.0979 | -3.7942 | -7.4529 | 0.8713 | 3.6587 | -781.1857 | -412.1861 | 1.8179 | 1.9865 |
76
+ | 0.149 | 0.3775 | 1200 | 0.0817 | -4.1856 | -8.2504 | 0.8843 | 4.0648 | -860.9355 | -451.3316 | 1.8715 | 2.0581 |
77
+ | 0.1143 | 0.4089 | 1300 | 0.0702 | -4.2444 | -8.6154 | 0.8884 | 4.3710 | -897.4431 | -457.2141 | 1.7765 | 1.9770 |
78
+ | 0.1204 | 0.4404 | 1400 | 0.0642 | -4.1442 | -8.6112 | 0.8966 | 4.4670 | -897.0154 | -447.1863 | 2.1996 | 2.3734 |
79
+ | 0.1013 | 0.4718 | 1500 | 0.0580 | -4.5031 | -9.1159 | 0.8951 | 4.6128 | -947.4904 | -483.0838 | 1.9514 | 2.1364 |
80
+ | 0.1011 | 0.5033 | 1600 | 0.0567 | -4.0373 | -8.5779 | 0.9067 | 4.5406 | -893.6846 | -436.5011 | 1.9239 | 2.1103 |
81
+ | 0.0853 | 0.5348 | 1700 | 0.0482 | -4.3119 | -9.2927 | 0.9067 | 4.9808 | -965.1708 | -463.9637 | 2.0648 | 2.2336 |
82
+ | 0.0897 | 0.5662 | 1800 | 0.0449 | -4.3018 | -9.4275 | 0.9101 | 5.1257 | -978.6490 | -462.9552 | 1.9037 | 2.0822 |
83
+ | 0.0717 | 0.5977 | 1900 | 0.0402 | -4.4391 | -9.8395 | 0.9112 | 5.4004 | -1019.8445 | -476.6779 | 2.0003 | 2.1749 |
84
+ | 0.0487 | 0.6291 | 2000 | 0.0368 | -5.4728 | -11.3180 | 0.9078 | 5.8452 | -1167.6968 | -580.0486 | 1.9355 | 2.1422 |
85
+ | 0.0683 | 0.6606 | 2100 | 0.0356 | -4.6736 | -10.2835 | 0.9190 | 5.6099 | -1064.2465 | -500.1268 | 2.0206 | 2.2058 |
86
+ | 0.0514 | 0.6920 | 2200 | 0.0341 | -4.6025 | -10.2228 | 0.9209 | 5.6203 | -1058.1812 | -493.0187 | 1.9362 | 2.1272 |
87
+ | 0.0623 | 0.7235 | 2300 | 0.0326 | -4.9398 | -10.7061 | 0.9213 | 5.7663 | -1106.5096 | -526.7491 | 1.8240 | 2.0327 |
88
+ | 0.0693 | 0.7550 | 2400 | 0.0313 | -4.8024 | -10.6310 | 0.9231 | 5.8286 | -1098.9999 | -513.0095 | 1.8580 | 2.0583 |
89
+ | 0.0543 | 0.7864 | 2500 | 0.0303 | -4.8132 | -10.7352 | 0.9228 | 5.9221 | -1109.4199 | -514.0873 | 1.9534 | 2.1471 |
90
+ | 0.0555 | 0.8179 | 2600 | 0.0301 | -4.7251 | -10.5626 | 0.9261 | 5.8375 | -1092.1620 | -505.2810 | 1.9398 | 2.1357 |
91
+ | 0.0646 | 0.8493 | 2700 | 0.0294 | -4.6930 | -10.6307 | 0.9261 | 5.9377 | -1098.9694 | -502.0694 | 2.0003 | 2.1947 |
92
+ | 0.0546 | 0.8808 | 2800 | 0.0287 | -4.8085 | -10.8169 | 0.9250 | 6.0084 | -1117.5887 | -513.6258 | 1.9596 | 2.1607 |
93
+ | 0.0702 | 0.9122 | 2900 | 0.0288 | -4.6970 | -10.6904 | 0.9243 | 5.9934 | -1104.9371 | -502.4718 | 1.9696 | 2.1647 |
94
+ | 0.0623 | 0.9437 | 3000 | 0.0286 | -4.7098 | -10.6743 | 0.9269 | 5.9645 | -1103.3302 | -503.7507 | 1.9440 | 2.1437 |
95
+ | 0.0593 | 0.9751 | 3100 | 0.0287 | -4.6985 | -10.6531 | 0.9276 | 5.9547 | -1101.2122 | -502.6163 | 1.9469 | 2.1464 |
96
 
97
 
98
  ### Framework versions
99
 
100
+ - PEFT 0.13.0
101
+ - Transformers 4.45.1
102
+ - Pytorch 2.4.1+cu121
103
+ - Datasets 3.0.1
104
+ - Tokenizers 0.20.0
all_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 0.9924812030075187,
3
  "total_flos": 0.0,
4
- "train_loss": 0.6927223205566406,
5
- "train_runtime": 34619.6666,
6
- "train_samples": 4242,
7
- "train_samples_per_second": 0.123,
8
- "train_steps_per_second": 0.001
9
  }
 
1
  {
2
+ "epoch": 1.0,
3
  "total_flos": 0.0,
4
+ "train_loss": 0.19036180805619987,
5
+ "train_runtime": 15997.0818,
6
+ "train_samples": 406907,
7
+ "train_samples_per_second": 25.436,
8
+ "train_steps_per_second": 0.199
9
  }
train_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 0.9924812030075187,
3
  "total_flos": 0.0,
4
- "train_loss": 0.6927223205566406,
5
- "train_runtime": 34619.6666,
6
- "train_samples": 4242,
7
- "train_samples_per_second": 0.123,
8
- "train_steps_per_second": 0.001
9
  }
 
1
  {
2
+ "epoch": 1.0,
3
  "total_flos": 0.0,
4
+ "train_loss": 0.19036180805619987,
5
+ "train_runtime": 15997.0818,
6
+ "train_samples": 406907,
7
+ "train_samples_per_second": 25.436,
8
+ "train_steps_per_second": 0.199
9
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff