Model save
Browse files- README.md +50 -8
- all_results.json +6 -6
- train_results.json +6 -6
- trainer_state.json +0 -0
README.md
CHANGED
@@ -1,7 +1,6 @@
|
|
1 |
---
|
2 |
-
base_model:
|
3 |
library_name: peft
|
4 |
-
license: other
|
5 |
tags:
|
6 |
- trl
|
7 |
- dpo
|
@@ -16,7 +15,17 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
# zephyr-7b-dpo-qlora
|
18 |
|
19 |
-
This model is a fine-tuned version of [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
## Model description
|
22 |
|
@@ -51,12 +60,45 @@ The following hyperparameters were used during training:
|
|
51 |
|
52 |
### Training results
|
53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
|
55 |
|
56 |
### Framework versions
|
57 |
|
58 |
-
- PEFT 0.
|
59 |
-
- Transformers 4.45.
|
60 |
-
- Pytorch 2.4.
|
61 |
-
- Datasets
|
62 |
-
- Tokenizers 0.
|
|
|
1 |
---
|
2 |
+
base_model: TII-Frontier-Team/falcon3-3b-instruct
|
3 |
library_name: peft
|
|
|
4 |
tags:
|
5 |
- trl
|
6 |
- dpo
|
|
|
15 |
|
16 |
# zephyr-7b-dpo-qlora
|
17 |
|
18 |
+
This model is a fine-tuned version of [TII-Frontier-Team/falcon3-3b-instruct](https://huggingface.co/TII-Frontier-Team/falcon3-3b-instruct) on an unknown dataset.
|
19 |
+
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 0.0287
|
21 |
+
- Rewards/chosen: -4.6985
|
22 |
+
- Rewards/rejected: -10.6531
|
23 |
+
- Rewards/accuracies: 0.9276
|
24 |
+
- Rewards/margins: 5.9547
|
25 |
+
- Logps/rejected: -1101.2122
|
26 |
+
- Logps/chosen: -502.6163
|
27 |
+
- Logits/rejected: 1.9469
|
28 |
+
- Logits/chosen: 2.1464
|
29 |
|
30 |
## Model description
|
31 |
|
|
|
60 |
|
61 |
### Training results
|
62 |
|
63 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
64 |
+
|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
65 |
+
| 0.6914 | 0.0315 | 100 | 0.6912 | 0.0006 | -0.0036 | 0.6340 | 0.0042 | -36.2582 | -32.7125 | -1.6841 | -1.6367 |
|
66 |
+
| 0.6743 | 0.0629 | 200 | 0.6753 | -0.0009 | -0.0462 | 0.6321 | 0.0454 | -40.5232 | -32.8573 | -1.5154 | -1.4649 |
|
67 |
+
| 0.6112 | 0.0944 | 300 | 0.5905 | -0.5010 | -0.8365 | 0.6631 | 0.3356 | -119.5518 | -82.8670 | -0.5166 | -0.4325 |
|
68 |
+
| 0.4477 | 0.1258 | 400 | 0.4026 | -1.9267 | -3.0850 | 0.7201 | 1.1583 | -344.3972 | -225.4428 | -0.5023 | -0.3494 |
|
69 |
+
| 0.3583 | 0.1573 | 500 | 0.3063 | -2.4869 | -4.1367 | 0.7646 | 1.6498 | -449.5698 | -281.4605 | 0.3124 | 0.4717 |
|
70 |
+
| 0.3041 | 0.1887 | 600 | 0.2405 | -2.9070 | -4.9732 | 0.7918 | 2.0662 | -533.2189 | -323.4665 | 0.9644 | 1.1113 |
|
71 |
+
| 0.2487 | 0.2202 | 700 | 0.1964 | -3.4123 | -5.8172 | 0.8209 | 2.4050 | -617.6231 | -373.9985 | 1.1343 | 1.2933 |
|
72 |
+
| 0.218 | 0.2517 | 800 | 0.1547 | -3.6771 | -6.6251 | 0.8336 | 2.9480 | -698.4094 | -400.4795 | 1.5710 | 1.7290 |
|
73 |
+
| 0.1858 | 0.2831 | 900 | 0.1394 | -3.5484 | -6.6808 | 0.8485 | 3.1324 | -703.9799 | -387.6123 | 1.6988 | 1.8631 |
|
74 |
+
| 0.173 | 0.3146 | 1000 | 0.1176 | -3.4824 | -6.7705 | 0.8649 | 3.2881 | -712.9531 | -381.0118 | 1.8190 | 1.9776 |
|
75 |
+
| 0.1494 | 0.3460 | 1100 | 0.0979 | -3.7942 | -7.4529 | 0.8713 | 3.6587 | -781.1857 | -412.1861 | 1.8179 | 1.9865 |
|
76 |
+
| 0.149 | 0.3775 | 1200 | 0.0817 | -4.1856 | -8.2504 | 0.8843 | 4.0648 | -860.9355 | -451.3316 | 1.8715 | 2.0581 |
|
77 |
+
| 0.1143 | 0.4089 | 1300 | 0.0702 | -4.2444 | -8.6154 | 0.8884 | 4.3710 | -897.4431 | -457.2141 | 1.7765 | 1.9770 |
|
78 |
+
| 0.1204 | 0.4404 | 1400 | 0.0642 | -4.1442 | -8.6112 | 0.8966 | 4.4670 | -897.0154 | -447.1863 | 2.1996 | 2.3734 |
|
79 |
+
| 0.1013 | 0.4718 | 1500 | 0.0580 | -4.5031 | -9.1159 | 0.8951 | 4.6128 | -947.4904 | -483.0838 | 1.9514 | 2.1364 |
|
80 |
+
| 0.1011 | 0.5033 | 1600 | 0.0567 | -4.0373 | -8.5779 | 0.9067 | 4.5406 | -893.6846 | -436.5011 | 1.9239 | 2.1103 |
|
81 |
+
| 0.0853 | 0.5348 | 1700 | 0.0482 | -4.3119 | -9.2927 | 0.9067 | 4.9808 | -965.1708 | -463.9637 | 2.0648 | 2.2336 |
|
82 |
+
| 0.0897 | 0.5662 | 1800 | 0.0449 | -4.3018 | -9.4275 | 0.9101 | 5.1257 | -978.6490 | -462.9552 | 1.9037 | 2.0822 |
|
83 |
+
| 0.0717 | 0.5977 | 1900 | 0.0402 | -4.4391 | -9.8395 | 0.9112 | 5.4004 | -1019.8445 | -476.6779 | 2.0003 | 2.1749 |
|
84 |
+
| 0.0487 | 0.6291 | 2000 | 0.0368 | -5.4728 | -11.3180 | 0.9078 | 5.8452 | -1167.6968 | -580.0486 | 1.9355 | 2.1422 |
|
85 |
+
| 0.0683 | 0.6606 | 2100 | 0.0356 | -4.6736 | -10.2835 | 0.9190 | 5.6099 | -1064.2465 | -500.1268 | 2.0206 | 2.2058 |
|
86 |
+
| 0.0514 | 0.6920 | 2200 | 0.0341 | -4.6025 | -10.2228 | 0.9209 | 5.6203 | -1058.1812 | -493.0187 | 1.9362 | 2.1272 |
|
87 |
+
| 0.0623 | 0.7235 | 2300 | 0.0326 | -4.9398 | -10.7061 | 0.9213 | 5.7663 | -1106.5096 | -526.7491 | 1.8240 | 2.0327 |
|
88 |
+
| 0.0693 | 0.7550 | 2400 | 0.0313 | -4.8024 | -10.6310 | 0.9231 | 5.8286 | -1098.9999 | -513.0095 | 1.8580 | 2.0583 |
|
89 |
+
| 0.0543 | 0.7864 | 2500 | 0.0303 | -4.8132 | -10.7352 | 0.9228 | 5.9221 | -1109.4199 | -514.0873 | 1.9534 | 2.1471 |
|
90 |
+
| 0.0555 | 0.8179 | 2600 | 0.0301 | -4.7251 | -10.5626 | 0.9261 | 5.8375 | -1092.1620 | -505.2810 | 1.9398 | 2.1357 |
|
91 |
+
| 0.0646 | 0.8493 | 2700 | 0.0294 | -4.6930 | -10.6307 | 0.9261 | 5.9377 | -1098.9694 | -502.0694 | 2.0003 | 2.1947 |
|
92 |
+
| 0.0546 | 0.8808 | 2800 | 0.0287 | -4.8085 | -10.8169 | 0.9250 | 6.0084 | -1117.5887 | -513.6258 | 1.9596 | 2.1607 |
|
93 |
+
| 0.0702 | 0.9122 | 2900 | 0.0288 | -4.6970 | -10.6904 | 0.9243 | 5.9934 | -1104.9371 | -502.4718 | 1.9696 | 2.1647 |
|
94 |
+
| 0.0623 | 0.9437 | 3000 | 0.0286 | -4.7098 | -10.6743 | 0.9269 | 5.9645 | -1103.3302 | -503.7507 | 1.9440 | 2.1437 |
|
95 |
+
| 0.0593 | 0.9751 | 3100 | 0.0287 | -4.6985 | -10.6531 | 0.9276 | 5.9547 | -1101.2122 | -502.6163 | 1.9469 | 2.1464 |
|
96 |
|
97 |
|
98 |
### Framework versions
|
99 |
|
100 |
+
- PEFT 0.13.0
|
101 |
+
- Transformers 4.45.1
|
102 |
+
- Pytorch 2.4.1+cu121
|
103 |
+
- Datasets 3.0.1
|
104 |
+
- Tokenizers 0.20.0
|
all_results.json
CHANGED
@@ -1,9 +1,9 @@
|
|
1 |
{
|
2 |
-
"epoch": 0
|
3 |
"total_flos": 0.0,
|
4 |
-
"train_loss": 0.
|
5 |
-
"train_runtime":
|
6 |
-
"train_samples":
|
7 |
-
"train_samples_per_second":
|
8 |
-
"train_steps_per_second": 0.
|
9 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 1.0,
|
3 |
"total_flos": 0.0,
|
4 |
+
"train_loss": 0.19036180805619987,
|
5 |
+
"train_runtime": 15997.0818,
|
6 |
+
"train_samples": 406907,
|
7 |
+
"train_samples_per_second": 25.436,
|
8 |
+
"train_steps_per_second": 0.199
|
9 |
}
|
train_results.json
CHANGED
@@ -1,9 +1,9 @@
|
|
1 |
{
|
2 |
-
"epoch": 0
|
3 |
"total_flos": 0.0,
|
4 |
-
"train_loss": 0.
|
5 |
-
"train_runtime":
|
6 |
-
"train_samples":
|
7 |
-
"train_samples_per_second":
|
8 |
-
"train_steps_per_second": 0.
|
9 |
}
|
|
|
1 |
{
|
2 |
+
"epoch": 1.0,
|
3 |
"total_flos": 0.0,
|
4 |
+
"train_loss": 0.19036180805619987,
|
5 |
+
"train_runtime": 15997.0818,
|
6 |
+
"train_samples": 406907,
|
7 |
+
"train_samples_per_second": 25.436,
|
8 |
+
"train_steps_per_second": 0.199
|
9 |
}
|
trainer_state.json
CHANGED
The diff for this file is too large to render.
See raw diff
|
|