Model save

Browse files

Files changed (4) hide show

README.md +50 -8
all_results.json +6 -6
train_results.json +6 -6
trainer_state.json +0 -0

README.md CHANGED Viewed

@@ -1,7 +1,6 @@
 ---
-base_model: tiiuae/falcon-mamba-7b-instruct
 library_name: peft
-license: other
 tags:
 - trl
 - dpo
@@ -16,7 +15,17 @@ should probably proofread and complete it, then remove this comment. -->
 # zephyr-7b-dpo-qlora
-This model is a fine-tuned version of [tiiuae/falcon-mamba-7b-instruct](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct) on an unknown dataset.
 ## Model description
@@ -51,12 +60,45 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions
-- PEFT 0.12.0
-- Transformers 4.45.0.dev0
-- Pytorch 2.4.0+cu121
-- Datasets 2.21.0
-- Tokenizers 0.19.1

 ---
+base_model: TII-Frontier-Team/falcon3-3b-instruct
 library_name: peft
 tags:
 - trl
 - dpo
 # zephyr-7b-dpo-qlora
+This model is a fine-tuned version of [TII-Frontier-Team/falcon3-3b-instruct](https://huggingface.co/TII-Frontier-Team/falcon3-3b-instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0287
+- Rewards/chosen: -4.6985
+- Rewards/rejected: -10.6531
+- Rewards/accuracies: 0.9276
+- Rewards/margins: 5.9547
+- Logps/rejected: -1101.2122
+- Logps/chosen: -502.6163
+- Logits/rejected: 1.9469
+- Logits/chosen: 2.1464
 ## Model description
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6914        | 0.0315 | 100  | 0.6912          | 0.0006         | -0.0036          | 0.6340             | 0.0042          | -36.2582       | -32.7125     | -1.6841         | -1.6367       |
+| 0.6743        | 0.0629 | 200  | 0.6753          | -0.0009        | -0.0462          | 0.6321             | 0.0454          | -40.5232       | -32.8573     | -1.5154         | -1.4649       |
+| 0.6112        | 0.0944 | 300  | 0.5905          | -0.5010        | -0.8365          | 0.6631             | 0.3356          | -119.5518      | -82.8670     | -0.5166         | -0.4325       |
+| 0.4477        | 0.1258 | 400  | 0.4026          | -1.9267        | -3.0850          | 0.7201             | 1.1583          | -344.3972      | -225.4428    | -0.5023         | -0.3494       |
+| 0.3583        | 0.1573 | 500  | 0.3063          | -2.4869        | -4.1367          | 0.7646             | 1.6498          | -449.5698      | -281.4605    | 0.3124          | 0.4717        |
+| 0.3041        | 0.1887 | 600  | 0.2405          | -2.9070        | -4.9732          | 0.7918             | 2.0662          | -533.2189      | -323.4665    | 0.9644          | 1.1113        |
+| 0.2487        | 0.2202 | 700  | 0.1964          | -3.4123        | -5.8172          | 0.8209             | 2.4050          | -617.6231      | -373.9985    | 1.1343          | 1.2933        |
+| 0.218         | 0.2517 | 800  | 0.1547          | -3.6771        | -6.6251          | 0.8336             | 2.9480          | -698.4094      | -400.4795    | 1.5710          | 1.7290        |
+| 0.1858        | 0.2831 | 900  | 0.1394          | -3.5484        | -6.6808          | 0.8485             | 3.1324          | -703.9799      | -387.6123    | 1.6988          | 1.8631        |
+| 0.173         | 0.3146 | 1000 | 0.1176          | -3.4824        | -6.7705          | 0.8649             | 3.2881          | -712.9531      | -381.0118    | 1.8190          | 1.9776        |
+| 0.1494        | 0.3460 | 1100 | 0.0979          | -3.7942        | -7.4529          | 0.8713             | 3.6587          | -781.1857      | -412.1861    | 1.8179          | 1.9865        |
+| 0.149         | 0.3775 | 1200 | 0.0817          | -4.1856        | -8.2504          | 0.8843             | 4.0648          | -860.9355      | -451.3316    | 1.8715          | 2.0581        |
+| 0.1143        | 0.4089 | 1300 | 0.0702          | -4.2444        | -8.6154          | 0.8884             | 4.3710          | -897.4431      | -457.2141    | 1.7765          | 1.9770        |
+| 0.1204        | 0.4404 | 1400 | 0.0642          | -4.1442        | -8.6112          | 0.8966             | 4.4670          | -897.0154      | -447.1863    | 2.1996          | 2.3734        |
+| 0.1013        | 0.4718 | 1500 | 0.0580          | -4.5031        | -9.1159          | 0.8951             | 4.6128          | -947.4904      | -483.0838    | 1.9514          | 2.1364        |
+| 0.1011        | 0.5033 | 1600 | 0.0567          | -4.0373        | -8.5779          | 0.9067             | 4.5406          | -893.6846      | -436.5011    | 1.9239          | 2.1103        |
+| 0.0853        | 0.5348 | 1700 | 0.0482          | -4.3119        | -9.2927          | 0.9067             | 4.9808          | -965.1708      | -463.9637    | 2.0648          | 2.2336        |
+| 0.0897        | 0.5662 | 1800 | 0.0449          | -4.3018        | -9.4275          | 0.9101             | 5.1257          | -978.6490      | -462.9552    | 1.9037          | 2.0822        |
+| 0.0717        | 0.5977 | 1900 | 0.0402          | -4.4391        | -9.8395          | 0.9112             | 5.4004          | -1019.8445     | -476.6779    | 2.0003          | 2.1749        |
+| 0.0487        | 0.6291 | 2000 | 0.0368          | -5.4728        | -11.3180         | 0.9078             | 5.8452          | -1167.6968     | -580.0486    | 1.9355          | 2.1422        |
+| 0.0683        | 0.6606 | 2100 | 0.0356          | -4.6736        | -10.2835         | 0.9190             | 5.6099          | -1064.2465     | -500.1268    | 2.0206          | 2.2058        |
+| 0.0514        | 0.6920 | 2200 | 0.0341          | -4.6025        | -10.2228         | 0.9209             | 5.6203          | -1058.1812     | -493.0187    | 1.9362          | 2.1272        |
+| 0.0623        | 0.7235 | 2300 | 0.0326          | -4.9398        | -10.7061         | 0.9213             | 5.7663          | -1106.5096     | -526.7491    | 1.8240          | 2.0327        |
+| 0.0693        | 0.7550 | 2400 | 0.0313          | -4.8024        | -10.6310         | 0.9231             | 5.8286          | -1098.9999     | -513.0095    | 1.8580          | 2.0583        |
+| 0.0543        | 0.7864 | 2500 | 0.0303          | -4.8132        | -10.7352         | 0.9228             | 5.9221          | -1109.4199     | -514.0873    | 1.9534          | 2.1471        |
+| 0.0555        | 0.8179 | 2600 | 0.0301          | -4.7251        | -10.5626         | 0.9261             | 5.8375          | -1092.1620     | -505.2810    | 1.9398          | 2.1357        |
+| 0.0646        | 0.8493 | 2700 | 0.0294          | -4.6930        | -10.6307         | 0.9261             | 5.9377          | -1098.9694     | -502.0694    | 2.0003          | 2.1947        |
+| 0.0546        | 0.8808 | 2800 | 0.0287          | -4.8085        | -10.8169         | 0.9250             | 6.0084          | -1117.5887     | -513.6258    | 1.9596          | 2.1607        |
+| 0.0702        | 0.9122 | 2900 | 0.0288          | -4.6970        | -10.6904         | 0.9243             | 5.9934          | -1104.9371     | -502.4718    | 1.9696          | 2.1647        |
+| 0.0623        | 0.9437 | 3000 | 0.0286          | -4.7098        | -10.6743         | 0.9269             | 5.9645          | -1103.3302     | -503.7507    | 1.9440          | 2.1437        |
+| 0.0593        | 0.9751 | 3100 | 0.0287          | -4.6985        | -10.6531         | 0.9276             | 5.9547          | -1101.2122     | -502.6163    | 1.9469          | 2.1464        |
 ### Framework versions
+- PEFT 0.13.0
+- Transformers 4.45.1
+- Pytorch 2.4.1+cu121
+- Datasets 3.0.1
+- Tokenizers 0.20.0

all_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
-    "epoch": 0.9924812030075187,
     "total_flos": 0.0,
-    "train_loss": 0.6927223205566406,
-    "train_runtime": 34619.6666,
-    "train_samples": 4242,
-    "train_samples_per_second": 0.123,
-    "train_steps_per_second": 0.001
 }

 {
+    "epoch": 1.0,
     "total_flos": 0.0,
+    "train_loss": 0.19036180805619987,
+    "train_runtime": 15997.0818,
+    "train_samples": 406907,
+    "train_samples_per_second": 25.436,
+    "train_steps_per_second": 0.199
 }

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
-    "epoch": 0.9924812030075187,
     "total_flos": 0.0,
-    "train_loss": 0.6927223205566406,
-    "train_runtime": 34619.6666,
-    "train_samples": 4242,
-    "train_samples_per_second": 0.123,
-    "train_steps_per_second": 0.001
 }

 {
+    "epoch": 1.0,
     "total_flos": 0.0,
+    "train_loss": 0.19036180805619987,
+    "train_runtime": 15997.0818,
+    "train_samples": 406907,
+    "train_samples_per_second": 25.436,
+    "train_steps_per_second": 0.199
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff