Anu/openhermes-mistral-dpo-gptq

Browse files

Files changed (5) hide show

README.md +14 -14
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
runs/Dec17_10-06-21_5191966662f3/events.out.tfevents.1702807714.5191966662f3.214.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6986
-- Rewards/chosen: -0.0511
-- Rewards/rejected: -0.8534
-- Rewards/accuracies: 0.75
-- Rewards/margins: 0.8023
-- Logps/rejected: -266.5457
-- Logps/chosen: -155.8428
-- Logits/rejected: -1.7291
-- Logits/chosen: -1.6870
 ## Model description
@@ -56,11 +56,11 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.7107        | 0.01  | 10   | 0.6516          | -0.1321        | -0.2312          | 0.5625             | 0.0991          | -260.3234      | -156.6530    | -1.7381         | -1.6899       |
-| 0.6802        | 0.01  | 20   | 0.6420          | -0.0577        | -0.3293          | 0.875              | 0.2717          | -261.3044      | -155.9080    | -1.7371         | -1.6891       |
-| 0.7309        | 0.01  | 30   | 0.6121          | 0.1195         | -0.2346          | 0.8125             | 0.3541          | -260.3571      | -154.1360    | -1.7408         | -1.6930       |
-| 0.6636        | 0.02  | 40   | 0.6198          | 0.0962         | -0.4775          | 0.75               | 0.5737          | -262.7859      | -154.3692    | -1.7374         | -1.6921       |
-| 0.7958        | 0.03  | 50   | 0.6986          | -0.0511        | -0.8534          | 0.75               | 0.8023          | -266.5457      | -155.8428    | -1.7291         | -1.6870       |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4545
+- Rewards/chosen: -0.0587
+- Rewards/rejected: -1.0907
+- Rewards/accuracies: 0.875
+- Rewards/margins: 1.0320
+- Logps/rejected: -312.2487
+- Logps/chosen: -273.6681
+- Logits/rejected: -1.8614
+- Logits/chosen: -1.7936
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6989        | 0.01  | 10   | 0.6566          | -0.0830        | -0.1482          | 0.75               | 0.0652          | -302.8232      | -273.9107    | -1.8738         | -1.7954       |
+| 0.6578        | 0.01  | 20   | 0.5787          | 0.0468         | -0.2201          | 0.8125             | 0.2669          | -303.5421      | -272.6130    | -1.8707         | -1.7965       |
+| 0.715         | 0.01  | 30   | 0.5021          | 0.2256         | -0.3134          | 0.8125             | 0.5391          | -304.4756      | -270.8246    | -1.8729         | -1.8014       |
+| 0.6847        | 0.02  | 40   | 0.4673          | 0.2097         | -0.6320          | 0.875              | 0.8417          | -307.6610      | -270.9843    | -1.8682         | -1.7996       |
+| 0.7869        | 0.03  | 50   | 0.4545          | -0.0587        | -1.0907          | 0.875              | 1.0320          | -312.2487      | -273.6681    | -1.8614         | -1.7936       |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,8 +19,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v_proj",
-    "q_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cd56a1c9a1196dc72c93cd6705385b9f7ee136699de0fb3ba6bce30cd3c408d7
 size 13648432

 version https://git-lfs.github.com/spec/v1
+oid sha256:31fc61323b8f5fef918c042b2673538aee93fc87f7d454187a4949021729eba8
 size 13648432

runs/Dec17_10-06-21_5191966662f3/events.out.tfevents.1702807714.5191966662f3.214.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:baa2184db1d5d96ec3de1e06c8535f180cd2d68299c17dbd27f59355247530e3
+size 12594

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1d3db60904dc14d8aaecb00755a05a8e3c173094799939cd9800b6aef0da2824
 size 4155

 version https://git-lfs.github.com/spec/v1
+oid sha256:b6b42e2c057547a59e697b34c870b3212086924324c52af5fa5e7e866d1078e0
 size 4155