LBusser/mistral_dpo0.0

Browse files

Files changed (5) hide show

README.md +14 -14
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
runs/Dec11_09-45-16_70b772f1ab48/events.out.tfevents.1702287929.70b772f1ab48.9285.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2177
-- Rewards/chosen: -4.0177
-- Rewards/rejected: -3.1669
-- Rewards/accuracies: 0.5625
-- Rewards/margins: -0.8508
-- Logps/rejected: -609.4065
-- Logps/chosen: -540.1581
-- Logits/rejected: -0.8963
-- Logits/chosen: -1.0179
 ## Model description
@@ -56,11 +56,11 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.7105        | 0.01  | 10   | 0.8032          | -0.5208        | -0.1762          | 0.5                | -0.3446         | -579.4993      | -505.1888    | -0.9201         | -1.0572       |
-| 0.6608        | 0.01  | 20   | 1.0717          | -1.5841        | -1.1358          | 0.625              | -0.4482         | -589.0955      | -515.8216    | -0.9135         | -1.0449       |
-| 0.7477        | 0.01  | 30   | 1.0619          | -1.4672        | -1.4134          | 0.5625             | -0.0539         | -591.8709      | -514.6534    | -0.9162         | -1.0480       |
-| 0.7152        | 0.02  | 40   | 1.5518          | -2.5724        | -2.2804          | 0.5625             | -0.2919         | -600.5416      | -525.7047    | -0.9079         | -1.0351       |
-| 0.8106        | 0.03  | 50   | 2.2177          | -4.0177        | -3.1669          | 0.5625             | -0.8508         | -609.4065      | -540.1581    | -0.8963         | -1.0179       |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6931
+- Rewards/chosen: 0.0
+- Rewards/rejected: 0.0
+- Rewards/accuracies: 0.0
+- Rewards/margins: 0.0
+- Logps/rejected: -710.1063
+- Logps/chosen: -753.4531
+- Logits/rejected: -1.6178
+- Logits/chosen: -1.7566
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6931        | 0.01  | 10   | 0.6931          | 0.0            | 0.0              | 0.0                | 0.0             | -710.1063      | -753.4531    | -1.6178         | -1.7566       |
+| 0.6931        | 0.01  | 20   | 0.6931          | 0.0            | 0.0              | 0.0                | 0.0             | -710.1063      | -753.4531    | -1.6178         | -1.7566       |
+| 0.6931        | 0.01  | 30   | 0.6931          | 0.0            | 0.0              | 0.0                | 0.0             | -710.1063      | -753.4531    | -1.6178         | -1.7566       |
+| 0.6931        | 0.02  | 40   | 0.6931          | 0.0            | 0.0              | 0.0                | 0.0             | -710.1063      | -753.4531    | -1.6178         | -1.7566       |
+| 0.6931        | 0.03  | 50   | 0.6931          | 0.0            | 0.0              | 0.0                | 0.0             | -710.1063      | -753.4531    | -1.6178         | -1.7566       |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,8 +19,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q_proj",
-    "v_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v_proj",
+    "q_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c720b26e35f914cfa9725e2546ccc24226d45d573d6ba4f5b8fa5a2805296954
 size 13648432

 version https://git-lfs.github.com/spec/v1
+oid sha256:c9a52614c9912d66ee94a410085b0f48dd16cd58becb67a6e9a1bcc9e7322987
 size 13648432

runs/Dec11_09-45-16_70b772f1ab48/events.out.tfevents.1702287929.70b772f1ab48.9285.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4b746430a09b790f04fd6afe3635bf9c07bb24f6a0ae8f26c92201b44139301f
+size 12594

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1b804ae047f4e66e496b0931a01a1e0c2d0aa3a03310742121512e9a45c3a861
 size 4155

 version https://git-lfs.github.com/spec/v1
+oid sha256:2b97882146d62af5b65ef1c79b4483b39ab44d2f592e5386a256ac2f677f950d
 size 4155