lordspline
commited on
Commit
•
fd322cb
1
Parent(s):
ad34de3
End of training
Browse files- README.md +13 -7
- pytorch_model.bin +1 -1
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
base_model: lordspline/
|
3 |
tags:
|
4 |
- axolotl
|
5 |
- generated_from_trainer
|
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
axolotl version: `0.4.1`
|
18 |
```yaml
|
19 |
-
base_model: lordspline/
|
20 |
model_type: AutoModelForCausalLM
|
21 |
tokenizer_type: AutoTokenizer
|
22 |
|
@@ -26,7 +26,13 @@ strict: false
|
|
26 |
|
27 |
chat_template: chatml
|
28 |
datasets:
|
29 |
-
- path: lordspline/scidata
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
type: sharegpt
|
31 |
conversation: chatml
|
32 |
|
@@ -64,7 +70,7 @@ gradient_checkpointing: unsloth
|
|
64 |
gradient_checkpointing_kwargs:
|
65 |
use_reentrant: true # look
|
66 |
early_stopping_patience:
|
67 |
-
resume_from_checkpoint: ./mergestein/checkpoint-8015
|
68 |
local_rank:
|
69 |
logging_steps: 1
|
70 |
xformers_attention:
|
@@ -93,9 +99,9 @@ tokens:
|
|
93 |
|
94 |
# mergestein
|
95 |
|
96 |
-
This model is a fine-tuned version of [lordspline/
|
97 |
It achieves the following results on the evaluation set:
|
98 |
-
- Loss: 1.
|
99 |
|
100 |
## Model description
|
101 |
|
@@ -127,7 +133,7 @@ The following hyperparameters were used during training:
|
|
127 |
|
128 |
| Training Loss | Epoch | Step | Validation Loss |
|
129 |
|:-------------:|:-----:|:-----:|:---------------:|
|
130 |
-
| 1.
|
131 |
|
132 |
|
133 |
### Framework versions
|
|
|
1 |
---
|
2 |
+
base_model: lordspline/mergestein
|
3 |
tags:
|
4 |
- axolotl
|
5 |
- generated_from_trainer
|
|
|
16 |
|
17 |
axolotl version: `0.4.1`
|
18 |
```yaml
|
19 |
+
base_model: lordspline/mergestein
|
20 |
model_type: AutoModelForCausalLM
|
21 |
tokenizer_type: AutoTokenizer
|
22 |
|
|
|
26 |
|
27 |
chat_template: chatml
|
28 |
datasets:
|
29 |
+
# - path: lordspline/scidata
|
30 |
+
# type: sharegpt
|
31 |
+
# conversation: chatml
|
32 |
+
- path: lordspline/wizard_v2_196k_unfiltered
|
33 |
+
type: sharegpt
|
34 |
+
conversation: chatml
|
35 |
+
- path: lordspline/ultrainteract
|
36 |
type: sharegpt
|
37 |
conversation: chatml
|
38 |
|
|
|
70 |
gradient_checkpointing_kwargs:
|
71 |
use_reentrant: true # look
|
72 |
early_stopping_patience:
|
73 |
+
resume_from_checkpoint: # ./mergestein/checkpoint-8015
|
74 |
local_rank:
|
75 |
logging_steps: 1
|
76 |
xformers_attention:
|
|
|
99 |
|
100 |
# mergestein
|
101 |
|
102 |
+
This model is a fine-tuned version of [lordspline/mergestein](https://huggingface.co/lordspline/mergestein) on the None dataset.
|
103 |
It achieves the following results on the evaluation set:
|
104 |
+
- Loss: 1.0348
|
105 |
|
106 |
## Model description
|
107 |
|
|
|
133 |
|
134 |
| Training Loss | Epoch | Step | Validation Loss |
|
135 |
|:-------------:|:-----:|:-----:|:---------------:|
|
136 |
+
| 1.1202 | 1.0 | 25552 | 1.0348 |
|
137 |
|
138 |
|
139 |
### Framework versions
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1589947346
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fc89906f67ddbeb76efbb37239aa493a9b881a2577994f05671aea509adc0188
|
3 |
size 1589947346
|