lordspline
/

mergestein

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

lordspline commited on Jul 11, 2024

Commit

fd322cb

•

1 Parent(s): ad34de3

End of training

Files changed (2) hide show

README.md +13 -7
pytorch_model.bin +1 -1

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-base_model: lordspline/qwen-merged
 tags:
 - axolotl
 - generated_from_trainer
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 axolotl version: `0.4.1`
 ```yaml
-base_model: lordspline/qwen-merged
 model_type: AutoModelForCausalLM
 tokenizer_type: AutoTokenizer
@@ -26,7 +26,13 @@ strict: false
 chat_template: chatml
 datasets:
-  - path: lordspline/scidata
     type: sharegpt
     conversation: chatml
@@ -64,7 +70,7 @@ gradient_checkpointing: unsloth
 gradient_checkpointing_kwargs:
    use_reentrant: true # look
 early_stopping_patience:
-resume_from_checkpoint: ./mergestein/checkpoint-8015
 local_rank:
 logging_steps: 1
 xformers_attention:
@@ -93,9 +99,9 @@ tokens:
 # mergestein
-This model is a fine-tuned version of [lordspline/qwen-merged](https://huggingface.co/lordspline/qwen-merged) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.6175
 ## Model description
@@ -127,7 +133,7 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
-| 1.5213        | 1.0   | 22879 | 1.6175          |
 ### Framework versions

 ---
+base_model: lordspline/mergestein
 tags:
 - axolotl
 - generated_from_trainer
 axolotl version: `0.4.1`
 ```yaml
+base_model: lordspline/mergestein
 model_type: AutoModelForCausalLM
 tokenizer_type: AutoTokenizer
 chat_template: chatml
 datasets:
+  # - path: lordspline/scidata
+  #   type: sharegpt
+  #   conversation: chatml
+  - path: lordspline/wizard_v2_196k_unfiltered
+    type: sharegpt
+    conversation: chatml
+  - path: lordspline/ultrainteract
     type: sharegpt
     conversation: chatml
 gradient_checkpointing_kwargs:
    use_reentrant: true # look
 early_stopping_patience:
+resume_from_checkpoint: # ./mergestein/checkpoint-8015
 local_rank:
 logging_steps: 1
 xformers_attention:
 # mergestein
+This model is a fine-tuned version of [lordspline/mergestein](https://huggingface.co/lordspline/mergestein) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.0348
 ## Model description
 | Training Loss | Epoch | Step  | Validation Loss |
 |:-------------:|:-----:|:-----:|:---------------:|
+| 1.1202        | 1.0   | 25552 | 1.0348          |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:82fee2a8c0ce5933cfda49995ef3a3064a42870c443b5dcd9006337b6594ea1a
 size 1589947346

 version https://git-lfs.github.com/spec/v1
+oid sha256:fc89906f67ddbeb76efbb37239aa493a9b881a2577994f05671aea509adc0188
 size 1589947346