Model save

Browse files

Files changed (4) hide show

README.md +97 -0
final_model/config.json +1 -1
final_model/model.safetensors +1 -1
final_model/training_args.bin +2 -2

README.md ADDED Viewed

	@@ -0,0 +1,97 @@

+---
+base_model: liminerity/Bitnet-Mistral.0.2-v3
+tags:
+- generated_from_trainer
+model-index:
+- name: Bitnet-Mistral.0.2-v5
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Bitnet-Mistral.0.2-v5
+This model is a fine-tuned version of [liminerity/Bitnet-Mistral.0.2-v3](https://huggingface.co/liminerity/Bitnet-Mistral.0.2-v3) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 4.4220
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 16
+- seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- training_steps: 1000
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 4.6207        | 0.0226 | 25   | 4.5769          |
+| 4.3648        | 0.0451 | 50   | 4.5716          |
+| 4.4694        | 0.0677 | 75   | 4.5629          |
+| 4.1782        | 0.0903 | 100  | 4.5555          |
+| 4.4231        | 0.1128 | 125  | 4.5486          |
+| 4.2572        | 0.1354 | 150  | 4.5546          |
+| 4.3837        | 0.1580 | 175  | 4.5466          |
+| 4.1915        | 0.1805 | 200  | 4.5419          |
+| 4.3894        | 0.2031 | 225  | 4.5257          |
+| 4.2447        | 0.2257 | 250  | 4.5235          |
+| 4.0355        | 0.2483 | 275  | 4.5236          |
+| 4.1873        | 0.2708 | 300  | 4.5211          |
+| 4.3891        | 0.2934 | 325  | 4.5078          |
+| 4.1322        | 0.3160 | 350  | 4.5019          |
+| 4.0357        | 0.3385 | 375  | 4.5051          |
+| 4.3401        | 0.3611 | 400  | 4.4921          |
+| 4.3848        | 0.3837 | 425  | 4.4903          |
+| 4.305         | 0.4062 | 450  | 4.4789          |
+| 4.2776        | 0.4288 | 475  | 4.4765          |
+| 4.1802        | 0.4514 | 500  | 4.4727          |
+| 4.0785        | 0.4739 | 525  | 4.4674          |
+| 4.0607        | 0.4965 | 550  | 4.4623          |
+| 3.9385        | 0.5191 | 575  | 4.4611          |
+| 4.194         | 0.5416 | 600  | 4.4565          |
+| 4.277         | 0.5642 | 625  | 4.4478          |
+| 4.1751        | 0.5868 | 650  | 4.4457          |
+| 4.0422        | 0.6093 | 675  | 4.4428          |
+| 4.1503        | 0.6319 | 700  | 4.4406          |
+| 4.0552        | 0.6545 | 725  | 4.4366          |
+| 4.4017        | 0.6770 | 750  | 4.4327          |
+| 4.2394        | 0.6996 | 775  | 4.4300          |
+| 4.1975        | 0.7222 | 800  | 4.4277          |
+| 4.2378        | 0.7448 | 825  | 4.4279          |
+| 4.078         | 0.7673 | 850  | 4.4256          |
+| 4.4727        | 0.7899 | 875  | 4.4235          |
+| 4.1667        | 0.8125 | 900  | 4.4224          |
+| 4.4079        | 0.8350 | 925  | 4.4223          |
+| 4.3179        | 0.8576 | 950  | 4.4221          |
+| 4.0479        | 0.8802 | 975  | 4.4220          |
+| 4.0943        | 0.9027 | 1000 | 4.4220          |
+### Framework versions
+- Transformers 4.41.2
+- Pytorch 2.3.0+cu121
+- Datasets 2.20.0
+- Tokenizers 0.19.1

final_model/config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "liminerity/Bitnet-Mistral.0.2-v5",
   "architectures": [
     "MistralForCausalLM"
   ],

 {
+  "_name_or_path": "liminerity/Bitnet-Mistral.0.2-v3",
   "architectures": [
     "MistralForCausalLM"
   ],

final_model/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2723b55dc532e31d62acf4ec0eeada693872696fd7e62b7c462b38a2802c41e2
 size 128524840

 version https://git-lfs.github.com/spec/v1
+oid sha256:fc314570a84931d72186e6dd236fbba46c68749909b5ad0826316e72549a8f2b
 size 128524840

final_model/training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6fe39a7e59cfc34bf787359c8621875a8af28b822ffcc77eace41356553656a4
-size 5112

 version https://git-lfs.github.com/spec/v1
+oid sha256:6b8375d79a12b46f0f80fef61798e6dc83ad0397be36c50baa6a2c4a13d08214
+size 5176